order in which they are to bt d isplayed. Consider the
t()llowing example:
Dispbv Order I
l
Decoder[ nput
I 1B2 R3
P4
BS B6 P7 B 8P 4
B 2 B 3 P 7 B 5 B 6IlO
The order mismatch is an ::�rtibct o f the com
p
re
ssion algorithm-::\ B-p
ic
ture
C:Jilnot be decoded until both its pastand
finure reference ti·amcs h:J\'C been de.coded ·Similarly a
P-picture
cannot bed
ecod
ed
until J ts past rdcrcncc ti·amc has been ckcodcd . To get around th1s problem, S U Bdefi nes
an output multibufkr. The si ze of this mu ltibufrer is approxim:nclv equalto
three times the size ofa
single Ll llco mpn.:sscd frame. For example,t(>r
a 4:2:0 subsa mplcdC!F
image, the siz
e oft
hemultihufti:r
\\'Ould be 352 lw 2 8 8by
1 .5b\' 3
bvtcs ( the exact size is retu rned by the libr:�rv du ring initi:�l c()(kc setu p ) . After steady st:ltc hJs been reac
he
d,each
invocationto
the decompress c ::� l l yields the correct next ri·a mc tobe
displayed as shown inF
igu
re ll . To avoid expensive copy opcr<nions, them u l ti b u ffer is
a
l loc
ated :�nd owned by the software above S U B .ITU-T's Recommendation
H . 261 (a.k.a. p
X64)
At
the li br:�n· level , decompressing an H . 2 6 1stre:un is ver�·
simiL;
rto
M P EG- 1 dec
od
ing \\'ith oneexcep
tion: instead of three types of pictu res, the H . 2 6 1 rccom membtion defines onl�· t\\'O , key ti·a mcs a n d non
-key h·anlCs ( nobid irectional
prcdinion ) . The implicationt(>r
implementation is that the size of themultibuffer
is approximately twice the size of ::1 singlc decompressedh·;unc . Furthermore,
the order in which compressedh·amcs :11-c p
re
sented to the deco
mpres
sor is
the sameas the order in \\'hich they arc to be displaved. To satist\• the H . 2 6 1 1-ccomnH:nd::�tion, S U B im
p
lements a
;
trc::�mingi ntertace
f(>r compr
essi
on
and
decompression. In this
m
od
e l , theapp
l ic
ation teeds input bu fti:rs to the codec, \\'hich processes the data 111 the bu fti:rs and re turns the processed datJ tothe app li
ution through a call back rou tine. During dcco
�
l1-pression, the
application
l ayer passes input bufterscont::�ining sections of ::�n H . 2 6 1 bit stream . Th
�
b1tstream un be divided arbitrarily, or, in the
case
of live tcleconkrencing, each bufkr can contai n data from a transmission packet. Empty output bu ffe rs are also passed to thecodec
to fil l \\'ith reconstructed imag_es. PictuiT ti·;uncsdo not
haveto
be a l igned onbuffer
boundaries. The codec parses the
bit
stream and, "'hen enough data is a\·ai lablc, reconstructs ::�n image. Input bufkr
s arcti
·ee
dby
call
ing the c11lback rou tine . When an image isreconstructed,
it is pl aced Ill anout
put butter and the
buffer
is retur
nedto the applica
tion th
mu
gh thecal lback
routi ne. The comprcss1on process iss
imilar
,but
input bufkrs cont::�in images and output bufkrs contai n bit-streamdata.
One advantage to thisstre
aminginterface
is that the ::�pplication layer docs not need to know the svnrax of the H . 2 6 Ib
itstream. The cod
e
c isr
esponsiblet(>r
:1!1 bit-stream parsing. Another advantage is th:Jt the c::�ll b<lck mecha nism f(>r retu rning co
mpleted
images or bit-stream buf
krs a l l o\\'S the
appl
ica
tionto
do other tasks \\'ith out implementing multithreading.S U B's
architecture
and A P I can easi ly accommo date !SO'sM P EG-2
and ITU -T's H . 2 6 3 videocom
pressiona
lgor
ithms because of their si
mibr
it
v to tbeM PEG- 1
:1nd H . 2 6 1 algori
th ms.Implementation of Video Rendering
O u r
soft
war
e implementation of ,·ideorenderi ng
cssc ntiallv
parallels
t he hard\\·are realization detailed c lsc\\'hcn: in thisissue -" As \\'ith
the hard\\':Jre Imple mentation, the solt\\'are renderer is bst and si mple because the com
plicat
ed
compu tationsarc p
er
for
med off line in buildingthe
various look- up t:1blcs. I n both hardware and soft
ware cases, a
shortcu t is ach ieved by dithering in YUV sp
ac
e and thenc
on
vertin
g to some snull number of RGB i ndex values in :1 look-u p table. "' Althoughin
mostcases
the m�1pping ,·a lu csi n
thelook-up t::�blcs remain
tixed for the duration of
ther u n , the video li brary provides rou ti nes to dvnamicallv adjust im:1gc brightness, contrast, satu ration,
am�_
the nu mber ofc
olor
s. Image scaling is possiblebut
affects pert(mnance .When qual itv is
i mport�lllt,the
soft
wa
re perf(mns scaling before dith
ering :1 1Hi when speed 1s the prim::�rv concern, it is done ancr
d i theri ng.Optimizations
We app
r
oached
the pr
oblem of optimization fi·om twodirections: Plattorm -indcpendcnt opti mi zations, or
algorithmic
enhancements,
\\'ere doneb,·
ex
p loi
ti n
gkno\\'ledgc of the com
pr
ession
algor
i thm and theTIME Figure
1 1
M u lributkring i n S U B
66
i n
pu
t data stream.Plattorm-dcpcndcnr
optim izations were done by
examining the sen·iccs <Wailablc ti·on1 the unde
rl
ying
operating system and by c\·a l u <ltingthe
Jttri bu res of the system's
pr
oce
ssor.As can be seen t
i
·o
mTa ble 2 ,
the
DCT' iso
ne of themost comput<ltional lv
intcnsi,·c components in th
ecompression
pipeli
ne.
It is a lso com mon to Jll fivei
ntern
ational stand ards. Therd(xe, a
spe
ciale ftort
was made in c hoosing andoptimizing the
DCT. Sinceall
rive
stand ardsc1l l tc>r t
hei nverse
DCT ( I DCT) to be
postproccs
scd
with inverse q uantization, sign ificantalg
o
rith
mic s
av
ing
s wereo
bta
ined bycompu
tinga
scalar mul
tiple ofthe
DCT andm
erging
the appropriate scJiing into the q uan
ti
ze
r. The
DCT implementedin the
l
ibrary
is <1 moditied version ofthe
one d i mensional suled DCT prop
ose
dby Ar<�ri cr a l . " The
two-d i mcnsion<ll DCTis obtained
byp
ertorminga
one-dimensional ncr onthe columns
t(
)ll owed by a OIH:-dimcnsional ncr onthe rows.
A totJIof
80mu ltiplies and