ECMCS'99
EURASIP Conference
DSP for Multimedia Communications and Services
Kraków, 24-26 June
BANDWIDTH-EFFICIENT WIRELESS MULTIMEDIA
COMMUNICATIONS:
LIMITATIONS, SOLUTIONS AND CHALLENGES
INVITED PAPER
Lajos Hanzo
Dept. of Electr. and Comp. Sc., Univ. of Southampton, SO17 1BJ, UK.
Tel: +44-703-593125, Fax: +44-703-594 508
Email: [email protected]
http://www-mobile.ecs.sot on.a c.uk
ABSTRACT
Commencing with the brief history of mobile
communications and the portrayal of the basic
concept of wireless multimedia communications,
theimplicationsofShannon'stheoremsregarding
jointsource and channelcodingforwireless
com-munications are addressed. Following a brief
in-troduction to speech, video and graphical source
codingas wellas thecellularconcept, a
rudimen-tary overview of exible, recongurable mobile
radio schemes is provided. We then summarise
the fundamental concepts of modulation,
intro-duce an adaptive modemscheme and argue that
third-generationtransceiversmightbecome
adap-tively re-congurable under network control in
order to meet backwards compatibility
require-ments with existing systems and to achieve best
compromiseamongstarangeofconictingsystem
requirementsin termsofcommunicationsquality,
bandwidth requirements, complexity and power
consumption, robustness against channel errors,
etc.
c
1998 IEEE. REPRINTED WITH PERMISSION
FROMPROC.OFTHEIEEE,JULY1998,VOL.86,NO.
7,PP 1342-1382,CENTENNIAL SPECIAL ISSUE -100
YEARSOFRADIO COMMUNICATIONS
THE HELPFUL SUGGESTIONS OF THE ANONYMOUS
REVIEWERSAREGRATEFULLYACKNOWLEDGED
ARANGEOFASSOCIATEDPAPERSCANBEFOUND
UN-1 THE WIRELESS COMMUNICATIONS
SCENE
Since the end of the last century, when Marconi and
Hertzdemonstratedthefeasibilityofradiotransmissions,
mankind has endeavoured to full the dream of
wire-lessmultimedia personalcommunications,enabling
peo-pletocommunicatewithanyone,anywhere,atanytime,
using a range of multimedia services. The evolution
of wireless systems and their subsystems has been well
documented in a range of monographs by Jaykes [1],
Lee [2], Parsons and Gardiner [3], Feher [4] and
oth-ers. Glisic and Vucetic [5] aswellasPrasad[6]
concen-tratedon variousaspects ofCodeDivision Multiple
Ac-cess(CDMA)intheirmonographs,whilethecompilation
ofexcellentoverviewseditedby GlisicandLeppanen [7]
treatedbothTimeDivisionMultipleAccess(TDMA)and
CDMA along with a range of other associated aspects,
suchassmartantennae[53,55,56],trelliscodingaswell
as emerging topics, referred to as 'time-space'
process-ing[55],'per-surviver'processing[57]etc.
Meyer et al [8] focused on various modern receiver
techniques in their monograph. Steele [9] compiled a
monograph,which considers mostphysical layeraspects
of modern TDMA systems, including speech and
chan-nel coding, modulation, frequency hopping, and so on,
amalgamatingthem inthelastchapterin thecontextof
theGlobal SystemofMobile Communicationsknownas
GSM.Furtherimportantreferencesareforexamplethose
byRappaport[10],GargandWilkes[11] orthe
compila-tioneditedbyGibbson[12]. Thesedevelopmentsarealso
portrayedin magazine special issues [13]-[18] and
inthebroadeldofwirelessmultimediacommunications.
Letuscommenceourdiscoursewithaglimpseofhistory.
Therstmobileradiosystemswereintroducedbythe
military, police and other emergency services, most of
which werelimited to voice only communications.
Dur-ingthepre-VLSIeratherealisablesignalprocessing
com-plexity wasseverlylimitedand hence thehandsets
pro-vided typically poor voice quality at a high cost. This
wasduetothephenomenonofmulti-pathwave
propaga-tion,wherethedierentmulti-pathcomponentsarriving
atthereceiver'santennasuerdierentattenuationand
phaserotationand hencethey sometimesadd
construc-tively, sometimesdestructively. This situation isfurther
aggravatedby theso-calleddelay-spread,whenthe
vari-ouspropagationpathshaveratherdierentpath-lengths
and consequentlyexhibit dierent delays, spilling
inter-symbol interference(ISI) into the adjacent signallingor
symbol intervals. These phenomena can todayoften be
combatedbysophisticated signalprocessingmethods at
the cost of added implementational complexity, which
was notpossibleinthepre-VLSI era. Hence,until quite
recently, the quality and varietyof wireless services has
beeninferior toconventionaltetheredcommunications.
The rst public cellular radio system, known as the
AdvancedMobilePhoneService(AMPS)wasintroduced
in 1979 in the United States, shortly followed by the
NordicMobile Telephone(NMT) systemin Scandinavia
in 1981. The rst British system was the Total
Ac-cessCommunicationsSystem (TACS) operatedby
Cell-netandVodafone,whiletheJapaneseintroducedthe
Nip-pon Advanced Mobile TelephoneSystem(NAMTS). All
oftheseso-calledrst-generationnationalsystemswere
basedonanaloguefrequencymodulation(FM) butused
digital network control. Howeverthey did not support
internationalroaming.
In1982 CEPT (ConferenceEuropeene des Posteset
Telecommunication), the main governing body of the
European PTT's, created the Groupe Speciale Mobile
(GSM)Committeeandtaskeditwithstandardisinga
dig-italcellularPan-Europeanpublicmobilecommunication
systemto operatein the 900 MHzband. This was
fol-lowedbythelaunchofexperimentalprogrammesof
dier-enttypesofdigitalcellularradiosystemsin anumberof
Europeancountries. Bythemiddleof1986nineproposals
were received for the future Pan-European system, and
GSMorganisedatrialinParistoidentifytheonehaving
thebest performance. Thetechnical details of the
can-didatesystemsaredescribedin references[33],[34],[35],
[36]and[37], whileashortsummaryof theirsalient
fea-tureswasgiven in reference[39]. A detailed description
ofthe standardisedGSMsystem'smain features canbe
foundin reference[40]. This schemeconstitutestherst
so-called second-generation public land mobile radio
(PLMR)system, which was designed for theworst-case
ingchannelconditionsandtechniquesformitigatingtheir
eects willbehighlightedduring ourfurther discourse.
FollowingGSM,in1989theAmericansecond
genera-tionschemeknownastheDigitalAdvancedMobilePhone
(DAMPS) system[41] had also beenstandardised, with
theadvantageofbeingabletoaccommodatethreehigher
qualitydigitalchannelsinaconventional30kHzanalogue
AMPS channel slot. Its uniquefeature isthat similarly
to theJapanesesecondgeneration schemereferredtoas
the Public DigitalCellular (PDC) system [42] it usesa
2 bits/symbol non-binary modem, which implicitly
as-sumesamorebenignpropagationenvironmentthanthat
oftheGSMPLMRsystem. Theimprovedwave
propaga-tionconditionsareaconsequenceofemployingso-called
micro-cells, where, in contrastto hostile PLMR system,
thehighantennaelevationisreducedtobelowtheurban
sky-line. Hence there is typically a strong line-of-sight
(LOS) path between the base station (BS) and mobile
station (MS),reducing the fading depth and mitigating
the eect of delay-spreadinduced ISI. These issues will
bere-visitedin moredepth atalaterstage.
With respecttotheimprovedpropagationconditions
themulti-levelIS-54andPDCsystemsprovideaseamless
transition towardstheso-calledcordless
telecommunica-tions(CT) systemconceptcontrivedmainly forfriendly
indoors oÆce and domestic propagation environments.
Hence CT products are designed to have a low
trans-mitted power and small coverage area, where typically
thereisadominantline-of-sight(LOS)propagationpath
between the Fixed Station (FS) and Portable Station
(PS).Thelowtransmittedpowerandsmalltransmission
range facilitate a low-complexity, low-cost, light-weight
construction. The standardisation and development of
CTproductswashallmarkedbytheBritishCT2system,
the Digital European Cordless Telephone (DECT) and
theJapaneseHandyphone(PHP)systems. Afurther
im-portantmilestonewasthestandardisationoftheBritish
DCS-1800 system, which is essentially an up-converted
GSM system implemented at 1.8 GHz. The denition
of the so-called half-rate GSM system supporting twice
as many subscriberswithin the 200kHz channel
band-width, asthefull-rate systemwasalsoanimportant
de-velopmentin theeld. These secondgenerationsystems
andCTschemesweredescribedin dedicatedchaptersof
reference[12].
Currently there exist a range of initiatives world
wide, which attempt to dene the third generation
personal communications network (PCN), which is
re-ferredtoasapersonalcommunicationssystem(PCS)in
North America. The European Community's Research
in Advanced Communications Equipment (RACE)
pro-gramme[13,12] andtheconsecutiveframeworkreferred
to asAdvanced Communications Technologies and
ini-MOBILITY
DATARATE
UMTS
2.
GENER.
GSM,
IS-54
ISDN
MBS
CORDLESS
FIXED
PORTABLE
MOBILE
WLAN
B-ISDN
Figure 1. Stylised mobility versusbitrate plane
classi-cationofexisting andfuturewirelesssystems
catedprojects,endeavouringto resolvetheon-going
de-bate as regards to the most appropriate multiple
ac-cess scheme, studying Time Division Multiple Access
(TDMA)[13, 40, 9, 41, 12]and Code Division Multiple
Access(CDMA)[13,9,43,12].
European third generation research is conducted
under the umbrella of the so-called Universal Mobile
Telecommunications System (UMTS) [13] initiativeand
sofar the following proposals have been submitted to
ETSI [54]: wideband CDMA [46, 47, 48], Adaptive
TDMA[49] (ATDMA), hybridTDMA/CDMA[50],
Or-thogonalFrequency DivisionMultiplex(OFDM)[51,68]
and Opportunity Driven Multiple Access (ODMA). We
note that the Nokia testbed portrayed in [48] was
de-signedwithvideotransmissioncapabilitiesupto128kbps
inmind. Similarly,cognizancewasgiventotheaspectsof
lessbandwidth-constrained,iehigher-ratevideo
commu-nicationsbytheJapanesewidebandCDMAproposal[52]
for the Intelligent Mobile Terminal IMT 2000 emerging
fromNTTDoCoMo. Thesestandardisationactivitiesare
portrayedinmoredepth in[54].
In the ACTS workplan [44] there are a number of
projects dealing with multimedia source- and channel
coding, modulation and multiple access techniques for
bothcellularandwirelesslocalareanetworks(WLANs).
These studies will design the architecture and produce
demonstration models of the universal mobile
telecom-munications system (UMTS), which the Europeans
in-tendtoaccomplishbeforetheturnofthecentury.
Some-wherealongthelineUMTSisexpectedtomergewiththe
CCIRstudyonthefuturepubliclandmobile
telecommu-nicationssystem(FPLMTS). These systemsare
charac-terisedbythehelp ofFigure1in termsoftheirexpected
grade of mobility and bitrate. These fundamental
fea-turespredeterminetherangeofpotentialapplications.
Specically, the xed networks are evolving from
work (ISDN) towards higher-rate broad-band ISDN or
B-ISDN. A higher grade of mobility, which we refer to
here as portability, is a feature of cordless telephones,
such as the DECT, CT2, PHP etc systems, although
theirtransmissionrateis morelimited. The DECT
sys-temsisthemostexibleoneamongstthem,allowingthe
multiplexing of 23 single-userchannels in onedirection,
which provides rates up to 2332 kbps = 736kbps for
advancedservices. Wirelesslocalareanetworks(WLAN)
cansupportbitratesupto155Mbits/sinordertoextend
existing Asynchroneous Transfer Mode (ATM) links to
portableterminals,but they usuallydo notsupport full
mobility functions,such aslocationupdate orhandover
fromoneBStoanother. Arapidlyevolvingeld gaining
also considerable commercial interest is associated with
theresearchanddevelopmentofHigh-PerformanceLANs
(HIPERLAN)[66, 67]for 'customerpremises'type
com-munications. Contemporary second generation PLMR
systems,suchasGSMandIS-54cannotsupporthigh
bi-trateservices, sincetheytypicallyhaveto communicate
over lower quality channels, but they exhibit the
high-est gradeof mobility, includinghigh-speedinternational
roamingcapabilities.
The third generation UMTS is expected to havethe
highestgradeoffexibilitybothintermsofitsservice
bi-traterange and in termsof mobility. In itsdesign
cog-nizanceisgiventothesecondgenerationsystems. Indeed,
wemayanticipatethat someof thesubsystems ofGSM
andDECTmayndtheirwayintoUMTS,eitherasa
pri-marysub-system,orasacomponenttoachievebackward
compatibility with systems in the eld. This approach
mayresultin hand-heldtransceiversthat are intelligent
multimodeterminals,ableto communicatewithexisting
networks,whilehavingmoreadvancedand adaptive
fea-turesthat wewouldexpecttoseeinthenextgeneration
ofwirelessmultimediapersonalcommunicationnetworks.
Followingtheabovebriefoverviewofthewireless
commu-nicationssceneletusnowbrieyspeculateonthe
practi-cal embodiment of themultimedia communicatorof the
nearfuture.
2 OUTLINE
Followingtheabovebriefhistoricaloverviewintherestof
thistreatiseweconcentratemainlyonbandwidth-eÆcient
low-rate systems, although many of the proposed
tech-niquesaresuitableforhigh-ratesystemsaswell. F
ollow-ing some introductory conceptual notes as regards to a
possiblemanifestationof thefuture wirelessmultimedia
communicatorinSection 3,weanalysetheramications
of Shannon's messagefor wireless systems in Section 4.
This is followed by three Sections onspeech, video and
graphicalsourcecoding,beforewefocusourattentionon
transmissionaspects. Section8.1highlightsthebasic
cel-lularconcept,whileSection8.2introducesafewmultiple
of modulation schemes in Section 11 and forward error
correction(FEC)coding,beforeconcludingwiththe
por-trayalof the expected system performance gures
char-acterising such an intelligent multimode speech system
in Section 13 and the characterisation of a videophone
transceiverin Section14.
Thepaperaddressestheso-calledphysical-layer
func-tionsofwirelesssystemsinmoredepth,butattemptsalso
todevotesomeattentionto higher-layeraspects,suchas
multiple access, dynamic channel allocation, handover,
etc. Giventhewidescopeofthistreatise,itisinevitable
thatsomeimportanttrendsandseminalcontributionsby
highlyacclaimedauthorsremainbeyonditscoverage,
al-thoughwith thenumber of references provided there is
suÆcientscopefortheinterestedreadertoprobefurther
incertaindeepersubjectareas.
3 WIRELESS MULTIMEDIA
COMMUNICA-TOR
A possible manifestation of the multimedia PS is
por-trayed in Figure 2, which is equipped with a bird-eye
camera, microphone, liquid-crystal screen, serving both
asvideo-telephone screen as well as a computerscreen.
The conventional keybord is likely to be replaced by
a pressure-sensitive writing tablet, facilitating optical
handwritingrecognition[208]-[213],signatureverication
etc.
The pivotalimplementational point of such a
multi-mediaPSisthatofndingthebestcompromiseamongst
a number of contradicting design factors, such as low
power consumption, high robustness against
transmis-sionerrorsamongstvariouschannelcondition,high
spec-traleÆciency,goodaudio/videoquality,low-delay,
high-capacitynetworkingandsoforth. Inthiscontributionwe
will address a few of these issues in the context of the
proposedPSdepictedinFigure2. Thetime-variant
opti-misationcriteriaofaexiblemulti-mediasystemcanonly
bemet byanadaptive scheme,comprising thermware
ofasuiteofsystemcomponentsandinvokingthat
combi-nationofspeechcodecs,videocodecs,embeddedchannel
codecs,voiceactivitydetector(VAD)andmodems,which
fulllsthecurrentlyprevalentrequirement[68].
These requirements lead to the concept of
arbitrar-ilyprogrammable, exible so-calledsoftwareradios [16],
which is virtually synonymous to the so-calledtool-box
conceptinvokedfor examplein theforthcomingMotion
PicturesExpertGroup(MPEG) 4videocodecproposed
forwirelessvideocommunications[70]. Thisconcept
ap-pearsattractivealso forUMTS-typetransceivers. Afew
examplesofsuchoptimisationcriteriaaremaximisingthe
teletraÆccarriedortherobustnessagainstchannelerrors,
whileinothercasesminimisationofthebandwidth
occu-pancy,theblockingprobabilityorthepowerconsumption
isof primeconcern.
Figure2. WirelessMultimediaCommunicator
Figure3. WirelessMultimediaNetwork
Figure 3. The multimedia PSs communicate with the
so-called BSsin theirvicinity, which are interconnected
either directly using optical bre with eachother, or in
more complex systems viathe so-called Mobile
Switch-ing Centres (MSC). The PSs can access through BS a
rangeofservices,includingbusinessdatabases,
multime-dia databases, main-frame computers, etc. Let us now
turnourattentiontosomeoftheinformationtheoretical
aspects of wireless communications, in order to be able
tounderstandtheunderlyingsystemstechnical
ramica-tions.
4 SHANNON'S MESSAGE AND ITS
IMPLI-CATIONS FOR WIRELESS CHANNELS
In mobile multimedia communications it is always of
prime concern to maintain an optimum compromise in
termsof the contradictoryrequirementsof low bit rate,
highrobustnessagainstchannelerrors,lowdelayandlow
complexity. Theminimumbitrateatwhichthecondition
transmissionraterequiredfor thelossless representation
ofthesourcesignal,whichisreferredtoasthesource
en-tropyis onlyasymptoticallyachievable,asthe encoding
memory length or delay tends to innity. Any further
compressionisassociatedwithinformationlossorcoding
distortion.Notethattheoptimumsourceencoder
gener-atesaperfectlyuncorrelatedsourcecodedstream,where
all the source redundancy has been removed, therefore
theencoded symbolsare independent and each one has
thesamesignicance. Havingthe samesignicance
im-plies that the corruption of any of the source encoded
symbols results in identical reconstructed signal
distor-tionoverimperfect channels.
Under these conditions, accordingto Shannon's
fun-damentalwork[72,73,75],bestprotectionagainst
trans-mission errorsis achieved, if source and channel coding
aretreatedasseparateentities. Whenusingablockcode
oflengthN channelcodedsymbolsinordertoencodeK
sourcesymbolswithacodingrateofR=K =N,the
sym-bolerrorratecanberenderedarbitrarilylowifN tends
to innity and the coding rate to zero. This condition
alsoimpliesaninnitecoding delay. Basedontheabove
considerationsandontheassumptionofAdditiveWhite
Gaussian Noise (AWGN) channels, source and channel
codinghavehistoricallybeenseparatelyoptimised.
Mobileradiochannelsaretypicallysubjectedto
mul-tipath propagation and hence constitute a more hostile
transmission medium than AWGN channels, exhibiting
pathloss, lognormal slow fading and Rayleigh fast
fad-ing [217, 216]. Furthermore, if the signalling rate used
is higher than the channel's so-called coherence
band-width[217,216], additionalimpairmentsareinicted by
dispersion, which is associated with frequency domain
lineardistortions. Under these circumstancesthe
chan-nel's error distribution versus time becomesbursty and
aninnite-memory symbol interleaveris required in
or-der to disperse the bursty errors and render the errors
asindependent, as possible, such as overAWGN
chan-nels. Clearly,formobilechannelsmanyoftheabove
men-tioned, asymptotically valid ramications of Shannon's
theoremhavealimitedapplicability.
A range of practical limitations must be observed,
when designing wireless multimedia links. Although it
isoftenpossibleto reduce therequiredbitrateof
state-of-artmultimediasourcecodecswhilemaintaininga
cer-tainreconstructedsignalquality,inpracticaltermsthisis
onlypossibleataconcomittantincreaseofthe
implemen-tationalcomplexityandencodingdelay. Agoodexample
of these limitations is the half-rate GSM speech codec,
whichwasrequiredtoapproximatelyhalvetheencoding
rateofthe13kbpsfull-ratecodec,whilemaintainingless
than quadrupled complexity, similar robustness against
channelerrorsandlessthandoubledencodingdelay.
Nat-urally, the increased algorithmic complexity is typically
associatedwithhigherpowerconsumption, whilethe
re-Figure4. Intelligenttransceiverschematic
segmentintuitivelyimpliesthateachbitwillhavean
in-creased relativesignicance. Accordingly, their
corrup-tionmayinict increasinglyobjectionablespeech
degra-dations,unlessspecial attentionis devoted tothis
prob-lem. It is worthnoting that despite itsquadruple
com-plexitythehalf-rateGSMspeechcodecmaintainsalower
powerconsumptionduetolow-power3V-technologythan
therstlaunchedfullratecodechad.
In a somewhat simplistic approach one could argue
that due to thereducedsource ratewecould
accommo-dateanincreasednumberofparitysymbolsusingamore
powerful, implementationally more complex and lower
ratechannel codec,while maintainigthesame
transmis-sion bandwidth. However, the complexity, quality,
ro-bustness trade-o of such a scheme would not be very
attractive.
A more intelligent approach will be required in
or-dertodesignbetterwirelessmultimediatransceivers[73,
74] for bursty mobile radio channels. The simplied
schematic of such anintelligent transceiveris portrayed
in Figure 4. Perfect source encoders operating close to
the information-theoretical limits of Shannon's
predic-tionscanonlybedesignedforstationarysourcesignals,a
conditionnotsatisedbymostmultimediasourcesignals.
Further previouslymentionedlimitationsare the
encod-ingcomplexityanddelay. Asaconsequenceofthese
lim-itations the source-coded stream will inherently contain
residual redundancy and the correlated source symbols
willexhibitun-equalerrorsensitivity,requiringun-equal
error protection. Following Hagenauer [73, 74] we will
refertotheadditionalknowledgeasregardstothe
dier-entimportance or vulnerability of various source coded
bits assourcesignicanceinformation (SSI),whereas to
thecondenceassociatedwiththechanneldecoder's
de-cisionsasdecoderreliabilityinformation(DRI).
Theseadditionallinksbetweenthesource-and
chan-nel codecs are also indicated in Figure 4. Further
thedierence betweenconsecutivedecoded symbols
vio-latessomethresholdconditionandtherebyfacilitatesthe
detection of a channel decoding error. Then the
chan-neldecodercanattemptasecond tentativedecoding by
passingthesecond most likelycorrectedmessage to the
sourcedecoder, which in turn subjectsthisagain to the
previously failed threshold test, etc. A variety of such
techniques have succesfully been used in robust
source-matchedsource-andchannelcoding[73,74,82,83].
An-other practicalmanifestation of the time-variant source
statisticsof speech signalsis the fact that during silent
speechspurtssomespeechcodecsdonotsurrendertheir
reservedphysicallink,theyreducetheiroutputbitrate
in-stead,whichcanreducetheinterferenceinictedtoother
usersinso-calledCodeDivisionMultipleAccess(CDMA)
systems,such astheAmericanIS-95system[43]. Video
codecs,suchasthevariable-rateMPEG1[80]andMPEG
2[81]codecsevenmoreexplicitlyrelyontheuctuation
ofthesourcestatistics. Forexample,whenanewobject
is introduced in the scope of the camera, which cannot
bepredictedonthebasisofalreadyknownpreviousvideo
frames,thenthebitrateistypicallyincreased.
TheroleoftheInterleaverandDeinterleaver[79]seen
inFigure4istorearrangethechannelcodedbitsbefore
transmission. Themobileradiochanneltypicallyinicts
bursts of errors during deep channel fades, which often
overload the channel decoder's error correction
capabil-ityincertainsourcesignalsegments,whileothersegments
arenotbenetingfromthechannelcodecatall,sincethey
mayhavebeentransmittedbetweenfadesandhenceare
error-freeevenwithoutchannelcoding. Thisproblemcan
becircumventedby dispersingtheburstsof errorsmore
randomlybetweenfadessothatthechannelcodecisfaced
alwayswithan'average-quality'channel,ratherthanthe
bi-modalfaded/non-fadedcondition,althoughonlyatthe
costofincreasedsystemdelay,whichmaybecomean
im-pediment in interactivemultimedia communications. In
otherwords,channelcodecsaremosteÆcient,ifthe
chan-nel errorsare near-uniformlydispersed overconsecutive
receivedsegments.
Initssimplestmanifestationaninterleaverisa
mem-orymatrixthatislledwithchannelcodedsymbolsona
row-by-rowbasis,whichare thenpassedonto the
mod-ulatoron acolumn-by-columnbasis. If the transmitted
sequence is corrupted by a burst of errors, the
deinter-leaver maps the receivedsymbols back to their original
positions, thereby dispersing the bursty channel errors.
Aninnitememorychannelinterleaverisrequiredin
or-dertoperfectlyrandomisetheburstyerrorsandtherefore
totransformtheRayleigh-fadingchannel'serrorstatistics
intothatofanAWGNchannel,forwhichShannon's
infor-mationtheoreticalpredictionsapply. Sinceininteractive
multimediacommunicationsthetolerabledelayisstrictly
limited,theinterleaver'smemorylengthandeÆciencyis
alsolimited. Forfurtherdetails ontheeects ofvarious
interestedreaderisreferredtoReference[79].
A specic deciencyof the abovementioned
rectan-gular interleaversis that in case ofa constantvehicular
speedthe Rayleigh-fading mobile channel typically
pro-ducesperiodicfades [217,216]and error burstsat
trav-elleddistancesof=2,whereisthecarrier'swavelength,
whichmaybemappedbytherectangularinterleaverinto
anothersetofperiodicburstsoferrors. Again,arangeof
morerandom re-arrangementor interleaving algorithms
exhibiting a higher performance than rectangular
inter-leavershavebeenproposed formobile channelsin
Refer-ence [79], where also avarietyof practical channel
cod-ingschemeshavebeenportrayed. Section 5givesabrief
overviewoftherecentactivitiesin speechsourcecoding,
Section 6provides a rudimentary introduction to video
source coding, while Section 7 highlights the principles
of graphical source coding. For a full review of speech
sourcecodingschemesformobile systemstheinterested
readerisreferredtoreferences[84]-[91],jointsourceand
channel codingwasthesubjectof[92], whereas
modula-tionandtransmissionarrangementsforwirelesschannels
havebeenstudiedin[4,6,69,9,68].
ReturningtoFigure4,softdecisioninformation(SDI)
is passed by the demodulator to the FEC decoder,
in-dicating that the demodulator refrained from makinga
hard-decision concerning the received bit. Instead, it
passes the estimated reliability of the received
informa-tiontotheFECdecoder,therebyimprovingitseÆciency.
Thechannelstate information(CSI), whichis in simple
terms representative of the current fade depth, can be
usedtoweightthetheSDIinthedetectionprocess. This
weightedreliabilityinformationisthenoftenusedbythe
channel decoderin order to invokemaximum likelihood
sequence estimation (MLSE) basedon theViterbi
algo-rithm[311, 79] in order to improvethesystem's
perfor-mancewithrespectto conventionalharddecision
decod-ing. Following the above rudimentary review of
Shan-non's infromation theory, the rest of this treatise is
de-votedto practical issuesof wireless multimedia
commu-nications. Let usinitially considerbriey therecent
ad-vancesinspeechsourcecoding.
5 SPEECHSOURCE CODING
5.1 A historical perspective onspeechcodecs
Followingthe64kbits/sPulseCodeModulation (PCM)
and 32 kbps Adaptive PCM (ADPCM) G.721
Recom-mendations standardised by the International
Telecom-munications Union (ITU), in 1986 the 13 kbits/s
Reg-ular Pulse Excitation (RPE) [105, 106] codec was
se-lected for the Pan-European mobile system known as
GSM,andmorerecentlyVectorSumExcitedLinear
Pre-diction (VSELP) [107, 108] codecs operating at 8 and
6.7 kbits/s were favoured in the American IS-54 and
develop-artwasdocumentedin arange ofexcellentmonographs
by O'Shaughnessy [87], Furui [88], Anderson and
Se-shadri[92], Kondoz [89], Kleijn and Paliwal [90] and in
atutorial review by Gersho[78]. Morerecently the5.6
kbits/shalf-rate GSM quadruple-mode VectorSum
Ex-citedLinearPredictive(VSELP)speech codecstandard
developed by Gerson et al [109] was approved, while in
Japan the 3.45 kbits/s half-rate PDC speech codec
in-ventedbyOhya,SudaandMiki[113]usingtheso-called
PitchSynchronousInnovation(PSI)CELPprinciplewas
standardised. Other currently investigated schemes are
the PrototypeWaveform Interpolation (PWI) proposed
byKleijn[114],Multi-BandExcitation(MBE)suggested
by GriÆn et al [115] and Interpolated Zinc Function
PrototypeExcitation(IZFPE)codecsadvocatedby
Hio-takakos and Xydeas [116]. In the low-delay, but more
errorsensitivebackwardadaptiveclassthe16kbps ITU
G.728 codec [117] developed by Chen et al from the
AT&T speech team hallmarks a signicant step. This
was followed by the equally signicant development of
the more robust, forward-adaptive 15 ms delay G.729
ACELParrangementproposedbytheUniversityof
Sher-brooketeam [122, 123], AT&T and NTT [118]. Lastly,
thestandardisationofthe2.4kbpsDoDcodecledto
in-tensiveresearchinthisverylow-raterangeandtheMixed
ExcitationLinearPredicitve(MELP)codecbyTexas
In-strument wasidentied [119] in 1996 asthebest overall
candidatescheme.
Beforeconcludingourdiscourse onspeechcodecslet
us briey highlight the problems associated with 7kHz
bandwidth-so-called commentatoryquality speech
cod-ing.
5.2 Widebandspeechcodecs
For thesake of completeness we note briey that 7kHz
bandwidthspeech codecs oer more transparent speech
quality thantheir narrowband counterpartsat typically
higherbitrateandalgorithmiccomplexity.
Oneoftheproblemsassociatedwithfull-bandcoding
of wideband speech is the codec's inability to treat the
lesspredictablehigh-frequency, low-energyspeech band,
which was tackled by the ITU G.722codec using
split-band orsub-bandcoding. Although theupper subband
is important for maintaining an improved intelligibility
andnaturalness, it only containsasmall fraction of the
speech energy, which is on the order of 1% and
there-foreits bitrate contribution has to be limited
appropri-ately. TheITU G.722 codec[131] uses two equal-width
subbands,whosesignalsareencodedemployingADPCM
techniquesand hastheabilityof transmittingspeech at
64,56or48kbps,whileallocating0,8or16kbpscapacity
fordatatransmission.
Quackenbush [132] suggested a transform-coded
approachinordertoallowforahigherexibilityinterms
audio signals and reduced the bitrate required
accord-ingtothelowersamplingrateof16kHz. Ordentlich and
Shoham proposed alow-delay Celp-based32kbps
wide-bandcodec[134],whichachievedasimilarspeechquality
totheG.72264kbpscodecataconcomitanthigher
com-plexity. Thebackward-adaptiveLPC lterused had an
order of32, which wassignicantlylowerthanthelter
order of 50 used in the G.728 codec [117]. The G.728
lter-orderof50wasabletocaterforlong-term
periodic-itiesofupto 6.25ms,correspondingto pitchfrequencies
down to 160 Hz at a sampling rate of 8kHz without a
LTP,allowingbetterreconstruction for femalespeakers.
Thelterorderof 32atasampling frequencyof16kHz
cannotcaterforlong-termperiodicities. Nonetheless,the
authorsoptedforusingnoLTP.IncontrasttotheG.728
codebook of 128 entries here 1024 entries were used to
modelthe5-sampleexcitations.
In acontribution by Black, Kondoz andEvans[135]
the backward- adaptive principle was retained for the
sakeoflowdelay,but itwascombinedwith asplit-band
approach. The low-band was encoded by a
backward-adaptiveCELPcodecusinga10-thorderLPClter
up-datedover148kHz-sampledsamplesor1.75msandthe
authorsarguedthatitwasnecessarytoincorporatea
for-wardadaptiveLTPinordertocounteractthepotentially
damagingerrorfeedbackeect ofthebackward-adaptive
LPC analysis. The upper-band typically containsaless
structured, noise-likesignal, which hasa slowly varying
dynamicrange. Blacketalhereproposedtousea6th
or-derforward-adaptivepredictorupdatedovera56-sample
interval, which is quadrupled in comparison to the
low-band. Backward- adaptive prediction would be
unsuit-ableforthislessaccuratelyquantisedband,whichwould
precipitatetheeectofquantizationerrorsinfuture
seg-ments.
The prestigous speech coding group at Sherbrooke
University [136, 137, 138] proposed a rangeof
ACELP-basedcodecs,sinceLaamme,AdoulSalamietalargued
that ACELP codecs are amenable to wideband coding,
when employing vast codebooks in conjunction with a
reduced-complexityfocusedcodebooksearchstrategy
us-ing a number of encapsulated search loops. This
tech-niquefacilitatessearchingonlyafractionofalarge
code-book,whileachievingasimilarperformancetothat ofa
full-search. SuÆce to say here that this technique was
proposed by the authors alsofor the ITU G.7298kbps
low-delaycodecusinga15-bitACELPcodebookandve
encapsulatedloops[121,122].
Here we conclude our discussion of speech source
codecsandbrieyclassifyarangeofvideocodecssuitable
forwirelessvideophonyandotherwirelessvisual
wire-6.1 Motivation and Background
Motivatedbytheproliferationofwirelessmultimedia
ser-vices[139, 140], aplethoraof videocodecschemes have
been proposed for various applications [141]-[156], but
the perhaps most signicant advances in the eld are
hallmarked by the MPEG4 initiative [70]. The design
ofvideophone schemescentresaround thebest
compro-miseamongstanumberofinherentlycontradictory
spec-ications, such as video quality, bit rate,
implementa-tionalcomplexity,robustnessagainstchannelerrors,
cod-ing delay, bitrate uctuation and the associated buer
lengthrequirement,etc. Manyoftheseaspectshavebeen
treated in a number of established monographs by
Ne-travaliandHaskell[143],Jain[191],JayantandNoll[85]
aswell as Gersho and Gray [149]. A plethora of video
codecshavebeenproposed intheexcellentspecialissues
edited by Tzou, Mussmann and Aigawa [157], by
Hub-ing [158] and Girod et al [159] for a range of bitrates
and applications, but the individual contributions by a
numberofrenownedauthorsaretoonumeroustoreview.
Khansari, Jalali, Dubois and Mermelstein [166] as well
asMannPelz[180]reportedpromising resultson
adopt-ingtheH.261codecforwirelessapplicationsbyinvoking
powerful signal processing and error-control techniques
in orderto remedy theinherent sourcecodingproblems
dueto stretchingits applicationdomain to hostile
wire-less environments. Farber, Steinbach and Girod
[167]-[170] also contributed substantially towards advancing
thestateofartinthecontextoftheH.263codecaswell
as in motion compensation [168, 169], as did Eryurtlu,
A.H.Sadka,A.M.Kondoz[174,175]. Furtherimportant
contributions in the eld were due to Chen et al [181],
Illgner and Lappe [182] Zhang [183], Ibaraki, Fujimoto
andNakano[184],Watanabeetal[185]etc, theMPEG4
consortium's endeavours [71], the eorts of the mobile
audio-videoterminal(MAVT)consortium. Vector
quan-tisationbased schemes were advocated byRamamurthy
andGersho[149]aswellasbyTorresandHuguet[150]. A
majorfeaturetopicoftheEuropeanCommunity'sFourth
FrameworkProgramme[44, 45] onAdvanced
Communi-cationsTechnologiesandServices(ACTS),isvideo
com-municationsoverarangeofwirelessandxedlinks.
In this Section initially we focused our attention on
thedesign andperformance evaluation of wireless video
telephone systems, suitable for the robust transmission
of Quarter Common Intermediate Format (QCIF)
se-quences over conventional mobile radio links, such as
the Pan-EuropeanGSM system [40], the American
IS-54 [41] and IS-95 [43] systems as well as the Japanese
PDCsystem[42]. Incontrasttoexistingstandardcodecs,
such as the ITU H.261 scheme and the MPEG1 [80],
MPEG2 [81] and MPEG4 [70] arrangements, our
pro-posedvideocodec'sxed,butarbitrarilyprogramable
bi-Search Area
n-1
f
n
f
b x b
p x q
MCER
Position of the best match
Figure 5. Simplied schematic of motioncompensation
c
J.Streit[186],1996
systems,whicharelikelytovarytheirbitrateinresponse
tovariouspropagationandteletraÆcconditions. Wewill
conclude the Section with a brief overview of the ITU
H.263 standardvideo codec, which is aexible scheme,
suitableforarangeof multimedia visualapplications at
variousbitratesandvideoresolutions.
6.2 Motion Compensation
Theultimategoaloflow-rateimage coding isto remove
redundancy in both spatial and temporal domains and
thereby reduce the required transmission bit rate. The
temporal correlationbetweensuccessiveimage frames is
typically removed using block-based motion
compensa-tion,whereeachblocktobeencodedis assumedto bea
motion-translatedversionofthepreviouslocallydecoded
frame.
The vector of motion translation or motion vector
(MV) is typicallyfound bythe help of correlation
tech-niques, as seen in Figure 5. Specically, a legitimate
motion translation region or search scope is stipulated
within theprevious locally decoded frame, the block to
be encoded is slid over this region according to a
cer-tain algorithm and the location of highest correlation
is deemed to be the destination of the motion
transla-tion. Motioncompensation (MC) isthen carriedoutby
subtractingtheappropriatelymotiontranslatedprevious
decoded block from the one to be encoded in order to
generate the so-called motion compensated error
resid-ual(MCER).Clearly,theimageisdecomposedinmotion
translationandMCER,andbothcomponentshavetobe
encodedandtransmittedtothedecoderforimage
recon-struction. Themotioncompensationremovessomeofthe
temporalredundancyandthevarianceoftheMCER
be-comesmuchlowerthanthatoftheoriginalimage,which
ensuresbit rateeconomy.
Figure6. Simplevideocodecschematic
145](SBC),waveletcoding[146],DiscreteCosine
Trans-formation[191, 80, 81, 188] (DCT), vector quantisation
(VQ) [149]-[151]or Quad-tree[147, 148, 155, 189](QT)
coding. Some of these techniqueswill behighlighted in
theforthcomingSubsections.
When a low codec complexity and low bit rate are
required, the motion compensation technique described
above can bereplaced by simple frame-dierencing. In
frame-dierencing the whole of the previous locally
de-coded image frameis subtractedfrom the oneto be
en-coded without the need for the above correlation-based
motionprediction,whichmaybecomevery
computation-allyintensiveforhigh-resolution,high-qualityvideo
por-trayinghigh-dynamicscenes. Such asimplevideocodec
schematicbasedonsimpleframe-dierencingisshownin
Figure6. AlthoughtheMCERresidualvarianceremains
somewhat higher for frame-dierencing than in case of
full motion compensation, there is no pattern-matching
search, which reduces the complexity and noMVs have
to be encoded, which may reduce the overall bit rate.
ObserveinFigure 6thatafter frame-dierencingthe
en-coded MCERis conveyedtothetransceiverand also
lo-cally decoded. This is necessary to be ableto generate
thelocallyreconstructedvideosignal,whichisinvokedby
theencoderin subsequentMC steps. Theencoder uses
thelocally reconstructed,rather thanthe original input
videoframes,sincethesearenotavailableatthedecoder,
whichwouldresultinmis-alignmentbetweentheencoder
anddecoder. This local reconstructionoperationis
car-ried outby the adder in the Figure, superimposing the
decoded MCER on the previous locally decoded video
frame. The operations are similar, if full MC is used.
Practical codecs, such as for example the ITU H.263
scheme,oftencombinetheso-calledinter-frameand
intra-framecodingtechniquesonablock-by-blockbasis,where
MC is employedonly if it was deemed advantageous in
MCERreductionterms.
tics, where large sections of the frame dierence signal
are 'at', characterised by low pixel magnitude values,
whilethe motioncontours,where theframe dierencing
hasfailedtopredictthecurrentpixelsonthebasisofthe
previouslocallydecoded framearerepresentedbylarger
values,asseeninatthecentreofFigure9. Consequently,
eÆcientMCER residualcoding algorithmsmust beable
to represent such textured MCER patterns adequately,
atopic to be addressedin the forthcomingsubsections.
Let us initially consider a bandwidth-eÆcient cost-gain
quantisedDCT-basedcodec[188].
6.3 DCT-based Video Codec
OurDCT-basedvideocodec'soutlineisdepictedin
Fig-ure 7. The DCT [191] has been popular in video
com-pression standards [80, 81], since it exhibits a so-called
energy compaction property, implying that upon
trans-forming acorrelated orpredictable signal to the spatial
frequency domain most of its energywill be compacted
to afew high-energy, low-frequencycoeÆcients. This is
a consequency of the Wiener-Khnitsin theorem, stating
thatthepowerspectraldensity(PSD)andthe
autocorre-lationfunction(ACF)areFouriertransformpairs. Hence
the atACF of apredictable, slowly-varying signal
im-plies acompact low-pass type PSD, which is amenable
to compression,since in the spatial frequencydomain a
lowernumberofcoeÆcientshastobetransmittedthanin
thetemporaldomain. ItisimportanttonotethattheMC
oftenremovesmostoftheredundancyfromthecorrelated
temporal domainvideoframeandhencetheDCTofthe
MCERmayevenresultinanexpandedspatialfrequency
domainrepresentation, which canbeconteractedfor
ex-amplebyadaptivebitallocationschemes. Strobach[147]
proposed quad-tree coding in order encode the MCER
and mitigate this problem. Alternative frequency
do-main solutionsinclude subband coding [144, 145](SBC)
or wavelet coding [146], which facilitate a exible
con-trol over the allocation of bits in the spatial frequency
domain. The MPEG standard codecs [80, 81] and the
H.261,H.263codecsscan andentropycodetheDCT
co-eÆcientsandalsoallowdirect encodingof themore
cor-related video signal on a block-by-block basis. Vector
quantisation(VQ) [149]-[151] canbecarriedoutbothin
thefrequencyandthetimedomains,butapersistant
de-ciencyistheirdiÆcultytohandlesharpedgesadequately.
Returningto theDCT principle,ourproposed
DCT-basedcodecwasdesignedtoachieveatime-invariant
com-pressionratioassociatedwithaxedbut programmable
encodedvideorateof5-13kbps
1
. Thecodec'soperation
isinitialisedintheintra-framemode,butonceitswitched
to the inter-frame mode, anyfurther modeswitches are
optional and only requiredif adrastic scene change
oc-curs.
1
MV Selection
and
Gain Scaling
Motion
Compensation
Prediction
Classified DCT
DCT Selection
and
Gain Control
Quantisation
Classified DCT
[image:10.612.55.278.13.228.2]Quantisers
Table of
Previous Local
Reconstructed
Frame
Local
Reconstructed
Frame
DCT
Inverse
Compensation
Inverse Motion
MV
Video
to Rec.
to Rec.
and
Motion
Prediction
Figure7. DCT-codecschematicIEEE,c Hanzo&Streit
[188],1995
0
10
20
30
40
50
60
70
80
90
100
Frame Index
20
22
24
26
28
30
32
34
36
PSNR
(dB)
10 kb/s
8 kb/s
5 kb/s
Figure8. PSNRversusframeindex performance at
var-ious bitrates for the 'Miss America' sequence IEEE,c
Hanzo&Streit[188]1995
In the intra-frame mode the encoder transmits the
coarselyquantisedblockaverages forthe currentframe,
whichprovidesalow-resolutioninitialframerequiredfor
theoperationof theinter-frame codecat boththe
com-mencement and during later stages of communications
in order to preventencoder/decodermisalignment. For
176144pixelITU standardQuarterCommon
Interme-diate Format (QCIF) images in a specic scenario[188]
welimitedthe numberof videoencoding bits perframe
to 1136, corresponding to a bitrate of 11.36 kbps at 10
frames/s.
Inthemotion-compensation88blocksareused. At
thecommencementoftheencodingprocedurethemotion
compensation (MC) schemedetermines amotionvector
(MV) for each of the 88blocks using full-search. The
center of each block and hence a total of 4 bits are
re-quired fortheencodingof 16 possiblepositions foreach
MV.Beforetheactualmotioncompensationtakesplace,
thecodectentativelydeterminesthe potentialbenetof
thecompensation intermsof motioncompensatederror
energy reduction. Then the codec selects those blocks
as 'motion-active' whose gain exceeds a certain
thresh-old. This method of classifying the blocks as
motion-activeandmotion-passiveresultsinanactive/passive
ta-ble,whichconsistsofaonebitagforeachblock,
mark-ingitaspassiveoractive.
Pursuinga similarapproach, gaincontrol isalso
ap-pliedtotheDiscreteCosineTransform(DCT)based
com-pression. EveryblockisDCTtransformedandquantised.
Inorder totakeaccountofthenon-stationarynature of
themotion compensated errorresidual(MCER) and its
time-variant frequency-domain distribution, four
dier-entsets of DCT quantisers were designed. The
quanti-sation distortion associatedwith each quantiser is
com-putedinordertobeabletochoosethebestone. Tenbits
areallocatedforeachquantiser,eachofwhicharetrained
Max-Lloyd quantisers catering for a specic
frequency-domainenergydistributionclass. AllDCTblockswhose
coding gain exceeds a certain threshold are marked as
DCT-activeresultingin asimilaractive/passivetableas
forthemotionvectors. Forthissecondtableweapplythe
samerunlengthcompressiontechnique,asabove. Again,
if the number of bits required for the encoding of the
DCT-activeblocks exceeds half of the maximum
allow-ablenumber,blocksaroundthefringes oftheimage are
consideredDCT-passive,ratherthanthoseinthecentral
eye and lip sections. If, however, the active DCT
coef-cient and activity-table do not ll up the xed-length
transmissionburst,thethresholdsforactiveDCTblocks
isloweredandalltablesarerecomputed.
The bit allocation scheme was designed to deliver
1136 bits per frame, which is summarised in Table 1.
Theencodedbitstream beginswitha22bitframe
align-ment word(FAW). This is necessaryto assist thevideo
decoder'soperation in orderresume synchronous
opera-tionafter lossof frame synchronisation overhostile
fad-ing channels. The partial intra-frame update refreshes
only 22 outof 396 blocksevery frame. Thereforeevery
18 frames or1.8 seconds the update refreshesthe same
blocks. This periodicity is signalled to the decoder by
transmittingtheinvertedFAW.AMVisstoredusing13
bits,where 9bitsarerequiredto identifyone ofthe396
the block indexes using the enumerative method and 4
bits for encoding the 16 possiblecombinations of the X
and Ydisplacements. The88DCT-compressed blocks
use a total of 21 bits, again 9 for the block index, 10
fortheDCTcoeÆcientquantisers,and2bitstoindicate
which of the four quantiser has been applied. The
to-tal numberof bitsbecomes30 (13+21)+224+22+
6=1136, where six dummy bits were added in order to
pack-22 224 309 304 309 3012 6 1136
Table1. BitAllocation TableperQCIFVideo FramefortheFixed-rate DCTCodecIEEE,c Hanzo &Streit[188]
199
blockcodecused.
The encoded parametersare transmitted to the
de-coder and also locally decoded in order to be used in
futuremotionpredictions. Thevideocodec'sPeakSNR
(PSNR)versusframeindexperformanceisshownin
Fig-ure 8, where the PSNR is dened, as the conventional
Signal-to-NoiseRatio (SNR), exceptthat insteadof the
actual video signal power a video pixel value of 255 is
assumed,yielding apixel powerof255
2
forall pixel
po-sitionsacross thevideo frame. Since 255 is the highest
possiblevalueforan8-bitpixelrepresentation,thePSNR
istypicallyhigher,thantheconventionalSNR.Thecodec
proposed was subjected to bit-sensitivity analysis and
aQuadratureAmplitude Modulation[68] (QAM) based
source-sensitivitymatchedtransceiverwasdesignedin
or-derto transmitthevideostream overwireless channels.
The interested reader is referred to reference [188] for
furtherdetails. Havingdescribedtheprinciplesof
DCT-basedvideocodingletusnowconsiderQTcodingofthe
MCER[189].
6.4 Quadtreestructured coding
Theproposed QT-codecsharesthestructure ofthe
pre-viousDCT-basedschemeportrayedinFigure7,but
em-ploys QT-codingof the MCER.Quad-trees(QT)
repre-sentasub-classoftheso-calledregiongrowingtechniques,
wheretheimage,inourcasetheMCERgeneratedbythe
MCschemeisdescribed bythehelp ofvariable size
sec-torscharacterizedbysimilarfeatures,inthiscase,similar
greylevels. Explicitly, theMCER is described in terms
oftwosetsofparameters,thestructureofsimilarregions
and their grey levels. Note that the information
char-acteristic of the QTstructure is potentially much more
sensitiveto biterrorsthanthegreylevelcodingbits.
BeforeQTdecomposition takesplace, theframe
dif-ferencesignalisdividedin1616-pixelblocksperfectly
tilingtheoriginal dierenceframe. CreatingtheQT
re-gionsisarecursiveoperation. Consideringeach
individ-ualpixel, twoormoreneighboursaremergedtogetherif
acertainmergingcriterionissatised. Thiscriterionmay
be,forinstance,asimilargreylevel. Thismerging
proce-dureisrepeateduntilnomoreregionssatisfythemerging
criterion, hence nomore mergingis possible. Similarly,
theQTregionscanbeobtainedinatop-downapproach,
dividingthe MCER in a numberof sections, ifthe
sec-tionsdonotsatisfythesimilaritycriterion,andcontinue
until the pixel level is reached and no further splits are
The quad-tree approach is one possible
implemen-tation of the socalled region growing techniques. This
process can be observed in Figure 9. For a
rectangu-lar region an algorithmically attractive implementation
is,whencommencingatthepixellevel,fourquadrantsof
a square are merged together, if the matching criterion
ismet. Thegreylevelsofthequadrantsofasquare are
representedbym1:::m4andtheirmeaniscomputed
ac-cordingtom=(m
1
+m
2
+m
3
+m
4
)=4. Iftheabsolute
dierenceofallfourpixelsandthemeangreylevelisless
thanthesystemparameter,thenthesepixelssatisfythe
mergingcriterion. Explicitly, asimplemergingcriterion
canbeformulatedasfollows:
(jm m
1
j<)\(jm m
2
j<)\(jm m
3
j<)
\(jm m
4
j<)=True; (1)
where\representsthelogicalANDoperation.
It is expected that if the system parameter is
re-duced, the matching criterion becomes more stringent
and hence less merging takes place, which is likely to
increasethe required encoding rate at aconcommittant
improvement of the MCER's representation quality. In
contrast,anincreased valueisexpected toallowmore
mergingto takeplace and hence reduce thebit rate,as
wewillshowin ourresultsSection.
Ifthemergingcriterionissatised,themeangreylevel
mbecomesthegreylevelof themergedquadrantin the
nextgeneration,andsoon. Atthisstageitisimportant
tonotethatthequalitycontrolthresholddoesnotneed
tobeknowntotheQTdecoder. Thereforetheimage
rep-resentation quality can be rendered position-dependent
withintheframebeingprocessed,whichallowsweighting
to be applied to important image sections, such as the
eyes and lips without increasing the complexity of the
decoderorthetransmissionrate.
Pursuingthe top-downQTdecomposition approach,
theframedierencesignalconstitutesaso-callednodein
theQT.Aftersplittingthisnodegivesrisetofourfurther
nodes,whichareclassiedonthebasisofthe'similarity
criterion'. Specically,ifallthepixelsatthislevelofthe
QT dier from the mean m by less than the threshold
, then theyare considered tobeaso-called 'leaf node'
in the QT. Hence they do not have to be subjected to
further 'similaritytests', theycanberepresentedsimply
bythemeanvaluem.
If, however,the pixels constituting the current node
Figure 10. Enhanced sample codebook with 128 88
vectorsStreitc andHanzo[190], 1997
bytheirmeanmandthustheymustbefurthersplit,until
thethreshold conditionis met. This repetitivesplitting
process is continued, until there are no more nodes to
split, sinceall theleaf nodes satisfythe threshold
crite-rion,asshown inFigure 9. ConsequentlytheQT
struc-turedescribesthe contours of similar grey levels in the
framedierencesignal.
Inorder to beableto reproducethe encoded image,
not only the grey levels of the leaf nodes, but also the
QTstructure mustbeeÆciently encoded and
communi-catedto thedecoder. Fortunately,theQTstructurecan
be eÆciently described by the help of a variable-length
code. Atthecommencementofimagecommunicationsa
low-resolutionversionoftherstimageframeisencoded
and transmittedto the decoder in order to assist in its
operation. ThentheMCERsignaliscomputed,whichis
subjected QTcoding (QTC) before transmissionto the
decoder. Again, the schematic of the QT codec obeys
thestructure of Figure 6. The QT coded MCERis
lo-callydecoded andaddedtothepreviouslocally decoded
frameand storedin theframebuer forthedurationof
oneframein order to generate thenextblock estimates
forthe MC operation. A range of techniques relatedto
theoptimumQTsplitting and bit allocationtechniques
weresuggestedin[189],where alsodetails ofthe
source-matchedvideotransceivercanbefound. Adequatevideo
quality was achieved for theMiss America sequence for
abitrateof 11.36kbps,whenusing10frames/sscanned
QCIFimages.
Without aiming for an indepth treatment webriey
allude to the concept of vector quantised video codecs,
where the MCER of an 88 pixel block is represented
bythebest matchingentryofthetwo-dimensional
code-book shown in Figure 10. This principle also allowed
Reference [187], where the full transceiver performance
over fading channels is also characterised. The
above-mentioned range of xed-rate videocodecs is compared
in terms of error resilience and video quality in
Refer-ence [190]. In References [187]-[189] a range of exible
recongurablemulti-level transceivers were designed for
thetransmissionoftheVQ-,DCT-andQT-codedvideo
streams byallocating anaddition physical speech
chan-nel for video telephony. However, for reasons of space
economy these results were not included here. Similar
video PSNR versus channel SNR results are here
pro-vided using the ITU H.263 video codec and a
recong-urable transceiver in order to characterise theexpected
videoperformanceinFigures27and28.
2
In closing wenote againthat the literatureof video
compression is very rich [139]-[156] and recent
develop-mentsledtothedenitionoftheMPEG1,MPEG2,H.261
andH.263standards. Althoughthesecodecsrelyon
vul-nerable variable-length coding techniques, work is also
under way towards contrivingmore robustcoding
algo-rithms, such as those to be incorporated in the
forth-comingMPEG4scheme [71,70]. InthenextSubsection
we briey highlight the features of the standard H.263
scheme,whichisanerror-sensitive,variable-ratescheme,
but achievesaveryhighcompressionratioand henceto
dateitisthebestexisting standardisedvideocodec. We
will alsopropose appropriatetransmissiontechniquesto
supportitsoperationinawirelessvideophonescheme.
6.5 The H.263 ITUCodec
The H.263 codec wasdetailed in References [193, 194],
whileanumberoftransmissionschemesdesignedfor
ac-commodating its rather error-sensitive bit-stream were
proposed in [195]-[177]. As an illustrative example, in
Table2wesummarisedthevariousvideoresolutions
sup-portedby theH.261 and H.263ITU codecs,in order to
demonstrate their exibility [192]. Their uncompressed
bitrates at frame scanning rates of both at 10 and 30
frames/secforbothgreyandcolourvideoarealsolisted.
The mature H.261 standard dened two dierent
pic-tureresolutions,namelyQCIFandCIF,whiletheH.263
codechastheabilitytosupportvedierentresolutions.
AllH.263decodersmustbeabletooperateinsun-QCIF
(SQCIF) and QCIFmodesand optionally support CIF,
4CIFand16CIFformats.
The H.261 and H.263 codecs share the simplied
schematicofFigure11,whichoperateunderthe
instruc-tions of the coding control block, selectingthe required
inter/intra frame mode, the quantisation and
bitalloca-tionscheme etc. DCTis invokedto compresseither the
originalortheMCERblocksandtheencodedvideosignal
is also locally decoded andstored in theframe memory
2
The exposition of this Section can beaugmented by
http://www-Hanzo,[189], 1996
Video Luminance No. ofPels Uncompressed
Format dimensions perframe bitrate(Mbit/s)
10frame/s 30frame/s
Grey Colour Grey Colour
SQCIF 128x96 12288 0.983 1.47 2.95 4.42
QCIF 176x144 25344 2.03 3.04 6.09 9.12
CIF 352x288 101376 8.1 12.2 24.3 36.5
4CIF 704x576 405504 32.4 48.7 97.3 146.0
16CIF 1408x1152 1622016 129.8 194.6 389.3 583.9
CCIR601 720x480 345600 27.65 41.472 82.944 124.416
HDTV1440 1440x960 1382400 110.592 165.888 331.776 497.664
HDTV 1920x1080 2073600 165.9 248.832 497.664 746.496
SQCIF:Sub-QuarterCommonIntermediateFormat
QCIF:QuarterCommonIntermediateFormat
CIF:CommonIntermediateFormat
HDTV:HighDenitionTelevision
Table2. Variousvideoformatsandtheiruncompressedbitrate. Uponusingcompression10-100timesloweraverage
DCT
-1
Quant
-1
Loop
Filter
Motion
Estimation
Quant
Video
Multiplex
Coder
Coding
Control
Current
Frame
0
1
0
1
DCT
Quantised Coefficients
Motion Vector
Frame
Memory
Motion
Compensation
Quantiser Index
0
INTRA/INTER mode
Multiplexer
Multiplexer
Prediction Error
Figure 11. Simplied H.261/H.263
schematicCherrimanc [192], 1995
compare-averpsnr-v-bitrate-miss
america.gle
5
10
1
2
5
10
2
2
5
10
3
2
5
Bitrate (Kbit/s)
30
32
34
36
38
40
42
44
46
48
50
A
verage
PSNR
(dB)
Miss America 30fps-CIF
Miss America 10fps-CIF
Miss America 30fps-QCIF
Miss America 10fps-QCIF
Miss America 30fps-SQCIF
Miss America 10fps-SQCIF
Figure 12. Image quality (PSNR) versus coded
bi-trate, for H.263 \Miss America" simulations at 10 and
30 frames/s using SQCIF, QCIF and CIF sequences
c
Cherriman[192],1995
inorder to be usedin future MC steps. All encoded
in-formation is multiplexed for transmission by the video
multiplex coder. The codec'sPSNR versusencoded
bi-trate performance is portrayed in Figure 12 for \Miss
America" simulations at 10 and 30 frames/s using
SQ-CIF,QCIFandCIFsequences[192]. Observeinthe
Fig-urethecodecguaranteesnear-linearrate-scalabilityover
awideoperatingrange,whichis partlyexplainedbythe
extensive employment of entropy coding schemes. The
performance of a complete adaptive videophone system
will be portrayed after considering the associated
wire-lesstransmissionaspects.
Here wecurtail our discussions on videocodecsand
providesomenotesonanotheraspectofmultimedia
com-7 GRAPHICALSOURCE CODING
7.1 Background
Telewriting is a multimedia telecommunication service
enabling the bandwidth-eÆcient transmission of
hand-writtentextandlinegraphicsthroughxedandwireless
communication networks [196]-[201]. Dierential chain
coding (DCC) has been successfully used for graphical
communications over E-mail networks [196] or teletext
systems[199],where bitrateeconomyisachievedby
ex-ploitingthe correlationbetweensuccessivevectors.
Ref-erences [197] and [202] addressed also some of the
as-sociatedcommunicationsaspects. A plethora of further
excellent treatises were contributed to the literature of
chain codingbyR. Prasadandhis colleaguesfromDelft
University[203]-[205].
7.2 Fixed length dierentialchain coding[206]
Inchaincoding(CC)asquare-shapedcodingringisslid
along the graphical trace from the current pixel, which
is the origin of the legitimate motion vectors, in steps
representedby the vectorsportrayedin Figure 13. The
bolddotsintheFigurerepresentthenextlegitimate
pix-els during the graphical trace's evolution. In principle
thegraphicaltracecanevolveto anyof thesurrounding
eight pixels and hence a three-bit codeword is required
forlosslesscoding. Dierentialchaincoding[203](DCC)
exploits that the most likely direction of stylus
move-ment is a straight extension, corresponding to vector0
andwith agraduallyreducingprobabiltyofsharp turns
corresponding vectorshavinghigher indeces. Explicitly,
we havefoundthe whilevector0typicallyhasa
proba-bility of around 0.5 for a range of graphical source
sig-nals,includingEnglishandChinesehandwriting, aMap
andatechnicalDrawing,therelativefrequencyofvectors
1isaround0.2,whilevectors2;3haveprobabilities
around0.05. ThissuggeststhatthecodingeÆciencycan
beimprovedusingtheprincipleofentropycodingby
al-locatingshortercodewordstomorelikelytransitionsand
longeronestolesslikelytransitions.
Inreference[206]weembarkedonexploringthe
poten-tialofagraphicalcodingschemedispensingwithvariable
lengthcoding, which wereferto asxed length
dieren-tial chain coding (FL-DCC). FL-DCC was contrived in
ordertocomplywiththetime-variantresolution-and/or
bitrateconstraintsofintelligentadaptivemultimode
ter-minals, which can be re-congured under network
con-trolto satisfythemomentarily prevailingtele-traÆc,
ro-bustness, quality, etc system requirements. In order to
maintain lossless graphics quality under lightly loaded
traÆc conditions, the FL-DCC codec can operate at a
rate of b = 3 bits/vector, although it has a higher bit
rate than DCC. However, sincein voice and video
cod-ingtypicallyperceptuallyunimpairedlossy quantisation
re--2
0 b=2 (00)
b=1 (1)
b=2 (10)
+4
+2
+3
-1
-3
+1
b=1 (0)
b=2 (01)
b=2 (11)
Figure13. Codingring
X
Y
SV
FV
FV FV
FV
o
o
VC
Figure 14. FL-DCC Coding Syntax
c
IEEE, Yuen and
Hanzo[206],1995
conditions.
Based on our ndings asregards to the relative
fre-quencies of the various dierential vectors, we decided
to evaluate the performance of the FL-DCC codec
us-ing the b = 1 and b = 2 bit/vector lossy schemes. As
demonstrated by Figure 13, in the b = 2-bit mode the
transitionstopixels-2,-3,+2, +3areillegitimate,while
vectors0,+1,-1and+4arelegitimate. Inorderto
min-imise the eects of transmission errors the Gray codes
seeninFigure 13wereassigned. Itwillbedemonstrated
that, due to the low probability of occurance of the
il-legitimate vectors, the associated subjective coding
im-pairment is minor. Under degrading channel conditions
orhighertele-traÆcloadtheFL-DCCcodingratehasto
bereducedto b=1,in orderto beableto invokealess
bandwidtheÆcient,butmorerobustmodulationscheme
orto generate lesspackets contendingfor transmission.
Inthis case only vectors+1 and -1 of Figure 13 are
le-gitimate. Thesubjectiveeects oftheassociatedzig-zag
tracewill beremoved by the decoder,which can detect
thesecharacteristicpatternsandreplacethembyatted
straightline.
In generalterms the size of thecoding ring is given
by 2n, where n =1;2;3::: is referredto asthe order
oftheringand isascalingparameter,characteristicof
the pixel separation distance. Hence the ring shown in
Figure 13 is a rst order one. The numberof nodes in
theringisM=8n.
in Figure 14. The beginning of a trace can be marked
byatypically8bitlongpen-down(PD) code, whilethe
end oftraceby apen-up (PU)code. In orderto ensure
thatthesecodesarenotemulatedbytheremainingdata,
if this would be incurred, bit stuÆng must be invoked.
Wefoundthatin complexityandrobustnesstermsusing
a'vectorcounter'(VC)constitutedamoreattractive
al-ternativeforoursystem. ThestartingcoordinatesX
0
;Y
0
ofatracearedirectly encoded usingforexample10and
9bits in caseof avideograhicsarray(VGA) resolution
of640480pixels.
The rst vector displacement along the trace is
en-coded by the best tting vector dened by the coding
ringasthestartingvector(SV).Thecodingringisthen
translatedalongthisstartingvectortodeterminethenext
vector. A dierential approach is used for theencoding
of all the following vectorsalong the trace, in that the
dierences in direction between the present vector and
itspredecessorarecalculatedandthesevectordierences
aremappedintoasetof2
b
xedlengthb-bitcodewords,
whichwereferto as'xedvectors'(FV).
We designedawireless 4QAM-based[68] transceiver
for the transmission of FL-DCC encoded graphical
sourcesignalsandevaluatedthesystem'srobustnessover
Rayleigh-fading channels with second-order
switched-diversity, using automatic repeat requests limited to a
maximum of three transmission attempts (TX3) [207].
HerewerefrainfromprovidingPSNRversuschannelSNR
curves, for these the interested useris referred to [207].
However,the correspondingsubjectivegraphical quality
andtheassociatedPSNRvaluesaresummarisedin
Fig-ure 15 for the channel SNR range of 5-12 dB,
respec-tively. Due to its low channel capacity requirement the
FL-DCC coded signal is readily accommodated by the
voice signalduring passive speech spurts, when using a
voiceactivitydetector(VAD)[40]. Finally,itis
notewor-thythattheITUstandardisedtwodierentchaincoding
schemesin theT.150Recommendationforuseover
con-ventional low-BER xed telephone lines. However, for
wirelesschannelstheproposedFL-DCCschemeis
prefer-ableduetoitshigherrobustnessandprogrammable-rate
operation.
Wenotethatanassociatedmultimediasignal
manip-ulation relying on writing tablets is the eld of
hand-writingrecognitionforbothon-line ando-line
applica-tions[208]-[213]. Manyofthetechniquesusedarebased
onhiddenMarkovmodels(HMMs),whicharewidely
em-ployed in the eld of speech recognition. Previous
re-searchhasshownthat HMMsareapplicabletoboth
o-line[212]andon-line[208,210,213]handwriting
recogni-tionproblems. Theadvantageof such statiscal methods
isthat theycanhandlevariabilityinthewritingprocess
ofanindividual,but theyarealsocapableof identifying
andcapturingtheindividualfeaturesofthehandwritten
characters bytakingintoaccountdynamic, pressure
Figure15. Subjectiveeectsoftransmissionerrorsforthe
b=116QAM,RD,TX3schemeforPSNRvaluesof(left
toright,toptobottom)49.47,42.57,37.42,32.01,27.58
and 21.74 dB, respectively. IEEE,c Yuen and Hanzo
[206],1995
functionofthedistancealongthewritingtrajectory.
Following the above brief excursion to graphical
source compression and signal processing, here we turn
ourattention to wireless communications aspects,
com-mencingwithareviewofthefrequencyre-useconceptof
cellularsystems.
8 CELLUAR COMMUNICATIONS BASICS
8.1 The CellularConcept
A common feature of the previously mentioned mobile
radiosystemsisthatcommunicationstakeplacebetween
astationarybase station(BS) andanumberofroaming
mobilestations (MSs) orportable stations (PSs) [1]-[9].
TheBS'sandtheMS'stransmitterisexpectedtoprovide
asuÆcientlyhighreceivedsignallevelforthefar-end
re-ceiversinordertomaintaintherequiredcommunications
integrity. This isusuallyensured bypowercontrol. The
geographical area in which this condition is satised is
termedasatraÆc cell,which typically hasan irregular
shape,depending ontheprevailingpropagation
environ-mentdeterminedbyterrainandarchitecturalfeaturesas
wellasthe localparaphernalia. Intheoretical sudies
of-tena simple hexagonalcell structure is favoured for its
simplicity,wheretheBSsarelocatedatthecentresofthe
cells.
Inanidealsituationthetotalbandwidthavailableto
2
6
7
5
4
2
3
6
1
5
4
2
3
6
7
1
5
4
2
3
6
7
1
5
4
2
3
6
7
1
5
4
2
3
6
7
1
5
4
2
3
6
7
1
5
4
2
3
6
7
1
5
4
2
3
6
7
1
5
4
2
3
6
7
1
5
4
1
3
7
Figure16. Hexagonalcellsandseven-cellclusters
eachcell,assumingthatthereisnoenergyspiltinthe
ad-jacentcell'scoveragearea. However,sincewave
propaga-tioncannotbeshieldedatthecellboundary,PSsnearthe
celledgewouldexperienceapproximatelythesamesignal
energywithin theirchannelbandwidthfrom atleasttwo
BSs. This phenomenon is called co-channel
interfer-ence. A remedy to this problem is to devide the total
bandwidthB
total
in frequencyslotsof B
cell
=B
total
=N,
and assign a mutually exclusive reduced bandwidth of
B cell
to eachtraÆc cellwithin aso-called cluster ofN
cells,asdemonstratedinFigure16forN =7. The
seven-cellclusters are then tesselatedin order to provide
con-tiguousradio coverage. Observefromthegu