• No results found

Bandwidth Efficient Wireless Multimedia Communications: Problems Solutions and Challenges

N/A
N/A
Protected

Academic year: 2020

Share "Bandwidth Efficient Wireless Multimedia Communications: Problems Solutions and Challenges"

Copied!
46
0
0

Loading.... (view fulltext now)

Full text

(1)

ECMCS'99

EURASIP Conference

DSP for Multimedia Communications and Services

Kraków, 24-26 June

BANDWIDTH-EFFICIENT WIRELESS MULTIMEDIA

COMMUNICATIONS:

LIMITATIONS, SOLUTIONS AND CHALLENGES

INVITED PAPER

Lajos Hanzo

Dept. of Electr. and Comp. Sc., Univ. of Southampton, SO17 1BJ, UK.

Tel: +44-703-593125, Fax: +44-703-594 508

Email: [email protected]

http://www-mobile.ecs.sot on.a c.uk

ABSTRACT

Commencing with the brief history of mobile

communications and the portrayal of the basic

concept of wireless multimedia communications,

theimplicationsofShannon'stheoremsregarding

jointsource and channelcodingforwireless

com-munications are addressed. Following a brief

in-troduction to speech, video and graphical source

codingas wellas thecellularconcept, a

rudimen-tary overview of exible, recongurable mobile

radio schemes is provided. We then summarise

the fundamental concepts of modulation,

intro-duce an adaptive modemscheme and argue that

third-generationtransceiversmightbecome

adap-tively re-congurable under network control in

order to meet backwards compatibility

require-ments with existing systems and to achieve best

compromiseamongstarangeofconictingsystem

requirementsin termsofcommunicationsquality,

bandwidth requirements, complexity and power

consumption, robustness against channel errors,

etc.

c

1998 IEEE. REPRINTED WITH PERMISSION

FROMPROC.OFTHEIEEE,JULY1998,VOL.86,NO.

7,PP 1342-1382,CENTENNIAL SPECIAL ISSUE -100

YEARSOFRADIO COMMUNICATIONS

THE HELPFUL SUGGESTIONS OF THE ANONYMOUS

REVIEWERSAREGRATEFULLYACKNOWLEDGED

ARANGEOFASSOCIATEDPAPERSCANBEFOUND

UN-1 THE WIRELESS COMMUNICATIONS

SCENE

Since the end of the last century, when Marconi and

Hertzdemonstratedthefeasibilityofradiotransmissions,

mankind has endeavoured to full the dream of

wire-lessmultimedia personalcommunications,enabling

peo-pletocommunicatewithanyone,anywhere,atanytime,

using a range of multimedia services. The evolution

of wireless systems and their subsystems has been well

documented in a range of monographs by Jaykes [1],

Lee [2], Parsons and Gardiner [3], Feher [4] and

oth-ers. Glisic and Vucetic [5] aswellasPrasad[6]

concen-tratedon variousaspects ofCodeDivision Multiple

Ac-cess(CDMA)intheirmonographs,whilethecompilation

ofexcellentoverviewseditedby GlisicandLeppanen [7]

treatedbothTimeDivisionMultipleAccess(TDMA)and

CDMA along with a range of other associated aspects,

suchassmartantennae[53,55,56],trelliscodingaswell

as emerging topics, referred to as 'time-space'

process-ing[55],'per-surviver'processing[57]etc.

Meyer et al [8] focused on various modern receiver

techniques in their monograph. Steele [9] compiled a

monograph,which considers mostphysical layeraspects

of modern TDMA systems, including speech and

chan-nel coding, modulation, frequency hopping, and so on,

amalgamatingthem inthelastchapterin thecontextof

theGlobal SystemofMobile Communicationsknownas

GSM.Furtherimportantreferencesareforexamplethose

byRappaport[10],GargandWilkes[11] orthe

compila-tioneditedbyGibbson[12]. Thesedevelopmentsarealso

portrayedin magazine special issues [13]-[18] and

(2)

inthebroadeldofwirelessmultimediacommunications.

Letuscommenceourdiscoursewithaglimpseofhistory.

Therstmobileradiosystemswereintroducedbythe

military, police and other emergency services, most of

which werelimited to voice only communications.

Dur-ingthepre-VLSIeratherealisablesignalprocessing

com-plexity wasseverlylimitedand hence thehandsets

pro-vided typically poor voice quality at a high cost. This

wasduetothephenomenonofmulti-pathwave

propaga-tion,wherethedierentmulti-pathcomponentsarriving

atthereceiver'santennasuerdierentattenuationand

phaserotationand hencethey sometimesadd

construc-tively, sometimesdestructively. This situation isfurther

aggravatedby theso-calleddelay-spread,whenthe

vari-ouspropagationpathshaveratherdierentpath-lengths

and consequentlyexhibit dierent delays, spilling

inter-symbol interference(ISI) into the adjacent signallingor

symbol intervals. These phenomena can todayoften be

combatedbysophisticated signalprocessingmethods at

the cost of added implementational complexity, which

was notpossibleinthepre-VLSI era. Hence,until quite

recently, the quality and varietyof wireless services has

beeninferior toconventionaltetheredcommunications.

The rst public cellular radio system, known as the

AdvancedMobilePhoneService(AMPS)wasintroduced

in 1979 in the United States, shortly followed by the

NordicMobile Telephone(NMT) systemin Scandinavia

in 1981. The rst British system was the Total

Ac-cessCommunicationsSystem (TACS) operatedby

Cell-netandVodafone,whiletheJapaneseintroducedthe

Nip-pon Advanced Mobile TelephoneSystem(NAMTS). All

oftheseso-calledrst-generationnationalsystemswere

basedonanaloguefrequencymodulation(FM) butused

digital network control. Howeverthey did not support

internationalroaming.

In1982 CEPT (ConferenceEuropeene des Posteset

Telecommunication), the main governing body of the

European PTT's, created the Groupe Speciale Mobile

(GSM)Committeeandtaskeditwithstandardisinga

dig-italcellularPan-Europeanpublicmobilecommunication

systemto operatein the 900 MHzband. This was

fol-lowedbythelaunchofexperimentalprogrammesof

dier-enttypesofdigitalcellularradiosystemsin anumberof

Europeancountries. Bythemiddleof1986nineproposals

were received for the future Pan-European system, and

GSMorganisedatrialinParistoidentifytheonehaving

thebest performance. Thetechnical details of the

can-didatesystemsaredescribedin references[33],[34],[35],

[36]and[37], whileashortsummaryof theirsalient

fea-tureswasgiven in reference[39]. A detailed description

ofthe standardisedGSMsystem'smain features canbe

foundin reference[40]. This schemeconstitutestherst

so-called second-generation public land mobile radio

(PLMR)system, which was designed for theworst-case

ingchannelconditionsandtechniquesformitigatingtheir

eects willbehighlightedduring ourfurther discourse.

FollowingGSM,in1989theAmericansecond

genera-tionschemeknownastheDigitalAdvancedMobilePhone

(DAMPS) system[41] had also beenstandardised, with

theadvantageofbeingabletoaccommodatethreehigher

qualitydigitalchannelsinaconventional30kHzanalogue

AMPS channel slot. Its uniquefeature isthat similarly

to theJapanesesecondgeneration schemereferredtoas

the Public DigitalCellular (PDC) system [42] it usesa

2 bits/symbol non-binary modem, which implicitly

as-sumesamorebenignpropagationenvironmentthanthat

oftheGSMPLMRsystem. Theimprovedwave

propaga-tionconditionsareaconsequenceofemployingso-called

micro-cells, where, in contrastto hostile PLMR system,

thehighantennaelevationisreducedtobelowtheurban

sky-line. Hence there is typically a strong line-of-sight

(LOS) path between the base station (BS) and mobile

station (MS),reducing the fading depth and mitigating

the eect of delay-spreadinduced ISI. These issues will

bere-visitedin moredepth atalaterstage.

With respecttotheimprovedpropagationconditions

themulti-levelIS-54andPDCsystemsprovideaseamless

transition towardstheso-calledcordless

telecommunica-tions(CT) systemconceptcontrivedmainly forfriendly

indoors oÆce and domestic propagation environments.

Hence CT products are designed to have a low

trans-mitted power and small coverage area, where typically

thereisadominantline-of-sight(LOS)propagationpath

between the Fixed Station (FS) and Portable Station

(PS).Thelowtransmittedpowerandsmalltransmission

range facilitate a low-complexity, low-cost, light-weight

construction. The standardisation and development of

CTproductswashallmarkedbytheBritishCT2system,

the Digital European Cordless Telephone (DECT) and

theJapaneseHandyphone(PHP)systems. Afurther

im-portantmilestonewasthestandardisationoftheBritish

DCS-1800 system, which is essentially an up-converted

GSM system implemented at 1.8 GHz. The denition

of the so-called half-rate GSM system supporting twice

as many subscriberswithin the 200kHz channel

band-width, asthefull-rate systemwasalsoanimportant

de-velopmentin theeld. These secondgenerationsystems

andCTschemesweredescribedin dedicatedchaptersof

reference[12].

Currently there exist a range of initiatives world

wide, which attempt to dene the third generation

personal communications network (PCN), which is

re-ferredtoasapersonalcommunicationssystem(PCS)in

North America. The European Community's Research

in Advanced Communications Equipment (RACE)

pro-gramme[13,12] andtheconsecutiveframeworkreferred

to asAdvanced Communications Technologies and

(3)

ini-MOBILITY

DATARATE

UMTS

2.

GENER.

GSM,

IS-54

ISDN

MBS

CORDLESS

FIXED

PORTABLE

MOBILE

WLAN

B-ISDN

Figure 1. Stylised mobility versusbitrate plane

classi-cationofexisting andfuturewirelesssystems

catedprojects,endeavouringto resolvetheon-going

de-bate as regards to the most appropriate multiple

ac-cess scheme, studying Time Division Multiple Access

(TDMA)[13, 40, 9, 41, 12]and Code Division Multiple

Access(CDMA)[13,9,43,12].

European third generation research is conducted

under the umbrella of the so-called Universal Mobile

Telecommunications System (UMTS) [13] initiativeand

sofar the following proposals have been submitted to

ETSI [54]: wideband CDMA [46, 47, 48], Adaptive

TDMA[49] (ATDMA), hybridTDMA/CDMA[50],

Or-thogonalFrequency DivisionMultiplex(OFDM)[51,68]

and Opportunity Driven Multiple Access (ODMA). We

note that the Nokia testbed portrayed in [48] was

de-signedwithvideotransmissioncapabilitiesupto128kbps

inmind. Similarly,cognizancewasgiventotheaspectsof

lessbandwidth-constrained,iehigher-ratevideo

commu-nicationsbytheJapanesewidebandCDMAproposal[52]

for the Intelligent Mobile Terminal IMT 2000 emerging

fromNTTDoCoMo. Thesestandardisationactivitiesare

portrayedinmoredepth in[54].

In the ACTS workplan [44] there are a number of

projects dealing with multimedia source- and channel

coding, modulation and multiple access techniques for

bothcellularandwirelesslocalareanetworks(WLANs).

These studies will design the architecture and produce

demonstration models of the universal mobile

telecom-munications system (UMTS), which the Europeans

in-tendtoaccomplishbeforetheturnofthecentury.

Some-wherealongthelineUMTSisexpectedtomergewiththe

CCIRstudyonthefuturepubliclandmobile

telecommu-nicationssystem(FPLMTS). These systemsare

charac-terisedbythehelp ofFigure1in termsoftheirexpected

grade of mobility and bitrate. These fundamental

fea-turespredeterminetherangeofpotentialapplications.

Specically, the xed networks are evolving from

work (ISDN) towards higher-rate broad-band ISDN or

B-ISDN. A higher grade of mobility, which we refer to

here as portability, is a feature of cordless telephones,

such as the DECT, CT2, PHP etc systems, although

theirtransmissionrateis morelimited. The DECT

sys-temsisthemostexibleoneamongstthem,allowingthe

multiplexing of 23 single-userchannels in onedirection,

which provides rates up to 2332 kbps = 736kbps for

advancedservices. Wirelesslocalareanetworks(WLAN)

cansupportbitratesupto155Mbits/sinordertoextend

existing Asynchroneous Transfer Mode (ATM) links to

portableterminals,but they usuallydo notsupport full

mobility functions,such aslocationupdate orhandover

fromoneBStoanother. Arapidlyevolvingeld gaining

also considerable commercial interest is associated with

theresearchanddevelopmentofHigh-PerformanceLANs

(HIPERLAN)[66, 67]for 'customerpremises'type

com-munications. Contemporary second generation PLMR

systems,suchasGSMandIS-54cannotsupporthigh

bi-trateservices, sincetheytypicallyhaveto communicate

over lower quality channels, but they exhibit the

high-est gradeof mobility, includinghigh-speedinternational

roamingcapabilities.

The third generation UMTS is expected to havethe

highestgradeoffexibilitybothintermsofitsservice

bi-traterange and in termsof mobility. In itsdesign

cog-nizanceisgiventothesecondgenerationsystems. Indeed,

wemayanticipatethat someof thesubsystems ofGSM

andDECTmayndtheirwayintoUMTS,eitherasa

pri-marysub-system,orasacomponenttoachievebackward

compatibility with systems in the eld. This approach

mayresultin hand-heldtransceiversthat are intelligent

multimodeterminals,ableto communicatewithexisting

networks,whilehavingmoreadvancedand adaptive

fea-turesthat wewouldexpecttoseeinthenextgeneration

ofwirelessmultimediapersonalcommunicationnetworks.

Followingtheabovebriefoverviewofthewireless

commu-nicationssceneletusnowbrieyspeculateonthe

practi-cal embodiment of themultimedia communicatorof the

nearfuture.

2 OUTLINE

Followingtheabovebriefhistoricaloverviewintherestof

thistreatiseweconcentratemainlyonbandwidth-eÆcient

low-rate systems, although many of the proposed

tech-niquesaresuitableforhigh-ratesystemsaswell. F

ollow-ing some introductory conceptual notes as regards to a

possiblemanifestationof thefuture wirelessmultimedia

communicatorinSection 3,weanalysetheramications

of Shannon's messagefor wireless systems in Section 4.

This is followed by three Sections onspeech, video and

graphicalsourcecoding,beforewefocusourattentionon

transmissionaspects. Section8.1highlightsthebasic

cel-lularconcept,whileSection8.2introducesafewmultiple

(4)

of modulation schemes in Section 11 and forward error

correction(FEC)coding,beforeconcludingwiththe

por-trayalof the expected system performance gures

char-acterising such an intelligent multimode speech system

in Section 13 and the characterisation of a videophone

transceiverin Section14.

Thepaperaddressestheso-calledphysical-layer

func-tionsofwirelesssystemsinmoredepth,butattemptsalso

todevotesomeattentionto higher-layeraspects,suchas

multiple access, dynamic channel allocation, handover,

etc. Giventhewidescopeofthistreatise,itisinevitable

thatsomeimportanttrendsandseminalcontributionsby

highlyacclaimedauthorsremainbeyonditscoverage,

al-thoughwith thenumber of references provided there is

suÆcientscopefortheinterestedreadertoprobefurther

incertaindeepersubjectareas.

3 WIRELESS MULTIMEDIA

COMMUNICA-TOR

A possible manifestation of the multimedia PS is

por-trayed in Figure 2, which is equipped with a bird-eye

camera, microphone, liquid-crystal screen, serving both

asvideo-telephone screen as well as a computerscreen.

The conventional keybord is likely to be replaced by

a pressure-sensitive writing tablet, facilitating optical

handwritingrecognition[208]-[213],signatureverication

etc.

The pivotalimplementational point of such a

multi-mediaPSisthatofndingthebestcompromiseamongst

a number of contradicting design factors, such as low

power consumption, high robustness against

transmis-sionerrorsamongstvariouschannelcondition,high

spec-traleÆciency,goodaudio/videoquality,low-delay,

high-capacitynetworkingandsoforth. Inthiscontributionwe

will address a few of these issues in the context of the

proposedPSdepictedinFigure2. Thetime-variant

opti-misationcriteriaofaexiblemulti-mediasystemcanonly

bemet byanadaptive scheme,comprising thermware

ofasuiteofsystemcomponentsandinvokingthat

combi-nationofspeechcodecs,videocodecs,embeddedchannel

codecs,voiceactivitydetector(VAD)andmodems,which

fulllsthecurrentlyprevalentrequirement[68].

These requirements lead to the concept of

arbitrar-ilyprogrammable, exible so-calledsoftwareradios [16],

which is virtually synonymous to the so-calledtool-box

conceptinvokedfor examplein theforthcomingMotion

PicturesExpertGroup(MPEG) 4videocodecproposed

forwirelessvideocommunications[70]. Thisconcept

ap-pearsattractivealso forUMTS-typetransceivers. Afew

examplesofsuchoptimisationcriteriaaremaximisingthe

teletraÆccarriedortherobustnessagainstchannelerrors,

whileinothercasesminimisationofthebandwidth

occu-pancy,theblockingprobabilityorthepowerconsumption

isof primeconcern.

Figure2. WirelessMultimediaCommunicator

Figure3. WirelessMultimediaNetwork

Figure 3. The multimedia PSs communicate with the

so-called BSsin theirvicinity, which are interconnected

either directly using optical bre with eachother, or in

more complex systems viathe so-called Mobile

Switch-ing Centres (MSC). The PSs can access through BS a

rangeofservices,includingbusinessdatabases,

multime-dia databases, main-frame computers, etc. Let us now

turnourattentiontosomeoftheinformationtheoretical

aspects of wireless communications, in order to be able

tounderstandtheunderlyingsystemstechnical

ramica-tions.

4 SHANNON'S MESSAGE AND ITS

IMPLI-CATIONS FOR WIRELESS CHANNELS

In mobile multimedia communications it is always of

prime concern to maintain an optimum compromise in

termsof the contradictoryrequirementsof low bit rate,

highrobustnessagainstchannelerrors,lowdelayandlow

complexity. Theminimumbitrateatwhichthecondition

(5)

transmissionraterequiredfor thelossless representation

ofthesourcesignal,whichisreferredtoasthesource

en-tropyis onlyasymptoticallyachievable,asthe encoding

memory length or delay tends to innity. Any further

compressionisassociatedwithinformationlossorcoding

distortion.Notethattheoptimumsourceencoder

gener-atesaperfectlyuncorrelatedsourcecodedstream,where

all the source redundancy has been removed, therefore

theencoded symbolsare independent and each one has

thesamesignicance. Havingthe samesignicance

im-plies that the corruption of any of the source encoded

symbols results in identical reconstructed signal

distor-tionoverimperfect channels.

Under these conditions, accordingto Shannon's

fun-damentalwork[72,73,75],bestprotectionagainst

trans-mission errorsis achieved, if source and channel coding

aretreatedasseparateentities. Whenusingablockcode

oflengthN channelcodedsymbolsinordertoencodeK

sourcesymbolswithacodingrateofR=K =N,the

sym-bolerrorratecanberenderedarbitrarilylowifN tends

to innity and the coding rate to zero. This condition

alsoimpliesaninnitecoding delay. Basedontheabove

considerationsandontheassumptionofAdditiveWhite

Gaussian Noise (AWGN) channels, source and channel

codinghavehistoricallybeenseparatelyoptimised.

Mobileradiochannelsaretypicallysubjectedto

mul-tipath propagation and hence constitute a more hostile

transmission medium than AWGN channels, exhibiting

pathloss, lognormal slow fading and Rayleigh fast

fad-ing [217, 216]. Furthermore, if the signalling rate used

is higher than the channel's so-called coherence

band-width[217,216], additionalimpairmentsareinicted by

dispersion, which is associated with frequency domain

lineardistortions. Under these circumstancesthe

chan-nel's error distribution versus time becomesbursty and

aninnite-memory symbol interleaveris required in

or-der to disperse the bursty errors and render the errors

asindependent, as possible, such as overAWGN

chan-nels. Clearly,formobilechannelsmanyoftheabove

men-tioned, asymptotically valid ramications of Shannon's

theoremhavealimitedapplicability.

A range of practical limitations must be observed,

when designing wireless multimedia links. Although it

isoftenpossibleto reduce therequiredbitrateof

state-of-artmultimediasourcecodecswhilemaintaininga

cer-tainreconstructedsignalquality,inpracticaltermsthisis

onlypossibleataconcomittantincreaseofthe

implemen-tationalcomplexityandencodingdelay. Agoodexample

of these limitations is the half-rate GSM speech codec,

whichwasrequiredtoapproximatelyhalvetheencoding

rateofthe13kbpsfull-ratecodec,whilemaintainingless

than quadrupled complexity, similar robustness against

channelerrorsandlessthandoubledencodingdelay.

Nat-urally, the increased algorithmic complexity is typically

associatedwithhigherpowerconsumption, whilethe

re-Figure4. Intelligenttransceiverschematic

segmentintuitivelyimpliesthateachbitwillhavean

in-creased relativesignicance. Accordingly, their

corrup-tionmayinict increasinglyobjectionablespeech

degra-dations,unlessspecial attentionis devoted tothis

prob-lem. It is worthnoting that despite itsquadruple

com-plexitythehalf-rateGSMspeechcodecmaintainsalower

powerconsumptionduetolow-power3V-technologythan

therstlaunchedfullratecodechad.

In a somewhat simplistic approach one could argue

that due to thereducedsource ratewecould

accommo-dateanincreasednumberofparitysymbolsusingamore

powerful, implementationally more complex and lower

ratechannel codec,while maintainigthesame

transmis-sion bandwidth. However, the complexity, quality,

ro-bustness trade-o of such a scheme would not be very

attractive.

A more intelligent approach will be required in

or-dertodesignbetterwirelessmultimediatransceivers[73,

74] for bursty mobile radio channels. The simplied

schematic of such anintelligent transceiveris portrayed

in Figure 4. Perfect source encoders operating close to

the information-theoretical limits of Shannon's

predic-tionscanonlybedesignedforstationarysourcesignals,a

conditionnotsatisedbymostmultimediasourcesignals.

Further previouslymentionedlimitationsare the

encod-ingcomplexityanddelay. Asaconsequenceofthese

lim-itations the source-coded stream will inherently contain

residual redundancy and the correlated source symbols

willexhibitun-equalerrorsensitivity,requiringun-equal

error protection. Following Hagenauer [73, 74] we will

refertotheadditionalknowledgeasregardstothe

dier-entimportance or vulnerability of various source coded

bits assourcesignicanceinformation (SSI),whereas to

thecondenceassociatedwiththechanneldecoder's

de-cisionsasdecoderreliabilityinformation(DRI).

Theseadditionallinksbetweenthesource-and

chan-nel codecs are also indicated in Figure 4. Further

(6)

thedierence betweenconsecutivedecoded symbols

vio-latessomethresholdconditionandtherebyfacilitatesthe

detection of a channel decoding error. Then the

chan-neldecodercanattemptasecond tentativedecoding by

passingthesecond most likelycorrectedmessage to the

sourcedecoder, which in turn subjectsthisagain to the

previously failed threshold test, etc. A variety of such

techniques have succesfully been used in robust

source-matchedsource-andchannelcoding[73,74,82,83].

An-other practicalmanifestation of the time-variant source

statisticsof speech signalsis the fact that during silent

speechspurtssomespeechcodecsdonotsurrendertheir

reservedphysicallink,theyreducetheiroutputbitrate

in-stead,whichcanreducetheinterferenceinictedtoother

usersinso-calledCodeDivisionMultipleAccess(CDMA)

systems,such astheAmericanIS-95system[43]. Video

codecs,suchasthevariable-rateMPEG1[80]andMPEG

2[81]codecsevenmoreexplicitlyrelyontheuctuation

ofthesourcestatistics. Forexample,whenanewobject

is introduced in the scope of the camera, which cannot

bepredictedonthebasisofalreadyknownpreviousvideo

frames,thenthebitrateistypicallyincreased.

TheroleoftheInterleaverandDeinterleaver[79]seen

inFigure4istorearrangethechannelcodedbitsbefore

transmission. Themobileradiochanneltypicallyinicts

bursts of errors during deep channel fades, which often

overload the channel decoder's error correction

capabil-ityincertainsourcesignalsegments,whileothersegments

arenotbenetingfromthechannelcodecatall,sincethey

mayhavebeentransmittedbetweenfadesandhenceare

error-freeevenwithoutchannelcoding. Thisproblemcan

becircumventedby dispersingtheburstsof errorsmore

randomlybetweenfadessothatthechannelcodecisfaced

alwayswithan'average-quality'channel,ratherthanthe

bi-modalfaded/non-fadedcondition,althoughonlyatthe

costofincreasedsystemdelay,whichmaybecomean

im-pediment in interactivemultimedia communications. In

otherwords,channelcodecsaremosteÆcient,ifthe

chan-nel errorsare near-uniformlydispersed overconsecutive

receivedsegments.

Initssimplestmanifestationaninterleaverisa

mem-orymatrixthatislledwithchannelcodedsymbolsona

row-by-rowbasis,whichare thenpassedonto the

mod-ulatoron acolumn-by-columnbasis. If the transmitted

sequence is corrupted by a burst of errors, the

deinter-leaver maps the receivedsymbols back to their original

positions, thereby dispersing the bursty channel errors.

Aninnitememorychannelinterleaverisrequiredin

or-dertoperfectlyrandomisetheburstyerrorsandtherefore

totransformtheRayleigh-fadingchannel'serrorstatistics

intothatofanAWGNchannel,forwhichShannon's

infor-mationtheoreticalpredictionsapply. Sinceininteractive

multimediacommunicationsthetolerabledelayisstrictly

limited,theinterleaver'smemorylengthandeÆciencyis

alsolimited. Forfurtherdetails ontheeects ofvarious

interestedreaderisreferredtoReference[79].

A specic deciencyof the abovementioned

rectan-gular interleaversis that in case ofa constantvehicular

speedthe Rayleigh-fading mobile channel typically

pro-ducesperiodicfades [217,216]and error burstsat

trav-elleddistancesof=2,whereisthecarrier'swavelength,

whichmaybemappedbytherectangularinterleaverinto

anothersetofperiodicburstsoferrors. Again,arangeof

morerandom re-arrangementor interleaving algorithms

exhibiting a higher performance than rectangular

inter-leavershavebeenproposed formobile channelsin

Refer-ence [79], where also avarietyof practical channel

cod-ingschemeshavebeenportrayed. Section 5givesabrief

overviewoftherecentactivitiesin speechsourcecoding,

Section 6provides a rudimentary introduction to video

source coding, while Section 7 highlights the principles

of graphical source coding. For a full review of speech

sourcecodingschemesformobile systemstheinterested

readerisreferredtoreferences[84]-[91],jointsourceand

channel codingwasthesubjectof[92], whereas

modula-tionandtransmissionarrangementsforwirelesschannels

havebeenstudiedin[4,6,69,9,68].

ReturningtoFigure4,softdecisioninformation(SDI)

is passed by the demodulator to the FEC decoder,

in-dicating that the demodulator refrained from makinga

hard-decision concerning the received bit. Instead, it

passes the estimated reliability of the received

informa-tiontotheFECdecoder,therebyimprovingitseÆciency.

Thechannelstate information(CSI), whichis in simple

terms representative of the current fade depth, can be

usedtoweightthetheSDIinthedetectionprocess. This

weightedreliabilityinformationisthenoftenusedbythe

channel decoderin order to invokemaximum likelihood

sequence estimation (MLSE) basedon theViterbi

algo-rithm[311, 79] in order to improvethesystem's

perfor-mancewithrespectto conventionalharddecision

decod-ing. Following the above rudimentary review of

Shan-non's infromation theory, the rest of this treatise is

de-votedto practical issuesof wireless multimedia

commu-nications. Let usinitially considerbriey therecent

ad-vancesinspeechsourcecoding.

5 SPEECHSOURCE CODING

5.1 A historical perspective onspeechcodecs

Followingthe64kbits/sPulseCodeModulation (PCM)

and 32 kbps Adaptive PCM (ADPCM) G.721

Recom-mendations standardised by the International

Telecom-munications Union (ITU), in 1986 the 13 kbits/s

Reg-ular Pulse Excitation (RPE) [105, 106] codec was

se-lected for the Pan-European mobile system known as

GSM,andmorerecentlyVectorSumExcitedLinear

Pre-diction (VSELP) [107, 108] codecs operating at 8 and

6.7 kbits/s were favoured in the American IS-54 and

(7)

develop-artwasdocumentedin arange ofexcellentmonographs

by O'Shaughnessy [87], Furui [88], Anderson and

Se-shadri[92], Kondoz [89], Kleijn and Paliwal [90] and in

atutorial review by Gersho[78]. Morerecently the5.6

kbits/shalf-rate GSM quadruple-mode VectorSum

Ex-citedLinearPredictive(VSELP)speech codecstandard

developed by Gerson et al [109] was approved, while in

Japan the 3.45 kbits/s half-rate PDC speech codec

in-ventedbyOhya,SudaandMiki[113]usingtheso-called

PitchSynchronousInnovation(PSI)CELPprinciplewas

standardised. Other currently investigated schemes are

the PrototypeWaveform Interpolation (PWI) proposed

byKleijn[114],Multi-BandExcitation(MBE)suggested

by GriÆn et al [115] and Interpolated Zinc Function

PrototypeExcitation(IZFPE)codecsadvocatedby

Hio-takakos and Xydeas [116]. In the low-delay, but more

errorsensitivebackwardadaptiveclassthe16kbps ITU

G.728 codec [117] developed by Chen et al from the

AT&T speech team hallmarks a signicant step. This

was followed by the equally signicant development of

the more robust, forward-adaptive 15 ms delay G.729

ACELParrangementproposedbytheUniversityof

Sher-brooketeam [122, 123], AT&T and NTT [118]. Lastly,

thestandardisationofthe2.4kbpsDoDcodecledto

in-tensiveresearchinthisverylow-raterangeandtheMixed

ExcitationLinearPredicitve(MELP)codecbyTexas

In-strument wasidentied [119] in 1996 asthebest overall

candidatescheme.

Beforeconcludingourdiscourse onspeechcodecslet

us briey highlight the problems associated with 7kHz

bandwidth-so-called commentatoryquality speech

cod-ing.

5.2 Widebandspeechcodecs

For thesake of completeness we note briey that 7kHz

bandwidthspeech codecs oer more transparent speech

quality thantheir narrowband counterpartsat typically

higherbitrateandalgorithmiccomplexity.

Oneoftheproblemsassociatedwithfull-bandcoding

of wideband speech is the codec's inability to treat the

lesspredictablehigh-frequency, low-energyspeech band,

which was tackled by the ITU G.722codec using

split-band orsub-bandcoding. Although theupper subband

is important for maintaining an improved intelligibility

andnaturalness, it only containsasmall fraction of the

speech energy, which is on the order of 1% and

there-foreits bitrate contribution has to be limited

appropri-ately. TheITU G.722 codec[131] uses two equal-width

subbands,whosesignalsareencodedemployingADPCM

techniquesand hastheabilityof transmittingspeech at

64,56or48kbps,whileallocating0,8or16kbpscapacity

fordatatransmission.

Quackenbush [132] suggested a transform-coded

approachinordertoallowforahigherexibilityinterms

audio signals and reduced the bitrate required

accord-ingtothelowersamplingrateof16kHz. Ordentlich and

Shoham proposed alow-delay Celp-based32kbps

wide-bandcodec[134],whichachievedasimilarspeechquality

totheG.72264kbpscodecataconcomitanthigher

com-plexity. Thebackward-adaptiveLPC lterused had an

order of32, which wassignicantlylowerthanthelter

order of 50 used in the G.728 codec [117]. The G.728

lter-orderof50wasabletocaterforlong-term

periodic-itiesofupto 6.25ms,correspondingto pitchfrequencies

down to 160 Hz at a sampling rate of 8kHz without a

LTP,allowingbetterreconstruction for femalespeakers.

Thelterorderof 32atasampling frequencyof16kHz

cannotcaterforlong-termperiodicities. Nonetheless,the

authorsoptedforusingnoLTP.IncontrasttotheG.728

codebook of 128 entries here 1024 entries were used to

modelthe5-sampleexcitations.

In acontribution by Black, Kondoz andEvans[135]

the backward- adaptive principle was retained for the

sakeoflowdelay,but itwascombinedwith asplit-band

approach. The low-band was encoded by a

backward-adaptiveCELPcodecusinga10-thorderLPClter

up-datedover148kHz-sampledsamplesor1.75msandthe

authorsarguedthatitwasnecessarytoincorporatea

for-wardadaptiveLTPinordertocounteractthepotentially

damagingerrorfeedbackeect ofthebackward-adaptive

LPC analysis. The upper-band typically containsaless

structured, noise-likesignal, which hasa slowly varying

dynamicrange. Blacketalhereproposedtousea6th

or-derforward-adaptivepredictorupdatedovera56-sample

interval, which is quadrupled in comparison to the

low-band. Backward- adaptive prediction would be

unsuit-ableforthislessaccuratelyquantisedband,whichwould

precipitatetheeectofquantizationerrorsinfuture

seg-ments.

The prestigous speech coding group at Sherbrooke

University [136, 137, 138] proposed a rangeof

ACELP-basedcodecs,sinceLaamme,AdoulSalamietalargued

that ACELP codecs are amenable to wideband coding,

when employing vast codebooks in conjunction with a

reduced-complexityfocusedcodebooksearchstrategy

us-ing a number of encapsulated search loops. This

tech-niquefacilitatessearchingonlyafractionofalarge

code-book,whileachievingasimilarperformancetothat ofa

full-search. SuÆce to say here that this technique was

proposed by the authors alsofor the ITU G.7298kbps

low-delaycodecusinga15-bitACELPcodebookandve

encapsulatedloops[121,122].

Here we conclude our discussion of speech source

codecsandbrieyclassifyarangeofvideocodecssuitable

forwirelessvideophonyandotherwirelessvisual

(8)

wire-6.1 Motivation and Background

Motivatedbytheproliferationofwirelessmultimedia

ser-vices[139, 140], aplethoraof videocodecschemes have

been proposed for various applications [141]-[156], but

the perhaps most signicant advances in the eld are

hallmarked by the MPEG4 initiative [70]. The design

ofvideophone schemescentresaround thebest

compro-miseamongstanumberofinherentlycontradictory

spec-ications, such as video quality, bit rate,

implementa-tionalcomplexity,robustnessagainstchannelerrors,

cod-ing delay, bitrate uctuation and the associated buer

lengthrequirement,etc. Manyoftheseaspectshavebeen

treated in a number of established monographs by

Ne-travaliandHaskell[143],Jain[191],JayantandNoll[85]

aswell as Gersho and Gray [149]. A plethora of video

codecshavebeenproposed intheexcellentspecialissues

edited by Tzou, Mussmann and Aigawa [157], by

Hub-ing [158] and Girod et al [159] for a range of bitrates

and applications, but the individual contributions by a

numberofrenownedauthorsaretoonumeroustoreview.

Khansari, Jalali, Dubois and Mermelstein [166] as well

asMannPelz[180]reportedpromising resultson

adopt-ingtheH.261codecforwirelessapplicationsbyinvoking

powerful signal processing and error-control techniques

in orderto remedy theinherent sourcecodingproblems

dueto stretchingits applicationdomain to hostile

wire-less environments. Farber, Steinbach and Girod

[167]-[170] also contributed substantially towards advancing

thestateofartinthecontextoftheH.263codecaswell

as in motion compensation [168, 169], as did Eryurtlu,

A.H.Sadka,A.M.Kondoz[174,175]. Furtherimportant

contributions in the eld were due to Chen et al [181],

Illgner and Lappe [182] Zhang [183], Ibaraki, Fujimoto

andNakano[184],Watanabeetal[185]etc, theMPEG4

consortium's endeavours [71], the eorts of the mobile

audio-videoterminal(MAVT)consortium. Vector

quan-tisationbased schemes were advocated byRamamurthy

andGersho[149]aswellasbyTorresandHuguet[150]. A

majorfeaturetopicoftheEuropeanCommunity'sFourth

FrameworkProgramme[44, 45] onAdvanced

Communi-cationsTechnologiesandServices(ACTS),isvideo

com-municationsoverarangeofwirelessandxedlinks.

In this Section initially we focused our attention on

thedesign andperformance evaluation of wireless video

telephone systems, suitable for the robust transmission

of Quarter Common Intermediate Format (QCIF)

se-quences over conventional mobile radio links, such as

the Pan-EuropeanGSM system [40], the American

IS-54 [41] and IS-95 [43] systems as well as the Japanese

PDCsystem[42]. Incontrasttoexistingstandardcodecs,

such as the ITU H.261 scheme and the MPEG1 [80],

MPEG2 [81] and MPEG4 [70] arrangements, our

pro-posedvideocodec'sxed,butarbitrarilyprogramable

bi-Search Area

n-1

f

n

f

b x b

p x q

MCER

Position of the best match

Figure 5. Simplied schematic of motioncompensation

c

J.Streit[186],1996

systems,whicharelikelytovarytheirbitrateinresponse

tovariouspropagationandteletraÆcconditions. Wewill

conclude the Section with a brief overview of the ITU

H.263 standardvideo codec, which is aexible scheme,

suitableforarangeof multimedia visualapplications at

variousbitratesandvideoresolutions.

6.2 Motion Compensation

Theultimategoaloflow-rateimage coding isto remove

redundancy in both spatial and temporal domains and

thereby reduce the required transmission bit rate. The

temporal correlationbetweensuccessiveimage frames is

typically removed using block-based motion

compensa-tion,whereeachblocktobeencodedis assumedto bea

motion-translatedversionofthepreviouslocallydecoded

frame.

The vector of motion translation or motion vector

(MV) is typicallyfound bythe help of correlation

tech-niques, as seen in Figure 5. Specically, a legitimate

motion translation region or search scope is stipulated

within theprevious locally decoded frame, the block to

be encoded is slid over this region according to a

cer-tain algorithm and the location of highest correlation

is deemed to be the destination of the motion

transla-tion. Motioncompensation (MC) isthen carriedoutby

subtractingtheappropriatelymotiontranslatedprevious

decoded block from the one to be encoded in order to

generate the so-called motion compensated error

resid-ual(MCER).Clearly,theimageisdecomposedinmotion

translationandMCER,andbothcomponentshavetobe

encodedandtransmittedtothedecoderforimage

recon-struction. Themotioncompensationremovessomeofthe

temporalredundancyandthevarianceoftheMCER

be-comesmuchlowerthanthatoftheoriginalimage,which

ensuresbit rateeconomy.

(9)

Figure6. Simplevideocodecschematic

145](SBC),waveletcoding[146],DiscreteCosine

Trans-formation[191, 80, 81, 188] (DCT), vector quantisation

(VQ) [149]-[151]or Quad-tree[147, 148, 155, 189](QT)

coding. Some of these techniqueswill behighlighted in

theforthcomingSubsections.

When a low codec complexity and low bit rate are

required, the motion compensation technique described

above can bereplaced by simple frame-dierencing. In

frame-dierencing the whole of the previous locally

de-coded image frameis subtractedfrom the oneto be

en-coded without the need for the above correlation-based

motionprediction,whichmaybecomevery

computation-allyintensiveforhigh-resolution,high-qualityvideo

por-trayinghigh-dynamicscenes. Such asimplevideocodec

schematicbasedonsimpleframe-dierencingisshownin

Figure6. AlthoughtheMCERresidualvarianceremains

somewhat higher for frame-dierencing than in case of

full motion compensation, there is no pattern-matching

search, which reduces the complexity and noMVs have

to be encoded, which may reduce the overall bit rate.

ObserveinFigure 6thatafter frame-dierencingthe

en-coded MCERis conveyedtothetransceiverand also

lo-cally decoded. This is necessary to be ableto generate

thelocallyreconstructedvideosignal,whichisinvokedby

theencoderin subsequentMC steps. Theencoder uses

thelocally reconstructed,rather thanthe original input

videoframes,sincethesearenotavailableatthedecoder,

whichwouldresultinmis-alignmentbetweentheencoder

anddecoder. This local reconstructionoperationis

car-ried outby the adder in the Figure, superimposing the

decoded MCER on the previous locally decoded video

frame. The operations are similar, if full MC is used.

Practical codecs, such as for example the ITU H.263

scheme,oftencombinetheso-calledinter-frameand

intra-framecodingtechniquesonablock-by-blockbasis,where

MC is employedonly if it was deemed advantageous in

MCERreductionterms.

tics, where large sections of the frame dierence signal

are 'at', characterised by low pixel magnitude values,

whilethe motioncontours,where theframe dierencing

hasfailedtopredictthecurrentpixelsonthebasisofthe

previouslocallydecoded framearerepresentedbylarger

values,asseeninatthecentreofFigure9. Consequently,

eÆcientMCER residualcoding algorithmsmust beable

to represent such textured MCER patterns adequately,

atopic to be addressedin the forthcomingsubsections.

Let us initially consider a bandwidth-eÆcient cost-gain

quantisedDCT-basedcodec[188].

6.3 DCT-based Video Codec

OurDCT-basedvideocodec'soutlineisdepictedin

Fig-ure 7. The DCT [191] has been popular in video

com-pression standards [80, 81], since it exhibits a so-called

energy compaction property, implying that upon

trans-forming acorrelated orpredictable signal to the spatial

frequency domain most of its energywill be compacted

to afew high-energy, low-frequencycoeÆcients. This is

a consequency of the Wiener-Khnitsin theorem, stating

thatthepowerspectraldensity(PSD)andthe

autocorre-lationfunction(ACF)areFouriertransformpairs. Hence

the atACF of apredictable, slowly-varying signal

im-plies acompact low-pass type PSD, which is amenable

to compression,since in the spatial frequencydomain a

lowernumberofcoeÆcientshastobetransmittedthanin

thetemporaldomain. ItisimportanttonotethattheMC

oftenremovesmostoftheredundancyfromthecorrelated

temporal domainvideoframeandhencetheDCTofthe

MCERmayevenresultinanexpandedspatialfrequency

domainrepresentation, which canbeconteractedfor

ex-amplebyadaptivebitallocationschemes. Strobach[147]

proposed quad-tree coding in order encode the MCER

and mitigate this problem. Alternative frequency

do-main solutionsinclude subband coding [144, 145](SBC)

or wavelet coding [146], which facilitate a exible

con-trol over the allocation of bits in the spatial frequency

domain. The MPEG standard codecs [80, 81] and the

H.261,H.263codecsscan andentropycodetheDCT

co-eÆcientsandalsoallowdirect encodingof themore

cor-related video signal on a block-by-block basis. Vector

quantisation(VQ) [149]-[151] canbecarriedoutbothin

thefrequencyandthetimedomains,butapersistant

de-ciencyistheirdiÆcultytohandlesharpedgesadequately.

Returningto theDCT principle,ourproposed

DCT-basedcodecwasdesignedtoachieveatime-invariant

com-pressionratioassociatedwithaxedbut programmable

encodedvideorateof5-13kbps

1

. Thecodec'soperation

isinitialisedintheintra-framemode,butonceitswitched

to the inter-frame mode, anyfurther modeswitches are

optional and only requiredif adrastic scene change

oc-curs.

1

(10)

MV Selection

and

Gain Scaling

Motion

Compensation

Prediction

Classified DCT

DCT Selection

and

Gain Control

Quantisation

Classified DCT

[image:10.612.55.278.13.228.2]

Quantisers

Table of

Previous Local

Reconstructed

Frame

Local

Reconstructed

Frame

DCT

Inverse

Compensation

Inverse Motion

MV

Video

to Rec.

to Rec.

and

Motion

Prediction

Figure7. DCT-codecschematicIEEE,c Hanzo&Streit

[188],1995

0

10

20

30

40

50

60

70

80

90

100

Frame Index

20

22

24

26

28

30

32

34

36

PSNR

(dB)

10 kb/s

8 kb/s

5 kb/s

Figure8. PSNRversusframeindex performance at

var-ious bitrates for the 'Miss America' sequence IEEE,c

Hanzo&Streit[188]1995

In the intra-frame mode the encoder transmits the

coarselyquantisedblockaverages forthe currentframe,

whichprovidesalow-resolutioninitialframerequiredfor

theoperationof theinter-frame codecat boththe

com-mencement and during later stages of communications

in order to preventencoder/decodermisalignment. For

176144pixelITU standardQuarterCommon

Interme-diate Format (QCIF) images in a specic scenario[188]

welimitedthe numberof videoencoding bits perframe

to 1136, corresponding to a bitrate of 11.36 kbps at 10

frames/s.

Inthemotion-compensation88blocksareused. At

thecommencementoftheencodingprocedurethemotion

compensation (MC) schemedetermines amotionvector

(MV) for each of the 88blocks using full-search. The

center of each block and hence a total of 4 bits are

re-quired fortheencodingof 16 possiblepositions foreach

MV.Beforetheactualmotioncompensationtakesplace,

thecodectentativelydeterminesthe potentialbenetof

thecompensation intermsof motioncompensatederror

energy reduction. Then the codec selects those blocks

as 'motion-active' whose gain exceeds a certain

thresh-old. This method of classifying the blocks as

motion-activeandmotion-passiveresultsinanactive/passive

ta-ble,whichconsistsofaonebitagforeachblock,

mark-ingitaspassiveoractive.

Pursuinga similarapproach, gaincontrol isalso

ap-pliedtotheDiscreteCosineTransform(DCT)based

com-pression. EveryblockisDCTtransformedandquantised.

Inorder totakeaccountofthenon-stationarynature of

themotion compensated errorresidual(MCER) and its

time-variant frequency-domain distribution, four

dier-entsets of DCT quantisers were designed. The

quanti-sation distortion associatedwith each quantiser is

com-putedinordertobeabletochoosethebestone. Tenbits

areallocatedforeachquantiser,eachofwhicharetrained

Max-Lloyd quantisers catering for a specic

frequency-domainenergydistributionclass. AllDCTblockswhose

coding gain exceeds a certain threshold are marked as

DCT-activeresultingin asimilaractive/passivetableas

forthemotionvectors. Forthissecondtableweapplythe

samerunlengthcompressiontechnique,asabove. Again,

if the number of bits required for the encoding of the

DCT-activeblocks exceeds half of the maximum

allow-ablenumber,blocksaroundthefringes oftheimage are

consideredDCT-passive,ratherthanthoseinthecentral

eye and lip sections. If, however, the active DCT

coef-cient and activity-table do not ll up the xed-length

transmissionburst,thethresholdsforactiveDCTblocks

isloweredandalltablesarerecomputed.

The bit allocation scheme was designed to deliver

1136 bits per frame, which is summarised in Table 1.

Theencodedbitstream beginswitha22bitframe

align-ment word(FAW). This is necessaryto assist thevideo

decoder'soperation in orderresume synchronous

opera-tionafter lossof frame synchronisation overhostile

fad-ing channels. The partial intra-frame update refreshes

only 22 outof 396 blocksevery frame. Thereforeevery

18 frames or1.8 seconds the update refreshesthe same

blocks. This periodicity is signalled to the decoder by

transmittingtheinvertedFAW.AMVisstoredusing13

bits,where 9bitsarerequiredto identifyone ofthe396

the block indexes using the enumerative method and 4

bits for encoding the 16 possiblecombinations of the X

and Ydisplacements. The88DCT-compressed blocks

use a total of 21 bits, again 9 for the block index, 10

fortheDCTcoeÆcientquantisers,and2bitstoindicate

which of the four quantiser has been applied. The

to-tal numberof bitsbecomes30 (13+21)+224+22+

6=1136, where six dummy bits were added in order to

(11)

pack-22 224 309 304 309 3012 6 1136

Table1. BitAllocation TableperQCIFVideo FramefortheFixed-rate DCTCodecIEEE,c Hanzo &Streit[188]

199

blockcodecused.

The encoded parametersare transmitted to the

de-coder and also locally decoded in order to be used in

futuremotionpredictions. Thevideocodec'sPeakSNR

(PSNR)versusframeindexperformanceisshownin

Fig-ure 8, where the PSNR is dened, as the conventional

Signal-to-NoiseRatio (SNR), exceptthat insteadof the

actual video signal power a video pixel value of 255 is

assumed,yielding apixel powerof255

2

forall pixel

po-sitionsacross thevideo frame. Since 255 is the highest

possiblevalueforan8-bitpixelrepresentation,thePSNR

istypicallyhigher,thantheconventionalSNR.Thecodec

proposed was subjected to bit-sensitivity analysis and

aQuadratureAmplitude Modulation[68] (QAM) based

source-sensitivitymatchedtransceiverwasdesignedin

or-derto transmitthevideostream overwireless channels.

The interested reader is referred to reference [188] for

furtherdetails. Havingdescribedtheprinciplesof

DCT-basedvideocodingletusnowconsiderQTcodingofthe

MCER[189].

6.4 Quadtreestructured coding

Theproposed QT-codecsharesthestructure ofthe

pre-viousDCT-basedschemeportrayedinFigure7,but

em-ploys QT-codingof the MCER.Quad-trees(QT)

repre-sentasub-classoftheso-calledregiongrowingtechniques,

wheretheimage,inourcasetheMCERgeneratedbythe

MCschemeisdescribed bythehelp ofvariable size

sec-torscharacterizedbysimilarfeatures,inthiscase,similar

greylevels. Explicitly, theMCER is described in terms

oftwosetsofparameters,thestructureofsimilarregions

and their grey levels. Note that the information

char-acteristic of the QTstructure is potentially much more

sensitiveto biterrorsthanthegreylevelcodingbits.

BeforeQTdecomposition takesplace, theframe

dif-ferencesignalisdividedin1616-pixelblocksperfectly

tilingtheoriginal dierenceframe. CreatingtheQT

re-gionsisarecursiveoperation. Consideringeach

individ-ualpixel, twoormoreneighboursaremergedtogetherif

acertainmergingcriterionissatised. Thiscriterionmay

be,forinstance,asimilargreylevel. Thismerging

proce-dureisrepeateduntilnomoreregionssatisfythemerging

criterion, hence nomore mergingis possible. Similarly,

theQTregionscanbeobtainedinatop-downapproach,

dividingthe MCER in a numberof sections, ifthe

sec-tionsdonotsatisfythesimilaritycriterion,andcontinue

until the pixel level is reached and no further splits are

The quad-tree approach is one possible

implemen-tation of the socalled region growing techniques. This

process can be observed in Figure 9. For a

rectangu-lar region an algorithmically attractive implementation

is,whencommencingatthepixellevel,fourquadrantsof

a square are merged together, if the matching criterion

ismet. Thegreylevelsofthequadrantsofasquare are

representedbym1:::m4andtheirmeaniscomputed

ac-cordingtom=(m

1

+m

2

+m

3

+m

4

)=4. Iftheabsolute

dierenceofallfourpixelsandthemeangreylevelisless

thanthesystemparameter,thenthesepixelssatisfythe

mergingcriterion. Explicitly, asimplemergingcriterion

canbeformulatedasfollows:

(jm m

1

j<)\(jm m

2

j<)\(jm m

3

j<)

\(jm m

4

j<)=True; (1)

where\representsthelogicalANDoperation.

It is expected that if the system parameter is

re-duced, the matching criterion becomes more stringent

and hence less merging takes place, which is likely to

increasethe required encoding rate at aconcommittant

improvement of the MCER's representation quality. In

contrast,anincreased valueisexpected toallowmore

mergingto takeplace and hence reduce thebit rate,as

wewillshowin ourresultsSection.

Ifthemergingcriterionissatised,themeangreylevel

mbecomesthegreylevelof themergedquadrantin the

nextgeneration,andsoon. Atthisstageitisimportant

tonotethatthequalitycontrolthresholddoesnotneed

tobeknowntotheQTdecoder. Thereforetheimage

rep-resentation quality can be rendered position-dependent

withintheframebeingprocessed,whichallowsweighting

to be applied to important image sections, such as the

eyes and lips without increasing the complexity of the

decoderorthetransmissionrate.

Pursuingthe top-downQTdecomposition approach,

theframedierencesignalconstitutesaso-callednodein

theQT.Aftersplittingthisnodegivesrisetofourfurther

nodes,whichareclassiedonthebasisofthe'similarity

criterion'. Specically,ifallthepixelsatthislevelofthe

QT dier from the mean m by less than the threshold

, then theyare considered tobeaso-called 'leaf node'

in the QT. Hence they do not have to be subjected to

further 'similaritytests', theycanberepresentedsimply

bythemeanvaluem.

If, however,the pixels constituting the current node

(12)

Figure 10. Enhanced sample codebook with 128 88

vectorsStreitc andHanzo[190], 1997

bytheirmeanmandthustheymustbefurthersplit,until

thethreshold conditionis met. This repetitivesplitting

process is continued, until there are no more nodes to

split, sinceall theleaf nodes satisfythe threshold

crite-rion,asshown inFigure 9. ConsequentlytheQT

struc-turedescribesthe contours of similar grey levels in the

framedierencesignal.

Inorder to beableto reproducethe encoded image,

not only the grey levels of the leaf nodes, but also the

QTstructure mustbeeÆciently encoded and

communi-catedto thedecoder. Fortunately,theQTstructurecan

be eÆciently described by the help of a variable-length

code. Atthecommencementofimagecommunicationsa

low-resolutionversionoftherstimageframeisencoded

and transmittedto the decoder in order to assist in its

operation. ThentheMCERsignaliscomputed,whichis

subjected QTcoding (QTC) before transmissionto the

decoder. Again, the schematic of the QT codec obeys

thestructure of Figure 6. The QT coded MCERis

lo-callydecoded andaddedtothepreviouslocally decoded

frameand storedin theframebuer forthedurationof

oneframein order to generate thenextblock estimates

forthe MC operation. A range of techniques relatedto

theoptimumQTsplitting and bit allocationtechniques

weresuggestedin[189],where alsodetails ofthe

source-matchedvideotransceivercanbefound. Adequatevideo

quality was achieved for theMiss America sequence for

abitrateof 11.36kbps,whenusing10frames/sscanned

QCIFimages.

Without aiming for an indepth treatment webriey

allude to the concept of vector quantised video codecs,

where the MCER of an 88 pixel block is represented

bythebest matchingentryofthetwo-dimensional

code-book shown in Figure 10. This principle also allowed

Reference [187], where the full transceiver performance

over fading channels is also characterised. The

above-mentioned range of xed-rate videocodecs is compared

in terms of error resilience and video quality in

Refer-ence [190]. In References [187]-[189] a range of exible

recongurablemulti-level transceivers were designed for

thetransmissionoftheVQ-,DCT-andQT-codedvideo

streams byallocating anaddition physical speech

chan-nel for video telephony. However, for reasons of space

economy these results were not included here. Similar

video PSNR versus channel SNR results are here

pro-vided using the ITU H.263 video codec and a

recong-urable transceiver in order to characterise theexpected

videoperformanceinFigures27and28.

2

In closing wenote againthat the literatureof video

compression is very rich [139]-[156] and recent

develop-mentsledtothedenitionoftheMPEG1,MPEG2,H.261

andH.263standards. Althoughthesecodecsrelyon

vul-nerable variable-length coding techniques, work is also

under way towards contrivingmore robustcoding

algo-rithms, such as those to be incorporated in the

forth-comingMPEG4scheme [71,70]. InthenextSubsection

we briey highlight the features of the standard H.263

scheme,whichisanerror-sensitive,variable-ratescheme,

but achievesaveryhighcompressionratioand henceto

dateitisthebestexisting standardisedvideocodec. We

will alsopropose appropriatetransmissiontechniquesto

supportitsoperationinawirelessvideophonescheme.

6.5 The H.263 ITUCodec

The H.263 codec wasdetailed in References [193, 194],

whileanumberoftransmissionschemesdesignedfor

ac-commodating its rather error-sensitive bit-stream were

proposed in [195]-[177]. As an illustrative example, in

Table2wesummarisedthevariousvideoresolutions

sup-portedby theH.261 and H.263ITU codecs,in order to

demonstrate their exibility [192]. Their uncompressed

bitrates at frame scanning rates of both at 10 and 30

frames/secforbothgreyandcolourvideoarealsolisted.

The mature H.261 standard dened two dierent

pic-tureresolutions,namelyQCIFandCIF,whiletheH.263

codechastheabilitytosupportvedierentresolutions.

AllH.263decodersmustbeabletooperateinsun-QCIF

(SQCIF) and QCIFmodesand optionally support CIF,

4CIFand16CIFformats.

The H.261 and H.263 codecs share the simplied

schematicofFigure11,whichoperateunderthe

instruc-tions of the coding control block, selectingthe required

inter/intra frame mode, the quantisation and

bitalloca-tionscheme etc. DCTis invokedto compresseither the

originalortheMCERblocksandtheencodedvideosignal

is also locally decoded andstored in theframe memory

2

The exposition of this Section can beaugmented by

(13)

http://www-Hanzo,[189], 1996

Video Luminance No. ofPels Uncompressed

Format dimensions perframe bitrate(Mbit/s)

10frame/s 30frame/s

Grey Colour Grey Colour

SQCIF 128x96 12288 0.983 1.47 2.95 4.42

QCIF 176x144 25344 2.03 3.04 6.09 9.12

CIF 352x288 101376 8.1 12.2 24.3 36.5

4CIF 704x576 405504 32.4 48.7 97.3 146.0

16CIF 1408x1152 1622016 129.8 194.6 389.3 583.9

CCIR601 720x480 345600 27.65 41.472 82.944 124.416

HDTV1440 1440x960 1382400 110.592 165.888 331.776 497.664

HDTV 1920x1080 2073600 165.9 248.832 497.664 746.496

SQCIF:Sub-QuarterCommonIntermediateFormat

QCIF:QuarterCommonIntermediateFormat

CIF:CommonIntermediateFormat

HDTV:HighDenitionTelevision

Table2. Variousvideoformatsandtheiruncompressedbitrate. Uponusingcompression10-100timesloweraverage

(14)

DCT

-1

Quant

-1

Loop

Filter

Motion

Estimation

Quant

Video

Multiplex

Coder

Coding

Control

Current

Frame

0

1

0

1

DCT

Quantised Coefficients

Motion Vector

Frame

Memory

Motion

Compensation

Quantiser Index

0

INTRA/INTER mode

Multiplexer

Multiplexer

Prediction Error

Figure 11. Simplied H.261/H.263

schematicCherrimanc [192], 1995

compare-averpsnr-v-bitrate-miss

a

merica.gle

5

10

1

2

5

10

2

2

5

10

3

2

5

Bitrate (Kbit/s)

30

32

34

36

38

40

42

44

46

48

50

A

verage

PSNR

(dB)

Miss America 30fps-CIF

Miss America 10fps-CIF

Miss America 30fps-QCIF

Miss America 10fps-QCIF

Miss America 30fps-SQCIF

Miss America 10fps-SQCIF

Figure 12. Image quality (PSNR) versus coded

bi-trate, for H.263 \Miss America" simulations at 10 and

30 frames/s using SQCIF, QCIF and CIF sequences

c

Cherriman[192],1995

inorder to be usedin future MC steps. All encoded

in-formation is multiplexed for transmission by the video

multiplex coder. The codec'sPSNR versusencoded

bi-trate performance is portrayed in Figure 12 for \Miss

America" simulations at 10 and 30 frames/s using

SQ-CIF,QCIFandCIFsequences[192]. Observeinthe

Fig-urethecodecguaranteesnear-linearrate-scalabilityover

awideoperatingrange,whichis partlyexplainedbythe

extensive employment of entropy coding schemes. The

performance of a complete adaptive videophone system

will be portrayed after considering the associated

wire-lesstransmissionaspects.

Here wecurtail our discussions on videocodecsand

providesomenotesonanotheraspectofmultimedia

com-7 GRAPHICALSOURCE CODING

7.1 Background

Telewriting is a multimedia telecommunication service

enabling the bandwidth-eÆcient transmission of

hand-writtentextandlinegraphicsthroughxedandwireless

communication networks [196]-[201]. Dierential chain

coding (DCC) has been successfully used for graphical

communications over E-mail networks [196] or teletext

systems[199],where bitrateeconomyisachievedby

ex-ploitingthe correlationbetweensuccessivevectors.

Ref-erences [197] and [202] addressed also some of the

as-sociatedcommunicationsaspects. A plethora of further

excellent treatises were contributed to the literature of

chain codingbyR. Prasadandhis colleaguesfromDelft

University[203]-[205].

7.2 Fixed length dierentialchain coding[206]

Inchaincoding(CC)asquare-shapedcodingringisslid

along the graphical trace from the current pixel, which

is the origin of the legitimate motion vectors, in steps

representedby the vectorsportrayedin Figure 13. The

bolddotsintheFigurerepresentthenextlegitimate

pix-els during the graphical trace's evolution. In principle

thegraphicaltracecanevolveto anyof thesurrounding

eight pixels and hence a three-bit codeword is required

forlosslesscoding. Dierentialchaincoding[203](DCC)

exploits that the most likely direction of stylus

move-ment is a straight extension, corresponding to vector0

andwith agraduallyreducingprobabiltyofsharp turns

corresponding vectorshavinghigher indeces. Explicitly,

we havefoundthe whilevector0typicallyhasa

proba-bility of around 0.5 for a range of graphical source

sig-nals,includingEnglishandChinesehandwriting, aMap

andatechnicalDrawing,therelativefrequencyofvectors

1isaround0.2,whilevectors2;3haveprobabilities

around0.05. ThissuggeststhatthecodingeÆciencycan

beimprovedusingtheprincipleofentropycodingby

al-locatingshortercodewordstomorelikelytransitionsand

longeronestolesslikelytransitions.

Inreference[206]weembarkedonexploringthe

poten-tialofagraphicalcodingschemedispensingwithvariable

lengthcoding, which wereferto asxed length

dieren-tial chain coding (FL-DCC). FL-DCC was contrived in

ordertocomplywiththetime-variantresolution-and/or

bitrateconstraintsofintelligentadaptivemultimode

ter-minals, which can be re-congured under network

con-trolto satisfythemomentarily prevailingtele-traÆc,

ro-bustness, quality, etc system requirements. In order to

maintain lossless graphics quality under lightly loaded

traÆc conditions, the FL-DCC codec can operate at a

rate of b = 3 bits/vector, although it has a higher bit

rate than DCC. However, sincein voice and video

cod-ingtypicallyperceptuallyunimpairedlossy quantisation

(15)

re--2

0 b=2 (00)

b=1 (1)

b=2 (10)

+4

+2

+3

-1

-3

+1

b=1 (0)

b=2 (01)

b=2 (11)

Figure13. Codingring

X

Y

SV

FV

FV FV

FV

o

o

VC

Figure 14. FL-DCC Coding Syntax

c

IEEE, Yuen and

Hanzo[206],1995

conditions.

Based on our ndings asregards to the relative

fre-quencies of the various dierential vectors, we decided

to evaluate the performance of the FL-DCC codec

us-ing the b = 1 and b = 2 bit/vector lossy schemes. As

demonstrated by Figure 13, in the b = 2-bit mode the

transitionstopixels-2,-3,+2, +3areillegitimate,while

vectors0,+1,-1and+4arelegitimate. Inorderto

min-imise the eects of transmission errors the Gray codes

seeninFigure 13wereassigned. Itwillbedemonstrated

that, due to the low probability of occurance of the

il-legitimate vectors, the associated subjective coding

im-pairment is minor. Under degrading channel conditions

orhighertele-traÆcloadtheFL-DCCcodingratehasto

bereducedto b=1,in orderto beableto invokealess

bandwidtheÆcient,butmorerobustmodulationscheme

orto generate lesspackets contendingfor transmission.

Inthis case only vectors+1 and -1 of Figure 13 are

le-gitimate. Thesubjectiveeects oftheassociatedzig-zag

tracewill beremoved by the decoder,which can detect

thesecharacteristicpatternsandreplacethembyatted

straightline.

In generalterms the size of thecoding ring is given

by 2n, where n =1;2;3::: is referredto asthe order

oftheringand isascalingparameter,characteristicof

the pixel separation distance. Hence the ring shown in

Figure 13 is a rst order one. The numberof nodes in

theringisM=8n.

in Figure 14. The beginning of a trace can be marked

byatypically8bitlongpen-down(PD) code, whilethe

end oftraceby apen-up (PU)code. In orderto ensure

thatthesecodesarenotemulatedbytheremainingdata,

if this would be incurred, bit stuÆng must be invoked.

Wefoundthatin complexityandrobustnesstermsusing

a'vectorcounter'(VC)constitutedamoreattractive

al-ternativeforoursystem. ThestartingcoordinatesX

0

;Y

0

ofatracearedirectly encoded usingforexample10and

9bits in caseof avideograhicsarray(VGA) resolution

of640480pixels.

The rst vector displacement along the trace is

en-coded by the best tting vector dened by the coding

ringasthestartingvector(SV).Thecodingringisthen

translatedalongthisstartingvectortodeterminethenext

vector. A dierential approach is used for theencoding

of all the following vectorsalong the trace, in that the

dierences in direction between the present vector and

itspredecessorarecalculatedandthesevectordierences

aremappedintoasetof2

b

xedlengthb-bitcodewords,

whichwereferto as'xedvectors'(FV).

We designedawireless 4QAM-based[68] transceiver

for the transmission of FL-DCC encoded graphical

sourcesignalsandevaluatedthesystem'srobustnessover

Rayleigh-fading channels with second-order

switched-diversity, using automatic repeat requests limited to a

maximum of three transmission attempts (TX3) [207].

HerewerefrainfromprovidingPSNRversuschannelSNR

curves, for these the interested useris referred to [207].

However,the correspondingsubjectivegraphical quality

andtheassociatedPSNRvaluesaresummarisedin

Fig-ure 15 for the channel SNR range of 5-12 dB,

respec-tively. Due to its low channel capacity requirement the

FL-DCC coded signal is readily accommodated by the

voice signalduring passive speech spurts, when using a

voiceactivitydetector(VAD)[40]. Finally,itis

notewor-thythattheITUstandardisedtwodierentchaincoding

schemesin theT.150Recommendationforuseover

con-ventional low-BER xed telephone lines. However, for

wirelesschannelstheproposedFL-DCCschemeis

prefer-ableduetoitshigherrobustnessandprogrammable-rate

operation.

Wenotethatanassociatedmultimediasignal

manip-ulation relying on writing tablets is the eld of

hand-writingrecognitionforbothon-line ando-line

applica-tions[208]-[213]. Manyofthetechniquesusedarebased

onhiddenMarkovmodels(HMMs),whicharewidely

em-ployed in the eld of speech recognition. Previous

re-searchhasshownthat HMMsareapplicabletoboth

o-line[212]andon-line[208,210,213]handwriting

recogni-tionproblems. Theadvantageof such statiscal methods

isthat theycanhandlevariabilityinthewritingprocess

ofanindividual,but theyarealsocapableof identifying

andcapturingtheindividualfeaturesofthehandwritten

characters bytakingintoaccountdynamic, pressure

(16)

Figure15. Subjectiveeectsoftransmissionerrorsforthe

b=116QAM,RD,TX3schemeforPSNRvaluesof(left

toright,toptobottom)49.47,42.57,37.42,32.01,27.58

and 21.74 dB, respectively. IEEE,c Yuen and Hanzo

[206],1995

functionofthedistancealongthewritingtrajectory.

Following the above brief excursion to graphical

source compression and signal processing, here we turn

ourattention to wireless communications aspects,

com-mencingwithareviewofthefrequencyre-useconceptof

cellularsystems.

8 CELLUAR COMMUNICATIONS BASICS

8.1 The CellularConcept

A common feature of the previously mentioned mobile

radiosystemsisthatcommunicationstakeplacebetween

astationarybase station(BS) andanumberofroaming

mobilestations (MSs) orportable stations (PSs) [1]-[9].

TheBS'sandtheMS'stransmitterisexpectedtoprovide

asuÆcientlyhighreceivedsignallevelforthefar-end

re-ceiversinordertomaintaintherequiredcommunications

integrity. This isusuallyensured bypowercontrol. The

geographical area in which this condition is satised is

termedasatraÆc cell,which typically hasan irregular

shape,depending ontheprevailingpropagation

environ-mentdeterminedbyterrainandarchitecturalfeaturesas

wellasthe localparaphernalia. Intheoretical sudies

of-tena simple hexagonalcell structure is favoured for its

simplicity,wheretheBSsarelocatedatthecentresofthe

cells.

Inanidealsituationthetotalbandwidthavailableto

2

6

7

5

4

2

3

6

1

5

4

2

3

6

7

1

5

4

2

3

6

7

1

5

4

2

3

6

7

1

5

4

2

3

6

7

1

5

4

2

3

6

7

1

5

4

2

3

6

7

1

5

4

2

3

6

7

1

5

4

2

3

6

7

1

5

4

1

3

7

Figure16. Hexagonalcellsandseven-cellclusters

eachcell,assumingthatthereisnoenergyspiltinthe

ad-jacentcell'scoveragearea. However,sincewave

propaga-tioncannotbeshieldedatthecellboundary,PSsnearthe

celledgewouldexperienceapproximatelythesamesignal

energywithin theirchannelbandwidthfrom atleasttwo

BSs. This phenomenon is called co-channel

interfer-ence. A remedy to this problem is to devide the total

bandwidthB

total

in frequencyslotsof B

cell

=B

total

=N,

and assign a mutually exclusive reduced bandwidth of

B cell

to eachtraÆc cellwithin aso-called cluster ofN

cells,asdemonstratedinFigure16forN =7. The

seven-cellclusters are then tesselatedin order to provide

con-tiguousradio coverage. Observefromthegu

Figure

Table ofQuantisersMVInverse Motionplied

References

Related documents