• No results found

Link 1 Link 2 Sender. Link 1 Link 2. Receiver. Receiver. Sender

N/A
N/A
Protected

Academic year: 2021

Share "Link 1 Link 2 Sender. Link 1 Link 2. Receiver. Receiver. Sender"

Copied!
19
0
0

Loading.... (view fulltext now)

Full text

(1)

Transporting Real-time Video over the Internet:

Challenges and Approaches

Dapeng Wu, Student Member, IEEE, Yiwei ThomasHou,Member, IEEE,

and Ya-Qin Zhang Fellow,IEEE

Abstract|Delivering real-timevideoovertheInternet is

an importantcomponentof manyInternet multimedia

ap-plications. Transmissionofreal-timevideo hasbandwidth,

delay and loss requirements. However, the current

Inter-net does not oer anyqualityof service(QoS) guarantees

to video transmissionover the Internet. In addition, the

heterogeneityofthenetworksandend-systemsmakesit

dif-cult to multicast Internet video in an eÆcient and

ex-ible way. Thus, designing protocols and mechanisms for

Internet videotransmissionposesmanychallenges. Inthis

paper, wetakea holisticapproach to thesechallenges and

presentsolutionsfrombothtransportandcompression

per-spectives. With the holisticapproach, we design a

frame-work for transporting real-time Internet video, which

in-cludes twocomponents,namely,congestioncontroland

er-rorcontrol. Specically,congestioncontrol consistsofrate

control,rateadaptiveencoding,andrateshaping;error

con-trolconsistsofforwarderrorcorrection(FEC),

retransmis-sion,error-resilienceanderrorconcealment. Forthedesign

ofeachcomponentintheframework,weclassifyapproaches

andsummarizerepresentativeresearchwork. Wepointout

there existsadesignspace whichcanbeexploredbyvideo

applicationdesigners,andsuggestthatthesynergyofboth

transportandcompressioncouldprovidegoodsolutions.

Keywords|Internet, real-timevideo, congestion control,

errorcontrol.

I. Introduction

U

NICASTandmulticastdeliveryofreal-timevideoare

importantbuildingblocksofmanyInternet

multime-dia applications, such as Internet television (see Fig. 1),

videoconferencing,distancelearning,digitallibraries,

tele-presence,andvideo-on-demand. Transmissionofreal-time

video hasbandwidth, delay and lossrequirements.

How-ever,thereisnoqualityofservice(QoS)guaranteeforvideo

transmission over the current Internet. In addition, for

videomulticast,theheterogeneityofthenetworksand

re-ceivers makes it diÆcult to achieve bandwidth eÆciency

andserviceexibility. Therefore,therearemany

challeng-ingissuesthatneedtobeaddressedindesigningprotocols

andmechanismsforInternetvideotransmission.

WelistthechallengingQoSissuesasfollows.

1. Bandwidth. Toachieveacceptablepresentation

qual-ity,transmissionofreal-timevideotypicallyhasminimum

bandwidthrequirement(say,28Kbps). However,the

cur-ManuscriptreceivedFebruary2,2000;revisedAugust3,2000.This

paperwasrecommendedbyEditorJimCalder.

D. WuiswithCarnegieMellon University, Dept.of Electrical &

Computer Engineering, 5000 Forbes Ave., Pittsburgh, PA 15213,

USA.

Y.T.HouiswithFujitsuLaboratoriesof America,595 Lawrence

Expressway,Sunnyvale,CA94085,USA.

Y.-Q.Zhang iswithMicrosoftResearchChina,5F,BeijingSigma

Center, No. 49 Zhichun Road, Haidian District, Beijing 100080,

rent Internet does not provide bandwidth reservation to

meet such a requirement. Furthermore, since traditional

routerstypicallydo notactivelyparticipatein congestion

control [7],excessivetraÆccancause congestioncollapse,

which can further degrade the throughput of real-time

video.

2. Delay. Incontrasttodatatransmission,whichare

usu-allynotsubjecttostrictdelayconstraints,real-timevideo

requiresboundedend-to-enddelay(say,1second). Thatis,

everyvideopacketmustarriveatthedestinationintimeto

bedecodedanddisplayed. Thisisbecausereal-timevideo

mustbeplayedoutcontinuously. Ifthevideopacket does

notarrivetimely,the playoutprocesswill pause,whichis

annoyingtohumaneyes. Inotherwords,thevideopacket

thatarrivesbeyondatimeconstraintisuselessandcanbe

considered lost. Although real-time videorequires timely

delivery, the current Internet does notoer such a delay

guarantee. In particular, the congestion in the Internet

could incur excessive delay, which exceeds the delay

re-quirementofreal-timevideo.

3. Loss. Lossofpackets canpotentiallymakethe

presen-tationdispleasing tohumaneyes,or,in somecases,make

thepresentationimpossible. Thus,videoapplications

typ-icallyimposesome packet lossrequirements. Specically,

thepacketlossratioisrequiredtobekeptbelowa

thresh-old(say,1%)toachieveacceptablevisualquality. Although

real-time video hasa lossrequirement, the current

Inter-netdoesnotprovideanylossguarantee. Inparticular,the

packet loss ratio could bevery high during network

con-gestion,causingseveredegradationofvideoquality.

Besides the above QoS problems, for video multicast

applications, there is another challenge coming from the

heterogeneity problem. Beforeaddressingthe

heterogene-ity problem, we rst describe the advantages and

disad-vantages of unicast and multicast. The unicast delivery

of real-timevideouses point-to-point transmission,where

only one sender and one receiver are involved. In

con-trast, themulticastdeliveryofreal-timevideouses

point-to-multipointtransmission,whereonesenderandmultiple

receiversareinvolved. Forapplicationssuchasvideo

con-ferencingand Internet television, deliveryusing multicast

can achieve high bandwidth eÆciency since the receivers

can share links. On the other hand, unicast delivery of

such applicationsis ineÆcient in terms of bandwidth

uti-lization. An exampleisgivein Fig.2,where, for unicast,

vecopiesofthevideocontentowacrossLink1andthree

(2)

Fig.1. Internettelevisionusesmulticast(point-to-multipointcommunication)insteadofunicast(point-to-pointcommunication)todeliver

real-timevideosothatuserscansharethecommonlinkstoreducebandwidthusageinthenetwork.

Receiver

Receiver

Receiver

Receiver

Receiver

Sender

Link 1

Link 2

Receiver

Receiver

Receiver

Receiver

Receiver

Sender

Link 1

Link 2

(a) (b)

Fig.2. (a)Unicastvideodistributionusingmultiplepoint-to-pointconnections. (b)Multicastvideodistributionusingpoint-to-multipoint

transmission.

only onecopyof the video content traversingany link in

thenetwork(Fig.2(b)),resultinginsubstantialbandwidth

savings. However, the eÆciency of multicast is achieved

at the costof losingthe service exibilityof unicast (i.e.,

in unicast,eachreceivercan individuallynegotiateservice

parameters with the source). Such lack of exibility in

multicast can beproblematic in aheterogeneous network

environment,whichweelaborateas follows.

Heterogeneity. There are two kinds of heterogeneity,

namely, network heterogeneity and receiver heterogeneity.

Networkheterogeneityreferstothesub-networksinthe

In-ternethavingunevenlydistributedresources(e.g.,

process-ing, bandwidth, storage and congestion control policies).

Networkheterogeneitycouldmakedierentuserexperience

dierentpacketloss/delaycharacteristics. Receiver

hetero-geneitymeansthatreceivershavedierentorevenvarying

latency requirements, visual quality requirement, and/or

processing capability. For example, in live multicast of a

lecture, participantswhowantto askquestions and

inter-actwith thelecturerdesirestringentreal-timeconstraints

onthevideowhilepassivelisteners maybewillingto

sac-ricelatencyforhighervideoquality.

of networks and receiverssometimes present aconicting

dilemma. Forexample,the receiversin Fig.2(b) may

at-tempt to request fordierentvideoqualitywith dierent

bandwidth. Butonlyonecopyofthevideocontentissent

outfrom thesource. Asaresult, allthereceivershaveto

receivethe same videocontentwith thesame quality. It

is thus a challenge to design a multicast mechanism that

notonlyachieveseÆciencyinnetworkbandwidthbutalso

meetsthevariousrequirementsofthereceivers.

To addressthe above technical issues, two general

ap-proaches have been proposed. The rst approach is

network-centric. That is, the routers/switchesin the

net-work are required to provide QoS support to guarantee

bandwidth,boundeddelay,delayjitter,andpacketlossfor

video applications (e.g, integrated services [6], [11], [42],

[65] or dierentiated services [2], [27], [35]). The second

approach is solely end system-based and doesnot impose

any requirements on the network. In particular, the end

systemsemploycontroltechniquesto maximize thevideo

quality withoutanyQoSsupport from the transport

net-work. In this paper, we focus on the end system-based

(3)

andisapplicabletoboththecurrentandfuture Internet.

Extensive research based on the end system-based

ap-proachhasbeenconductedandvarioussolutionshavebeen

proposed. This paper aims at giving the reader a big

picture of this challenging area and identifying a design

space that can be explored by video application

design-ers. Wetakeaholisticapproachtopresentsolutionsfrom

both transport and compression perspectives. By

trans-portperspective,wereferto theuse ofcontrol/processing

techniqueswithoutregardof thespecic videosemantics.

Inotherwords,thesecontrol/processingtechniquesare

ap-plicable to generic data. By compression perspective, we

meanemployingsignalprocessingtechniqueswith

consid-eration of the video semantics on the compression layer.

With theholisticapproach,wedesignaframework,which

consistsoftwocomponents,namely,congestioncontroland

errorcontrol.

1. Congestion control. Burstylossand excessive delay

have devastatingeect onvideo presentation qualityand

theyareusuallycausedbynetworkcongestion. Thus,

con-gestioncontrolisrequiredtoreduce packetlossanddelay.

Onecongestioncontrolmechanismisratecontrol [5]. Rate

control attempts to minimize network congestion and the

amount of packet lossby matching the rate of the video

stream to the available network bandwidth. In contrast,

without rate control, the traÆc exceeding the available

bandwidth would be discarded in the network. To force

thesourcetosendthevideostreamattheratedictatedby

the rate control algorithm, rate adaptive video encoding

[63]orrateshaping[18]isrequired. Notethatratecontrol

isfromthetransportperspective,whilerateadaptivevideo

encodingisfromthecompressionperspective;rateshaping

isin bothtransportandcompressiondomain.

2. Error control. The purpose of congestion control is

topreventpacketloss. However,packetlossisunavoidable

intheInternetandmayhavesignicantimpacton

percep-tualquality. Thus, othermechanismsmustbein placeto

maximizevideopresentationqualityin presenceof packet

loss. Such mechanisms include error control mechanisms,

whichcanbeclassiedintofourtypes,namely,forward

er-rorcorrection (FEC),retransmission,error-resilience,and

error concealment. The principleof FEC is to add extra

(redundant)informationtoacompressedvideobit-stream

sothattheoriginalvideocanbereconstructedinpresence

of packet loss. There are three kinds of FEC: (1)

chan-nel coding, (2) source coding-based FEC, and (3) joint

source/channel coding. The use of FEC is primarily

be-cause of its advantage of small transmission delay [14].

ButFEC couldbe ineectivewhen burstypacket loss

oc-curs and such loss exceeds the recoverycapability of the

FEC codes. Conventional retransmission-based schemes

such as automatic repeat request (ARQ) are usually

dis-missed as a means for transporting real-time video since

the delay requirement may not be met. However, if the

one-way trip time is short with respect to the maximum

allowable delay, a retransmission-based approach (called

delay-constrainedretransmission)isaviableoptionfor

er-lossonthecompressionlayer. UnliketraditionalFEC(i.e.,

channelcoding),whichdirectlycorrectsbiterrorsorpacket

losses,error-resilientschemesconsiderthesemantic

mean-ingofthecompressionlayerandattempttolimitthescope

ofdamage(causedbypacketloss)onthecompressionlayer.

As a result, error-resilient schemes could reconstruct the

videopicturewithgracefullydegradedquality. Error

con-cealment is a post-processing technique used by the

de-coder. When uncorrectable bit errors occur, the decoder

useserror concealment to hide theglitch from the viewer

so that amorevisually pleasingrendition of the decoded

videocanbe obtained. Note that channelcoding and

re-transmission recover packet loss from the transport

per-spective, while source coding-based FEC,error-resilience,

anderrorconcealmentdealwithpacketlossfromthe

com-pression perspective; joint source/channel coding falls in

bothtransport andcompressiondomain.

Theremainderofthispaperisorganizedasfollows.

Sec-tion II presentstheapproachesfor congestion control. In

Section III, wedescribethemechanismsfor errorcontrol.

Section IV summarizes this paper and points out future

researchdirections.

II. Congestion Control

Therearethreemechanismsforcongestioncontrol: rate

control, rate adaptive video encoding, and rate shaping.

Rate control follows the transport approach; rate

adap-tivevideoencodingfollowsthecompressionapproach;rate

shapingcould follow either thetransport approach orthe

compressionapproach.

For thepurpose of illustration,wepresent an

architec-tureincludingthethreecongestioncontrolmechanismsin

Fig.3,wheretheratecontrolisasource-basedone(i.e.,the

sourceisresponsibleforadaptingtherate). Although the

architectureinFig.3istargetedattransportinglivevideo,

this architecture is also applicable to stored video if the

rateadaptiveencodingisexcluded. Atthesenderside,the

compressionlayercompressesthelivevideobasedonarate

adaptive encoding algorithm. After this stage, the

com-pressedvideobit-streamisrstlteredbytherateshaper

and then passedthrough theRTP/UDP/IP layersbefore

entering the Internet, where RTP is Real-time Transport

Protocol[41]. PacketsmaybedroppedinsidetheInternet

(due to congestion) or at the destination (due to excess

delay). For packets that are successfully deliveredto the

destination,theyrstpassthroughtheIP/UDP/RTP

lay-ersbeforebeingdecodedat thevideodecoder.

Underourarchitecture,aQoSmonitorismaintainedat

thereceiverside to infernetwork congestion status based

on the behavior of the arriving packets, e.g., packet loss

anddelay. Suchinformationisusedinthefeedbackcontrol

protocol,whichsendsinformationbacktothevideosource.

Basedonsuchfeedbackinformation,theratecontrol

mod-uleestimatestheavailablenetworkbandwidthandconveys

theestimatednetworkbandwidthto therateadaptive

en-coderortherateshaper. Then, therateadaptiveencoder

(4)

Compresion Layer

Rate Control

Rate Adaptive Encoding

Rate Shaper

Video Decoder

Protocol

QoS Monitor

Feedback Control

RTP Layer

UDP Layer

Receiver Side

Internet

IP Layer

IP Layer

RTP Layer

UDP Layer

Sender Side

Transport

Compression

Domain

Domain

Fig.3. Alayeredarchitecturefortransportingreal-timevideo.

is clear that thesource-basedcongestion control must

in-clude: (1)ratecontrol,and(2)rateadaptivevideo

encod-ingorrateshaping.

We organize therest of this sectionas follows. In

Sec-tionII-A,wesurveythe approachesforratecontrol.

Sec-tion II-B describes basicmethods for rate-adaptivevideo

encoding. In Section II-C, we classify methodologies for

rateshapingandsummarizerepresentativeschemes.

A. Rate Control: ATransportApproach

Since TCP retransmission introduces delays that may

notbeacceptableforreal-timevideoapplications,UDPis

usually employed as the transport protocol for real-time

video streams [63]. However, UDPis not ableto provide

congestioncontrolandovercomethelackofservice

guaran-teesintheInternet. Therefore,itisnecessarytoimplement

acontrolmechanismontheupperlayer(higherthanUDP)

to preventcongestion.

Therearetwotypesofcontrolforcongestionprevention:

one is window-based [26] and the other is rate-based [52].

The window-basedcontrolsuch asTCPworksasfollows:

it probes for the available network bandwidth by slowly

increasingacongestionwindow(usedtocontrolhowmuch

dataisoutstandinginthenetwork);whencongestionis

de-tected (indicatedbythe lossofoneormorepackets), the

protocolreducesthecongestionwindowgreatly(seeFig.4).

Therapidreductionofthewindowsizeinresponseto

con-gestionisessentialtoavoidnetworkcollapse. Ontheother

hand,therate-basedcontrolsetsthesendingratebasedon

the estimated available bandwidth in the network; if the

estimationoftheavailablenetworkbandwidthisrelatively

accurate,therate-basedcontrolcouldalsopreventnetwork

collapse.

Sincethewindow-basedcontrollikeTCPtypically

cou-plesretransmissionwhichcanintroduceintolerabledelays,

the rate-based control (i.e., rate control) is usually

em-ployed fortransportingreal-time video[63]. Existingrate

control schemes for real-time videocan be classied into

0

5

10

15

20

25

30

Time (in round trip time)

0

5

10

15

20

Congestion window (packets)

Packet loss

Fig.4. Congestionwindowbehaviorunderwindow-basedcontrol.

hybridratecontrol,whicharedescribedinSectionsII-A.1

toII-A.3,respectively.

A.1 Source-basedRateControl

Under the source-based rate control, the sender is

re-sponsible for adapting the transmission rate of the video

stream. The source-based rate control can minimize the

amount of packet lossby matching the rate of the video

stream to the available network bandwidth. In contrast,

without rate control, the traÆc exceeding the available

bandwidthcouldbediscardedin thenetwork.

Typically, feedback is employed by source-based rate

control mechanisms to conveythe changing status of the

Internet. Baseduponthefeedbackinformationabout the

network, the sender could regulate the rate of the video

stream. The source-basedrate control can be applied to

bothunicast[63]andmulticast[3].

Forunicastvideo,theexistingsource-basedratecontrol

mechanismscanbeclassiedintotwoapproaches,namely,

theprobe-basedapproach andthemodel-basedapproach,

whicharepresentedasfollows.

Probe-based approach. Such an approach is based on

(5)

0

50

100

150

Time (sec)

0

10

20

Rate (kbps)

Packet loss ratio > threshold

Fig.5. SourceratebehaviorundertheAIMDratecontrol.

so that someQoS requirements are met, e.g., packet loss

ratio p is below a certain threshold P

th

[63]. The value

ofP

th

isdeterminedaccordingtotheminimumvideo

per-ceptual quality required by the receiver. There are two

ways to adjust the sending rate: Additive Increase and

Multiplicative Decrease (AIMD) [63], and Multiplicative

Increase and Multiplicative Decrease (MIMD) [52]. The

probe-basedratecontrolcouldavoidcongestionsinceit

al-waystries toadapttothecongestionstatus, e.g.,keepthe

packetlossat anacceptablelevel.

Forthe purpose of illustration, we briey describe the

source-based rate control based on additive increase and

multiplicativedecrease. TheAIMDratecontrolalgorithm

isshownasfollows[63].

if(pP

th )

r:=minf(r+AIR );MaxRg;

else

r:=maxf(r);MinR g.

where p is the packet loss ratio; P

th

is the threshold for

the packet loss ratio; r is thesending rateat the source;

AIR is the additive increase rate; MaxR and MinR are

the maximum rate and the minimum rate of the sender,

respectively;andis themultiplicativedecrease factor.

Packet lossratiopismeasuredbythereceiverand

con-veyedbacktothesender. Anexamplesourceratebehavior

under theAIMDratecontrolisillustratedinFig.5.

Model-basedapproach. Dierentfromtheprobe-based

approach,whichimplicitlyestimatestheavailablenetwork

bandwidth, the model-based approach attempts to

esti-matetheavailablenetworkbandwidthexplicitly. Thiscan

be achieved by using athroughput model of aTCP

con-nection, which is characterized by the following formula

[19]: = 1:22MTU R TT p p ; (1)

where is the throughput of a TCP connection, MTU

(MaximumTransitUnit)isthemaximumpacketsizeused

by the connection, R TT is the round trip time for the

connection, p is the packet loss ratio experienced by the

connection. Under the model-based rate control, Eq. (1)

stream. That is, the rate-controlled video ow gets its

bandwidthshare likea TCPconnection. As aresult, the

rate-controlledvideoowcould avoidcongestion inaway

similartothatofTCP,andcanco-existwithTCPowsin

a\friendly"manner. Hence,themodel-basedratecontrol

isalsocalled\TCP friendly"ratecontrol[57]. Incontrast

to this TCP friendliness, a owwithout rate control can

getmuchmorebandwidththanaTCPowwhenthe

net-work is congested. This may lead to possible starvation

ofcompeting TCPowsdueto therapidreductionof the

TCPwindowsizein responsetocongestion.

To compute the sending rate in Eq. (1), it is

neces-saryforthesourceto obtaintheMTU,R TT,andpacket

loss ratiop. The MTU canbefound through the

mech-anism proposed by Mogul and Deering [34]. In the case

when the MTU information is not available, the default

MTU, i.e., 576bytes, will be used. The parameterR TT

can be obtained through feedback of timing information.

Inaddition,thereceivercanperiodicallysendthe

parame-terptothesourceinthetimescaleoftheroundtriptime.

Uponthereceiptoftheparameterp,thesourceestimates

thesending rate andthen arate control actionmaybe

taken.

Single-channel multicast vs. unicast. For multicast

underthesource-basedratecontrol,thesenderusesasingle

channel orone IPmulticast groupto transport the video

stream to the receivers. Thus, such multicast is called

single-channel multicast.

For single-channel multicast, only theprobe-based rate

control canbeemployed[3]. A representativeworkis the

IVS (INRIAVideo-conference System)[3]. Therate

con-trolinIVSisbasedonadditiveincreaseandmultiplicative

decrease, which is summarized asfollows. Each receiver

estimates its packet loss ratio, based on which, each

re-ceivercandeterminethenetworkstatustobeinoneofthe

threestates: UNLOADED,LOADED,andCONGESTED.

The source solicits the network status information from

thereceiversthrough probabilisticpolling, which helps to

avoidfeedbackimplosion. 1

Inthisway,thefractionof

UN-LOADED andCONGESTED receiverscanbeestimated.

Then,thesourceadjuststhesendingrateaccordingtothe

followingalgorithm. if(F con >T con ) r:=max(r=2;MinR); elseif(F un == 100%)

r:=minf(r+AIR);MaxRg;

where F con , F un , and T con

are fraction of CONGESTED

receivers,fraction of UNLOADEDreceivers,and apreset

threshold, respectively;r, MaxR, MinR, and AIRare the

sending rate, the maximum rate, the minimum rate, and

additiveincreaserate,respectively.

Single-channel multicast has good bandwidth eÆciency

sinceallthereceiversshareonechannel(e.g., theIP

mul-ticastgroupin Fig.2(b)). Butsingle-channel multicastis

unable to provide serviceexibility and dierentiation to

1

(6)

mes-Efficiency

low

high

Bandwidth

low

Flexibility

Service

high

Unicast

Receiver-based/Hybrid

Rate Control

Single-channel

Multicast

Fig.6. Trade-obetweenbandwidtheÆciencyandserviceexibility.

Fig.7. Layeredvideoencoding/decoding. Ddenotesthedecoder.

dierent receiverswith diverseaccess link capacities,

pro-cessingcapabilitiesandinterests.

Ontheotherhand,multicastvideo,deliveredthrough

in-dividualunicaststreams(seeFig.2(a)),canoer

dieren-tiatedservicestoreceiverssinceeachreceivercan

individu-allynegotiatetheserviceparameterswiththesource. But

the problem with unicast-based multicast video is

band-widthineÆciency.

Single-channelmulticastandunicast-basedmulticastare

twoextremecasesshowninFig.6. Toachievegood

trade-obetweenbandwidtheÆciency andserviceexibilityfor

multicast video, two mechanisms, namely, receiver-based

andhybridratecontrol,havebeenproposed,whichwe

dis-cussasfollows.

A.2 Receiver-basedRateControl

Underthereceiver-basedratecontrol,thereceivers

regu-latethereceivingrateofvideostreamsbyadding/dropping

channels. Incontrasttothesender-basedratecontrol,the

sender does not participate in rate control here.

Typi-cally, the receiver-basedrate control is applied to layered

multicast videoratherthanunicastvideo. Thisis

primar-ilybecausethesource-basedratecontrolworksreasonably

wellforunicastvideoandthereceiver-basedratecontrolis

targetedatsolvingheterogeneityproblemin themulticast

case.

Beforeweaddressthereceiver-basedratecontrol,werst

briey describelayeredmulticastvideoas follows. At the

senderside, arawvideosequence iscompressedinto

mul-tiple layers: a base layer (i.e., Layer 0) and one ormore

enhancementlayers(e.g., Layers1 and 2in Fig. 7). The

base layercan be independently decoded and it provides

basic video quality; the enhancement layers can only be

decodedtogether withthebase layerandtheyfurther

re-ne the quality of the base layer. This is illustrated in

64KbpsinFig.7); thehigherthelayeris,themore

band-width thelayerconsumes(seeFig.7). Aftercompression,

eachvideo layer issentto a separateIP multicast group.

At thereceiverside, each receiver subscribes to a certain

set of videolayersbyjoining the corresponding IP

multi-castgroup. In addition, each receivertries to achievethe

highestsubscriptionlevelofvideolayerswithoutincurring

congestion. IntheexampleshowninFig.8,eachlayerhas

a separate IP multicast group. Receiver 1 joins all three

IPmulticastgroups. Asaresult,itconsumes1Mbpsand

receivesall the three layers. Receiver 2 joins the two IP

multicast groupsfor Layer0and Layer1with bandwidth

usageof256Kbps. Receiver3onlyjoinstheIPmulticast

groupforLayer0withbandwidthconsumptionof64Kbps.

Like the source-based rate control, we classify

exist-ing receiver-based rate control mechanisms into two

ap-proaches, namely, the probe-based approach and the

model-basedapproach,whicharepresentedasfollows.

Probe-based approach. This approach was rst

em-ployed in Receiverdriven Layered Multicast (RLM) [33].

Basically, the probe-based rate control consists of two

parts:

1. Whennocongestionisdetected,areceiverprobesforthe

available bandwidth byjoining alayer, which leads to an

increaseof its receivingrate. Ifno congestion is detected

after the joining, the join-experiment is considered

\suc-cessful". Otherwise, the receiver drops the newly added

layer.

2. Whencongestionisdetected,thereceiverdropsalayer,

resultinginreductionof itsreceivingrate.

The above control has a potential problem when the

numberofreceiversbecomeslarge. Ifeachreceivercarries

out the above join-experiment independently, the

aggre-gatefrequencyofsuchexperimentsincreaseswiththe

num-berofreceivers. Sinceafailed join-experimentcouldincur

congestiontothenetwork,anincreaseofjoin-experiments

couldaggravatenetworkcongestion.

Tominimizethefrequencyofjoin-experiments,ashared

learningalgorithmwasproposedin[33]. Theessenceofthe

sharedlearningalgorithmistohaveareceivermulticastits

intent to the groupbefore it starts ajoin-experiment. In

thisway,eachreceivercanlearnfromotherreceivers'failed

join-experiments, resultinginadecrease ofthenumberof

failedjoin-experiments.

The sharedlearning algorithm in [33] requires each

re-ceivertomaintainacomprehensivegroupknowledgebase,

which contains the results of all the join-experiments for

the multicast group. In addition, the use of multicasting

to update the comprehensive group knowledge base may

decreaseusablebandwidth onlow-speedlinks andleadto

lowerquality forreceiversonthese links. Toreduce

mes-sageprocessing overhead at each receiverandto decrease

bandwidthusageof thesharedlearningalgorithm, a

hier-archicalratecontrolmechanismcalledLayeredVideo

Mul-ticast with Retransmissions (LVMR) [30] was proposed.

Themethodologyofthehierarchicalratecontrolisto

(7)

Fig.8. IPmulticastforlayeredvideo.

information (rather than all the information) to the

re-ceivers. Inaddition,thepartitioningofthecomprehensive

group knowledge base allows multiple experiments to be

conducted simultaneously, makingitfaster fortherateto

convergetothestablestate. Althoughthehierarchicalrate

controlcouldreducecontrolprotocoltraÆc, itrequires

in-stalling agentsin the network so that the comprehensive

groupknowledgebasecanbepartitionedandorganizedin

ahierarchicalway.

Model-based approach. Unlike the probe-based

ap-proach which implicitly estimates the available network

bandwidththroughprobingexperiments,themodel-based

approachattemptstoexplicitlyestimatetheavailable

net-work bandwidth. Themodel-based approach is based on

thethroughputmodelofaTCPconnection,whichwas

de-scribedinSectionII-A.1.

Fig.9showstheowchartofthebasicmodel-basedrate

controlexecutedbyeachreceiver,where

i

isthe

transmis-sion rateof Layeri. Inthe algorithm, it is assumedthat

eachreceiverknowsthetransmissionrateofallthelayers.

Forthe ease of description, we divide the algorithm into

thefollowingsteps.

Initialization: A receiverstarts withsubscribing thebase

layer(i.e.,Layer0)andinitializesthevariableLto0. The

variableLrepresentsthehighestlayercurrentlysubscribed.

Step 1: Receiver estimates MTU, R TT, and packet loss

ratio p for a given period. The MTU can be found

through the mechanism proposed by Mogul and Deering

[34]. Packetlossratiopcan beeasily obtained. However,

the R TT cannot be measured througha simple feedback

mechanismdue tofeedbackimplosionproblem. A

mecha-nism[53],basedonRTCPprotocol,hasbeenproposedto

estimatetheR TT.

Step 2: Upon obtaining MTU, R TT, and p for a given

period,thetargetratecanbecomputedthroughEq.(1).

Step 3: Upon obtaining , a rate control action can be

taken. If < , drop the base layerand stop receiving

video(the network cannotdelivereventhebase layerdue

tocongestion);otherwise,determineL 0

,thelargestinteger

such that P L 0 i=0 i . If L 0

> L, add the layers from

LayerL+1toLayerL 0

,andLayerL 0

becomesthehighest

layer currently subscribed (let L = L 0

); if L 0

< L, drop

layersfromLayerL 0

+1toLayerL,andLayerL 0

becomes

thehighestlayercurrentlysubscribed(letL=L 0

). Return

toStep1.

Theabovealgorithm hasapotentialproblem when the

numberofreceiversbecomeslarge. Ifeachreceivercarries

outtheratecontrolindependently,theaggregatefrequency

ofjoin-experimentsincreaseswiththenumberofreceivers.

Since a failed join-experiment could incur congestion to

thenetwork, anincreaseof join-experimentscould

aggra-vatenetworkcongestion.Tocoordinatethejoining/leaving

actionsofthereceivers,aschemebasedonsynchronization

points [55] wasproposed. With small protocol overhead,

theproposed schemein [55]helps toreduce thefrequency

and duration of join-experiments, resulting in a smaller

possibilityofcongestion.

A.3 HybridRateControl

Underthehybridratecontrol,thereceiversregulatethe

receivingrate of videostreamsby adding/dropping

chan-nelswhile thesenderalsoadjuststhetransmissionrateof

each channel based on feedback information from the

re-ceivers. Sincethehybridratecontrolconsists ofrate

con-trolatboththesenderandareceiver,previousapproaches

describedin SectionsII-A.1andII-A.2canbeemployed.

The hybrid rate control is targeted at multicast video

andisapplicabletobothlayeredvideo[44]andnon-layered

video [8]. Dierent from the source-based rate control

frameworkwherethesenderusesasinglechannel,the

hy-brid rate control framework uses multiple channels. On

theotherhand,dierentfromthereceiver-basedrate

con-trolframeworkwheretherateforeachchannelisconstant,

(8)

Estimate MTU, RTT,

and packet loss ratio p

Σ

i=0

L’

i

such that

γ <=λ

Determine L’, the largest integer

L’ > L ?

L’ < L ?

Let L=L’

Stop

Subscribe to the base layer

Let L=0

λ < γ ?

0

No

Subscribe to the base layer

λ

Compute through Eq. (1)

No

Let L=L’

Drop the base layer

Yes

Yes

No

Yes

(i.e., Layer 0)

Add the layers

Drop the layers

from Layer L+1 to Layer L’

from Layer L’+1 to Layer L

Fig.9. Flowchartofthebasicmodel-basedratecontrolforareceiver.

changetheratefor each channel basedoncongestion

sta-tus.

Onerepresentativeworkusinghybridratecontrolis

des-tinationsetgrouping(DSG)protocol[8]. Beforewepresent

theDSGprotocol,werstbrieydescribethearchitecture

associatedwith DSG. At thesender side, araw video

se-quence is compressed into multiple streams (called

repli-cated adaptive streams), which carry the same video

in-formation with dierent rate and quality. Dierent from

layered video, each stream in DSG can be decoded

inde-pendently. Aftercompression,eachvideostreamissentto

a separateIP multicast group. Atthe receiverside, each

receiver can choose a multicast group to join by taking

into account of its capability and congestion status. The

receiversalsosend feedbackto thesource, and thesource

usesthisfeedbacktoadjust thetransmissionrateforeach

stream.

TheDSGprotocolconsistsoftwomaincomponents:

1. Rate control at the source. For each stream, the

rate control at the source is essentially the sameas that

usedin IVS(seeSectionII-A.1). Butthefeedbackcontrol

foreachstreamworksindependently.

2. Rate control at a receiver. A receivercan change

its subscription andjoin ahigheror lower qualitystream

basedonnetworkstatus,i.e.,thefractionofUNLOADED,

LOADEDandCONGESTEDreceivers.Themechanismto

GESTEDreceiversissimilartothatusedinIVS.Therate

control at a receiver takes the probe-based approach as

presentedinSection II-A.2.

B. Rate-adaptive Video Encoding: A Compression

Ap-proach

Rate-adaptive video encoding has been studied

exten-sivelyforvariousstandardsandapplications,suchasvideo

conferencing withH.261 andH.263 [31],[61],storage

me-dia with MPEG-1 and MPEG-2 [16], [28], [48], real-time

transmissionwithMPEG-1andMPEG-2[17],[23],andthe

recent object-based coding with MPEG-4 [54], [63]. The

objectiveofarate-adaptiveencodingalgorithmisto

max-imizetheperceptualqualityunderagivenencodingrate. 2

Such adaptiveencoding canbeachievedbythealteration

of theencoder'squantizationparameter (QP) and/or the

alterationofthevideoframerate.

Traditionalvideoencoders(e.g.,H.261,MPEG-1/2)

typ-ically rely on altering the QP of the encoder to achieve

rate adaptation. These encoding schemes must perform

codingwith constantframe rates. This is becauseeven a

slightreductioninframeratecansubstantiallydegradethe

perceptualqualityat thereceiver,especially during a

dy-namic scenechange. Sincealtering theQP is notenough

to achieveverylowbit-rate, these encoding schemes may

2

(9)

chang-Fig.10. Anexampleof videoobject(VO)conceptinMPEG-4video. Avideoframe(left)issegmentedinto twoVOplaneswhereVO1

(middle)isthebackgroundandVO2(right)istheforeground.

notbesuitableforverylowbit-ratevideoapplications.

Onthecontrary,MPEG-4andH.263codingschemesare

suitableforverylowbit-ratevideoapplicationssincethey

allow the alteration of the frame rate. In fact, the

alter-ationoftheframerateisachievedbyframe-skip. 3

Speci-cally,iftheencoderbuerisindangerofoverow(i.e.,the

bitbudgetisover-usedbythepreviousframe),acomplete

frame canbeskipped at theencoder. Thiswill allow the

codedbitsofthepreviousframestobetransmittedduring

thetimeperiodofthisframe,thereforereducingthebuer

level(i.e.,keepingtheencodedbitswithinthebudget).

Inaddition, MPEG-4is therstinternationalstandard

addressingthecodingofvideoobjects(VO's)(seeFig.10)

[24]. WiththeexibilityandeÆciencyprovidedbycoding

videoobjects,MPEG-4iscapableofaddressinginteractive

content-basedvideoservicesaswellas conventionalstored

and live video [39]. In MPEG-4, a frame of a video

ob-jectiscalledavideoobjectplane(VOP),whichisencoded

separately. Suchisolationofvideoobjectsprovidesuswith

much greaterexibility to performadaptiveencoding. In

particular, we candynamically adjust target bit-rate

dis-tributionamongvideoobjects,inadditiontothealteration

of QP on each VOP (such a scheme is proposed in [63]).

Thiscanupgradetheperceptualqualityfortheregionsof

interest(e.g.,headandshoulder)whileloweringthequality

forotherregions(e.g.,background).

Forallthevideocodingalgorithms,afundamental

prob-lem ishowto determineasuitableQPtoachievethe

tar-getbit-rate. Therate-distortion(R-D)theoryisapowerful

tooltosolvethisproblem. UndertheR-Dframework,there

aretwoapproachesforencoding ratecontrolin the

litera-ture: the model-basedapproach andthe operationalR-D

basedapproach. The model-basedapproachassumes

var-ious input distributions and quantizer characteristics [9],

[63]. Under this approach, closed-form solutions can be

obtainedbyusing continuousoptimizationtheory. Onthe

other hand, the operational R-D based approach

consid-ers practical coding environments where only a nite set

of quantizersis admissible[23],[28],[48],[61]. Under the

operational R-D based approach, the admissible

quantiz-ers are used by the rate control algorithm to determine

theoptimalstrategy tominimize thedistortionunder the

constraintofagivenbitbudget. Theoptimaldiscrete

solu-tionscanbefoundthroughapplyingintegerprogramming

3

theory.

C. Rate Shaping

Rate shaping is a technique to adapt the rate of

com-pressedvideobit-streamsto thetarget rateconstraint. A

rate shaperis aninterface(or lter)between theencoder

andthenetwork,withwhichtheencoder'soutputratecan

bematchedtotheavailablenetworkbandwidth. Sincerate

shapingdoesnotrequireinteractionwiththeencoder,rate

shaping is applicable to any video coding scheme and is

applicabletobothliveandstoredvideo. Rateshapingcan

beachievedthroughtwoapproaches: oneisfromthe

trans-port perspective [22], [45], [67] and the other is from the

compressionperspective[18].

Arepresentativemechanismfromthetransport

perspec-tiveisserverselectiveframediscard [67].Theserver

selec-tiveframediscardismotivatedbythefollowingfact.

Usu-ally,aservertransmitseachframe withoutanyawareness

of the available network bandwidth and the client buer

size. As a result, the network may drop packets if the

available bandwidth is less than required, which leads to

framelosses. Inaddition,theclientmayalsodroppackets

that arrivetoolate for playback. This causes wastageof

networkbandwidthandclientbuerresources. Toaddress

this problem, the selectiveframe discardscheme

preemp-tively dropsframes at theserverin an intelligentmanner

byconsideringavailablenetworkbandwidthandclientQoS

requirements. Theselectiveframe discard hastwo major

advantages. First, by taking the network bandwidthand

clientbuerconstraintsintoaccount,theservercanmake

thebest useofnetworkresourcesbyselectivelydiscarding

framesinordertominimizethelikelihoodoffutureframes

being discarded, thereby increasing the overall quality of

thevideodelivered. Second, unlikeframedropping in the

networkorattheclient,theservercanalsotakeadvantage

ofapplication-specicinformationsuchasregionsof

inter-est andgroupofpictures(GOP) structure, in itsdecision

indiscardingframes. Asaresult,theserveroptimizesthe

perceived quality at the client while maintaining eÆcient

utilizationofthenetworkresources.

A representativemechanismfrom thecompression

per-spective is dynamic rate shaping [18]. Based onthe R-D

theory, the dynamic rate shaper selectively discards the

Discrete Cosine Transform (DCT) coeÆcients of thehigh

(10)

Internet

Request

Retransmission

Error Concealment

Error Resilient

Decoder

IP Layer

UDP Layer

Transport

Compression

Domain

Domain

IP Layer

RTP Layer

UDP Layer

Receiver Side

Loss Detection

Error Resilient

Sender Side

Retransmission

Control

FEC Encoder

FEC Decoder

Encoder

RTP Layer

Raw Video

Display

Video

Fig.11. Anarchitectureforerrorcontrolmechanisms.

namic rateshaperselectsthehighestfrequencies and

dis-cards the DCT coeÆcients of these frequencies until the

targetrateismet.

Congestion control attempts to prevent packet loss by

matching therateof videostreamsto theavailable

band-width inthenetwork. However,packet lossisunavoidable

intheInternetandmayhavesignicantimpacton

percep-tualquality. Therefore,weneedothermechanismsto

max-imize thevideopresentation quality in presenceof packet

loss. Such mechanisms include error control mechanisms,

whicharepresentedinthenextsection.

III. ErrorControl

IntheInternet, packets maybedroppeddue to

conges-tionatrouters,theymaybemis-routed,ortheymayreach

thedestinationwith suchalongdelayastobeconsidered

uselessorlost. Packetlossmayseverelydegradethevisual

presentationquality. Toenhancethevideoqualityin

pres-ence of packet loss, error control mechanisms have been

proposed.

Forcertaintypesofdata(suchastext),packetlossis

in-tolerablewhiledelayisacceptable. Whenapacketislost,

there are two ways to recover the packet: the corrupted

data must be corrected by traditional FEC (i.e., channel

coding),orthepacketmustberetransmitted. Ontheother

hand,forreal-timevideo,somevisualqualitydegradation

isoftenacceptablewhiledelaymustbebounded. This

fea-tureof real-timevideointroducesmanynewerrorcontrol

mechanisms,whichareapplicabletovideoapplicationsbut

notapplicable to traditionaldata suchastext. Basically,

the error control mechanisms for video applications can

beclassiedintofourtypes,namely,FEC,retransmission,

error-resilience, and errorconcealment. FEC,

retransmis-sion,anderror-resilienceareperformedatboththesource

and the receiver side, while error concealment is carried

out only at the receiver side. Fig.11 shows the location

of each errorcontrolmechanisminalayeredarchitecture.

As shown in Fig. 11, retransmission recovers packet loss

from the transport perspective; error-resilience and error

concealment deal with packet loss from the compression

siondomains. Fortherestofthissection,wepresentFEC,

retransmission,error-resilience,anderrorconcealment,

re-spectively.

A. FEC

Theuseof FECisprimarilybecauseofitsadvantageof

small transmission delay, compared with TCP [14]. The

principleof FECis to add extra(redundant)information

toacompressedvideobit-streamsothattheoriginalvideo

canbe reconstructedin presenceof packet loss. Based on

thekindofredundantinformationto beadded, the

exist-ing FEC schemes can be classied into three categories:

(1)channelcoding,(2) sourcecoding-basedFEC,and(3)

jointsource/channelcoding,whichwillbepresentedin

Sec-tionsIII-A.1to III-A.3, respectively.

A.1 ChannelCoding

For Internet applications, channel coding is typically

used in termsof blockcodes. Specically, avideostream

isrstchoppedinto segments,eachof which ispacketized

into k packets; then for each segment, a block code (e.g.,

Tornadocode [1])is applied to thek packets to generate

a n-packet block, where n > k. Specically, the channel

encoderplacesthekpacketsintoagroupandthencreates

additionalpacketsfrom them sothat thetotalnumberof

packets in the groupbecomes n, where n > k (shown in

Fig. 12). This group of packets is transmitted to the

re-ceiver, which receivesK packets. To perfectly recover a

segment, a user must receive K (K k) packets in the

n-packet block (see Fig. 12). Inother words, auseronly

needstoreceiveanykpacketsinthen-packetblocksothat

itcanreconstruct alltheoriginalkpackets.

Sincerecoveryiscarriedoutentirelyatthereceiver,the

channel coding approachcanscaleto arbitrarynumberof

receiversinalargemulticastgroup. Inaddition,duetoits

abilitytorecoverfromanykoutofnpacketsregardlessof

whichpacketsarelost,itallowsthenetworkand receivers

to discard some of the packets which cannot be handled

due to limited bandwidth or processing power. Thus, it

(11)

Encoder

Decoder

source

data

encoded

data

received

data

reconstructed

data

k

n

K>=k

k

Fig.12. Channelcoding/decodingoperation.

disadvantagesassociatedwithchannel codingas follows.

1. Itincreasesthetransmissionrate. Thisisbecause

chan-nelcodingaddsn kredundantpacketstoeverykoriginal

packets,whichincreasestheratebyafactorofn=k. In

ad-dition, thehigherthelossrateis,thehigherthe

transmis-sion rateis requiredto recoverfrom the loss. The higher

the transmission rate is, the more congested the network

gets, whichleadsto an evenhigher lossrate. This makes

channelcodingvulnerableforshort-termcongestion.

How-ever, eÆciency may be improved by using unequal error

protection[1].

2. Itincreasesdelay. Thisisbecause(1)achannelencoder

mustwaitforallkpacketsinasegmentbeforeitcan

gener-atethen kredundantpackets;and(2)thereceivermust

waitforatleastkpacketsofablockbeforeitcanplayback

thevideosegment. Inaddition, recoveryfrom burstyloss

requires the use of either longer blocks (i.e., largerk and

n)ortechniqueslikeinterleaving. Ineithercase,delaywill

befurtherincreased. Butforvideostreamingapplications,

which can tolerate relatively large delay, the increase in

delaymaynotbeanissue.

3. It is notadaptiveto varyinglosscharacteristicsand it

works best only when the packet loss rate is stable. If

morethann kpacketsofablockarelost,channelcoding

cannot recoveranyportion of the original segment. This

makeschannelcodinguselesswhentheshort-termlossrate

exceedstherecoverycapability ofthecode. Onthe other

hand,ifthelossrateis wellbelowthecode'srecovery

ca-pability,theredundantinformationismorethannecessary

(a smallerratio n=k would be moreappropriate). To

im-provethe adaptivecapability of channel coding, feedback

canbeused. Thatis,ifthereceiverconveystheloss

char-acteristicstothesource,thechannelencodercanadaptthe

redundancy accordingly. Note that this requires aclosed

loopratherthananopenloopintheoriginalchannel

cod-ingdesign.

Asignicantportionofpreviousresearchonchannel

cod-ingforvideotransmissionhasinvolvedequal error

protec-tion (EEP),in which allthe bitsof thecompressed video

stream aretreated equally, andgivenanequal amountof

redundancy. However, the compressed video stream

typ-ically does not consist of bits of equal signicance. For

example, in MPEG, anI-frameis moreimportant thana

P-framewhileaP-frameismoreimportantthanaB-frame.

Currentresearchisheavilyweightedtowardsunequalerror

informationbits aregivenmoreprotection. A

representa-tive work of UEP is the Priority Encoding Transmission

(PET) [1]. A key feature of the PET scheme is to allow

auserto setdierentlevels(priorities)oferrorprotection

for dierent segmentsof the videostream. This unequal

protectionmakesPETeÆcient(lessredundancy)and

suit-ablefor transporting MPEGvideowhich hasan inherent

priorityhierarchy(i.e.,I-,P-,andB-frames).

Toprovideerrorrecoveryinlayeredmulticastvideo,Tan

and Zakhor proposed a receiver-driven hierarchical FEC

(HFEC)[50]. InHFEC,additionalstreamswithonlyFEC

redundantinformationaregeneratedalongwiththevideo

layers. Each ofthe FECstreamsis used forrecoveryof a

dierentvideolayer,and each oftheFECstreams issent

to a dierent multicast group. Subscribing to more FEC

groupscorrespondstohigherlevelofprotection. Likeother

receiver-driven schemes, HFEC also achievesgood

trade-obetweenexibilityofprovidingrecoveryandbandwidth

eÆciency,that is:

Flexibilityof providing recovery: Each receiver can

inde-pendently adjust the desired level of protection based on

pastreception statisticsand theapplication's delay

toler-ance.

BandwidtheÆciency: Each receiverwillsubscribetoonly

asmany redundancy layersas necessary, reducing overall

bandwidthutilization.

A.2 SourceCoding-basedFEC

Source coding-based FEC(SFEC)is arecentlydevised

variant of FEC for Internet video [4]. Like channel

cod-ing,SFECalsoaddsredundantinformationtorecoverfrom

loss. Forexample,SFECcouldaddredundantinformation

asfollows: the nth packet containsthenth GOB (Group

ofBlocks)andredundantinformationaboutthe(n 1)th

GOB. If the (n 1)th packet is lost but the nth packet

isreceived,thereceivercanstillreconstruct the(n 1)th

GOBfromtheredundantinformationaboutthe(n 1)th

GOB,whichis containedinthenthpacket. However,the

reconstructed (n 1)th GOB has acoarser quality. This

isbecausetheredundantinformationaboutthe (n 1)th

GOBisacompressedversionofthe(n 1)thGOBwitha

largerquantizer,resultinginlessredundancyaddedtothe

nthpacket.

Themain dierencebetweenSFECand channel coding

isthekindofredundantinformationbeingaddedtoa

com-pressedvideostream. Specically,channelcodingadds

re-dundantinformation accordingtoablockcode(irrelevant

to the video) while the redundant information added by

SFEC is morecompressed versions of the raw video. As

a result, when there is packet loss, channel coding could

achieve perfect recovery while SFEC recovers the video

withreducedquality.

One advantage of SFEC over channel coding is lower

delay. ThisisbecauseeachpacketcanbedecodedinSFEC

while,underthechannelcodingapproach,boththechannel

encoderandthechanneldecoder havetowaitfor atleast

(12)

R

1

R

2

Distortion

Source rate

2

1

D

D

Distortion

o

1

2

D

D

D

o

R

1

R

R

2

Source rate

(a) (b)

Fig.13. (a)Rate-distortion relation forsourcecoding. (b)

Rate-distortionrelationforthecaseofjointsource/channelcoding.

are: (1)anincreaseinthetransmissionrate,and(2)

inex-ible to varying losscharacteristics. However,such

inexi-bility to varying losscharacteristicscan alsobeimproved

through feedback [4]. That is, if thereceiverconveysthe

loss characteristicsto the source, the SFEC encoder can

adjusttheredundancyaccordingly. Notethatthisrequires

acloseloopratherthananopenloopintheoriginalSFEC

codingscheme.

A.3 JointSource/ChannelCoding

Due to Shannon's separation theorem [43], the coding

worldwasgenerallydividedintotwocamps: sourcecoding

and channel coding. Thecamp ofsource coding was

con-cerned with developing eÆcient source coding techniques

while thecampof channel coding wasconcernedwith

de-velopingrobust channel coding techniques [21]. In other

words, the camp of source coding did not take channel

coding into account and the camp of channel coding did

not consider source coding. However, Shannon's

sepa-ration theorem is not strictly applicable when the delay

is bounded, which is the case for such real-time services

as video over the Internet [10]. The motivation of joint

source/channel coding forvideocomes from thefollowing

observations:

CaseA:Accordingtotherate-distortiontheory(shown

inFig.13(a))[13],thelowerthesource-encodingrateRfor

avideounit,thelargerthedistortionD ofthevideounit.

That is,R#)D".

Case B: Suppose that the total rate (i.e., the

source-encoding rateR plusthechannel-codingredundancy rate

R 0

)isxedandchannellosscharacteristicsdonotchange.

The higher the source-encoding rate for a video unit is,

the lower the channel-coding redundancy rate would be.

ThisleadstoahigherprobabilityP

c

oftheeventthat the

video unit gets corrupted, which translates into a larger

distortion ofthevideounit. That is,R")R 0

#)P

c ")

D".

CombiningCasesA andB, itcan bearguedthat there

existsanoptimalsource-encodingrateR

o

thatachievesthe

minimum distortionD

o

(seeFig.13(b)), givenaconstant

totalrate. Asillustratedin Fig.13(b),theleftpartofthe

curveshowsCaseAwhiletherightpartofthecurveshows

CaseB.Thetwopartsmeetattheoptimalpoint(R

o ,D

o ).

The objective of joint source/channel coding is to

nd the optimal point shown in Fig. 13(b) and design

source/channel coding schemes to achieve the optimal

point. In other words, nding an optimal point in joint

source/channel coding is to make an optimal rate

alloca-tionbetweensourcecodingandchannelcoding.

Basically,jointsource/channelcodingisaccomplishedby

threetasks:

Task 1: nding an optimal rate allocation between

sourcecodingand channel coding foragivenchannelloss

characteristic,

Task 2: designing a source coding scheme (including

specifyingthequantizer)to achieveitstargetrate,

Task 3: designing/choosingchannelcodesto matchthe

channellosscharacteristicandachievetherequired

robust-ness.

For thepurpose of illustration, Fig.14 shows an

archi-tecturefor joint source/channel coding. Under the

archi-tecture,aQoSmonitoriskeptatthereceiversidetoinfer

the channel losscharacteristics. Such information is

con-veyedbacktothesourcesidethroughthefeedbackcontrol

protocol. Based on such feedback information, the joint

source/channeloptimizermakesanoptimalrateallocation

between the source coding and the channel coding (Task

1) and conveys the optimal rate allocation to the source

encoderandthechannelencoder. Thenthesourceencoder

choosesanappropriatequantizertoachieveitstargetrate

(Task 2)andthechannelencoderchoosesasuitable

chan-nelcodeto matchthechannellosscharacteristic(Task3).

Anexampleofjointsource/channelcodingisthescheme

introducedbyDavisandDanskin[14]fortransmitting

im-agesovertheInternet. Inthisscheme,sourceandchannel

coding bits are allocated in a way that can minimize an

expected distortion measure. As a result, more

percep-tually important low frequency sub-bands of images are

shieldedheavilyusingchannelcodeswhile higher

frequen-cies areshieldedlightly. This unequalerrorprotection

re-duceschannelcodingoverhead,whichismostpronounced

onburstychannelswhereauniformapplicationofchannel

codesisexpensive.

B. Delay-constrained Retransmission: A Transport

Ap-proach

A conventional retransmission scheme, ARQ, works as

follows: whenpackets arelost,thereceiversendsfeedback

to notify the source; then thesource retransmits the lost

packets. TheconventionalARQ isusually dismissed asa

method for transporting real-time video since a

retrans-mitted packet arrives at least 3 one-way trip time after

the transmission of the original packet, which might

ex-ceedthedelayrequiredbytheapplication. However,ifthe

one-way trip time is short with respect to the maximum

allowable delay, a retransmission-based approach, called

delay-constrainedretransmission,isaviableoptionfor

er-rorcontrol[37],[38].

Typically, one-way trip time is relatively small within

(13)

sen-Internet

Source Decoder

IP Layer

UDP Layer

IP Layer

RTP Layer

UDP Layer

Receiver Side

Sender Side

RTP Layer

QoS Monitor

Protocol

Feedback Control

Joint Source/Channel

Optimizer

Channel Decoder

Channel Encoder

Source Encoder

Raw Video

Compression

Domain

Transport

Domain

Fig.14. Anarchitectureforjointsource/channelcoding.

constrainedretransmissionforlossrecoveryinanLAN

en-vironment[15]. Delay-constrainedretransmissionmayalso

be applicable to streaming video, which can tolerate

rel-atively large delay due to a largereceiver buer and

rel-atively long delay for display. As a result, even in wide

area network (WAN), streaming video applications may

have suÆcient time to recoverfrom lost packets through

retransmissionandtherebyavoidunnecessarydegradation

in reconstructedvideoquality.

In the following, we present various delay-constrained

retransmission schemes for unicast (Section III-B.1) and

multicast (SectionIII-B.2),respectively.

B.1 Unicast

Based on who determines whether to send and/or

respond to a retransmission request, we design three

delay-constrained retransmissionmechanismsfor unicast,

namely,receiver-based,sender-based,andhybridcontrol.

Receiver-based control. Theobjectiveof the

receiver-based control is to minimize the requests of

retransmis-sion that will not arrive timely for display. Under the

receiver-based control,the receiverexecutes thefollowing

algorithm.

WhenthereceiverdetectsthelossofpacketN:

if(T c +R TT+D s <T d (N))

sendtherequestforretransmissionof

packetN to thesender;

where T

c

is thecurrenttime, R TT isan estimatedround

triptime, D

s

isaslackterm,andT

d

(N)is thetime when

packetN isscheduledfordisplay. TheslacktermD

s could

include toleranceoferrorinestimating R TT,thesender's

responsetimetoarequest,and/orthereceiver'sprocessing

delay(e.g.,decoding). IfT

c +R TT+D s <T d (N)holds,it

isexpectedthattheretransmittedpacketwillarrivetimely

fordisplay. Thetimingdiagram forreceiver-basedcontrol

isshowninFig.15.

Sender-basedcontrol. Theobjectiveofthesender-based

RTT

T

c

Sender

Receiver

request for packet 2

retransmitted packet 2

packet 1

packet 2 lost

packet 3

T (2)

d

D

s

Fig.15. Timingdiagramforreceiver-basedcontrol.

miss theirdisplaytimeat thereceiver. Under the

sender-basedcontrol,thesenderexecutesthefollowingalgorithm.

Whenthesenderreceivesarequestforretransmission

ofpacketN: if(T c +T o +D s <T 0 d (N))

retransmitpacketN to thereceiver

where T

o

is the estimated one-way trip time (from the

sendertothereceiver),andT 0 d (N)isanestimateofT d (N). To obtain T 0 d

(N), the receiver has to feedback T

d (N) to

the sender. Then, based on the dierences between the

sender's system time and the receiver's system time, the

sendercanderiveT 0

d

(N). The slacktermD

s

mayinclude

error termsin estimating T

o and T

0

d

(N), as well as

toler-ance in thereceiver'sprocessingdelay(e.g., decoding). If

T c +R TT +D s < T 0 d

(N) holds, it can be expected that

retransmittedpacketwillreachthereceiverintimefor

dis-play. Thetimingdiagramforsender-basedcontrolisshown

inFig.16.

Hybrid control. The objective of the hybrid control is

tominimizetherequestofretransmissionsthatwillnot

ar-rivefor timely display, and to suppress retransmission of

thepacketsthatwillmisstheirdisplaytimeatthereceiver.

Thehybridcontrolis asimplecombinationof the

sender-basedcontrol and thereceiver-basedcontrol. Specically,

thereceivermakesdecisionsonwhethertosend

retransmis-sionrequestswhilethesendermakesdecisionsonwhether

(14)

con-T

c

D

s

D

s

d

T (2)

Sender

Receiver

T

o

T’ (2)

d

packet 1

packet 2 lost

packet 3

request for packet 2

retransmitted packet 2

Fig.16. Timingdiagramforsender-basedcontrol.

complexity.

B.2 Multicast

Inthemulticastcase,retransmissionhastoberestricted

within closely located multicast members. This is

be-cause one-way trip timesbetweenthese memberstend to

be small, making retransmissions eective in timely

re-covery. In addition, feedback implosion of retransmission

requests is a problem that must be addressed under the

retransmission-based approach. Thus, methods are

re-quired to limit thenumber orscopeof retransmission

re-quests.

Typically, alogicaltree is conguredto limitthe

num-ber/scopeof retransmission requests and to achievelocal

recovery among closely located multicast members [29],

[32], [64]. The logical tree can be constructed by

stati-cally assigning Designated Receivers (DRs) at each level

ofthetreeto helpwithretransmissionoflost packets[29].

Oritcanbedynamicallyconstructedthroughtheprotocol

used inSTructure-OrientedResilientMulticast (STORM)

[64]. By adapting the tree structure to changing

net-worktraÆcconditionsandgroupmemberships,thesystem

couldachievehigherprobabilityofreceivingretransmission

timely.

Similartothereceiver-basedcontrolforunicast,receivers

inamulticastgroupcanmakedecisionsonwhethertosend

retransmission requests. By suppressing the requests for

retransmissionofthosepacketsthatcannotberecoveredin

time, bandwidtheÆciency can beimproved[29]. Besides,

using a receiving buer with appropriate size could not

only absorb the jitter but also increase the likelihood of

receiving retransmitted packets before their display time

[29].

To address heterogeneity problem, a receiver-initiated

mechanism for error recovery canbe adopted as done in

STORM [64]. Under this mechanism, each receiver can

dynamically select the best possible DR to achieve good

trade-o betweendesired latency and the degreeof

relia-bility.

C. Error-resilience: ACompressionApproach

Error-resilient schemes address loss recovery from the

compression perspective. Specically, they attempt to

prevent error propagation or limit the scope of the

damage (caused by packet losses) on the compression

layer. The standardized error-resilient tools include

re-Resilience

Error

low

high

Inter Mode

Intra Mode

Compression

Efficiency

low

high

Optimal in R-D sense

Fig.17. Illustrationofoptimalmodeselection.

covery(e.g.,reversiblevariablelengthcodes(RVLC))[24],

[49]. However,re-synchronizationmarking,data

partition-ing,anddatarecoveryaretargetedaterror-prone

environ-mentslikewireless channelsandmaynotbeapplicableto

theInternet. ForInternetvideo,theboundaryofapacket

already provides a synchronization point in the

variable-lengthcodedbit-stream atthereceiverside. Ontheother

hand,sinceapacketlossmaycausethelossofallthe

mo-tion data and its associated shape/texture data,

mecha-nismssuchasre-synchronizationmarking,data

partition-ing,anddatarecoverymaynotbeusefulforInternetvideo

applications. Therefore, we donot intend to present the

standardized error-resilienttools. Instead,wepresenttwo

techniques which are promising for robust Internet video

transmission,namely,optimal mode selection andmultiple

description coding.

C.1 OptimalModeSelection

Inmanyvideocodingschemes,ablock,whichisavideo

unit,is codedbyreference to apreviouslycoded blockso

thatonlythedierencebetweenthetwoblocksneedstobe

coded,resultinginhighcodingeÆciency. Thisiscalled

in-ter mode. Constantlyreferringtopreviouslycodedblocks

hasthedangeroferrorpropagation.Byoccasionally

turn-ing o this inter mode, error propagation canbelimited.

But it will be more costly in bits to code a block all by

itself, without any reference to a previously coded block.

Suchacodingmodeiscalledintramode. Intra-codingcan

eectively stoperror propagation at the cost of

compres-sion eÆciency while inter-codingcan achievecompression

eÆciencyattheriskoferrorpropagation. Therefore,there

isatrade-oinselectingacodingmodeforeachblock(see

Fig.17). Howtooptimallymakethese choices isthe

sub-jectofmanyresearchinvestigations[12],[62],[66].

Forvideocommunicationoveranetwork,ablock-based

coding algorithm such as H.263 or MPEG-4 [24] usually

employsratecontroltomatchtheoutputratetothe

avail-ablenetwork bandwidth. Theobjectiveof rate-controlled

compression algorithms is to maximize the video quality

under the constraint of a given bit budget. This can be

achievedbychoosingamodethatminimizesthe

quantiza-tion distortion between the original block and the

recon-structed one under agivenbit budget[36], [46], which is

theso-calledR-D optimizedmodeselection. Werefersuch

(15)

Raw Video

Video

Encoder

Video

Decoder

Transport

Protocol

Network

Protocol

Transport

Depacketizer

Packetizer

Source behavior

Receiver behavior

Path characteristics

Fig.18. Factorsthathaveimpactonthevideopresentationquality:

sourcebehavior,pathcharacteristics,andreceiverbehavior.

malityundertheerror-proneenvironmentsinceitdoesnot

considerthenetworkcongestionstatusandthereceiver

be-havior.

Toaddressthisproblem, an end-to-endapproach to

R-D optimizedmodeselectionwasproposed [62]. Under the

end-to-endapproach,three factorswere identiedto have

impact on the video presentation quality at the receiver:

(1) the source behavior,e.g., quantizationand

packetiza-tion, (2)the pathcharacteristics,and (3) thereceiver

be-havior,e.g.,errorconcealment(seeFig.18). Basedonthe

characterizations,atheory [62] forglobally optimalmode

selection wasdeveloped. By taking into considerationof

the network congestion status and the receiverbehavior,

the end-to-end approach is shown to be capable of

oer-ing superior performance over the classical approach for

Internetvideoapplications[62].

C.2 MultipleDescriptionCoding

Multiple description coding (MDC) is another way to

achieve trade-o between compression eÆciency and

ro-bustness to packet loss [59]. With MDC, araw video

se-quence iscompressedintomultiple streams(referredtoas

descriptions). Eachdescriptionprovidesacceptable visual

quality; more combined descriptions provide a better

vi-sual quality. Theadvantagesof MDC are: (1) robustness

to loss: evenif areceivergetsonlyonedescription(other

descriptionsbeinglost), itcanstill reconstructvideowith

acceptable quality; and(2) enhanced quality: ifareceiver

getsmultipledescriptions,itcancombinethemtogetherto

produce abetterreconstruction than that produced from

anysingledescription.

However,theadvantagesdonotcomeforfree. Tomake

eachdescriptionprovideacceptablevisualquality,each

de-scriptionmustcarrysuÆcientinformationaboutthe

origi-nalvideo. ThiswillreducethecompressioneÆciency

com-paredto conventional singledescriptioncoding(SDC). In

addition, although more combined descriptions provide a

better visual quality, a certain degree of correlation

be-tweenthemultipledescriptionshastobeembeddedineach

description,resultingin further reductionofthe

compres-sion eÆciency. Current research eort is to nd a good

trade-o between the compression eÆciency and the

re-D. ErrorConcealment: A CompressionApproach

When packet loss is detected, the receiver can employ

error concealment to concealthe lost data and make the

presentation morepleasing to humaneyes. Since human

eyes can tolerate a certain degree of distortion in video

signals,error concealment is aviabletechniqueto handle

packetloss[60].

There are two basic approaches for error concealment,

namely, spatialand temporalinterpolation. Inspatial

in-terpolation, missing pixel values are reconstructed using

neighboring spatial information, whereas in temporal

in-terpolation,thelostdataisreconstructedfromdatainthe

previousframes. Typically,spatialinterpolationisusedto

reconstructmissing datain intra-codedframeswhile

tem-poral interpolationis used to reconstruct missing data in

inter-codedframes.

In recent years, numerous error-concealment schemes

have been proposed in the literature (refer to [60] for a

good survey). Examples include maximally smooth

re-covery [58], projection onto convexsets [47], and various

motionvectorandcodingmoderecoverymethodssuchas

motion compensated temporal prediction [20]. However,

most error concealment techniques discussed in [60] are

only applicable to either ATM or wireless environments,

and require substantial additional computation

complex-ity, which is acceptable for decoding still images but not

tolerablein decoding real-time video. Therefore, weonly

describesimpleerrorconcealmentschemesthat are

appli-cableto Internetvideocommunication.

We describe three simple error concealment (EC)

schemesasfollows.

EC-1: Thereceiverreplacesthewholeframe(wheresome

blocksarecorrupteddue topacketloss)withtheprevious

reconstructedframe.

EC-2: The receiverreplaces acorrupted block with the

blockat thesamelocationfromthepreviousframe.

EC-3: Thereceiverreplacesthecorruptedblockwiththe

block from the previous frame pointed by a motion

vec-tor. Themotionvectoriscopiedfromitsneighboringblock

whenavailable,otherwisethemotionvectorisset tozero.

EC-1andEC-2arespecialcasesofEC-3. Ifthemotion

vectorofthecorruptedblockisavailable,EC-3canachieve

betterperformancethanEC-1andEC-2whileEC-1and

EC-2havelesscomplexitythanthatofEC-3.

IV. Summary

Transporting video over the Internet is an important

componentofmanymultimedia applications. LackofQoS

support in the current Internet, and the heterogeneityof

thenetworksandend-systemsposemanychallenging

prob-lemsfordesigningvideodeliverysystems. Inthispaper,we

identied fourproblemsfor videodeliverysystems:

band-width, delay, loss,and heterogeneity. There are two

gen-eralapproachesthataddresstheseproblems: the

network-centric approachand theendsystem-based approach. We

Figure

Fig. 1. Internet television uses multicast (point-to-multipoint communication) instead of unicast (point-to-point communication) to deliver
Fig. 3. A layered architecture for transporting real-time video.
Fig. 5. Source rate behavior under the AIMD rate control.
Fig. 7. Layered video encoding/decoding. D denotes the decoder.
+7

References

Related documents

Our approach is a three-step process: (1) Historical Anomaly Detection: Evaluate the past updates announced by ASes for establishing hijacked or vacillating bindings, (2)

The history of the authorities for heritage management is a history of ongoing control and rationalisation exerted by the local government on Lijiang Ancient Town, all having

losses than AC lines for the same power capacity. The losses in the converter stations have of course to be added, but since they are only about 0.6 % of the transmitted power

Most Home is a technology and services company that provides lead response solutions for real estate brokers that helps them convert internet leads into qualified prospects for

3013 AXLE ARRANGEMENT 3020 DRIVE &amp; STEERING AXLE 3061 TRANSMISSION GEARS 3063 TRANSMISSION SHAFT 3064 TRANSMISSION CASE 3065 TRANSMISSION CONTROL 3066 TRANSMISSION OIL PUMP

Long-term collaborative music composition may also exhibit collaborative emergence, especially as collaborators are engaged in a developing common knowledge to navigate the

Patient cable PK-83-B for single-chamber pacing with two screw terminals for temporary leads on the patient side and Redel plug on the Reocor side (use Redel adapter).