Motivation for Multiple Description Coding 7

Chapter 1 : Introduction 1

1.5 Motivation for Multiple Description Coding 7

Presently, image and video compression are well developed and widely used in the field of signal processing. Modern state of the art coders provide better compression with better quality. This is particularly true for video, as a video sequence possesses correlation in both spatial and temporal domains. The most efficient video compression schemes use motion estimation as we can see in the later part of this thesis and compensation algorithms that exploit prediction to remove the correlation in temporal domain [7]. The correlation in spatial domain is usually removed by methods similar to those of image compression.

A new field in signal processing is representation of scenes in 3D. Recently, it has drawn significant attention of industry, research community and end users. 3D video scenes may be captured by stereoscopic or multi-view camera settings. The captured multi-view video can be compressed directly or converted to more representations and then compressed.

In any case, the efficiently compressed video data has to be transmitted over communication channels, such as wireless channels. This raises the issue of error protection, since most of these channels are error prone. Compressed videos/images and especially video sequences are

vulnerable to transmission errors. If the error occurs in a video frame, it may propagate further into subsequent frames because of motion-compensated prediction. Multiview video compression methods utilize comprehensive temporal and inter-view prediction and therefore channel errors occurring in one view can propagate not only to the subsequent frames of the same view but also to the other views.

Common approach to error protection is to consider it as a pure channel problem, separate from the source compression problem. This approach is based on Shannon theory, which states that in principle the source and channel coding tasks can be carried out independently with no loss of efficiency. Following this approach, raw source video sequences are processed in a way which reduces the data rate as much as possible. Reliable transmission of the bit stream to the receiver is provided by a channel coder. The transport mechanism has to be as reliable as possible since a single error in the compressed bitstream might severely damage the reconstructed signal. According to the noisy channel coding theorem (establishes that for any given degree of noise contamination of a communication channel, it is possible to communicate discrete data i.e digital information nearly error-free up to a computable maximum rate through the channel), a near- optimum transmission may be achieved if the data rate does not exceed the channel capacity. However, this can be hardly achieved in practice.

Widely used error-protection methods operate at the data link layer of the OSI, reference model are Selective Repeat (SR), Stop and Wait (S &W) and go-back-N (GBN) [8]. Therefore, error- free transmission is achieved by retransmitting packets that have been lost or corrupted. A problem with such a mechanism is that it causes delays and thus requires larger memory buffers. The latency is at least a packet round trip delay time. A second problem arises when packet losses are caused by network congestions. Trying to retransmit lost packets generates extra data traffic and makes the network even more congested for. Furthermore, retransmissions are virtually impossible in digital broadcasting. During broadcasting, loss of even a single packet may cause the transmitter to receive multiple retransmission requests, an effect called a feedback retransmission such as the Go-Back-N (GBN), Automatic Repeat Request etc.

This problem of congestion caused by retransmission is solve at the transport layer of OSI reference, as TCP backs off before retransmission starts. Congestion Avoidance (CA) is the mechanism designed in TCP to deal with lost packets. When TCP connection experiences packet loss, it then switches to CA, where the CA algorithm increments the size of the congestion window much more slowly than in low-start making the congestion window grows linearly than exponentially.

Another approach for reliable transmission over error-prone channels is so called forward error correction (FEC). The compressed bitstream data is distributed between packets and protected by block channel codes by adding extra bits. The data from the lost packets can be reconstructed from the received packets. The choice of the block code length is important. In terms of efficiency, long blocks are preferred since short blocks generate bitstreams with a relatively large number of additional symbols. The law of large numbers also dictates the choice of longer blocks since the number of errors is predicted more easily by long sequences. Just as with the previous approach based on retransmissions, a problem of delays and large memory buffers exists, this time caused by long blocks.

The decoders using the two above-mentioned approaches i.e., ARQ and FEC tolerate no errors. They assume that all the data transmitted is correctly received. Consequently, they spend a considerable amount of resources to guarantee this. Pure channel coding approaches might not be quite feasible in such cases. As an alternative, one can tolerate channel related losses. Assuming that not all data sent has reached the decoder, one can concentrate on ensuring efficient decoding of the correctly received data. In this case, one needs to change the source coding accordingly and, more broadly, to consider the error protection problem as a joint source-channel problem. As the result multiple description coding (MDC) is an attractive coding approach as it provides a reliable and error resilience source reconstruction using only part of the data sent to the decoder and employs no priority network transmission mechanisms [9]. The MDC is especially advantageous in short-delay media streaming scenarios such as video conferencing and when broadcast over error-prone channels where it provides acceptable reconstruction quality and prevents feedback impulsion in case of packet loss [10][11].

In document 3D multiple description coding for error resilience over wireless networks (Page 32-35)