Comparison of DVC Architectures - Conventional Distributed Video Coding

3.3 Conventional Distributed Video Coding

3.3.4 Comparison of DVC Architectures

The similarities and diﬀerences between these three DVC systems are summarized as follows.

(i) Frame Classiﬁcation

In both the Stanford system and DISCOVER, the input video sequence are divided into WZ frames and key frames. In PRISM, there is no classiﬁcation of frames performed. All video frames are treated similarly.

(ii) Spatial Transformation

In all three architectures, block-based DCT is used. In the Stanford system and DISCOVER, only the WZ frames are transformed. The transform coeﬃcients of each frame are grouped according to their values into bands.

In the Stanford and DISCOVER codecs, after Turbo / LDPC decoding, inverse DCT is performed to decode the WZ frames. In the PRISM codec, a block is reconstructed from the corresponding SI and quantized bit stream.

(iii) Quantization

In the Stanford system and DISCOVER, each DCT band is uniformly quantized with a number of levels that depend on the target quality or on the DCT coeﬃcients. For a given band, bits of the quantized symbols are grouped together, forming bit- planes, which are then independently turbo encoded or LDPC encoded. In the PRISM architecture, a scalar quantizer is used.

(iv) Block Classiﬁcation

This is only done in PRISM since the other two are frame-based codecs. (v) Turbo/LDPC Coding

Only turbo encoding is used in the Stanford system while DISCOVER makes use of both turbo and LDPC encoding for coeﬃcient bit-planes. The Turbo/LDPC decoder receives successive chunks of parity bits from the feedback channel. To decide whether more bits are needed for the successful decoding, the decoder uses a simple request stopping criteria which checks that all Turbo/LDPC code parity check equations are satisﬁed for the decoded codeword. In DISCOVER, a further CRC checking is performed to obtain a good reconstruction quality.

(vi) Syndrome Coding and Hash Generation

This is performed in the PRISM codec only. For the syndrome class, only the least significant bits of the quantized DCT coefficients are syndrome encoded. In addition, for each block, the encoder sends a 16-bit cyclic redundancy check (CRC) checksum as a signature of the quantized DCT coefficients. This is needed in order to select the best candidate block (SI) at the decoder. Candidate blocks are used for syndrome decoding. A hash signature is generated for each decoded candidate block. For successful decoding, the generated hash signature is compared with the CRC hash received from the encoder.

(vii) Side Information Creation

This is an important step in DVC decoding. For both the Stanford and DISCOVER codecs, SI is created by previously decoded key frames using motion compensated frame interpolation. This is an estimate for the WZ frames. The better the estimate, the smaller the number of parity bits needed for correction. In PRISM, motion estimation is performed using a reference frame by positioning a window around the center of block to be decoded.

(viii) Correlation Noise Modelling

The correlation statistics between side information and WZ frames is modelled by the Laplacian distribution. This modelling is needed in both the Stanford system and DISCOVER. Prism does not require this step.

3.4 Summary

In this chapter, a review of CS based Image and Video coding is presented. Different CS image coding schemes are classified into different categories and then key points in each category are discussed. Similarly, a classification for different CS video coding schemes is discussed. The differences with the work done in this thesis and available CS image/video literature is also discussed.

Chapter 4 Sensing Matrix, Quantization

Matrix and Reconstruction

Algorithms for Image Compression

In a conventional lossy image compression system, an invertible transform is applied to the image which provides its expansion in terms of transform coefficients. Typically most of the energy of the signal is concentrated in a relatively small subset of the transform coefficients. Consequently, when quantization is then applied to the coefficients, a significant number of quantized coefficients will be zero and therefore need not be encoded. After quantization, a lossless compression process called “entropy coding” encodes the data into a bit stream for storage or transmission. Decompression is performed by inverse quantization followed by inverse transformation. This process is used in JPEG [1]. The choice of transformation and the design of the quantization matrix are important factors in the performance of the compression system.

For a system based on compressed sensing, the process is somewhat diﬀerent. Instead of applying a transform to the image, a set of linear measurements is obtained through a sensing matrix. The number of measurements is typically much smaller than the original image. Figure 4.1 illustrates this process in block diagram form. Here the measurementsy

is obtained by applying a sensing matrix Φ to an imagexwith a total ofN pixels. Φ is an

Image Compression

Image

CS Encoding

Quantization

Recovered

Image

CS

Decoding

Inverse

Quantization

Encoder

Decoder

Sparsity

Transform /TV

Figure 4.1: CS Image Compression

vector y = Φx is m. The CS measurements are then quantized and entropy encoded. At the decoder, inverse quantization is followed by a CS recovery process to reconstruct the image. In this case, the performance of such a compression system is determined by the number of measurements, the sensing matrix, the quantization matrix, and the CS reconstruction algorithm.

In this chapter, the effects of the choice of sensing and quantization matrices, and the CS reconstruction algorithms are studied in a non-distributed image compression set- ting. The efficacy of several different sensing matrices are evaluated in terms of encoding complexity and ease of implementation. A quantization matrix is designed and its performance is evaluated. Finally, several different CS reconstruction algorithms are compared in terms of reconstruction time and reconstruction quality. The results obtained in this

chapter is then applied to distributed image and video coding in subsequent chapters.

In document Distributed image and video coding based on compressed sensing : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Engineering at Massey University, New Zealand (Page 69-74)