Iterative Joint Video and Channel Decoding in a Trellis Based Vector Quantized Video Codec and Trellis Coded Modulation Aided Wireless Videophone

(1)

VTC 2006 Fall

Dallas

Iterative Joint Video and Channel Decoding in a

Trellis-Based Vector Quantized Video Codec and Trellis

Coded Modulation Aided Wireless Videophone

R. G. Maunder, J. Kliewer, S. X. Ng, J. Wang, L-L. Yang, L. Hanzo

School of ECS

(2)

Outline

❏

Introduction

❏

System Overview

❏

Trellis-Based Vector Quantization

❏

Results

(3)

Introduction

Shannonian Video and Channel Coding

❏

Video and channel coding may be performed

indepen-dently without penalty.

❏

However, this requires a number of conditions to be met,

including:

❏

infinite complexity and

❏

infinite latency.

❏

Hence, Shannonian video and channel coding is not

prac-tical.

(4)

Introduction

Joint Video and Channel Coding

❏

As in the Shannonian approach, the video codec achieves

compression.

❏

However, like the channel codec, the video also has a

For-ward Error Correction (FEC) capability.

❏

In the receiver, video and channel decoding are performed

jointly.

❏

We employ:

❏

a frame differencing and trellis-based Vector

Quantiza-tion (VQ) video codec and

(5)

System Overview

Transmitter

Bit

concatenator

f

n

−

+

Reconstruction buffer

e1 eM

M VQ

encoders

ˆ_f

n

+

e

ˆ e ˆ_f

n−1

+ Frame differencer

Bit interleaver (M sub-frames)

Frame difference decomposer

Frame difference recomposer

ˆeM ˆe1

u1 uM

u _TCM

encoder

(6)

System Overview

Bit

concatenator

f_n

− +

e1 eM

M VQ

encoders

ˆ_f

n

+

e

ˆe ˆ_f

n−1

+ Frame differencer

Bit interleaver (_M sub-frames)

Frame difference decomposer

Frame difference recomposer

ˆ eM ˆ

e1

u1

uM

u _TCM

encoder

(7)

System Overview

Receiver

❏

The TCM decoder and the VQ decoders iteratively exchange mutually extrinsic

Log Likelihood Ratio (LLR) soft information.

❏

This extrinsic information is exploited as a priori information during decoding.

❏

The a priori information is subtracted from the resultant a posteriori

informa-tion to obtain more reliable extrinsic informainforma-tion for use in the next iterainforma-tion.

(1)

y

L1

a(u) =L

2

e(u)

L1

e(u)

+

−

decoderTCM

L1

p(u) = L

2

e(u) +L

1

e(u)

L2

e(u)

L2

a(u

M₎

˜_f

n−1

+ − + ˜_f n + L2

a(u) = L

1

e(u)

(2) M VQ decoders LLR interleaver LLR deinterleaver L2

a(u

1₎

L2

p(u) =L

1

e(u) +L

2

e(u)

recomposer Frame difference (_M sub-frames)

LLR partitioner LLR concatenator ˜ e1 ˜ eM ˜ e L2

p(u

1₎ _L2

p(u

(8)

System Overview

(1)

y

L1a(u) = L

2

e(u)

L1e(u)

+ −

decoderTCM

L1p(u) = L

2

e(u) +L

1

e(u)

L2e(u)

L2a(u

M₎

˜_f

n−1

+ − + ˜_f n +

L2a(u) = L

1

e(u)

(2) M VQ decoders LLR interleaver LLR deinterleaver

L2a(u

1₎

L2p(u) = L

1

e(u) + L

2

e(u)

recomposer

Frame difference (_M sub-frames)

LLR partitioner

LLR

concatenator

˜ e1

˜eM

˜ e

L2p(u

1₎ _L2

p(u

(9)

Trellis-Based Vector Quantization

Frame Difference Decomposition

❏ The Frame Difference (FD) e is decomposed into M number of sub-frames [e1 . . .eM] for the sake of reducing trellis-based VQ complexity.

❏ Each FD sub-frame em, m ∈ [1. . . M], comprises J number of video blocks [em₁ . . . em_J ] chosen as one Macro Block (MB) from each MB group.

❏ This ensures that the sub-frames have similar characteristics.

❏ For example:

Macro-block grouping boundaries. Each macro-block group comprises M _{= 33 macro-blocks.} Video block, comprising (8×8) pixels.

Macro-block, comprising JMB _{= 4 video blocks.}

FD sub-frame em, comprising

J _{= 12 video blocks.} FD e, comprising

M _·J _{= 396 video blocks.}

em 1

em 2

em 3

em 4

em₅ em 6

em₇ em 8 em₉

em 10

(10)

Trellis-Based Vector Quantization

Macro-block grouping

boundaries. Each

macro-block group comprises

M

_{= 33 macro-blocks.}

Video block, comprising

(8

_×

8) pixels.

Macro-block, comprising

J

MB

_{= 4 video blocks.}

FD sub-frame

e

m

, comprising

J

_{= 12 video blocks.}

FD

e

, comprising

M

_·

J

_{= 396 video blocks.}

em 1

em 2

em 3

em 4

em 5

em 6

em 7

em 8 em

9

em 10

em 11

(11)

Trellis-Based Vector Quantization

Vector Quantization Codebook

❏ The LBG-algorithm designed VQ codebook comprises K number of VQ tiles [VQ1 . . .VQK] of various dimensions.

❏ This allows the efficient encoding of large areas of low FD activity.

❏ Each VQ tile VQk, k ∈ [1. . . K], is represented by a minimum free-distance 2 Re-versible Variable Length Code (RVLC) RVLCk with an entropy-considered length.

❏ For example:

k

Index Video blocks (Jk

x ×Jyk) =Jk

Bits Ik

(8×8) pixels.

comprising Video block,

VQ1 VQ2 VQ3 VQ4 VQ5

VQ tiles:

1 1

2 3 4 5

(2×2) = 4 0

2 3 4 5

11 101 1001 10001 (1×2) = 2

(1×1) = 1

RVLC code

(12)

Trellis-Based Vector Quantization

k

Index

Video blocks

(

J

k

x

×

J

yk

) =

J

k

Bits

I

k

(8

comprising

_×

8) pixels.

Video block,

2

3

4

5 (2

_×

2) = 4

0

2

3

4

5

11

101 1001

10001

(1

_×

2) = 2

(1

_×

1) = 1

(1

_×

1) = 1

(1

_×

1) = 1

RVLC code

(13)

Trellis-Based Vector Quantization

Vector Quantization Code Constraints

❏ Each J video block FD sub-frame em must be represented by an exact tessellation of VQ tiles from the VQ codebook.

❏ The corresponding transmission sub-frame um is formed by concatenating the corre-sponding RVLC codes and must comprise I number of bits [um₁ . . . um_I ].

❏ These code constraints must be adhered to during VQ encoding and may be exploited to achieve FEC during VQ decoding.

❏ For example:

um ₌_{₁_{, , , , ,}₀ _{0 0 1} ₁_,_{0 0 1 1 0 1}_{, , , , ,} _,₁_{, , , ,}₀ _{1 0 0}_}

Ta 5 1 5

1 1 1

IkT JkT kT T Tb Tc Td Te T_f 4 3 3 1 1 4 3 3 1 1 4 4 em1 e2m e3m em4 e5m e6m em7 e8m e9m em10em11em12

ˆ

em ₌

video blocks. Macro-block, JMB = 4 comprising pixels. FD sub-frame, comprising

J = 12 video blocks. Video block,

comprising (8×8)

(14)

Trellis-Based Vector Quantization

u

m

=

{

1 _{, , , , ,}

0 0 0 1

1 _,

0 0 1 1 0 1

_{, , , , ,}

_,

1 _{, , , ,}

0 1 0 0

}

T

_a

5

1

5

1

1 I

kT

J

kT

k

T

4

3

1

4

ˆ

e

m

=

video blocks.

Macro-block,

J

MB

= 4

comprising

pixels.

FD sub-frame,

comprising

J

= 12

video blocks.

Video block,

comprising

(8

×

8)

Transmission sub-frame, comprising

I

= 17 bits.

(15)

Trellis-Based Vector Quantization

Vector Quantization Trellis Structure

❏ Each transition T in the trellis represents a possible application of the JkT -block VQ tile VQkT and the corresponding IkT -bit RVLC RVLCkT.

❏ The trellis represents every possible J-block FD sub-frame ˆem and I-bit transmission sub-frame um combination.

❏ For example:

F D su b -frame ˆe m , comp risin g J = 12 vid eo b lo cks T_a T_b T_c T_d T_e T_f ˆ em 1 ˆ em 2 ˆ em 3 ˆ

em₄

ˆ em 5 ˆ em 6 ˆ

em₇

ˆ em 8 ˆ em 9 ˆ

em₁₀

ˆ em 11 ˆ em 12 um

1 um2 um3 um4 um5 um6 um7 um8 u9m um10 um11 um12 u13m um14 um15 um16 um17

(16)

Trellis-Based Vector Quantization

F

D

su

b

-frame

ˆe

m

,

comp

risin

g

J

=

12 vid

eo

b

lo

cks

ˆ

e

m 12

(17)

Trellis-Based Vector Quantization

Vector Quantization Encoding and Decoding

❏

VQ encoding performs the Viterbi algorithm on the trellis using a mean

squared error distortion metric.

❏

This ensures that the VQ code constraints are adhered to and obtains the

optimal Minimum Mean Squared Error (MMSE) encoding.

❏

The trellis is also employed during VQ decoding to guarantee the recovery of

valid information and to provide a FEC capability by exploiting the VQ code

constraints and the minimum free distance of the RVLCs.

❏

The BCJR algorithm is performed on the basis of the transmission sub-frame

a priori soft information

L

2_a

(

u

m

)

.

❏

A posteriori soft information

L

2_p

(

u

m

)

is obtained by considering the a posteriori

probabilities of transitions on vertical cross sections through the trellis.

❏

Similarly, a MMSE soft FD sub-frame reconstruction

˜

e

m

is obtained by

(18)

Results

Simulation parameters

❏ 100 frames of ‘Lab’ QCIF (176× 144)-pixel 10 fps video sequence

❏ M = 33, J = 12, I = 45, K = 512, 14.85 kbps

❏ 3/4-rate TCM using 16QAM modulation

❏ Bandwidth efficiency of η = 2.00 bit/s/Hz

❏ E_b/N0 = 3.96 dB at Rayleigh fading channel capacity limit

(19)

Results

EXIT chart

❏ Using minimum free-distance 2 RVLCs ensures that VQ decoding can achieve unity extrinsic mutual information and an infinitesimally low decoding error.

❏ Tunnel for E_b/N₀ > 5.25 dB, just 1.29 dB from the channel capacity limit.

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

I_A1, I_E2

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

IE 1 ,

IA

2

High latency Eb/N0= 6 dB

Low latency Eb/N0= 6 dB

EXIT trajectory: Eb/N0= 11 dB

Eb/N0= 10 dB

Eb/N0= 9 dB

Eb/N0= 8 dB

Eb/N0= 7 dB

Eb/N0= 6 dB

Eb/N0= 5.25 dB

Eb/N0= 4 dB

(20)

Results

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

I

1

, I

2

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

I

E

1

,

I

A

2

High latency Eb/N0= 6 dB Low latency Eb/N0= 6 dB EXIT trajectory:

Eb/N0= 11 dB Eb/N0= 10 dB Eb/N0= 9 dB Eb/N0= 8 dB Eb/N0= 7 dB Eb/N0= 6 dB Eb/N0= 5.25 dB Eb/N0= 4 dB

(21)

Results

PSNR performance

❏ VQ- and MPEG4-based bench-markers employ iterative channel decoding, have a 0.1s latency and have the same computational complexity as our approach.

4 5 6 7 8 9 10 11 12

E_b/N₀[dB]

24 26 28 30 32

A

v

er

ag

e

P

S

N

R

[d

B

]

MPEG4-based bench-marker VQ-based bench-marker VQ-TCM low latency VQ-TCM high latency