Achievable rates for multi-terminal lossy source coding with feedback

In this section, we examine the networkNshown in Figure 4.8. SourcesX andY are sources present at nodes1 and2 respectively. The receiver (node 3) wishes to reconstruct both sources subject to the fidelity criteria:

EdX(X,Xˆ)≤DX and EdY(Y,Yˆ)≤DY

where,dXanddY are finite valued distortion measures, andDXandDY are the respective distortion thresholds. We derive an achievable region Rin,fb with feedback, and show that Rin,fb is a strict

superset ofRin, the best known achievable region without feedback. This is proved by evaluating

bothRin,fb andRin for the network considered in example 6, which is a special case. .

Let _D(DX, DY) denote the set of pairs of random variables (U, V) ∈ U × V for which there exist functions f : U × V → Xˆ and g : U × V → Yˆ such that. EdX(X, f(U, V)) ≤ DX and

EdY(X, g(U, V))≤DY. Define the setRX to be the set of all rate pairs(R13, R23)that satisfy the

conditions

R(1,3)> I(X;U|V), (4.14)

and R(2,3)> I(Y;V), (4.15)

for some pair of random variables(U, V)∈ D(DX, DY)for whichX →Y →V andU →(X, V)→Y form Markov chains. In a symmetric fashion, define the setRY to be the set of all rate pairs(R13, R23)

that satisfy the conditions

R(1,3)> I(X;U), (4.16)

and R(2,3)> I(Y;V|U), (4.17)

for some pair of random variables(U, V)_{∈ D}(DX, DY)for whichY →X →UandV →(Y, U)→X form Markov chains. BothR(1,2) andR(2,1)are non-empty since choosing (U, V) = (X, Y)satisfies

all the Markov chain conditions. Finally, let Rin,fb be the convex hull of RX∪RY, and again, let

R∞(N)denote the set of achievable rates with feedback for the network shown in Figure 4.8. The following theorem relatesR∞(N)and Rin,fb.

Theorem 10. R∞(N)_⊇Rin,fb.

briefly. Let Nabcd(x,y,u,v) denote the number of occurrences of the quadruplet (a, b, c, d) in the sequence(x,y,u,v). Define the typical set:

A(n)(X, Y, U, V),{(x,y,u,v) : n1Nabcd(x,y,u,v)−p(a, b, c, d) <_{|X ||Y||U||V|} ∀(a, b, c, d)_{∈ X × Y × U × V}. (4.18)

Similarly, for each subset W of _{X, Y, U, V}, define A(n)(W)to be the typical sets corresponding to n-length sequences drawn from the distribution of W. The above definition implies that if a collection of sequences is jointly typical with respect to their joint distribution, then any subset of the collection is also jointly typical with respect to the joint distribution of that subset; for example, if (x,y,u,v) _∈ A(n)(X, Y, U, V), then (x,y,u) ∈ A(n)(X, Y, U). Therefore, whenever the set of underlying random variables is clear from the context, we denote the corresponding typical set by the simplified notation A(n). Another useful property of this notion of typicality that it implies distortion typicality; namely, if (U, V) _{∈ D}(DX, DY) and (x,u,v) ∈ A(n), then

i=1dX(xi, f(ui, vi))< DX+dmax·.

Proof of Theorem 10: By the symmetry in the definition of Rin,fb and the convexity ofR∞(N),

it suffices to show that RX _⊆R∞(N). Let R= (R13, R23)∈RX. By definition, there exists a pair

(U, V)∈ D(DX, DY) for which X →Y →V and U →(X, V)→Y form Markov chains and the inequalities (4.14) and (4.15) are satisfied.

Fix an integer nand an >0. ChooseR0

13 such thatI(X, V;U)< R130 < R13+I(U;V). The

reason for this choice will become clear later.To show that the rate pair (R13, R23) is achievable,

consider the following encoding and decoding strategy over a block of lengthn.

Codebook generation: At encoder1, first generate2nR0

13n_{-length sequences sequences}_U(1),_U(2), . . ._, U(2nR0

13)_{by drawing each} _U(j)_{i.i.d. according to the probability rule} Pr(U(j) =u) = Πn

i=1p(ui)

for every u _{∈ U}n. Uniformly bin these 2nR013 _{sequences into} 2nR13 _{bins. We use} _B

n(j) to de- scribe the index of the bin into whichU(j)falls. At the second encoder, generate2nR23 _sequences V(1),V(2), . . . ,V(2nR23₎_{drawn i.i.d. with the probability rule}

Pr(V(j) =v) = Πni=1p(vi).

Both these codebooks are assumed known to both encoders and the decoder.

Encoding: Let α(1)n (Y) = k if (Y, Vn(k)) ∈ A(n). Otherwise, let α(1)n (Y) = 1. Trans- mit α(1)n (Y) to node 3, and also to node 1 via the feedback link. Let α(2)n (Y,X) = Bn(j) if

(X, Vn₍_α(1)

n (Y)), Un(j))∈A(n). Otherwise, letα(2)n (Y,X) = 1.

Decoding: The decoder first decodesα(1)n (Y)to the sequenceVˆn=Vn(α(1)n (Y)). Next, it looks for a sequenceUˆn_{in the bin}_α(2)

n (Y,X)s.t. ( ˆUn,Vˆn)∈A(n). Finally, it produces the reconstructions

Xn₌_f_{( ˆ}_U

1,Vˆ1), . . . , f( Ûn,Vˆn)andYˆn=g( Û1,Vˆ1), . . . , g( Ûn,Vˆn).

Sincecan be made arbitrarily small, it is clear that the above code can operate at rates as close toRas desired. Further, since dX anddY are finite distortion measures, in order to show that the expected distortion of this code can be made arbitrarily close to(DX, DY), it suffices to show that

Pr(1

ndX(X, f( ˆU

n_,_V_ˆn₎₎_{> D}

X+δ)and Pr(_n1dY(Y, g( ˆUn,Vˆn))> DX+δ)can be made arbitrarily small for each δ >0. Thus, it is enough to prove thatPr(_{(X,Y,Uˆn_,_V_ˆn₎ _/

∈ A(n)} can be made arbitrarily small for each >0. Note that

{(X,Y,Uˆn,Vˆn)∈/A(n)} ⊆E1∪E2∪E3∪E4,

where, the eventsE1, E2, E3, andE4 are defined as follows:

• E1 ={(X,Y)∈/ A(n)}. By AEP, the probability of this event can be made arbitrarily small by choosing nlarge enough.

• E2=E1c∩ {(X,Y,Vˆn∈/A (n)

}. By noting thatX →Y →V is a Markov chain, and using the Markov lemma [47], the probability of this event can be made to asymptotically vanish with

nas long asR23> I(Y;V)(see the proof of the rate distortion theorem in [46, 26] for further

details on this argument).

• E3 = (E1∪E2)c∩ {(X,Y,Vˆn, Un(j)) ∈/ A(n) ∀ j = 1,2, . . . ,2nR 0

13}_{. By following a similar}

reasoning as above, as long as R0

13 > I(X, V;U), the probability of this event can be made

arbitrarily small.

• E4 = (E1∪E2∪E3)c∩ {(un,Vˆn)∈A(n) for someun 6=Un(α(2)n (Y,X))s.t.un is in the bin

Bn(α(2)n (Y,X))}. The probability of this event can be made arbitrarily small too by choosing a large enough n, the number of elements in each bin is less than2nI(U;V) _{with probability}

approaching 1asngrows without bound.

Thus, for any rateR= (R13, R23)∈R(1,2), there exists a sequence of2-round((2nR13,2nR23), n)

codes for this network. By a similar reasoning, R(2,1) is achievable. By the convexity of R∞(N),

Rin,fb is achievable. Hence,Rin,fb ⊆R∞(N).

Let RF(N) denote the set of achievable rate pairs for the network in Figure 4.8 without the feedback links. Example 6 demonstrates that RF(N) ( R∞(N). It should be pointed out that finding a single letter characterization ofRF(N)is not known. Berger and Tung proposed an inner bound [47, 48] Rin ⊆RF(N), which was shown to be optimal for Gaussian sources [49]. For other

classes of sources, the question of tightness of this bound is still open. The inner bound is defined as follows:

Definition 3(Berger-Tung inner bound). [47, 48] The Berger-Tung inner boundRin is defined to

be the set of all rate pairs (R(1,3), R(2,3))that satisfy the conditions

R(1,3)> I(X;U|V), (4.19)

R(2,3)> I(Y;V|U), (4.20)

and R(1,3)+R(2,3)> I(X, Y;U, V), (4.21)

for some random variables U andV taking values in alphabets _U and_V respectively, and satisfying the following properties:

1. U _→X _→Y _→V forms a Markov chain, and 2. (U, V)∈ D(DX, DY).

Our next result relates Rin,fb to the Berger-Tung inner bound.

Theorem 11. For every source pair (X, Y),Rin,fb⊇Rin. Further, there exists a source pair such

that Rin,fb)Rin.

Proof. In order to prove that Rin,fb ⊇Rin, first note thatRin can be viewed as the convex hull of

RX,nf∪RY,nf, where RX,nf (and in a similar manner, RY,nf) is defined as the set of all rate pairs R= (R13, R23)∈Rin satisfying

R(1,3)≥I(X;U), (4.22)

R(2,3)≥I(Y;V|U). (4.23)

To prove thatRin=conv(RX,nf∪RY,nf), note that for eachR∈Rin andλ∈[0,1],

It follows thatRcan be written as a convex combination of points fromRX,nf andRY,nf. Therefore,

U _→ X _→ Y _→ V that is satisfied by every element in RX,nf implies the Markov conditions

U _→X _→Y andX _→(Y, U)_→V. Hence,RX,nf ⊆R(1,2), and therefore, Rin⊆Rin,fb.

Next, we show that there exist sources X and Y for which Rin,fb )Rin. As in Example 6, let

X = _Y = ˆ_X = ˆ_Y such that p(X = 0) = p(X = 1) = 1/2 and p(Y = x_|X = x) = p for some

p_∈(0,1/2). LetdX anddY be hamming distortion measures on onX ×Xˆ andY ×Yˆ respectively, i.e. di(z,zˆ) = 1 if z 6= ˆz and 0 is z = ˆz. Let 0 < DX <1/2 and DY = 0. Fix the rate for the second encoder to beR23=H(Y) = 1, which is achievable by choosingV =Y. Then,min{R13 :

(R13, R23)∈Rin,fb}= minU:(U,Y)∈D(DX,0)I(X;U|V) =RX|Y(DX), whereRX|Y(·)is the conditional

rate distortion function ofX whenY is known. Thus, the rate pair(RX|Y(DX), H(Y))lies inRin,fb.

On the other hand, min{R13: (R13, R23)∈Rin}= minU:(U,Y)∈D(DX,0),T→X→Y I(X;U|Y). By

the result of [45], this is strictly greater thanR_X|Y(DX). Hence,Rin,fb)Rin.

In document Network Coding and Distributed Compression over Large Networks: Some Basic Principles (Page 66-70)