In this section, we examine the networkNshown in Figure 4.8. SourcesX andY are sources present at nodes1 and2 respectively. The receiver (node 3) wishes to reconstruct both sources subject to the fidelity criteria:
EdX(X,Xˆ)≤DX and EdY(Y,Yˆ)≤DY
where,dXanddY are finite valued distortion measures, andDXandDY are the respective distortion thresholds. We derive an achievable region Rin,fb with feedback, and show that Rin,fb is a strict
superset ofRin, the best known achievable region without feedback. This is proved by evaluating
bothRin,fb andRin for the network considered in example 6, which is a special case. .
Let D(DX, DY) denote the set of pairs of random variables (U, V) ∈ U × V for which there exist functions f : U × V → Xˆ and g : U × V → Yˆ such that. EdX(X, f(U, V)) ≤ DX and
EdY(X, g(U, V))≤DY. Define the setRX to be the set of all rate pairs(R13, R23)that satisfy the
conditions
R(1,3)> I(X;U|V), (4.14)
and R(2,3)> I(Y;V), (4.15)
for some pair of random variables(U, V)∈ D(DX, DY)for whichX →Y →V andU →(X, V)→Y form Markov chains. In a symmetric fashion, define the setRY to be the set of all rate pairs(R13, R23)
that satisfy the conditions
R(1,3)> I(X;U), (4.16)
and R(2,3)> I(Y;V|U), (4.17)
for some pair of random variables(U, V)∈ D(DX, DY)for whichY →X →UandV →(Y, U)→X form Markov chains. BothR(1,2) andR(2,1)are non-empty since choosing (U, V) = (X, Y)satisfies
all the Markov chain conditions. Finally, let Rin,fb be the convex hull of RX∪RY, and again, let
R∞(N)denote the set of achievable rates with feedback for the network shown in Figure 4.8. The following theorem relatesR∞(N)and Rin,fb.
Theorem 10. R∞(N)⊇Rin,fb.
briefly. Let Nabcd(x,y,u,v) denote the number of occurrences of the quadruplet (a, b, c, d) in the sequence(x,y,u,v). Define the typical set:
A(n)(X, Y, U, V),{(x,y,u,v) : n1Nabcd(x,y,u,v)−p(a, b, c, d) <|X ||Y||U||V| ∀(a, b, c, d)∈ X × Y × U × V. (4.18)
Similarly, for each subset W of {X, Y, U, V}, define A(n)(W)to be the typical sets corresponding to n-length sequences drawn from the distribution of W. The above definition implies that if a collection of sequences is jointly typical with respect to their joint distribution, then any subset of the collection is also jointly typical with respect to the joint distribution of that subset; for example, if (x,y,u,v) ∈ A(n)(X, Y, U, V), then (x,y,u) ∈ A(n)(X, Y, U). Therefore, whenever the set of underlying random variables is clear from the context, we denote the corresponding typical set by the simplified notation A(n). Another useful property of this notion of typicality that it implies distortion typicality; namely, if (U, V) ∈ D(DX, DY) and (x,u,v) ∈ A(n), then
1
n
Pn
i=1dX(xi, f(ui, vi))< DX+dmax·.
Proof of Theorem 10: By the symmetry in the definition of Rin,fb and the convexity ofR∞(N),
it suffices to show that RX ⊆R∞(N). Let R= (R13, R23)∈RX. By definition, there exists a pair
(U, V)∈ D(DX, DY) for which X →Y →V and U →(X, V)→Y form Markov chains and the inequalities (4.14) and (4.15) are satisfied.
Fix an integer nand an >0. ChooseR0
13 such thatI(X, V;U)< R130 < R13+I(U;V). The
reason for this choice will become clear later.To show that the rate pair (R13, R23) is achievable,
consider the following encoding and decoding strategy over a block of lengthn.
Codebook generation: At encoder1, first generate2nR0
13n-length sequences sequencesU(1),U(2), . . ., U(2nR0
13)by drawing each U(j)i.i.d. according to the probability rule Pr(U(j) =u) = Πn
i=1p(ui)
for every u ∈ Un. Uniformly bin these 2nR013 sequences into 2nR13 bins. We use B
n(j) to de- scribe the index of the bin into whichU(j)falls. At the second encoder, generate2nR23 sequences V(1),V(2), . . . ,V(2nR23)drawn i.i.d. with the probability rule
Pr(V(j) =v) = Πni=1p(vi).
Both these codebooks are assumed known to both encoders and the decoder.
Encoding: Let α(1)n (Y) = k if (Y, Vn(k)) ∈ A(n). Otherwise, let α(1)n (Y) = 1. Trans- mit α(1)n (Y) to node 3, and also to node 1 via the feedback link. Let α(2)n (Y,X) = Bn(j) if
(X, Vn(α(1)
n (Y)), Un(j))∈A(n). Otherwise, letα(2)n (Y,X) = 1.
Decoding: The decoder first decodesα(1)n (Y)to the sequenceVˆn=Vn(α(1)n (Y)). Next, it looks for a sequenceUˆnin the binα(2)
n (Y,X)s.t. ( ˆUn,Vˆn)∈A(n). Finally, it produces the reconstructions
ˆ
Xn=f( ˆU
1,Vˆ1), . . . , f( ˆUn,Vˆn)andYˆn=g( ˆU1,Vˆ1), . . . , g( ˆUn,Vˆn).
Sincecan be made arbitrarily small, it is clear that the above code can operate at rates as close toRas desired. Further, since dX anddY are finite distortion measures, in order to show that the expected distortion of this code can be made arbitrarily close to(DX, DY), it suffices to show that
Pr(1
ndX(X, f( ˆU
n,Vˆn))> D
X+δ)and Pr(n1dY(Y, g( ˆUn,Vˆn))> DX+δ)can be made arbitrarily small for each δ >0. Thus, it is enough to prove thatPr({(X,Y,Uˆn,Vˆn) /
∈ A(n)} can be made arbitrarily small for each >0. Note that
{(X,Y,Uˆn,Vˆn)∈/A(n)} ⊆E1∪E2∪E3∪E4,
where, the eventsE1, E2, E3, andE4 are defined as follows:
• E1 ={(X,Y)∈/ A(n)}. By AEP, the probability of this event can be made arbitrarily small by choosing nlarge enough.
• E2=E1c∩ {(X,Y,Vˆn∈/A (n)
}. By noting thatX →Y →V is a Markov chain, and using the Markov lemma [47], the probability of this event can be made to asymptotically vanish with
nas long asR23> I(Y;V)(see the proof of the rate distortion theorem in [46, 26] for further
details on this argument).
• E3 = (E1∪E2)c∩ {(X,Y,Vˆn, Un(j)) ∈/ A(n) ∀ j = 1,2, . . . ,2nR 0
13}. By following a similar
reasoning as above, as long as R0
13 > I(X, V;U), the probability of this event can be made
arbitrarily small.
• E4 = (E1∪E2∪E3)c∩ {(un,Vˆn)∈A(n) for someun 6=Un(α(2)n (Y,X))s.t.un is in the bin
Bn(α(2)n (Y,X))}. The probability of this event can be made arbitrarily small too by choosing a large enough n, the number of elements in each bin is less than2nI(U;V) with probability
approaching 1asngrows without bound.
Thus, for any rateR= (R13, R23)∈R(1,2), there exists a sequence of2-round((2nR13,2nR23), n)
codes for this network. By a similar reasoning, R(2,1) is achievable. By the convexity of R∞(N),
Rin,fb is achievable. Hence,Rin,fb ⊆R∞(N).
Let RF(N) denote the set of achievable rate pairs for the network in Figure 4.8 without the feedback links. Example 6 demonstrates that RF(N) ( R∞(N). It should be pointed out that finding a single letter characterization ofRF(N)is not known. Berger and Tung proposed an inner bound [47, 48] Rin ⊆RF(N), which was shown to be optimal for Gaussian sources [49]. For other
classes of sources, the question of tightness of this bound is still open. The inner bound is defined as follows:
Definition 3(Berger-Tung inner bound). [47, 48] The Berger-Tung inner boundRin is defined to
be the set of all rate pairs (R(1,3), R(2,3))that satisfy the conditions
R(1,3)> I(X;U|V), (4.19)
R(2,3)> I(Y;V|U), (4.20)
and R(1,3)+R(2,3)> I(X, Y;U, V), (4.21)
for some random variables U andV taking values in alphabets U andV respectively, and satisfying the following properties:
1. U →X →Y →V forms a Markov chain, and 2. (U, V)∈ D(DX, DY).
Our next result relates Rin,fb to the Berger-Tung inner bound.
Theorem 11. For every source pair (X, Y),Rin,fb⊇Rin. Further, there exists a source pair such
that Rin,fb)Rin.
Proof. In order to prove that Rin,fb ⊇Rin, first note thatRin can be viewed as the convex hull of
RX,nf∪RY,nf, where RX,nf (and in a similar manner, RY,nf) is defined as the set of all rate pairs R= (R13, R23)∈Rin satisfying
R(1,3)≥I(X;U), (4.22)
R(2,3)≥I(Y;V|U). (4.23)
To prove thatRin=conv(RX,nf∪RY,nf), note that for eachR∈Rin andλ∈[0,1],
R13+R23 > I(X, Y;U, V) = (1−λ)I(X, Y;U, V) +λI(X, Y;U, V) = (1−λ)(I(X;U) +I(Y;U|X) +I(Y;V|U) +I(X;V|Y, U)) +λ(I(X;V|Y) +I(Y;V) +I(Y;U|V, X) +I(X;U|V)). (4.24)
It follows thatRcan be written as a convex combination of points fromRX,nf andRY,nf. Therefore,
U → X → Y → V that is satisfied by every element in RX,nf implies the Markov conditions
U →X →Y andX →(Y, U)→V. Hence,RX,nf ⊆R(1,2), and therefore, Rin⊆Rin,fb.
Next, we show that there exist sources X and Y for which Rin,fb )Rin. As in Example 6, let
X = Y = ˆX = ˆY such that p(X = 0) = p(X = 1) = 1/2 and p(Y = x|X = x) = p for some
p∈(0,1/2). LetdX anddY be hamming distortion measures on onX ×Xˆ andY ×Yˆ respectively, i.e. di(z,zˆ) = 1 if z 6= ˆz and 0 is z = ˆz. Let 0 < DX <1/2 and DY = 0. Fix the rate for the second encoder to beR23=H(Y) = 1, which is achievable by choosingV =Y. Then,min{R13 :
(R13, R23)∈Rin,fb}= minU:(U,Y)∈D(DX,0)I(X;U|V) =RX|Y(DX), whereRX|Y(·)is the conditional
rate distortion function ofX whenY is known. Thus, the rate pair(RX|Y(DX), H(Y))lies inRin,fb.
On the other hand, min{R13: (R13, R23)∈Rin}= minU:(U,Y)∈D(DX,0),T→X→Y I(X;U|Y). By
the result of [45], this is strictly greater thanRX|Y(DX). Hence,Rin,fb)Rin.