The Distributed Karhunen–Loève Transform - Distributed Source Coding Theory, Algorithms and App

particular, other transforms such as the wavelet transform and the discrete cosine transform are considered.

2.3.1 Problem Statement and Notation

We now assume that theLterminals depicted in Figure2.2 are observing a part of

anN-dimensional, jointly Gaussian real-valued vectorx. We further assume thatxhas zero mean and covariance matrix⌺x. If the encoding ofxis performed jointly,the best transform is the KLT,and this is true in both the linear approximation and compression scenarios. In the distributed scenario, however, encoders cannot communicate with each other, and each terminal devises the best local transform coding strategy based on its partial observation ofxand on the knowledge of the global covariance matrix ⌺x.To be more specific,we assume that the first terminal observes the firstM1compo- nents ofxdenoted byx1⫽(x1,x2, . . . ,xM1), the second encoder observes the next M2componentsx2⫽(xM1⫹1,xM1⫹2, . . . ,xM1⫹M2), and so on. Each terminal provides an approximation of the observed partial vector to the central decoder whose goal is to produce the best possible estimate of the entire vectorxfrom the received approx- imations. More precisely,the goal is to find a vectorxˆthat minimizes the mean-squared error (MSE)Ex⫺xˆ2.

Let us consider for the time being the linear approximation problem. That is, each terminal produces akl-dimensional approximation of the observed components, which is equivalent to saying that terminall applies akl⫻Ml matrixTl to its local observation. Hence, stacking all the transforms together, we can say that the central receiver observes the vectory given by

y⫽Tx⫽ ⎛ ⎜ ⎜ ⎜ ⎝ T1 0 . . . 0 0 T2 . . . 0 ... ... ... ... 0 0 . . . TL ⎞ ⎟ ⎟ ⎟ ⎠x. (2.11)

The decoder then has to estimate x from y. This situation is clearly more con- strained than the one encountered in the centralized transform coding scenario since the transformT is block-diagonal in the present distributed scenario, whereas it is unconstrained in the centralized case.

Under the MSE criterion, the optimal estimatexˆ ofx is given by the conditional expectation ofxgiveny. Moreover, in the Gaussian case, the estimator is linear and is given by

x⫽E[x|y]⫽E[x|Tx]⫽⌺xTT(T⌺xTT)⫺1Tx. The corresponding MSE can be written as

D⫽E[x⫺xˆ2]⫽trace(⌺x⫺⌺xTT(T⌺xTT)⫺1T⌺x). (2.12) Therefore, for ﬁxedkl’s, the distributed approximation problem can be stated as the minimization of (2.12) over all block-diagonal matricesT as given in (2.11). A simple

solution to this problem does not seem to exist. However, in some particular settings, some precise answers can be provided [12]. Moreover, in [12] an iterative algorithm that ﬁnds (locally) optimal solutions is also proposed.

The compression problem is to some extent even more involved since the transformed vectorsyl⫽Tlxl need to be quantized and the binary representation of the quantized version ofyl is transmitted to the receiver. The objective now is to obtain the best possible estimate ofxfrom the quantized versions of the approximationsyl, l⫽1,2, . . . ,Lunder an overall bit budgetR. Thus the problem now is not only to ﬁnd the right block-diagonal transform T but also the right allocation of the bits among the encoders and the correct quantization strategy.

In the next subsection, we consider the two-terminal scenario and review the linear approximation and compression results presented in [12]. One special case leading to theConditional KLTis also reviewed,and examples are given.The following subsection then discusses the multiterminal scenario.

2.3.2 The Two-terminal Scenario

We now consider the two-terminal scenario depicted in Figure2.3 where Terminal 1

observes the vectorx1given by the ﬁrstMcomponents ofxand the second terminal observes the vector x2 given by the last N⫺M components of x. The covariance matrix ofx1 (i.e.,E[x1xT1]) is denoted with⌺1. Similarly, we have⌺2⫽E[x2xT2]and ⌺12⫽E[x1x2T].

The transformT2applied by the second terminal tox2is assumed to be ﬁxed and known at both terminals. The decoder, however, does not have access to T2x2 but to the noisy versiony2⫽T2x2⫹z2, where z2 is a zero-mean jointly Gaussian vector

+ x1 i1 z1 y2 x2 T2 zk 2 Encoder 1 Decoder + x ^ FIGURE 2.3

The two-terminals scenario: Encoder 2 applies a transformT2to the observed vectorx2. The

transform is ﬁxed and known at both encoders. The decoder receives a noisy version

y2⫽T2x2⫹z; the noise may model the quantization error or is due to noise in the communication

channel. The open issue is to ﬁnd the best transform coding strategy at Encoder 1 under these circumstances.

2.3 The Distributed Karhunen–Loève Transform

45

independent ofx2. Notice thatT2has dimensionk2⫻(N⫺M)andz2has dimension k2⫻1. The noise might be due to the transmission channel or may model the effect of the compression of the transformed coefﬁcientsT2x2.

In line with the problem statement of the previous section, two perspectives are considered. In the ﬁrst one,Terminal 1 has to provide the bestk1-dimensional approximation of the observed vectorx1 given thaty2 is available at the decoder (but not at the Encoder 1). In the second scenario,x1has to be compressed using an available bit budgetR, and the task is again to devise the best compression strategy given that y2is available at the decoder. In both scenarios, the goal is the minimization of

E[x⫺xˆ2|y2], (2.13)

wherexˆ is the reconstructed vector.

Let us focus on the approximation problem ﬁrst. Because of the assumption that xandz2are Gaussian, there exist constant matricesA1andA2such that

x2⫽A1x1⫹A2y2⫹v, (2.14) wherevis a Gaussian random vector independent ofx1 andy2. Fundamentally, the termA1x1⫹A2y2represents the linear estimation ofx2fromx1andy2, andvis the uncertainty. Namely,vrepresents what cannot be estimated ofx2fromx1andy2.

Using the same arguments, we can also write IM

x1⫽B2y2⫹w, (2.15)

whereIM is theM-dimensional identity matrix andw is a Gaussian random vector independent of y2 and with correlation matrix ⌺w. It is worth mentioning at this stage that, because of the peculiar structure of the vectorIM

x1, the matrix⌺w has rank⭐M. It is also possible to show that⌺wis given by [12]:

⌺w⫽ IM A1 ⌺1⫺⌺12T T 2(T2⌺2T2T⫹⌺z)⫺1T2⌺T12 (IM AT1).

where in(a)we have used (2.14) and in(b)we have used the fact that the optimal estimation ofx2from (xˆ1,y2) isxˆ2⫽A1xˆ1⫹A2y2. Also recall thatvis independent of

In document Distributed Source Coding Theory, Algorithms and Applications tqw darksiderg pdf (Page 49-52)