Optimal Encoding and Decoding - Single Sensor, Network of Arbitrary Topology

3.5 Single Sensor, Network of Arbitrary Topology

3.5.2 Optimal Encoding and Decoding

For the nodei, denote by Ii_{(k) the information set that it can use to generate the message that it}

transmits at time stepk. This set contains the aggregate of the information the node has received on the incoming edges at time stepst= 0, 1,· · ·,k−1. As an example, for the source nodes,

Is(k) =_{y(0), y(1),_{· · ·}, y(k₋1)_}.

Based on the information set _Ii_{(k), the} _{i-th node can calculate its minimum mean squared error}

ˆ xi_(k

|Ii_{(k)) or more shortly as ˆ}_xi_{(k), where it is understood that the estimate is based on the informa-}

tion set_Ii_{(k). We denote the error covariance associated with this estimate by}_Pi_(k

|Ii_{(k)) or more}

compactly asPi_(k). _{We aim to minimize the error covariance}_Pd_{(k). Clearly, for two information}

sets_Ii_(k,_{1) and} Ii_(k,_{2) related by} Ii_(k,₁₎ ⊆ Ii_(k,_{2), we have}_Pi_(k |Ii_(k,₁₎₎ ≤Pi_(k |Ii_(k,_2)).

The packet drops occur according to a random process. Letλuv(k) be the binary random variable

describing the packet drop event on link (u, v) _{∈ E} at time k. λuv(k) assumes the value ‘dropped’

if the packet is dropped on link (u, v) at time k and ‘received’ otherwise. For a network with independent and memoryless packet drops,λuv(k) has a Bernoulli distribution with parameterpuv.

We defineλuu(k) = ‘received’. Once again, we refer to instantiations of the packet drop process as

packet drop sequences. Given a packet drop sequence, at time stepk we can define a time stamp ti_{(k) for node}_i_{such that the packet drops did not allow any information transmitted by the source}

afterti_{(k) to reach the}_{i-th node in time for it to be a part of} Ii_(k).

Now, consider an algorithm_A1as follows. At time stepk, every node takes the following actions:

1. Calculate the estimate of statex(k) based on the information set at the node. 2. Transmit its entire information set on the outgoing edges.

3. Receive any data successfully transmitted along the incoming edges.

4. Update its information set and affix a time stamp corresponding to the time of the latest measurement in it.

When this algorithm is executed for a particular drop sequence, the information set at node iwill be of the form

Ii(k) =_{y(0), y(1),_{· · ·}, y ti(k)_},

whereti_(k)_{< k} _{is the time stamp as defined above. This is the maximal information set}_Ii,max_(k)

that the node i can possibly have access to with any algorithm. For any other algorithm, the information set will be smaller than this since earlier packets, and hence measurements, might have been dropped. The error covariance at any node when it calculates its estimate of the state x(k) is the least when the information set it has access to is _Ii,max_{(k). The algorithm}

A1 requires an

increasing amount of memory and transmission as time goes on. We will now describe an algorithm

A2that achieves the same performance at the expense of constant memory and transmission (modulo

the transmission of the time stamp). The algorithm proceeds as follows. At each time stepk, every node takes the following actions:

1. Calculate its estimate ˆxi_{(k) of the state}_{x(k) based on any data received at the previous time}

stepk−1 and its previous estimate. The estimate can be computed using a switched linear filter, as shown later.

2. Affix a time stamp corresponding to the latest measurement used in the calculation of the estimate in step 1 and transmit the estimate on the outgoing edges.

3. Receive any data on the incoming edges and store it for the next time step. To prove that algorithm_A2is indeed optimal, we need the following intermediate result.

Lemma 3.6 Consider any edge (i, j) and any packet drop pattern. At time stepk, let the nodei transmit the measurement set

Sij ={y(0), y(1),· · ·, y(l)}

on the edge (i, j) if algorithm _A1 is executed. If, instead, algorithm A2 is executed, the node i

transmits the estimate

x k_|Sij= ˆx(k_|{y(0), y(1),_{· · ·}, y(l)_}) along the edge(i, j)at time step k.

Proof The proof readily follows by induction on the time stepk. For time k= 1, the source node stransmits_{y(0)_}along all edges of the form (s, .) while following algorithm_A1 and the estimate

x(1|y(0)) while executing algorithmA2. If any edge is not of the form (s, .), there is no information

transmitted along that edge in either algorithm. Thus, the statement is true fork= 1. Now, assume that the statement is true for k=n. Consider the node i at timek =n+ 1. If the nodei is the source node, the statement is true by an argument similar to that at k = 1. Let us assume that nodeiis not the source node. Consider all edges that transmitted data at time stepk=nto node i. If algorithm_A1 is being executed, suppose the edge (p, i) transmits the measurement set

Spi={y(0), y(1),· · · , y(t(p))},

where p_{∈ N}i is an in-neighbor of node i. Also, denote the measurement set that the node i has

access to from time stepk=n₋1 as

Sii=_{y(0), y(1),_{· · ·}, y(t(i))_}.

Note that at time stepk=n, the nodeitransmitted the set Sii along all outgoing edges. Let vbe the node for which

t(v) = max_{t(i)_{∪ {}t(p)_|p_{∈ N}i}}.

Then, at time k = n+ 1 under algorithm A1, the node i transmits along all outgoing edges the

measurement set

S1₌

Now, consider the case when algorithm _A2 is being executed. By the assumption of the statement

being true at time stepk=n, the edges (p, i) transmit the estimate

x(n|Spi) = ˆx(n|{y(0), y(1),· · ·, y(t(p))}),

for all p_{∈ N}i. Also, since at time k=n the node transmittedSii on any edge (i, .) in algorithm A1, it has access to the estimate ˆx(n|Sii) when algorithmA2 is executed. Clearly, the setSviis the

superset of all sets Sii _and _Sji _where _j _{∈ N}

i andv have been defined above. Thus, the estimate

that the nodeicalculates at timek=n+ 1 is ˆx(n+ 1|Svi_{). But the measurement set}_Svi_{is simply}

the setS1_{. Hence, at time step}_k₌_n_{+ 1, the node}_i_{transmits along all outgoing edges the estimate}

x(n+ 1_|S1_{). Thus, the statement is true at time step}_k ₌_n_{+ 1 along all edges of the form (i, .).}

Since the nodeiwas arbitrary, the statement is true for all edges in the graph. Thus, we have proven that if the statement is true at timek=n, it is true at timek=n+ 1. But it is true at timek= 1. Thus, by the principle of mathematical induction, it is true at all time steps.

Note that we have also shown that if at time stepk, the node has access to the measurement setSii

from time stepk₋1 when algorithm_A1 is executed; it has access to the estimate ˆx(k−1|Sii) from

time stepk−1 when algorithm A2is executed. We can now state the following result.

Proposition 3.7 The algorithm A2 is optimal in the sense that it leads to the minimum possible

error covariance at any node at any time step.

Proof Consider a nodei. At timek, letj_{∈ N}i∪ {i}such thatλji(k−1) = ‘received’. Denote the

measurement set that is transmitted from nodej to nodei at time stepk under algorithm _A1 by

Sji_{. As in the proof of Lemma 3.6, there is a node}_{v, such that}_Svi _{is the superset of all the sets}

Sji_{. Thus, the estimate of node}_i_{at time}_k _{under algorithm} A1 is

xA1_{(k) = ˆ}_x(k

|Svi_).

From Lemma 3.6, when algorithm A2 is executed, at time step k, the node i has access to the

estimates ˆx(k−1|Sji_{). Once again, since}_Svi_{is the superset of all the sets}_Sji_{, the estimate of node}

iat time stepkis simply ˆ

xA2_{(k) =}_Aˆ_x(k

−1_|Svi_{) = ˆ}_x(k |Svi_).

Thus, we see that for any nodei, the estimates ˆxA1_{(k) and ˆ}_xA2_{(k) are identical for any time step} _k

for any packet drop pattern. But algorithm A1 leads to the minimum possible error covariance at

each node at each time step. Thus, algorithmA2 is optimal.

1. The step of calculating the estimate at each node in the algorithm_A2can be implemented using

a switched linear filter as follows. The source node implements a Kalman filter and updates its estimate at every time step with the new measurement received. Every other nodeichecks the time-stamps on the data coming on the incoming edges. The time-stamps correspond to the latest measurement used in the calculation of the estimate being transmitted. Then, node iupdates its time-stamp using the relation

ti(k) = max

j∈Ni∪{i}

λji(k−1)tj(k−1). (3.19)

Suppose the maximum of (3.19) is achieved by nodev ∈ Ni∪ {i}. Then, the nodei updates

its estimate as

xi_{(k) =}_Aˆ_xv_(k −1),

where ˆxv_{(k) denotes the estimate of the state}_{x(k) maintained by the node}_v.

2. As in the case of a single link, the algorithm is optimal for any packet drop pattern, i.e., irrespective of whether the packet drops are occurring in an i.i.d. fashion or are correlated across time or space or even adversarial in nature. We also do not assume any knowledge of the statistics of the packet drops at any of the nodes. Finally, if the communication links introduce finite delays and packet reordering, the algorithm can be modified along the lines of Section 3.4.2.

3. We have proved that the algorithm is optimal for any node. Thus, we do not need to assume only one sink. The algorithm is also optimal for multiple sources if all sources have access to measurements from the same sensor. For multiple sources with each source obtaining measurements from a different sensor, the problem remains open.

4. A priori we had not made any assumption about a node transmitting the same message along all the out-going edges. It turned out that in this optimal algorithm, the messages are the same along all the edges. Thus, the algorithm is suitable for broadcast channels such as the wireless channel.

5. The communication requirements can be reduced somewhat by adopting an event-based pro- tocol in which a node transmits only if it updated its estimate based on data arriving on an incoming edge at the previous time step. This will not degrade the performance but reduce the number of transmissions, especially if packet drop probabilities are high.

In a sense our algorithm corresponds to communication of information over a digital channel while the strategy of using no encoding is an analog communication scheme. Our algorithm allows the intermediate nodes to play the role of repeaters that help to limit the effect of the channel by

decoding and re-encoding the data along the way. In analog channels, repeaters make no difference in the received Signal-to-Noise ratio and hence the signal quality. Similarly, in our setting, if raw measurements are being transmitted, presence of intermediate nodes does not help in improving the estimation performance.

3.5.3 Stability Analysis

For the rest of the chapter, we will assume that packets are dropped in each link in an i.i.d. fashion with the loss probability in link e = (u, v) being pe. We also assume packet drop processes in

two different links to be independent of each other. We begin by computing the conditions for the estimate error at the sink node to be stable under algorithm A2 (or equivalently A1) when the

source and the sink are connected with a network of arbitrary topology. Since the algorithm A2 is

optimal, these form necessary conditions for the estimate error covariance to be stable under any other algorithm (in particular, for the case when the nodes do not do any processing and simply transmit measurements).

Once again, we are interested in stability in the bounded second moment sense. Thus, for the destination node trying to estimate a process of the form (3.18), denote the error at time stepkas

ed_{(k) =}_x(k)

−xˆd_(k),

where ˆxd_{(k) is the estimate of the destination node. We can compute the covariance of the error}

e(k) at timekas

Pd(k) =Eed(k)(ed(k))T,

where the expectation is taken over the initial conditionx(0), the process noisew(j), the measurement noise v(j) and the packet dropping processes in the network. We consider the steady-state error covariance in the limit askgoes to infinity, i.e.,

Pd(_∞) = lim

k→∞P

d_(k). _(3.20)

IfPd₍

∞) is bounded, we will say that the estimate error is stable; otherwise it is unstable.

We can isolate the effect of the network by explicitly taking the expectation with respect to the packet dropping process. For nodedand timek,td_{(k) denotes the time-stamp of the most recent}

to (3.19). The expected estimation error covariance at timekat node dcan thus be written as Pd(k) = Eed(k)(ed(k))T = Eh x(k)₋xˆd(k) x(k)₋xˆd(k)Ti = k X l=0 Eh x(k)₋xˆd(k_|td(k) =l) x(k)₋xˆd(k_|td(k) =l)TiProb td(k) =l,

where, in the final equation, we have explicitly taken the expectation with respect to the packet dropping process and ˆxd_(k

|td_{(k) =} _{l) denotes the estimate of} _{x(k) at the destination node given}

all the measurements{y(0), y(1),· · ·, y(l)}. We see that the effect of the packet dropping process shows up in the distribution of the time-stamp of the most recent observation used in estimating x(k)6_{. For future use, we denote the}_latency_{for the the node}_d_{at time}_k_as

ld(k) =k₋1₋td(k).

Also, denote the mmse estimate of x(k) given all the measurements_{y(0), y(1), _{· · ·}, y(k₋1)_} by M(k). ClearlyM(k) evolves in time according to a Riccati recursion. We can now rewrite the error covariancePd_{(k) as} Pd(k) = k−1 X l=0 Eh x(k)₋xˆd(k_|ld(k) =l) x(k)₋xˆd(k_|ld(k) =l)Ti_×Prob ld(k) =l = k−1 X l=0  AlM(k₋l) AlT + l−1 X j=0 AjQ AjT  Prob ld(k) =l. (3.21)

The above equation gives the expected estimation error covariance for a general network with any packet dropping process. The effect of the packet dropping process appears in the distribution of the latency ld_{(k). As we can see from (3.21), the stability of the system depends on how fast the}

probability distribution of the latency decreases.

To analyze the stability, we use Proposition 3.4, which is restated here for the case of independent packet drops.

Proposition 3.8 Consider a process of the form (3.18) being estimated using measurements from a sensor of the form (3.17) over a packet-dropping link that drops packets in an i.i.d. fashion with probability p. Suppose that the sensor calculates the mmse estimate of the measurements at every time step and transmits it over the channel. Then, the estimate error at the receiver is stable in the

Note that Prob`

td₍_k_{) =}_k´

bounded second moment sense if and only if

p|ρ(A)|2_<_1,

whereρ(A)is the spectral radius of the matrixA appearing in (3.18).

Network with Links in Parallel

We begin by considering a network consisting only of links in parallel. Consider the source and the sink node being connected by a network withmlinks in parallel with the probability of packet drop in linke being pe. Since the same data is being transmitted over all the links, the distribution of

the latency in (3.21) remains the same if the network is replaced by a single link that drops packets when all the links in the original network drop packets and transmits the information if even one link in the original network allows transmission. Thus, the packet drop probability of this equivalent link is p1p2· · ·pm. The necessary and sufficient condition for the error covariance to diverge thus

becomes

p|ρ(A)|2_<_1,

where

p=p1p2· · ·pm.

Necessary Condition for Stability in Arbitrary Networks

Using the result for parallel networks, we can obtain a necessary condition for stability for general networks as follows.

Proposition 3.9 Consider a process of the form (3.18) being estimated using measurements from a sensor of the form (3.17) through an arbitrary network of packet dropping links with drop probabilities pe’s. Consider every possible division of the nodes of the network into two sets with the source and

the sink node being in different sets (also called a cut-set). For any such division, letp1, p2,· · · , pn

denote the packet erasure probabilities of the edges that connect the two sets (equivalently, the edges that are in the cut). Define the cut-set erasure probability as

pcut set=p1p2· · ·pn.

Then, a necessary condition for the error covariance to converge is

wherepnetwork is the network erasure probability defined as

pnetwork= max

all possible cut-setspcut set.

Proof Denote the given network by_N1. Consider a cut setC of the networkN1, with the sources

being in setAand the destination nodedin setB and the links 1, 2, _{· · ·},njoining the setsAand B. Form another network_N2 by replacing all links within the setsA and B by perfect links, i.e.,

links that do not drop packets and further do not introduce any delay as data is transmitted across them. Now for any packet drop pattern, denote the information sets that the destination node has access to at any time stepk over networksN2 andN1 byId,N2(k) andId,N1(k) respectively. It is

obvious that

Id,N1_(k)

⊆ Id,N2_(k).

Thus, the estimate error covariances at the destination node for the two networks are related by

Pd(k_|Id,N1_(k))

≥Pd(k_|Id,N2_(k)).

Hence, by considering the stability of error covariance over network _N2, we can obtain a necessary

condition for the stability of error covariance over network _N1. NowN2 consists of the source and

the sink joined by edges 1, 2,_{· · ·},nin parallel. The condition for the error covariance across_N2 to

converge is thus

pcut set|ρ(A)|2<1,

where

pcut set =p1p2· · ·pn.

This is thus a necessary condition for error covariance acrossN1 to be stable. One such condition

is obtained by considering each cut-set. Thus, a necessary condition for the error covariance to converge is

pnetwork|ρ(A)|2<1,

where

pnetwork= max

all possible cut-setspcut set.

In document Distributed estimation and control in networked systems (Page 110-135)