Appendix 5. A Some Proofs
III. Sensor Data Collection
7. Data Collection with Hidden Markov Models
7.5. Cooperative Sensing by Spatial Correlation: An Extension
7.5.3. A Case Study
In this section, we present an evaluation of the proposed solution on Intel Lab data [1] as a case study. Figure 7.7 presents the cluster decomposition of the nodes1.
Each circle represents a spatial cluster. The clustering result is obtained by the
Node 1 Original Data T emper ature 0 2000 4000 6000 8000 10000 12000 14000 18 20 22 24 26
Node 2 Original Data
T emper ature 0 2000 4000 6000 8000 10000 12000 14000 18 20 22 24 26
Node 1 Collected Data
HMM Collected Data 0 2000 4000 6000 8000 10000 12000 14000 18 20 22 24
Node 2 Collected Data
HMM Collected Data 0 2000 4000 6000 8000 10000 12000 14000 20 22 24 26
Node 1 Restored Data
Time
Restored Data at Sink
0 2000 4000 6000 8000 10000 12000 14000 18 20 22 24 26
Node2 Restored Data
Time
Restored Data at Sink
0 2000 4000 6000 8000 10000 12000 14000 18 20 22 24 26
Figure 7.8.: Illustration on the cooperative data collection with HMM. unsupervised clustering algorithm Hierarchical Clustering [120] based on the follow- ing attributes: the x, y coordination of the sensor and the mean of the first 100
temperature readings. Other standard clustering algorithms like K-means [121] and EM [122] can also be used. However, comparisons of clustering algorithms is beyond the scope of this thesis.
Figure 7.8 demonstrates the data collection and restoration process graphically. Clearly the sensor readings from the two nodes are highly correlated, and each node is given a smaller profile to collect data with. The middle frames show that each node only focuses on its given profile region and does not report/update any reading other than that. However, the readings from the two cluster members complement each other and restored at the sink which are shown in the bottom frames. Comparing the bottom frames with the original data (top frames), it is interesting to see the similarities between them.
Table 7.5 reports the simulation results regarding the energy saving and data collection accuracy. It is obvious that the Cooperative data collection framework outperforms the independent G-HMMs especially for simulations with smaller preci-
Chapter 7. Data Collection with Hidden Markov Models
sion requirements: for instance, when= 0.1, the communication is saved by 93.6%
(compared with 75.2% for the independent case). The mean absolute differences
show that the proposed solution perfectly satisfies the precision requirement in the average sense. However, we also witness degradations on the precision satisfaction comparing with the independent counterpart. The decrease can be explained by the uncertainties associated with the spatial restoration.
Table 7.5.: Evaluation of the cooperative data collection method on temperature sensor
# of Data Data Mean Absolute Precision
Messages Saving (%) Difference (℃) Satisfaction (%)
= 0.1 635.3(±18.38) 93.6(±0.16) 0.095(±0.042) 84.1(±19.2) = 0.2 438.3 (±15.56) 95.6(±0.51) 0.145(±0.033) 87.0(±18.6) = 0.5 307.5(±24.75) 96.9(±0.25) 0.247(±0.021) 95.0 (±8.7) = 0.8 266.8 (±26.57) 97.3 (±0.57) 0.408(±0.031) 95.9(±5.3) = 1.0 257.17(±14.85) 97.4(±0.15) 0.458(±0.025) 98.8(±2.9)
7.6. Discussion
In summary, this chapter presents a HMM enhanced data collection method which features on-line fault detection and also requirement based compressed data collec- tion. Simulation results show that the method effectively filters out sensor data faults but also manages to keep a very low false positive rate with the help of the statistical model. Moreover, based on a pre-specified precision requirement, the method can accordingly significantly reduce the data communication by inferring the HMM. This chapter again demonstrates the power of statistical models: if prop- erly specified, they can bring benefits to WSNs.
We also extend the solution to accommodate both spatial and temporal corre- lations. A case study shows that the extension further reduces unnecessary data communication especially for applications with a demanding precision requirement. Note that the proposed HMM method suffers from linear growth hidden state space (see Section 7.3.2) when the user’s precision requirement gets refined. There- fore, when a WSN application requires precise sensor readings, sensors need to host more complex state models locally, which may not be desired for some low-end
sensor hardware. To resolve this shortcoming, in Chapter 8, we investigate another statistical model, dynamical Linear Model (DLM). DLM is a close relative to HMM: they share the same model structure: possessing both hidden and observation layers; but DLM have continuous valued hidden process, which resolves the growing hidden state space problem faced by the HMM method. We are going to show DLM can also significantly reduces the communication effort but from an adaptive sampling perspective.
Chapter 8.
Data Collection with Dynamical
Linear Models
8.1. Introduction
In this chapter, we still try to solve the same problem: how to energy efficiently collect data but from a different perspective and by a different model.
In terms of the employed statistical model, the proposed solution can be viewed as an extension to the HMM-based approach, as Dynamical Linear Models (DLMs) which are the main model the proposed data collection method exploits are essen- tially Hidden Markov Models with continuous-valued hidden process [72] 1. DLMs
have been successfully applied to solve engineering problems, like object track- ing [123] and robot localization [124], now become a popular method to study time series data. In time series analysis, a DLM provides a way to analyse the data from a component-wise (trend, seasonal component for example) perspective. More im- portantly, the model, allowing the components to evolve, is a suitable candidate to analyse non-stationary, irregular processes [93].
However, the proposed solution solves the problem quite differently from the HMM-based method. The idea is to maintain identical DLM models at both the sink and in the network. Instead of sending all the observed raw data to the sink, the distributed nodes decide locally the sensor data which need to be sampled and reported. By refraining from sending all the raw data, energy is saved locally at the nodes.
The illustration starts with an introduction to the DLM in Section 8.2, where the essential results of DLM are presented and explained. Section 8.3 presents proposed
1More rigorously, DLMs also require all the variables are Gaussian distributed. See Section 8.2
solution in detail. The section shows how the DLM results introduced previously can be employed and used in sensor context. In Section 8.4, the experiment results on the proposed solution are listed and explained. Finally, we finish this chapter by discussing the pros and cons of the solution.
8.2. Dynamical Linear Models
8.2.1. General Definition
A dynamical linear model (DLM) can be specified by a pair of equations for each
time t≥1,
Yt=Ftθt+vt, vt∼ Nm(0,Vt) (8.1a) θt=Gtθt−1 +wt, wt∼ Np(0,Wt), (8.1b)
together with a prior forθ0
θ0 ∼ Np(m0,C0), (8.1c)
where Gt (of order p×p) and Ft (of order m×p) are fixed scalar matrices and
vt,wtfor t >0 are independent zero mean Gaussian random vectors with their cor-
responding variance matrices Vt and Wt. Note that (8.1a) is called the observation equation and Yt is the counterpart of Xt in the Hidden Markov Model definition;
while (8.1b) is called the state equation which is equivalent to the hidden state
process of a HMM.
8.2.2. DLM for Sensor
It can be proved [93] that the hidden states (θt) follows Markovian property, i.e.,
p(θt|θ(t−1)) =p(θt|θt
−1); (8.2)
and the observationYt only depends on θt, i.e.,
p(Yt|θ(t),Y(t−1)) =p(Yt|θt). (8.3)
Therefore, DLM is indeed an extension of a HMM since it follows the two defining requirements in Definition 7.1 except that the hidden stateθt is a continuous value