Local Fault Test - Learning and Inference for the Local Model

Appendix 4. B Statistical Test Proofs

5. Fault Detection Based on Hierarchical Physical Models

5.3. Learning and Inference for the Local Model

5.3.2. Local Fault Test

In this section, we are going to show how the learnt statistical model can be used to do fault detection. Two results of multidimensional Gaussian distribution are used. First, the conditional Gaussian distribution theorem is used to find single variate data faults. Secondly, multivariate faults are filtered out using the Mahalanobis Distance Theorem.

Single Variate Test

The single variate test, or univariate test, requires Result 5.2, which shows how the conditional distribution of a subset of a multivariate Gaussian random vectors based on the observations of the rest is computed.

Result 5.2 (Conditional Gaussian Theorem). Let Y =

  Y1 Y2   be distributed as Np(µ,Σ) with µ =   µ1 µ2  , Σ =   Σ11 Σ12 Σ21 Σ22 

, and |Σ22| >0. Then the conditional

distribution of Y1, given that Y2 = y2, is still Gaussian and has an conditional

distributionY1|(Y2 =y2)∼ Np1(µ1|2,Σ1|2), where

µ1|2 =µ1+Σ12Σ22−1(y2 −µ2) (5.11)

Σ1|2 =Σ11−Σ12Σ−221Σ21. (5.12)

Proof. The conditional density can be calculated based on Bayesian’s theorem:

p(Y1|Y2) = p(_pY₍1_Y,₂Y₎2).Note that marginal distribution ofY2 is still a Gaussian with its

corresponding mean vector and covariance matrix. See [84] for a detailed proof. According to Result 5.2, we can show that the conditional distribution of temperature T given a humidity observation h is univariate Gaussian distributed, i.e. T|(H =h)∼ NµT|H, σT2|H ,where µT|H =µT + σ2 T ,H σ2 H (H−µH), σT2|H =σ 2 T − (σ2 T,H)2 σ2 H . (5.13)

Note the conditional mean is adjusted from the marginal mean µT by a term ac-

counting for the humidity observation and the bivariate correlation. Also note the conditional variance is always smaller than the original one, i.e. σ2

T|H < σ2T, which

means the uncertainty shrinks for this additional piece of evidence. Figure 5.3 demon- strates this idea graphically using real world sensor data.

Based on the Gaussian identity of the conditional distribution, the test presented in Theorem 4.4 : case 1 can be reused here to carry out the univariate fault test, i.e.

µT|H +tα,∞ q σ2 T|H < T _ µT|H −tα,∞ q σ2 T|H > T, (5.14)

whereT is a new temperature observation under test. Note that humidity observa-

Chapter 5. Fault Detection Based on Hierarchical Physical Models T H 17 17.5 18 18.5 19 19.5 38.6 38.7 38.8 38.9 39 39.1 39.2 39.3 39.4 39.5 0 2 4 0 0.5 1 1.5 Marginal Conditional

Figure 5.3.: Data fault testvia conditional Gaussian distribution. The joint density

is shown as a 2-D contour plot. The black bell shaped curves are the marginal distributions of the temperature (below) and humidity (right) respectively. Imagine a humidity observation of 38.8 (%) is sampled, note the change of the conditional distribution: the mean is switched to take the negative correlation into account, while the variance shrinks as a result of the new evidence.

Bivariate Test

Higher dimensional faulty data is more difficult to detect than the univariate case. For example, as shown in Section 2.2.1, an ensemble of non-faulty single variate data may still be faulty because of their statistical correlation.

To find multivariate faulty data, we use the following result of a multivariate Gaussian distribution.

Definition 5.1 (Mahalanobis Distance/Statistical Distance). Let X ∼ Np(µ,Σ) with |Σ|>0, for any observation x,

M(x) = (x−µ)0Σ−1₍_x₋_µ₎ _(5.15)

Result 5.3(Chi-square Distribution of Mahalanobis Distance). LetX ∼ Np(µ,Σ)

with |Σ|>0, the Mahalanobis distance of a random sample x is distributed as χ2

whereχ2

p denotes the chi-square distribution with pdegrees of freedom. Proof. See [84] for a sketched proof.

According to Result 5.3, to validate a multivariate observation x, one needs to

first calculate the corresponding Mahalanobis distance according to Equation (5.15); second, check whether the distance is smaller than a predefined critical value, like

χ2

p(0.5) for α = 0.05. If the distance is larger than the critical value, then the

observation should be classified as a fault ensemble, as the probability of observing a instance as x is smaller than α = 0.05. The test is summarised in the following

theorem.

Theorem 5.3 (Bivariate Fault Test). Let X ∼ Np(µ,Σ) with |Σ| > 0, for an observation y, if

M(y)> χ2_p(α), (5.16)

then

P{d: (d−µ)0Σ−1₍_d₋_µ₎_>_M₍_y₎_{} ≤}_α, _(5.17)

where χ2

p(α) denotes the upper (100α)th percentile of the χ2p distribution. Proof. The theorem follows from Result 5.3. SinceM(x)∼ χ2

p (by Result 5.3), we

have

P{d: (d−µ)0Σ−1₍_d₋_µ₎_≤_χ2

p(α)}= 1−α; (5.18)

therefore, Equation (5.17) follows directly.

5.3.3. Robust Learning

To make the solution robust to faulty learning data, a learning data filter similar to the one used in the spatial fault detection Section 4.3.4 is introduced. The filter tests the current learning data based on the model learnt so far by using the chi-square test in Theorem 5.3. If the result is negative, then the data entry is incorporated into the model by Equation (5.9); otherwise, it is discarded as a fault. For the same reason presented in??that a strict chi-square test will over screen benign learning data and lead to a biased model with underestimated variances, a more conservative critical value, αl

Chapter 5. Fault Detection Based on Hierarchical Physical Models

value used for fault data test in the operational phase. Algorithm 5.1 summarises the efficient and robust learning algorithm for the local model.

Algorithm 5.1 Robust learning of the local bivariate Gaussian model

Input: Ti, Hi: sensor readings of temperature and humidity observed at i; n ← 0

initially, denoting the current learning data size; 1: if n < Nl then

2: bivariate learning data test via Theorem 5.3 withα=αl update

3: if test is positive then 4: discard (Ti, Hi) ensemble

5: else

6: recursively learning model parameters via Theorem 5.2

7: n++

8: end if 9: end if

In document Wireless sensor network control through statistical methods (Page 95-99)