The Proposed Enhanced Reputation-based Secure Data Aggregation Scheme

Aggregation Scheme

RSDA, which is presented in the previous chapter, integrates aggregation functionalities with the advantages provided by a reputation system in order to extend the network lifetime and enhance the accuracy of the aggregated data. However,RSDA is prone to the On-Off attack. Let us recall the adversary behavior during launching the attack, which is discussed in Sec- tion 3.2. The adversary inRSDA, once it has succeeded in compromising any sensorxin cell

Ck, behaves normally until it gets a high reputation score; hence, it becomes eligible as the next cell representativeC_krep. Oncexhas been elected asC_krepdue to its good reputation value, it behaves maliciously intermittently in order to affect the aggregation results ofCk. Switching

Table 5.1: Description of notations used in Chapter 5

Notation Description

K1_{, K}2 _{Two network-wide shared keys.}

Ci Thei-th cell.

KCi Intra-cell key for thei-th cell.

KCij Inter-cell key shared between thei-th andj-th cells.

H(.) Hash function.

MACK_Ci Message authentication code computed by usingKCi. ADV An adversary around the WSN.

T The number of nodes in each cell.

W The total number of compromised nodes in the whole deployment area.

t The minimum number of cell members that are required to revoke a misbehavingCrep or to confirm a newCread

x, y Sensor nodesxandy, respectively.

px, py The physical phenomena reported by sensor nodesxandyrespectively. B The base station.

C_iread The reported (sensed) physical phenomenon fromCi. F An aggregation function.

ARQn

Ci An aggregation result for query number Qn which is obtained by ap- plyingF atCi.

ARQn−

Ci A previous estimate of the aggregation result for query number Qn, which is predicted atCi.

Qn A query number.

R_Sx_/_A_/_F Reputation value of sensor nodexfor aSensing/ Aggregation/ or For- warding functionality.

αx_S_/_A_/_F The number of correct behaviors of sensor nodexfor a Sensing/ Ag- gregation/ or Forwarding functionality.

βx

S_/A_/F The number of incorrect behaviors of sensor node x for a Sensing/ Aggregation/ or Forwarding functionality.

ThrA_/S_/R The pre-defined threshold for theAggregation/Sensing/Reputation.

C_i# The number of inputs to the aggregation function.

aQn _{The absolute deviation score at}_Q n.

gQn _{The CUSUM score at}_Q_n.

∧,∨ The AND and OR operators, respectively.

between normal and anomalous behavior is important to ensure that the compromised node’s reputation value is at least equal to the predefined reputation thresholdThrR. For example,

C_krepcan alter the aggregation result for consecutive aggregation queries just before its reputation value falls belowThrR, which will let other cell members inCi initiate the revocation mechanism in order to replace and black-list this misbehaved cell representative. By doing this, the adversary has affected the reported aggregation results and extended the required time to detect its malicious behavior.

Sun et al. [119] discovered that using fixed forgetting factor technique can facilitate an adversary’s mission in launching On-Off attacks against a reputation-based trust system. The main idea behind the fixed forgetting factor technique is to let performing kgood actions at timet1 is equivalent to performingkβt2−t1 good actions at t2, where 0<β ≤1. Thus, Sun et

al. proposed a scheme that is inspired by a social phenomena. It takes long-time interaction and consistent good behaviors to built up a good reputation value; only few bad actions can ruin the reputation value. Therefore, they mimic the social phenomena by introducing an adaptive forgetting factor to defeat against OO attacks. In Sun et al.’s solution, the additional successful (r) and failed (s) interactions at (t2) between two nodes are updated as follows:

rt2=rt1βˆ+rt2−t1 and st2 =st1βˆ+st2−t1

where t2>t1 and ˆβ = 1−

rt1+1

rt1+st1+2

. However, we found that Sun et al.’s solution is insuf- ficient, because a single misbehave of a trustworthy sensor node can bring its reputation value to distrust category. This single misbehave can be an undelivered message which occurs not because the sensor node has an intension to misbehave, but it occurs due to unreliable wireless communication, which is common in WSNs.

Let us assume that a reputation value for a sensor node xis 0.944 due to its 34 successful and 1 failed interactions att1. If the behavior of sensor xfor the current activity (att2) has

been considered as a misbehave, then the updated reputation value of sensorxwill be 0.58. If a predetermined threshold value for reputation was set to be 0.7, then this single failure will move sensor nodexfrom a trust category to a distrust category. In other words, a single failure has changed the secure state of sensor node xfrom trust to distrust state. In this section, a solution against such an attack is proposed by using a different approach to Sun et al.’s solution. To mitigate the On-Off attacks (OO) inRSDA, the use of a combination of theestimation theory and theonline change point detection mechanism is suggested. This detection is based on measuring the deviation between the reputation-based aggregation and the estimate of the aggregation result. The estimation theory helps to measure the estimated value of the aggregation result by finding the mean of the aggregation results based on good historic data. The deviation from the mean helps an intermediate cell evaluate the behavior of its children cells, as will be discussed later. The evaluation result will be incorporated into the information gathering and sharing phase of the reputation system as a direct observation of the intermediate cell - see Section 3.1. Consequently, cell representatives at intermediate cells will be able to evaluate the aggregation behavior of the downstream/children cells’ representatives as well as be able to evaluate the forwarding behavior, as will be discussed below.

Since the proposal extends RSDA, it applies the same network assumptions, data model, and adversarial model. Consequently, the same notations used in describingRSDAare used in describingE-RSDAbut with few additions, in the last three lines in Table 5.1. The forwarding and sensing behaviors are evaluated in the same way as in RSDA. However, the aggregation behavior is evaluated differently, depending on whether the evaluation is performed on the aggregation results of the same cell representative or on the aggregation results of other cell representatives, specifically downstream cells. In the former, a cell member considers the over- heard aggregation result, which is calculated by itsCrep_{, as normal if the difference between its} aggregation calculation and its Crep calculation is bounded by a predefined threshold; other-

Represents an Intermediate Cell Represents a Leaf Cell Represents a Cell Member Represents a Cell Representative Represents a Single Cell Reading Represents Aggregated Readings

Cj Ck

Cm Cb

Figure 5.2: A simplified deployment area forE-RSDA

wise, it is considered abnormal. In the latter, an intermediateCrep compares the aggregation result, which is calculated by theCrep _{itself based on the reported aggregated data from down-}

stream cell representatives, with its prediction for the aggregation result, which is calculated based on the estimation theory. The aggregation behavior is considered normal if the difference is bounded by a predefined threshold; otherwise, it is considered anomalous. It is important to note that only the intermediate cell’s duties are enhanced, in this chapter, to mitigate the OO attack, and no modification has been done at leaf cells. The simplified deployment area represented in Figure 5.2 is used in the subsequent paragraphs to illustrate the modification done to the intermediate cellCj.

At Intermediate Cells The cell representativeC_jrepis challenged to evaluate the aggrega-

tion’s behavior of its children cells as it is not able to overhear all inputs to the aggregation functions they apply, which can be due to poor radio coverage or a limited authentication capability. For example,C_jrep in Figure 5.2 has no access to the shared key between cellsCb andCk due to the geographic location. This limitation is addressed from the anomaly detection perspective. Most existing anomaly detection approaches follow a centralized architecture where all the observed data are collected by a central entity. This architecture prohibits performing in-network aggregation within the deployment area, which depletes quickly the limited energy resources at sensor nodes. Thus, a distributed architecture for anomaly detection is preferable for WSNs due to its flexibility in applying in-network processing, which helps reduce communication energy consumption at intermediate cells.

The use of the estimation theory, online change point detection, and stop rules are respectively proposed to predict the future aggregation result, detect the deviation from the mean of the previous aggregation results, and verify the nature of the detected change at intermediate cells. The estimation function “estimator” helps the representative of an intermediate cell to predict theestimated aggregation result for the next query number with consideration

to previously accepted aggregation results. Then, the reputation-based aggregation result is compared with the estimated value to detect any major change in the aggregation behavior of the children cell representative, while the cumulative sum (CUSUM) score is evaluated to detect small deviations. Once a deviation has been detected, further investigation should be done to identify the nature of the change, which could be due to OO attacks, abrupt changes, or temporary changes in the environment. In this regard, each intermediate cell performs the following tasks:

The aggregation function: Let C_jrep, for example, apply the average aggregation

function (AVE) on the received readings in order to answerQn as follows: ARQn Cj =AVE(C read 1 , C read 2 , ..., C read i , ..., C read j ) = R Crep₁ _Cread 1 +RC rep 2 Cread 2 +...+RC rep i Cread i +...+R Crep j _Cread j C₁#+C₂#+...+C_i#+...+C_j# (5.3)

The recursive estimation function: The aggregation result at the intermediate cell

can be estimated recursively. In other words, the estimate of the aggregated result of cell Cj ( ˆAR

Cj) depends on its previous estimate value and the current aggregation result. There is no need to keep all old aggregation results in order to detect changes in the aggregation results. It is believed that the recursive form of the estimation is more practical for real time applications in WSNs, because it does not require large memory spaces to store old aggregation results [133, page 594]. The new estimate of the aggregated data, which answersQn, is calculated as follows:

ˆ ARQn Cj = Qn−1 Qn ˆ ARQn− Cj + 1 Qn ARQn Cj (5.4)

which can be further rewritten as: ˆ ARQn Cj = ˆ ARQn− Cj − 1 Qn ˆ ARQn− Cj + 1 QnAR Qn Cj (5.5)

By combining the last two terms of the right-hand side of Equation 5.5, we get ˆ ARQn Cj = ˆ ARQn− Cj + 1 Qn (ARQn Cj − ˆ ARQn− Cj ) (5.6)

The difference between the reputation-based aggregation result and the estimate of the aggregation result is called the residual. The basic idea here is to compare the current reputation- based aggregation result with the estimated aggregation result ˆARQ_Cn−

j in order to measure the scatter or spread of the aggregation results in a series of aggregation queries. We use the absolute deviation, which is the absolute difference between the current reputation-based aggregation result and the estimate of the aggregation result, to measure the magnitude of varying aggregation results as follows:

aQn= ∣ARQn−ARˆ Qn−∣ _(5.7)

If the absolute deviation score (aQn_{) is greater than a threshold}_{ThrA, then a major change} in the mean of the aggregation results is detected. This change can be either an abrupt or

incipient change in the aggregation results, which has to be investigated. Thus, the decision rule at this stage can be expressed as:

dQn 1 (a Qn) =⎧⎪⎪⎪ ⎨⎪⎪ ⎪⎩ normal aQn≤_{T hr} A alarm aQn>_{T hr} A (5.8)

Unfortunately, an adversary with a reasonable reputation value can slightly affect the aggregation result with small deviations (less than ThrA) in order to manipulate the estimate calculation in Equation 5.6. This makes the change pass the absolute deviation test and be classified as normal in Equation 5.8. Thus, the CUSUM is used to compute the cumulative sum of the differences between reputation-based aggregation results and estimate values. According to Equation 5.2, the CUSUM score (gQn_{) can be represented as:}

gQn=_gQn−1+ (ARQn−ARˆ Qn−)_, _where_gQ0 =₀

This CUSUM score is then compared with the predefined threshold ThrA to identify whether the small deviations were accumulated in a way that affects the aggregation results or not. Due to heterogeneous environments that lack a complete model of the physical phenomena, it is difficult to computegQn_{since no prior information about the underlying process} distribution is available. One way to solve this problem is to use a non-parametric approach which does not make any assumptions about the underlying process probability distribution. In the case of a non-parametric CUSUM algorithm [7], the corresponding decision rule can be expressed as: dQn 2 (g Qn) =⎧⎪⎪⎪ ⎨⎪⎪ ⎪⎩ normal −T hrA≤gQn≤T hrA alarm otherwise (5.9) If the CUSUM score falls in the range[−ThrA,ThrA], then the aggregation behavior will be considered as normal aggregation behavior. However, if the CUSUM score is outside the range [−ThrA,ThrA], then an alarm is raised indicating that small deviations have been accumulated which may or may not affect the estimator function and then the aggregation result. A stopping rule is used as part of the change point detection algorithm, because no statistical assumptions on the input to the aggregation function are given. Furthermore, because any change in the mean of the aggregation is considered abrupt in the change detection method, it could be either abrupt or incipient in the stopping rule method [52, page 17]. The latter method is that which can be expected in heterogeneous environments.

Figure 5.3 summarizes the process that should be performed by any intermediate cell. The intermediate cell representative,C_jrepreceives aggregation results from its children cells. It performs reputation-based aggregation as described in Equation 5.3. Subsequently, it calculates the absolute deviation score and the CUSUM score, which are subject to a threshold test with ThrA. These two scores (aQn_{, g}Qn)_{are considered error indicators, and based on them, the} change in the mean of the aggregation results is detected. If the error indicators are less than the threshold, then no change in the mean of the aggregation results is detected, because the reputation-based aggregation result is correlated closely enough with Cj’s prediction for the

Measurement Sources Measurement Sources Measurement Sources c Yes No

Figure 5.3: A simplifiedE-RSDAmodel

aggregation result. After that,C_jrepcomputes its prediction for the new aggregation result. In other words, the cell representative, C_jrep, accepts the reported aggregated data from its children cells if aQn ≤_Thr

A ⋀ gQn∈ [−ThrA,ThrA]. Then,Cjrep updates the reputation values of its children cell representatives by increasing theirαA values, and computes its prediction for the next aggregation results. In contrast with Equation 4.9,C_jrep calculates the reputation value of its children cell representative (Cm, Ci, and Ck in Figure 5.2) by using the available reputation information about the forwarding and aggregation activities as follows:

RCrep =µ2 α_ACrep αCrep A +β Crep A + (1−µ2) αC_Frep αCrep F +β Crep F where 0<µ2≤1 (5.10)

Once C_jrep has detected a change, it starts a fixed window (buffer) with size S, keeps a copy of the current estimate of the aggregation result before considering this detected change (temp estimate), and computes the new estimate value considering this new change. During the window’s lifetime,temp estimate is always considered asARˆ Q_Cn_j− since it is the last estimate value for the aggregation result before the change is detected. Then, C_jrepclassifies the detected change into one of the following categories:

OO Attack. Unpermitted deviation of a reputation-based aggregation result from

the estimate of the aggregation result will be detected if(aQn>_Thr

A ⋀gQn<ThrA) ⋁ (aQn<_ThrA ⋀_gQn>_ThrA) _occurs _l _{times during the window length, where} _l _{is the} attack frequency in which the adversary misbehaves once per l query responses. Once this unpermitted deviation is detected, it is classified as an OO attack, and then C_jrep

updatesβA for the node that caused this fault and resets the current estimate.

Perturbation. Temporary departure of the aggregation result from the current estimate

will be detected ifaQn>_Thr

A ⋁ gQn∉ [−ThrA,ThrA]forSconsecutive responses. The difference between this type of change and the OO attack is that the detected change continues for the whole length of the window. This temporary departure can be either an

Table 5.2: Data sets used in the experiment evaluation

Scenario Dataset Description Duration Frequency # Attacks

Scenario 1 Dataset-1 No Attacks - - -

Dataset-2 Abrupt Change 28 1 1

Scenario 2 Dataset-3 Incipient Change 28 1 1

Dataset-4 1-per-2 OO - F. Block 1 2 7

Scenario 3 Dataset-5 1-per-2 OO - L. Block 1 2 7

Dataset-6 1-per-3 OO - F. Block 1 3 5

Scenario 4 Dataset-7 1-per-3 OO - L. Block 1 3 5

abrupt or incipient change. Unfortunately, the absolute deviation and the CUSUM scores

In document Secure Data Aggregation in Wireless Sensor Networks (Page 132-139)