2.4 Fault Diagnosis Algorithms
2.4.12 Fault Type Based
Based on the persistence of fault, the faulty sensor nodes are classified into four sub-categories such as permanent, intermittent, transient, and Byzantine faulty sensor nodes. The permanent faulty sensor nodes are identified by considering the time out mechanisms [75, 86], or minimum energy threshold [3] mechanisms. When a sensor node sends a request message to another sensor node and expects a reply within certain time duration and do not receive a reply message or the remaining energy value of a node goes below a threshold value, the sensor nodes are considered as permanently faulty. Transient faults occur once during the lifetime of a sensor node.
Therefore, they are captured by checking the status of the sensor node at
consecu-tive periods. This is more energy consuming as compared to permanent fault. The transient faulty sensor nodes are diagnosed by using any one of the fault diagnosis algorithms discussed in Section 2.4. When the sensor nodes are fault free for some duration and faulty in some other duration, the sensor node is considered as inter-mittently faulty. The intermittent faults are more likely in distributed systems such as multi processor and multi computer system, computer networks, wireless ad hoc networks, WSN and other kind of distributed systems.
The intermittent faulty behavior of the distributed system was first explored by Blough et al. [87]. Their algorithms diagnose the intermittently faulty processor by using the comparison model such as MM and MM*. As the multiprocessor systems can be powered at any time, this approach is most suitable by providing better accuracy in fault diagnosis. Bondavalli et al. [11] have proposed a threshold and count based intermittent fault diagnosis protocol where, they put a clear distinction between transient and intermittent faulty processor.
Khilar et al. [88] have presented a probabilistic based fault diagnosis approach which identifies only the intermittently faulty sensor node based on the remaining energy of the sensor node in a WSN. In their approach, each sensor node exchanges message related to their remaining energy. For this an extra message is exchanged over the network between the sensor nodes, which consumes extra energy due to message transmission and reception as a result the battery is drained quickly and lifespan of the network reduces quickly. This approach puts extra burden over the network by consuming high energy, memory and bandwidth because of the fact that the diagnosis process follows broadcast comparison model where the energy is broadcasted by each of the sensor nodes to achieve diagnosis.
Lee et al. [25] have presented a comparison and time redundancy matrix based fault diagnosis approach which detects both the intermittent and transient faulty sensor nodes by comparing its own sensed reading with its neighboring sensor node’s data for r consecutive rounds. In each round, the sensor node collects data from their neighboring nodes and compute the absolute difference between the own sensed data with collected data and compare the result with a threshold. Here, two threshold
values are used for finding the fault status of a sensor node. One threshold is used to identify partial fault status of the sensor node for each test interval and another one is to find the minimum number of times the node should declare to be faulty, so that, its final decision is to be faulty. This approach may not give good accuracy for a constant threshold. Therefore, an optimal and adaptive threshold (which changes dynamically with variation in neighboring nodes) should be designed to improve the performance of the algorithm.
To overcome the demerits as discussed earlier, Yim et al. [89] have proposed an adaptive and dynamically changing threshold based event diagnosis protocol to detect the events locally in the presence of intermittently faulty sensor nodes. The confidence level of the sensor node and threshold based neighbor co-ordination based approach is used for detecting the transient and intermittent faulty sensor nodes.
The thresholds are adjusted dynamically to detect the events more accurately. The traditional time out mechanism is also used for detecting the permanently faulty sensor nodes.
Arunashu et al. [62] have proposed a hybrid fault diagnosis algorithm which diagnoses both intermittent and hard faulty sensor nodes over a static arbitrary topology network. For identifying hard faulty sensor nodes the time out mechanism is considered and neighbor co-ordination based comparison technique is used for identifying intermittently faulty sensor nodes. In time out mechanism, each node is associated with a clock value. Before the clock value, expires each node should receive some information from its neighbors. If a node is unable to receive any information from its neighbors, then the node declares that missing node as the hard faulty sensor node. In neighbor coordination based comparison technique, each sensor node compares its own sensed data with the neighbors data and the comparison is carried out over an application specific threshold value. If more than 50% of comparison result indicates that the node is faulty then that node is identified as faulty node. For calculating the time duration, number of tests required for testing the node, how many times a node should behave abnormally so that it will be declared as the faulty node. The authors put emphasis on diagnostic accuracy,
diagnosis latency and energy overhead. These three parameters are modeled as the multi objective optimization problem which is solved by using multi-objective particle swarm optimization technique.
Andreas et al. [79] have proposed a Byzantine fault diagnosis method, where each sensor node sends a set of messages to a group of sensor nodes and also receives messages from the same group. If the number of messages sent is equal to the number of receiving messages, then the sensor node is identified as fault free otherwise faulty.
This approach needs multi hop communication and requires coordination among the nodes to identify the faulty node. Recently, Kuo Feng Su et al. [90] presented a fault diagnosis method in WSNs where each sensor node establishes two node disjoint shortest paths [91] and send their message using this path. If the sensor node receives the same message which it had sent, then that node is identified as fault free otherwise it is labeled as faulty. This approach needs multi hop communication and requires more time to establish the path.
2.5 Conclusion
A comprehensive study of fault diagnosis algorithm is given in this chapter. It has been observed from the literature study that quite a good number of fault diagnosis schemes have been proposed for various kinds of distributed networks such as ad-hoc networks, WSNs, and wireless networks till date. The system and fault model for various kinds of systems where the diagnosis algorithms are applicable has been discussed. The classification of fault diagnosis algorithms have been presented. The suitability of self fault diagnosis algorithms have been focused as compared to cen-tralized and distributed diagnosis which are not energy efficient. The shortcomings and advantages of various fault diagnosis algorithms are also discussed.
Algorithm in WSNs using
Neighbor Co-ordination
Distributed Self Fault Diagnosis Algorithm in WSNs
Using Neighbor Coordination
In this chapter, a distributed self fault diagnosis algorithm is proposed to identify both hard and soft faulty sensor nodes in wireless sensor networks. The algorithm is distributed, self diagnosable and can diagnose the most common faults like stuck at zero, stuck at one, random data and hard fault. In this approach, each sensor node gathers the observed data from neighboring sensor nodes and computes the mean to check the presence of faulty sensor node which reduces the processing overhead. If a sensor node diagnoses a faulty sensor node, then it compares observed data with the data of the neighbors and predicts the probable fault status. The final fault status is determined by diffusing the fault information from the neighbors. The accuracy and completeness of the algorithm are verified based on the statistical analysis over sensors data.
3.1 Introduction
During the life span of wireless sensor networks, a number of unexpected situations arise such as the misbehavior of sensor nodes due to the occurrence of various kinds of faults [3, 9, 33, 35]. The faults occur in wireless sensor networks (WSNs) due to a number of causes such as malfunctioning of hardware and software units, malicious interference, battery exhaustion or natural calamities. The presences of faulty sensor
fault diagnosis of sensor nodes in order to obtain correct data from WSNs.
The proposed distributed self fault diagnosis algorithm considers both the soft and the hard faulty behavior of sensor nodes. In the proposed algorithm, every sensor node in the network shares their sensed data in the neighbors and predicts the probable fault status of every other sensor node. After sharing the probable fault status, the voting scheme is used as a major parameter for diagnosing the final fault status. The main contribution of this chapter includes (i) the design and evaluation of an efficient distributed self fault diagnosis algorithm for diagnosing hard and soft faulty sensor nodes in WSNs, (ii) calculate the mean to know the presence of faulty sensor node in the neighborhood, which reduces the computational time (iii) the algorithms are implemented in NS3 [38], (iv) the performance of the algorithm is compared with the existing algorithms [6,40]. The result of the proposed distributed self fault diagnosis using neighbor co-ordination approach (DSFDNC) algorithm shows that the number of communications requirement is less compared to the existing algorithms which makes the algorithm to be energy efficient.
The remaining part of the chapter is organized as follows. The system model is presented in Section 3.2. The proposed distributed self fault diagnosis algorithm using a neighbor co-ordination approach (DSFDNC) is described in Section 3.3. The algorithm has been analytically shown to be correct in Section 3.4. We described the simulation results and compared the performance with the existing fault detection algorithm in Section 3.5. Finally, Section 3.6 concludes the chapter with discussions.