1.5 Publications
In this thesis, our work regarding GAD anomaly detection framework and DMGA grouping technique have been published on IEEE transactions on Industrial Engineering and in the IEEE communication magazine. Also, our research regarding the multi-feature dimensionality reduction (MFDR) technique is now under revision, while an energy-aware data fusion technique has been accepted and is to appear in INFOCOM 2016. The complete list of publications related to this thesis is listed as the following:
1. P.Y. Chen, S. Yang, J. A. McCann, Distributed Real-time Anomaly Detection in Networked Industrial Sensing Systems, IEEE Transactions on Industrial Electron-ics, pp3832-3842, 2015.
2. P.Y. Chen, S. Yang, J.A. McCann, J. Lin, X. Yang, Detection of False Data Injection in Smart-Grid Systems, IEEE Communication Magazine, pp 206-213, 2015.
3. P.Y. Chen, W. Yu, J.A. McCann, MFDR: A Multi-Feature Dimensionality Re-duction Representation of Time Series(under revision).
4. S. Yang, Y. Tahir, P.Y. Chen, J. A. McCann, Distributed Optimization in Energy Harvesting Sensor Networks with Dynamic In-network Data Processing, IEEE INFOCOM, 2016 (to appear).
Chapter 2
Background
In this chapter, we are going to present the background study regarding our research on anomaly detection in NDSS. In Chapter 2.1, we will introduce networked distributed sensing system (NDSS) and formally define the components in this system. The spatiotemporal correlation in NDSSs will be discussed Chapter 2.2 with real-world examples. In Chapter 2.3, we will discuss the potential anomalies behind the scene of NDSSs, from two different aspects including their causes and patterns. Eventually, a general anomaly-management flowchart will be present to summarise how anomalies are handled in real-world NDSSs in Chapter 2.3.3. Note that the related work regarding NDSS anomaly detection will be presented in Chapter 3.
2.1 Network Distributed Sensing Systems (NDSSs)
Networked distributed sensing systems (NDSSs) exist in large body of industrial applications [GH09, Che14, YCK+10], such as weather forecasting [MZA12], pollution control [BSP12], structural-health monitoring[CCXS10], smart water systems [OUS+08], machine-condition mon-itoring [LG09, KNS14], and smart power-grid systems [GLH10], Fig. 2.1 illustrates an example of NDSS in a smart grid system. As can be seen, NDSSs consists of spatially distributed autonomous sensors nodes, which can dynamically collect real-time and comprehensive infor-mation from environments in real time. These sensor nodes are connected by inforinfor-mation and
10
2.1. Network Distributed Sensing Systems (NDSSs) 11
communication technology (ICT) infrastructures, such as wireless communications. Through these ICT infrastructures, sensor nodes are able to communicate with each other and send their measurements back to base stations (e.g. server and perhaps laptop class devices) for further analysis.
National-wide Smart Grid
Smart Meter Smart Meter
Local- Smart Grid
Local
Bulk Generators
Energy Energy
Energy Transmission Lines Cyber Links Demand Nodes
Supply Nodes
Figure 2.1: The conceptual illustration of a smart grid system.
With NDSSs, we are able to collect measurements from large-scale deployed sensors in real time. In cyber-physical systems, this enables the possibility to realise, perhaps even nation-wide, real-time control systems. Let’s again take smart-grid system as an example, the data collected by NDSSs allows information systems to perform prediction analysis, which balances the power production and consumption in the grid system through smart pricing techniques.
Therefore energy distribution (which controls the energy generation, consumption, and trans-mission process) can be performed in a more dynamic and efficient manner [MKB+12, LNR11].
In the Internet of Things (IoT) and smart monitoring systems (e.g. agriculture and structural health monitoring), the comprehensive information provided by NDSSs allows us to access and better understand complex residential behaviours and environments, respectively. This can potentially help to solve difficult problems, such as the virus spread or the correlation between water, power and food production, which is very difficult to answer without fine-grained analytic data.
12 Chapter 2. Background
2.1.1 Limitations and Challenges in Real-World NDSSs
Although NDSSs have been widely adopted in many real-world applications, they still suffer from various limitations and technical challenges [GH09] which can be summarised as follows:
• Resource-limited and unreliable sensor nodes: In order to realise large-scale cost effective and sustainable deployments, NDSSs typically consist of resource-limited sensor nodes. These sensor nodes are usually powered by batteries or can be charged by energy harvesting components (e.g. solar panels). Due to the limited energy supply, these sensor nodes are typically constructed with electro-mechanical-based sensors and micro-controllers to reduce their sizes and power consumption. Thus, complicated algorithms (e.g. video streaming) that perform well on traditional PC environments may not be di-rectly adoptable in NDSSs. Furthermore, to reduce deployment costs, many sensor nodes are typically constructed with low-cost components without comprehensive protection.
Consequently they are prone to unexpected faults.
• Limited communication capacity: In NDSSs, radio-frequency (RF) modules and network interface controllers (NICs) are the most power consuming components of the sensor node. Performing data analysing task with such devices in a centralised manner can be very expensive since all sensor measurements are required to be sent back to data sinks. Therefore, it is preferable to push the data analysing task to the edge of the network (i.e. each sensor node) in a distributed manner, where analytical data is first extracted from the raw data by light-weight algorithms to reduce its size before sent back to base station for further analysis. This edge analytics can also drive local control decisions.
With this design, the size of data needed to be transmitted back can be significantly reduced.
• Heterogeneous and dynamic environmental condition: NDSSs can be deployed in heterogeneous environments where the environmental condition, surrounding physical phenomena, and sensor types can have totally different properties. Therefore, it can be difficult to transplant a given algorithm from one NDSS to another since the basic