In Cyber-Physical Systems, especially in the In ternet of Things, sen sors m onitor physical attributes such as light, tem perature, noise, m ovem ent an d hum idity. The data com m unicated by sensors con sist of tim e-series values th a t are sam pled over a defined p eriod and then tran sm itted to a sin k /g a tew ay for fu rth er processing. In this sec tion, w e introduce inform ation processing an d abstraction m ethods for Cyber-Physical D ata, in p articular tim e-series an d num erical d ata processing algorithm s.
Time series d ata is n o t as easy interpretable as for instance a docu m ent, video or any other d ata available on the Internet. Platform s such as Xively9 (form er Cosm) or Nimbits^° allow pu blishing and visualisation of stream ing data from sensor devices, however, they lack processing an d analytic features; The d ata rem ains in the sam e raw condition an d m akes it difficult to detect interesting inform ation, especially w ith regards to the v ast am o u n t of sensors th a t w ill be con nected to the Internet in the fu ture an d lead to consequent challenges th a t form the Big loT D ata issue [34].
In the research dom ain of sensor netw orks there are w ell investigated topics such as event an d p attern detection, data m ining an d context- aw are com puting [134]. However, m ost approaches use raw sensor data for their analysis in a specific application do m ain [29, 124, 66, 128, 32] w here it can be assum ed w hich events an d particular infor m ation are going to be detected. W ith the em erging large volum es of heterogeneous d ata and their various application scenarios, n ew d o m ain in d ep en d en t approaches are n eed ed th a t can abstract from the un derly ing data an d enable a h u m a n /m a c h in e interpretable rep re sentation of the data. Sensor abstraction from raw data has tw o m ajor advantages: a) As a replacem ent of raw sensor data, abstractions can 9 https://xively.com/
be u sed for fu rther processing an d annotation. A bstractions are less granu lar as raw d ata an d therefore require less data-space an d com m unication traffic, b) A bstractions are easier to u n d erstan d by the end-user or to be in terp reted by autom ated m achine processes. For instance, in stead of transm itting the raw sam ples [-5 C, -3 C, ..., -2 C, 0 C, -4 C ] it m ight be m ore valuable an d require less com m uni cation to tran sm it an abstract concept such as "cold". The higher the d ata is abstracted to, the less its com m unication costs. However, this w ill com e w ith the trade-off of losing som e p a rt of the inform ation a n d also requires context inform ation in w hich the d ata has been ob tained [118]. The req uired g ranu larity of the inform ation d ep end s on the application a n d /o r the requirem ents of the end-user. In this sec tion, w e focus o ur attention on approaches an d m ethods th a t can be u sed to abstract from the raw d ata to higher-level representations and can be ru n on m iddlew are solutions. In the following, w e state m ore precisely the definition of inform ation abstraction an d m otivations beh in d its application. We introduce a w orkflow w ith several steps from pre-processing to the representation of abstractions. For each step, w e p rovide som e possible algorithm s an d m etho ds th at can be ap plied an d give an overview of the state of the art in inform ation abstraction from a technical an d research point-of-view and discuss the curren t requirem ents for inform ation abstraction.
2.4.1 Abstraction and Knowledge Representation
This section defines an d discusses the term s inform ation abstraction from sensor d ata an d its different form s of representation including different levels of abstraction, its distinction to other research areas, an d discusses m otivation an d challenges of creating abstractions from sensor data.
2.4.1.1 What is an Abstraction?
The term abstraction as we use it in this w ork, is coined in the area of context-aw are com puting, describing the transition from different levels of context incorporation from a sensing layer to a perception layer an d finally to a situation layer [21]. This transitioning process is defined by C hen an d Kotz [16] as deriving higher-level context data from low er-context (raw) sensor data by collecting, aggregating an d inferring raw d ata w ith additional know ledge from the environm ent w ith the goal to adjust the sensor devices behaviour to the current context. W ith the In ternet of Things, w here d ata eventually has to be m ad e available an d u nderstan dable for the end-user, the focus of ab straction moves from a device p oint of view to a m ore user-centric position. Sigg et al. [118] define abstraction as the ""amount of p ro cessing applied to the data"' w ith the goal to raise the level of context abstraction including the error probability indu ced by each transition.
2 . 4 D A T A P R O C E S S I N G F O R S E N S O R N E T W O R K S 5I
We define tw o granularity levels of abstraction w ith the aim to rep re sent the know ledge w ith a user-centric focus; lower-level abstraction (or data abstraction) an d higher-level abstraction (or sem antic abstrac tion). We define the process of abstraction as the derivation from raw data to m ore valuable an d u n derstan dable inform ation.
Low er-level abstraction s rep resent atom ic an d static inform ation w hich can be obtained by gathering d ata from a single local sensor stream an d by com bining the data w ith inform ation from the local sensors' m eta d ata such as type, range an d capabilities. A tom ic in this case, m eans th a t this is the low est abstraction level after processing of raw sensor data. Static in this context m eans th a t the abstraction is a sin gle an d in d ep en d en t observation m ade at a fixed p o in t in tim e an d does n o t include inform ation ab ou t a sequence of observations. M an- tyjarvi [91] describes this as "'sm allest atom ary q uan tity of context inform ation w ith sem antic m eaning"'. For instance, a doo r sensor can m easure two states, either the do or is open (0) or closed (1) (assum ing th a t a door cannot be half-open an d m u st be either opened or closed). The abstractions "'open'" an d "'closed'" represent the situa tion an d cannot be split into sm aller abstractions. Both abstractions do n o t refer to a sequence of actions over time. D ata inform ation can be obtained th ro u g h data processing techniques such as p attern an d event detection th at analyse the raw sensor d ata of a single node and inform the u se r/n e tw o rk about the occurrence of the event.
H igher-level abstractio ns how ever can be inferred by observing several sources of lower-level abstractions to get the global picture about occurring activities an d m ultivariate events. A certain p attern of open an d closed doors d u rin g specific tim es of the day an d other lower-level abstractions can lead to the higher-level abstractions "'be
ginning of work day'" an d "'end o f work day'". H igher-level abstractions
can be obtained by m achine learning techniques such as classifica tion an d clustering of lower-level abstractions over time. Different approaches such as logical inference w ith the help of reasoning m ech anism s an d rule-based system s can be also u sed for this purpose. The representation form of the abstraction can v ary in different ap pli cations for sensor data. G raphical user interfaces including geograph ical m aps can visualise the abstracted d ata an d allow the end-user to perceive inform ation, events an d changes in the environm ent quickly an d som etim es even w ith o u t being an expert in a certain field. Se m antic representations of inform ation such as those defined in the Sem antic Sensor O ntology [19] can provide interlinked inform ation obtained by the abstraction process to the u ser an d can be used to q u ery the status of the real w orld. Transferring the abstractions into a m achine understan dable form at also leads to a h igher m achine- interpretable representation an d can raise the interoperability of data.
Low“lcvcl abstractions High-Level abstractions
Raw Data Collet Abstraction/ Représenta
Inferecc Dimensionaiit\
Reduction
Preprocessing Feature Extraction
l*T(*cessin(! | j Maihemaiical. j
Wavelets Spceinm. oihct
• Bandpas-s- Filter • Min.Max j- Haar -Discrete FFI -PAA
• HichT.owpass • Mean. Median 1 - Variable PAA
• Variance, Sid -SAX
Deviation • Correlation 1 Inicgrulion -KMeans |- Rules ■Markov Chains I- Scnianiic Web
Figure 7: Common Information Abstraction process, defined by examining different approaches
24.1.2 Motivation for Information Abstraction
There is a huge d em an d for new d ata processing techniques and con cepts to cope w ith the issues of the big data problem . We endorse th at inform ation abstraction can be used as a m ean to reduce the deluge of data. Focusing on the abstracted inform ation rather th an the nu m er ical data, can b rin g two m ain advantages, netw ork traffic reduction an d the enhancem ent of com prehensiveness for the end-user. Instead of transm itting the raw data to the user, abstracted data can contain less data b u t focus on the inform ation w hich is useful for the user. C om pared to lossless com pression techniques, abstraction does not focus on reconstructing the initial data b u t allow extracting the infor m ation th at is interesting for the user. D ata abstraction can be used as a fu ndam ental base for existing approaches such as outlier detection, activity recognition an d other em erging areas in the dom ain of sensor netw orks.
Inform ation A bstraction exploits several techniques an d m ethods from different research areas to provide com prehensible inform ation from a large am ou nt of raw data to the user that are introduced in the following.
2.4.1.3 Creating abstractions
In the following, we introduce a general w orkflow th at has been d e fined by exam ining several different approaches for inform ation ab straction in the dom ain of sensor data (details in Section 2.4.6). Ei ther the approaches th at have been exam ined follow the workflow as show n in Figure 7 or im plem ent certain p arts of it. Therefore, we extracted the following m ain steps th at serve as a com m on grou nd for the workflow: Pre-processing to bring the d ata into shape for fur ther processing, dimensionality reduction to either aggregate the data or reduce its feature vectors, feature extraction to find lower-level ab stractions in local sensor data as defined in Section 2.4.1, abstraction from lower-level abstractions to higher-level abstractions an d finally
representation to m ake the abstracted data available for the end-user
a n d /o r m achines th at can in terp ret the abstracted data. We introduce the different steps and key techniques used in this dom ain. All m eth-
2 . 4 D A T A P R O C E S S I N G F O R S E N S O R N E T W O R K S 5 3
ods th a t are d em onstrated use a synthesised test d ata set. The syn thesised d ata set consists of 2048 sam ples. The first 1024 sam ples are G aussian ran d o m n um bers betw een 0 an d 100, the next 512 sam ples represent G aussian rand om num bers betw een 0 an d 300 and the last 512 ran d o m n u m b ers are in betw een o an d 100. This has b een chosen to m odel som e kin d of activity in betw een tw o p erio ds of no activity an d also to rep resent dynam icity in the data.
2.4.2 Pre-Processing Methods
The raw sensory data passes th ro u g h a pre-processing stage to p re p are the d ata for further steps. Pre-processing can be done on the sen sor n o d e to reduce transm ission cost an d filter u n w an ted data. This can include m athem atical / statistical m eth od s to sm ooth the data by applying m oving average w indow s, or m eth od s from signal process ing such as band-, low-, high pass filter to focus on certain frequency spectra. Transm ission cost can be red uced by only sending certain in form ation of a curren t sam pling w in do w to the base station / gatew ay such as m inim um a n d /o r m ax values or the m ean value of the cur ren t window.
The pre-processing is n o t only lim ited to a single sensor node, cer tain approaches use in-netw ork processing to aggregate the d ata b e fore further processing by finding the m inim um , m ean or m axim um value in a set of sensor nodes before transm itting the data to the base station. D espite local aggregation, in-netw ork techniques can also be u sed to im prove the accuracy of the d ata by calculating correlation w ith d ata from neighbouring nodes. The survey of Figo et al. [37] d e scribes pre-processing techniques in m ore detail regarding tim e an d space complexity. The applied pre-processing techniques introduced in this section are show n in Figure 8 an d described in the following: 2.4.2.1 Signal Pre-Processing
A filter can either be a sim ple hardw are circuit or sim ple algorithm
th a t removes un w anted p arts of a signal in frequency do m ain by cut ting the signal after/b efo re a certain frequency. This leads to the ad vantages th at less data has to be subm itted an d fu rther processing steps have a focused dataset instead of analysing the raw data includ ing the b ackground noise. However, the trade-off for a cut in the data is th at outliers or other interesting d ata can be m issing.
Low /H igh-Pass Filter: A lo w /h ig h -P ass Filter cuts off the curren t sig nal in frequency dom ain after/b efo re a certain threshold: called the cut-off frequency. A rora et ah [6] use a low pass filter to sm ooth the signal in order to prevent a split of activities in the later processing. Friksson et al. [32] use a highpass filter to rem ove low -frequency com po nen ts in a road-anom aly detection scenario w here sensors are d e ployed on a car. The filter rem oves subtle changes in th e acceleration
Figure 8: Pre-Processing Techniques
signal and passes only high-frequency signal that are m ost probably caused by holes an d cracks in the road.
B andpass Filter: A Bandpass Filter has two cut-off frequencies, the lower and the u p p e r frequencies an d w ill only pass the signal in be tween. Stocker et al. [120] use a b an dp ass filter to pre-process signals from a vibration sensor deployed at a road pavem ent to retrieve only data that is created by passing cars. W ang et al. [128] use bandpass filters for b ird observation, as it is know n th at the b irds produce a sound only in a certain frequency range. Olfati-Saber [99] introduces an approach for a distribu ted filter that includes several high and low -pass filters deployed over a sensor netw ork to m inim ise the over all background noise an d increase the accuracy of the observations by com bining data from several sensor nodes.
2.q.2.2 Mathematical/Statistical Pre-Processing
In contrary to signal processing, m athem atical pre-processing tech niques w ork on the data instead on the signal an d frequency dom ain. Data w indow s are used to aggregate the data over a w indow period an d transm it it either to the base station (e.g. gateway) for further processing or dissem inate the aggregated data over the netw ork for in-netw orking processing before further processing.
M in, Max: The difference betw een the m inim um an d m axim um in side a sam ple w ind ow can be used as a pre-processing step for further feature detection. F arringdon et al. [35] use the averages including the range of the m in /m a x difference to detect the orientation of a sensor badg e attached to person. Based on the values they detect if the per son is standing, sitting or lying dow n.
M ean, M edian: The M ean or M edian is usually used to sm oothen the d ata by rem oving peaks and noise from the signal. The m oving aver age (m edian) can be applied on stream ing data by taking only the last
2 4 D A T A P R O C E S S I N G F O R S E N S O R N E T W O R K S 5 5
TL values into consideration an d then subsequently shifting forw ard the sliding w indow . G hasem zadeh et al. [47] use m oving average as a pre-processing step in a b o d y sensor netw ork to detect p attern s in the neu rom uscular system based on EEG signals. In their application scenario the m oving average is u sed to cancel high frequency noise. Variance, S tan d ard D eviation: Both, variance an d stan d ard deviation are u sed to represent the volatility of the data. G olding an d Lesh [50] calculate the variance an d stan d ard deviation of the raw d ata to track people w ith cheap sensor devices.
C orrelation, In tegration: Especially w ith m ulti-dim ensional d ata from accelerom eters, correlation an d integration are u sed to get velocity an d position. By calculating the derivation of the speed, the distance can be approxim ated.
2.4.3 Dimensionality Reduction
To cope w ith the large am o u n t of d ata th a t has to be processed an d stored, dim ensionality reduction techniques can be ap plied to reduce the size an d length of the data by applying different m ethods on the data w hile keeping the key features an d patterns.
The goal of dim ensionality reduction is to reduce the length of an in p u t Vector w ith length n to a reduced vector of size M w here M « TL.Different m ethods have been in troduced th a t either aggre
gate the d ata or filter certain sam ples of the original data to reduce the length of the initial data. This section gives an overview of som e of the frequently u sed techniques.
D iscrete F ou rier T ransform ation: The Discrete Fast Fourier Transfor m ation (DFFT) transform s a signal from the tim e-dom ain to a fre quency dom ain. The signal is aligned along the frequency axis, resu lt ing in an o u tp u t vector of frequencies ranging from low -frequency to high-frequency coefficients. To reduce the dim ensionality of the original tim e-series data, the data is transform ed via DFFT into the Fourier coefficients. Then only the first few coefficients are u sed to represent the original sequence. The shortened transform ed vector is subsequently u sed in the inverse DFFT to reconstruct the original data. The form ula for transform ation an d inverse transform ation (=re- construction) are show n in Equation i w here n is the o u tp u t length, Xic the transform ed signal an d X^ the reconstructed signal. In Fig ure 9, the original data an d the transform ed d ata w ith only n coef ficients are depicted. H ere n also describes the length of the o u tp ut, the sm aller the reduced vector, the lesser its resolution.
Original Data n coefficients=512 n coefficients=256 n coefficients=128 n coefficients=64 n coefficients=32 n coefficients=16 n coefficients=8 Time
Figure 9: Original Data and reconstructed Fourier transformation w ith less coefficients N - 1 Xk = ^ ■ 6 T L = 0 —1 27T k n / N N - 1 Zti k TL / N ( 1) k =0
W avelet T ransform ation: In com parison to the Fourier transform a tion th a t loses the tim e inform ation of the data and transform s the data globally, discrete wavelet transform ation (DWT) preserves the tim e dim ension and transform s the data locally leading to a faster calculation. The Fiaar wavelet transform ation originated in 1910 by A lfred FFaar [55] is still frequently used in the dom ain of tim e-series analysis [121]. The transform ation takes a i-D in p u t vector an d tran s