• No results found

To compare the four different labelling techniques, we need a number of different datasets. We are interested in datasets containing a different number of features for a different number of sensor nodes and days. We also need a dataset, in which the correlation between sensor nodes varies. A dataset matching our description is the Grand St. Bernard dataset [108].

The Grand St. Bernard dataset has been collected by a multi-hop wireless sen-sor network, deployed at the Grand St. Bernard pass, located between Switzer-land and Italy. The setup consists of 23 sensor nodes. These nodes measure, among other meteorological characteristics of the environment, the temperature and humidity during a period of two months [108] with the sampling frequency of two minutes. The nodes are deployed in two clusters. The small cluster consists of 5 nodes, while the big cluster consists of 18 nodes. Each cluster has a base station. Figure 3.2 depicts the topology of the network.

Even though the nodes are relatively close to each other, for instance, the distance between node 9 and node 18 is approximately 0.764 km, the correlation between the sensor observations is not necessary strong. Figure 3.3 illustrates the mean correlation for humidity and temperature for both clusters. Humidity

Figure 3.2: Wireless sensor network located in the Grand St. Bernard Pass, Switzerland [108]

Days 09/26 09/27 09/28 09/29 09/30 10/01 10/02 10/03 10/04 10/05 10/06 10/07 10/08 10/09 10/10 mean Min -0.95 -0.98 -0.92 -0.20 -0.89 -0.72 -0.39 -0.52 -0.47 -0.61 -0.91 -0.68 0.00 -0.32 -0.33 -0.59

Max 0.96 0.98 0.97 0.99 0.96 0.94 0.94 0.99 0.97 0.98 0.98 0.97 0.97 0.96 0.96 0.97

Std 0.48 0.48 0.43 0.43 0.46 0.41 0.30 0.40 0.37 0.47 0.44 0.43 0.34 0.33 0.38 0.41

Mean 0.08 0.07 0.10 0.53 0.18 0.23 0.38 0.60 0.47 0.64 0.16 0.31 0.55 0.52 0.35 0.34

|Mean| 0.37 0.36 0.33 0.54 0.37 0.34 0.40 0.66 0.53 0.77 0.37 0.43 0.55 0.54 0.40 0.46

Table 3.1: Humidity correlation for each day

correlation for each day is depicted in Figure 3.4 and Table 3.1. Temperature correlation for each day is depicted in Figure 3.5 and Table 3.2. In these tables the mean correlation for temperature and humidity lies around 0.34 and 0.42 respectively. The standard deviation is very high. This means that the corre-lation varies strongly for individual nodes. It can be clearly seen that humidity correlation between nodes of the small cluster is stronger than their temperature correlation. This is the other way around, however, for the big cluster. It means that the temperature correlation between nodes of the big cluster is stronger than their humidity correlation.

Another observation is that there is a relatively weak correlation between hu-midity and temperature of nodes of the two clusters. The correlation between humidity and temperature on each node is also low (this is displayed using the color of the nodes). This correlation between humidity and temperature is also depicted for each day separately in Figure 3.6 and Table 3.3. The standard

devi-Figure 3.3: Mean correlation of humidity and temperature during 15 days period (2007/09/26 - 2007/10/10)

Figure 3.4: Humidity correlation for each day

Days 09/26 09/27 09/28 09/29 09/30 10/01 10/02 10/03 10/04 10/05 10/06 10/07 10/08 10/09 10/10 mean Min -0.58 -0.79 -0.85 0.00 0.00 0.00 -0.72 -0.27 -0.14 -0.23 -0.61 -0.74 -0.50 -0.57 -0.56 -0.44

Max 1.00 0.99 0.99 0.99 1.00 0.98 0.99 0.99 1.00 0.99 1.00 0.98 0.99 0.98 0.98 0.99

Std 0.44 0.52 0.50 0.39 0.33 0.44 0.33 0.37 0.35 0.37 0.42 0.49 0.45 0.46 0.37 0.42

Mean 0.40 0.56 0.60 0.65 0.30 0.55 0.20 0.55 0.39 0.48 0.17 0.29 0.38 0.37 0.33 0.42

|Mean| 0.45 0.66 0.72 0.65 0.30 0.55 0.26 0.58 0.40 0.51 0.32 0.46 0.44 0.45 0.35 0.47

Table 3.2: Temperature correlation for each day

Figure 3.5: Temperature correlation for each day

Days 09/26 09/27 09/28 09/29 09/30 10/01 10/02 10/03 10/04 10/05 10/06 10/07 10/08 10/09 10/10 mean Min -0.86 -0.99 -0.89 -0.93 -0.91 -0.69 0.03 -0.84 -0.60 -0.75 -0.24 -0.82 -0.94 -0.82 -0.84 -0.74

Max 0.99 0.99 0.99 0.68 0.46 0.85 0.99 0.72 0.72 0.69 0.97 0.96 0.86 0.94 0.87 0.85

Std 0.54 0.63 0.55 0.53 0.30 0.39 0.31 0.67 0.37 0.46 0.35 0.54 0.66 0.49 0.52 0.49

Mean 0.20 0.18 0.02 -0.56 -0.09 -0.36 0.51 -0.27 -0.20 -0.34 0.33 -0.15 -0.38 -0.20 -0.08 -0.09

|Mean| 0.47 0.54 0.45 0.69 0.21 0.45 0.51 0.69 0.34 0.52 0.41 0.44 0.66 0.44 0.43 0.48

Table 3.3: Correlation between humidity and temperature for each day

ation, minimum and maximum values show that the correlation is very unstable.

The overall correlation between humidity and temperature is zero. The row dis-playing the |mean| (the mean of the absolute value of the correlation) shows that the correlation is around 0.5. Overall the correlation depends on the distance between the nodes. However, the correlation between humidity and temperature for different nodes varies day by day. Table 3.4 illustrates the correlation between humidity, temperature and distance.

Figure 3.7 (left) illustrates that the correlation for humidity is overall very weak for one day. In the big cluster there are strong correlations spanned over relatively large distances, while most nodes, located very close to each other, have a weak correlation. The temperature correlation plot in Figure 3.7 (right) illustrates that there is a very strong correlation between nodes of the two clusters for the same day. Figure 3.8 illustrates the correlation plot for the humidity and temperature for another day. It can be clearly seen that distance and humidity are correlated, while temperature and distance are not. This shows that in this

Figure 3.6: Correlation between humidity and temperature for each day

Correlation Humidity Temperature Distance

Humidity 1 -0.09 -0.37

Temperature -0.09 1 -0.30

Distance -0.37 -0.30 1

Table 3.4: Correlation between humidity, temperature and distance

Figure 3.7: Correlation of humidity and temperature on 2007/09/28

Figure 3.8: Correlation of humidity and temperature on 2007/10/06

Features Days per set Nodes Nodes per set Correlation Datasets

h,t,h+t 1 18 1 - 329

h,t,h+t 1-2 18 1-5 0.7 34

h,t,h+t 1 5 1 0.7 2

h,t,h+t 1 18 18 - 15

h,t,h+t 1 5 5 - 15

Table 3.5: Different subdatasets, h:humidity, t:temperature, h+t: humidity and temperature combined

dataset, obviously no systematic trend exists and moreover no unified assumptions regarding correlation between humidity and temperature can be made.

From the Grand St. Bernard dataset, we select various subdatasets having different characteristics in terms of number of days, number of nodes, number of features, and correlation values between humidity and temperature. Table 3.5 provides an overview about these different datasets. As it can be seen from the table, we consider the correlation between humidity and temperature as well as the correlation between the nodes for two datasets. In these two datasets, the minimum correlation value is 0.7. Through extensive experiments, we realize that no subdataset could be found to match a bigger correlation value. For the other three datasets reported in the table, we have made a combination of days and nodes. We separate the two clusters (containing 5 and 18 nodes) because the distance between the clusters is large compared to the distances between

correlation between the nodes depends on the distance.