Spatio-temporal Interpolation (STI) Algorithm

model proposed by De Iaco [53]:

γst(hs,ht) =γst(hs,0) +γst(0,ht) +kγst(hs,0)γst(0,ht) (3.19)

with

k= sillγst(hs,0) +sillγst(0,ht)−sillγst(hs,ht) sillγst(hs,0)sillγst(0,ht)

(3.20) where thesill at different dimensions (space, time, and spatio-temporal) are estimated by using the variance of the corresponding data set. Finally, the following rule must be met in order to satisfy the admissibility forγstas in Equation 3.19.

0<k≤1/max{sillγst(hs,0),sillγst(0,ht)} (3.21)

3.5.2 Raw Data Pre-processing

Figure 3.11 illustrates the processing example for a 2D spatial-only case. The entire RoI will be divided into a number of partitions in a way that each partition consists of one sample data point on average. Then, each partitions holds the mean (µ) and standard deviation (σ)

calculated from the sample data within the corresponding partitions and is spatially-located at the midpoint of rectangle (light-green as in the right of Figure 3.11). On other words, the interpolation algorithm will use the µ value for the estimation, while theσ is used to

compute the estimation error (Section 3.5.4).

Fig. 3.11 Data pre-processing (spatial-only case). Left: 2D spatial map; Right: ’partitioned’ sub-area from the map.

Such a procedure can be extended to the spatio-temporal case (2D-spatial and 1D- temporal) as demonstrated in Figure 3.12, where the actual swarm sensing field simulation high-resolution sample observations will be generated (left of the figure). Similar to the estimation presented on the right of Figure 3.11, on the right of Figure 3.12 is an example of

52 Methodology

Fig. 3.12 Data pre-processing for 2D-spatial + 1D-temporal case. Left: 3D spatio-temporal data ‘cube’; Right: ’partitioned’ sub-area of the 3D data cube.

the partition value located at the midpoint of timet_x= (t₁+t₂)/2 that holds the µ andσ of

the observations (blue dots).

After execution, the raw data from the field simulation will be in the form of a regularly- spaced spatio-temporal grid (called the ‘processed’ data/observations hereafter) that holds the

µ andσ values for the purpose of the STI estimation (Section 3.5.3) and its corresponding

error measurements (Section 3.5.4) respectively.

3.5.3 The Hybrid Approach STI Algorithm

LetV ={v_st,1,vst,2,vst,3,· · ·,vst,i,· · ·,vst,N}={µst,1,µst,2,µst,3,· · ·,µst,i,· · ·,µst,N}be a list

of space-time ‘processed’ observations with a total ofNelements: ST I_red(x,y,t) =ST I_ext(x,y,t_lo)× thi−t t_hi−t_lo+ST Iext(x,y,thi)× t−t_lo t_hi−t_lo (3.22) ST I_ext(x,y,t) = N

∑

i=1 w_st,i·vst,i= N

∑

i=1 γ_st−_,2_i·vst,i (3.23)

whereST I_red andST Iext are the reduction and extension approaches of the STI algorithm

respectively;x,yandt are the(x,y)spatial-location at timet to be interpolated;t_lo andt_hi are the lower and higher time indices based on timet; and finally,γst is the spatio-temporal

variogram that models the space-time interaction for the weighting mechanism (Section 3.5.1). An example of the above algorithm is illustrated in Figure 3.13.

3.5 Spatio-temporal Interpolation (STI) Algorithm 53

Fig. 3.13 Visual illustration of the proposed STI algorithm. In this example, the value at timet is to be interpolated. The algorithm first estimates the values at botht_lo=t₂(red) and t_hi=t₃ (blue) utilising the extension approach (Equation 3.23); and then interpolates the value at timet (purple) using the reduction approach (3.22).

In this work, only plus-and-minus one time index (i.e. ±1 hour in this case) will be utilised based on the justification that the observation at a particular time will not be correlated at the exactly same time-frame on the previous and/or following day. Furthermore, such a configuration has been shown to be superior to the empirical data set (South Esk Modelled data set in Section 3.1.1) which is the same data set as is used in this work [86]. Eventually, the spatio-temporal variogram (Section 3.5.1) is only modelled on a daily basis to facilitate computational efficiency.

3.5.4 STI Algorithm Error Measurements

Finally, the proposed STI algorithm is able to calculate the measurement error using the following equation (similar to Equation 3.23):

ST I_err(x,y,t) = N

∑

i=1 w_st_,_i·σst,i= N

∑

i=1 γ_st−_,2_i·σst,i (3.24)

This calculation is mainly used to address the Quality Control (QC) component of the data utilised within the STI process, so that the proposed algorithm is able to provide a certain confidence level during interpretation.

3.5.5 Performance Assessment

There are many statistical techniques that can be used for performance evaluation of an interpolation method. Li and Heap [54] lists the mostly used statistical techniques for evaluation of interpolation methods. One of these techniques, the leave-one-out cross-

54 Methodology

validation method will be implemented to evaluate the capability of interpolation method that has been developed based on: e.g., how well it can predict an omitted value, compared with the original value at that location [58, 72, 76].

The interpolation technique will also be assessed using Root Mean Square Error (RMSE). This is used to measure the error between the predicted values, compared with the model being estimated [49, 72, 75, 76, 87]. RMSE is based on the following equation:

RMSE= " 1 N N

∑

i=1 (o_i−p_i)2 #1₂ (3.25)

whereNis the total number of samples,oiand piare the observed and the estimated/interpolated

values of theith node. However, RMSE suffers from the drawback of being sensitive to outliers [54]. Therefore, the results could also be investigated using Mean Absolute Error (MAE), a measurement that is less sensitive to extreme values:

MAE= 1 N N

∑

i=1 |oi−pi| (3.26)

Li [54] suggests that in assessing an interpolation technique’s performance, a combination of exact (cross-validation) and inexact (RMSE and MAE) methods is desirable to have confidence in the overall capability of the method.

The statistical mean will also be used to measure how ‘correlated’ two sets of variables are using the Pearson’srproduct-moment correlation of coefficient. The calculation is as follows: r= ∑ n i=1(xi−x¯)(yi−y¯) p ∑ni=1(xi−x¯)2(yi−y¯)2 (3.27) wherex={x₁,x₂, ...,x_n}andy={y₁,y₂, ...,y_n}are the two sets of values being examined; ¯x and ¯yare the mean value ofxandyrespectively. The resultant value will always lie between -1 (total negatively correlated), 0 (no correlation) and +1 (perfect positive correlation). In this study,xandycan be seen as the observed (sample points) and estimated values respectively.

In document High Resolution Environmental Modelling Application Using a Swarm of Sensor Nodes (Page 77-81)