Construction of the (semi)variogram - Surface maps constructed with kriging

Mapping Spatial Data

5.7 Surface maps constructed with kriging

5.7.1 Construction of the (semi)variogram

25 50 75 90 95 100

0.27 0.37 0.59 0.74 1.07 1.61 1.95 4.42 Percentile mg/kg in C−horizonAs

0 50 100 km

N

Figure 5.6 Smoothed surface maps of the variable As in the Kola C-horizon; left: constructed using a “continuous grey scale” (100 percentiles), right: the As element distribution map constructed using seven selected percentile-related classes

mineral prospectors, or with data that have a very large proportion of “less than detection limit”

values.

5.7 Surface maps constructed with kriging

In general, kriging is a more informative approach for generating surface maps from point source data than smoothing (Section 5.6) because it will deliver a statistically optimised esti-mator for each point on the grid selected for the interpolation (see, e.g., Cressie, 1993). As an additional advantage, kriging will provide an approximation of the prediction quality at each point within the grid. It is possible to either estimate the values for the blocks defined by the intersections of the grid lines (block kriging), or for any location in space (point kriging). The basic idea behind spatial analysis and kriging is that because samples taken at neighbouring sites have a closer relationship to each other than to samples collected at more distant sites the closer samples should have greater influence on the interpolated estimate. The question then becomes, how to set the weights proportional to distance.

5.7.1 Construction of the (semi)variogram

The basis of kriging is the (semi)variogram (see, e.g., Cressie, 1993), which visualises the similarity (variance) between data points (y-axis) at defined distances (x-axis). The distance plotted along thex-axis in the (semi)variogram is the Euclidean distance (distance of the direct connection) between two points in the survey area. Usually about 30 to 50 per cent of the maximum distance in a survey area is used as a maximum for thex-axis of the (semi)variogram.

One reason for this choice is that the variability of the difference in the element concentrations of very distant samples can be expected to be high (approaching the total variability of the

SURFACE MAPS CONSTRUCTED WITH KRIGING 77

Figure 5.7 Visualisation of the basic information for the computation of the semivariogram for the variable As in the Kola Project C-horizon

element in the survey area). For the Kola Project area the maximum distance is about 700 km.

For plotting the (semi)variograms, a maximum distance of 300 km was chosen (43 per cent) when preparing the Kola Atlas (Reimann et al., 1998a) and is thus used here, although a shorter distance, e.g., 200 km, would in many cases be more appropriate.

The basis for the calculation of the (semi)variogram is the semivariogram cloud shown in Figure 5.7 (upper left). The semivariogram cloud displays the squared difference of the element concentrations between all possible pairs of data points. The pairs are plotted according to their separation distance along thex-axis. The y-axis records the semivariance, which is half of the value of the squared differences between the values of the pairs. The semivariance is used because it allows an easier visualisation of the total variance in the semivariogram. The values along thex-axis can then be summarised in distance classes. Such a summary is shown using boxplots in Figure 5.7 upper right. The MEDIANS displayed in the boxes increase with distance. The plot shows many outliers, i.e. pairs of data points that have an unusual large variability. Instead of summarising in boxplots, it is also possible to smooth the data along thex-axis. Figure 5.7 lower left shows the smoothed line for the semivariogram cloud in Figure 5.7 upper left. Again the increase of the semivariance with increasing distance is visible. At a certain distance the smoothed line flattens out, i.e. the variance becomes constant.

78 MAPPING SPATIAL DATA From this distance on, the concentrations will no longer show a spatial dependency. For small distances the semivariance is small and thus the spatial dependency between the points is high.

Progressing from the very noisy semivariogram cloud to the boxplot presentations, the trend is becoming smoother and smoother. However, for finally constructing a semivariogram, an even smoother appearance is desirable. For this it is necessary to leave the semivariogram cloud and to no longer look at the pairs of all data points but to summarise more points using distance intervals. The squared differences of the element concentrations for all points within the distance interval (20 km) are averaged. This procedure results in the few points shown in Figure 5.7 lower right, which are then the base for fitting the semivariogram model.

In the above example distance intervals in any direction were considered (omnidirectional).

In practice, it may also be interesting to study the semivariance in certain directions because the spatial dependency could change with direction. Thus for a two-dimensional grid, the distances are calculated in a certain direction from each data point, e.g., north–south or east–west. If the grid is irregular, a tolerance angle for the defined direction is needed to find enough points in the selected direction. If a small angle is chosen only a few points may fall into the segment at small distances. This will cause a noisy semivariogram. If the angle is chosen to be very large, directional differences will be averaged and thus no longer be visible in the semivariogram.

An often-used default angle to define a segment isπ/8 radians (22.5 degrees). For calculating the semivariogram function for a segment, the average squared differences of the element concentrations for all sample pairs within that angular relationship from one another are used.

Figure 5.8 left shows the semivariances for the variable As in the Kola C-horizon data for four different directions additional to the omnidirectional semivariogram. Although all curves start at about the same point, they flatten out at different distances and show a different total variance. For kriging a single model has to be fitted to the semivariances. In the example plot (Figure 5.8, left) the omnidirectional semivariogram may be the best compromise for fitting the data (Figure 5.8, right).

0 50 100 150 200 250 300

0.00.51.01.5

Distance [km]

Semivariogram

As in C−horizon

●

● ● ● ● ● ● ● ●

●

● omnidirectional N−SE−W NW−SE NE−SW

0 50 100 150 200 250 300

0.00.51.01.5

Distance [km]

Semivariogram

As in C−horizon

●

● ● ● ● ● ● ● ●

●

Nugget variance = 0.59

Sill = 0.7 Range = 187

Figure 5.8 Directional semivariograms (left) for the variable As in the Kola Project C-horizon samples and spherical semivariogram model for the omnidirectional semivariogram (right)

SURFACE MAPS CONSTRUCTED WITH KRIGING 79 For ﬁtting a model to the semivariogram there are different options. The curve can be approximated by a spherical model (most common case) as shown in Figure 5.8 (right). Linear, exponential, Gaussian, and several other models can also be used to fit the semivariogram function. The final fitting should be based on a visual inspection of the semiovariogram.

In document Statistical Data Analysis Explained (Page 100-103)