• No results found

4.4. Implications and Conclusions

5.2.2 The data

We use the World Soil Information System (Wosis) dataset provided by Batjes et al. (2016) supplemented with the Australian NSSC dataset described in Chapter 2. In this chapter we analyse percentage clay content.

In the North South direction, the dataset is concentrated between -50º and 50º latitude (a separation distance of around 11,000 km). The maximum latitudinal (North-South) separation is around 16,000 km (Figure 5.1 and Figure 5.2 left panel). There is very limited data collected in the Arctic or the Antarctic. In the East West direction, there is a region of very sparse data collection from around 150º to -150º longitude (Figure 5.1 and 5.3 left panel). This corresponds with the north and south pacific oceans where there is limited land mass, and where data collection is sparse over the land masses that do occur at these longitudes. Other notable (but shorter) gaps in the East West data occur at around -30º to -20º and 50º to 70º. The density of sampling is by far the greatest at longitudes of - 120º to -80º, this corresponds with North America (Figure 5.1 and Figure 5.4 left panel).

5.2.2.1. Data Cleaning

The raw Wosis dataset contains around 80,000 points with observations for percentage clay fraction. In order to harmonise the data for depth, we needed to clean the data. In addition to removing negative values and observations where the summation of the sand and clay fraction was greater than 100, we also removed observations where:

● the topsoil observation did not start at depth 0 cm ; ● If there was a gap between the layers ; and

● If there was an overlap between the layers.

103

We harmonised the depth data using the equal area spline with the ea_spline function in the ithir package (Malone, Minasny, & McBratney, 2017) using the same depth intervals as we used in Chapter 2. We combined the global data with the NSSC Australian data. This left us with 55,562 profiles 35,988 were from the US, another 2,101 were from Mexico, 89 from Canada, and the rest from different parts of the world (151 from Australia). The North American continent dominates the WOSIS dataset. The cleaned and splined NSSC dataset described in Chapter 2 was added to these observations, contributing another 13,830 observations from Australia (Figure 5.1). Summary statistics for the splined data of clay content in Table 5.1.

Table 5.1. Summary statistics of splined composite WOSIS and NSSC dataset

depth count average s.d. skewness kurtosis

0-5 cm 69415 20.49 15.28 1.27 1.50

5-15 cm 68953 21.78 15.50 1.15 1.12

15-30 cm 64520 24.91 16.55 0.88 0.43

30-60 cm 62156 29.22 17.88 0.60 -0.04

60-200 cm 56141 29.81 18.33 0.60 -0.09

104

5.2.2.2. Global Non-Stationarity

We find evidence for non-stationarity about the mean and the variance in the global soil texture data in the North South direction (Figure 5.2). We do not find evidence of non-stationarity in the East-West variogram. There is a region of lower variability, and lower mean values from around -140 to 100, but this does not represent a strong trend (Figure 5.3).

Figure 5.2. Percentage clay fraction statistics by latitude (North-South). Average per degree of latitude

in left panel, variance per degree of latitude in right panel.

Figure 5.3. Percentage clay fraction statistics by longitude (East-West). Average per degree of latitude

in left panel, variance per degree of latitude in right panel.

A common approach for dealing with non-stationarity about the mean is the removal of the trend. We try this approach. The N-S mean trend appears as though it could be well fit by a quadratic function (Figure 5.2 left panel). We select the quadratic function and fit it to the raw data (Figure 5.4 left panel).

105

It is difficult to visualize the fit on the raw data, but it appears to be a good fit against the the mean (Figure 5.4 right panel). From this fitted curve we calculate the residuals (Figure 5.5).

Figure 5.4. Percentage clay fraction against latitude. On the left panel each red point represents an

observation. On the right panel each red point represents the average of the % clay fraction observations collected in that degree of longitude. The black line is the same on both panels

Figure 5.5. Percentage clay fraction statistics calculated from residual data by latitude (North-South).

Average per degree of latitude in left panel, variance per degree of latitude in right panel.

Removal of the mean trend does not affect the stationarity about the variance (Figure 5.5 left panel), or the shape of the empirical semivariograms (Figure 5.6 and 5.7).

106

Figure 5.6. Global Variogram North South, calculated from original data (blue points) and residual data

(red points)

Figure 5.7. Global variogram – East West calculated from original data (blue points) and residual data

(red points)

Dealing with non-stationarity about the variance is a more difficult problem than dealing with non- stationarity about the mean. The pronounced trend in the variance violates the second assumption of weak stationarity (i.e. the variance in this case is dependent not only on separation distance between two observations, but also on direction and location of these observations). Methods exist for modelling variograms when the variance is non-stationary (e.g. Lark, 2009; Meul & Van Meirvenne, 2003), but these approaches are not compatible with empirical variogram modelling at a global scale. In the results and discussion below, we proceed with the original data and discuss potential drivers

107

for the non-stationarity and associated anisotropy. It is necessary to deal with non-stationarity about the variance to create accurate estimates of the prediction error when using variograms for kriging and mapping (Lark, 2009), but the underlying non-stationarity does not prevent us from drawing inferences about global trends in variability from the global variograms.

5.3. Results and Discussion