Evolution from Traditional to Digital Soil Mapping for Land Evaluation: Research Perspective and Literature Review
1.4 Digital Soil Mapping (DSM)
1.4.1 DSM Development
The precise genesis of DSM is difficult to determine; it is a culmination of several factors all converging at an opportune time; including geostatistics, GIS, computer processing and more quantitative soil information demand (Minasny and McBratney, 2015). Stemming from the early adoption of geostatistics for mineral exploration in the 1960s and 1970s (Journel and Huijbregts, 1978), predictive techniques based on “regionalised variable theory”, identifying the spatial dependence on environmental data were being introduced into predictive mapping of soil attributes. Webster and Burgess (1980) demonstrated a ‘kriging’ approach for the prediction of three soil properties (sodium content, stoniness, and topsoil (loam) thickness), whereby semivariograms were computed to show the spatial-dependence of each property, with the semi-variances used to determine the value-weights applied to kriging equations. The semivariograms basically depict the level of spatial autocorrelation, and how spatial data varies over distance (Olea, 2006). In this way, the equations were able to generate spatial predictions of each soil property (in particular where no recorded values of soil property information were recorded) as an alternative to existing conventional soil mapping; this showed an improvement over these maps in terms of short-range variation. In a similar approach, using the theory of semi-variance and spatial dependence to determine the number of observations needed to adequately estimate the regional distribution of soil attributes, McBratney and Webster (1983) showed that determining this number through classical sampling theory (without regard to spatial dependence) had greatly overestimated the required sample numbers in the past. Consequently, due to the high costs of the over-estimated sampling numbers, many investigations into spatial soil property estimations had been avoided. McBratney (1984) coined the term
11 “Geostatistical Soil Survey” to encompass the use of semivariance analysis and spatial dependence for production of spatial estimations of soil properties (with known variance), the ability to develop optimal sampling designs to generate these predictions, and identified superior predictions to that of commonly used conventional techniques due to the ability to capture the continuous nature of soil attributes. McBratney concluded that “Geostatistical Soil Survey is largely complimentary to Conventional Soil Survey”, and is most useful for “medium to large-scale surveys with specific aims”.
Later, Laslett et al. (1987) tested the performance of several two-dimensional (2D) spatial prediction methods for topsoil pH which showed mixed results, with all methods tested showing “some deficiencies”, especially direct interpolation methods that caused ‘spikes’, or unnaturally high values near data points. Non- interpolating methods, such as splines or universal kriging (UK) were shown to produce better predictive surfaces of short range correlations between neighbouring soil sample sites. Odeh et al. (1995) used terrain predictor variables by co-kriging (spatial correlation between two variables to predict the variable of interest (Kozar et al., 2002)) and regression-kriging (determining the spatial auto-correlation of the modelling residuals, kriging these residuals, and adding the residual surfaces to the original modelled attribute values to remove the unexplained variability due to the spatial dependency of each soil sample site (Hengl et al., 2007)). In this example of spatially predicting soil properties, the authors showed that latter prediction methods incorporating environmental covariables performed better than methods (ordinary and UK) that did not consider any co-variation of predictor variables. Stein et al. (1988) used a correlation approach incorporating existing soil-mapping polygons as an explanatory variable for the co-kriging of collected soil point data on soil moisture deficits (determined from hydraulic conductivity and associated moisture retention curves in a study area of the Eastern Netherlands). Stein et al. also demonstrated the advantages of co-kriging in producing more precise results than using ordinary kriging (OK) alone on the training data-set, showing general improvements in Mean Variance of Prediction Error (MVPE) and Mean Squared
12 Errors of Prediction (MSEP). MVPE comparisons showed up to 25 % increase in precision when using the co-kriging approach, which also demonstrated the value in incorporating existing conventional soil mapping as a form of soil-expert modelling input.
1.4.1.1 DSM and Environmental Correlation
By the 1990s, DSM was routinely incorporating an empirical environmental correlation approach with the geostatistical developments encompassing spatial dependencies of soil properties. Most present-day DSM is based around the scorpan environmental correlation approach (McBratney et al., 2003), using environmental covariates to form spatial soil attribute modelling, commonly incorporating regression-kriging. This was facilitated by both advancement in available computing power, but also in improvements to quality and affordability of remotely-sensed spatial data, such as multi-spectral (and hyper-spectral) satellite imagery, digital elevation models, global positioning systems (GPS) to facilitate accurate placement of soil sample training data, and gamma-radiometric mapping (Cook et al., 1996; Dobos et al., 2000; Taylor et al., 2002; McBratney et al., 2003; Minasny and Hartemink, 2011). McKenzie and Galant (2006) further identified the advantages of an environmental correlation approach in DSM, in that; explicit relationships exist between environmental variables and soil attributes; the predictor variables allow spatial prediction across entire areas of interest; and the DSM models can be dynamic, that is, updated as new environmental covariates (or samples) are obtained.
Gessler et al. (1995) considered soil landscape modelling and its application with spatially predicting certain soil attributes. Citing Moore et al. (1993) and the hypothesis of “catenary soil development” (the geomorphological slope and gravity processes of erosion, deposition and soil formation (Edwards, 1940)), and an earlier study into predicting soil attributes using terrain analysis, Gessler et al. identified the need for spatial predictions of individual soil attributes, as “explicit, quantitative and spatially realistic” representations of the soil-landscape “continuum”. They identified that catenary soil-landscape processes ultimately influence the spatial
13 variability of soil attributes, which is largely controlled by water movement through the landscape, which in turn is controlled by the terrain geometry. Gessler et al., following Moore et al. (1991) deduced that terrain geometry could be used for predicting the movement of water through the soil matrix, and therefore some soil attributes. Pointing to the vast areas of Australia, which at the time had very limited detailed soils information, the importance for such approaches was emphasised due to the limited ability to quantify environmental soil-landscape function and associated land management in many areas (McKenzie, 1991).
Identifying the requirements for spatial products to adequately inform precision agriculture (the technological ability to vary field management based on spatial knowledge of soil properties (Robert, 1993)), Kozar et al. (2002) showed a co-kriging approach for the prediction of soil potassium to control variable-rate application of soil fertiliser performed better than kriging potassium field sample sites alone. In this example, the particular relationship between terrain indices (slope) and translocation of profile potassium was shown to produce good correlations; however, the authors stated that including several additional covariates, such as terrain derivatives and parent material, would further enhance the ability to spatially predict soil nutrients, and the importance of such predictions to enable future development in precision agriculture (McBratney et al., 2005).
An early Australian example of an environmental correlation (DSM) using several environmental predictors was investigated by McKenzie et al. (CSIRO) in 1999, where a 50,000 ha area in south-eastern Australia was used to test a stratified soil sampling method and digital geology, landform and climate covariates to produce spatial predictions of soil properties at 25 m resolution. The authors recognised that conventional medium to low-intensity soil surveys have efficiencies due to using the relationships between the soil and environmental variables, however, they perceived them as largely qualitative, and rarely recorded for appropriate transfer of knowledge to end users (McKenzie and Ryan, 1999). The environmental correlation models produced promising results, ranging from explaining 42 to 78 % of the
14 variance of the sampled soil properties, which were described as ‘unmatched’ in spatial agreement by traditional soil survey methods at the time.
Considering further studies involving environmental spatial correlation of soil properties, Park and Vlek (2002) published research that identified the need to consider environmental correlation of soil spatial variability, but in three dimensions (3D). Park and Vlek used 502 samples from 64 profiles, along with existing soil type maps, terrain derivatives, and sample depths as spatial covariates within an area of Somerset in the United Kingdom. The better predictions were found to be for those properties, (specifically exchangeable cations, manganese-oxides, and pH), that were determined to be influenced by catena hydrological slope processes, and demonstrated the knowledge of the strong influence of terrain on the spatial variation of some soil properties.
1.4.1.2 Soil Landscape Clustering
Another area of DSM development was also progressing in automated methods for segmenting landscapes for soil mapping, particularly fuzzy classification theory (Zadeh, 1968; Burrough et al., 1992), where membership functions are generated to effectively group and capture the continuous nature of soil and terrain. To reduce the inherent subjectivity of API, Irvin et al. (1997) described a numerical classification approach to delineate landforms in a 50 ha study area of Wisconsin, USA, combining continuous fuzzy and unsupervised classification methods. The authors investigated whether several DEM terrain derivatives correlated as soil-forming processes with observed soil properties as a method of delineating soil mapping units. Both the unsupervised and fuzzy approaches were compared against traditional API to determine whether manual API could be replaced by more automated methods to reduce time and subjectivity. Various numbers of delineation landform classes were tested to determine which number best represented the expert-based landform knowledge of the area. Irvin et al. found that the automated (unsupervised) statistical clustering approach produced more detail than the traditional API, and was better able to differentiate features based on aspect, and within forested areas. With the continuous (fuzzy) classification approach, which clustered the terrain
15 derivatives and produced partial membership values for each pixel, the authors found difficulty in deciding upon the optimum number of classes to adequately represent the known landform patterns in the area, without needlessly delineating landforms. This method was also able to delineate with more detail than the API, and had the advantage of membership analysis to assess variations in terrain contributions to each landform class. Both automated methods were shown to generate products that would aid soil surveyors in producing more-detailed and less subjective mapping; however, it was identified that soil-landscape knowledge was crucial in manipulating the clustering to generate the most meaningful landforms in terms of soil formation processes.
In order to overcome the lack of capacity for DEMs to explain soil variability in flat, featureless alluvial plains, Triantafilis et al. (2013) described the use of remotely- sensed gamma-radiometrics to improve the capture of variation, specifically in clustering the radiometrics using fuzzy k-means (FKM) (McBratney and Gruijter, 1992). The authors were able to effectively delineate known agricultural features in most areas, with a few exceptions where the gamma characteristics were not ‘unique’ enough to delineate them from neighbouring features and suggested the integration of proximal-sensing such as electro-magnetic induction, or use of LiDAR- derived DEMs. In addition to clustering landforms to inform soil-formation process as delineated mapping units, fuzzy classification can also be used to produce covariates for scorpan DSM, or to provide landform stratification for sampling purposes, which is discussed in more detail in Chapter 2 (Kidd et al., 2015a).