3.6 Chapter summary
4.1.1 Ordinary kriging parameter estimation
The empirical variogram of the sample proportions was estimated and fit to a spherical co- variance structure for each of the simulations considered in the previous chapter. Simulation conditions varied between six beta distributions, three spatial ranges, and three sets of sam- ple sizes. The choice of how many spatial lag classes to use can in some cases dramatically affect parameter estimation [20]. To guarantee a fair comparison, the spatial lag classes were chosen to match those used in evaluating the beta-binomial kriging model.
Table 4.2 provides summary statistics for the estimated nugget parameter, ˆ⌧2. In the
underlying beta random field, the nugget effect is exactly zero. That is, the latent spatial probability field is completely smooth. The empirical variogram which would be used in ordinary kriging tended to drastically overestimate the nugget effect at small sample sizes. For relatively large sample sizes ranging between 50 and 100, the sample proportions are generally very close to the underlying probability. As the number of samples increased the nugget effect trended closer to zero.
When compared to previous estimates using the beta-binomial model, it’s clear that ordinary kriging estimated a much larger nugget effect. The largest nugget effects were estimated at small sample sizes and for symmetric distributions. The additional variability at the latent beta level of the symmetric beta distributions is evident in the sample proportions and the estimated nugget. The skewed beta distributions still show some overestimation of the nugget. Based on these parameter estimates the ordinary kriging model should be used with some caution at small sample sizes.
The spatial sill for each simulation condition also shows some bias. As the spatial range increases, the estimated spatial sill tends to also increase. This pattern was observed in the beta-binomial simulations as well, large spatial ranges tend to occur with larger spatial sills. As in the beta-binomial variogram estimates, there is at least one simulation in all scenarios that estimated completely unreasonable values, shown by the very large maximum sill estimates. In about half of the simulation conditions, the median spatial sill is over or underestimated by at least 0.01. This bias may not seem like much, but on a proportional scale that is a huge misestimation. The poor estimation tends to occur for beta distributions
with relatively high variability: beta distributions 1, 2, 3, and 5. The higher the variability in the underlying spatial probability field, the higher the variability in the sample proportions and the worse the empirical variogram performs.
The sill estimates are nearly identical for both the ordinary kriging model and the beta- binomial kriging model. The majority of ordinary kriging cases are slightly overestimating the spatial sill, reflecting again the additional variability from using sample proportions to estimate a beta random field that is not accounted for in this model. Both models show about the same “failure rate”, the approximate percentage of time the model produced non- sensical results. This failure rate is indicative of non-convergence in the model estimates, or perhaps a lack of defined spatial structure in the simulated data. Neither model appears to be a clear winner at this point when it comes to parameter estimation
The range estimates are relatively well estimated using both the ordinary kriging model and the beta-binomial kriging model. Both models show a tendency to overestimate the range, which is not unusual for spatially correlated data. However the median range using the empirical semivariogram is actually underestimated in the majority of simulation conditions. There are a few cases using the ordinary kriging model in which the range is significantly underestimated (outliers at the bottom of the boxplot), but this only happened at small spatial ranges.
Overall the empirical variogram estimates the spatial range well. However where it misses the mark is in the spatial nugget effect and spatial sill. The empirical variogram has a different estimation goal than the beta-binomial variogram: the variability of the sample proportions. The nugget and sill estimates are inflated because they represent the variability of the spatial proportions themselves and not the underlying spatial probabilities. Care must be taken to choose the model best suited to the modeling structure of primary interest. At large sample sizes, the difference between the empirical variogram and the beta-binomial variogram is nearly negligible, as both methods estimate the variability of the beta random field quite well. For small sample sizes (ni < 30), the choice of which variogram to use is
critical. The next section will consider what effect using the empirical variogram has on the predicted values using ordinary kriging.
Figure 4.1: Parameter estimates for nugget effect using ordinary kriging and beta-binomial kriging.
Simulation conditions 1-18 have a spatial range of 5, simulation conditions 19-36 have a spa- tial range of 10, and simulation conditions 37-54 have a spatial range of 15. The simulations on the left within each subpanel have the smallest sample sizes, simulations in the middle of each subpanel have medium sample sizes, and simulations on the right on each subpanel have the largest sample sizes. The lightest plot color represents beta case 1, the darkest plot color represents beta case 6. Estimates made using the ordinary kriging model are in shades of blue (top panel), and estimates made using the beta-binomial kriging model are in shades of red (bottom panel).
Figure 4.2: Parameter estimates for spatial sill using ordinary kriging and beta-binomial kriging.
Simulation conditions 1-18 have a spatial range of 5, simulation conditions 19-36 have a spa- tial range of 10, and simulation conditions 37-54 have a spatial range of 15. The simulations on the left within each subpanel have the smallest sample sizes, simulations in the middle of each subpanel have medium sample sizes, and simulations on the right on each subpanel have the largest sample sizes. The lightest plot color represents beta case 1, the darkest plot color represents beta case 6. Estimates made using the ordinary kriging model are in shades of blue (top panel), and estimates made using the beta-binomial kriging model are in shades of red (bottom panel).
Figure 4.3: Parameter estimates for spatial range using ordinary kriging and beta-binomial kriging.
Simulation conditions 1-18 have a spatial range of 5, simulation conditions 19-36 have a spa- tial range of 10, and simulation conditions 37-54 have a spatial range of 15. The simulations on the left within each subpanel have the smallest sample sizes, simulations in the middle of each subpanel have medium sample sizes, and simulations on the right on each subpanel have the largest sample sizes. The lightest plot color represents beta case 1, the darkest plot color represents beta case 6. Estimates made using the ordinary kriging model are in shades of blue (top panel), and estimates made using the beta-binomial kriging model are in shades of red (bottom panel).