Model based-uncertainty - Identifying and quantifying uncertainty in invasive spread forecasts

6.3 Identifying and quantifying uncertainty in invasive spread forecasts

6.3.1 Model based-uncertainty

6.3.1.1 Error analysis (EA)

EA aims to quantify the influence of different sources of input estimation error on a given model’s dynamics and output (Haefner, 2005; Jager & King, 2004; Matott et al., 2009). EA also helps to understand how errors combine within the system (error amplification or compensation). Parysowet al.(2000) and Mcgwire & Fisher (2001) describe error analysis and, in particular, spatial error budget analysis, as a method to systematically parti- tion the contribution of different sources of error introduced by each parameter in the

Methods Description Selected references Model based-uncertainty

Error analysis (EA) Identification of the sources of error that cause the

largest variation in model forecast

Haefner (2005); Hartig

et al.(2011); Evans (2011); Jager & King

(2004); Matottet al.

(2009)

Sensitivity analysis (SA) Identification of model output components most

sensitive to local (spatially distributed) input variables

Crosettoet al.(2000);

Haefner (2005); Jager & King (2004); Matott

et al.(2009)

Uncertainty analysis (UA) Identification of how uncertainty in multiple

(interacting) parameters and their representation influence uncertainty in model forecast

Hartiget al.(2011);

Jager & King (2004);

Matottet al.(2009)

Bayesian networks (BNs) Combine prior distributions of uncertainty to yield

an updated (posterior) set of distributions

O’Sullivan & Perry

(2013); Matottet al.

(2009); Railsback & Grimm (2011)

Spatial data analysis (SDA) Detecting and quantifying characteristics of

geographical data and, specifically, on spatial autocorrelation, spatial heterogeneity and scale-dependence structure

Dormannet al.(2013);

Evans (2011); Jager & King (2004)

Robustness analysis (RA) Analysis of the extent to which different

representational decision influence model dynamics

Evans (2011); Matott

et al.(2009)

Confrontational methods

Visualization and difference measure Visual comparison of empirical observations and

model predictions (non-spatial and spatial measures)

Fox & Hendler (2011); Grimm (2002);

Spiegelhalteret al.

(2011)

Statistical methods Quantitative comparison and analysis of predictions

and observations (via linear regression models, correlation, etc.)

Mayer & Butler (1993)

Exploratory/heuristics

Pattern oriented modelling (POM) Use of multiple observed patterns to evaluate and

refine models and select between alternate representation

Grimm & Railsback (2012b); Railsback & Grimm (2011)

Multi-Model Analysis (MMA) Generate ensemble predictions via consideration of

multiple plausible models

Burnhamet al.(2011);

Burnham & Anderson (2002)

Participatory modelling (PM) Methods involving expert opinion into model design

and evaluation

Kruegeret al.(2012);

Martinet al.(2012);

Millingtonet al.(2011)

Table 6.1: Selected approaches and tools for the evaluation of spatially-explicit models (modified from

O’Sullivan & Perry, 2013)

spread model. A clear detailing of the technique from the point of view of tracking input errors can be found in the Joint Committee for Guides in Metrology report, JCGM (2008).

6.3.1.2 Sensitivity analysis (SA)

SA is among the most widely used methodologies for assessing uncertainty. SA seeks to rank input parameters by their relative influence on variation and uncertainty in the target output variable. SA involves systematic alteration of model parameter values and

evaluating their effect on model outputs. In the case of a spread model, this might involve the population rate of increase, Allee threshold, mean long-distance dispersal and/or the classification of habitat suitability. In its traditional form, SA is often conducted using a ‘local’ approach where the parameters of interest are systematically varied one-at-a- time (no interaction considered) by some small amount. Among others, Fass`o & Perri (2006), Saltelli et al. (2008) and Zajac (2010) reviewed a large variety of methods and tools available for sensitivity testing, but only a few are well suited for spatial models. Some headway has been made developing tools for evaluating uncertainties in the spatial context (see for example Congalton & Green, 1993; Crosettoet al., 2000; Kocabas & Drag- icevic, 2006; Pontius, 2002), but rigorous evaluation for spatially explicit models remains a real challenge due to the large number of factors and interactions between components of the models at different spatial scales (Jager & King, 2004; O’Sullivan & Perry, 2013).

6.3.1.3 Uncertainty analysis (UA)

UA is a more general approach that seeks to quantify the variation in model predic- tion caused by uncertainty in multiple, potentially interacting input parameters (Jager & King, 2004; Matott et al., 2009; Railsback & Grimm, 2011). UA involves generating a probability density function for each parameter of interest, and quantifying the impact of input uncertainties on the empirical distribution of the model output. Many different approaches for conducting UA have been developed and reviewed in Matott et al. (2009) and Hartiget al.(2011). Among them, the Monte Carlo approach, which does not require assumptions about model structure, has been the most widely applied to spatially-explicit data (Crosetto et al., 2000). A good introduction to Monte Carlo techniques in a spatial context can be found in Walker et al. (2003), along with a wide range of references to studies of sensitivity testing. A more generic study of uncertainty testing, concentrating on statistical summaries, can be found in Bobashev & Morris (2010).However, as empha- sized in Railsback & Grimm (2011) and O’Sullivan & Perry (2013), the computational cost of covering the parameter space of complex models such as most spread models, rapidly becomes impractical.

To reduce the computational cost of global UA, two different approaches have been developed: the approximation and sampling methods (Evans, 2011; Matott et al., 2009). Approximation methods characterize model output uncertainty by propagating one or more statistical moments (e.g., mean, variance, skewness, and kurtosis) of the various input distributions through the modelling system. Examples include error propagation equations (Gertner, 1987), point estimate methods (Tsai & Franceschini, 2005), and various reliability methods (Hamedet al., 1996; Portielje et al., 2000; Skaggs & Barry, 1996). On the other hand, sampling methods guide the selection of a structured parameter space that allows the extraction of a large amount of uncertainty with a relatively small input sample size. Helton et al. (2006) and Helton (2008) provided a thorough survey of sampling-based methods for uncertainty and sensitivity analysis. Among them, Latin hypercube sampling or quasi-random sampling have been the most widely used. With respect to estimating risks for emerging invasive species threats, probability models may be inadequately formulated because of the very high importance of rare events (i.e., events associated with the extreme tails of the distribution), which most probability models do not describe well (Kriticos et al., 2013).

6.3.1.4 Bayesian Networks (BN)

BNs are probabilistic graphical models that combine prior distributions of input errors with general knowledge and site specific data to yield an updated (posterior) set of distributions. BNs can simultaneously represent uncertainty in input data and response data, as well as in model parameter distributions, model code, structure and resolution (Clark, 2005; Clark & Gelfand, 2006). Developing a BN involves, 1) defining a directed acyclic graph that specifies the conditional probability dependencies in the data, 2) defining prior probability distributions for all graph nodes (i.e. sources of uncertainty), and, 3) defining a likelihood function and sampling strategy (e.g. Markov chain Monte Carlo - MCMC) for inducing a posterior distribution based on prior distributions. Credal networks are regarded as an extension of BNs, where credal sets replace probability mass functions in the specification of the network variables (Cozman, 2000). These credals are groups of probability distribution that represent uncertainty about the probability model that

should be used. Therefore credal networks allow the representation and manipulation of uncertainty in graphic models, where probability values may be imprecise or indeterminate.

6.3.1.5 Spatial Data analysis (SDA)

SDA refers to analytical, statistical and graphical procedures for evaluating and sum- marizing spatial input data. It typically comprises characterization of the spatial and temporal structure of input data. A key difficulty with spatial data is the presence of scale- dependent spatio-temporal correlation structures. Spatial and temporal autocorrelation can have a significant effect on the apparent sample size by introducing redundancy (Getis, 2007). These issues become particularly important when datasets for validating models are drawn from the same area by sample splitting (Ara´ujo et al., 2005), resulting in, for example, positive autocorrelation between sample units that can falsely reduce error lev- els. An overview of challenges arising from cross-scale analysis is provided by Feketeet al.

(2010) and implications for the prioritization of intervention areas in the context of climate change can be found in Hagenlocher et al. (2014). The development of neutral landscape models by Gardneret al.(1987) and Withet al.(1997) has also provided a new framework for generating replicated landscape patterns with partially controlled spatial properties. Neutral models allow hypothesis testing about how variation in spatial structure can affect model forecasts. Neutral models are used for generating alternative categorical landscapes such as in error analysis, with the exception that the generated spatial patterns do not represent deviations from a reference map.

6.3.1.6 Robustness analysis (RA)

As well as errors and uncertainty associated with input data, there are also epistemic uncertainties associated with model structure, in particular, with respect to the choice of the functional structure of the model and the choice of variables. The RA replaces the ‘entire’ model or submodel components with a different representation (or construct) to identify how the model behaves under different functional forms (Beven & Binley, 1992; Weisberg, 2006). Assessing model uncertainty has become the subject of considerable at- tention within the context of statistics (Burnham & Anderson, 2002; Johnson & Omland,

2004; Link & Barker, 2009; Lukacs et al., 2010) and is currently an area of rapid development for assessing stochastic spatial simulation models (Grimmet al., 2005; Hartiget al., 2011; Schurr et al., 2012; Wood, 2010; Thiele & Grimm, 2015)

Robustness analysis of stochastic models is particularly challenging, in part because their likelihood functions cannot usually be calculated explicitly. It is therefore difficult to couple such models to well-established statistical theory such as maximum likelihood or Bayesian statistics. A number of new methods, among them, genetic programming, approximate Bayesian computing (ABC), Metropolis-Hasting Markov chain Monte Carlo (MCMC), pattern-oriented modelling (POM), and synthetic likelihood, bypass that limi- tation (Clark, 2005; Clark & Gelfand, 2006; Matottet al., 2009; Poliet al., 2008). These methods share three main principles, 1) aggregation of simulated and observed data via summary statistics, 2) likelihood approximation based on the summary statistics, and, 3) efficient sampling.. Bolkeret al.(2009) and Hartiget al.(2011) provide thorough overviews of RA techniques.

In document Complex systems analysis of invasive species in heterogeneous environments (Page 142-147)