Complementing PB data with planned survey data data

6.2 Incorporating imperfect detection

6.2.4 Complementing PB data with planned survey data data

effects of imperfect detection in spatially correlated data, in a manner similar to how we dealt with these difficulties for the IPP model in Chapter 4. Here we investigate the possibility of improving the estimates of the model with imperfect detection by adding a small sample of the data having perfect detection. We will discuss the possibility of adding first PA data and then SO data, to the PB data. We first attempt to fit the Cox data of Fig. 6.1 (right panel) complemented with a smaller dataset with perfect detection using an integrated IPP model and then discuss fitting the data using an integrated LGCP model.

This section examines a simplified integrated model, that assumes there is small independent area (approximately 1/9th of the total study area, shown in Fig. 6.4), for which the survey detected all individuals present (i.e., detection was perfect). In our simulation this data provides the occupancy status (presence or absence) of each of the cells in this area (PA data).

The species occupancy is assumed to have the same intensity distribution λ(s) in the PB area (Fig. 6.1) (left panel) and in the perfect detection area (Fig. 6.4). The species presence points are then simulated in the perfect detection area as a Poisson process with intensity λ(s) from eqn.6.1. The detected presences in the PB data are distributed with the intensity λ(s)b(s) (right panel of Fig. 6.1).

The hierarchical model for the PA data with perfect detection can then be written out as:

CHAPTER 6: MOVING TOWARDS AN INTEGRATED MODEL WITH SPATIAL CORRELATION YP A ∼ P oisson(λ) log(λ) = β0x + ξ ξ ∼ N (0, K(l)) (6.9)

Using an integrated IPP model to fit LGCP simulated data

As the first step of our investigation, we fit an integrated IPP model that combines PB and PA data, as proposed inFithian et al.(2015). This model assumes that the occupancy is driven by a Poisson process with a log linear intensity λ(s).

YP A ∼ P oisson(λ)

YP B ∼ P oisson(λb)

log(λ(s)) = β0x log(b) = α0w

(6.10)

The results of 100 parameter estimation from fitting an integrated IPP model are presented in Fig. 6.5. The right panel of Fig. 6.5, which shows the estimated intensity function λ(s) obtained from the IPP model, clearly demonstrates the main downfall of using IPP modelling without explicitly accounting for spatial correlation actually present in the data. IPP is not capable of capturing the spatial clustering and can only recreate λ(s) that varies with the observed covariate x(s). The spatially correlated process ξ(s), and the clustering it generates is totally ignored.

This result confirms the need for the integrated LGCP model where eqns. 6.3 and 6.9 are fit simultaneously. The fitting of this model would be the next step towards developing an integrated model that is able to account for imperfect detection and spatial correlation at the same time. There are a range of significant issues that need to be solved to be able to successfully fit such a model in practice.

6.3 Discussion

When creating an integrated LGCP model, SO data is a good candidate for complementing the PB data. (In the last section we complemented the PB data with PA data.) With SO data the presence or absence of the species is recorded in multiple repeated surveys (see Chapter2). This data can be successfully fitted using the SO model presented inMacKenzie et al.(2002) andTyre et al.(2003), and is discussed in more detail in Chapter 4where it was used for creating an integrated IPP model. The model has been widely used to analyse SO data collected in repeated surveys of the same set of sites. It’s main advantage is that it allows explicit modelling

SECTION 6.3: DISCUSSION

Figure 6.5: Estimated coefficients β0 and β1 obtained from fitting an integrated

IPP model. Right panel: map of the predicted occupancy intensity λ(s) from the estimated coefficients β0 and β1.

of the detection process during SO data collection. It is assumed that the site i is occupied by the species with the probability ψi (i = 1, . . . , M ), and the occupancy

is then modelled with a Bernoulli distribution with probability ψi.

To integrate the LGCP and the SO models we first have to ‘translate’ the occupancy intensity λ(s) in the LGCP, that is defined in the continuous space to the discrete probability of occupancy ψi at each site that is used in the SO model.

Since the LGCP is a Poisson process conditional on the random field λ(s), it is possible to exploit the properties of the Poisson Process to connect the SO model. The number of individuals present at site i, N (Ci), has a Poisson distribution that

depends on the intensity function λ(s) as follows: N (Ci) ∼ P oisson(µ(Ci)), where

µ(Ci) =

Ciλ(s) ds. This distribution for N (Ci) provides the basis for defining the

probability of occupancy for site Ci as follows:

ψi= P r(N (Ci) > 0) = 1 − exp(−

λ(s)ds), (6.11)

where the occurrence probability ψi increases with the area of site Ci.

Fitting this model has proven computationally difficult due to the limitations in statistical R packages. Future work will require overcoming this serious difficulty. The next step towards developing an integrated model with spatial correlation would be to use greta to fit both of the hierarchical models eqns. 6.3 and 6.9 together. This model will be complex as it will depend on a several random factors, for example, the locations of the survey areas and their relative distance to the presence-only observations. The model requires extensive investigation, and will be a promising direction for future research. This model fitting requires further investigation and requires finding an appropriate computational solution for incorporating SO model into LGCP fitting framework, that can incorporate spatial correlation while modelling imperfect detection directly.

CHAPTER 6: MOVING TOWARDS AN INTEGRATED MODEL WITH SPATIAL CORRELATION

One of the main assumptions of SDMs, is that the species distribution remains static during the data collection, meaning that the species do not move from area to area during the data collection process. This assumption is present in the LGCP implementation of SDMs, and while this assumption will likely hold for low-mobility species such as plants, and fauna with limited home ranges and/or dispersal capa- bility, it is violated for species, such as large mammals or birds with large home ranges and long dispersal distances.

In this chapter we investigated the possibility of accounting for imperfect detection in spatially correlated data by utilising a thinned LGCP model. I then demonstrated that an integrated IPP model cannot be used for modelling this type of spatially correlated data. Finally I discussed the possible options for devising an integrated LGCP model.

Part II

CHAPTER

7

Combining multiple data sources to

model species extinction

7.1 Overview of methods for inferring species extinction

from sighting data

In biology, the term extinction means the termination of a group of organisms, usually a species. The moment extinction occurs may be defined as the time when the last individual of the species has died. In ecology, extinction is often used to refer to local extinction which means that the species ceased to exist in a specific area while still existing somewhere else. This leads to the creation of the term extinct in the wild, when the species does not have any living members in the wild while some individuals are alive in zoos or other artificial habitats. Knowing whether a species is extinct is necessary for prioritisation of conservation strategies and allocating resources for environmental monitoring. Observing extinction directly is almost impossible (Diamond 1987). According to the definitions ofIUCN(2001), a taxon is presumed extinct when an exhaustive surveys of the known or expected habitat, that was conducted at appropriate times(diurnal, seasonal, annual), have failed to record an individual. It is important to keep in mind that such surveys can be expensive and take a very long to complete. There are, however, multiple methods for estimating the probability of species extinction from prior information. These estimations are also used to measure biodiversity conservation effectiveness, for reporting on the state of the environment and determining whether an invasive species eradication program is successful (Rout et al. 2009;Fuller et al. 2013;Pimm et al. 2014;Tittensor et al. 2014).

CHAPTER 7: COMBINING MULTIPLE DATA SOURCES TO MODEL SPECIES EXTINCTION

torical records of the species and targeted surveys. Records (or observations) usually include time and place when the species was observed. These records can have different certainty. For example, if a specimen was collected during the sighting and an expert confirmed the taxon, the record has an extremely high certainty, while if the observation was based on a blurry photograph, or the bird sound heard in a noisy location, it is considered uncertain. Ignoring the distinction between certain and uncertain observations can lead to severe errors in conservation practice and waste limited resources (McKelvey et al. 2008). Careful assessments of observations are even more critical for endangered species that are rarely observed. Each sighting is a rare event and can have a significant influence on how conservation measures are applied.

The second source of information is targeted surveys when the species was not found. In this case, the effort put into the survey is recorded usually in the form of the area that was surveyed.

There has always been a need for a quantitative methodologies that estimate whether a taxon has gone extinct or whether it remains extant. The idea of using sighting records to estimate extinction probability within the field of conservation biology was introduced bySolow(1993). In this method, the records of the species in some observation period are assumed to represent a realisation of a Poisson Process. This is used as a basis for a formal test of the hypothesis of species extinction. The main drawback of this model is the assumption that both the population size and the probability of observation are stable over the observation period. Others have subsequently adapted these ideas (Burgman et al. 1995). McCarthy (1998) modified Solow’s (1993) model accounting for changes in the range and abundance of the species. Rout et al. (2009) used the number of surveys as a record unit in Solow’s (1993) model. The basic idea of all these methods is that the species is less likely to be extinct the more recently it has been observed. The use of null-hypothesis testing assumes the understanding of the sampling distribution and rate at which records are made during the observation period. However, this rate can change, for example, because of changes in species population size or changes in survey effort.

Solow and Roberts(2003) proposed a non-parametric approach that makes minimal assumptions about the rate allowing varying search effort. Roberts et al. (2010) model included uncertain records, without distinguishing them from certain, and conducted a sensitivity analysis showing that uncertainty has a significant impact on the estimates of extinction probability. Solow et al. (2012) divided the sighting period into segments with certain and uncertain observations. The model proposed by Lee et al.(2014) assumes that certain sightings and uncertain sightings occur at different rates. The model proposed by Jari´c and Roberts (2014) accounts for the

SECTION 7.1: OVERVIEW OF METHODS FOR INFERRING SPECIES EXTINCTION FROM SIGHTING DATA

reliability of individual observations by assigning each record a probability of being correct.

However, these approaches do not incorporate information on the adequacy of surveys and searches for the taxon (as specified by IUCN (2001)). There is a need to account for detectability, accessibility of habitat, timing, duration, extensiveness, sampling intensity, survey methods and observer skill (Clements et al. 2013;2014).

Thompson et al.(2013) developed a single framework that accommodates certain and uncertain observations, together with information from targeted surveys, but the approach is theoretical and unfortunately not easy to apply in practice.

Lee (2014) published a simplified version of the model, but this requires parameter estimates that are not readily available in practice, and it makes assumptions that are not in accord with many field data , asThompson et al.(2013) points out. Also, in some instances that can arise in practice, Lee’s (2014) assumptions and model design contain some inconsistencies that can lead to model probabilities that exceed unity. Caley and Barry (2014), presented a model that utilises a Bayesian framework and accounts for some of the uncertainty by specifying prior distributions. The model also introduces varying and density-dependent detection probabilities. How- ever, these authors found that the specification of priors can be difficult especially when the data is sparse and not particularly informative.

The next section presents a paper ‘Inferring extinctions II: A practical, iterative model based on records and surveys.’ which was published in Biological Conservation in 2017.

CHAPTER 7: COMBINING MULTIPLE DATA SOURCES TO MODEL SPECIES EXTINCTION

7.2 Inferring extinctions II: A practical, iterative model

based on records and surveys

This section has been published, and has the following citation:

Thompson, C. J., Koshkina, V., Burgman, M. A., Butchart, S. H., and Stone, L. (2017). Inferring extinctions II: A practical, iterative model based on records and surveys. Biological Conservation, 214:328–335 DOI: 10.1016/j.biocon. 2017.07.029

Abstract

Extinctions are difficult to observe. Estimating the probability that a taxon has gone extinct using data from the field aids prioritisation of conservation interventions and environmental monitoring. There have been recent advances in approaches to estimating this probability from records. However, complete assessment requires consideration of the type, timing and certainty of records, the timing, scope and severity of threats, and the timing, extent and reliability of surveys.

Until recently, no single method could integrate these different sources and qualities of data into a single measure. Here we describe a new, acces- sible method for estimating the probability that a taxon is extinct based on different kinds of both record and survey data, and accounting for data qual- ity. The model takes into account uncertainties in input parameter estimates and provides bounds on estimates of the extinction probability. We illustrate application of the model using information for the Alaotra Grebe (Tachybap- tus rufolavatus). Application of this approach should facilitate more efficient allocation of conservation resources by enabling scenario analyses that inform investments in searches and management interventions for potentially extinct taxa. It should also provide more reliable estimates of recent extinction rates.

In document Stochastic ecological models for predicting species distribution and extinction (Page 93-102)