Probability theory and simulation concepts

appropriateness of specific definitions for particular aims.

2.4 Probability theory and simulation concepts

Two key probability theory ideas are applied several times in the thesis, and these are now introduced. The first of these is the Gillespie algorithm; the second is the quasi-stationary distribution.

2.4.1 Gillespie algorithm

Much of the work in this thesis is concerned with stochastic simulation in continuous time. These simulations rely on the Gillespie algorithm and variants of it, as applied to spatially explicit systems. More specifically, they all use variants of the Direct Method described in Gillespie (1977), used to generate a statistically correct trajectory of a stochastic equation. Based in probability theory, this algorithm was originally made popular in its use for the simulation of chemical and biochemical reactions, and there remains a strong interest in its use and improvements in the biochemical and chemical literature (Sanassy et al., 2014); its use is also common in the ecological literature (Black and McKane, 2012).

The simulation algorithm employs the average rates of reactions or events that can take place in the system. In the context of an ecological system, these might be birth or death events; in the context of metapopulations, they are colonisation and extinction events. The algorithm uses properties of the exponential distribution which allow us to sample separately the time until the next event and the event that occurs.

We denote the events that could occur (e.g. births and deaths) by Em, up to a total of EM

events. If organisms have the same event rates, the number of organisms that could undergo each event is denoted hm, while the rate at which each organism undergoes this event is

denoted cm. Note that if organisms have different event rates - either because they have

different inherent rates or because of ecological effects such as overcrowding - then we typically allocate a separate birth and death rate to each organism and hm = 1 for all events.

We sum over each of the M events to calculate the total event rate rate E0 =

m=1cmhm.

The algorithm is described procedurally in Algorithm 2.1. The first step is to sample the time until the next event τ . This follows an exponential distribution with rate parameter E0,

and can be sampled from this distribution5. The next step is to decide which event should occur. For this, we need to work out the probabilities of the different events, which we do by dividing by the total rate. We then select an event probabilistically, in proportion to

5_{This can be achieved by drawing a number}_r

1 from the uniform distribution on the interval(0, 1) and calculatingτ = (1/E0)ln(1/r1).

2.4. Probability theory and simulation concepts 29

its probability. We can think of the event probabilities as covering the unit interval, and generate a random number r2on the interval [0, 1) and use it to choose the event ν that spans

the section of the interval containing r2.

Algorithm 2.1 Gillespie algorithm, direct method 1: while t < tmax do

2: Compute event rates Emand total event rate E0 3: Generate r1, r2 ∼U (0, 1)

4: Compute time to next event τ = (1/E0)ln(1/r1) 5: Update time: t_{← t + τ}

6: Choose event Eν for which

PM −1

m=1 Em/E0 < r2 ≤

m=1Em/E0 7: Update system state: perform event Eν

8: end while

More computationally efficient algorithms have been developed; these are reviewed in Sanassy et al. (2014). For example, although the original algorithm used a linear search to select the next event (line 6), improvements use more efficient search algorithms with better scaling properties. In addition to these improvements to the exact algorithm, approximate methods have been developed. The most commonly used approximation is that of tau-leaping (or τ -leaping; Gillespie, 2001). In this approximation, instead of simulating each event individ- ually and updating rates after each event, we simulate together all of the events expected to take place within the time interval [t, t + τ ). The process is shown in Algorithm 2.2. Because the number Km is unbounded, it is necessary to check that unrealistic (or negative) values

are not reached before conducting the updating step. The approximation is appropriate when the state of the system does not change too much during the time interval τ . A method for efficient step size selection is explained in Cao et al. (2006).

Algorithm 2.2 Gillespie algorithm, tau-leaping approximation 1: while t < tmax do

2: Compute event rates Em 3: Choose a time step τ

4: For each event Em, generate an event count Km ∼P oisson(Emτ ) 5: Update system state: perform each event Ema number Kmtimes 6: Update time: t_{← t + τ}

2.4. Probability theory and simulation concepts 30

2.4.2 The quasi-stationary distribution

The concept of the quasi-stationary distribution is usually invoked in systems modelled as Markov chains with at least one absorbing state. In the context of population dynamics, a Markov chain might be used to model the number of organisms in a system, and the absorbing state would correspond to the extinction state of the species. Intuitively, the quasi- stationary distribution describes the long-run distribution of system states (e.g. patch occu- pancy states or number of individuals present), prior to extinction.

Pollett (2012) explains that the idea of the quasi-stationary distribution (or conditional limiting distribution) can be traced back to Yaglom (1947). However, he notes that the idea actually had deeper roots in ecology and evolution, noting that Wright (1931) had already referred to a limiting conditional distribution of gene frequencies. A very early use of the term quasi-stationarity can also be found in the epidemiological literature in work by Bartlett (1956, 1957b), who referred to an ‘an effective or quasi-stationarity’ (Bartlett, 1957b, p.38), and according to Pollett (2012), coined the term quasi-stationary distribution in his 1960 work (Bartlett, 1960). A formal definition and general theory became available in the early 1960s (Pollett, 2012), with a continuous-time version of the theory developed in Darroch and Seneta (1967).

We now define the QSD more formally, employing a combination of the notation used in Artalejo (2012) and de Oliveira and Dickman (2005). Specifically, we define a regular time- homogeneous Markov process X = _{{X(t); t ≥ 0} on a countable state space S (in the} finite case, this is of dimension m). The state space S consists of a set ST of transient states

among which the process evolves until it hits a set of one or more absorbing states SA; thus

S = ST ∪ SA. Without loss of generality, the states can be labelled such that SA={0}, and

in the finite state case, the transitory states are labelled ST = {1, 2, ..., m}. In other words,

the Markov process takes the values σ = 0, 1, 2, ..., m, with the state σ = 0 absorbing. In a first definition, we allow time to tend to infinity and consider the probability of the chain being in a particular state, conditioned on non-extinction. Note that the distribution defined in this way is also referred to as the quasi-limiting distribution (QLD) (see e.g. M´el´eard and Villemonais, 2012). We use pσ(t) to denote the probability that X(t) = σ for some σ in the

transient set σ _{∈ S}T, given some particular initial state X(0) (i.e. pσ(t) = P{X(t) = σ ∈

ST | X(0) = j ∈ ST}. The survival probability PT(t) =

σ≥1pσ(t) = 1− p0(t) is the

probability that the process has not become trapped in the absorbing state by time t (i.e. is still in the transient set). Allowing t to tend to infinity, if it exists, the probability density distribution given by the vector π in which πσ = limt→∞

pσ(t)

PT(t) is a QSD.

In a second definition, we start with a probability distribution for the states of the Markov chain and run the chain forwards. The initial probability distribution is a quasi-stationary distribution if for all future time until extinction, the probability distribution of states in

2.4. Probability theory and simulation concepts 31

which the chain finds itself is unchanged (i.e. is time independent). Formally, we denote the time to extinction by T = inf_{{t > 0 | X(t) ∈ S}A, X(0) ∈ ST}, corresponding to the lower

bound on the time that the process enters the absorbing state. Suppose that a =_{aσ; σ ∈ ST}

denotes a probability distribution over the transient states defined by aσ = P{X(t) = σ}.

Now, if there exists an initial distribution π defined as πσ = P{X(0) = σ}, such that

for all transient states σ _{∈ S}T and for all future times t ≥ 0 until extinction at time T ,

P_{{X(t) = σ | T > t} = π}σ, then π is called a quasi-stationary distribution.

In the case where the state space is finite, the two definitions are equivalent (Vere-Jones, 1969; M´el´eard and Villemonais, 2012). Furthermore, when the chain is finite and irreducible (i.e. every transient state can be reached from every other state), the existence of a unique QSD is guaranteed (Darroch and Seneta, 1967). These conditions are fulfilled by the models considered in this thesis.

Computing the QSD

The standard approach for obtaining the QSD is based on eigenvector analysis. This approach is provably exact up to the limits of numerical approximation in obtaining eigen- values. The QSD is found numerically from the appropriate generator matrix (see Nasell, 2001b, for a clear explanation of why this is the case). For the full chain (including extinction), the generator matrix Q in the discrete time case consists of the state transition probabilities matrix in which qrs is the transition probability from state r to state s. In the

continuous time case, the transition rate matrix or (infinitesimal) generator matrix Q is based upon the transition rates: off-diagonal elements qrs are given by the transition rates from r

to s while diagonal elements are defined such that row sums are zero, i.e. qrr =−

s6=rqrs.

For the SRLM, defining ν = 2nfor notational simplicity, Q has dimension ν_{× ν. The gen-} erator matrix for the continuous case where the zero state is absorbing (rates out of this state are zero) is Q =          −Ps6=1q0s 0 0 · · · 0 q10 −P_s6=1q1s q12 · · · q1ν q20 q21 − P s6=2q2s · · · q2ν .. . ... ... . .. ... qν0 qν1 qν2 · · · − P s6=νqνs         

In order to obtain the quasi-stationary distribution, we employ the sub-matrix of the generator matrix corresponding to the transient states ST. The sub-generator matrix QST is obtained from the full generator matrix by removing the row and column corresponding to transitions

In document Modelling persistence in spatially-explicit ecological and epidemiological systems (Page 43-47)