Obtaining future population estimates - Bayesian modelling of integrated data and its applicati

4.2 Methods

4.2.3 Obtaining future population estimates

We describe two implementational approaches to obtaining posterior samples of future population sizes. In the first, we generate future survival and productivity rates and update the predicted population states within the MCMC algorithm (within-chain predictions). The second method initially obtains a posterior sample of the parameter values, excluding the future states; then, given the set of posterior values, we impute future parameter values for each individual posterior sample independently (independent post hoc predictions).

Within-chain predictions

At each iteration of the chain, we first update all historical parameter values and states for times t= 1, . . . , T orT −1 (dependent on the parameter) using a Metropolis-Hastings random walk algorithm with uniform proposal density, as described in Section 1.3.3. Using the same algorithm we then propose to update, for example, ρ,t from t = T + 1, . . . , T + P, where P = 10 is the number of years to be predicted; however, because there are no data for this period, acceptance probabilities for proposed moves depend only on the process likelihood (equation (3.6)) and the normal ‘prior’ with varianceσ2

4.2 Methods 75

ρ,t(equation (4.2)). With the updatedρ,tand current value ofµρ, we calculate future productivity ratesρt by the inverse-logit of equation (4.1). Similarly, we generate future first-year and adult survival rates from t = T, . . . , T +P by updating0,t and a,t for those years.

Given the future demographic rates, we update the corresponding future population sizes of female breeders Yt and Zt (respectively, continuing female breeders, equation (3.2), and new female recruits, equation (3.3); total breeders

Xt=Yt+Zt) and female prebreeders Jt (equation (3.4)). There is only the process model involved in formulating futureJ and X values.

The above method is the more intuitive, and elegant, means of obtaining posterior samples of predicted population sizes. However, because the acceptance probability of any proposed update of future population sizes depends on the values of the future demographic parameters at the current iteration— and vice versa—the imputed predicted values are necessarily highly correlated. Correspondingly, proposal distributions need to be narrow and mixing of the MCMC chain is poor. Given the large degree of uncertainty in population sizes towards the end of the prediction period, this method proved impractical for obtaining suitable samples from the posterior: even after 10 million iterations (two weeks of computing time) the chain did not appear to have converged and the Monte Carlo error was unacceptably large.

Independent post hoc predictions

As an alternative to the above approach, we initially run the Bayesian integrated model to obtain a posterior sample of parameter values and population sizes for the historical period t = 1, . . . , T only. Then, for each posterior sample i

from the MCMC chain (i= 1, . . . , S, where S is the number of (thinned) post- burn-in samples) we first simulate new future productivity, first-year survival and adult survival rates by sampling from

logit(ρ(_ti))∼N(µ(_ρi), σ2(_ρi)), t=T + 1, . . . , T +P, (4.3a) logit(φ(₀i_,t))∼N(µ₀(i), σ2(₀i)), t=T, . . . , T +P, (4.3b) and

logit(φ(_a,ti))∼N(µ_a(i), σ2(_ai)), t=T, . . . , T +P. (4.3c) The normal distribution parameters in equations (4.3) are the underlying means and variances from the random effects models, with superscript ‘(i)’ denoting the ith sample from the posterior distribution. Future values for time-constant

survival rates φ1, φ2 and φ3, and fidelity rateψ, simply take their values as at the ith iteration.

Given the demographic rates and the ith sample of the current population sizes XT and JT−4, we generate future breeding population trajectories for t =

T+1, . . . , T+P and obtain the corresponding posterior distribution by sampling from

Y_t(i) ∼Bin(X_t(₋i)₁, φ(_a,ti)₋₁) (4.4) and

Z_t(i) ∼Bin(J_t(₋i)₅, ψ(i)φ(_a,ti)₋₁), (4.5) withX_t(i) =Y_t(i)+Z_t(i)representing the total predicted population sizes of female breeders in yeart. Similarly, fort =T −3, . . . , T +P we sample from

J_t(i)∼Bin(X_t(i), ρ_t(i)φ∗_t(i)/2) (4.6) to generate predicted population sizes of female prebreeders, where φ∗_t(i) =

φ(₀i_,t)φ(₁i)φ(₂i)φ(₃i).

The posterior distribution for the predicted states obtained using this approach will be the same as for the previous method. Both approaches take full account of all sources of uncertainty liable to affect the estimates of future population sizes: uncertainty in the demographic parameter estimates; uncertainty in the counts, and hence the values of the true underlying population sizes during the study period; and uncertainty caused by future environmental stochasticity (although in our approach this is limited to the range of historic interannual variation and does not account for potentially more extreme con- ditions in the future). However, the current approach significantly reduces the dependence of the predicted states, and hence increases the effective sample size (the number of effectively independent draws from the posterior distribution) of predicted values.

We ran the MCMC chain for 2 million iterations, discarding the first 1 million as burn-in, which convergence diagnostics indicated was ample for this model (see Section 3.3.3). To save storage space, and to reduce the autocorrelation among consecutive samples, we thinned the output to every one-hundredth iteration, thus providing 10,000 posterior samples. To boost the sample size of projected population sizes, we simulated 10 projections from each realisa- tion of the posterior, resulting in 100,000 posterior samples for each predicted state. The MCMC simulation took approximately 40 hours, and the subsequent

4.2 Methods 77

projections took less than 1 minute!

In document Bayesian modelling of integrated data and its application to seabird populations (Page 93-96)