• No results found

Here, Equation (2.1) is used as a density estimator. However, asinow indexes cues (rather than individuals), D is assumed to refer to cue (rather than animal) density. Common forms of auxiliary data that can be employed in this scenario include signal strengths and TOAs.

A simple random sample of nr individuals from the population are monitored indepen-

dent of the main survey, but at the same time and location. A cue rate (in cues per unit time) is observed from each. Let ri be the cue rate observed from the ith monitored in-

dividual, r hold all observed cue rates, and µbr = Pnr

i=1ri/nr provide an estimator for the

mean population cue rate,µr.

An estimator of animal density is therefore

b Da= b D b µr . (3.1)

The unit of the numerator is cues per unit time per unit area, while that of the denominator is cues per individual per unit time. The quotient of the two thus has the unit individuals per unit area.

While this estimator may be intuitive, its properties are not immediately clear. For example, if Db and

b

µr are unbiased estimators of D and µr, respectively, then it does not

follow that Dba is also unbiased; the expectation of a quotient of random variables is not equivalent to the quotient of the respective expectations, and so, in general, E(Dba) = E(D/b µbr)6= E(Db)/E(µbr).

Here, the estimator shown in Equation (3.1) is given further theoretical justification. It is shown that, if the estimator Db has asymptotic unbiasedness and normality as t → ∞ (thus as n→ ∞), then the properties of asymptotic unbiasedness and normality also hold for Dba; that is, if the estimator Db

d

random variable with the asymptotic distribution of Db, then the estimatorDba=D/b µbr has asymptotic unbiasedness and normality.

The estimatorµbris the mean of a simple random sample from the population of animals;

thus, by the weak law of large numbers, bµr p →µr asnr→ ∞. Therefore, as t→ ∞ and nr → ∞, b Da= b D b µr d → Db∗ µr ,

due to Slutsky’s theorem. From the condition of asymptotic unbiasedness and normality of b D above, b Da d → Db∗ µr ∼N D µr ,Var(Db) µ2 r ! . (3.2)

Thus, asymptotically (as both t→ ∞ and nr → ∞),

E(Dba) =

D µr

=Da.

IfDb were an ML estimator, then the condition above (i.e., its asymptotic unbiasedness and normality) would be met directly from standard ML theory. However, the objective function being maximised (Equation (2.3)) for its estimation (Equation (2.1)) is the likeli- hood of a model that does not befit the data-generating process as it assumes independence between cue locations. Therefore, one cannot directly project ML estimator properties onto

b

D. The appropriateness of the above condition is instead assessed via simulation in Section 3.4.2.

For the same reason, ML variance estimators should not be used for model parameters in this context. Typically, in cases where data violate a model’s independence assumption due to positive correlation across sampling units, variance estimates are underestimated. This causes CIs to have a true coverage lower than their nominal levels.

Here an alternative is proposed. It is a simulation method that uses parameter estimates (Db, Dba, γb, ψb, and µbr) to generate numerous data sets with the appropriate cue-location dependence. The parameter estimates obtained from each allow inference of the estimators’ properties under cue-location dependence, thus giving rise to appropriate variance estimates. This approach is similar to a parametric bootstrap procedure, only the statistical model used to derive estimators is different to that used for data simulation.

For this approach it is also necessary to estimate the distribution of population cue rates. This can be done either parametrically—for example, by assumingrhas CDFF(r;ζ)

and estimatingζ in some way—or nonparametrically—for example, by using the empirical distribution function (EDF) to estimateF(r;ζ).

Here the superscript ∗ is used to denote either simulated data, or estimates obtained from simulated data. The following procedure is proposed:

1. Simulate animal locations within A as a realisation of a Poisson point process with homogeneous intensityDba.

2. Determine the number of cues made by each individual by simulating from the esti- mated distribution ofr(e.g., either some parametrically estimated distribution or the EDF).

3. ConstructX∗. The location of each cue is given by the location of the individual that produced it.

4. Generate Ω∗ via simulation of detections, with detection probabilities given by

g(dj(x);γb). Remove entries in X

and that are associated with cues that were not detected by any detector.

5. GenerateY∗ (for any auxiliary data collected during the survey) via simulation from the PDFf(Yi|ωi,xi;ψ) (Equation (2.4)). 6. CalculateDb ∗ ,γb ∗ , and ψb ∗

fromΩ∗ and Y∗ (Equation (2.1)).

7. Generater∗via simulation ofnrindependent and identically distributed (IID) random

variables from the estimated distribution ofr(e.g., either fromF(r;bζ) or its EDF). 8. Calculateµb ∗ r = Pn∗r i=1r ∗ i/n∗r 9. CalculateDb ∗ a=Db ∗ /µbr.

10. Repeat the above steps nb times, savingbθ ∗

on each occasion.

One particular point to note is that, in Step 2, the simulated cue-rate data may provide non-integer numbers of cues emitted by individuals. In this case, letr∗0 be the non-integer value generated. The resultant number of cues simulated is then given by

r∗ =    dr∗0e with probability r∗0− br∗0c, br∗0c with probability 1−(r∗0− br∗0c),

The nb sets of saved parameter estimates can then be treated as though they were

generated from a bootstrap procedure; see Davison and Hinkley (1997) for a detailed account of what can be achieved. In particular, standard deviations of estimates obtained from simulated data provide standard errors, and biases can be estimated by subtracting the original parameter estimate from the mean of the estimates from simulated data; a bias adjustment can then be carried out by subtracting this bias from the original estimate. A variety of bootstrap CI method exist, some of which are summarised in Section 2.3.3.

One final point to note is that the objective function maximised for point estimation is not the ‘true’ likelihood for the model being fitted; thus likelihood-based information criteria (such as AIC) should no longer be used for model selection. See Section 3.6.2 for discussion on this point, and a potential alternative.