Simulation studies: a general overview - Experimental Design and Mathematical Modelling Methods

There are many percentage estimates that can be used to conclude on apparent anthelmintic eﬃcacy/resistance, such as percentage estimates (2.2), (2.3), (2.4), (2.5), (2.6) and (2.7) mentioned in Section 2.3 (with i = 14). As a result, the only

sensible approach to determine what type of percentage estimate offers robust statistical analyses would be to conduct a simulation study based on real cattle FEC data (Burton et al. 2006). In this case, we would be interested in observing the performance of the confidence intervals associated with the various percentage estimates, via coverage probabilities. The performance of associated confidence intervals for the relative percentage estimates is of interest, since these intervals are used in the decision-making process of classifying apparent efficacy/resistance in cattle herds.

According to Burton et al. (2006), simulation studies are considered to be com- puter intensive procedures to assess the performance of a variety of statistical methods in relation to a known truth/target value and such evaluation cannot be achieved with studies of real data alone. As a result, an empirical estimation of the sampling distribution of the parameters of interest is provided that could not be achieved by means of a single study and the estimation of accuracy measures can be compared to the known truth/target value as well.

For the purposes of this study, simulations will involve repeated random sampling from probability distributions and are referred to as Monte Carlo simulations and these can produce a vast number of scenarios, i.e. iterations (Diaz-Emparanza 2002; Burton et al. 2006). These types of simulations are considered to be a relatively new area of Mathematics with a variety of applications and specialist packages being developed for their implementation. For example, Monte-Carlo simulation is easily implemented in R/RStudio using built in functions, which generate random values from probability distributions. As a result, these functions can be utilised within user-deﬁned functions, which code the simulation. The design of simulation studies however, in general, is considered to be a complex process and presents itself with numerous issues to be considered. For example, justiﬁcations have to be supplied for all decisions made prior to the simulation study being carried out (these would be detailed in a study protocol ), a description of the simulation procedures, statistical methods to be evaluated, the storing of data and estimates, the number of simulations to be performed, the particular type of accuracy measures to be used, etc. (Burton et al. 2006).

the questions: "How many simulations do I carry out?" and "What are the different types of performance measures available for me to use in my simulation study?"

4.4.1 Number of simulations to carry out as part of a study

The number of simulations a researcher is required to carry out is an important consideration, because the quality of the results is directly inﬂuenced by the number of simulations carried out (Diaz-Emparanza 2002).

Burton et al. (2006) tell us that the number of simulations, B say, can be determined by using generic formula such as

B = Z(1− α 2)σ δ 2 (4.5)

where δ is the speciﬁed level of accuracy of the estimate of interest the researcher is willing to accept, i.e. the permissable diﬀerence from the "true"/target value,

Z(1−α₂) is the 1− α₂ quantile of the standard normal distribution, α is the sig-

niﬁcance level set by the researcher and σ2 _{is the variance of the parameter of}

interest.

Alternatively, B could be determined based on the power 1− β of a study, i.e.

the probability of detecting a signiﬁcant diﬀerence from the "true"/target value, where the following formula is considered:

B = (Z(1− α 2)+ Z(1−β))σ δ 2 (4.6)

and in fact, (4.6) is equivalent to (4.5) if the power is assumed to be 50%. These formulae require a realistic estimate of the variance being obtained from data, but if the variance is unknown or cannot be estimated reliably then it may be possible to perform an identical pilot/test simulation study to obtain these

realistic estimates.

When one considers the relatively large FEC data set available in this project, the estimation of variability of the positive treatment and negative control FEC data for both Days 0 and 14 would require extensive time - a cost which is often taken into account when carrying out these types of studies (Burton et al. 2006). If individual values of B were to be examined for each data set, there is no guarantee that the maximum value of B that would be considered as the number of simulations to carry out would be manageable to achieve given the time constraints of this project - especially when we consider estimating the variability of these data given the diﬀerent types of treatment groups, diagnostic sensitivities and farms involved. As a result, the number of simulations carried out was based on what could be achieved given the time constraints of the project, as opposed to evaluating the number of simulations based on the use of generic formulae such as those mentioned above.

4.4.2 Performance measures

According to Burton et al. (2006), it is necessary to consider the criteria for evaluating the performance of results from the different scenarios/statistical ap- proaches being examined. The comparison of simulated results with the "true"/target values used to simulate the data provides a measure of performance and associated precision of the simulated process. Performance measures often considered are bias, accuracy and coverage. In our case, we are primarily interested in assessing the performance of confidence intervals and the estimation of percentage estimates in a Bootstrapping framework and so the coverage will be adopted as our primary performance measure. As an addition, if a percentage estimate provides adequate coverage consistently, we will then consider observing the standardised bias present for the relevant percentage estimates, which is described below. Burton et al. (2006) tell us that the coverage is the proportion of times that the obtained confidence interval contains the true specified parameter value and that it should approximately be equal to the advertised coverage rate. Over- coverage occurs when the coverage rates are above the advertised coverage rates

and this can suggest that the results are too conservative since more simulations will not find a significant result when there is a true effect thus leading to a loss of statistical power with too many type II errors. On the other hand, under- coverage occurs when the coverage rates are lower than the advertised coverage rates and is considered unacceptable according to Burton et al. (2006) as it indicates over-confidence in the estimates.

One possible criterion for acceptability of the coverage is that it should not fall outside of approximately two standard errors (SEs) of the normal coverage probability, p say, i.e. SE(p) =

q p(1−p)

B (Burton et al. 2006). For example, if 95%

conﬁdence intervals are calculated using 1000 independent simulations then the coverage should fall between 93.6% and 96.4%. This criterion shall be utilised as part of our simulation study in order to identify those percentage estimates that oﬀer suitable coverage and shall be referred to as Burton’s criterion for the remainder of this Chapter.

Recall from Section 2.4.2, that the bias (Equation (2.15)) of an estimator, ˆθ, for

a parameter θ say, is deﬁned as

Bias = E[ˆθ]− θ,

where E[ˆθ] is the average value of the distribution for the estimate ˆθ in a set

of simulations. According to Burton et al. (2006), the amount of bias that

is considered troublesome has varied from 1

2se[ˆθ] to 2se[ˆθ], where se[ˆθ] is the

empirical standard error of the estimate ˆθ (estimated as the standard deviation)

over all simulations.

An alternative concept is that of the standardised bias:

Standardised Bias = E[ˆθ]− θ

se[ˆθ] .

can be more informative, as the consequence of the bias depends on the size of the uncertainty in the parameter estimate. For example, a value of -50% means that the estimate, on average, falls one half of a standard error below the parameter (Collins et al. 2001). It is also worth noting that a standardised bias within ±40% has been shown to not have noticeable adverse impact on the coverage probabilities of conﬁdence intervals as well as the eﬃciency of estimators and error rates (Collins et al. 2001; Burton et al. 2006). As a result, this criterion shall be utilised when observing the standardised biases for percentage estimates, if required, and shall be referred to as Collins’ criterion for the remainder of this Chapter.

In document Experimental Design and Mathematical Modelling Methods for the Study of Anthelmintic Resistance in UK Livestock Populations (Page 144-149)