A bootstrap bias-correction - NONLINEAR MEASUREMENT ERROR MODEL

CHAPTER 2. NONLINEAR MEASUREMENT ERROR MODEL

2.5 A bootstrap bias-correction

The Monte Carlo simulations in Section2.3.2demonstrated small sample bias in the parameter estimates for the nonlinear model in (2.12)-(2.15). In this section, we outline a bootstrap bias-correction for the maximum likelihood estimates.

2.5.1 Methodology

We are interested in the properties of the maximum likelihood estimator. For a sample of size n, a non-parametric bootstrap sample is a sample, obtained with replacement, of size n from an original Monte Carlo sample. By resampling from the data a number of times and calculating the maximum likelihood estimates for each simulated dataset, we can approximate the sampling distribution of the estimator. Instead of generating a large number of bootstrap samples for each Monte Carlo sample, we consider a small number of b = 1, ..., n_B (for a minimum of n_B = 2) bootstrap samples for each of the k = 1, ..., M original Monte Carlo samples.

For each sample and each parameter, we estimate the following quantities: the original maxi-mum likelihood estimate of the parameter, bθ_k, the mean bootstrap estimate,

θ¯^∗_k= 1 nB

n_B

b=1

θ_kb^∗ , (2.55)

the bootstrap bias-corrected estimate,

θek = θbk− bBk

= 2bθk− ¯θ_k^∗, (2.56)

and the variance of the nB bootstrap samples,

σb²_kw= 1 n_B− 1

b=1

(θ_kb^∗ − ¯θ_k^∗)², (2.57)

where bBk = ¯θ_k^∗ − bθk. Estimates computed using (2.56) can have large variances due to large variability in the estimated bias. The large variability in the estimated bias can result in an adjusted estimate with a larger standard error than the original estimate, bθ. When the estimated bias is

large with respect to the standard error, the correction can be effective (Efron and Tibshirani, 1994). We note that the t−ratios for the bias calculated in the Monte Carlo simulations from Section 2.3.2reflect a large magnitude of bias for the maximum likelihood estimates with respect to the estimated standard errors.

The mean of the bootstrap bias-corrected estimator is the same regardless of the number of bootstrap samples that are used in the adjustment. In the Monte Carlo simulation study, we choose a small number, n_B, of bootstrap samples and estimate the variance of the bootstrap bias-corrected estimate, eθ_k. The variance estimate is scaled to represent the within-bootstrap sample variance that would have been obtained had we generated B samples instead of the small number, n_B, used in the simulation. The estimator for the variance of the bootstrap bias-corrected estimator θ is:e

VardB(eθ) = dVar(2bθ − ¯θ^∗) + 1

BΣbw− 1 n_BΣbw

, (2.58)

where we average over the k = 1, ..., M original samples to get an estimate of the within-bootstrap sample variance:

Σb_w = 1 M

k=1

1 nB− 1

n_B

b=1

(θ^∗_kb− ¯θ^∗_k)²

# ,

= 1

k=1

bσ²_kw. (2.59)

The estimate of dVar(2bθ − ¯θ^∗) using the n_B bootstrap samples is:

Var(2bd θ − ¯θ^∗) = 1 M − 1

k=1

(2bθ_k− ¯θ_k^∗) − (2¯θ_{M L}− ¯θ_BS)i2

, (2.60)

where ¯θ^∗_k is estimated from the nB bootstrap samples and ¯θM L and ¯θBS are averaged over the k = 1, ..., M original samples:

θ¯_{M L} = 1 M

k=1

θb_k, (2.61)

and

θ¯_BS = 1 M

k=1

θ¯_k^∗. (2.62)

2.5.2 Simulations with a bias-correction

In this section we perform a non-parametric bootstrap adjustment of the parameter estimates, with n_B = 2 bootstrap samples for each of the k = 1, ..., 500 original samples from the Monte Carlo simulations. We compute maximum likelihood estimates for 500nB bootstrap samples, each of size n = 1000. Estimation is done using maximum likelihood both with and without the Bayes adjustment since t−ratios for the bias were significant for both estimation methods in the simulation study (Tables2.5-2.8). We focus on Scenario 4 (see Table2.8) with σee= 0.5 and σuu= 0.25; results for other scenarios are in Appendix G.2.

The new mean estimates in Table 2.10 are the bootstrap bias-corrected parameter estimates eθ in (2.56); the original maximum likelihood (ML) estimates are the estimates from the Monte Carlo simulations in Section2.3.2. The bias, variance, standard error, and t−ratio for the bootstrap bias-corrected estimates are calculated as described in (2.36)-(2.38). The estimate of the variance for the bootstrap bias-corrected parameter estimate, scaled to represent B = 1000 bootstrap samples, and the ratio SD_B(eθ)[SD(bθ)]⁻¹ comprise the final two rows of each table. The ratio SD_B(eθ)[SD(bθ)]⁻¹ compares the variability of the original maximum likelihood estimate to the variability of the bootstrap bias-corrected estimate. The estimates of SD_B(eθ) are the standard deviations of the bootstrap bias-corrected estimates eθ scaled to represent B = 1, 000 bootstrap samples (the square root of the variance in (2.58)). The estimates of SD(bθ) are the standard deviations of the maximum likelihood estimates bθ from the Monte Carlo simulations in Section2.3.2.

When averaging over the 500 original Monte Carlo samples, the bootstrap bias-corrected pa-rameter estimates for a are over-adjusted. After examining individual bootstrap samples, we note that there are a handful of bootstrap samples that result in extreme estimates of a. In Scenario 4, where data are simulated with σee = 0.5 and σuu = 0.25, the bootstrap sample estimates for a range from −69.32 to −1.375. When we include a prior on a, the over-adjustment is greatly reduced. Bootstrap sample prior means are estimated from the original Monte Carlo sample (e.g.

the prior mean estimate is the same for both bootstrap samples for a given original Monte Carlo

sample). The estimates for a in the bootstrap samples range from −17.63 to −1.411 once a prior is used in the maximum likelihood estimation.

The bootstrap bias-corrected estimates have smaller, or at least roughly similar, estimated biases as compared to the estimated biases of the original maximum likelihood estimates. Overall, the bootstrap bias-corrected estimates appear to have small biases except for the parameter a.

The t−ratio for the bias in a, however, is not significant once estimation is done with a prior on a. In terms of variability, the variance estimates of the original maximum likelihood estimates are comparable to the variance estimate of the bootstrap bias-corrected estimate (as if we took B = 1000 bootstrap samples) as the ratio of the standard deviations SD_B(eθ)[SD(bθ)]⁻¹ are all close to one.

In document Nonlinear models with measurement error: Application to vitamin D (Page 65-68)