GENERALIZED LINEAR MODELS WHEN THE EXPOSURE IS UNTRANSFORMED
4 M EASUREMENT ERROR CORRECTION FOR A QUADRATIC TRANSFORMATION OF THE ERROR
4.4 C ORRECTION METHODS FOR A QUADRATIC MODEL
In this section, I will describe the modifications required to adapt the correction methods presented in Chapter 2, where a linear form of the error-prone variable appeared in the substantive model, to the quadratic substantive model presented in Equation 4.1. As in previous chapters, it is assumed that either a validation study has been performed on some fraction of the study population or that a replicate study has been performed providing at least two measures of the error-prone measure on all or part of the study population. In all cases, a classical error model is assumed unless specifically stated otherwise.
Regression calibration (RC)
RC operates by replacing π in the substantive model with the expectation of ππ given the error-
prone measure(s), πΎπ, and any adjustment variables, ππ. When the substantive model includes
transformations of π, e.g. ππ, RC is extended by replacing ππ with πΈ[ππ|πΎ, π]. For a quadratic
model with a continuous outcome, RC therefore works by regressing ππ on πΈ[ππ|πΎπ, ππ] and πΈ[ππ2|πΎ
π, ππ]:
4.6 πΈ[ππ|πΎπ, ππ] = π½0 + π½π1πΈ[ππ|πΎπ, ππ] + π½π2πΈ[ππ2|πΎπ, ππ] + π·ππ»ππ.
The expectation πΈ[ππ|πΎπ, ππ] may be estimated as described in Chapter 2 by regressing π1 on π if a validation study is present or by regressing π2 on π1 if a replicate study is present (Equations 2.2 and 2.3). If a validation study is present, the variance of π given the error-prone measure and any accurately measured covariates, ππ|ππ2 , may be directly estimated. Alternatively, in a replicate
study, ππ|ππ2 may be estimated as the covariance of the error-prone measures conditional on π π.
80 The definition of variance (var(π) = πΈ[π2] β πΈ[π]2) can be used to rearrange πΈ[π
π2|πΎπ, ππ]
[1,49]. Therefore, we replace πΈ[ππ2|πΎπ, ππ] in Equation 4.6 with ππ|ππ2 + πΈ[ππ|πΎπ, ππ]2:
4.7 πΈ[ππ|πΎπ, ππ] = π½0+ π½π1πΈ[ππ|πΎπ, ππ] + π½π2πΈ[ππ|πΎπ, ππ]2+ π½π2ππ|ππ2 + π·ππ»ππ.
Equation 4.7 can be rearranged to put all constant terms together:
4.8 πΈ[ππ|ππ, ππ] = (π½0+ π½π2ππ|ππ
2 ) + π½
π1πΈ[ππ|ππ, ππ] + π½π2πΈ[ππ|ππ, ππ]2 + π·π
π»π π.
It follows from Equation 4.8 that if πΈ[ππ|πΎπ, ππ] is used in place of π and πΈ[ππ|πΎπ, ππ]2 in place
of π2 in the quadratic substantive model, the desired π½Μ
π1 and π½Μπ2 are equivalent to the observed
linear and quadratic parameters and the desired π½Μ0 is equivalent to π½Μ0ββ π½Μ
π2πΜπ|ππ
2 .
This may be extended to logistic regression with the same caveats regarding approximation as for the untransformed model (Section 2.2). The Cox proportional hazards model is a special case of this method as the term π½π2var(ππ|πΎπ, ππ) is subsumed by the baseline hazard (15).
Bootstrapping [1] or an extension of the delta method can be used to obtain SEs [6]. Bayesian analysis using MCMC
A three-part conditional independence structure was introduced in Section 2.3 and applied in the context of an untransformed predictor in Chapter 3. This approach may be extended easily for use when the substantive model also includes an π2 term, i.e. the quadratic model in Equation 4.1.
In this chapter, the substantive model specified, π(ππ|ππ, ππ; π·), is updated to be the quadratic
model given in Equation 4.1. The measurement error model, π(πΎπ|ππ; π ), remains the classical error model (Equation 1.7) and the exposure model, π(ππ|ππ; πΆ), remains the distribution of π dependent on any accurately measured covariates, π, here assumed to be normal.
Scaling of the variables may be necessary to specify plausible prior distributions [28,85] and centering the exposure and its squared term will reduce correlation between the terms and improve MCMC convergence [28]. Scaling and centering must be performed after the transformation of the squared term. The mean and standard deviation of the latent π and π2 can be estimated from
πΎ when no validation study is available. INLA
It was discussed in Section 2.5 that the joint distribution of the latent Gaussian parameters, including the latent π, is assumed to have the attributes of a GMRF. Whether the simplified Laplace approximation used for the latent Gaussian parameters is reliable depends on the accuracy of this assumption. In Chapter 2, the latent Gaussian parameters, π, included the regression parameters from the substantive model, the exposure model, and the latent π, i.e.
81 π½0, π·π, πΌ0, πΆπ, and ππ. When the substantive model is the quadratic model, π includes π½0, π·π, πΌ0, πΆπ, ππ, and ππ2. The method cannot treat both π and π2 as approximately normally
distributed latent variables as the square of a normally distributed variable cannot also be normally distributed. Therefore, a significant extension to the INLA method would be required to accommodate transformations of a latent variable.
Furthermore, the software for applying INLA cannot accommodate a transformation of a latent Gaussian variable within the substantive model; therefore, the impact of this violation of principle cannot be easily assessed.
Given these limitations, in this work I will no longer pursue INLA as previously described as a method of measurement error correction when the error-prone measure has been transformed within the substantive model. However, in the next section, a hybrid method using MCMC or INLA and attributes of RC is described.
Bayesian regression calibration
In this chapter and the previous, correction methods are applied in the context of relatively simple models (i.e. the classical error model and the use of a specified functional form of the error-prone exposure). In this relatively straightforward setting, I would like to propose a novel correction method which is a hybrid between Bayesian methods and RC. This method is expected to easily adapt to settings with a complex error model and an unknown functional form of the error-prone predictor, i.e. model selection. MCMC solutions are powerful and flexible, but slow to converge, particularly when model selection is incorporated (Chapters 5 and 6). While the time involved to run standard RC is negligible, estimation of the maximum likelihood estimates of πΈ[ππ|πΎπ, ππ]
for more complex error models can be cumbersome or even prohibitive where the likelihood cannot be expressed in closed form [110]. When assuming the classical error model, this is not a limiting problem with linear and quadratic substantive models but becomes so for more complex substantive models such as the full set of models required for the fractional polynomial method, the topic of Chapter 6.
Standard RC relies on the estimation of πΈ[ππ|πΎπ, ππ] and πΈ[ππ2|πΎ
π, ππ] via maximum likelihood
estimation. An alternative means of obtaining πΈ[ππ|πΎπ, ππ] and πΈ[ππ2|πΎ
π, ππ] is via the posterior
mean of π(ππ|πΎπ, ππ; πΆ, π ), where πΆ and π represent the parameters of the exposure model and the error model, respectively (Section 2.3). Posterior samples of the latent π, denoted πΜπ, are
drawn using MCMC after the chains have reached convergence. By squaring all samples of πΜπ, one can estimate πΈ[πΜπ2|πΎπ, ππ] directly. πΈ[πΜπ2|πΎπ, ππ] will be a good estimate of πΈ[ππ2|πΎπ, ππ] as
long as the error and exposure models are not misspecified and enough samples have been drawn to approximate the distribution. Each estimated expectation, πΈ[πΜπ|πΎπ, ππ] and πΈ[πΜπ2|πΎ
82 then be inserted directly into Equation 4.6 to fit the quadratic model. Inference can then be made according to frequentist principles.
The MCMC chains would be expected to converge more quickly using this simpler model than the fully Bayesian model which incorporates the substantive model. This is particularly true for non-linear outcome models such as logistic regression.
Either MCMC or INLA may be used to generate the posterior samples of πΜπ. While sampling is not inherent to the INLA method, samples may be obtained from the estimated posterior distribution. This operation is still much faster than MCMC because there is no need to wait for convergence or concerns about autocorrelation, i.e. the ESS is equivalent to the number of samples drawn. In this work, each method used in this way will be referred to as MCMC-RC or INLA-RC, respectively. R code demonstrating the implementation of each is provided in Appendix B. Any other Bayesian method of analysis, such as Hamiltonian Monte Carlo algorithms, may be used similarly.
Bayesian RC would be expected to underestimate variance in the regression parameters of the substantive model as it does not propagate the uncertainty due to measurement error from the MCMC model to the substantive model (Section 2.2.1). In theory, bootstrapping may be used for better estimates of the SEs; however, this could only be done for simple models and small data sets for MCMC-RC. For INLA-RC, bootstrapping is more feasible but was not used in simulation studies in this thesis.
Multiple imputation
Multiple imputation of squared terms in the missing data context
The desirability of compatibility of the substantive model and the imputation model when performing MI was discussed in Section 2.6. However, several authors have considered MI methods for imputing covariates when they appear as transformed terms in the substantive model which use imputation models that are not compatible with the substantive model [111β113]. The simplest method is to impute the missing variable π assuming a linear relationship to π, then transform it to π2 for the substantive model. This preserves the π and π2 relationship but violates
the theoretical properties of the joint model for the substantive model and imputation model underlying the multiple imputation. This method, sometimes referred to as the βpassive approachβ, has been shown to result in biased regression estimates [111,112]. Alternatively, in whatβs been called the βJust Another Variableβ (JAV) approach, one can impute both π and π2
separately using chained equations as if they were different variables [111]. Unlike the βpassive approachβ, JAV does not preserve the relationship between π and π2. In some settings, JAV may
83 improve estimates over the βpassive approachβ, but in many common settings it still results in bias [111β113].
In a method called polynomial combination, Vink and van Buuren proposed to impute not π, but π + π2 [112]. While this method results in less bias than JAV and preserves the π and π2
relationship, it is not easily extendable to other transformations of the latent exposure.
SMC-FCS, which uses rejection sampling to ensure compatibility (Section 2.6), was demonstrated in its original publication for use with a quadratic substantive model (Equation 4.1) [46]. For this transformation of the latent exposure as well as others, SMC-FCS was demonstrated to be effective at minimizing bias due to missing data.
Multiple imputation of squared terms in the measurement error context
None of the above MI approaches have, to my knowledge, been applied to exposure measurement error in the published literature.
In Chapter 3, I explored the use of SMC-FCS for measurement error correction with either a validation study or a replicate study present. When a validation study has been performed, SMC- FCS is an effective tool for minimizing bias without any alteration to the method as used for missing data. However, for replicate studies, it was necessary to alter the method to incorporate the measurement error model and to stipulate proper priors for both the ππ2 and π
π2 variances to
ensure reliable posterior inference [27]. From the simulation studies performed (Section 3.5.3), it was further shown that without stipulation of somewhat informed priors for the substantive model regression coefficients, estimated posterior distributions of the regression parameters are uninformative when the likelihood contains little data, i.e. due to high measurement error variance and/or small sample size. Therefore, use of a fully Bayesian model from which to draw samples of the latent π may be required for reliability across scenarios.
The same MCMC model as used for the fully Bayesian analysis may be used to obtain samples of πΜπ and πΜπ2 from π(ππ|πΎπ, ππ, ππ; π½) which could then be imputed into data sets to be used in the
standard MI fashion. That is, the quadratic model would be fit to each imputed data set and pooled regression estimates obtained by applying Rubinβs Rules. In this way, estimates with frequentist properties may be obtained in lieu of Bayesian posterior means.
Given the limitations of MI as demonstrated in Chapter 3 and as explored by others in the field of missing data, in this thesis I will no longer pursue MI as a method of measurement error correction where the error-prone measure has been transformed.
84