Statistical Methods - Keil_unc_0153D

CHAPTER II. METHODOLOGY

2.4 Statistical Methods

The proposed analyses involve two estimation procedures for longitudinal data de- veloped byRobins(1989, 1997), which are part of a larger class of statistical methods

for causal inference known as g-methods.

Both of these proposed methods are considered generally here and in more specific detail in §3.1 and 3.2.

2.4.1 G-estimation of structural nested accelerated failure time models

The simple SNAFT model model shown in § 1.5.1 can be generalized to continuous radon exposures, as observed in the Colorado Plateau uranium miners. We propose to first fit a basic SNAFT model

T¯0=

exp(ψX¯k−5)d k

where ¯Xk−5 is the cumulative radon exposure in working level months up to timet with a lag of 5 years to allow for latency between lung cancer induction and mortal- ityT. Because many time related factors mediate the radon-lung cancer association, We will also fit models of the allowing the time ratio to vary over time varying factors (time since exposure, age at exposure, smoking, exposure rate) of the relationship between radon and lung cancer. For example, to assess effects of time-since-exposure in correspondence with BEIR VI models, We will fit the model

T0¯=

exp(ψ₁X¯k−(5:14)+ψ2X¯k−(15:24)+ψ3X¯k−(25+))d k

Where ¯Xk−(a:b)corresponds to the cumulative exposure accrued fromatobyears prior

to timek. We will estimate these models using the g-estimation procedure described in § 1.5.1. The structural model can also be used to explore lag functions, since the lag functions used or estimated under standard regression models (e.g.Langholz et al.

(1999);Hauptmann et al.(2001)) will not necessarily hold for causal models. The lag can either be a fixed number, as expressed in the structural model here, or it can be

treated as a stochastic variable (Richardson(2009b);Richardson et al.(2011)).

Joffe et al.(2012) described a different set of structural models useful for allowing the effects of a continuous exposure to vary over time-varying factors. However, practical issues related to implementation of estimation algorithms limited exploration of effect measure modification in the analysis.

G-estimation of a SNAFT model in occupational data has previously been shown to be feasible byChevrier et al.(2012). Additionally, the investigators showed several ways to generate other effect measures usingψestimates from g-estimation. Because SNAFT models are well suited for overcoming the non-positivity that characterizes bias due to healthy worker survivor bias, We will estimate other effect measures for com- parability, especially the hazard ratio as a method of evaluating the MSM described in § 2.4.2. An alternative approach is that taken byNaimi et al.(2014a), who compared SNAFT model parameters to parameters from a parametric accelerated failure time model.

2.4.1.1 Inference under measurement error of exposures and covariates

G-estimation can be used in any subset of the data for which conditional exchange- ability holds, andJoffe et al.(2010) note that this can include a subset of the data in which the measurement error is less, thus reducing the impact of exposure misclassi- fication. Based on the variation in measurement sources across time shown in figure C.2, the estimating equation could be limited to a subset of the data in which measurement error was least (e.g after 1950), but full inference for the data is still possible. This feature of g-estimation could also allow estimation in a subset with complete smoking information, thus extending to a number of possibilities regarding focused analyses on single modifiers or sensitivity analyses.

2.4.2 Marginal structural models using inverse probability weighting

The second proposed analysis of the Colorado Plateau uranium miners data in- volves a MSM of the form

λX(k) =λ0(k)exp(βX¯k−5)

Where ¯Xk−5 is again the 5-year lagged radon concentration in WLM andT is the time to lung cancer, which We will estimate using a weighted Cox proportional hazards model with estimated weights ˆW(k) as described in § 1.5.2, but using a continuous exposure. ˆ Wk= k Y j=0 f [Xj|X¯j−1,V] f [Xj|L¯j, ¯Xj−1,V]

In principal, SNAFT models should more naturally allow for exploration of effect measure modification. However, as noted in the previous section, practical issues have limited the use of SNAFT models. The MSM may be the appropriate model to explore effect measure modification, in that case. While MSMs are noted to have shortcom- ings when addressing effect measure modification under certain circumstances (e.g.

Hernán et al.(2001)), several authors have shown how to assess effect measure modification or to estimate joint effects using inverse-probability weighting (Hernán et al.

(2001);Petersen et al.(2007);Chiba et al.(2009);VanderWeele and Vansteelandt(2011)). As in the SNAFT model, the MSM is potentially useful for estimating a latency func- tion, particularly when considering a weighted exposure history where weights are proportional to the expected contribution to disease at a given time. These latency model weights work naturally with the inverse probability weights (Langholz et al.(1999);

Richardson(2009b)). Further, inverse probability weights have not been applied previously to a linear excess relative risk model, and the model is relatively straightforward to estimate once weights are estimated. This model may provide a useful comparison to previous analyses.

Because inverse probability weighting is sensitive to violations of the positivity as- sumption, non-positivity due to the lack of off-work exposures must be addressed in this analysis. We propose estimating the effect of total radon exposure (occupational

+residential) using existing background radon estimates fromPrice et al.(2011). As an example, residential monitoring data for the state of Colorado (which holds part of the Colorado Plateau) is shown in figure 2.1. By combining residential with occupational exposure, the conditional probability density of exposuref[Xk|·]will be>0 for all par-

ticipants. These data are approximately log-normally distributed, and adding them to the Colorado Plateau uranium miners exposure estimates will likely create difficul- ties estimating the exposure density, given that the true density will be the sum of two approximately log-normally distributed variables.

In document Keil_unc_0153D_14884.pdf (Page 57-62)