Many methods can be used to estimate causal effects with epidemiologic data, pro- vided the identifiability assumptions outlined in Section 7.2 hold. In certain settings, non-standard methods are required to make these assumptions more plausible, such as, for example, when there is time-varying confounding. In our case, under condi- tions of time-varying confounding with positivity violations, we use g-estimation of a structural nested model, first introduced by Robins (1989a). The “g-estimator” out- lined in Robins (1989a) and elsewhere (Robins & Tsiatis 1991; Robins 1992; Robins et al. 1992; Robins & Tsiatis 1992; Robins 1993, 1998; Robins and Hernán 2009) is a generalization of the “e-estimator” introduced by Robins, Mark & Newey (1992) which models the expectation of the exposure conditional on confounders to esti- mate the exposure’s effect on the outcome. These methods both involve specifying a model for the exchangeability assumption defined in (7.3). E-estimators are prac- tically identical to the standard ordinary least squares estimator ˆOLS = (Z0Z)−1Z0y except that the design matrixZcontaining the exposureX and vector of confounders C is replaced with a function of the exposure conditional on the confounders (see Robins et al. 1992, for details). With e-estimators, the assumption about the rela- tion between the exposure and the potential outcomes (i.e., exchangeability) is made implicitly. G-estimators, on the other hand, take a different approach.
Robins’ g-estimator requires that the expectation of the exposure be modeled as a function of the potential outcomes. In effect, the expectation defined in (7.3) is explicitly specified using a statistical model. However, as explained in Section 7.1 the potential outcomes are unobserved (or latent) variables, for which we do not have any data. At the very most, under counterfactual consistency, we only observe the potential outcome under the observed exposure. As a way forward, structural
nested models are used to impute the remaining set of potential outcomes for a given individual under a specific set of assumptions. Here, we illustrate how structural nested accelerated failure time models can be specified to generate the potential outcomes for use in the g-estimator.
Structural nested failure time models are a mappingh(·)between failure time that would have been observed under no exposure Ti0, the failure time that would have been observed under some arbitrary exposureTix, and some unknown parameter ψ:
Ti0 =h(Tix, ψ)(Robins 1998). This mapping is most often of the form
Tix = Ti0exp{−ψx}
which says that the failure time under no exposure is accelerated by the exp{−ψx}
to give the failure time under x (Hernán, Cole, Margolick, Cohen & Robins 2005). With this formulation, exp{−ψ} corresponds to the ratio of survival times associated with a single unit increase in the exposure, which is our causal estimand of interest defined in 7.1. This equation is typically rearranged as
Ti0 =Tixexp{ψx}.
For a time-varying exposure xit we can “break up” the survival time Tix on the
right hand side of the equation and letxit act on the “pieces”:
Ti0¯ =
Z Tix¯
0
exp{ψxit}dt (7.4)
where, as before, overbars denote variable histories, and where Ti0¯ is the potential outcome under no exposure. Finally, we can replace the continuous exposure value at time t, xt, with any desired exposure metric, such as cumulative exposure up to
timet,at, as employed in Chapter 7: Ti0¯ = Z Ta¯ i 0 exp{ψait}dt (7.5)
This model was first introduced by Cox (1984) as the “strong version” of the accelerated failure time model, which is a sub-class of models known more generally as structural nested accelerated failure time models (Robins 1992; Robins et al. 1992; Lok, Gill, Van Der Vaart & Robins 2004). Equation 7.5 has four unknowns and cannot be estimated using the observed data alone. As a way forward, we let the value ofxit be the observed value (denoted by a capital letter) for individuali at time t. We can then link this equation to our data by the consistency assumption:
Ti0¯ =
Z Ti
0
exp{ψAit}dt (7.6)
Equation 7.6 has two unknowns: Ti0¯ andψ. If we letψ˜ denote the set of plausibleψ
values (e.g., 0 to 3 by 0.05), we can solve for the potential outcome under no exposure for each value in the set of plausible ψ values, which we denote Ti0¯( ˜ψ) to make the
˜ ψ-dependence clear: Ti0¯( ˜ψ)= Z Ti 0 exp{ψA˜ it}dt (7.7)
Equation 7.7 now has only one unknown: the value ofψ˜ that is an estimate of our causal parameterψ. We can obtain an estimate of this parameter using g-estimation and the assumption of no unmeasured confounding, as detailed in Chapter 9.
Structural nested AFT models such as the one in Equation 7.7 are essentially mathematical equations that relate survival time outcomes that would have been observed under different exposure histories. This relation is depicted in Figure 7.1,
Figure 7.1: Illustration of the relation between exposure and potential outcomes defined using structural nested models.
which was generated using the data from an individual in the South Carolina Chrysotile As- bestos Cohort and a fixed value of ψ = 0.6. In this figure,
TA = T represents the observed outcome. The area under this curve, computed using Equa- tion 7.7 is equivalent to the area under the straight horizon- tal line at exp(a)=1, which rep- resents the potential outcome that would have been observed under no exposure. The third line ends at Ta, which repre-
sents the time that would have been observed under a unit increase in the observed exposure over allt. Note that the area under each of the three curves is equivalent.