2.1 Notation and Models
2.1.2 Subgroup Effects in an Observational Setting
We now consider the models in the context of an observational study. As noted before, we use Wi to denote the treatment or exposure variable for subject i in an observational
setting, instead of Xi. We make the distinction between the two variables because their
probability distributions are different. In an observational setting, we assume that the treatment variable, Wi, is not independent of auxiliary variables Z1i and Z2i, as treatment
(or exposure) selection is often dependent upon subject characteristics; for simplicity, we continue to assume that the auxiliary variables are scalars. We make the assumption that Wi is independent of subgroup variable Si conditional on Z1i and Z2i. This assumption is
appropriate in settings where, for example, the subgroup variable Si denotes the presence
or absence of some genetic factor, which is information that may not be readily available to the treating physician and hence could not directly affect treatment decisions.
Suppose we make the same conditional independence assumptions as in the randomized setting, with the exception of the assumption for the treatment variable:
B.1 Yi ⊥ Z2i| (Wi, Si, Z1i)
B.2 Wi ⊥ Si| (Z1i, Z2i) (observational setting)
B.3 Z1i ⊥ Z2i .
See Figure 2.2 for a causal DAG of the variables in the observational study setting. The assumptions above allow us to factor the joint probability of Yi, Wi, Si, Z1i, Z2i for
subject i as
P (Yi, Wi, Si, Z1i, Z2i)
= P (Yi|Wi, Si, Z1i, Z2i)P (Wi|Si, Z1i, Z2i)P (Si|Z1i, Z2i)P (Z1i, Z2i)
Figure 2.2: Simple directed acyclic graph for treatment variable (or exposure variable) W , response Y , and auxiliary variables Z1 and Z2, in the context of an observational study.
Variable Z1 is directly associated with response, whereas Z2 is not directly associated with
response.
We can re-write the logistic regression model for the response using the notation for the treatment variable in an observational setting: µi(ϑ) = E(Yi|Wi, Si, Z1i) = P (Yi =
1|Wi, Si, Z1i; ϑ). As before, we consider a logistic regression model for the response whereby
logit(µi(ϑ)) = ϑ0+ ϑ1Wi+ ϑ2Si+ ϑ3WiSi+ ϑ4WiZ1i , (2.7)
where ϑ = (ϑ0, ϑ1, ϑ2, ϑ3, ϑ4)T is a vector of regression coefficients. We further assume that
the probability distribution of the response stays the same if we condition on the same observed values of exposure, subgroup variable and potential confounder. In other words, the conditional odds ratios of treatment are the same whether we are in a randomized setting or in an observational setting, as discussed in Chapter 1. This can be written as
P (Yi = 1|Xi = w, Si, Z1i; ϑ) = P (Yi = 1|Wi = w, Si, Z1i; ϑ) . (2.8)
The parametric model forms of Si, Z1i and Z2i are the same as in the randomized
setting. Since the treatment selection process for subject i depends on auxiliary variables Z1i and Z2i, we assume
πi(ξ1) = P (Wi = 1|Z1i, Z2i; ξ1) = expit(ξ10+ ξ11Z1i+ ξ12Z2i)
where ξ1 = (ξ10, ξ11, ξ12)T are regression parameters. Here, πi(ξ1) is the propensity score
Next, we compute the marginal conditional distribution for response Y given only the treatment, subgroup variable, and their interaction, which is the model that we are interested in fitting. In the observational setting, we have
P (Yi = 1|Wi, Si) = EZ1i|Wi,Si{P (Yi = 1|Wi, Si, Z1i; ϑ)} = 1 X z1=0 P (Yi = 1|Wi, Si, Z1i = z1; ϑ)P (Z1i = z1|Wi, Si) . (2.9)
Comparing the forms of P (Y |W, S) from (2.9) and P (Y |X, S) in (2.5), we see that they are not equal in general.
Marginal regression parameters β from (2.4) can be estimated in an observational setting using the following weighted estimating equation
e U1(β; ξ1) = n X i=1 e U1i(β; ξ1) (2.10)
(Robins et al., 2000), where
e U1i(β; ξ1) = 1 X l=0 I(Wi = l) πi(ξ1)l(1 − πi(ξ1))1−l Di(β)[Vi(β)]−1(Yi− µi β) (2.11)
with Di(β) = ∂µi(β)/∂β and Vi(β) = var(Yi|Wi, Si) = µi(β)[1−µi(β)]. The tilde indicates
that a weighted estimating equation is used with a weight for confounding. Let ˜β denote the solution to eU1(β; ξ1) = 0 for fixed ξ1.
An auxiliary estimating function is required to estimate ξ1 and we specify this as
U2(ξ1) = n X i=1 U2i(ξ1) where U2i(ξ1) = n X i=1 Di(ξ1)[Vi(ξ1)] −1 (Wi− πi ξ1) ,
Di(ξ1) = ∂πi(ξ1)/∂ξ1 and Vi(ξ1) = var(Wi|Z1i, Z2i) = πi(ξ1)(1 − πi(ξ1)). Then let e Ui(γ) = e U1i(β) U2i(ξ1) ! ,
where γ = (βT, ξT1)T. In practice, we use the data to obtain an estimate for ξ
1, denoted
by ˆξ1, and replace the weight in (2.11) with its estimated counterpart πi(ˆξ1) = P (Wi =
1|Z1i, Z2i; ˆξ1). Let ˜γ = ( ˜β T
, ˆξ1T)T.
Under mild regularity conditions and provided the propensity score model is not mis- specified, ˜γ is consistent and asymptotically normal due to the unbiasedness of estimating functions, i.e.
√
n(˜γ − γ) −→ MVN(0, Σ(γ)) .D The asymptotic covariance matrix Σ(γ) takes the form
Σ(γ) = I−1(γ)C(γ)I−1(γ)T , where I(γ) = E[−∂Ui(γ)/γT] and C(γ) = EUi(γ)Ui(γ)T .
The above covariance matrix takes into account the estimation of ξ1. See Section 2.3.2for details on estimation of Σ(γ) in a more general setting.
The subgroup effects are estimated in the same way as in the randomized setting, which is described at the end of Section2.1.1. We have discussed estimation of treatment effects in an observational setting with an important subgroup variable using a weighted estimating function approach. In the next section, we discuss estimation of β in the setting where the subgroup variable is not observed for some subjects.