3.6 Discussion
4.2.1 Univariable Mendelian randomization
In a univariable Mendelian randomization analysis, each genetic variant must satisfy the IV assumptions outlined in Section 1.4. Under linearity assumptions, the association between the genetic variant and the outcome can be decomposed into an indirect effect via the risk factor and a direct effect:
βY j = αj + θβXj,
where θ is the causal effect of the risk factor on the outcome. Genetic variant j is pleiotropic if αj ̸= 0, and αj is the direct effect of the genetic variant on the outcome.
Figure 4.1 contains a direct effect αj via an independent pathway, which violates the
IV3 assumption.
The causal effect of the risk factor on the outcome can be estimated using a weighted linear regression of the genetic association estimates [45], with the inverse-variance as weights (se( ˆβYj)
−2) and the intercept set to zero (the IVW method):
ˆβYj = θU IˆβXj + ϵU Ij, ϵU Ij ∼ N(0, φ
2
U Ise( ˆβYj)
2) , (4.1)
where θU I is the IVW estimate, ϵU Ij represents the error term, and φU I represents the
residual standard error. Under a multiplicative random-effects model (see Section 2.3.1 for more detail), the residual standard error (φU I) can be greater than one, allowing
𝐺" 𝛽 𝑋 𝑌
&' 𝜃
𝑈
𝛼"
Fig. 4.1Directed acyclic graph illustrating univariable Mendelian randomization assumptions
with potential violation of IV3 by a pleiotropic effect indicated by a dotted line. The genetic effect of Gj on X is βXj, the direct (pleiotropic) effect of Gj on Y via an independent pathway
is αj (representing the potential violation of the IV3 assumption), and the causal effect of
the risk factor X on the outcome Y is θ. U represents the set of unmeasured variables that confound the association between X and Y .
random-effect models will be the same, but the standard error of the causal effect from the multiplicative random-effects model will be larger if there is heterogeneity between the causal estimates. Throughout this Chapter, we apply a multiplicative random-effects model to all the analyses.
The MR-Egger estimate is obtained using the same regression model as Equation 4.1, but allowing the intercept to be estimated [29]:
ˆβYj = θ0UE+ θU EˆβXj+ ϵU Ej, ϵU Ej ∼ N(0, φ
2
U Ese( ˆβYj)
2) ,
where θU E is the MR-Egger estimate, ϵU Ej represents the error term, and φU E represents
the residual standard error under the univariable MR-Egger model. If the genetic variants are not pleiotropic, then the intercept term should tend to zero as the sample size increases, and the MR-Egger estimate (ˆθU E) and the IVW estimate (ˆθU I) are both
consistent estimates of the causal effect. If the genetic variants are pleiotropic, and the InSIDE assumption holds, then the MR-Egger estimate will be a consistent estimate of θ [29, 64].
Under the InSIDE assumption, the intercept term ˆθ0UE can be interpreted as
an estimate of the average direct effect of the genetic variants [52]. If the average direct effect is zero (balanced pleiotropy), and the InSIDE assumption is satisfied, the intercept term should tend to zero as the sample size increases, and the MR-Egger estimate (ˆθU E) and the IVW estimate (ˆθU I) are both consistent estimates of the causal
4.2 Methods 93
violated or the average direct effect differs from zero (directional pleiotropy); this is a test of the validity of the IV assumptions (the MR-Egger intercept test).
4.2.2
Multivariable Mendelian randomization
In a multivariable Mendelian randomization analysis, each genetic variant must satisfy the IV assumptions outlined in Section 2.6.3. Now, the association of the genetic variants with the outcome can be decomposed into indirect effects via each of the risk factors and a residual direct effect α′
j. Assuming there are three risk factors and all
relationships are linear:
βYj = α
′
j + θ1βX1j + θ2βX2j + θ3βX3j, (4.2)
where θk is the causal effect of the risk factor k on the outcome (Figure 4.2). We
assume that the risk factors do not have causal effects on each other; we later relax this assumption and allow for causal effects between the risk factors.
𝑋" 𝐺$ 𝑋% 𝑌 𝑋' 𝛽)*+ 𝛽),+ 𝜃" 𝜃% 𝜃' 𝑈" 𝑈% 𝑈' 𝛽)/+ 𝛼′$
Fig. 4.2 Directed acyclic graph illustrating multivariable Mendelian randomization assump-
tions for a set of genetic variants Gj, three risk factors X1, X2 and X3, and outcome Y.
The genetic effect of Gj on Xk is βXkj, the direct (pleiotropic) effect of Gj on Y is α
′
j, and
the causal effect of the risk factor Xk on the outcome Y is θk. Uk represents the set of
unmeasured variables that confound the associations between Xk and Y .
Causal estimates of the effect of each risk factor on the outcome can be obtained using multivariable weighted linear regression of the genetic association estimates, with the intercept set to zero (multivariable IVW method) [115]:
ˆβYj = θ1MIˆβX1j + θ2MIˆβX2j + θ3MIˆβX3j + ϵM Ij, ϵM Ij ∼ N(0, φ
2
M Ise( ˆβYj)
where θM I are the multivariable IVW estimates, ϵM Ij represents the error term, and
φM I represents the residual standard error under the multivariable IVW model.
We propose the natural extension to multivariable MR-Egger using the same regression model, but allowing the intercept to be estimated:
ˆβYj = θ0ME+ θ1MEˆβX1j+ θ2MEˆβX2j+ θ3MEˆβX3j+ ϵM Ej, ϵM Ej ∼ N(0, φ
2
M Ese( ˆβYj)
4.2 Methods 95