Factorial framework - Robust methods in Mendelian randomization

Burgess and Thompson [28] suggested that assessing the causal effects of multiple risk factors in a single Mendelian randomization study, referred to as ‘multivariable Mendelian randomization’ [28], is analogous to performing a factorial RCT (Figure 5.1). Since multivariable Mendelian randomization assumes that there are no interactions between the risk factors, it would be more accurate to compare this study design to a factorial RCT when the analysis is performed ‘at the margins’ (Section 5.2.1).

Fig. 5.1 Figure taken from the paper by Burgess and Thompson [28] comparing a factorial

randomized clinical trial to a multivariable Mendelian randomization study.

5.3.2 Genetic variants acting as proxies for pharmacological

interventions

The concept of using genetic variants as proxies for pharmacological interventions to identify interaction effects between drug treatments was first introduced by Ference

et al. [33], and the method has been used in additional studies to assess interactions

between drug treatments [34, 35]. Ference et al. [33] refer to this type of study design as a ‘2 × 2 factorial Mendelian randomization study’. The authors make no reference to the concept of ‘factorial Mendelian randomization’ as described by Davey Smith and Ebrahim [15].

Ference et al. [33] outlined their approach of performing a 2 × 2 factorial Mendelian randomization study by comparing the effect of lowering low density lipoprotein (LDL- C) levels on the risk of coronary heart disease (CHD) by inhibiting the NPC1L1 gene with ezetimibe, or inhibiting the HMGCR gene with statins, or through a combination

of both. Genetic variants associated with LDL-C levels in either gene region were identified, and two externally weighted gene scores were calculated for each gene region, where the reference alleles were the LDL-C lowering allele for each variant. To mimic a 2×2 factorial RCT, the two gene scores were dichotomized to create a 2×2 contingency table (Table 5.2). The gene scores were dichotomized at their median values to ensure the numbers of participants were balanced across the four groups in Table 5.2, where: • n00 represents the reference group, which was considered to be equivalent to

receiving no treatment,

• n10 are the group of participants with lower LDL-C mediated by NPC1L1, which

was considered to be equivalent to receiving the ezetimibe treatment only, • n01 are the group of participants with lower LDL-C mediated by HMGCR, which

was considered to be equivalent to receiving the statin treatment only, and • n11 are the group of participants with lower LDL-C mediated by NPC1L1 and

HMGCR, which was considered to be equivalent to receiving both ezetimibe and

statin treatments.

The authors performed an ‘inside the table’ analysis (Section 5.2.1) by fitting separate logistic regression models to each subgroup, with the n00 participants used as

the reference group for each model. By comparing the three OR estimates from the separate logistic regression models, the authors concluded that there was no evidence of an interaction effect between ezetimibe and statins, arguing that the effect of lowering LDL-C on the risk of CHD mediated by variants in NPC1L1, HMGCR or both, was approximately the same. No formal statistical testing was applied to the three OR estimates, and the authors did not attempt to estimate the interaction effect of lowering LDL-C levels by inhibiting the NPC1L1 gene and inhibiting the HMGCR gene on the risk of CHD.

Table 5.2 Contingency table created by Ference et al. [33] to compare the effect of lowering

low densty lipoprotein levels on the risk of coronary heart disease by inhibiting the NPC1L1 gene with ezetimibe, either alone, or in combination with a statin that inhibits the HMGCR gene.

Gene score for NPC1L1, GSA

≤ med(GSA) > med(GSA)

Gene score for ≤ med(GSB) n00 n10

5.3 Factorial framework 135

5.3.3 Effect of obesity and alcohol consumption on the risk

of liver disease

Although the method proposed by Ference et al. [33] (Section 5.3.2) is specifically designed for considering drug interactions, Carter et al. [136] have applied the method to investigate the interaction effect of obesity and alcohol consumption on liver disease using data from the Copenhagen General Population Study. As highlighted in Sec- tion 5.3.1, the idea of using Mendelian randomization to consider this research question was initially proposed by Davey Smith and Ebrahim [15]. Since Carter et al. [136] used the method proposed by Ference et al. [33], they were unable to estimate the interaction effect of obesity and alcohol consumption on liver disease. Carter et al. [136] may have followed the approach outlined by Ference et al. [33] as the current Mendelian randomization literature does not cover the estimation of interaction effects between risk factors.

In their study, Carter et al. [136] created a weighted gene score for BMI using five genetic variants, and classified participants as having a ‘low BMI’ or ‘high BMI’ if their gene score was ≤ or > than the median value of the weighted gene score. The rs1229984 variant in the ADH1B gene region was used as a proxy for alcohol consumption. Participants were classified as having a ‘low alcohol consumption’ if they were homozygous or heterozygous for the alcohol decreasing allele, or a ‘high alcohol consumption’ if they were homozygous for the alcohol increasing allele. Following the method proposed by Ference et al. [33], these classifications were used to create four subgroups of participants: 1) low BMI, low alcohol consumption; 2) low BMI, high alcohol consumption; 3) high BMI, low alcohol consumption; and 4) high BMI, high alcohol consumption.

Carter et al. [136] used two plasma biomarkers of liver injury and incident cases of liver disease from hospital records as the outcome measurements. Using the participants with a high BMI and high alcohol consumption as the reference group, three separate regression models were fitted to each outcome measurement to estimate the mean differences (two plasma measurements) or ORs (incident liver disease) for the three remaining groups of participants. The estimates and 95% confidence intervals from these three models were compared to consider the direction and overall patterns of association for each outcome measurement. The authors did not estimate the interaction effect between BMI and alcohol consumption on liver disease.

5.3.4 Requirement for further methodological research

We now review the literature on estimating interaction effects in Mendelian randomization, and outline the new material considered in this Chapter. Since methodological developments in Mendelian randomization is heavily linked to the IV literature, a review of relevant IV methods that estimate interaction effects will be considered first.

Instrumental variable analyses

There has been little method development in the IV literature on estimating interaction effects using observational data. However, interactions can be estimated from observational data using TSLS regression as described in Section 5.2.2.

Whilst there has been a substantial amount of method development in the IV literature on non-compliance in RCTs with a single treatment, there is little guidance on how IVs can be used when there is non-compliance in studies where the interaction between treatments is of primary concern [137]. Since non-compliance to randomization in a 2 × 2 factorial RCT can lead to artificial interaction effects, reduced statistical power, and biased estimates under an ITT analysis [138], developing methodology that can be used in a 2 × 2 factorial RCT would be beneficial. Blackwell [137], recognising this gap in the literature, has used a compliance framework to provide non-parametric estimators of the local average interaction effect (LAIE) and the local average conditional effects (LACEs) for a 2 × 2 factorial study design. The estimators of the LAIE and LACEs (local average treatment effect for one treatment with the other treatment fixed) require estimates of the compliance probabilities for the different treatment groups.

The application of Blackwell’s method to Mendelian randomization is limited. To estimate the LAIE under Blackwell’s [137] method, estimates of the compliance probabilities are required. Whilst it would be possible for these probabilities to be estimated in a 2 × 2 factorial RCT, it is difficult to conceive how these probabilities could be obtained in a 2×2 factorial Mendelian randomization study. The applicability of Blackwell’s method to a Mendelian randomization study is further restricted by the method only considering binary treatments, with each treatment having its own binary IV. Most Mendelian randomization studies use multiple variants as IVs, and the risk factors are typically continuous [139].

5.3 Factorial framework 137 Mendelian randomization

As highlighted in Section 5.3.1, the first paper to introduce the idea of estimating interaction effects between risk factors in Mendelian randomization was written by Davey Smith and Ebrahim in 2003. There has been no methodological developments in the Mendelian randomization literature on estimating interaction effects since this paper was published. We will address this gap in the literature by extending the multivariable Mendelian randomization method to the factorial setting by estimating the interaction term between two risk factors. This expansion will be considered for both individual and summary level data. We will also investigate whether the definition of a ‘strong instrument’ for multivariable Mendelian randomization (Section 2.6.3) is applicable to the factorial setting. Finally, the suitability of using the method proposed by Ference et al. [33] to investigate the interaction effect between two risk factors, as done by Carter et al. [136], will be investigated.

The application of a factorial framework to a Mendelian randomization study when the genetic variants are used as proxies for drug treatments has only been considered in the context of applied projects [33–35]. There are several methodological issues relating to the implementation of this method that will be addressed in this Chapter. Rather than performing an ‘inside the table’ analysis and comparing the estimates from the separate regression models with no statistical testing for an interaction effect, we will fit a single multivariable regression model with an interaction term between the gene scores. Although the interpretability of the interaction effect from this model would be limited, as it represents the effect of the genetic variants on the outcome [33], rather than the effect of the treatments on the outcome, the estimate could be used to assess whether there is evidence of an interaction effect. Instead of dichotomizing the gene scores at their median values, the impact of treating the gene scores as continuous variables to increase statistical power will also be investigated. Whether the gene scores are dichotomized or not, the distribution of the gene scores will effect the power of the study to detect the interaction term, and we will consider this in detail. For example, if the genetic variants are rare, we would expect the distribution of the gene scores to be skewed, and the numbers of participants in the 2 × 2 contingency table to be unbalanced, resulting in reduced statistical power.

Throughout this Chapter, we will continue to make a distinction between a factorial Mendelian randomization study that uses genetic variants as either: a) predictors of the risk factors to estimate the interaction effect between two risk factors on an outcome; or b) proxies for pharmacological interventions to detect interaction effects between drug treatments. For scenario a), we will use the genetic variants as IVs to

estimate the interaction effect between two risk factors, and for scenario b), we will use the genetic variants as proxies for drug treatments to test for an interaction effect between the variants and the outcome. We class these two study types as ‘factorial Mendelian randomization’ as they both consider the detection and/or estimation of an interaction effect. Since the primary aim of the analysis and role of the genetic variants differ between the two scenarios, the methodological developments for these study types will be considered separately.

5.3.5 Summary

In this Section, we have reviewed the literature on factorial Mendelian randomization, and have found that the study design has been considered under two broad scenarios: a) the genetic variants are used as predictors of the risk factors; and b) the genetic variants are used as proxies for pharmacological interventions. However, there remain significant gaps in the literature on this topic. The work presented in this Chapter will contribute to the factorial Mendelian randomization literature by investigating how interaction effects can be estimated under scenario a), and providing a more formal framework for detecting interaction effects under scenario b).

In document Robust methods in Mendelian randomization (Page 163-168)