Robust regression (MM–estimation) - Robust methods in Mendelian randomization

3.3 Methods

3.3.1 Robust regression (MM–estimation)

Before we introduce the robust regression method used in this Chapter, we first consider M– and S–estimators in relation to estimating the regression coefficients β in the linear regression model:

yi = β0+ β1xi1+ · · · + βmxim+ ϵi

= xT

i β+ ϵi,

where i = 1, . . . , N, and the error term ϵi has an expected value of zero and scale σ2.

An M–estimator minimises: N X i=1 ρ yi− x T i β ˆσ ! , (3.2)

where ρ is an objective function and ˆσ is a scale estimate for the error term. M- estimates for β are obtained by taking the partial derivatives of the objective function in Equation 3.2, and solving the system of equations:

N X i=1 ψ ri ˆσ ! xi= N X i=1 ψ(ui)xi = 0 , (3.3)

where ψ is proportional to the derivative of ρ, and ri = yi− xTi βˆ. By substituting the

weighting function:

w(u) = ψ(u)

u , (3.4)

into Equation 3.3, and applying an iteratively reweighted least squares (IRLS) algorithm, M-estimates ˆβM can be obtained. To reduce the impact outliers have on the scale

estimate, the median absolute deviation of the residuals can be considered: ˆσ = median(|ri|)

0.6745 , (3.5)

where median(|ri|) is multiplied by 1/0.6745 as the expected value of median(|ri|) is

0.6745 if the residuals are normally distributed [90]. Note that σ is re-estimated at each iteration until the estimates ˆβM converge.

If we assume that the error term is independently and normally distributed ϵi ∼

N(0, 1), and Equation 3.2 is set to PN

i=1ri2, then the M-estimator is equivalent to

the OLS estimator. However, we may want to use an objective function that is less sensitive to outliers, such as Tukey’s bisquare objective function:

ρ(ui) =        c2 6 1 − 1 − ui c 23 ! if |ui| < c c2 6 if |ui| ≥ c , (3.6)

with its weighting function:

w(ui) =        1 − ui c 22 if |ui| < c 0 if |ui| ≥ c . (3.7)

From Equation 3.7, the weight of an observation decreases as ui tends away from

zero, and when |ui| ≥ c the observation will have zero weight. The value of the

tuning parameter c determines the relative efficiency of the M–estimator. For Tukey’s bisquare weighting function, a standard value of c = 4.685 is used to ensure the M-estimator has 95% asymptotic efficiency relative to the OLS estimate if the error term is normally distributed with an expected value of zero and constant variance [89]. Whilst this estimator may be less sensitive to outlying data points with respect to the

y observations, it can be sensitive to leverage points (data points that are outlying

3.3 Methods 43

An M–estimate of the scale of the error term is the value ˆσ that solves: 1 N N X i=1 ρ yi− x T i β σ ! = K ,

where K is a tuning parameter, and ρ is an objective function. S–estimates ˆβS are

the values that minimise the M–estimate of scale ˆσS. Estimates of ˆβS and ˆσS can be

obtained by using the weighting function in Equation 3.4 and an IRLS algorithm. If Tukey’s bisquare objective function is used, the S-estimator will have an asymptotic breakdown point of 50% if c = 1.548 and K = 0.5 [91]. Whilst the S-estimator may be highly robust to outliers and leverage points, it can lack efficiency.

To overcome some of the disadvantages of using the M– and S–estimators, MM– estimation was proposed by Yohai [92], and consists of the following three stages:

1. The initial estimates ˆβS are obtained from a S-estimator with a high breakdown

point and objective function ρ1.

2. Using the residuals ri = yi− xTi βˆS from the stage above, an M-estimate of scale

ˆσS is calculated.

3. M–estimates ˆβ are obtained using the M-estimate of scale ˆσS from the second

stage and the objective function ρ2. Note that the M-estimate of scale ˆσS is fixed

for each iteration, i.e. it is not re-estimated using Equation 3.5. The estimates ˆβ

from this stage represent the MM-estimates.

By using a S-estimator in the first stage, and M-estimator in the third stage, the MM-estimator should be efficient and have a high breakdown point.

In this Chapter, we consider the MM–estimation approach (referred to as ‘robust regression’ in this dissertation) described by Yohai [92] and Koller and Stahel [88] that is used by the lmrob command in the R package robustbase [93]. lmrob uses Tukey’s bisquare objective function (Equation 3.6) for ρ1 and ρ2, with c = 1.548 in Equation 3.7

at the S–estimation step to maintain a high breakdown point, and c = 4.685 at the M–estimation step to provide efficiency.

Instead of using weighted least squares to obtain the estimates for the IVW and MR- Egger methods (as done in Equations 2.7 and 2.9), we propose using robust regression (MM–estimation approach used by lmrob). Since the lmrob command allows the user to specify a vector of weights to be used in conjunction with Tukey’s weighting function, we also account for se( ˆβYj)

−2 _{in the weights of the observations. The estimates from}

the MM-estimator using Tukey’s weighting function w(uj) and se( ˆβYj)

equivalent to the estimates obtained from weighted least squares where the weights are the product of w(uj) from the final iteration of the MM-estimator and se( ˆβYj)

−₂_.

In document Robust methods in Mendelian randomization (Page 71-74)