imputation
Rubin first recognised the value of MI for sensitivity analysis in 1987 [19]. The imputation model and substantive model are separate and do not have to be the same. They can reflect different structures, thus departures from a MAR response mechanism can be readily accommodated by altering the imputation model and maintaining the substantive model of interest.
Standard MI, under the assumption of MAR, utilises the conditional distribution of partially observed response data given the observed response data for imputation across all missing data
patterns (1.2). Through altering the values of the parameters of this distribution by missingness pattern we can explore departures from MAR via MI. This approach directly corresponds with the pattern-mixture modelling sensitivity analysis approach outlined in Section 1.3. MI provides an accessible route to explore the impact of such models, without the need for complex model fitting or specification of formula.
For MNAR imputation, for each deviating individual with missing data pattern mi we require the
distribution of their missing outcomes, given their observed data denoted as,
[Ymi|Yoi, mi, ηm] . (1.12)
ηm are the parameters of this distribution, whose values differ across missingness patterns, and whose values we first have to estimate, before we can impute missing data from (1.12) using the standard MI procedure. That is for each missing data pattern m, we create K complete datasets by taking a draw of ηm from the appropriate Bayesian posterior distribution, [ηm|Yo]. We then
draw the missing data from (1.12) using the current draw of ηm given Yo. Each imputed data
set is analysed using the substantive model of interest. Results are then combined using Rubin’s combination rules.
When exploring departures from MAR for each missing data pattern m we choose a form for constructing ηm from η, the parameters of our imputation model under MAR. This is based
on a pre-specified rule that reflects a specific assumption. This enables us to assess the impact of alternative missing data assumptions on the trial results in a principled manner, with MAR providing a natural starting point to MNAR exploration.
For longitudinal trials with a number of different missing data patterns, as with all pattern-mixture approaches —and indeed less interpretable selection modelling approaches— this can require the specification of many parameters. Two alternative techniques for forming ηm are now discussed
for longitudinal clinical trials with continuous outcomes. We will see that explicit parameter specification is however not always required.
1.5.1
The ‘δ-method’
Quite often it may be of interest to explore the impact of deviators having a poorer response post- deviation than those observed. In other cases it might even be of interest to assess the impact of a better response post-deviation. The ‘δ-method’ provides a useful accessible route to explore such pertinent departures from MAR via MI in the clinical trial setting.
For each missing data pattern m, the parameters of the conditional data distribution used for imputation, ηm, are constructed using the parameters of the MAR implied conditional distribution, η, and numerical information which ideally would be elicited from experts. That is a consensus is reached on the extent to which the parameters of the distribution of missing data are likely to
Carpenter and Kenward [7, 44] outline how this pattern mixture MI approach can be used to accommodate an anticipated change in rate of improvement or decline post-deviation from that predicted under MAR in a longitudinal trial setting. They introduce a chronic asthma randomised placebo controlled trial. The primary outcome was FEV1(Forced Expiratory Volume in 1 second,
measured in Litres) recorded at baseline, 2, 4, 8 and 12 weeks: however a number of patients failed to complete the study in both arms. Further those who dropped out were observed to have a poorer response over times they were observed. Initially K imputations are generated under MAR. For each deviating patient, the first MAR imputed FEV1 observation is reduced by a postulated
amount, δ, which itself has an appropriate prior distribution. The second MAR imputed FEV1
observation is reduced by twice this postulated amount, 2δ, and so on. This reflects a worsening response over time.
Figure 1.1 illustrates the ‘δ-method’ using data from a similar asthma trial with FEV1 outcome
data (described in full detail in Section 1.7.2). The first MAR imputed FEV1observation is reduced
by δ = 0.05, and the second by 2δ = 0.10.
The δ-adjusted imputed datasets are analysed using the substantive model of interest. Results are combined across imputed datasets using Rubin’s rules. In the present example this provides the treatment estimate where active deviators are assumed to have a change of δ = 0.05 in the rate of decline post-deviation, i.e. a gradual worsening in response over time. Full technical details of the ‘δ-method’ are presented in Chapter 5.
δ 2δ 1.9 2 2.1 2.2 2.3 FEV 1 0 4 8 12 Time (weeks)
Placebo MAR means Active MAR means Observed active data Imputed active data − MAR Imputed active data − δ
Figure 1.1: Illustration of the ‘δ-method’
Such formal pre-specification of quantitative sensitivity parameters that is required in the regulated trial environment is however notoriously difficult [28]. Even more so when additional baseline covariates are included in the analysis. A different approach can be employed, whereby adjustments for δ of an increasing size can be applied until the conclusions of the trial change. Afterward the plausibility of the assumption underlying the analysis when conclusions change can be evaluated, that is the particular size of δ representing the difference in outcome between the observed and non-observed. This is the tipping point analysis approach [30]. However this approach is open to criticism since it often relies on post-hoc justification.
1.5.2
Reference based sensitivity analysis
Another option is to make statements about post-deviation data, by reference to other groups of individuals in the trial (typically individuals in different treatment arms). For each missing data pattern we can construct the parameters for our missing data, ηm, using within trial information. The ‘δ-method’ can require pre-specification of a large number of sensitivity parameters especially when the data are longitudinal and the number of missing data patterns (m) is large. Reference based approaches avoid the need for explicit parameter specification, which can often be hard to justify. In-study data is used to make qualitative rather than quantitative missing data assumptions based on plausible clinical scenarios.
An early version of reference based MI was originally introduced by Little and Yau in 1996 [42]. Little and Yau used sequential regression and MI to impute for patients as actually treated af- ter dropout. Building on the ideas of Little and Yau, more recently in 2013, Carpenter, Roger and Kenward [13] formalised the approach and presented a novel collection of MI procedures for reference based sensitivity analysis of trials with protocol deviation.
The technique revolves around a fitted MAR model to the observed data for each treatment arm. The parameters of the required conditional distributions used to impute data, ηm, are pieced together with qualitative reference to trial arms, using the MAR parameters. For continuous data the mean vector and variance-covariance matrix of the required conditional distributions are constructed using the information from other groups in the trial. The primary analysis model, based on a comparison of the randomised groups, is retained in the sensitivity analysis, in keeping with the ITT principle. This allows for the assessment of the impact of alternative sampling behaviour only on the primary analysis as originally planned.
For example, future statistical behaviour of deviating patients on an active treatment can be assumed to be similar to that of observed control subjects. In the chronic asthma trial comparing an active treatment against placebo, it may be of interest to assess the robustness of inferences from the primary analysis if we assume deviating patients in the active arm stopped taking their treatment post-deviation i.e. the active patients jump to placebo behaviour. Under this de-facto assumption the unobserved data for deviating active patients can be imputed from the appropriate conditional distribution formed using the mean and covariance matrix from the active arm at pre-deviation times and the mean and covariance matrix from the placebo arm for post-deviation times. Figure 1.2 is a schematic illustration of so-called jump to reference (J2R) imputation for the same active deviator observed in Figure 1.1. In comparison to imputation under MAR (Figure 1.1), we see the imputed data is much lower under the J2R assumption.
A full description of the reference based MI procedures, with technical details, is given in Chapter 2.
The above example illustrates how MI provides an accessible route for open and interpretable sensitivity analysis that all trial personnel can understand. No direct estimation of a MNAR model is required. Both de-jure and de-facto estimands can be assessed via this route. It is a useful tool to assess both the sensitivity of the primary estimand to alternative missing data assumptions and the sensitivity of the primary estimand to alternative target estimands.
1.9 2 2.1 2.2 2.3 FEV 1 0 2 4 6 8 10 12 14 Time (weeks)
Placebo MAR means Active MAR means Observed FEV1 Imputed FEV1(J2R)
Figure 1.2: Illustration of jump to reference imputation