When modelling missing data, some form of sensitivity analysis is essential because we are forced to make assumptions which are untestable from the data. As discussed, there are many possible options, and at least to some extent the choice will be determined by the problem at hand. However, based on our experience, we propose a general strategy for use when a joint model with a selection model parameterisation is used for analysis of a dataset with suspected non-ignorable missingness.
This strategy, outlined in Figure 8.6, assumes that a reasonable base model has already been formed. Two types of sensitivity analysis should then be carried out, an assumption sensitivity and a parameter sensitivity. For the assumption sensitivity, a number of alternative models should be run, formed from the base model by changing key assumptions. These should include, but not be limited to, changes in the model of interest error distribution, the transformation of the model of interest response and the functional form of the model of missingness. The parameter sensitivity involves running the base model with the parameters controlling the extent of the departure from MAR fixed to values in a plausible range.
The results of both sets of sensitivities should then be analysed to establish how much the quantities of interest vary. A range of plots, providing complementary views of the analysis is recommended. Plots like those in Figure 8.2 are particularly useful. If the conclusions are robust, this should be reported. Otherwise a range of diagnostics (e.g. fit of a hold-out sample and model of missingness DICW) should be used to determine a region of high plausibility, and the uncertainty recognised. The
Figure 8.6: Proposed strategy for sensitivity analyses for models with missingness
BASE MODEL
ASSUMPTION
SENSITIVITY
run alternative modelswith key assumptions changed including
PARAMETER
SENSITIVITY
run model with theMoM parameters associated with the informative missingness (δ) fixed to range of plausible values Are conclusions robust? report robustness determine region of high plausibility
YES
NO
• MoI error distribution • MoI response transform • MoM functional form
recognise uncertainty
Summary: designing sensitivity analyses
Sensitivity analysis is crucial when modelling missing data, to allow the robustness of con- clusions about the questions of interest to be investigated. We propose a dual approach in which (1) a base model is compared to a number of alternatives (assumption sensitiv- ity) and (2) the base model is rerun a number of times with the parameters controlling the extent of the departure from MAR fixed to a range of plausible values (parameter sensitivity). An outline of our proposed strategy is shown diagrammatically in Figure 8.6.
We have demonstrated this strategy using our MCS income example, and find that our conclusions regarding the substantive questions about educational level and ethnicity are robust. Higher levels of education are associated with higher hourly pay, but ethnicity makes little difference. However, there is considerable uncertainty about the effect of change in partnership status on income. An investigation of the fit of different aspects of our joint models, using the hold-out sample of re-issued individuals and the model of missingness DICW, provides greatest support for JMB and suggests that gaining a partner is associated with lower pay.
Chapter 9
Bayesian modelling of missing data:
conclusions and extensions
Each chapter has already been summarised, so here we draw together our findings and suggest avenues of future research.
9.1
Strengths and limitations of Bayesian joint models
Bayesian full probability modelling (indicated hereafter by BJM - Bayesian Joint Modelling) provides a flexible and ‘statistically principled’ way of modelling non-random missing data, using a selection model factorisation of a joint model. This type of joint model comprises a model of interest and model(s) of missingness. By estimating the unknown parameters and the missing data simultaneously, this modelling approach ensures that their estimation is coherent. Since the required joint models are built in a modular way, they are easy to adapt, facilitating sensitivity analyse which is crucial when the missing data mechanism is unknown.
Provided the different parts of the joint model are correctly specified, BJM performs much better than simplistic alternative methods, such as complete case analysis, in terms of bias in the model of interest parameter estimation and general model fit. However, the results are not robust to incorrect specification of the different parts of the joint model. In particular, the performance of BJM can be adversely affected if the error distribution of the model of interest is misspecified or the functional form of the model of missingness is incorrectly specified. Unfortunately, we can never be sure about the correct choices, so sensitivity analysis should always be carried out on these key assumptions. We recommend that expert knowledge or other external information is used to help select a range of plausible options.
If no knowledge about the shape of the missingness is available from external sources or expert opinion, the safest strategy is to use a linear logit for the model of missingness. If the missingness is linear
or close to linear, we have shown that there are potential benefits, and if not, its effect appears to be reasonably benign at least in the examples we have studied. Other functional forms should only be used if supported by external evidence, and informative priors can be derived to help with the estimation of the model of missingness parameters.
A drawback to this approach is that these types of complex statistical models do not run quickly for large datasets in WinBUGS, the readily available software typically used for Bayesian analysis. Additionally, we found that the current capability of the WinBUGS software limits the scope for easily implementing complex joint models which incorporate combinations of certain types of correlated covariates with missingness, as these may require more specialised MCMC sampling algorithms. An alternative approach to modelling missing data, multiple imputation, avoids some of the speed and computational issues associated with BJM, because it is a two stage approach which imputes the missing data and analyses the model of interest separately, and so divides the problem making it more tractable. However, the trade-off is the need to ensure that the imputation and analysis models are compatible, which is not trivial. Additionally, unless you are prepared to generate imputations using a series of conditional univariate models, the problems associated with specifying a multivariate imputation model for correlated variables remain. Also, while multiple imputation can be applied if data are MNAR, this is not entirely straightforward.
One of the advantages of the Bayesian approach is that it allows expert knowledge to be incorporated through informative priors. This is particularly useful for parameters in the model of missingness that are difficult to estimate. From our experience, we recommend that the elicitation process focusses on weakly identified variables, particularly those allowing informative missingness. It should allow for correlation between these variables and pay particular attention to their functional form.