Formulation of models - Bayesian full probability modelling

2.4 Bayesian full probability modelling

2.4.1 Formulation of models

In contrast to multiple imputation, where the imputation of the missing data and the analysis of the model of interest are performed separately, Bayesian full probability modelling performs these two stages simultaneously. To see how suitable Bayesian models are formulated, we will consider the case where our model of interest is a simple linear regression with a univariate Normal outcome yi and a

vector of covariates x1i, . . . , xpi, for i = 1, . . . , n individuals, as specified by Equation 1.8 and discussed

in Section 1.6.1. We start by considering the case where the covariates are fully observed and only the response has missing values, as shown in Figure 1.1.

If we assume that the missing data mechanism is ignorable, we do not need a missingness model as imputation of ymis is unnecessary for valid inference about β and σ. Values for ymis can be generated

from the posterior predictive distribution f (ymis|yobs, β, σ) if required, but this will not alter the

posterior inference about β or σ. This is reflected in Figure 1.1, which only shows the model of interest. However, we could redraw this diagram to include a model of missingness, typically

mi ∼ Bernoulli(pi), link(pi) = f (yi, θ),

θ ∼ Prior distribution.

(2.1)

where m_iis a binary missing value indicator for y_i, θ are the parameters of missingness and ‘link’ would typically be taken to be the logit or probit function. A representation of the joint model is shown in Figure 2.1, with y drawn as a double node, such that the observed values, y_obs, and missing values, y_mis, are shown separately. It is clear from this diagram that the probability of missingness is dependent on the observed values but not the missing values, as required for ignorable missingness. Note that a different ignorable missingness diagram could be drawn in which the probability of missingness is also dependent on the covariates, by adding an arrow from x to p. For ignorable missingness, the model of interest and model of missingness can be fitted separately.

A major advantage of Bayesian modelling is that these models can be adapted relatively easily, to allow for the possibility that the missing data mechanism is non-ignorable. To do this, we need to add a link from the missing values of y to the probability of missingness, as shown in Figure 2.2. The two parts of the model must now be fitted together. If we collapse the double node back to a single node representing both observed and missing values of y, we obtain Figure 2.3. This will be our representation from now on, but it should be interpreted as Figure 2.2. Carpenter et al. (2002) use clinical trial data to illustrate how a drop-out at random model can be adapted to become a non-random drop-out model.

Figure 2.1: Graphical representation for ignorable missingness in the response variable

x

µ

y

_iobs

p

m

i individual i

σ

β

θ

Model of Interest

Response Model of Missingness

y

_imis

Figure 2.2: Graphical representation for non-ignorable missingness in the response variable

x

µ

y

_iobs

p

m

i individual i

σ

β

θ

Model of Interest

Response Model of Missingness

y

_imis

If the covariates have missing values then a covariate model of missingness is also required (this is the subject of Chapter 6). As we will see, the joint model can become quite complicated when there are missing values in both the response and the covariate, and assumptions of non-ignorable missingness are made for some or all of these.

A further advantage of Bayesian modelling is that it provides an ideal framework for incorporating data from external sources and informative prior information (Best et al., 1996; White et al., 2004). These issues are explored in Chapter 7.

Figure 2.3: Simplified graphical representation for non-ignorable missingness in the response variable

x

µ

y

p

m

i individual i

σ

β

θ

Model of Interest

Response Model of Missingness

following examples. Pettitt et al. (2006) fit Bayesian hierarchical models to categorical longitudinal data, with missing response and covariates, assuming MAR. Clayton et al. (1998) have evaluated Bayesian methods for using auxiliary information in the analysis of incidence data arising from a longitudinal study, where data is missing by design. Kadane (1993) provides an early example of how to conduct a Bayesian analysis of survey data with non-response, which explores sensitivity to different beliefs about the missing data. Bayesian methods have been used by Forster and Smith (1998) in modelling non-ignorable non-response for categorical survey data and by Huang et al. (2005) in estimating the parameters of generalised linear models with non-ignorably missing covariates. Wood

et al. (2006) compare a Bayesian approach for using the number of failed contact attempts to adjust

for non-ignorable non-response with a modified conditional likelihood method and an EM procedure.

In document Bayesian methods for modelling non-random missing data mechanisms in longitudinal studies (Page 34-36)