Review of hierarchical models and provider profiling methods
4.1 Hierarchical models
4.1.3 Extensions to the basic model
Individuals can be similar at more than one level (e.g. surgery patient i within surgeon j within hospital k), with similarity now also occurring at the middle level(s) (e.g.
similarity of surgeons within the same hospital in the previous example). This structure, illustrated in figure 4.1(b), can be modelled by adding another random effect to the
model specification:
logit(pijk) = α + βxijk+ ujk+ vk ujk ∼ N(0, σu2)
vk ∼ N(0, σv2)
Note that we are now using the index ijk to accommodate the third level. As assumption of independence from covariates xij now applies to both the ujk and the vk, and further we assume that the ujk and the vk are uncorrelated. We can reparametrise the model to consider each level separately:
logit(pijk) = βxijk+ μjk (Level 1) (4.3)
μjk = νk+ ujk (Level 2)
νk= α + vk (Level 3)
ujk ∼ N(0, σ2u) vk∼ N(0, σ2v)
This parametrisation, known as hierarchical centring, emphasises the hierarchical struc-ture of the data. There is a shift of focus, as rather than having one equation for the individual we have a set of equations, one for each level of the hierarchy. This approach is intuitive if we wish to add cluster-level variables into the model. If we have some factor Z measured at level 2 of the hierarchy (say an area-level measure of air pollution if examining lung cancer rates) then the estimated log odds of the outcome at hospital j is given by μjk = νk+ γzjk+ ujk, i.e. it is a function of the cluster-level covariate.
In model 4.2 all cluster effects arise from the same distribution with a single variance parameter σ2u. It is possible to have different variances (known as complex variation, see Goldstein (2002)) if we expect the influence of the cluster on the outcome to be heterogeneous. For example, in an educational context, small and large classes could be assigned a different variance. Another common extension is to allow covariate effects (β in equation 4.2) to vary with cluster as well (for example we might expect the effects of low income on lung cancer rates to vary by area). This is illustrated in equation 4.4 below.
logit(pij) = α + βxij + u(1)j xij+ u(0)j (4.4) u =
u(0) u(1)
∼ N(0, Ψ) Ψ =
σ2
u(0) σu(0)u(1) σu(1)u(0) σu2(1)
Here, Ψ is the 2 x 2 variance covariance matrix for the random effects. The diag-onal terms of the matrix contain the variance of the random intercepts and
coeffi-cient (σ2u(0) and σu2(1) respectively), while the off-diagonal terms give the covariance σu(0)u(1) = σu(1)u(0) which describes how the random terms are correlated. For example, areas with higher than expected rates of lung cancer may also have stronger associations between income and lung cancer, leading to a stronger correlation between intercept and slope terms. These models are sometimes referred to as random coefficient models.
4.1.4 Hierarchical models for provider profiling
The hierarchical models described in the previous section are widely used in public health (Diez-Roux, 2000; Leyland and Goldstein, 2001) and have long been advocated for provider profiling purposes (DeLong et al., 1997; Goldstein and Spiegelhalter, 1996;
Normand et al., 1997). In this context, the random effects described above are called provider effects, terminology which I shall use in this thesis. These cannot necessarily be interpreted as an effect that is wholly caused by the provider (such that would be obtained if patients were randomised to providers) because the interpretation depends on the set of factors we have adjusted for. This set may not encompass all variables beyond provider control, thus some of the between-provider variation may be due to these unmodelled variables as discussed in section 3.5.2.
Early applications include Gatsonis et al. (1993, 1995) who described variation in angiography rates across states, and Thomas et al. (1994) who estimated hospital-specific log odds of mortality. Gatsonis et al. (1995) also allowed the provider effects to vary with patient characteristics using the random coefficient formulation previously described.
These authors recommended the hierarchical analysis over the fixed effects approach because it produces more reliable and precise provider-specific estimates (particularly for smaller providers) through shrinkage, addresses problems of multiple comparisons, accounts for uncertainty in a statistically principled way, and enables attribution of variation to different sources, including provider-level variables (DeLong et al., 1997;
Gatsonis et al., 1993, 1995; Goldstein and Spiegelhalter, 1996; Normand et al., 1997;
Thomas et al., 1994).
Normand et al. (1997) presented a framework for hierarchical methods for provider profiling, incorporating complex variation to allow for different hospital types to have different between-hospital variation. They used Bayesian methods to present provider effects as posterior tail probabilities of different measures of poor performance, argu-ing that these are a more interpretable quantity than the confidence intervals obtained using frequentist methods. Goldstein and Spiegelhalter (1996) also exploited the flex-ibility of Bayesian models to illuminate flaws in simple league tables because of high levels of uncertainty in the ranks. They highlighted that extension of their methods to three-level models is straightforward, enabling profiling of providers within providers.
Three-level models have been implemented in several studies, for example to examine healthcare-associated infections in hospitals within regions in France (Chen et al., 2013), and adherence to osteoporosis guidelines for physicians within clinics (Brookhart et al., 2006).
In line with focus of public interest described in chapter 1, subsequent research is concentrated in developing methods for detecting unusual performance using techniques such as funnel plots (Jones et al., 2008; Spiegelhalter, 2005b). Similarly, the Stan-dardised Mortality Ratio (SMR, which compares the observed deaths with the number expected) is an intuitive, if much-criticised, measure of hospital performance (Lilford and Pronovost, 2010), but it is less clear within the hierarchical framework how the nu-merator and denominator should be calculated. Several different definitions have been proposed (Glance et al., 2003; Health and Social Care Information Centre Clinical In-dicators Team, 2016; Shahian et al., 2005; Yang et al., 2013) but these do not always yield consistent results (Mohammed et al., 2012). Whether we wish to quantify provider effects using odds ratios, SMRs, ranks or tail probabilities, in the context of hierarchical models all of these approaches involve estimating the random effects themselves in some way. As previously described shrunken estimates of provider effects can be calculated using empirical or full Bayes. The former was used by Thomas et al. (1994) amongst others (DeLong et al., 1997; Dimick et al., 2010) and is sometimes called a reliability adjustment. Gatsonis et al. (1993) found both approaches gave similar results for their purpose.
Provider profiling differs from many standard applications of hierarchical models (e.g. repeated measures, growth curves, cluster randomised trials) in that the random effects themselves are the focus of interest, not a nuisance source of variation to be accounted for in order to correctly estimate other parameters (Ohlssen et al., 2007b).
Therefore the random effects should be carefully specified with regard to the inferences we wish to make, with sensitivity analysis to check for robustness to variation in the assumptions. Several studies have explored using alternative distributions to Normal for the random effects. For provider profiling analyses a t distribution with low degrees of freedom has been suggested (Normand et al., 1997; Ohlssen et al., 2007b) but there are no specific recommendations about exact values for ν. Values of ν = 1 or 2 have an infinite variance so are not appropriate choices. Within a Bayesian framework we can attempt to estimate ν from the data (setting a suitable prior) but unless the data (or prior) are very informative this does not yield a useful posterior and is similar to specifying a Normal distribution (Daniels and Gatsonis, 1999). Austin (2005) demonstrated using simulation that assuming a normal model to estimate provider effects arising from a t distribution with ν = 3 and ν = 5 resulted in over-shrinkage but there was no basis for choosing ν. In some situations a skewed distribution may be suitable, for example if there are a few providers who are particularly high performing. Lee and Thompson
(2008) present an example concerning a clinical trial comparing teleconsultations with standard appointments, where treatment effects were expected to differ by consultant.
They used flexible Normal and t distributions for the random effects with parameters, estimating both ν and a skewness parameter from the data, though in this case there was little difference in model fit compared with symmetry.
There are some limitations to using hierarchical models for provider profiling. While shrinkage of providers is desired for some purposes it can lead to a false sense of com-placency for smaller providers (Walker et al., 2013). In such cases the tail probabilities advocated by Normand et al. (1997) may be useful to indicate the extent to which un-usual performance can be ruled out. Another consequence of shrinkage is that smaller providers will be shrunk more towards the mean than larger providers. Direct compari-son of posterior means between providers may therefore yield incorrect inferences (Shen and Louis, 1998). Aylin et al. (2003) argue that if an aim of the analysis is to detect outliers then it may be preferable to minimise the risk of type II error (not detecting a genuine outlier) in comparison with the type I error (erroneously identifying a provider as an outlier); in such cases a fixed effects model may be preferable to a hierarchical model. A similar argument was made by Austin et al. (2003) who demonstrated that hi-erarchical models in general had greater specificity for correctly identifying outliers while fixed effects models had greater sensitivity, even when all providers were high-volume centres. Notably, the authors found that fixed effects models outperformed hierarchical models if the underlying distribution of provider effects was trimodal, but a unimodal random effects distribution was assumed. The requirement to assume a distribution for the random effects can therefore be a limitation, one which is not present for fixed effects models.
Hierarchical models also require the assumption that the provider effects are uncor-related with covariates. This can be violated if there are unmeasured confounders (i.e.
variables which are associated with both the provider and the outcome that are not in the model) that are also associated with included covariates, because the effects of the omitted variables are absorbed into the provider effects. Sociodemographic variables such as ethnicity and economic deprivation can be confounders because they have spatial variability and have effects on health outcomes, but provider profiling analyses are often not adjusted for them to avoid masking the effects of interest. This was discussed in chapter 3, along with the effect on the interpretation that some of the between-provider variation could be due to unmeasured confounding. A further effect of adopting this approach occurs if the unmeasured sociodemographic variables, and hence the provider effects, are correlated with any of the clinical risk-adjustment factors. Then the assump-tion of independence of provider effects and covariates is violated, resulting in biased estimates of the covariate effect (Rabe-Hesketh and Skrondal, 2005). An advantage of fixed effects models is that they do not require that provider effects be independent from
covariate effects.
The concept of hierarchical modelling (and caveats associated with the results) can be difficult to explain to a non-statistical audience. Sometimes these difficulties relate to aspects which are otherwise considered advantageous. The Centers for Medicare and Medicaid Services (CMS) in the US received several criticisms from consumers regarding the use of hierarchical models (Ash et al., 2012): the methods did not show any variation because of shrinkage and were therefore not useful; the performance of small hospitals is masked; concepts surrounding hierarchical models are difficult to convey. Nevertheless hierarchical models have now gone firmly beyond the statistical literature and have been put into practice by the CMS (Shahian et al., 2005) and many others such as the UK Healthcare Comission (Spiegelhalter, 2005a) (now the Care Quality Commission) and the UK Health and Social Care Information Centre (Health and Social Care Information Centre Clinical Indicators Team, 2016).
In summary, hierarchical models are an established approach for provider profiling, and this thesis builds on these existing approaches. Some aspects of the structure of the neonatal data can be modelled using techniques described thus far. As with all regression models, continuous and categorical patient risk factors can be adjusted for. Hierarchical models allow for variation at both the NNU and network level to be estimated using a three-level model. Level of NNU can be incorporated into the model as a provider-level covariate, and heteroscedasticity by level of NNU investigated by including separate variance terms. These methods will be described in detail in section 5.2. A limitation of these established methods is that they only allow for attribution to a single provider.
The problems posed by transfers for provider profiling analyses (see chapter 1) and analysis of transfer patterns (chapter 2) indicated that attribution to a single provider is not a reasonable approach for these data, nor is excluding transfers altogether. In the next section I will review the literature for how transfers have been handled in existing provider profiling studies.