experiences of junior doctors at discharge
Chapter 5: Theory of Discrete Choice Experiments Choice Experiments
5.2 Methodological considerations
5.2.6 Phase 5: Model estimation and data analysis
5.2.6.1 Experiment design
The number of levels allocated to each attribute (nL) multiplied to the power of the number of attributes (nA) gives the total number of possible combinations of levels and attributes (nLnA). A DCE questionnaire of this scale which offers choices for all of these possible combinations – known as a full factorial design – is usually impractical to analyse and unmanageable for respondents to complete, due to high cognitive complexity. In light of this, a model that presents a fraction of the different combinations can be generated using model estimation technology. Such models present a smaller, more manageable number of combinations of attributes at different levels, also known as profiles, which can be used as choices for the survey respondent, but retain their statistical properties such that they are still sufficient to elicit preferences between attributes (ensuring precision).
Orthogonal designs are designs which attempt to minimise the correlation between the levels of the attributes. Efficient designs, as well as being orthogonal, aim to result in data that generates parameter estimates with as small as possible standard errors. Using choice modelling software to create a design using the attributes and levels chosen permits the most efficient design possible to be created.
5.2.6.2 Utility theory
The theoretical framework which underpins DCEs is known as random utility theory, which was developed by McFadden et al. in 1974 (149). In a DCE question, it is assumed that respondents will choose the alternative which provides them with the higher level of utility.
The principle of random utility theory is that it is impossible to observe all of the factors which may affect an individual’s preferences, and that elements of their decision-making behaviour are ‘random’ in nature.
Utility scores are calculated using the equations in Figures 5.2 and 5.3. The utility theory equation contains two components: an explainable (systematic) component and an unexplainable (random) component. The systematic component represents a function of the attributes of the alternatives, which can be observed. The random component represents the unobserved differences in preferences which exist between individuals.
U
altn= V(Att
altn, β) + ε
altnU Utility
Alt an alternative
N the individual respondent
Ualtn the utility of respondent n choosing alternative alt Β regression coefficient
V(Attaltn, β) explainable component of utility (function of the attributes and alternatives of individual n for alternative alt)
εaltn unexplainable component of utility (unobserved variations in preferences of individual n for alternative alt)
Figure 5.2: Utility theory equation
The systematic component is a representative utility function, which relates the observed attributes of the alternatives to the utility derived from alternative alt (Ualtn). An attribute specific constant (ASC) captures the mean effect of the unobserved factors in the error terms for each of the alternatives (εaltn).
V(Att
altn, β) = ASC + β
1Att
alt1+ ... + β
kAtt
altkV(Attaltn, β) explainable component (function of the attributes and alternatives of individual n for alternative alt)
β regression coefficient k alternatives 1, 2, ... k
ASC attribute specific constant for alternative alt
Att Attribute
Figure 5.3: Breakdown of explainable utility function
The magnitude of the regression coefficients (β) represents the impact of a unit change in an attribute on the utility of switching between choices. The greater the magnitude of the coefficient the greater the impact of a unit change on the utility, and therefore the greater the preference for that attribute. The statistical significance of the β value for an attribute indicates its importance.
5.2.6.3 Choice modelling
When evaluating healthcare, it is recognised that real-life decisions are not binary in nature (140) with three or more alternative options often available to an individual. Consequently, the majority of existing DCEs have adopted a simple multi-nominal conditional model (MNL).
There are three key assumptions associated with the use of the MNL. These are:-
1. The ratio of choice probabilities of any two alternatives is unaffected by other alternatives. This implies that choice probabilities would all change proportionally if an alternative were to be added in or one removed. This is known as the independence of irrelevant alternatives (IIA) property. The IIA property can be inappropriate in situations where two services presented may be more similar to each other than the opt-out option presented, and therefore will compete with each other more intensively than they do with opting out.
2. The MNL cannot represent unobserved heterogeneity or any other unobserved variability between individuals (εin). It is recognised that individuals make choices based on explainable factors (such as income or education) and unexplainable (random) factors, which cannot be related to observed characteristics. Systematic
(explainable) heterogeneity can be incorporated into the MNL, but random (unexplainable) cannot.
3. The unobserved heterogeneity error terms (εin) are independent and identically distributed across all observations. This is known as the IID property.
5.2.6.4 Alternative models
Although the MNL is widely recognised as the simplest model to employ, and therefore the recommended starting point for any DCE experiment being designed (140), other models have developed the MNL further to attempt to lessen some of its restrictions.
Nested logit models (a type of Generalised Extreme Value model) can partially relax the IIA property by grouping (or nesting) subsets of alternatives which are similar to each other with respect to unobserved characteristics. Creation of these mutually exclusive groups allows for more flexible substitution patterns (150). Multinomial probit (MNP) models can fully relax the IIA property, as well as the other restrictions of the MNL, but as a result are complicated to estimate and consequently not widespread in the literature. Random parameters or mixed logit models are appropriate where considerable variation across the population of respondents is anticipated (151, 152), and Latent Class models are recommended where two or more groups of respondents with similar preferences are anticipated (153). Binary choice models are appropriate where dichotomous choice sets are employed (154, 155).