Chapter 1. Introduction
1.4 Evaluating strain typing as a tool for TB control
Evaluations explore whether interventions and policies do what they intend to do, and what impact they have. This is necessary because interventions are costly, funds are limited and for every intervention implemented, there is an opportunity cost – what alternative interventions could have been implemented with the same resources? Evaluations, therefore, enable decision-makers to set priorities. There is also the possibility that interventions may have unexpected, adverse effects. Evaluations can be useful for the design of future interventions and policies. There are five main evaluation paradigms: experimentalist, pragmatist, constructivist, pluralist and realist. These approaches are briefly summarised here.
Experimental evaluation is based on the experimental research design whereby an intervention is introduced into one of two groups that are matched (either by randomisation or matching on potential confounders) and the groups are measured before and after the intervention.144 This relies on a theory of causation that removes all other possible causative agents so that there is just one possible causal link; the intervention. This approach is not easily applied to interventions that exist in the real world and introduces the ‘black box’ problem, whereby it only produces a description of the outputs, rather than explaining why some does or doesn’t work.
Pragmatist evaluation calls for evaluations to be useful to those that they are intended for; they are utilisation-focussed. There are four features of a pragmatic evaluation: utility, feasibility, propriety and accuracy.145 This focus raises the problem of the policy-maker or the funder having too much influence over the evaluation, removing the equipoise and the objectivity. This approach also led the way for a text-book approach to evaluation, with a prescriptive step-by-step method.146,147
Constructivist evaluation turns the focus away from the outputs of an intervention and the policy makers, towards the processes and meaning – engaging all possible stakeholders to establish why or why not an intervention works.148 This paradigm falls short because it does not allow for the objective assessment of an intervention or the asymmetry of power across the stakeholders.
33 Pluralist evaluation is an attempt to combine the experimental, the pragmatist and the constructivist paradigms. One form of this is the comprehensive evaluation, which ash three main activities:149
1. Analysis of the intervention design
2. Monitoring of the program implementation 3. Assessing program utility.
This approach is criticised for trying to encompass too much; being too broad and requiring too many resources to conduct properly.150
The final paradigm summarised here attempts to make the pluralist approach more realistic and applicable to the real world. Realist evaluation is based on a generative model of causation: that an action is causal only if its outcome is triggered by a mechanism acting in context (outcome=mechanism+context).150 Pawson and Tilley (1997) argue that an evaluation should demonstrate if the program works, what it is about the program that works for whom and in what conditions. The realist evaluation is based on Wallace’s (1971) wheel of science and argues for an evaluation cycle that includes:151
Theories (based outcome=mechanism+context);
Hypotheses (that hypothesise what might work for whom and in what circumstances);
Observations (multi-method data collection and analysis of the mechanisms, context and outcomes); and
34 Figure 12 – Realist evaluation cycle
Adapted from: Pawson and Tilley (1997) p.85 150
Existing evaluations of TB strain typing services
TB surveillance is carried out in most countries to varying degrees of scope and quality.1 However, molecular surveillance is a relatively new and expensive form of surveillance currently reserved for richer countries with lower TB burdens, and for research purposes. Therefore, molecular surveillance for TB occurs in very few countries; in 2014 the only countries with a universal molecular surveillance system for TB were the Netherlands,100,127 the USA,64,136 Denmark,124,152 Norway130,132 and Slovenia.133,134 The public health value of such surveillance systems has been demonstrated through the retrospective evaluation of strain typing data.87,102,116,127 A universal service in England was anticipated to add value to the TB control strategy.153
Evaluations of existing strain typing services have focussed on single elements of the service such as the discrimination of the typing method,69 effectiveness of cluster investigations,97,152 and the ability to identify cross contamination.117 No evaluation
35 to date has looked at the entirety of the service and analysed its complexity, taking into account the resources and infrastructure, the processes involved, the multiple outputs and the long term outcomes.
Complex interventions
A national strain typing service is a complex intervention. Complex interventions are often defined as interventions with several interacting components.154 The MRC’s definition of a complex intervention includes:
Number of, and interactions between, components within the experimental and control interventions
Number and difficulty of behaviours required by those delivering or receiving the intervention
Number of groups or organisational levels targeted by the intervention
Number and variability of outcomes
Degree of flexibility or tailoring of the intervention permitted
Complex interventions present difficult problems for evaluators due to the nature of their complexity – the design and delivery of the intervention is complex; the effect of the local context on the interventions is variable; and the length and complexity of the pathway between the intervention and its outcomes may be difficult to unwrap. In addition, there are practical difficulties in applying experimental research methods to service evaluation. A strain typing service is a complex intervention because it has many different parts that are organised and delivered by different groups (such as laboratories, public health teams, clinical teams); these groups accept and implement the service differently in different settings and across different parts of the country; and the causal chains between a strain typing service and its potential outcomes are complicated and difficult to capture.
To increase the number and improve the quality of evaluations, frameworks for evaluating complex interventions have been developed. The MRC published a ‘Framework for the Development and Evaluation of RCTs for Complex Interventions to Improve Health’ in 2000.155
36 on the experience collected by the scientific community, the need to include non- experimental research methods and to make it applicable to interventions outside of the health service.154 The overall framework covers the development and piloting of an intervention, the evaluation and dissemination of findings. The evaluation framework has three main components: assessing effectiveness, process evaluation and assessing cost-effectiveness. It is acknowledged that an experimental research design is not always possible in the evaluation of a complex intervention, instead quasi-experimental or observational studies may be adequate alternatives. The framework advocates for a good theoretical understanding of the intervention in order to identify appropriate outcome measures, and suggests the use of surrogate outcomes measures.
The MRC evaluation framework has been criticised by ‘realists’ for not including an explanation of the mechanisms of change that might link the intervention with the outcomes and not examining how the intervention might interact with the context. Realist evaluation is based on a generative model of causation: that an action is causal only if its outcome is triggered by a mechanism acting in context (outcome=mechanism+context).150 Pawson and Tilley (1997) argue that an evaluation should demonstrate if the programme works, what it is about the programme that works for whom and under what conditions.
Whilst the MRC framework advocates for the development of the intervention to be based on theory, it does not include the role of a theory in the evaluation process, which is crucial to other theory-driven frameworks.156,157 Theory-driven frameworks are based on the thesis that understanding the theory underlying an intervention is necessary for understanding whether it works and how it works. Theory of Change is one such framework in which a hypothesised theory of how the intervention affects change is developed with stakeholders and can be tested empirically.158 The advantages of Theory of Change is that it can better represent the complexity of an intervention as it makes explicit the causal pathways, but does not impose a structure, allowing for interactions, feedback loops and multiple pathways. Compared to the MRC Framework, disadvantages of Theory of Change is that there are multiple,
37 prescriptive steps that need to be followed which may not be appropriate for the evaluation context and it does not include a framework for the dissemination of results.159,160
These frameworks incorporate the processes involved in the development of the intervention as well as its evaluation. There is an assumption that the evaluation team have been a part of the intervention design team and that the design, implementation and evaluation of the intervention is (or can be) an iterative process. In situations where this is not the case, and the evaluation is either retrospective, or as an add-on to the implementation of the intervention, these frameworks may not be the most appropriate choice.
Donabedian’s formative framework for evaluating the quality of medical care161 provides a framework around which an evaluation can be designed, without prescribing any particular steps, nor assuming that the evaluators are involved in the design of the intervention from its initiation. The framework divides a service up by its Structures, Processes, Outputs and Outcomes. ‘Structures’ refers to the resources and inputs of the service; ‘Processes’ refers to the activity of the resources and the processes involved in the service; ‘Outputs’ refers to the products of the service; and ‘Outcomes’ refers to changes resulting from such outputs. An advantage of this framework is that it ensures that the whole of the service is considered in the evaluation, rather than just focussing on the Outcomes of the service and ignoring the Structures and Processes that produce those outcomes. In effect, by evaluating the Processes one is attempting to avoid so-called Type III errors – where one ends up evaluating an intervention that is not implemented properly.162 Process evaluation can provide the insight and context necessary to interpret the outputs of a service, help to explain differences between the observed and expected outcomes, and identify ways of intervening to improve the service.154 Subsequently, this enables the researchers to develop constructive recommendations based on the Structures and Processes of the service – elements of the service that can be directly influenced – as recommended in other evaluation guidelines.154
38 Evaluation frameworks tend to agree in many ways: the evaluation design should take into account who the evaluation is for and what kind of decisions it might influence; hypotheses about how the intervention will affect change should be made explicit; the appropriate methods available (of which there are likely to be multiple) should be used to understand if the intervention works, how it works, and in what contexts.
39