Composite endpoints combine a number of individual outcomes in order to assess the effectiveness or efficacy of a treatment. They are typically used in situations where it is difficult to identify a single relevant endpoint to sufficiently capture the change in disease status incited by the treatment, however they may be employed for multiple purposes [8–11].
The construction of the composite endpoint differs depending on the disease. For instance, it is common in randomised trials of cardiovascular conditions to combine a number of binary outcomes such as death, myocardial infarction, stroke or ischemia- driven target vessel revascularization, as in [12]. These composite endpoints are typically analysed using time-to-event methods. Composite endpoints in other diseases combine outcomes on different scales, such as continuous and discrete measures. Our focus in this thesis is on a subset of these outcomes known as composite responder endpoints. These endpoints allocate patients as either ‘responders’ or ‘non-responders’ based on whether they cross predefined thresholds in the individual outcomes and are typically treated as a single binary endpoint. It is both theoretically and pragmatically important to make the distinction between composites that require patients to experience an event in all components and those which require patients to have an event in at least one of the components. We introduce the general characteristics of both below.
1.1.1
Events in at Least One Component
To be classed as a responder in some diseases, patients may have to meet one of multiple criteria, which may be defined on different scales. Alternatively, patients may have to respond in a subset of the components in order to be responders overall, such as in rheumatoid arthritis where response in five of seven components equates to response overall. Properties related to composite endpoints requiring events in at least one component have been discussed at length in the literature, e.g. [11]. The considerations in the construction of composite endpoints are summarised as follows.
1.1 Composite Endpoints 3
1. Coherence
Coherence in this context means that components should measure the same underlying pathophysiologic process, as well as the same disease process.
2. Coincidence
Although composite endpoints should be coherent, coincidence ensures that components are not so closely related that patients experience all of them. In this case it is considered that the composite endpoint has become redundant and the effects of treatment can be captured in a single component.
3. Therapy homogeneity
From an investigator’s perspective it is important that the composite endpoint is sensitive to the treatment being evaluated and it is desirable that effect sizes are similar on each component.
A desirable aspect of these endpoints is that their application in trials may result in an increase in power, provided everything else remains constant. This is due to an increase in the number of events, where events are defined as any occurrence of response. Moyé [11] frames the possible power gains in terms of probability by assuming a two-dimensional composite endpoint with outcomes A and B, as shown in (1.1).
P(A ∪ B) =P (A) + P (B) − P (A ∩ B) (1.1)
=P (A) + P (B) − P (A|B)P (B) =P (A) + P (B)(1 − P (A|B))
From this we can see that the event rate P (A∪B) is at its maximum when P (A|B) = 0, implying that mutual exclusivity of events in the composite is desirable. However, for two components to be mutually exclusive in practice often requires linking together events that physicians are unaccustomed to combining, leaving the interpretation of the endpoint challenging [9, 10]. Furthermore, this is simplistic as the magnitude of power gains may also depend on the treatment effect in each component as well as the occurrence of events, where therapy homogeneity across components is considered optimal in terms of power [11, 13].
1.1.2
Events in All Components
Patient response may otherwise be obtained through meeting specific criteria in all components of the outcome. As before, these endpoints must be both coherent and homogeneous in response to therapy. However, they are not designed to avoid coincidence. Examples of these endpoints arise in solid tumour cancers, where a patient is only classed as a responder if they have experienced a predefined reduction in tumour size and have not developed new lesions [14].
Generally these endpoints do not have the advantage of increasing power in a given study, as an increased number of events are required. Instead, they are useful in multisystem diseases which require interventions to treat a range of symptoms in order for the treatment to be considered truly effective. Considering the probability of the events in the composite occurring, we are now concerned with P (A ∩ B) which will be largest when P (A ∪ B) is minimised as shown in (1.2). However, as before other considerations such as treatment effect homogeneity are also relevant.
P(A ∩ B) = P (A) + P (B) − P (A ∪ B) (1.2)
The methods in this thesis will be developed for composite responder endpoints in general, which may be applied where events are required in all components or in at least one component.
1.1.3
Opportunities and Limitations
There are many potential benefits to conducting a clinical trial using composite endpoints. As discussed, in the case of requiring response in at least one component, composite endpoints have the advantage that they increase the number of events in the trial [8, 15, 16]. In the likely case that this leads to a reduction in sample size, trials may be shorter and less expensive resulting in effective drugs being brought to market earlier [17–20]. Composite endpoints are particularly useful when a number of outcomes are equally relevant. In particular, in the case of diseases with large variation in symptoms, employing a composite endpoint will avoid an arbitrary choice of a single outcome [9, 15, 21, 22]. Furthermore, combining equally relevant outcomes and analysing as a composite endpoint negates the requirement for a multiple comparison adjustment [22–25]. In addition, proponents of composite endpoints believe that they are appropriate as they estimate the net clinical benefit of intervention by accounting
1.1 Composite Endpoints 5
for the multiple factors of interest in a given disease [26–29].
However, there are limitations in the application of composite endpoints. In practice, composites may be inconsistently defined and provide opportunities for post-hoc changes [30]. Composite endpoints may be driven by less important or subjective components, meaning that a promising treatment effect may not translate to the expected benefit for patients [9, 22, 31]. If ‘quantitative heterogeneity’ occurs and treatment effects observed on the components are in different directions, this will make interpreting the overall effect challenging [10, 11, 15]. Furthermore, treatment effects on the overall composite may be diminished or harmful effects may be masked if unresponsive components are included [8, 10, 24]. Although composite endpoints have the capacity to capture multiple aspects of a disease, ‘qualitative heterogeneity’ means that not all patients will attach similar importance to each component [10, 18, 27–29]. Finally, it may not always be possible to avoid multiple testing corrections as many applications of composite endpoints require that the treatment effects on each individual component should also be reported [11, 15].
1.1.4
Recommendations for Use
When employing composite endpoints, guidance must be followed in order to ensure valid and meaningful implementation in clinical trials [8]. As the overall treatment effect reported on a composite endpoint depends on the correlation between components, the direction of treatment effect in each component and hence the patient responder rates, it is therefore important for interpretation that effects are reported on individual components as secondary results. In order to reduce any ambiguity in application, many sets of guidelines have been issued, including from the European Network for Health Technology Assessment (EUnetHTA) for application in pharmaceuticals [32]. We summarise the recommendations from the literature for construction, reporting and interpretation of composite endpoints below [13, 30].
A. Construction
• Composite endpoints should generally not be used if a suitable single endpoint is available, except when it can be justified to be more suitable (e.g. rare disease/event) [32]
• Composites and components should be clearly prespecified before starting the trial [9, 21]
• Prior evidence should exist for each component to avoid including clinically unimportant outcomes [19, 22]
• Including outcomes that are unlikely to experience an effect of the intervention should be avoided [11, 27]
• A mix of objective and subjective outcomes should be avoided [13, 31, 32]
B. Reporting
• Components should be separately defined as secondary endpoints and effects reported with the primary analysis results to determine if one component has dominated the composite [13, 18, 25]
• Separate components can be reported according to severity level and the ‘worst’ outcome experienced should be reported according to a predefined ranking system [25, 32]
• Report relevant combinations of the components relating to subgroups or special patient populations at risk [24]
• The number of patients with partially missing values on some components should be reported in detail [32]
C. Interpretation
• Treatment effects should be interpreted based on the composite endpoint (any effect of the components should be interpreted together rather than concluding efficacy of individual components) [9, 13]
• Clinically important components should be checked to ensure that they have not been affected negatively by the treatment [22, 24]
• Basing the overall conclusion on a meta-analysis if comparable composite end- points are available from several studies should be considered [32]