Population Selection with a
Shortterm Endpoint: Problems and
Solutions
Gernot Wassmer, PhD
Institute for Medical Statistics
University of Cologne, Germany
Introduction
• Consider a situation where the design adaptation, e.g., selecting a population in an enrichment design, is not based on the primary endpoint.
• For example, a biomarker or a surrogate endpoint may provide valuable
information. This might be a laboratory measure which is available more quickly than the primary endpoint, or a preliminary measurement of the primary
endpoint.
• In survival trials, when the endpoint is time to event, this becomes even more important, as usually the interim analyses are conducted after a specified
Introduction
• The selection process that is based on a surrogate requires a specific statistical methodology.
• The problem is that at least a part of the patients contribute to both analyses: to the selection decision analysis with their surrogate value, and to the final
confirmatory analysis with their primary endpoint value.
• Therefore, such trials require a different statistical analysis method than those where selection is based on the primary endpoint alone in order to control the Type I error rate.
• A solution has been provided by Jenkins et al. (2011) and Friede et al. (2011). See also Irle and Schäfer, 2012, Magirr et al., 2014, Mehta et al., 2014.
The Enrichment Procedure
• For simplicity, consider a two-sample comparison case although an extension to the multi-armed case is straightforward.
• Consider prespecified subpopulation(s) S1,…,SG , and a full population F. • At an interim stage it is decided which subpopulation is selected for further
inference (including all subpopulations, i.e., full population).
• In general, not only selection procedures, but also other adaptive strategies (e.g., sample size reassessment) can be performed.
Methodology
• Sources for alpha inflation • Interim analyses
• Sample size reassessment • Multiple sub-populations
• The proposed adaptive procedure strongly controls the pre-specified family-wise Type I error rate, it is based on the application of the closed test procedure
together with combination tests (Bauer and Kieser,1999).
Stage II … Stage I
Example: 2 stages S = S2
0S2 can be rejected if all combination tests exceed the critical value u2.
H
2 1 0 0 0 S S F H H H ∩ ∩ 1 0 0 S F H H ∩ 2 0 0 S F H H ∩ 1 2 0 0 S S H H ∩ F H0 1 0 S H 2 0 S H 2 0 S HClosed Testing Procedure
7
• The choice of combination tests is free. E.g., you might use inverse normal or Fisher‘s combination test.
• The choice of tests for intersection hypotheses is free. E.g., you might use Bonferroni, Simes or Sidak tests.
• For one subgroup also Dunnett‘s test can be applied. In this case, you might also use the CRP principle. i.e., perform a conditional Dunnett test (Friede et al., 2012)
• Calculation of RCIs and overall p-values straightforward
• Methodology and designing parameters provided in Wassmer and Dragalin (2014).
The Problem with the Surrogate
• For survival designs, in general, only the test statistic from the confirmatory
phase may be used for subsequent planning, no other information from patients under risk can be used (Bauer and Posch, 2004).
• So it is theoretically not allowed to use the information of the surrogate for doing the adaptation. This is because otherwise, in the presence of a correlation
between the primary endpoint and the surrogate, the occurrence of an event can be predicted for the patients under risk and thus control of Type I error rate
cannot be guaranteed any more.
• Particularly, the use of a surrogate for selecting a sub-population or making sample size changes is not allowed.
The Solution
) ~ 1 ( 1 ) ~ 1 ( 1 2 1 2 1 G k G k w p p w Φ− − + − Φ− −Issues
• Asymptotic normality and the independent increments structure of the test statistic in both the G1 and the G2 population.
Simulations show that the Type I error rate is controlled which is partly due to the conservatism of test procedures for the intersection hypotheses.
• Solution: discard early stops for efficacy at interim.
• A number of patients which have been randomized in a deselected subpopulation are not used for further analysis.
• Furthermore, patients in G1 from deselected subpopulations usually have
discontinued follow-up (Friede et al., 2011). For these treatment populations, the test statistic is set equal −∞ (or, equivalently, the p-value is set equal to 1).
Implementation
13 ) ). 6 / ~ sin( 2 ρ π ρ = ⋅Example
Predictive Value of a Surrogate and Its Impact on the Power of a Seamless Phase II/III Design
• Consider a three-stage design with O’Brien & Fleming boundaries • One subgroup S (with prevalence 40%), selecting either S or F in first
interim analysis, based on better hazard ratio (i.e., assume a “positive” event)
• In the primary endpoint, we assume
• a control event rate at 12 months of 53% for both S and Sc
• a treatment group event rate in S of 70% • a treatment group event rate in Sc of 59%
• In the surrogate, we assume
• a control event rate at 12 months of 10% for both S and Sc
• a treatment group event rate in S of 50% • a treatment group event rate in Sc of 30%
Example
Predictive Value of a Surrogate and Its Impact on the Power of a Seamless Phase II/III Design
• Does this fit to a predictive value, of, say, 80%?
Control: 15 Surrogate No Yes Primary No 0.47 Yes 0.53 0.90 0.10
Example
Predictive Value of a Surrogate and Its Impact on the Power of a Seamless Phase II/III Design
• Does this fit to a predictive value, of, say, 80%?
Control: Surrogate No Yes Primary No 0.45 0.02 0.47 Yes 0.45 0.08 0.53 0.90 0.10
Example
Predictive Value of a Surrogate and Its Impact on the Power of a Seamless Phase II/III Design
• Does this fit to a predictive value, of, say, 80%?
Active S: 17 Surrogate No Yes Primary No 0.30 Yes 0.70 0.50 0.50
Example
Predictive Value of a Surrogate and Its Impact on the Power of a Seamless Phase II/III Design
• Does this fit to a predictive value, of, say, 80%?
Active S: Surrogate No Yes Primary No 0.20 0.10 0.30 Yes 0.30 0.40 0.70 0.50 0.50
Example
Predictive Value of a Surrogate and Its Impact on the Power of a Seamless Phase II/III Design
• Does this fit to a predictive value, of, say, 80%?
Active Sc: 19 Surrogate No Yes Primary No 0.41 Yes 0.59 0.70 0.30
Example
Predictive Value of a Surrogate and Its Impact on the Power of a Seamless Phase II/III Design
• Does this fit to a predictive value, of, say, 80%?
Active Sc: Surrogate No Yes Primary No 0.35 0.06 0.41 Yes 0.35 0.24 0.59 0.70 0.30
Example
Predictive Value of a Surrogate and Its Impact on the Power of a Seamless Phase II/III Design
• Consider Bonferroni correction for G1 data • 50 patients for G1
• Maximum total number of 400 patients • Weight w = 0.125
• Range of effect sizes in S, predictive values of 60%, 80%, 95% • Simulations performed with ADDPLAN 6.1
Example
Predictive Value of a Surrogate and Its Impact on the Power of a Seamless Phase II/III Design
Example
Predictive Value of a Surrogate and Its Impact on the Power of a Seamless Phase II/III Design
Example
Predictive Value of a Surrogate and Its Impact on the Power of a Seamless Phase II/III Design
• Type I error rate controlled
• Especially if a range of effect sizes is assumed, predictive value or sensitivity must be carefully selected
• Power to reject S is bounded by selection probability that is based on surrogate effect size and dependent of sample size in G1
Open Problems and Issues
• No theoretical foundation for ad-hoc solutions, only justified by simulation
• Situations where adaptive design performs bad (worse than separate Phase II/III design)
• Essentially due to poor correct selection probabilities or implicit futility rules • Crucial role of simulations to assess the impact of basing patient population
selection on surrogates instead of the primary endpoint and to decide which design should be used.
Bauer, P., Kieser, M. (1999). Combining different phases in the development of medical treatments within a single trial. Statistics in Medicine 18, 1833-1848.
Bauer, P. and Posch, M. (2004). Letter to the Editor: Modification of the sample size and the schedule of interim analyses in survival trials based on data inspections. Statistics in Medicine 23, 1333–1335.
Brannath, W., Zuber, E., Branson, M., Bretz, F., Gallo, P., Posch, M., Racine-Poon, A., 2009: Confirmatory adaptive designs with with Bayesian decision tools for a targeted therapy in oncology. Statistics in Medicine 28:1445-1463.
Fleming, T. R. and DeMets, D. L. (1996). Surrogate end points in clinical trials: are we being misled? Ann Intern Med 125, 605–613. Friede, T., Parsons, N., Stallard, N., Todd, S., Valdes Marquez, E., Chataway, J., Nicholas, R. (2011). Designing a seamless phase II/III clinical trial using early outcomes for treatment selection: an application in multiple sclerosis. Statistics in Medicine 30: 1528-1540.
Friede, T., Parsons, N., & Stallard, N. (2012). A conditional error function approach for subgroup selection in adaptive clinical trials. Statistics in Medicine, 31: 4309-4320 (correction in Statistics in Medicine 32).
Irle, S., Schäfer, H. (2012). Interim design modifications in time-to-event studies. Journal of the American Statistical Association 107: 341-348.
Jenkins, M., Stone, A., Jennison, C. (2011). An adaptive seamless phase II/III design for oncology trials with subpopulation selection using correlated survival endpoints. Pharmaceutical Statistics 10: 347–356.
Magirr, D., Jaki, T., König, F., Posch, M. (2014). Adaptive designs for time-to-event trials. Submitted.