Missing Data in Survival Analysis and Results
from the MESS Trial
J. K. Rogers
J. L. Hutton
K. Hemming
Department of Statistics
University of Warwick
Outline
Background
Survival Analysis
Missing Data
MESS Trial
Background
MRC Multicentre Trial for Early Epilepsy and Single
Seizures
Initial Analysis
Outline
Background
Survival Analysis
Missing Data
MESS Trial
Background
MRC Multicentre Trial for Early Epilepsy and Single
Seizures
Initial Analysis
Suitable Models
Survival Analysis
Modelling Survival Data
I
Time to event
I
Censoring: actual survival time not observed for an
individual
I
Right Censoring: observed, censored survival time is less
than actual, but unknown survival time
I
Two functions are of central interest:
I
Survivor function -
S
(
t
) =
P
(
T
≥
t
)
I
Hazard function -
h
(
t
) =
lim
δt
→0
n
P
(
t
≤
T
≤
t
+
δt
|
T
≥
t
)
δt
Survival Analysis
Modelling Survival Data
I
Time to event
I
Censoring: actual survival time not observed for an
individual
I
Right Censoring: observed, censored survival time is less
than actual, but unknown survival time
I
Two functions are of central interest:
ISurvivor function -
S
(
t
) =
P
(
T
≥
t
)
I
Hazard function -
h
(
t
) =
limδt→0
n
P
(t≤T
≤t+δt|T
≥t)
δt
Missing Data
Missing Data
Let
Y
=
{y
ij
}
denote an
(
n
×
k
)
complete-data rectangular data
set, with
n
cases over
k
variables and
Y
= (
Y
obs
,
Y
mis
)
.
I
MCAR - missingness independent of
Y
I
MAR - missingness depends only on
Y
obs
I
MNAR - neither MCAR or MAR
Missing data methods include complete case analysis,
imputation techniques and model based approaches.
Missing Data
Missing Data
Let
Y
=
{y
ij
}
denote an
(
n
×
k
)
complete-data rectangular data
set, with
n
cases over
k
variables and
Y
= (
Y
obs
,
Y
mis
)
.
I
MCAR - missingness independent of
Y
I
MAR - missingness depends only on
Y
obs
I
MNAR - neither MCAR or MAR
Missing data methods include complete case analysis,
imputation techniques and model based approaches.
Missing Data
Missing Data
Let
Y
=
{y
ij
}
denote an
(
n
×
k
)
complete-data rectangular data
set, with
n
cases over
k
variables and
Y
= (
Y
obs
,
Y
mis
)
.
I
MCAR - missingness independent of
Y
I
MAR - missingness depends only on
Y
obs
I
MNAR - neither MCAR or MAR
Missing data methods include complete case analysis,
imputation techniques and model based approaches.
Outline
Background
Survival Analysis
Missing Data
MESS Trial
Background
MRC Multicentre Trial for Early Epilepsy and Single
Seizures
Initial Analysis
Suitable Models
Background
Early Epilepsy and Single Seizures
I
On average 50
%
of people do not experience a recurrence
after a single seizure
I
Around 20
−
30
%
of people will never achieve long-term
remission
I
Risk of future seizures increases with the number of
previous seizures
MRC Multicentre Trial for Early Epilepsy and Single Seizures
Aim of Trial
I
When should treatment with antiepileptic drugs commence
I
Antiepileptic drugs come with unpleasant side effects
I
Comparison of policies: immediate versus deferred
treatment in those patients where uncertainty about
starting treatment remained
MRC Multicentre Trial for Early Epilepsy and Single Seizures
Aim of Trial
I
When should treatment with antiepileptic drugs commence
I
Antiepileptic drugs come with unpleasant side effects
I
Comparison of policies: immediate versus deferred
treatment in those patients where uncertainty about
starting treatment remained
MRC Multicentre Trial for Early Epilepsy and Single Seizures
Outcomes Measured
I
Assessed the effects of the two policies on short term
recurrence and long-term remission
I
Time to first seizure
ITime to second seizure
ITime to fifth seizure
ITime to one year remission
ITime to second year remission
Outline
Background
Survival Analysis
Missing Data
MESS Trial
Background
MRC Multicentre Trial for Early Epilepsy and Single
Seizures
Initial Analysis
Suitable Models
Kaplan-Meier Plots
0 500 1000 1500 2000 2500 3000 0.0 0.2 0.4 0.6 0.8 1.0Time to first seizure
t S(t) 0 5001000 1500 2000 2500 3000 0.0 0.2 0.4 0.6 0.8 1.0
Time to second seizure
t S(t) 0 5001000 2000 3000 0.0 0.2 0.4 0.6 0.8 1.0
Time to fifth seizure
t S(t) 0.0 0.2 0.4 0.6 0.8 1.0
Time to one year remission
S(t) 0.0 0.2 0.4 0.6 0.8 1.0
Time to two year remission
S(t)
Allocated to START Allocated to DELAY
Suitable Models 0.0 0.2 0.4 0.6 0.8 1.0
Time to one year remission
S(t) 0.0 0.2 0.4 0.6 0.8 1.0
Time to two year remission
The Missing Data Problem
Randomisation Issues
I
Two randomisation forms used during the trial
1.
Randomisation
→
Drug (approx 1
/
3)
2.
Drug
→
Randomisation (approx 2
/
3)
I
Second randomisation strategy allows comparisons
between specific drugs
I
Adopt missing data methods to overcome problem of
missing covariates
The Missing Data Problem
Randomisation Issues
I
Two randomisation forms used during the trial
1.
Randomisation
→
Drug (approx 1
/
3)
2.
Drug
→
Randomisation (approx 2
/
3)
I
Second randomisation strategy allows comparisons
between specific drugs
I
Adopt missing data methods to overcome problem of
missing covariates
The Missing Data Problem