Analysis of Clinical Trials with Time-to-Event Endpoints
5.4 Statistical Methods for Interval-Censored Data
In oncology clinical trials, but also in many other clinical trials, time-to-event data are generated by subjecting the patient to questioning, physical examination or other diagnostic methods at scheduled clinic follow-up visits.
In such trials, the exact times of the event (e.g. cancer, remission, etc.) are not known, rather if the event is present at a particular clinic visit, what one knows is that the event occurred between the last visit and the current visit. Hence such times-to-event reflect interval-censored data. Interval-censored data are commonly produced in clinical trials where there is a non-lethal endpoint, such as the progression-free survival (PFS), time-to-no-evidence of disease or time-to-remission in oncology trials.
Due to lack of knowledge of more appropriate statistical methods or inac-cessibility of the appropriate statistical software, the common ad hoc practice is to approximate the interval-censored data using the left or right endpoint or the midpoint of the interval. Under this convention, well-known statistical methods developed for exact failure time data for right censored data may be utilized. The inference from force fitting exact times-to-event methods to interval censored data may introduce bias and may render inferences there from invalid.
Since the seminal Cox proportional hazards regression is not applicable to interval censored data, we introduce the so-called proportional hazards model using an iterative convex minorant (ICM) algorithm for interval censored data as proposed by Pan (1999). This approach is implemented as a intcox in the R system. In addition, we introduce Turnbull’s nonparametric estimator for censored data as well as some parametric models to fit interval-censored data. A comprehensive discussion can be found in Sun (2006).
5.4.1 Turnbull’s Nonparametric Estimator
For interval-censored data, Turnbull (1976) proposed an analog to the Kaplan–Meier product-limit estimator. This is based on an iterative procedure
Analysis of Clinical Trials with Time-to-Event Endpoints 97 to estimate the survival function S(t) corresponding to the interval-censored data such as those presented in Table 5.2.
To construct Turnbull’s estimator, the observed times-to-event are ordered in the same manner as in Kaplan and Meier estimator. Let 0 = τ0 < τ1 <
τ2 < · · · < τm be the ordered time points including all left tLi and right tU i time points in all intervals of (tLi, tU i), i = 1, 2, · · · , n, from n patients. Notice that m is usually larger than n because of the interval data.
Then, for the ith patient, define an indicator Iij to keep track of whether the interval (τj−1, τj) is completely within the interval (tLi, tU i] as
Iij =
1 if (τj−1, τj) ∈ (tLi, tU i]
0 otherwise (5.29)
where Iij also indicates whether the event that occurred in (tLi, tU i] could have occurred at τj. Based on this indicator, Turnbull’s estimator is obtained from the following iterative steps:
1. Make an initial guess at S(τj) and compute
pj = S(τj−1) − S(τj) j = 1, 2, · · · , m 2. Compute the number of events occurred at τj using
ej =
3. Compute the estimated number at risk at time τj using
rj=
m
X
k=j
ek
4. Compute the updated product-limit estimator S(τj) using the con-structed pseudo data from Steps 2 to 3.
5. Iterate Steps 1 to 4 and update Snew(τj) from the previous step. If Snew(τj) is close to it in the previous step for all τ ’s, stop the iterative process. The convergence of the iterative approach depends on the initial guess of S(τj), which are typically estimated using the Kaplan–Meier estimator.
5.4.2 Parametric Likelihood Estimation with Covariates The usual likelihood approach starts with the proportional hazards as-sumption as in Equation (5.26) to combine the covariates X and the vector of regression coefficients β via a linear predictor with the baseline hazard h0(t).
98 Clinical Trial Data Analysis Using R
We are interested in the effect the covariates have on the probability of the occurrence of events as formulated by the survival function
S(t|X) = 1 − F (t|X) (5.30)
where F is the cumulative distribution function. Similar mathematical manip-ulation as in Equation (5.28), we have:
S(t|X) = S0(t)exp(β0X) (5.31) where S0(t) is the baseline survival function which is independent of the co-variates. Therefore,
1 − F (t|X) = S(t|X)
= S0(t)exp(β0X)
= [1 − F0(t)]exp(β0X) (5.32) Therefore for n patients with observed interval censored data of (tLi, tU i), i = 1, · · · , n, the log-likelihood function with regression parameter vector β and the parameters θ from the baseline distribution can be constructed as follows: Commonly used baseline distribution functions F0 are defined in Section 5.2.2. Statistical estimation and inference is then based on the maximum like-lihood methods from the Equation (5.33), which has been implemented in R survival .
The advantage for this likelihood approach is that we can estimate the regression parameter vector β and the baseline parameters θ simultaneously (see Peace and Flora (1978)). The disadvantage is that we need to specify the baseline F0 which is contrary to the essence of Cox regression if interest is only in estimates and inferences on the regression parameters.
5.4.3 Semiparametric Estimation: The IntCox
From Section 5.4.2, Pan’s (1999) semiparametric method is to estimate the regression parameters β as the parametric part. This requires utilizing a nonparametric piecewise constant function to represent the baseline cumula-tive density function F0(t) in the likelihood function of Equation (5.33) using the iterative convex minorant algorithm (ICM). Since the parameter vector θ associated with F0(t) is eliminated, the log-likelihood function in Equation (5.33) now becomes
Analysis of Clinical Trials with Time-to-Event Endpoints 99 Henschel, Heiss and Mansmann implemented Pan’s ICM in the R package intcox , which can be obtained from following link:
http://cran.r-project.org/web/packages/intcox/
In this section, we briefly describe this implementation. The reader may use the above link to access more information. The implementation requires max-imizing the log-likelihood in Equation (5.34) by a modified Newton-Raphson algorithm assuming that the baseline function F0(t) is a piecewise constant represented by a finite dimensional vector, which is estimated together with the regression parameter β.
With the log-likelihood function in Equation (5.34), the gradients are
∇1L(F0, β) = ∂L(F∂F0,β)
0 and ∇2L(F0, β) =∂L(F∂β0,β). The full Hessian matrix in the original Newton-Raphson algorithm is replaced by the diagonal matrices of the negative second partial derivatives G1(F0, β) and G2(F0, β).
The Newton-Raphson algorithm updates F(m+1)from F(m)iteratively by utilizing the stepsize α with initial starting value α = 1 as follows:
F0(m+1)=Projh
F0(m)+ αG1(m)−1∇1L(m), G1(m), Ri β(m+1)=β(m)+ αG2(m)−1∇2L(m)
To ensure F0(m+1)is a distribution function, the authors used a projection into the restricted range R weighted by G as
Proj[y, G, R] = arg min
x
nXk
i=1
(yi− xi)2Gii : 0 ≤ x1≤ · · · ≤ xk ≤ 1o
If L(F(m+1)) < L(F(m)), α is halved and the step is reiterated. To expe-dite convergence, starting values are computed by treating the data as right-censored and using the classical proportional hazards model.