5.5 Step-by-Step Implementations in R
5.5.3 Fitting the Semiparametric Estimation: IntCox
The method of “IntCox” described in Section 5.4.3 is implemented in the R package intcox . We load the package as
> library(intcox)
and fit the breast cancer data by calling the intcox function as:
> fit.IntCox = intcox(Surv(tL,tU,type="interval2")~TRT,data=dat) no improvement of likelihood possible, iteration = 1
> # print the model fit > fit.IntCox
Call:
intcox(formula = Surv(tL, tU, type = "interval2") ~ TRT, data = dat)
coef exp(coef) se(coef) z p TRT -0.776 0.46 NA NA NA
Likelihood ratio test=NA on 1 df, p=NA n= 94
It may be seen from the output of fit.IntCox that the estimated treat- ment coefficient is −0.776. With this fitting, we can extract the estimated
Analysis of Clinical Trials with Time-to-Event Endpoints 117 piecewise baseline cumulative hazard function to calculate the estimated base- line survival functions for both treatments. We can then display them graph- ically as in Figure 5.5 as follows. The reader may compare this figure with figures from Kaplan–Meier and other methods.
> # the baseline survival is exp(-baseline cumulative hazard) > surv.base = exp(-fit.IntCox$lambda0)
> # plot the survival function for TRT=0
> plot(fit.IntCox$time.point,surv.base,type="s",
xlab="Time in Months",ylab="S(t)",lty=4, lwd=3) > # add the survival function for TRT=1
> lines(fit.IntCox$time.point,surv.base^exp(fit.IntCox$coef), type="s",lty=1, lwd=3)
> # add the legend
> legend("bottomleft",title="Line Types",lty=c(1,4),lwd=3, c("Radiation Only","Radiation+Chemotherapy")) 0 10 20 30 40 50 60 0.0 0.2 0.4 0.6 0.8 1.0 Time in Months S(t) Line Types Radiation Only Radiation+Chemotherapy
FIGURE 5.5: Estimated Survival Functions from IntCox.
Furthermore, we note that the output from intcox (i.e., f it.IntCox) is like that from the Cox regression coxph except that no standard errors of the regression parameters are available at the time of writing this chapter. Standard errors for the regression parameters may be estimated using standard
118 Clinical Trial Data Analysis Using R
bootstrap methods. We obtain random samples of the observed data with replacement for a large number of times and fit the intcox for the resultant bootstrap sample which can be implemented in R easily as
> set.seed(12345678) > # number of bootstrapping=1000 > num.boot = 1000 > boot.intcox = numeric(num.boot) > # the for-loop > for(b in 1:num.boot){
#sample with replacement
boot.ID=sample(1:dim(dat)[1],replace=T) # fit intcox for the bootstrap sample
boot.fit = intcox(Surv(tL,tU,type="interval2")~TRT, dat[boot.ID,],no.warnings = TRUE) # keep track the coefficient
boot.intcox[b] = coef(boot.fit) } # end of b-loop
The 95% confidence interval for the treatment effect can be obtained using the R function quantile as
> Boot.CI = quantile(boot.intcox, c(0.025,0.975)) > Boot.CI
2.5% 97.5% -1.412 -0.237
Therefore from this bootstrapping sample, we see that the 95% confidence interval for treatment effect is (−1.412, −0.237) with estimated regression pa- rameter ˆβ = −0.776, which again confirms the statistical significance of treat- ment effect.
In addition, we can use this bootstrapping sample to evaluate the bias be- tween Pan’s ICM estimate and the mean/median of the bootstrapping samples as: > bias.IntCox =c(mean.bias=coef(fit.IntCox)-mean(boot.intcox), median.bias=coef(fit.IntCox)-median(boot.intcox)) > bias.IntCox mean.bias.TRT median.bias.TRT 0.0248 0.0202
This shows that the bias is negligible. The bootstrapping distribution, the confidence interval, and the biases are depicted in Figure 5.6, using the following R code chunk:
Analysis of Clinical Trials with Time-to-Event Endpoints 119 > # Histogram from bootstrap sample
> hist(boot.intcox,prob=T,las=1,
xlab="Treatment Difference",ylab="Prob", main="") > # put vertical lines for
> abline(v=c(Boot.CI[1],fit.IntCox$coef,mean(boot.intcox), median(boot.intcox),Boot.CI[2]), lwd=c(2,3,3,3,2), lty=c(4,1,2,3,4)) Treatment Difference Prob −2.5 −2.0 −1.5 −1.0 −0.5 0.0 0.0 0.2 0.4 0.6 0.8 1.0 1.2
FIGURE 5.6: Bootstrapping Distribution for Treatment Difference. Overlaying this bootstrapping distribution, are the lower (left-most dashed vertical lines) and upper limits (right-most dashed vertical lines) for the 95% confidence interval. The three vertical lines in the middle depict the estimated treatment effect from “intcox” - the mean and the median from the bootstrap- ping sample. Since they are so close it is difficult to note any differences.
120 Clinical Trial Data Analysis Using R
5.6
Concluding Remarks
In this chapter, we presented a variety of methods and models for analyzing time-to-event data in clinical trials with step-by-step implementation in the R system. Readers may use the R code and explanations provided in this chapter to analyze their own clinical trial data.
For further reading, we recommend the book by Peace (2009) specifically on design and analysis of variety of clinical trials with time-to-event endpoints. Other general texts on survival analysis are Lawless(1982), Kalbeisch and Prentice (2002) and Collett (2003).
For interval-censored data, Lindsey and Ryan (1998) is an excellent arti- cle with which to begin to obtain a broad review of statistical methods for interval-censored data. In this paper, the same breast cancer data in Table 5.2 was used for illustration and the reader may compare the results from this article to those in this chapter (they are exactly the same!). For a com- prehensive understanding of theory and analysis of interval-censored data, we recommend the book by Sun (2006) which collects and unifies statistical mod- els and methods in analyzing interval-censored data. As a further extension of interval-censoring, progressive type-I interval-censoring is commonly seen. Readers may refer to Chen and Lio (2010) and the references cited therein.
Chapter 6
Analysis of Data from Longitudinal
Clinical Trials
6.1 Clinical Trials . . . 122 6.1.1 Diastolic Blood Pressure Data . . . 122 6.1.2 Clinical Trial on Duodenal Ulcer Healing . . . 122 6.2 Statistical Models . . . 123 6.2.1 Linear Mixed Models . . . 123 6.2.2 Generalized Linear Mixed Models . . . 125 6.2.3 Generalized Estimating Equation . . . 126 6.3 Analysis of Data from Longitudinal Clinical Trials . . . 126 6.3.1 Analysis of Diastolic Blood Pressure Data . . . 126 6.3.1.1 Preliminary Data Analysis . . . 127 6.3.1.2 Longitudinal Modeling . . . 135 6.3.2 Analysis of Cimetidine Duodenal Ulcer Trial . . . 141 6.3.2.1 Preliminary Analysis . . . 141 6.3.2.2 Fit Logistic Regression to Binomial Data . . . 142 6.3.2.3 Fit Generalized Linear Mixed Model . . . 145 6.3.2.4 Fit GEE . . . 147 6.4 Concluding Remarks . . . 149
In this chapter, we analyze response data from longitudinal clinical trials using the R system. The primary feature of response data from longitudinal clinical trials is that it is measured over time on each clinical trial participant along with covariates. Therefore an objective in the analysis of such data is to model its change over time along with the effects of treatment and covariates.
We present two real clinical trial datasets inSection 6.1of this chapter. The statistical models used to analyze these data appear inSection 6.2. Step-by- step implementation of the models in R is illustrated inSection 6.3. Concluding remarks follow inSection 6.4.
Note: to run the R programs in this chapter, the analyst should install the following R packages first: RODBC , nlme, lme5 , gee, MASS , multcomp, mvtnorm and lattice.
122 Clinical Trial Data Analysis Using R