• No results found

Computing Software Packages for Survival Analysis

In document Survival Analysis - Guo (Page 138-142)

A

ll commercial software packages offer procedures for survival ana-lysis. Issues related to running some procedures have been dis-cussed in relevant places in the book. In this chapter I provide an overview to highlight key issues in programming with SAS, SPSS, and Stata. Syntax files using SAS, and Stata packages to generate the exam-ples in this book are available on the book’s companion Web page.

Readers may find them useful.

SAS

All survival procedures offered by SAS require the user to specify the value indicating censoring, not event. This feature is categorically dif-ferent from SPSS and Stata. As a consequence, the user needs to be cautious in syntax specification when running the same data with dif-ferent packages.

Proc Lifetest is the procedure to generate a life table including estimated hazard and survivor functions, the Kaplan-Meier estimation of survivor function, bivariate tests (i.e., log-rank, Wilcoxon’s tests) on differences of survivor curves between groups, and graphics including

129

the hazard plot, survivor plot, log-survivor plot, and log-log survivor plot (also known as log-cumulative-hazard plot). Additional graphic procedures such as Proc gplot, the macro SMOOTH written by Allison (1995), may be used to make the estimated curves smoother and more suitable for presentation.

Proc Logistic estimates a binary logistic regression, and Proc Catmod estimates a multinomial logit model. Hence, they are the procedures for discrete-time models. Before running Proc Logistic or Proc Catmod, the user needs to use programming commands to convert the person data into person-time data. For examples of data conversion, see Allison (1995) or syntax files available in the companion Web page for this book.

Proc Phreg is the procedure to estimate the Cox proportional hazards model. To run the model with time-varying covariates, the user needs to specify which variables indicate the time-varying informa-tion following the Proc Phreg statement and uses a series of ‘‘if. . .then’’

commands. The BASELINE key word in Proc Phreg may be used to generate the model-predicted survivor curves. COVS(AGGREGATE) is the key word to request the LWA or WLW marginal models; but for the WLW model, additional programming using Proc IML is required. The WLW macro created by Allison (1995) can also be used to run the model.

Proc lifereg is the procedure to estimate the parametric models. The DISTRIBUTION key word allows the user to specify different types of parametric models; the choices are EXPONENTIAL for the exponential model, GAMMA for the generalized or standard gamma model, LOGISTIC for the logistic model, LLOGISTIC for the log-logistic model, LNORMAL for the log-normal model, and WEIBULL for the Weibull model. To run the piecewise exponential model, the user spe-cifies DISTRIBUTION¼EXPONENTIAL, but it is necessary to run the procedure using the person-time data.

SPSS

SURVIVAL is the procedure to generate the life table and the hazard plot, survivor plot, log-survivor plot, and density (PDF) plot. KM estimates the Kaplan-Meier estimation of survivor function and gener-ates similar plots. Both procedures offer bivariate tests (i.e., log-rank, Wilcoxon’s tests).

LOGISTIC REGRESSION estimates the binary logistic regression (i.e., for the discrete-time model analyzing single event), and NOMREG estimates the multinomial logit model (i.e., for the discrete-time model analyzing multiple events).

COXREG is the procedure to run the Cox proportional hazards model. To specify time-varying covariates in the Cox regression, the user needs to use TIME PROGRAM before running COXREG to inform the program which variables contain time-varying information and how to create the time-varying covariates for the Cox model; typically this is done through a series of ‘‘if...’’ commands. To obtain model-predicted survivor curves, the user specifies PLOT SURVIVAL. This procedure only produces the curve using sample mean values of all independent variables, or curves defined by a categorical independent variable.

Currently there is no procedure available to run multilevel analysis.

There is no procedure available to run the parametric models.

STATA

To run any procedure of survival analysis in Stata, the user needs to run stset first to inform the program of key variables and their roles in the analysis. Variables measuring the study time and the event code are defined at this stage. If the data file is saved, next time the user does not need to run stset again. Stata distinguishes between the single-record data (also known as a wide or multivariate file) and multiple-record data (also known as a long or univariate file), and this is a key feature Stata uses to run models with time-varying covariates. To run time-varying models, the user needs to organize the file in a multiple-record format.

Using stset, the user informs the program of the ID variable so the program recognizes that within a same value of ID, records are for the same individual but at different times.

Several procedures can be used to conduct univariate, bivariate, and graphic analysis. sts is the procedure to generate, graph, list, and test the survivor functions (via the Kaplan-Meier estimator) and the Nelson-Aalen cumulative hazard function. stci computes means and percentiles of study time, and their standard errors and confidence intervals. ltable displays and graphs life tables for individual-level or aggregate data and provides the likelihood-ratio and log-rank tests to discern group differences.

The user needs to convert the person data into person-time data by using Stata programming commands before running a discrete-time model. Useful commands to fulfill this task include expand, stsplit, and a user-developed program called prsnperd. After creating the person-time data, logistic (for the binary logistic regression) or mlogit (for the multinomial logit model) may be employed to conduct the discrete-time analysis.

stcox is the procedure to conduct the Cox regression. As mentioned earlier, time-varying information and the structure of a multiple-record data file must be specified in stset before running stcox. Once this is done, the time-varying variables can be specified as other independent variables in the Cox regression without additional efforts. stcoxkm plots Kaplan-Meier observed survivor curves and compares them with Cox predicted curves. Thus, this is a procedure users can employ to check the proportionality assumption. In the stcoxkm curves, the closer the observed values are to the predicted values, the less likely it is that the proportional-hazards assumption has been violated. stcurve plots the survivor, hazard, or cumulative hazard functions based on an esti-mated Cox regression (i.e., the user runs it after running stcox). Note that stcurve provides all three types of curves, not just survivor curves.

stpower cox is the procedure to conduct power analysis for the Cox regression and compute the needed sample size, power, and effect size for a Cox model. A set of stcox postestimation commands are of special interest after one runs stcox. The vce(robust) option can be used to request the robust variance estimator for a Cox regression, that is, to run the LWA model for multilevel analysis.

streg fits parametric survival models. distribution ( ) is the key word to specify the parametric model of interest, and the choices are (expo-nential) for the exponential model, (gompertz) for the Gompertz model, (loglogistic) or (llogistic) for the log-logistic model, (weibull) for the Weibull model, (lognormal) or (lnormal) for the log-normal model, and (gamma) for the generalized gamma model. Like the procedure for SAS, the user specifies (exponential) based on person-time data to run the piecewise exponential model. stcurve following streg plots the model-based survivor, hazard, or cumulative hazard curves. A set of streg postestimation commands are of special interest after one runs streg.

8

In document Survival Analysis - Guo (Page 138-142)

Related documents