• No results found

Chapter 3 Data and methods

3.5 Statistical Methods

3.5.4 Competing risk regression models

A competing risk in survival analysis is defined as an event which prevents the occurrence of the primary event of interest [268-271]. In the analyses of hospital admissions and SMNs (as measures of late effects) presented in this thesis death was considered as a competing risk, since if a patient died they were no longer at risk for hospitalisation or SMN.

3.5.4.1 Cumulative incidence function (CIF)

If competing risks are not present then the complement of the Kaplan-Meier (KM) function can be used to estimate the incidence of an outcome over time. Estimating the cumulative incidence using the KM method in the presence of competing risks, by treating those who experience a competing event as censored, is not appropriate as this will produce biased estimates [270]. The KM survival function will overestimate the incidence in the presence of

competing risks and it is recommended that the cumulative incidence function should be used to estimate incidence which describes the absolute risk of the event of interest over time [270-272]. The cumulative incidence function (CIF) was calculated to estimate the probability of late effects over time including death as a competing risk using the stcompet command in Stata [258].

In standard survival analysis (with no competing risks) Cox proportional hazards models can be used to estimate the relative effect of covariates on the hazard function. There is a direct correspondence between the effect of a covariate on the hazard of the outcome and the effect of a covariate on the incidence of the outcomes. If a covariate increases the hazard of the occurrence of the outcome it will also increase the incidence of the outcome [273]. In the presence of competing risks there is no longer a direct relationship between the hazard and the risk. The way in which covariates are associated with cause-specific

hazards may not be the same as the way they are associated with the

cumulative incidence. Two different hazard based regression models have been described and used to deal with competing risks: 1) estimating the effect of covariates on the cause-specific hazard function and 2) estimating the effects of covariates on the subdistribution hazard function (or the CIF) [269-273].

These two methods differ in their use and interpretation and the method chosen should depend on the specific research question. Details of the two approaches are given below.

3.5.4.3 Cause-specific regression models

Cause-specific models can be used to estimate the association between covariates and the rate of occurrence of the event of interest (the hazard). In these models subjects who experience a competing event are treated as censored subjects and removed from the risk set for calculation of the hazard [269, 271, 272]. This model can be implemented using, for example, a Cox model. The cause-specific hazard ratio provides a summary of the relationship between a covariate and the rate of occurrence in subjects who are currently event-free without considering the effect of the competing risk. These models are best suited to address aetiological research questions [269, 271, 272].

3.5.4.4 Fine-Gray subdistribution hazard model

Fine and Gray [274] defined a regression model to directly estimate the relationship between covariates and the cumulative incidence function, or the probability of the occurrence of the event of interest. These models are known as Fine-Gray models, subdistribution hazard models or CIF regression models.

The subdistribution hazard is the probability of failure due to an event at that moment in time, given that this event has not already occurred. The risk set includes all subjects who have not yet experienced the outcome of interest, so includes those who are event-free and also those who have experienced a competing event [269, 273]. Subjects who experienced the competing event are included in the risk set so that they can be counted in the proportion of the population that cannot have the event of interest. These models are

recommended if the research question is focussed on estimating incidence and predicting prognosis [269, 271, 272]. Therefore, this model was chosen and used to investigate the relationship between patient risk factors and the incidence of respiratory late effects where death was considered a competing risk.

The interpretation of the coefficients from the Fine-Gray model is not

straightforward [272, 273]. Exponentiated regression coefficients denote the subdistribution hazard ratio (sHR) and can be used to describe the direction of the observed association but cannot be used to directly quantify the magnitude of the association since the magnitude of the relative effect of the covariate on the subdistribution hazard function is different from the magnitude of the effect of the covariate on the CIF [273]. A sHR=1 implies no association between the covariate and the CIF, while if the sHR>1 then this implies than a 1-unit

increase in the covariate is associated with an increased incidence of the event of interest and if the sHR<1 then the covariate is associated with a decreased incidence [272].

The magnitude of the regression coefficients do not provide information of the magnitude of the covariate on the incidence, however, the magnitude of

coefficients from the same model may be compared [273]. For example, if one covariate has a larger regression coefficient than a second covariate then the magnitude of the first covariate on the incidence of the outcome will be greater than the magnitude of the second covariate. Limitations of these models are the sHRs cannot be directly compared from different models with different

outcomes, or from different studies since the CIF will not be the same for different types of events [269, 273].

The Fine-Gray model is a semi-parametric model. Similar to the Cox model, the baseline subhazard does not need to be specified and the model assumes that the subdistribution hazards are proportional [274]. This assumption can be

checked graphically by plotting non-parametric cumulative incidence functions by covariates to check for crossing incidence curves or including time varying coefficients where the assumption is violated if there is a significant interaction between the covariate and time [95, 270].