Bayesian Piecewise Constant Hazard Model

(1)

Bayesian Piecewise Constant Hazard Model

1.1 Bayesian survival modelling using the piecewise constant hazard model

Bayesian inference may be implemented to the piecewise constant hazard model with time varying covariate effects. Much work has been done on piecewise constant hazard models where the hazard parameters are independent between intervals. See, for example Genest & Kalbfleisch (1998);Ibrahim et al. (2001). Kim et al. (2007) proposed a class of semi-parametric models for survival data in which they constructed a dynamic model for piecewise constant hazard functions over a finite partition of the time axis. Gamerman (1991) introduced the dynamic generalised linear model into survival analysis where he applied it to the piecewise constant hazard model.

Hence, the cut points for the piecewise constant hazard model need to be chosen.

1.1.1 Choosing the cut points of time intervals for the piecewise constant hazard model

The piecewise constant hazard model requires the discretisation of the time axis into intervals and thereby choosing the number and locations of the cut points. Some authors have suggested defining the intervals as beginning and ending at the observed failure times while Kalbfleisch & Prentice (1973) suggested selecting intervals independently of the data.

West (1992), in an attempt to define intervals, suggested shorter intervals over the first few years since deaths in cancer data are common in early stages and longer intervals in the later years since there are fewer deaths. One possibility is to choose our cut points deliberately so that each interval has the same number of deaths. A researcher would

(2)

time interval using prior judgements.

Suppose the time axis is discretised to have ten time intervals so that it can be assumed that 10% of the events (e.g deaths) occur in each interval. The event times are supposed to follow approximately an exponential distribution. The survival function of an exponential distribution with parameter λ is given by exp {−λt}.

Given the cut points τ1 < τ2 < . . . < τ_{J −1}, the probability of surviving until τj will be exp {−λτj} = 1 − 0.1j

so

−λτ_j = log (1 − 0.1j) and

τj = −1

λlog (1 − 0.1j)

Let the mean of the exponential distribution (¹_λ) be ν and ζ = ¹_J. So,

τ_j = −ν log (1 − jζ) (1.1)

1.1.2 Construction of the likelihood for piecewise constant hazard model Suppose that associated with every patient is a time which could either be a death or censoring time. Every interval is associated with three different groups of patients, patients who died during the interval, patients who were censored during the interval and patients who survived the interval. The contribution of the likelihood from the patients in every interval will depend on the three different groups of patients. The likelihood contribution L of the patients is given as

L =

n

Y

i=1 J

Y

j=1

Li,j

where

Li,j =











1 if t_i < τ_j−1

λ^δ_i,j^i,jexp {−λ(ti− τ_j−1)} if τj−1 ≤ t_i< τj

exp {−λ(τ_j− τ_j−1)} if t_i ≥ τ_j where n is the number of patients

J is the number of time intervals

t_i is the event or censoring time of the i^th patient.

λi,j is the hazard of the i^th patient in the j^th interval.

δ_i,j is the indicator of death or censoring of the i^th patient in the j^th interval.

Within each interval the conditional survival distribution given that the patient is alive

(3)

and uncensored at time τj−1, is exponential since the hazard is constant.

exponential distribution with probability density function f_i(t_i|λ_i,j) given by

fi(ti|λ_i,j) = λi,jexp {−λi,jti} and survival function S_i(t_i|λ_i,j) given by

S_i(t_i|λ_i,j) = exp {−λ_i,jt_i}

The likelihood contribution of patients L for each interval is given by

L = Y

i∈Dj

λ_i,jexp {−λ_i,j(t_i− τ_j)}Y

i∈Cj

exp {−λ_i,j(t_i− τ_j)}Y

i∈Fj

exp {−λ_i,j(τ_j+1− τ_j)}

where Dj is the set of patients who died in the j^th interval and have a time which is the difference between the patient’s death time and the time of the start of the interval.

C_j is the set of patients who were censored in the j^thinterval and have a time which is the difference between the patient’s censored time and the time of the start of the interval.

F_j is the set of patients who neither died or censored in the j^th interval but survived until the j + 1^th interval and have a time which is the length of the interval.

The logarithm of the likelihood ` for the j^th interval is given by

` = X

i∈Dj

log λi,j− X

i∈Dj∪Cj

λi,j(ti− τ_j) −X

i∈Fj

λi,j(τj+1− τ_j)

1.1.3 Construction of prior distribution for the parameters of piecewise constant hazard model

In a Bayesian context, there is an advantage of constructing a prior that makes the hazard parameters in neighbouring intervals to be correlated. In practice, a piecewise constant hazard function which has prior distribution in which the parameters are correlated over time could be used. Genest & Kalbfleisch (1978) and Ibrahim et al. (2001) made the prior distribution of the parameters independent between time intervals. However, it would be reasonable to think that the hazards in the intervals which are closely together are likely to be similar. McKeague & Tighiouart (2000) gave a dependent prior using a Markov random field.

The prior used in the illustration of the piecewise constant hazard model is assumed to take the form of a realisation of a stochastic process which could either be stationary or non stationary. If the prior is made stationary then each parameter gets the same variance in each time period. For example, an autoregressive process with autoregressive parameter

(4)

which governs how strong the correlation between time periods will be might be used. The autoregressive parameter ρ > 0 might be selected to give positive autocorrelation. Then, a first order autoregressive process is chosen so that for any given value of autoregressive parameter ρ, the process is given by

β_s− µ = ρ(β_s−1− µ) + _s

where s is normally distributed with mean zero and variance σ² (Chatfield, 2004).

This can be rearranged this as

Y_t= µ + φ(Y_t−1− µ) + _t By repeated substitution, the following shall be obtained

Yt= µ + t+ φt−1+ φ²t−2+ ....

Y_t= µ +

∞

X

p=0

φ^p_t−p

The expectation of β_j, E(β_j) is µ . The variance of the process is

Var(βs) = σ²

∞

X

p=0

ρ^2p= σ² 1 − ρ²

Making the process stationary makes the prior variance the same in all intervals. The value of ρ can also be chosen following the recommendations by Revie et al. (2003) on the construction of correlations. If ρ = 1, the process will be non-stationary. In some other cases, it is feasible to assume that more knowledge is possessed at the starting time and less prior experience as the time goes on and for this reason the variance is allowed to increase or decrease as time increases. The most flexible thing that can possibly be done is to have a different hazard for each possible combination of the categories of covariates. For instance, the continous covariates may be converted to categorical or ordered categories.

1.1.4 Construction of prior distribution for the frailty variance

Supposing that the logarithm of the frailty has a normal distribution with zero mean and a variance of σ_f². The prior distribution for the frailty variance σ²_f can be constructed.It is then feasible to suppose that the precision τ_f = σ_f⁻² has a gamma prior distribution.

If the parameters remained constant over all time intervals then an individual’s lifetime would have an exponential distribution since the hazard λ of the individual is constant.

The individual’s mean lifetime will be 1/λ. Doubling λ would half the mean lifetime and

(5)

correspond to a log-frailty of log 2. Suppose, for example, σf = log 2 then this corresponds to τ_f = _{(log 2)}¹ 2. Thus, the prior mean for τ_f is set to be _{(log 2)}¹ 2. So, if τ_f ∼ Ga(a_f, b_f),

then a_f

b_f = 1

(log 2)² (1.2)

The prior variance of τ_f is ^a_b^f2 f

and the coefficient of variation is ^√¹_a

f. a_f is chosen to reflect our prior uncertainty in τ_f. Since little prior information is known on τ_f, a small value for af is chosen. However, so that the prior density of τf is at τf, af > 1, say 1.1, is chosen.

Combining this with Equation 6.2 gives b_f = 0.53.

1.2 Application: Bayesian survival modelling using the piecewise constant hazard model

The Bayesian survival modelling using the piecewise constant hazard model to the two example data sets described in previous chapter. Equation (6.1) to the two example data sets to choose the cut points is applied. Based on a prior assessment of the mean survival life time of a patient, ν is set as 3 years in the SNLG data set, and hence get the cut points are

0.105, 0.223, 0.357, 0.511, 0.693, 0.916, 1.204, 1.609, 2.303.