3.2 Different approaches to estimating the correlation between progression-
3.2.5 Generalized model-based methods
In this section, we generalize the illness-death model-based approach of Li
and Zhang48 to achieve more flexibility and to allow estimation of Kendall’s τ
rather than the Pearson correlation coefficient.
The modified approach continues to use a multi-state illness-death model,
but allows any parametric formulation for the transition intensities between states.
In particular, we can allow post-progression survival to depend on both time-
to-progression, denoted by t0, and time since progression, denoted by s. We
assume π01(t), π02(t) and π12(s; t0) are parametrized by a vector of parameters θ which can be consistently estimated from data with a finite follow-up period through maximum likelihood estimation.
From the definition of Kendall’s τ in (3.1) and under an assumption that
the bivariate lifetime random variables (Sn,Tn)n∈Nrepresenting the PFS and OS times of the n patients are independent and identically distributed, for the gen-
eral illness-death model the Kendall’s τ implied by the model is as follows: τmod = 4 Z ∞ 0 π02(s) exp{−2Π0(s)}ds +4 Z ∞ 0 Z s1 0 Z ∞ 0 π01(s1)π01(s2) exp{−Π0(s1) − Π0(s2)}π12(s3; s1) × exp{−Π12(s3; s1)}[1 − exp{−Π12(s1+ s3− s2; s2)}]ds3ds2ds1 +4 Z ∞ 0 Z s1 0 π02(s1)π01(s2) exp{−Π0(s1) − Π0(s2)} × (1 − exp {−Π12(s1− s2; s2)})ds2ds1− 1, (3.10) where Π0(t) = Rt
0 π01(u) + π02(u)du and Π12(s; v) = Rs
0 π12(u; v)du. The first term in (3.10) refers to the case where one patient dies before progression, before the
other has died or progressed. The second term refers to the case where patients 1
and 2 progress at times s1 and s2, respectively, where s1 > s2 and subsequently
patient 1 survives an additional s3 whereas patient 2 dies within s1 − s3 + s2
of progression. The third term refers to the case where patient 1 progresses
and dies before patient 2, despite patient 2 dying without progression. A full
derivation of (3.10) is given in Appendix A.1.
In the model of Fleischer et al,28 where exponential distributed transitions
are considered, the expression in 3.10 can be simplified. By substituting π01(t) = 1
λ01, π02(t) =
1
λ02 and π12(t; t0) =
1
λ12 into (3.10) and directly integrating, after
some algebraic manipulation we obtain
τmod = (λ02λ12)
2+ 2(λ
01λ03λ212+ λ01λ202λ12+ λ01λ212+ λ201λ02λ12)
(λ02λ12+ λ01λ12)(λ02λ12+ λ01λ12+ λ01λ02)
− 1.
Rather than the exponential rate parameter as in Fleischer et al we use the scale
parameter, inverse of the rate parameter, denoted by λ01, λ02and λ02in order to
be consistent with the definition for the scale parameter for the Weibull hazards
in the thesis.
However, the integrals in (3.10) are analytically intractable for the model of
hindrance since τmodcan be obtained quite easily and with arbitrary accuracy via
numerical or Monte-Carlo methods. Moreover, making the underlying model
more complex, for instance by allowing separate Weibull shape parameters for
each transition intensity, has little or no bearing on the computational difficulty
of calculating τmod.
Monte-Carlo methods provide a particularly convenient way of evaluating
the model-based Kendall’s τ . We can use the fact that for a model where S and T are continuous,
τ = 2P (S1 > S2, T1 > T2) − 2P (S1 < S2, T1 > T2)
= 2P (S1 > S2, T1 > T2) − {1 − 2P (S1 > S2, T1 > T2)}
= 4P (S1 > S2, T1 > T2) − 1. (3.11)
It is therefore only necessary to evaluate P (S1 > S2, T1 > T2) which can be achieved by simulating 2M pairs of (Si, Ti) and then taking
ˆ
P (S1 > S2, T1 > T2) = M−1 M X
i=1
I(Si > Si+M, Ti > Ti+M). (3.12)
Simulation for general illness-death models can be achieved using the meth-
ods in Beyersmann et al.7 The Monte-Carlo standard error associated with the
approximation is at most 1/2 √
M . Typically, M = 1 × 106
or 1 × 107 samples
can be generated using very little computation time, meaning the Monte-Carlo
standard error is negligible. A point estimate for τmod can be obtained by simu-
lating 2M independent pairs of PFS and OS times from the illness-death model
with parameter estimatesθ := (ˆˆ λ01, ˆλ02, ˆλ12, ˆα01, ˆα02, ˆα12). The parameters of the parametric illness-death model are estimated as in Li and Zhang48via maximum
likelihood (see Appendix A.4 for more details). The only difference is that we
used distinct shape parameters α01, α02and α12corresponding to each transition
intensity instead of a common shape parameter α for all transitions.
A simulation-based approach may also be used to obtain confidence inter-
generating B samples
θ1∗, . . . , θ∗B∼ N ( ˆθ, I( ˆθ)−1),
where I(θ) is the observed Fisher information of the log-likelihood. For eachˆ of the B samples, a pair of (S, T ) from the illness-death process with parame-
ters θb∗ are simulated 2M times. The next step is to estimate τmod denoted by τmod∗b for every b ∈ [1, B] using (3.12). Confidence intervals can then be con-
structed based either upon the sample standard deviation or sample quantiles of τmod∗1 , . . . , τmod∗B . A non-parametric bootstrap variant of this algorithm is also possi- ble where B bootstrap samples are generated by repeatedly resampling from the
original data and the maximum likelihood estimates are recomputed to generate
each θb∗.