Simultaneous variable selection for joint models of longitudinal and survival outcomes

(1)

Web-based Supplemental Materials for

Simultaneous

Variable Selection for Joint Models of

Longitudinal and Survival Outcomes

by

Zangdong He, Wanzhu Tu, Sijian Wang, Haoda Fu and Zhangsheng Yu

1 Expectation conditional maximization procedures to optimize

the penalized likelihood

Let Θ = (θ, ζ1m, η2l), where θ = (β1,β2,Γ1,Γ2,φ) are defined in section 2.2.

The expectation conditional maximization procedures to optimize the penal-ized likelihood are proposed as follows:

1. Initialize (β(0)₁ ,β(0)₂ ,γ₁(0)_m, ζ₁(0)_m,γ₂(0)_l , η(0)₂_l ,φ(0)_{) with some plausible values.}

2. For iteration s, update β1,β2 by adaptive LASSO,

β1(s),β2(s) =argmax β1,β2 ˜ Q(β1, β2,Γˆ(₁s−1),Γˆ₂(s−1),φˆ(s−1)|βˆ₁(s−1),βˆ₂(s−1),Γˆ₁(s−1),Γˆ(₂s−1),φˆ(s−1)) −λ1 p X j=1 ωβ1j|β1j| −λ2 p X k=1 ωβ2k|β2k|. 3. updateγ1m,γ2l: γ1m(s),γ2l(s)=argmax γ1m,γ2l ˜ Q(βˆ₁(s),βˆ₂(s),Γ1,Γ2,φˆ(s−1)|βˆ (s) 1 ,βˆ (s) 2 ,Γˆ (s−1) 1 ,Γˆ (s−1) 2 ,φˆ (s−1)₎ − 1 4 q X m=2 (λ3ωγ1m) 2 (ζ₁(s_m−1))2 ||γ1m|| 2₋ 1 4 q X l=2 (λ4ωγ2l) 2 (η₂(s_l−1))2 ||γ2l|| 2 .

(2)

4. updateζ1m, η2l: ζ₁(s_m) = r λγ1ωγ1m 2 ||γ (s) 1m||, η (s) 2l = r λγ2ωγ2l 2 ||γ (s) 2l ||. 5. updateφ: φ=argmax φ ˜ Q(βˆ(₁s),βˆ₂(s),Γˆ(₁s),Γˆ₂(s),φ|βˆ₁(s),βˆ(₂s),Γˆ(₁s),Γˆ(₂s),φˆ(s−1)).

6. Terminate the iteration when max|Θ(s)−Θ(s−1)| are small enough. Otherwise, let

s=s+ 1 and go back to step 2.

Before updating parameters in each step, the corresponding ˜Q function

is approximated by Gaussian quadrature in the E-step. To improve

compu-tation stability, smaller subset of (β₁,β₂,Γ1,Γ2,φ) could be updated

itera-tively. We could update β₁ when (β₂,Γ1,Γ2,φ) is fixed, and then update β2

when (β₁,Γ1,Γ2,φ) is fixed, and sequentially for Γ1, Γ2, and φ when other

parameters are fixed. It is at the price of more iterations.

2 Data generation for simulation study: Scenario 5

In Scenario 5, we generate the longitudinal outcome Yij from the following

model:

Yij =1 + 1.5X1ij,1 + 2X1ij,2 + 0X1ij,3 + 0X1ij,4 +bli,0

+ bli,1Z1ij,1 +bli,2Z1ij,2 +bli,3Z1ij,3 +bli,4Z1ij,4 + ij,

and the failure time from a Weibull distribution with the hazard function:

λi(t) = λ0(t) exp(1.5x2i,1 + 2x2i,2 + 0x2i,3 + 0x2i,4

+bsi,0 +bsi,1z2i,1 +bsi,2z2i,2 +bsi,3z2i,3 +bsi,4z2i,4),

for i = 1, . . . ,800, j = 1, . . . ,5, where λ0(t) = αλtα−1 with α = 2, and

λ = exp(1) = 2.718.

Random effectbi is independently generated fromN(0,I5). bli = (bli,0, bli,1, bli,2,

(3)

ob-tained by bsi = Γ2bi, where Γ1 = Γ2 = σD                  1 0 0 0 0 1 2 1 2 0 0 0 1 3 1 3 1 3 0 0 0 0 0 0 0 0 0 0 0 0                  1 2 andσD = √

0.5. CovariatesX1ij,1 = Z1ij,1, X1ij,2 = Z1ij,2, X1ij,3 = Z1ij,3, X1ij,4 =

Z1ij,4 andx2i,1 = z2i,1, x2i,2 = z2i,2, x2i,3 = z2i,3, x2i,4 = z2i,4 are generated as

in-dependent N(0,1) variables; The measurement error ij ∼ i.i.d.N(0,1). The

censoring time is independently generated from an exponential distribution to achieve a 60% censoring percentage.

3 Data generation for simulation study: Scenario 6

In Scenario 6, we generate the longitudinal outcome Yij from the following

model:

Yij =1 + 1.5X1ij,1 + 2X1ij,2 + 2.5X1ij,3 + 0X1ij,4 + 0X1ij,5 + 0X1ij,6 + 0X1ij,7+

bli,0 +bli,1Z1ij,1 +bli,2Z1ij,2 +bli,3Z1ij,3 +bli,4Z1ij,4 +bli,5Z1ij,5 + bli,6Z1ij,6+

bli,7Z1ij,7 +ij,

and the failure time from a Weibull distribution with the hazard function:

λi(t) =λ0(t) exp(1.5x2i,1 + 2x2i,2 + 2.5x2i,3 + 0x2i,4 + 0x2i,5 + 0x2i,6 + 0x2i,7+

bsi,0 +bsi,1z2i,1 +bsi,2z2i,2 +bsi,3z2i,3 +bsi,4z2i,4 +bsi,5z2i,5+

bsi,6z2i,6 +bsi,7z2i,7),

for i = 1, . . . ,250, j = 1, . . . ,5, where λ0(t) = αλtα−1 with α = 2, and

λ = exp(1) = 2.718.

Random effectbi is independently generated fromN(0,I8). bli = (bli,0, bli,1, bli,2,

(4)

bsi,5, bsi,6, bsi,7) is obtained by bsi = Γ2bi, where Γ1 = Γ2 = σD                                    1 0 0 0 0 0 0 0 1 2 1 2 0 0 0 0 0 0 1 3 1 3 1 3 0 0 0 0 0 1 4 1 4 1 4 1 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0                                    1 2

andσD = √0.5. CovariatesX1ij,1 = Z1ij,1, X1ij,2 = Z1ij,2, X1ij,3 = Z1ij,3, X1ij,4 =

Z1ij,4, X1ij,5 = Z1ij,5, X1ij,6 = Z1ij,6, X1ij,7 = Z1ij,7 and x2i,1 = z2i,1, x2i,2 =

z2i,2, x2i,3 = z2i,3, x2i,4 = z2i,4, x2i,5 = z2i,5, x2i,6 = z2i,6, x2i,7 = z2i,7 are

generated as independent N(0,1) variables; The measurement error ij ∼

i.i.d.N(0,1). The censoring time is independently generated from an expo-nential distribution to ahieve a 30% censoring percentage.

(5)

W eb T able 1: Selection frequency of mixed effects in longitudinal and surviv al comp onen ts for Scenario 5 Fixed effect selection Sel. F req.(%) for Longitudinal comp onen t Sel. F req.(%) for Surviv al comp onen t X1 , 1 X1 , 2 X1 , 3 X1 , 4 X2 , 1 X2 , 2 X2 , 3 X 2 , 4 Non-Zero Non-Zero Zero Zero Non-Zero Non-Zero Zero Zero 100 100 0 0 100 100 0 0 Random effect selection Sel. F req.(%) for Longitudinal comp onen t Sel. F req.(%) for Surviv al comp onen t Z1 , 1 Z1 , 2 Z1 , 3 Z1 , 4 Z2 , 1 Z2 , 2 Z2 , 3 Z2 , 4 Non-Zero Non-Zero Zero Zero Non-Zero Non-Zero Zero Zero 100 100 0 0 99 99 1 0

(6)

W eb T able 2: Estimation of fixed effec ts β1 ,j and β2 ,j in longitudinal and surviv al comp onen ts for Scenario 5 ˆβ1 ,j ± S E (Co v erage probabilit y) for Long itudinal comp onen t a In tercept X1 , 1 X1 , 2 X1 , 3 X1 , 4 T rue v alue β 1 1.5 2 0 0 W/O selection ˆβ 0.995 ± 0.033(94%) 1.500 ± 0.036(95%) 1.998 ± 0.039(94%) 0.002 0.001 1 st stage ˆβ 0.993 ± 0.033(92%) 1.487 ± 0.036(90%) 1.986 ± 0.039(87%) 0.000 0.000 2 nd stage ˆβ 0.999 ± 0.029(95%) 1.505 ± 0.034(95%) 2.001 ± 0.036(91%) 0.000 0.000 ˆβ2 ,j ± S E (Co v erage probabilit y) fo r Surviv a l comp onen t a In tercept X1 , 1 X1 , 2 X1 , 3 X1 , 4 T rue v alue β -1.5 2 0 0 W/O selection ˆβ 1.355 ± 0.126(74%) 1.844 ± 0.149(75%) 0.008 0.019 1 st stage ˆβ 0.989 ± 0.125(0%) 1.381 ± 0.145(1%) 0.000 0.000 2 nd stage ˆβ 1.348 ± 0.130(67%) 1.823 ± 0.152(73%) 0.000 0.000 a ˆβs are the a v erages o f estimates o v er the 100 data sets; SE is the empi ri cal standard error of the 1 00 ˆβs ; F or eac h data set, the 95% confidence in terv al b ased on the parameter and standard error estimates is calcul a ted and the corresp onding co v erage probabilities for the true v alue o v er the 100 data sets are included in the paren theses. SE and co v erage probabilit y are only rep orted for non-zero v ariables.

(7)

W eb T able 3: Estimation of random effects √ D 1 k k and √ D 2 k k in longitudinal and surviv al comp onen ts for Scenario 5 q ˆ_D1 k k for Longitudinal comp onen t a q ˆ_D2 k k for Surviv al comp onen t a I nter ce p t1 Z1 , 1 Z1 , 2 Z1 , 3 Z1 , 4 I nter cept 2 Z2 , 1 Z2 , 2 Z2 , 3 Z2 , 4 T rue v alue √ D k k 0.707 0 .7 07 0.707 0 0 0.707 0.707 0.707 0 0 W/O selection q ˆ_Dk k 0.791 0 .8 17 0.817 0.052 0.050 0.776 0.820 0.825 0.205 0.202 1 st stage q ˆ_Dk k 0.787 0 .7 73 0.763 0.000 0.000 0.407 0.368 0.319 0.000 0.000 2 nd stage q ˆDk k 0.682 0 .6 92 0.696 0.000 0.000 0.638 0.665 0.674 0.004 0.000 a q ˆ_D1 k k and q ˆ_D2 k k are the a v erages of esti mates o v er the 100 data sets.

(8)

W eb T able 4: Selection frequency of mixed effects in longitudinal and surviv al comp onen ts for Scenario 6 Fixed effect selection Sel. F req.(%) for Longitudinal comp onen t Sel. F req.(%) for Surviv a l comp onen t X 1 , 1 X 1 , 2 X 1 , 3 X 1 , 4 X 1 , 5 X 1 , 6 X 1 , 7 X 2 , 1 X 2 , 2 X 2 , 3 X 2 , 4 X 2 , 5 X 2 , 6 X 2 , 7 Non-Zero Non-Zero Non-Zero Zero Zero Zero Zero No n -Z e r o Non-Zero Non-Zero Zero Z e r o Zero Zero 100 100 100 0 0 0 0 100 100 100 0 0 0 0 Random effect selec t ion Sel. F req.(%) for Longitudinal comp onen t Sel. F req.(%) for Surviv a l comp onen t Z1 , 1 Z1 , 2 Z1 , 3 Z1 , 4 Z1 , 5 Z1 , 6 Z 1 , 7 Z2 , 1 Z2 , 2 Z2 , 3 Z2 , 4 Z 2 , 5 Z2 , 6 Z2 , 7 Non-Zero Non-Zero Non-Zero Z e r o Zero Zero Zero Non-Zero N on-Zero Non-Zero Zero Zero Zero Zero 100 100 100 0 0 0 0 97 93 94 6 4 1 9

(9)

W eb T able 5: Estimation of fixed effec ts β1 ,j and β2 ,j in longitudinal and surviv al comp onen ts for Scenario 6 ˆβ1 ,j ± S E (Co v era ge probabilit y) for Longitudinal comp onen t a In tercept X 1 , 1 X 1 , 2 X 1 , 3 X 1 , 4 X 1 , 5 X 1 , 6 X 1 , 7 T rue v alue β 1 1.5 2 2.5 0 0 0 0 W/O selection ˆβ 0.994 ± 0.068(85%) 1.498 ± 0.081(75%) 1.999 ± 0.072(79%) 2.496 ± 0.072(81%) 0.001 -0.004 0.000 -0.0 03 1 st stage ˆβ 0.987 ± 0.068(89%) 1.454 ± 0.079(82%) 1.960 ± 0.072(87%) 2.462 ± 0.072(87%) 0.000 0.000 0.000 0.000 2 nd stage ˆβ 0.994 ± 0.064(87%) 1.497 ± 0.076(82%) 1.995 ± 0.074(85%) 2.496 ± 0.073(86%) 0.000 0.000 0.000 0.000 ˆβ2 ,j ± S E (Co v era ge probabilit y) for Surviv al comp onen t a X 2 , 1 X 2 , 2 X 2 , 3 X 2 , 4 X 2 , 5 X 2 , 6 X 2 , 7 T rue v alue β 1.5 2 2.5 0 0 0 0 W/O selection ˆβ 1.966 ± 0.286(63%) 2.667 ± 0.377(49%) 3.313 ± 0.429(49%) 0.014 -0.025 0.011 0.035 1 st stage ˆβ 1.039 ± 0.249(20%) 1.495 ± 0.331(28%) 1.897 ± 0.370(30%) 0.000 0.000 0.000 0.000 2 nd stage ˆβ 1.549 ± 0.358(86%) 2.112 ± 0.593(82%) 2.625 ± 0.712(84%) 0.000 0.000 0.000 0.000 a ˆβs are the a v era ges of estimates o v er the 100 data sets; S E is the empirical standard error of the 100 ˆβs ; F or eac h data set, the 95% confidence in terv al based on the parameter and standard error estimates is calculated and the corresp onding co v erage pro babili ties for the true v alue o v er the 100 data sets are included in the paren theses. SE and co v erage probabilit y are only rep orted for no n-zero v a ri a ble s.

(10)

W eb T able 6: Estimation of random effects √ D 1 k k and √ D 2 k k in longitudinal and surviv al comp onen ts for Scenario 6 q ˆD 1 k k for Longitudinal comp onen t a In tercept Z 1 , 1 Z1 , 2 Z1 , 3 Z1 , 4 Z1 , 5 Z1 , 6 Z1 , 7 T rue v alue p D k k 0.707 0.707 0.707 0.707 0 0 0 0 W/O selection q ˆD k k 0.785 0.821 0.831 0.829 0.165 0.174 0.159 0.158 1 st stage q ˆD k k 0.768 0.677 0.657 0.633 0.000 0.000 0.000 0.000 2 nd stage q ˆ_Dk k 0.628 0.669 0.693 0.699 0.000 0.000 0.000 0.000 q ˆD 2 k k for Surviv al comp onen t a In tercept Z 2 , 1 Z2 , 2 Z2 , 3 Z2 , 4 Z2 , 5 Z2 , 6 Z2 , 7 T rue v alue p D k k 0.707 0.707 0.707 0.707 0 0 0 0 W/O selection q ˆD k k 1.037 1.074 1.155 1.167 0.574 0.552 0.535 0.685 1 st stage q ˆD k k 0.511 0.431 0.412 0.382 0.002 0.004 0.000 0.025 2 nd stage q ˆ_Dk k 0.652 0.698 0.725 0.782 0.047 0.018 0.005 0.091 a q ˆ_D1 k k and q ˆ_D2 k k are the a v erages of es timates o v er the 100 data sets.

(11)

Web Figure 1: Residual plots for data application diagnostics. The circles are the standardized residuals. The black lines are the LOESS estimates.