Simulation Studies - Choi_unc_0153D

In this section, we present some results from our simulation studies. Two sets of simulations with different generalized linear mixed models for the longitudinal outcomes

are performed. Binary and Poisson data are considered for longitudinal process in the first and second sets of simulations, respectively.

3.6.1 Binary longitudinal outcomes and survival time

In this first set of simulations, we assumeYij to be a binary outcome following

P(Yij =yij∣bi) =exp{yijηij −log(1+exp{ηij})}, yij =0, 1,

with ηij =Xijβ+bi =β0+β1X1i+β2X2i+β3X3ij+bi forj =1, . . . , ni, and

h(t∣bi) =λ(t)exp{ψbi+Zi(t)γ} =λ(t)exp{ψbi+γ1Z1i+γ2Z2i},

wherebi∼N(0, σb2),X1i ≡Z1i are simulated from a Bernoulli distribution with success

probability being 0.5, andX2i≡Z2iare simulated from the uniform distribution between

0 and 1. The longitudinal data are generated for every 0.3 unit of time, and thus X3ij,

the time at measurement, has the value of every 0.3 unit ranging over 0 through 2.4. We consider differentψvalues of -0.1, 0, and 0.1 for negative, zero, and positive dependency between longitudinal process and survival time model, respectively. The parameters in the two models are chosen as β0 = −1, β1 = 1, β2 = −0.5, β3 = −0.2, σb2 = 0.5, ψ

= −0.1/ 0/ 0.1, γ1 = −0.1, γ2 = 0.1, and λ(t) = 1. Censoring time is generated from

the uniform distribution between 0.4 and 2.4, and the censoring proportion is around 25∼35%. We consider different sample sizes (n=200, 400) with 1000 replications. The average number of longitudinal observations (ni) is 3 with the range of 1 to 8. For the

comparison of the estimated baseline cumulative hazards over simulations, we consider the three time points of 0.9, 1.4, 1.9 which are three quartiles of the observed survival time. The results of the maximum likelihood estimates for θ and baseline cumulative hazards at the given three time points and their respective standard error estimates

are reported in Table 3.3. The simulation study is conducted using R.

In Table 3.3, “True” gives the true values of parameters; the averages of the maximum likelihood estimates from the EM algorithm are in “Est.”; the sample standard deviations from 1000 simulations are reported in “SSD”; “ESE” is the average of 1000 standard error estimates based on the observed information matrix; “CP” is the coverage proportion of 95% nominal confidence intervals based on the estimated standard error “ESE”. Satterthwaite method is used for the coverage probability of σ2

From Table 3.3, we can see that even for the smaller sample size (n=200), the bias of the estimates from EM algorithm is negligible for most cases. The estimated standard errors calculated from the observed information matrix are close to the sample standard deviations from the 1000 estimates, and the 95% confidence interval coverage rates are close to 0.95 except those for ψ. The parameter ψ tends to be underestimated with higher than the nominal level coverage rates, but the coverage rate is improved for larger sample size. Thus, with small sample size, the test for ψ is conservative, which strengthens the test results when rejecting the null(ψ =0), and the type I error becomes closer to the nominal level as sample size increases. In addition, the simulations show that the variances of the estimators decrease as the sample size (n) increases. We also can see that the estimates are fairly robust and close to the true values for all different

ψ values.

3.6.2 Poisson longitudinal outcomes and survival time

In the second set of simulations, we assume Yij to follow a Poisson distribution,

Table 3.3: Summary of simulation results of maximum likelihood estimation for binary longitudinal outcomes and survival time.

n=200 n=400

ψ Par. True Est. SSD ESE CP Est. SSD ESE CP - .1 β0 -1.0 -1.008 .270 .272 .951 -1.006 .194 .191 .945 β1 1.0 1.015 .241 .232 .942 .997 .162 .162 .949 β2 - .5 - .522 .405 .392 .939 - .500 .282 .274 .945 β3 - .2 - .173 .252 .245 .947 - .187 .181 .172 .952 σ2 b .5 .502 .231 .288 .968 .493 .168 .200 .974 ψ - .1 - .083 .353 .390 .990 - .102 .235 .245 .978 γ1 - .1 - .102 .174 .174 .949 - .104 .121 .121 .955 γ2 .1 .099 .296 .303 .953 .097 .214 .210 .950 Λ( .9) .9 .910 .181 .184 .955 .906 .132 .128 .943 Λ(1.4) 1.4 1.444 .294 .299 .962 1.421 .211 .204 .943 Λ(1.9) 1.9 1.980 .446 .449 .955 1.945 .312 .302 .951 0 β0 -1.0 -1.012 .281 .273 .944 -1.008 .192 .192 .948 β1 1.0 1.014 .235 .233 .954 1.006 .164 .163 .948 β2 - .5 - .501 .414 .393 .934 - .503 .278 .276 .949 β3 - .2 - .190 .263 .246 .936 - .194 .175 .173 .950 σ2 b .5 .505 .237 .290 .960 .503 .176 .203 .963 ψ .0 .008 .375 .388 .994 .003 .236 .241 .980 γ1 - .1 - .108 .180 .174 .942 - .103 .113 .121 .968 γ2 .1 .098 .309 .303 .949 .101 .209 .210 .951 Λ( .9) .9 .920 .188 .186 .952 .905 .131 .127 .944 Λ(1.4) 1.4 1.463 .306 .303 .948 1.415 .206 .202 .952 Λ(1.9) 1.9 2.006 .462 .457 .953 1.937 .306 .299 .948 .1 β0 -1.0 -1.009 .285 .274 .948 -1.000 .192 .192 .945 β1 1.0 1.004 .224 .234 .964 1.004 .166 .163 .952 β2 - .5 - .510 .414 .395 .945 - .512 .284 .276 .943 β3 - .2 - .186 .260 .249 .948 - .185 .189 .175 .929 σ2 b .5 .519 .249 .295 .946 .498 .174 .203 .965 ψ .1 .117 .354 .386 .990 .129 .246 .247 .986 γ1 - .1 - .096 .175 .175 .946 - .101 .117 .122 .966 γ2 .1 .086 .312 .304 .944 .101 .212 .211 .948 Λ( .9) .9 .915 .184 .185 .957 .904 .129 .127 .946 Λ(1.4) 1.4 1.455 .305 .303 .955 1.413 .203 .203 .954 Λ(1.9) 1.9 2.010 .481 .463 .952 1.938 .303 .302 .959

with ηij defined as in Section 3.6.1. We also consider the same hazards model and

simulation setting as those used in Section 3.6.1 exceptσ2

b =0.2. The simulated Poisson

longitudinal outcomes range over 0 to 7 with the average 0.5.

Table 3.4 shows that overall the estimates perform well even for the smaller sample size n = 200 with small biases of the estimates except ψ. We conducted additional simulations with sample sizes of 800 and 1000, and the bias of ψ existing for the small sample size decreases as sample size increases over 200, 400, 800 and 1000. The estimated standard errors using the observed information matrix are close to the sample standard deviations, and the 95% confidence interval coverage rates are close to 0.95 except for σ2

b and ψ.

From Table 3.4,ψ is seemingly underestimated with higher than the nominal coverage rates, but the coverage rate decreases to close to 95% nominal level as sample size increases. Additional simulations we conducted show that, with sample sizes of 800, the 95% confidence interval coverage rates for ψ =-0.1, 0 and 0.1 were 95.5%, 95.9% and 95.9%, respectively. σ2

b also appears to have high coverage rates, which may be

due to numerical problem since its coverage rates are still high for larger sample sizes. This implies that variance of σ2

b may not be estimated well for Poisson longitudinal

distribution. In the meantime, the test for σ2

b is conservative, which strengthens the

test result for rejecting the null(σ2

b =0). On the other hand, profile likelihood may be

an alternative estimation approach for σ2

b. It is also shown that the variances of the

estimators decrease for larger sample size, and the estimates are fairly robust and close to the true values for all three differentψ values.

In document Choi_unc_0153D_12105.pdf (Page 105-109)