PSA Data - Florida State University Libraries

6.2.1 Data Description

We consider the subjects with more than three PSA readings as the longitudinal measure, and the time-to-diagnosis with prostate cancer as the survival endpoint. There are 647 out of 1210 subjects who had more than three PSA readings. Of these 647 subjects, 52 of them had prostate

Figure 6.9: Observed trajectories of PSA readings for 647 subjects who have more than three PSA readings.

cancer diagnosis. Their PSA readings were transformed by log(PSA+1) because the data in original sclae is skewed, and some of the PSA readings are equal to zero.

6.2.2 Results Using Joint Models

Let Yij denote the transformed PSA reading log(PSA + 1) for the i-th subject at time point j, i = 1, . . . , n and j = 1, . . . , n_i. Of the 647 people included in this analysis, 92 people had 4 longitudinal observations, 80 had 5, 87 had 6, 69 had 7, 56 had 8, 43 had 9, 37 had 10, 33 had 11, 58 had 12, 37 had 13, 19 had 14, 11 had 15, and 25 had more than 15 longitudinal observations.

There were 52 people had prostate cancer diagnosis before the end of the study. Figure 6.9 displays the plots of the PSA readings on the original scale over time, and the transformed PSA readings over time. Each line represents the longitudinal trajectory for each person. We do not include covariate in the Cox proportional hazard model in this dissertation.

We fit this PSA data to the Gaussian model reviewed in Chapter 3 and the log-gamma joint model proposed in Chapter 4, with linear trajectory ψ_β_i(t_ij) = β_0i+ β_1it_ij. We assume that the baseline hazard λ(t) is piecewise constant with L = 15 intervals. The time points defining the intervals were arbitrarily taken to be t = 0, 2, 4, 6, 8, 10, 12, 14.

In the Gaussian joint model, the priors and hyperpriors for the parameters in (3.6), (3.4) and (3.10) were taken as

b0 ∼ N₂ (3, −1)⁰, I2 , V⁻¹₀ ∼ W ishart(20, 10 × I₂),

M ∼ Gamma(1, 1), σ² ∼ IG(10, 1),

γ ∼ N (1, 0.05),

λ_l ∼ Gamma(10, 10), l = 1, 2, . . . , 7,

where I represents the identity matrix, W ishart and Gamma represent the Wishart and Gamma distributions. In the log-gamma joint model, the priors and hyperpriors for the parameters in (4.2), (4.8) and (4.6) were taken as

κ0 ∼ IG(10, 10), κ1 ∼ IG(10, 10), κ3 ∼ IG(10, 10), M ∼ Gamma(1, 1),

α ∼ Gamma(100, 10), γ ∼ LG(50, 20),

λ_l ∼ Gamma(10, 10), l = 1, 2, . . . , 5,

where “IG”, “LG”, and “Gamma” stand for the inverse gamma, gamma, gamma and log-gamma distributions. We ran 5,000 iterations with 2,000 burn-in. We computed the posterior mean of the last 3,000 draws as the point estimate for those parameters, and we used the 2.5% and 97.5% quantiles of the last 3,000 draws to compute 95% pointwise credible interval.

Parameter estimates are shown in Table 6.2. The estimated slope in the linear regression is 0.07 with 95% credible interval of (−0.002, 0.15) using Gaussian joint model and 0.04 with 95% credible interval of (0.01, 0.06) using log-gamma joint model, suggesting a slightly increase in PSA over the study period. The estimated value of γ is 1.17 and the 95% credible interval of γ does not cover zero. This suggests that when the PSA readings increases, the hazard of being diagnosed with

Table 6.2: Parameter estimates for the PSA data using the Gaussian joint model with slice sampler and the log-gamma joint model.

Gaussian Joint Model Log-gamma Joint Model Parameter Estimate 95% Credible Interval Estimate 95% Credeble Interval

β₀₁ 0.96 (0.64, 1.27) 1.11 (0.81, 1.36)

β11 0.07 (−0.002, 0.15) 0.04 (0.01, 0.06)

V ar(error) 0.05 (0.04, 0.05) 0.04 (0.04, 0.05)

γ 1.17 (1.13, 1.17) 0.81 (0.53, 1.07)

prostate cancer will also increase. The non-zero estimate of γ also suggests that a joint model is needed in this problem. The MCMC trace plots and the posterior density estimates of β01, β11, γ, α are given in Figure 6.10-6.11. Figure 6.12 shows the observed PSA readings versus the estimated longitudinal trajectories of 4 randomly selected patients using the Gaussian joint model and the log-gamma joint model. Both of these two types of joint models captured the trajectories well. The MSRE of the PSA readings in the original scale from the Gaussian joint model is 1.79, which is smaller than the MSRE from the log-gamma joint model, which is 4.94. The estimated association of the longitudinal observations and the hazard rate of death is positive in both Gaussian and log-gamma joint models. This suggests that a higher PSA implies higher hazard of having prostate cancer diagnosis.

At cluster level, we used Dahl’s method to select the “best” iteration from the last 3,000 itera-tions and used the values of the draws in that iteration as the point estimate for those parameters.

Figure 6.14 shows the predicted PSA trajectories, predicted hazard rate and predicted survival probability for each cluster. Different colors represent different clusters. In the bottom left panel we truncate the trajectory of the cluster labeled with yellow at the largest observed time of PSA readings. We do this because a high percentage of individuals in this cluster were diagnosed with prostate cancer early, and hence, feel less confident in making estimates after this time point. The decreasing trend of PSA readings in this cluster is consistent with Lin et al. [24], where they found individuals with similar PSA readings could exhibit a similar decreasing trend. The middle top panel of the hazard rate plot in Figure 6.14 was restricted for better visualization, because most of the cluster-specific hazard rate were smaller than 10. The log-gamma model leads to an estimated 7 clusters, while the Gaussian model leads to an estimated 54 clusters. This corresponds well with the results of the simulation study in Chapter 5, in which the Gaussian joint models tended to

Figure 6.10: Trace plots and density curves using the Gaussian joint model with slice sampler for the PSA data. (a) MCMC trace plot of trajectory intercept β₀. (b) Posterior density estimates of trajectory intercept β0. (c) MCMC trace plot of trajectory slope β1. (d) Posterior density estimates of trajectory slope β1. (e) MCMC trace plot of the link parameter γ. (f) Posterior density estimates of the link parameter γ.

Figure 6.11: Trace plots and density curves using the log-gamma joint model for the PSA data. (a) MCMC trace plot of trajectory intercept β₀. (b) Posterior density estimates of trajectory intercept β0. (c) MCMC trace plot of trajectory slope β1. (d) Posterior density estimates of trajectory slope β1. (e) MCMC trace plot of the link parameter γ. (f) Posterior density estimates of the link parameter γ.

Figure 6.12: Observed PSA readings (black circles) vs. estimated PSA longitudinal trajectories of randomly selected 4 subjects by using the Gaussian joint model with slice sampler (blue dash line) and the log-gamma joint model (red dash line).

overestimated the number of clusters and estimated more clusters than the log-gamma joint model, given that the prior of the concentration parameter M in the Dirichlet process prior is the same in both joint models. We also compared the predicted survival probability from the joint models with the Kaplan-Meier estimator, which is shown in the survival probability plots in Figure 6.14. The Kaplan-Meier estimator for all patients was in between those estimated cluster-specific survival probabilities.

6.2.3 Diagnosis

Similar to the AIDS data in Section 6.1, we provide diagnostics in addition to interpreting model fit for the PSA data. It may be the case that both the Gaussian and log-gamma versions of the joint model provide poor fits to the data. The posterior predictive p-value (Meng [25]) is a viable approach to assess model fit (see Section 2.8). For this PSA data set the posterior predictive p-value for the log-gamma model is 1 and 0.027 (for χ² and square error loss functions, respectively), and the posterior predictive p-value for the Gaussian model is approximately equal to 1 and 0.749 (for χ² and square error loss functions, respectively). This suggests that the Gaussian model may overfit

Figure 6.13: QQ-plot of residuals using Gaussian joint model and log-gamma joint model for the PSA data .

the data, while the result of the log-gamma model is inconclusive. To obtain a sense of how much we are smoothing in each case, we consider other types of diagnostics. For example, the DIC (see Section 2.9) for the log-gamma model is 1920113 and the DIC for the Gaussian model is 6425589.

This suggests better predictive performance of the log-gamma model. Residual diagnostics are also helpful for determining the quality of the model fit. In Figure 6.8, we plot the qq-plot of the residuals from the Gaussian model and the log-gamma joint model. Specifically, Figure 6.13 plots the sample quantile of residuals, which is [yij − ( ˆβ0+ ˆβ1tij)], versus the theoretical quantiles from a Gaussian distribution. Here, we see the the normal qq-plot suggests left skewness in both cases. This suggests a better fit using the log-gamma model, and that residuals are distributed more similar to a left-skewed distribution (possibly a log-gamma distribution).

6.3 Discussion

In this chapter, we applied the Gaussian joint mode and the log-gamma joint model on two real data sets, the HIV data and the PSA data. Both of these two sets had repeated observations as the longitudinal measure, and time-to-event as the survival endpoint. We jointly analyzed their longitudinal and survival outcomes using semiparametric Bayesian joint models and implemented

Figure 6.14: Predicted longitudinal trajectories, predicted hazard rate and predicted survival prob-ability for each cluster by using the Gaussian joint model with slice sampler and the log-gamma joint model for the PSA data. In the bottom left panel we truncate the trajectory of the cluster labeled with yellow at the largest observed time of PSA readings. We do this because a high per-centage of individuals in this cluster were diagnosed with prostate cancer early, and hence, feel less confident in making estimates after this time point.

via the Markov chain Monte Carlo (MCMC) approach. The focus or this analysis is on recovering the mean longitudinal trajectory and using the trajectory to make prediction of survival prognosis.

A Dirichlet process prior was applied on the parameters of the trajectory function to relax distri-butional assumptions on those parameters, which can improve estimates in cases where parametric distributional assumptions are inappropriate. Both the Gaussian joint model and the log-gamma joint model were coded in R. Comparisons between these two joint models were given in Section 6.2 and 6.3. The Gaussian joint model was slightly better in estimating the mean trajectories for at individual level, while the log-gamma model had better performance in clustering and the estimates for each cluster were more interpretable.

CHAPTER 7 CONCLUSION AND FUTURE WORK

In this dissertation, we generalized a Bayesian semiparametric joint model proposed by Brown and Ibrahim [6] to jointly analyze longitudinal and survival data. In Chapter 3, we reviewed the Gaussian joint model proposed by Brown and Ibrahim [6] and provided the full conditional distributions for the parameters in their joint model. Metropolis-Hastings and the slice sampler.

In Chapter 4, we challenged the Gaussian assumption in the Gaussian joint models and assumed a log-gamma distribution for the longitudinal measures. Then we proposed a computationally efficient method for inference by using the multivariate log-gamma distributions and the conditional multivariate log-gamma distributions as the priors and hyperpriors, which obtained more conjugacy.

Full conditional distributions of the parameters in the log-gamma joint model were provided in Chapter 4. In Chapter 5, simulation studies were given to illustrate both the Gaussian joint model and the log-gamma joint model. We fit these two types of joint models on both Gaussian distributed data and log-gamma distributed data and compared their performance. Comparison results were given at individual level and cluster level. At individual level, the Gaussian joint model did slightly better than the gamma joint model in terms of the MSE of longitudinal measures. The log-gamma joint model did much better than the Gaussian joint model in estimating the hazard for each individual. In clustering, the log-gamma joint model was better than the Gaussian joint model in terms of cluster assignments which was measured by the adjusted Rand index. The log-gamma joint model was more accurate in estimating the number of clusters, while the Gaussian joint mode tended to have more estimated clusters than the truth. The log-gamma joint model was also more accurate in the estimation of cluster-specific longitudinal mean trajectory and cluster-specific hazard.

We only considered linear trajectory for the longitudinal measures in this dissertation . How-ever, the trajectory can be other forms of interests, e.g., quadratic trajectory, which can be a possible future work. Moreover, we assumed the longitudinal error to be independent and iden-tically distributed with either the Gaussian distribution or the log-gamma distribution. However,

the longitudinal error within each individual can be correlated. A joint model with time-dependent longitudinal error can be another possible future work.

In Chapter 6, we had a real data analysis on the HIV data to capture the trend of CD4 cells count and to learn the association between the CD4 count trend and the hazard of death. We also analyzed the PSA data to capture the trend of PSA readings and learn the association between the PSA trend and the hazard of prostate cancer diagnosis. There are more possibilities for real data analysis. For example, we will implement the Gaussian joint model and the log-gamma joint model on a Behavior Samples data, which has repeated measures of the Behavior Samples scores for children. This study aims to contribute to the understanding of early communication delays and will analyze the correlation between the longitudinal observations and the risk of having a language development disorder.

APPENDIX A

ADDITIONAL DETAILS ON THE MULTIVARIATE LOG-GAMMA DISTRIBUTION

A.1 Marginal Multivariate Log-Gamma Distribution

A particular class of the marginal multivariate log-gamma (mMLG) distribution will be useful for Gibbs sampler. In Bradley et al. [5], Theorem 2 defines a particular class of MLG random vectors. Specifically, let q^∗ ∼ MLG(0_m, V, α, κ), and partition this random vector so that q^∗ =

q^∗₁⁰, q^∗₂⁰0

, where q^∗₁ is a g-dimensional random vector and q^∗₂ is a (m − g)-dimensional random vector. Additionally, we partition V⁻¹= [H B] into an m × g matrix H and an m × (m − g) matrix B as done in (2.7). Consider the class of MLG random vectors that satisfy the following:

V⁻¹= (Q₁ Q₂)

R₁ 0_g,m−g 0_m−g,g _σ¹I_m−g

where 0_g,m−g is a g × (m − g) matrix of zeros; 0_m−g,g is a (m − g) × g matrix of zeros; I_m−g is a (m − g) × (m − g) identity matrix;

H = (Q1 Q2)

R₁ 0m−g,g

is the QR decomposition of the m × g matrix H; the m × g matrix Q1 satisfies Q⁰₁Q1 = Ig; the m × (m − g) matrix Q₂ satisfies Q⁰₂Q₂ = 1_m−g and Q⁰₂Q₁ = 0_m−g,g; B = _σ¹Q₂, where σ > 0; R₁ is a g × g upper triangular matrix. Notice that

V =

(H⁰H)⁻¹H⁰ σQ⁰₂

. It follows from (2.5) that

q^∗ =

q^∗₁ q^∗₂

(H⁰H)⁻¹H⁰w σQ⁰₂w

, (A.1)

where w ∼ MLG(0_m, I_m, α, κ). The pdf of the joint distribution of q^∗= (q^∗₁⁰, q^∗₂⁰)⁰ is given by

The marginal distribution of q^∗₁ in this particular class of MLG distributions is defined as the marginal multivariate log-gamma distribution (mMLG). Multiplying both sides of (A.1) by (Ig, 0g,m−g) we have

q^∗₁ = (H⁰H)⁻¹H⁰w. (A.2)

Thus, we can simulate for q^∗₁ according to (A.2).

In document Florida State University Libraries (Page 89-101)