CHAPTER 7: FUTURE WORK
A.4 A Simulation Study Comparing Exact Bayesian Inference Through
This appendix is divided into two sections. In the first section we describe our procedure for analysis of the PWC-PH model using MCMC methods. Our approach attempts to make the model fitting step as efficient as possible for scenarios where using MCMC might be preferred in design. Our MCMC implementation also serves as a tool to fit a model to historical data in order to obtain samples from the posterior that can be used to create discrete approximations of the sampling priors. In the second section we provide results from our simulation studies comparing analyses using MCMC to analyses using the proposed normal approximation to the marginal posterior of the hazard ratio parameters.
A.4.1 High Throughput Model Fitting with MCMC
In this section we discuss a reformulation of the marginal posterior for the hazard ratio pa- rameters that is beneficial for MCMC model fitting when the design variables (treatment indicator and covariates) are binary and few in number. This will commonly be the case during design of a clinical trial. In this case, one can write the posterior distribution as a function of a relatively small set of sufficient statistics to which the simulated subject level data can be reduced prior to model fitting.
Let d index the set of D distinct values of the design variables observed in the combined historical study and new study datasets. Write φd = exp γzd∗+βTx
∗
d
, where (z∗d,x∗d) are the particular values of the design variables identified byd. Letνskd and rskd be the number of events and total time at risk in intervalk for the set of new study subjects in stratums that had design variable values identified by d. Let νskd,0 and rskd,0 be the analogous quantities for the historical
study. The marginal posterior (3.3.4) reduces to
π(γ,β|D,D0, a0) ∝ QD d=1φ ν∗ d d QS s=1 QKs k=1 βsk∗ α∗sk (A.4.1)
where νd∗ = S X s=1 Ks X k=1 (νskd+a0νskd,0), α∗sk = D X d=1 (νskd+a0νskd,0), and βsk∗ = D X d=1 φd(rskd+a0rskd,0).
Thus, one only needs the set of sufficient statistics{(νskd, rskd, νskd,0, rskd,0) :∀s, k, d}and{(zd∗,x
∗
d) :∀d} in order to evaluate the marginal posterior for MCMC sampling. WhenDn+n0, an approach
using the sufficient statistics is far superior to the straight forward approach using subject level data. In our implementation we first reduce each simulated dataset to the sufficient statistics and the perform MCMC model fitting using (A.4.1). Our implementation supports an arbitrary number of binary covariates.
When the covariates are restricted to be binary it is straight forward to show that the full conditional distribution for each of the hazard ratio parameters is log-concave. Thus, one can use rejection sampling or adaptive rejection sampling (W. R. Gilks, 1992) to sample from the full conditionals. In our implementation we utilize a simple Newton-Raphson algorithm to locate the mode of the full conditional and then construct a three-part envelope function centered about the mode for rejection sampling. We use the optimal envelope under the assumption of normality for the full conditional. Since we only need to draw one sample at each Gibbs step and since so few samples actually get rejected, we find that it is not necessary to adapt the envelope.
A.4.2 A Comparison of MCMC Analysis Results with Results based on the
Laplace Approximation
For the simulation studies in this section we used the same historical data set as was used in the example application in Section 3.5. The number of baseline hazard components and the change points for the new study model where taken from the best historical study model determined by DIC. The parameter values for the new study model were fixed at the posterior means reported in
Table A.2: Summary of inference agreement using MCMC and the Laplace approximation
—–K = 3 —– —–K = 10 —– —–K= 25 —–
ν a0 R2 % Concordant R2 % Concordant R2 % Concordant
40 0.0 0.9996 99.4 0.9995 99.1 0.9995 99.2 40 0.5 0.9999 98.4 0.9999 98.6 0.9999 99.6 40 1.0 0.9999 98.5 0.9998 99.2 0.9998 99.4 80 0.0 0.9999 99.3 0.9998 99.3 0.9999 99.1 80 0.5 0.9999 99.5 0.9999 99.4 0.9999 98.9 80 1.0 0.9999 99.2 0.9999 99.4 0.9999 99.2 120 0.0 0.9999 99.5 0.9999 99.7 0.9999 99.3 120 0.5 0.9999 99.3 0.9999 99.6 0.9999 99.1 120 1.0 0.9999 99.3 0.9999 99.6 0.9999 99.4
Table 3.2. The following parameters were varied in the stimulation studies:
1. Number of new eventsν = 40, 80, and 120 withn= 3ν enrolled subjects 2. Ks = 3, 10, and 25 for boths= 1 and s= 2
3. a0 = 0.0, 0.5, and 1.0
Enrollment was assumed to be uniform over a 2 year period. Administrative censoring occurred at the time when the targeted number of events had been reached. No other censoring mechanism was simulated. We performedB = 1,000 simulation studies at each of the 27 combinations of the parameters being varied and computed two summaries of agreement. First, we computed the R2 value based on regressing the log
h
P
γ <0D(b),D0, a0 i
values computed using the normal ap- proximation onto the corresponding quantities computed using MCMC (based on 100,000 MCMC samples). Second, we computed the percentage of study decisions that were concordant between the two approaches. Study decisions were concordant if 1nPγ <0D(b),D0, a0
≥ψowas equal for both approaches. Appendix Table A.2 presents the results of our simulation studies. Fora0 = 0,
the analysis does not use any of the historical data and so the case where ν= 40 and a0= 0 gives
good insight into small sample performance. It appears the normal approximation is valid even when the sample size is much smaller than what will be encountered in an adequately powered clinical trial. For all cases the R2 value was near 1.0 and the concordance between the methods was never below 98%.
APPENDIX B: CHAPTER 4 SUPPLEMENTAL MATERIALS