2. COMPARING A METHOD FOR INDEPENDENT DATA TO A METHOD FOR
2.3. Monte Carlo Simulation Study Methods
2.3.2. Statistical analyses
For each simulated data set, we performed the following non-inferiority hypothesis test on the propensity-score matched sample:
H0: ππΆβ ππΈ β€ πΏ0 vs. H1: ππΆβ ππΈ > πΏ0
where ππΆ and ππΈ are the true event probabilities under the control and experimental treatments, respectively, for subjects in whom the ATT is defined, and πΏ0 < 0 is the non- inferiority margin. Without loss of generality, we assume that a smaller success
probability denotes greater safety. In the propensity-score matched sample, we used two statistical methods to perform the non-inferiority test. First, we used a method proposed
by Farrington and Manning (Farrington & Manning, 1990) that assumes independent treatment groups based on the restricted maximum likelihood estimation (MLE):
ππΉπ = πΜπΆβπΜπΈβπΏ0 βVarΜπΉπ(πΜπΆβπΜπΈβπΏ0) (1) where VarΜπΉπ(πΜπΆβ πΜπΈ β πΏ0) = πΜ(1 β ππΆ Μ)πΆ ππΆ +πΜ(1 β ππΈ Μ)πΈ ππΈ .
πΜπΆ is the proportion of control subjects in the matched sample with the event of interest, πΜπΈ is the proportion of experimental subjects in the matched sample with the event of interest, ππ and ππΈ are the number of control and experimental subjects, respectively, in the matched sample.
πΜπΆ = 2π’ππcos(π€ππ) β πππ/3πππ, πΜπΈ = πΜπΆβ πΏ0 where ΞΈ =NE ππΆ, πππ= 1 + ΞΈ, πππ= β(1 + ΞΈ + πΜπΆ+ ΞΈπΜπΈ + πΏ0(ΞΈ + 2)), πππ = (βπΏ0)2+ πΏ 0(2πΜπΆ+ ΞΈ + 1) + πΜπΆ+ ΞΈπΜπΈ, dfm = βπΏ0 β (1 + πΏ0)πΜπΆ, π’ππ = π πππ(π£ππ)β( πππ 2 (3πππ)2β πππ 3πππ), π£ππ = ( bfm 3 (3afm)3 ) β (bfmcfm 6afm2 ) + ( dfm 2afm ), wfm = (π + cosβ1(π£ππ π’ππ3 ))/3. πΜπΆ and πΜπΈ are the restricted MLEs of pC and pE under the null hypothesis that the risk difference equals πΏ0. The test statistic ZFM is asymptotically standard normal under the null hypothesis.
Second, we used a method proposed in chapter one of this dissertation which adjusts the Farrington-Manning test statistic (1) to account for the potential correlation within matched subjects:
ππ΄πΉπ = πΜπΆ β β πΜ πΈβ πΏ0 βVarΜπ΄πΉπ(πΜπΆβ β πΜ πΈβ πΏ0) (2) where VarΜAFM(πΜπΆβ β πΜπΈ β πΏ0) = πΜπΆ(1 β πΜπΆ) ππΆ ππΌπΉπΆ+πΜπΈ(1 β πΜπΈ) ππΈ β 2πΆππ£(πΜπΆ, πΜπΈ).
πΜπΆβ is the weighted event probability for the control group in the matched sample, where each control receives a weight that is the inverse of the number of controls matched to the same experimental subject (Stuart, Matching Methods for Causal Inference: A Review and a Look Forward, 2010) (Austin, Statistical Criteria for Selecting the Optimal Number of Untreated Subjects Matched to Each Treated Subject When Using Many-to-One Matching on the Propensity Score, 2010) (Austin, Assessing balance in measured baseline covariates when using many-to-one matching on the propensity-score, 2008). The proposed statistic uses Eldridge et alβs (Eldridge, Ashby, & Kerry, 2006) formula to calculate the variance inflation factor (VIF) which is used to adjust the variance of the control groupβs event rate to account for the correlation among controls in the same matched set. The formula takes into account the possibility of unequal set sizes (i.e. variable-ratio matching, where the number of controls matched to each treated subject is allowed to vary from one matched set to another) by using the coefficient of variation of the set size,
ππΌπΉπΆ = 1 + ([{ππ£πΆ2 (πΎ β 1) πΎ + 1} πΜ Μ Μ ] β 1) πΌπΆπΆπΆ πΆ , where ππ£πΆ = π ππΆ ππΆ
Μ Μ Μ Μ is the coefficient of variation of the set size in the control group. The variance of the set size in the control group is π ππΆ
2 = βπΎπ=1(ππΆπβπΜ Μ Μ Μ )πΆ 2
(πΎβ1) . K is the number of
matched sets. πΜ Μ Μ πΆ is the average number of control matched to each treated subject. ππΆπ is the number of controls in matched set k. Wu et al (Wu, Crespi, & Wong, 2012) presented several methods for estimating the intra-class correlation coefficient for binary responses. The proposed statistic uses a modified version of Fleiss-Cuzick estimator for the intra- class correlation coefficient (πΌπΆπΆπΆ) of the control group. The modified Fleiss-Cuzick estimator is based on Farrington-Manningβs restricted MLE of the event rate in the control group instead of the unrestricted MLE. The formula for the πΌπΆπΆπΆ is
πΌπΆπΆπΆ = 1 β β π₯πΆπ(ππΆπ β π₯πΆπ) ππΆπ πΎ π=1 (ππΆβ πΎ)πΜ(1 β ππΆ Μ)πΆ
where π₯πΆπ is the number of events in the control group in matched set k. To account for the possible correlation between subjects in the control group and experimental group who are matched, the proposed statistic uses a modified version of Obuchowskiβs method to estimate the covariance between the event ratesβ of the control group and experimental group. The proposed statistic uses Farrington-Manningβs RMLEs (πΜπΆ and πΜπΈ) in
estimating the covariance:
πΆππ£(πΜ, ππΆ Μ) =πΈ πΎ
A continuity correction may improve the normal approximation of the above proposed statistic for data with very small sample sizes or very highly correlated matched subjects. The proposed test statistic with the continuity correction is
ππ΄πΉππΆπΆ = πΜπΆ β β πΜ πΈ β πΏ0β πΆπΆ βVarΜπ΄πΉπ(πΜπΆβ πΜπΈ β πΏ0) (3) where πΆπΆ =1 2( 1 ππππ‘πππππ ) , ππππ‘ππ πππ = πΎ β (π) + (ππΈ + πππππ)(1 β π), ππΆπππ = ππΆ ππΌπΉπΆ, π = π(πΜ, ππΆ Μ) =πΈ πΆππ£(πΜ, ππΆ Μ)πΈ βπΜ(1 β ππΆ Μ)πΆ ππΆ β ππΌπΉπΆβ πΜ(1 β ππΈ Μ)πΈ ππΈ
We used the estimated risk difference and standard error of the risk difference based on each of the aforementioned methods to also estimate 95% confidence intervals of the risk difference.
For each of the different scenarios, we simulated 25,000 data sets. When the true risk difference was equal to the non-inferiority margin πΏ0 = -0.1 (under null hypothesis), we estimated the empirical type I error rate as the proportion of simulated data sets in which the null hypothesis was rejected with a significance level of less than 0.025. Owing to our use of 25,000 simulated data sets, an empirical type I error rate that was less than 0.023 or greater than 0.027 would be classified as being significantly different from 0.025. When the true risk difference was equal to 0 (under alternative hypothesis), with non-inferiority margin πΏ0 = -0.1, we estimated empirical power as the proportion of simulated data sets in which the null hypothesis was rejected with a significance level of
less than 0.025. For each of the scenarios, we also estimated the empirical coverage of the 95% confidence intervals of the risk difference as the proportion of estimated 95%
confidence intervals that contained the true risk difference. We also determined the mean width of the estimated 95% confidence intervals across the 25,000 simulated data sets. Also, the bias (i.e. Estimated RD β True RD) of the unweighted (used in the FM method) and weighted (used in the AFM method) estimates of the risk difference were computed across the 25,000 simulated data sets. Bias < 0 indicates bias towards the null.
Finally, we compared the standard deviation of the empirical sampling distribution of the estimated risk difference (i.e. the standard deviation of the 25,000 estimated risk differences across the simulated data sets) with the mean of the estimated standard errors of the estimated risk difference.