Statistical analyses - Monte Carlo Simulation Study Methods

2. COMPARING A METHOD FOR INDEPENDENT DATA TO A METHOD FOR

2.3. Monte Carlo Simulation Study Methods

2.3.2. Statistical analyses

For each simulated data set, we performed the following non-inferiority hypothesis test on the propensity-score matched sample:

H₀: 𝑝_𝐶− 𝑝_𝐸 ≤ 𝛿₀ vs. H₁: 𝑝_𝐶− 𝑝_𝐸 > 𝛿₀

where 𝑝𝐶 and 𝑝𝐸 are the true event probabilities under the control and experimental treatments, respectively, for subjects in whom the ATT is defined, and 𝛿0 < 0 is the non- inferiority margin. Without loss of generality, we assume that a smaller success

probability denotes greater safety. In the propensity-score matched sample, we used two statistical methods to perform the non-inferiority test. First, we used a method proposed

by Farrington and Manning (Farrington & Manning, 1990) that assumes independent treatment groups based on the restricted maximum likelihood estimation (MLE):

𝑍_𝐹𝑀 = 𝑝̂𝐶−𝑝̂𝐸−𝛿0 √Var̂_𝐹𝑀_(𝑝̂_𝐶−𝑝̂𝐸−𝛿0) (1) where Var̂𝐹𝑀(𝑝̂𝐶− 𝑝̂𝐸 − 𝛿0) = 𝑝̃(1 − 𝑝_𝐶 ̃)_𝐶 𝑁𝐶 +𝑝̃(1 − 𝑝𝐸 ̃)𝐸 𝑁𝐸 .

𝑝̂_𝐶 is the proportion of control subjects in the matched sample with the event of interest, 𝑝̂𝐸 is the proportion of experimental subjects in the matched sample with the event of interest, 𝑁𝑐 and 𝑁𝐸 are the number of control and experimental subjects, respectively, in the matched sample.

𝑝̃_𝐶 = 2𝑢_𝑓𝑚cos(𝑤_𝑓𝑚) − 𝑏_𝑓𝑚/3𝑎_𝑓𝑚, 𝑝̃_𝐸 = 𝑝̃_𝐶− 𝛿₀ where θ =NE 𝑁_𝐶, 𝑎𝑓𝑚= 1 + θ, 𝑏𝑓𝑚= −(1 + θ + 𝑝̂𝐶+ θ𝑝̂𝐸 + 𝛿0(θ + 2)), 𝑐_𝑓𝑚 = (−𝛿₀)2_{+ 𝛿} 0(2𝑝̂𝐶+ θ + 1) + 𝑝̂𝐶+ θ𝑝̂𝐸, dfm = −𝛿0 ∗ (1 + 𝛿0)𝑝̂𝐶, 𝑢_𝑓𝑚 = 𝑠𝑖𝑔𝑛(𝑣_𝑓𝑚)√( 𝑏𝑓𝑚 2 (3𝑎_𝑓𝑚)2− 𝑐_𝑓𝑚 3𝑎_𝑓𝑚), 𝑣_𝑓𝑚 = ( bfm 3 (3afm)3 ) − (bfmcfm 6a_fm2 ) + ( dfm 2afm ), w_fm = (𝜋 + cos−1₍𝑣𝑓𝑚 𝑢_𝑓𝑚3 ))/3. 𝑝̃_𝐶 and 𝑝̃_𝐸 are the restricted MLEs of p_C and p_E under the null hypothesis that the risk difference equals 𝛿0. The test statistic ZFM is asymptotically standard normal under the null hypothesis.

Second, we used a method proposed in chapter one of this dissertation which adjusts the Farrington-Manning test statistic (1) to account for the potential correlation within matched subjects:

𝑍_𝐴𝐹𝑀 = 𝑝̂𝐶 ∗ _{− 𝑝̂} 𝐸− 𝛿0 √Var̂_𝐴𝐹𝑀_(𝑝̂_𝐶∗ _{− 𝑝̂} 𝐸− 𝛿0) (2) where Var̂_AFM(𝑝̂𝐶∗ − 𝑝̂𝐸 − 𝛿0) = 𝑝̃𝐶(1 − 𝑝̃𝐶) 𝑁𝐶 𝑉𝐼𝐹_𝐶+𝑝̃𝐸(1 − 𝑝̃𝐸) 𝑁𝐸 − 2𝐶𝑜𝑣(𝑝̂_𝐶, 𝑝̂_𝐸).

𝑝̂_𝐶∗ is the weighted event probability for the control group in the matched sample, where each control receives a weight that is the inverse of the number of controls matched to the same experimental subject (Stuart, Matching Methods for Causal Inference: A Review and a Look Forward, 2010) (Austin, Statistical Criteria for Selecting the Optimal Number of Untreated Subjects Matched to Each Treated Subject When Using Many-to-One Matching on the Propensity Score, 2010) (Austin, Assessing balance in measured baseline covariates when using many-to-one matching on the propensity-score, 2008). The proposed statistic uses Eldridge et al’s (Eldridge, Ashby, & Kerry, 2006) formula to calculate the variance inflation factor (VIF) which is used to adjust the variance of the control group’s event rate to account for the correlation among controls in the same matched set. The formula takes into account the possibility of unequal set sizes (i.e. variable-ratio matching, where the number of controls matched to each treated subject is allowed to vary from one matched set to another) by using the coefficient of variation of the set size,

𝑉𝐼𝐹𝐶 = 1 + ([{𝑐𝑣𝐶2 (𝐾 − 1) 𝐾 + 1} 𝑛̅̅̅] − 1) 𝐼𝐶𝐶𝐶 𝐶 , where 𝑐𝑣𝐶 = 𝑠_𝑛𝐶 𝑛𝐶

̅̅̅̅ is the coefficient of variation of the set size in the control group. The variance of the set size in the control group is 𝑠𝑛𝐶

2 ₌ ∑𝐾𝑘=1(𝑛𝐶𝑘−𝑛̅̅̅̅)𝐶 2

(𝐾−1) . K is the number of

matched sets. 𝑛̅̅̅𝐶 is the average number of control matched to each treated subject. 𝑛𝐶𝑘 is the number of controls in matched set k. Wu et al (Wu, Crespi, & Wong, 2012) presented several methods for estimating the intra-class correlation coefficient for binary responses. The proposed statistic uses a modified version of Fleiss-Cuzick estimator for the intra- class correlation coefficient (𝐼𝐶𝐶𝐶) of the control group. The modified Fleiss-Cuzick estimator is based on Farrington-Manning’s restricted MLE of the event rate in the control group instead of the unrestricted MLE. The formula for the 𝐼𝐶𝐶𝐶 is

𝐼𝐶𝐶_𝐶 = 1 − ∑ 𝑥𝐶𝑘(𝑛𝐶𝑘 − 𝑥𝐶𝑘) 𝑛𝐶𝑘 𝐾 𝑘=1 (𝑁𝐶− 𝐾)𝑝̃(1 − 𝑝𝐶 ̃)𝐶

where 𝑥𝐶𝑘 is the number of events in the control group in matched set k. To account for the possible correlation between subjects in the control group and experimental group who are matched, the proposed statistic uses a modified version of Obuchowski’s method to estimate the covariance between the event rates’ of the control group and experimental group. The proposed statistic uses Farrington-Manning’s RMLEs (𝑝̃𝐶 and 𝑝̃𝐸) in

estimating the covariance:

𝐶𝑜𝑣(𝑝̂, 𝑝_𝐶 ̂) =_𝐸 𝐾

A continuity correction may improve the normal approximation of the above proposed statistic for data with very small sample sizes or very highly correlated matched subjects. The proposed test statistic with the continuity correction is

𝑍_𝐴𝐹𝑀_𝐶𝐶 = 𝑝̂𝐶 ∗ _{− 𝑝̂} 𝐸 − 𝛿0− 𝐶𝐶 √Var̂𝐴𝐹𝑀(𝑝̂𝐶− 𝑝̂𝐸 − 𝛿0) (3) where 𝐶𝐶 =1 2( 1 𝑁_{𝑇𝑜𝑡𝑎𝑙}𝑒𝑓𝑓 ) , 𝑁𝑇𝑜𝑡𝑎𝑙 𝑒𝑓𝑓 = 𝐾 ∗ (𝜌) + (𝑁_𝐸 + 𝑁_𝑐𝑒𝑓𝑓)(1 − 𝜌), 𝑁_𝐶𝑒𝑓𝑓 = 𝑁𝐶 𝑉𝐼𝐹_𝐶, 𝜌 = 𝜌(𝑝̂, 𝑝𝐶 ̂) =𝐸 𝐶𝑜𝑣(𝑝̂, 𝑝_𝐶 ̂)_𝐸 √𝑝̃(1 − 𝑝𝐶 ̃)𝐶 𝑁𝐶 ∗ 𝑉𝐼𝐹𝐶√ 𝑝̃(1 − 𝑝_𝐸 ̃)_𝐸 𝑁𝐸

We used the estimated risk difference and standard error of the risk difference based on each of the aforementioned methods to also estimate 95% confidence intervals of the risk difference.

For each of the different scenarios, we simulated 25,000 data sets. When the true risk difference was equal to the non-inferiority margin 𝛿0 = -0.1 (under null hypothesis), we estimated the empirical type I error rate as the proportion of simulated data sets in which the null hypothesis was rejected with a significance level of less than 0.025. Owing to our use of 25,000 simulated data sets, an empirical type I error rate that was less than 0.023 or greater than 0.027 would be classified as being significantly different from 0.025. When the true risk difference was equal to 0 (under alternative hypothesis), with non-inferiority margin 𝛿0 = -0.1, we estimated empirical power as the proportion of simulated data sets in which the null hypothesis was rejected with a significance level of

less than 0.025. For each of the scenarios, we also estimated the empirical coverage of the 95% confidence intervals of the risk difference as the proportion of estimated 95%

confidence intervals that contained the true risk difference. We also determined the mean width of the estimated 95% confidence intervals across the 25,000 simulated data sets. Also, the bias (i.e. Estimated RD − True RD) of the unweighted (used in the FM method) and weighted (used in the AFM method) estimates of the risk difference were computed across the 25,000 simulated data sets. Bias < 0 indicates bias towards the null.

Finally, we compared the standard deviation of the empirical sampling distribution of the estimated risk difference (i.e. the standard deviation of the 25,000 estimated risk differences across the simulated data sets) with the mean of the estimated standard errors of the estimated risk difference.

In document Assessing non-inferiority via risk difference in one-to-many propensity-score matched studies (Page 83-88)