• No results found

2. COMPARING A METHOD FOR INDEPENDENT DATA TO A METHOD FOR

2.3. Monte Carlo Simulation Study Methods

2.3.2. Statistical analyses

For each simulated data set, we performed the following non-inferiority hypothesis test on the propensity-score matched sample:

H0: π‘πΆβˆ’ 𝑝𝐸 ≀ 𝛿0 vs. H1: π‘πΆβˆ’ 𝑝𝐸 > 𝛿0

where 𝑝𝐢 and 𝑝𝐸 are the true event probabilities under the control and experimental treatments, respectively, for subjects in whom the ATT is defined, and 𝛿0 < 0 is the non- inferiority margin. Without loss of generality, we assume that a smaller success

probability denotes greater safety. In the propensity-score matched sample, we used two statistical methods to perform the non-inferiority test. First, we used a method proposed

by Farrington and Manning (Farrington & Manning, 1990) that assumes independent treatment groups based on the restricted maximum likelihood estimation (MLE):

𝑍𝐹𝑀 = π‘Μ‚πΆβˆ’π‘Μ‚πΈβˆ’π›Ώ0 √Var̂𝐹𝑀(π‘Μ‚πΆβˆ’π‘Μ‚πΈβˆ’π›Ώ0) (1) where Var̂𝐹𝑀(π‘Μ‚πΆβˆ’ 𝑝̂𝐸 βˆ’ 𝛿0) = 𝑝̃(1 βˆ’ 𝑝𝐢 Μƒ)𝐢 𝑁𝐢 +𝑝̃(1 βˆ’ 𝑝𝐸 Μƒ)𝐸 𝑁𝐸 .

𝑝̂𝐢 is the proportion of control subjects in the matched sample with the event of interest, 𝑝̂𝐸 is the proportion of experimental subjects in the matched sample with the event of interest, 𝑁𝑐 and 𝑁𝐸 are the number of control and experimental subjects, respectively, in the matched sample.

𝑝̃𝐢 = 2π‘’π‘“π‘šcos(π‘€π‘“π‘š) βˆ’ π‘π‘“π‘š/3π‘Žπ‘“π‘š, 𝑝̃𝐸 = π‘ΜƒπΆβˆ’ 𝛿0 where ΞΈ =NE 𝑁𝐢, π‘Žπ‘“π‘š= 1 + ΞΈ, π‘π‘“π‘š= βˆ’(1 + ΞΈ + 𝑝̂𝐢+ θ𝑝̂𝐸 + 𝛿0(ΞΈ + 2)), π‘π‘“π‘š = (βˆ’π›Ώ0)2+ 𝛿 0(2𝑝̂𝐢+ ΞΈ + 1) + 𝑝̂𝐢+ θ𝑝̂𝐸, dfm = βˆ’π›Ώ0 βˆ— (1 + 𝛿0)𝑝̂𝐢, π‘’π‘“π‘š = 𝑠𝑖𝑔𝑛(π‘£π‘“π‘š)√( π‘π‘“π‘š 2 (3π‘Žπ‘“π‘š)2βˆ’ π‘π‘“π‘š 3π‘Žπ‘“π‘š), π‘£π‘“π‘š = ( bfm 3 (3afm)3 ) βˆ’ (bfmcfm 6afm2 ) + ( dfm 2afm ), wfm = (πœ‹ + cosβˆ’1(π‘£π‘“π‘š π‘’π‘“π‘š3 ))/3. 𝑝̃𝐢 and 𝑝̃𝐸 are the restricted MLEs of pC and pE under the null hypothesis that the risk difference equals 𝛿0. The test statistic ZFM is asymptotically standard normal under the null hypothesis.

Second, we used a method proposed in chapter one of this dissertation which adjusts the Farrington-Manning test statistic (1) to account for the potential correlation within matched subjects:

𝑍𝐴𝐹𝑀 = 𝑝̂𝐢 βˆ— βˆ’ 𝑝̂ πΈβˆ’ 𝛿0 √Var̂𝐴𝐹𝑀(π‘Μ‚πΆβˆ— βˆ’ 𝑝̂ πΈβˆ’ 𝛿0) (2) where VarΜ‚AFM(π‘Μ‚πΆβˆ— βˆ’ 𝑝̂𝐸 βˆ’ 𝛿0) = 𝑝̃𝐢(1 βˆ’ 𝑝̃𝐢) 𝑁𝐢 𝑉𝐼𝐹𝐢+𝑝̃𝐸(1 βˆ’ 𝑝̃𝐸) 𝑁𝐸 βˆ’ 2πΆπ‘œπ‘£(𝑝̂𝐢, 𝑝̂𝐸).

π‘Μ‚πΆβˆ— is the weighted event probability for the control group in the matched sample, where each control receives a weight that is the inverse of the number of controls matched to the same experimental subject (Stuart, Matching Methods for Causal Inference: A Review and a Look Forward, 2010) (Austin, Statistical Criteria for Selecting the Optimal Number of Untreated Subjects Matched to Each Treated Subject When Using Many-to-One Matching on the Propensity Score, 2010) (Austin, Assessing balance in measured baseline covariates when using many-to-one matching on the propensity-score, 2008). The proposed statistic uses Eldridge et al’s (Eldridge, Ashby, & Kerry, 2006) formula to calculate the variance inflation factor (VIF) which is used to adjust the variance of the control group’s event rate to account for the correlation among controls in the same matched set. The formula takes into account the possibility of unequal set sizes (i.e. variable-ratio matching, where the number of controls matched to each treated subject is allowed to vary from one matched set to another) by using the coefficient of variation of the set size,

𝑉𝐼𝐹𝐢 = 1 + ([{𝑐𝑣𝐢2 (𝐾 βˆ’ 1) 𝐾 + 1} 𝑛̅̅̅] βˆ’ 1) 𝐼𝐢𝐢𝐢 𝐢 , where 𝑐𝑣𝐢 = 𝑠𝑛𝐢 𝑛𝐢

Μ…Μ…Μ…Μ… is the coefficient of variation of the set size in the control group. The variance of the set size in the control group is 𝑠𝑛𝐢

2 = βˆ‘πΎπ‘˜=1(π‘›πΆπ‘˜βˆ’π‘›Μ…Μ…Μ…Μ…)𝐢 2

(πΎβˆ’1) . K is the number of

matched sets. 𝑛̅̅̅𝐢 is the average number of control matched to each treated subject. π‘›πΆπ‘˜ is the number of controls in matched set k. Wu et al (Wu, Crespi, & Wong, 2012) presented several methods for estimating the intra-class correlation coefficient for binary responses. The proposed statistic uses a modified version of Fleiss-Cuzick estimator for the intra- class correlation coefficient (𝐼𝐢𝐢𝐢) of the control group. The modified Fleiss-Cuzick estimator is based on Farrington-Manning’s restricted MLE of the event rate in the control group instead of the unrestricted MLE. The formula for the 𝐼𝐢𝐢𝐢 is

𝐼𝐢𝐢𝐢 = 1 βˆ’ βˆ‘ π‘₯πΆπ‘˜(π‘›πΆπ‘˜ βˆ’ π‘₯πΆπ‘˜) π‘›πΆπ‘˜ 𝐾 π‘˜=1 (π‘πΆβˆ’ 𝐾)𝑝̃(1 βˆ’ 𝑝𝐢 Μƒ)𝐢

where π‘₯πΆπ‘˜ is the number of events in the control group in matched set k. To account for the possible correlation between subjects in the control group and experimental group who are matched, the proposed statistic uses a modified version of Obuchowski’s method to estimate the covariance between the event rates’ of the control group and experimental group. The proposed statistic uses Farrington-Manning’s RMLEs (𝑝̃𝐢 and 𝑝̃𝐸) in

estimating the covariance:

πΆπ‘œπ‘£(𝑝̂, 𝑝𝐢 Μ‚) =𝐸 𝐾

A continuity correction may improve the normal approximation of the above proposed statistic for data with very small sample sizes or very highly correlated matched subjects. The proposed test statistic with the continuity correction is

𝑍𝐴𝐹𝑀𝐢𝐢 = 𝑝̂𝐢 βˆ— βˆ’ 𝑝̂ 𝐸 βˆ’ 𝛿0βˆ’ 𝐢𝐢 √Var̂𝐴𝐹𝑀(π‘Μ‚πΆβˆ’ 𝑝̂𝐸 βˆ’ 𝛿0) (3) where 𝐢𝐢 =1 2( 1 π‘π‘‡π‘œπ‘‘π‘Žπ‘™π‘’π‘“π‘“ ) , π‘π‘‡π‘œπ‘‘π‘Žπ‘™ 𝑒𝑓𝑓 = 𝐾 βˆ— (𝜌) + (𝑁𝐸 + 𝑁𝑐𝑒𝑓𝑓)(1 βˆ’ 𝜌), 𝑁𝐢𝑒𝑓𝑓 = 𝑁𝐢 𝑉𝐼𝐹𝐢, 𝜌 = 𝜌(𝑝̂, 𝑝𝐢 Μ‚) =𝐸 πΆπ‘œπ‘£(𝑝̂, 𝑝𝐢 Μ‚)𝐸 βˆšπ‘Μƒ(1 βˆ’ 𝑝𝐢 Μƒ)𝐢 𝑁𝐢 βˆ— π‘‰πΌπΉπΆβˆš 𝑝̃(1 βˆ’ 𝑝𝐸 Μƒ)𝐸 𝑁𝐸

We used the estimated risk difference and standard error of the risk difference based on each of the aforementioned methods to also estimate 95% confidence intervals of the risk difference.

For each of the different scenarios, we simulated 25,000 data sets. When the true risk difference was equal to the non-inferiority margin 𝛿0 = -0.1 (under null hypothesis), we estimated the empirical type I error rate as the proportion of simulated data sets in which the null hypothesis was rejected with a significance level of less than 0.025. Owing to our use of 25,000 simulated data sets, an empirical type I error rate that was less than 0.023 or greater than 0.027 would be classified as being significantly different from 0.025. When the true risk difference was equal to 0 (under alternative hypothesis), with non-inferiority margin 𝛿0 = -0.1, we estimated empirical power as the proportion of simulated data sets in which the null hypothesis was rejected with a significance level of

less than 0.025. For each of the scenarios, we also estimated the empirical coverage of the 95% confidence intervals of the risk difference as the proportion of estimated 95%

confidence intervals that contained the true risk difference. We also determined the mean width of the estimated 95% confidence intervals across the 25,000 simulated data sets. Also, the bias (i.e. Estimated RD βˆ’ True RD) of the unweighted (used in the FM method) and weighted (used in the AFM method) estimates of the risk difference were computed across the 25,000 simulated data sets. Bias < 0 indicates bias towards the null.

Finally, we compared the standard deviation of the empirical sampling distribution of the estimated risk difference (i.e. the standard deviation of the 25,000 estimated risk differences across the simulated data sets) with the mean of the estimated standard errors of the estimated risk difference.

Related documents