The evaluation measures used to compare the performance of the proposed approaches were computed as following:
1) Average value of estimates;
2) Relative bias computed as the deviation of the average observed value from the true value (parameter) divided by the true value . Positive relative bias represents an overestimate of the parameter and negative bias represents an underestimate of the parameter. The relative bias is not applicable for the ICC estimate when its true value is set as zero.
3) The standard errors of estimated regression coefficients (log odds ratio) were computed as the empirical standard error of 1000 estimate values. Note that the standard error of estimated regression coefficients (log odds ratio) from GEE extensions of ordinal logistic regression is obtained from the sandwich estimator.
4) The type I error rate was calculated as the proportion of simulation samples generated under the null hypothesis which have p-values less than or equal to the nominal 5% significance level.
5) The statistical power was calculated as the proportion of simulation samples generated under the alternative hypothesis which have p-values less than or equal to
the nominal 5% significance level, given that the corresponding test statistic provides a valid type I error rate.
Since 1000 iterations were used, the approximate 95% confidence interval for a five percent rejection rate is (0.031, 0.069). Therefore, the statistical test is overly conservative for type I error rates less than 0.03, and overly liberal for type I error rates greater than 0.07 (Bradley, 1978). Power comparisons were limited to those test statistics with valid type I error rates.
6.4.1
Investigation of the ICC estimators
In this objective, we evaluated properties of the ANOVA and kappa-type ICC estimator. Both spaced scores (i.e.,1,2,3) and midrank scores were considered in calculating the ICC estimators, as summarized in Table 6.3. The subscript ‘M’ denotes the ICC estimator calculated by using midranks. In addition, since Cohen’s regular kappa κˆ (Cohen, 1960) is also a frequently used statistic measuring the agreement of ordinal outcomes, we compared properties of ρˆ and κ ρˆ to A κˆ as well. The ICC estimators, defined in Chapter 2, are compared were listed in Table 6.3 in terms of the average, relative bias, range, and standard error.
Since negative ICCs are generally considered implausible in the context of cluster randomization trials, negative ICC estimators are usually set to zero. In the present study, we also followed this practice.
Table 6.3: ICC estimators evaluated in simulation study
6.4.2
Evaluation of test statistics
Adjusted Cochran-Armitage tests
ICC estimator With scores 1,2,3 With midrank scores ANOVA ICC estimator
A
ρˆ ρˆA(M)
Kappa-type ICC estimator
κ
We evaluated the performance of three direct-adjusted Cochran-Armitage test statistics for clustered ordinal outcome data. Both ρˆ and κ ρˆ were used as estimates of the ICC in A
the test statistics. In addition, since both equally-spaced score and midrank scores were considered for ρˆ and κ ρˆ , adjusted Cochran-Armitage test statistics were also calculated A
using these two scoring schemes respectively. The twelve test statistics evaluated under this objective are listed in Table 6.4, and compared in terms of Type I errors and statistical power.
As reviewed in section 1.5.1, Jung and Kang (2001) proposed another simple adjustment to the Cochran-Armitage test for clustered ordinal outcomes. We also compared adjusted Cochran-Armitage tests to Jung and Kang (2001)’s approach (i.e., χJ2 and
χ
J2(M)).Table 6.4: Adjusted Cochran-Armitage test statistics evaluated in simulation With ρˆA With ρˆκ With ρˆA(M) With ρˆk(M)
Adjusted Cochran-Armitage test (I) 2
1 A
χ
2 1 κχ
χ
A21(M) 2 ) ( 1M κχ
Adjusted Cochran-Armitage test (II)χ
A22χ
κ22χ
A22(M)χ
κ22(M) WLS Cochran-Armitage test χA23 χκ23χ
A23(M)χ
κ23(M)Comparisons of model-based approaches
Test statistics from marginal extensions of proportional odds regression models and mixed-effect proportional odds regression models (i.e., cluster-specific models) were compared in terms of type I error rates and power. In particular, marginal models were fitted by the GEE approach using an independent working correlation. Since mixed effects linear models are commonly used to fit clustered ordinal outcomes, we also compared marginal and cluster-specific models to random effects linear models, where the test statistic is the t-statistic expressed as the ratio of the parameter estimate to its standard error. The SAS procedures used in the above marginal, cluster-specific, and random effects linear models were PROC GENMOD, PROC NLMIXED, and PROC GLIMMIX respectively. The test statistics under this objective are listed as the first four
statistics in Table 6.5, denoted as WM , WR,χCS2 , and TLinear correspondingly. They were
previously defined in Chapter 4.
Table 6.5: Model-based test statistics evaluated in simulation study Test statistic Test name
M
W Model-based Wald test statistic
R
W Robust Wald test statistics
R
S Robust score test statistic
2 CS
χ Chi-square test statistic from cluster-specific models
Linear
T T-test statistic from random effects linear model
1 BC
W Bias-corrected robust Wald test: Approach 1
2 BC
W Bias-corrected robust Wald test: Approach 2
3 BC
W Bias-corrected robust Wald test: Approach 3
4 BC
W Bias-corrected robust Wald test: Approach 4
5 BC
W Bias-corrected robust Wald test: Approach 5
1 df
W Degrees-of-freedom-adjusted Wald test: Approach 1
2 df
W Degrees-of-freedom-adjusted Wald test: Approach 2
3 df
W Degrees-of-freedom-adjusted Wald test: Approach 3
4 df
W Degrees-of-freedom-adjusted Wald test: Approach 4
5 df
W Degrees-of-freedom-adjusted Wald test: Approach 5
BC
S Modified robust score test
In particular, we give special attention to the magnitudes of the regression coefficient estimate βˆ and its standard error in the marginal and cluster-specific models (Ten Have
et al., 1996), as discussed in section 4.4. Note that the standard error of estimated regression coefficients (i.e., log odds ratios) from GEE extensions of ordinal logistic regression is obtained from the sandwich estimator. The estimates are listed in Table 6.6.
Table 6.6: Regression coefficient estimates and their standard errors from marginal and cluster-specific models
Approaches Parameter estimate Standard errors
Marginal models GEE
β
ˆ SE(β
ˆGEE) Cluster-specific models CSβ
ˆ SE(β
ˆCS)Evaluation of small-sample adjustments to GEE
Five bias-corrected and four degrees-of-freedom-adjusted approaches for the robust Wald test and one correction approach for the robust score test discussed in Chapter 5 were investigated. Therefore, including four model-based test statistics discussed previously (i.e., WM, WR,WCS, and TLinear), a total of 16 model-based test statistics were compared
and summarized in Table 6.5. They were previously defined in Chapter 5.