The systematic review of previous simulation studies in the last chapter found con- icting recommendations. Four reasons for this were suggested, which we address in this new study. (1) There was conict of interest in most studies because they compared existing methods with those newly proposed. To address this, we only compare pre-existing methods in our study. (2) Most studies only compared a small subset of the methods available, so we include a comprehensive list. (3) Simulations were often not representative of real meta-analyses, so we dene parameter values for simulations based on meta-analyses seen in practice. (4) Studies don not address that all methods are very imprecise in typical meta-analyses. They failed to address this issue because their results were focused on the relative performance of methods. We consider both relative and absolute performance in this simulation study. Meta-analyses are simulated with odds ratio and standardised mean dierence study eects to capture properties of heterogeneity variance estimators for a represent- ative range of outcome measures. Novianti et al. [78] was the only study identi- ed in my systematic review that simulated both binary and continuous outcomes. participant-level data is simulated to ensure simulated data is representative of real meta-analyses. Generating participant-level data will also ensure the issues with het- erogeneity estimation specic to certain types of outcome measures is captured. One issue is that estimated odds ratio and standardised mean dierence study eects are correlated with their variances [3, 8]. This is a particularly large issue in all binary outcome meta-analyses with rare events [3].
Our methods for simulating meta-analysis data dier from most other previous sim- ulation studies in two key ways. First, we dene underlying τ2 parameter values
that correspond to a consistent range of underlying I2 values. We dene a range
of I2 between 0% and 95% to ensure the corresponding range of τ2 represents zero,
low, moderate and high inconsistency in study eects for all scenarios. Only Konto- pantelis et al. [64] has previously taken a similar approach. No guidelines exist for interpreting τ2 estimates because the measure cannot be compared between meta-
analyses, but the Cochrane Collaboration have issued rough guidelines on interpret- ing I2 values [51]. Second, all previous studies dened the event probability of the
control group for simulating binary outcome meta-analyses. Conversely, we dene the average event probability between both study groups. In doing so, the rarity of the event is more independent of the study eect sizes.
Results are presented from these simulated meta-analyses in the following two chapters. In the next chapter, we explore comprehensively the performance of all included het- erogeneity variance estimators. Scenarios are identied where all estimators perform poorly, when they perform well and in such cases which estimators perform better than others. I then investigate how the ndings from this analysis apply to real meta-analyses in chapter 8 by combining with empirical data. Methods for analysis of this simulated meta-analyses data are detailed in the two chapters that follow.
Chapter 7
7.1 Introduction
The last chapter detailed the design of a new simulation study to compare hetero- geneity variance estimators in random-eects meta-analysis. Details included the methods for simulating meta-analysis data, which heterogeneity variance estimators are compared and the performance measures used for comparisons. The study is designed based on ndings from a systematic review of previous simulation studies in chapter 5 and input from other collaborators. In this chapter, the results of this study are presented.
A number of heterogeneity variance estimators are excluded from the main results because they are clearly inferior to other estimators; section 7.2 explains the reasons for these exclusions. Also, given the scale of this study, it was only possible to present a subset of all simulated scenarios and performance measures. Reasons for choosing this subset are given in sections 7.3 and 7.4. These exclusions of estimators and results were based on a preliminary exploration of all study results, which are presented more fully in volume II of this thesis.
The main results are given in section 7.5 and split into three parts. First, results that compare estimators in terms of performance measures relating to point estimates of the heterogeneity parameter are presented in section 7.5.1. Mean bias and mean squared error performance measures in this section are plotted on the proportional scale to the heterogeneity variance parameter whenever τ2 > 0. In other words,
mean bias is plotted as a proportion of the true parameter value rather than abso- lute dierence from the truth. Similarly, for a proportional mean squared error of (for example) 100%, the average squared error is equal to τ2. This is so that results
can be compared more easily between scenarios of dierent τ2 and to help inter-
pretation. Raw mean bias and mean squared error is presented whenever τ2 = 0.
After results from the primary performance measures, those relating to estimation of the summary eect are presented in section 7.5.3 and nally, those relating to the condence interval for the summary eect are in section 7.5.4. Within each section,
selected results are presented to give a representative picture of all simulated scen- arios and a summary explains how they can be generalised to all scenarios. Results are interpreted from two viewpoints; (1) as a relative comparison of the performance of heterogeneity variance estimators reveal those that perform best and (2) as a gen- eral comparison of performance between scenarios to summarise where all estimators perform well/poorly.