(or the α (type I) error level); (iii) the statistical power (or the β (type II) error level); and (iv) for continuous outcomes, the assumed standard deviation of the outcome. For cluster randomised trials the 2004 CONSORT extension additionally recommends the reporting of two further descriptive ele- ments (v) the number of clusters or the cluster size(s) and (vi) the intracluster correlation coefficient (ICC) or coefficient of variation (k), along with a measure of its uncertainty. The 2012 revision additionally recommends specification of whether equal or unequal cluster sizes are assumed.
1.4
Sample size calculations for cluster randomised trials
For cluster randomised trials there may be several combinations of the number of clusters and cluster size that produce designs with equivalent power. In these situations, to determine the optimal design, one may additionally consider the efficiency of these designs in terms of the costs involved in recruiting and measuring clusters and individuals within clusters. These scenarios are briefly considered in Chapter Six but the focus of this thesis are scenarios where the cluster size or number of clusters is fixed, or constrained, at the point of sample size calculation and the trial is designed to produce a specified level of power.
In this section I briefly describe the most common approach to sample size calculation for cluster randomised trials and describe recent developments prior to the start of my research and unresolved issues to place my research in context. In Chapter Six I will return to look in more detail at the developments and unresolved issues that remain at the end of my research.
1.4.1
Early work
Cornfield31 recognised that randomisation by cluster resulted in a less efficient design and so the sample size assuming individual randomisation must be inflated to achieve adequate power under cluster randomisation. Cornfield’s work was followed in 1981 by Donner52 who quantified this infla- tion factor and described it as the Design effect. Despite being over 30 years old these two papers still remain highly cited and use of the design effect for sample size calculation remains the most common approach.
1.4. SAMPLE SIZE CALCULATIONS FOR CLUSTER RANDOMISED TRIALS
1.4.2
The Design Effect(DE)
For continuous and binary outcomes the sample size calculated assuming individual randomisation (Equations 1.1 or 1.2) is multiplied by the design effect to account for randomisation by cluster. This design effect is given by
DE = 1 + (n− 1)ρ (1.12)
Where n is the number of individuals per cluster (assumed constant) and ρ is the intracluster corre- lation coefficient. When we conduct a CRT we may sample an entire cluster such as all participants registered at a General Practice, or take a sub-sample for inclusion into the trial. Throughout this thesis, when I refer to cluster size, I am specifically referring to the sample of the cluster that is to be included in the analysis, which may or may not be the entire cluster.
1.4.3
Recent developments
The design effect proposed by Donner was derived for continuous or binary data analysed at the cluster level assuming a fixed cluster size. In some trials such as those conducted in ophthalmology where a subject is the cluster and measurements are taken on each eye fixed cluster size may be a reasonable assumption to make. However, to have variable cluster sizes is more common. In trials where the cluster size is very variable use of the average cluster size in the design effect will likely underestimate the required sample size. For cluster-level analysis simple methods are available that provide an appropriate sample size using the harmonic mean of the sample size in each cluster.7 Recent reviews have shown developments in sample size calculations including methods allowing for: variable cluster sizes, matched designs, re-estimation using internal pilots, attrition, incorporation of covariates or multiple time points, time-to-event outcomes, and incorporation of imprecision in the
ICC.8, 53–55
1.4.4
Unresolved issues
The methodological reviews indicate that although many sample size methods are being developed to deal with variations and complexities in trial design the majority are still only applicable to binary
1.4. SAMPLE SIZE CALCULATIONS FOR CLUSTER RANDOMISED TRIALS
or continuous outcomes. Methodology for alternative outcomes such as ordinal, count or time-to- event is in the minority, especially for variations to the standard parallel group trial. However, these reviews present only selected methods and do not provide a comprehensive review of all available sample size methods. In Chapter Two I conduct a systematic review to provide a comprehensive description of published sample size methods for cluster randomised trials.
1.4.5
Quality of methodology and reporting
Despite the simplicity of the design effect many trialists are still unaware of the need to adjust sample size calculations to account for clustering in cluster randomised trials, or perhaps unaware that the design they are using induces such clustering. Many of the reviews in Table 1.1 examined the quality of both the trial methodology and reporting. The proportion of trials that reported a sample size calculation and the proportion that reported an appropriate calculation can be seen in Table 1.2. Despite the introduction of the CONSORT Statement many of the reviews showed that the reporting was inadequate. In the largest review by Ivers et al whose sample is perhaps the most representative of the health research field just over half of the trials reported a sample size calculation.30
Given that the CONSORT extension for cluster randomised trials was first published in 2004 it is clear from Table 1.2 that sample sizes that appropriately account for clustering are reported with low frequency. Much improvement to reporting is needed, poor reporting can make it difficult for those designing trials to obtain the estimates they need for sample size calculations.
In this section I have described the use of the design effect as the most common method for sample size calculation in cluster randomised trials. However, as the variety in trial designs increase such as variable cluster sizes, attrition or repeated measurements the simple design effect may not always be appropriate. The focus of my research is sample size calculations for ordinal outcomes. In the following section I provide the definition of an ordinal outcome used throughout this research and describe some simple and intuitive approaches to sample size calculation and why they may not always be adequate.