Justification of data source - Ordinal outcomes in CRTs

Ordinal outcomes in CRTs

3.2.3 Justification of data source

I chose trial reports as the data source for this review as I considered them to provide a more representative sample of clinical trials than other sources, such as published protocols. However, the level of detail and general reporting quality may be less than ideal given the strict length restrictions often imposed by journals at the time these trials were reported.

I did consider published trial protocols as a data source. The biggest advantage of the trial protocol over the trial report is the level of detail is likely to be greater, as they are not subject to the same length restrictions. The trial protocol may contain detailed information about decisions made during the course of design, for example possible justification for dichotomising an ordinal outcome, treating it as continuous, or choosing an alternative outcome. The protocol may provide insight into whether these decisions were driven by clinical relevance or the availability and complexity of statistical methods for ordinal outcomes. However, although the publication of trial protocols appears to be more common in recent years those trials which publish their protocols are more likely to be of high quality and not a representative sample of all cluster randomised trials. Their use was not considered further.

In Chapter one twenty-three reviews of cluster randomised trials were summarised in Table 1.1. The number of trials included in each review ranged from 15 to 300. With the availability of these existing reviews, and due to the time limitations imposed upon my research, I chose to utilize an existing review as my data source rather than conduct a new one. I opted to use the review by Ivers et al for two reasons.30 _{Firstly it was the largest review to have been conducted and secondly, unlike}

3.2. METHODS

some of the other reviews it was less restrictive in the databases searched and health areas included and was more likely to provide a representative sample of published cluster randomised trials across all areas of health research.

3.2.4 Inclusion and exclusion criteria

The search strategy implemented in the review by Ivers et al produced a sample of CRTs that the authors describe to be representative of all Medline publications. A publication was included if it was the main study report and was published in an English language journal between the years 2000 and 2008. Trial protocols, pilot studies, or papers that reported only baseline results or secondary analyses were excluded.

For my research I initially considered all 300 trials and then refined this to the subset reporting ordinal outcomes. The collaboration work with the Canadian group was restricted to those trials that reported a sample size calculation.

3.2.5 Data collection

In the original review by Ivers et al 47% of trials did not identify a primary outcome. In these cases, for the purpose of data extraction an outcome was designated as primary. Where multiple outcomes were specified the primary outcome was chosen as that which was reported first in the abstract or analysis. All data extraction in the original review was related to the primary outcome. The same approach was taken in my review with the exception being those that reported a sample size calculation. For these trials data extraction was based upon the outcome used in the sample size calculation if this was different from the originally defined primary outcome. This was not a frequent occurrence and was done in order to maximise information gained about sample size calculations. Where several follow-up time points were given data extraction was based on the final time point, unless an earlier time-point was identified as primary.

Descriptive information collected in the original review was made available to me by the authors. It included information on year of publication, impact factor, trial design (parallel trial, factorial, cross-

3.2. METHODS

over, other), method of randomisation (completely randomised, stratified, pair-matched, other), a description of the primary outcome and whether a sample size calculation had been reported. The following sections describe the details of what information was additionally extracted and by whom.

Information extracted collaboratively (by CR, MT, and SDX)

For papers that reported a sample size calculation the following information was collected collaboratively: (the full data extraction form can be found in appendix iii):

Study design:

• The outcome for which the sample size calculation was performed for Sample size:

• The method, or citation, of sample size calculation

• Data type of the primary outcome (binary, categorical, ordinal, continuous, count, time-to-event data, other, unclear)

• Type I, Type II error rate, and whether a one or two-sided test assumed

• The estimator used to describe the correlation within clusters, its value and any justification for its value

• Additional aspects accounted for in the calculation i.e. attrition, variable cluster sizes

• The value of the expected response in the control and treatment arms with a measure of the effect size and any justification provided for these values

• Target sample size: Total number of clusters and individuals Analysis:

• The achieved sample size used in the analysis: Total number of clusters and individuals • The method of analysis used

• Data type at the level (cluster or individual) corresponding to the analysis (binary, categorical, ordinal, continuous, count, time-to-event data, other, unclear)

• Data type as used in the analysis (binary, categorical, ordinal, continuous, count, time-to-event data, other, unclear)

3.2. METHODS

and unadjusted effect size

Information extracted independently (by CR)

The following information was extracted independently on all articles that did not report a sample size calculation:

Study design:

• Data type of the primary outcome (binary, categorical, ordinal, continuous, count, time-to-event data, other, unclear)

For the subset of trials identified with an ordinal primary outcome the following information was extracted :

Study design:

• Description of the outcome

• Number of ordinal categories, with category description • Description of the intervention

• Description of the cluster

• Description of the sub-units within a cluster

If a sample size calculation was present the following information was included in the information collected collaboratively, else this information was extracted independently:

Analysis:

• The achieved sample size used in the analysis: total number of clusters and sub-units • The method of analysis used

• Data type as used in the analysis (binary, categorical, ordinal, continuous, count, time-to-event data, other, unclear)

• Observed values of: the measure of correlation; the response in the control and treatment arms; and unadjusted effect size

3.2. METHODS

3.2.6 Data management

Collaborative extraction (by CR, MT, and SDX)

I drafted a data extraction form. This was discussed and finalised with MT, SDX, SE, and AC. The extracted data was transcribed from the data extraction form and stored in an Access database, to which the data collection from the originally review was also imported. I designed and tested the database (screen shots available in appendix iv).

CR, MT, SDX extracted the data in pairs for each article that reported a sample size calculation. The trials were divided into batches of approximately ten trials. Each batch of trials was assigned to an extracting pair. MT was responsible for the allocation and rotation of extraction pairs. Each member of the team had a copy of the database and was responsible for storing the paper and electronic versions for the trials to which they were assigned. After each set of 10 trials had been extracted the electronic database from each extracting pair was emailed to me. I imported each set of data into Stata where I used the cf2 command to compare the two datasets and list the discrepancies. The list of discrepancies was then sent to the extracting pair and were reviewed and resolved by consensus within the pair.

After the discussion of any discrepancies in the data extraction one member of each extracting pair was responsible for updating the database. This was MT for all her batches and CR for the batches marked with SDX. At the end of the project these two datasets were merged to create the final dataset on which data checking took place, described in the next section.

Individual abstraction (by CR)

The majority of the information I extracted independently was for those trials that had an ordinal primary outcome. As the number of trials for which this information was extracted was small an Excel spreadsheet was deemed adequate for its storage.

3.2.7 Validation and data checking

Collaborative extraction

3.2. METHODS

most familiar with the trials included in the sample, with the aim of testing the form on a variety of trials. No changes to the form were deemed necessary after piloting the form.

After the collection of all the data a data checking plan was agreed to ensure consistent recording between both extraction pairs and extractions over time. This data checking included some of the following aspects:

• Agreeing the categorisation of free text responses such as the method of analysis used

• Ensuring consistency in reporting percentages as decimals rather than whole numbers for example 0.85 versus 85%

• Double checking papers where a large discrepancy was seen between target and actual sample size • Double checking papers where a large discrepancy was seen between target and actual sample size parameter estimates

• Checking that absolute and relative differences had been calculated correctly

• Part way through data extraction it was agreed that the target total number of clusters required should be left blank if not explicitly reported in the sample size calculation. All papers where this question was not missing were double checked to ensure it had been explicitly reported and not inferred.

• Logical checks, for example, if the sample size does not account for the ICC then no value should be given for the ICC

Individual extraction

All information I extracted on trials with ordinal outcomes was performed twice, a month apart, in order to provide a double check of the extraction. A second data extraction was feasible due to the small number of trials included. As both extractions were performed by me there was some limitation to the validation provided. However, it was thought that the information to be extracted would be a key part of the trial report and should be easily identified. If any ambiguity was present it was discussed with SE and AC.

3.3. RESULTS

3.3 Results

In document Sample size calculations for cluster randomised trials, with a focus on ordinal outcomes. (Page 89-95)