• No results found

A sample size of 360 participants would provide 80% power to detect a minimum clinical between-group difference of 2.45 points [baseline standard deviation (SD) of 7.8 points based on 66 participants] on the ADAS-Cog at 12 months with a 5% level of significance and 2 : 1 randomisation in favour of the intervention. This equated to a standardised effect size of 0.31. An overall difference of 2–2.5 change points on the ADAS-Cog is considered to be a worthwhile target.83To account for therapist effects, we

inflated the sample size using a design effect of 1.04 (intracluster correlation=0.01) assuming that there are five participants per group (recognising that it may not be possible to achieve and retain eight recruits to each group), which gave a sample size of 375. The sample size was further inflated to account for 20% loss to follow-up, of which 10% was estimated would be attributable to death. Thus, a final minimum sample size of 468 participants was required, with 312 participants to be randomly allocated to the intervention arm and 156 participants to the control arm.

Note that the sample size initially calculated was 728 participants based on the sMMSE score as the primary outcome. The use of ADAS-Cog score rather than sMMSE score as the primary outcome allowed the sample size to be reduced to 468. This change in sample size occurred before we started the trial and was approved as a formal protocol amendment.

Primary analyses

Data were summarised and reported in accordance with Consolidated Standards of Reporting Trials (CONSORT) guidelines for RCTs,84and we used intention-to-treat (ITT) analyses as the primary analysis.

For the primary analyses, multilevel models were used with a random effect for region to estimate the treatment effects and 95% confidence intervals (CIs). The clustering effects of therapist and group were assessed by measuring the intracluster correlation. Therapist and/or group effects were not included in the multilevel model if the clustering effects were found to be negligible. The models were adjusted for important covariates (age, gender, region, sMMSE score and baseline ADAS-Cog score). Owing to the nature of the population, certain items on the ADAS-Cog were missing and, hence, the missingness was classed as being not random. Thus, for the primary analyses, participants with missing ADAS-Cog outcome at each time point in the observed data had their item-level responses reviewed on an individual basis. If any items were missing as a result of the participant being either cognitively unable, too distressed or refusing to answer, then we estimated the treatment effects using multiple imputation (MI) methods. This approach was taken as we are aware that the missing item-level responses for the ADAS-Cog are non-ignorable. Therefore, doing a complete-case analysis (as done so in the first sensitivity analysis) could give highly biased results. However, baseline variables, such as baseline ADAS-Cog score and sMMSE score, are quite likely to be good predictors of missing items. We therefore applied MI methods, which is considered a plausible approach to address non-ignorable missing data, to impute missing item-level data, enabling us to compute and analyse the ADAS-Cog scores.85

Sensitivity analyses

In addition to the primary analysis, three sensitivity analysis data sets were used to carry out sensitivity analyses. For the first sensitivity analysis, the treatment effect was estimated using the observed data with missing values present in the ADAS-Cog. This sensitivity analysis data set consisted of participants who provided complete primary outcome data. For the second sensitivity analysis, participants with missing ADAS-Cog outcome in the observed data had the worst score assigned at the item level, provided the item was missing as a result of the participant being either cognitively unable, too distressed or refusing to answer. For the third sensitivity analysis, an item response theory (IRT) approach was used. Between the estimation of the sample size and the finalisation of the statistical analysis plan, the literature had moved to suggest that the ADAS-Cog does not measure a single patient trait (cognitive impairment) but rather it measures cognitive impairment in multiple cognitive domains.86Therefore, IRT was used to assess treatment

effects in each of the cognitive domains, namely language, memory and praxis86(seeAppendix 5). Secondary analyses

For secondary analyses, we estimated treatment effects over the 12-month time period using longitudinal models adjusting for the same variables used in the primary analysis.

Subgroup analyses

Prespecified subgroup analyses looking at severity of cognitive impairment (sMMSE score of≥20 for mild and<20 for moderate), type of dementia (Alzheimer’s vs. other), physical performance (no problems walking vs. some problems/confined to bed, taken from the EQ-5D-3L) and gender (female vs. male) were conducted using formal tests of interaction.87

Complier-average causal effect analysis

We measured compliance with the intervention by the number of sessions attended. This information was collected by the therapist providing the treatment. Complier-average causal effect (CACE) analysis was used to assess the effect of compliance with the intervention on the primary outcome.88

Data set access

The final data set was accessible to all study members after data lock. The chief investigator assumed overall responsibility for the data report and had full access to the trial data set. There were no contractual agreements that limited access for investigators.

Serious adverse event and adverse event reporting

An AE was defined as any untoward medical occurrence in a participant that did not necessarily have a causal relationship with this treatment. These were most likely to be identified by the physiotherapist during the exercise sessions, from information at the sign-in, or after completion of the exercise sessions during support telephone calls or the face-to-face meeting.

As each participant had a pre-exercise assessment done, this provided information on comorbidities. The trial population included many participants aged>70 years old and, therefore, they had many of the common chronic diseases of older age, for example osteoarthritis. It was expected that participants would experience some uncomfortable effects of participation in the intervention, for example muscle or joint soreness in response to exercise. Provided that these followed an expected pattern (e.g. as for delayed-onset muscle soreness), needed simple modifications to the exercise activity (e.g. changes to the bicycle seat height) or were non-serious exacerbations of existing medical conditions, they were not considered as AEs. A serious adverse event (SAE) was an AE that fulfilled one or more of the following criteria:

l resulted in death

l was immediately life-threatening

l required hospitalisation or prolongation of existing hospitalisation

l resulted in persistent or significant disability or incapacity

The SAEs to be reported were defined as those that occurred within 2 hours of completing the exercise sessions or follow-on physical activities. SAEs were reported to the Trial Co-ordinating Centre within 24 hours of the physiotherapist becoming aware of them. The Trial Co-ordinating Centre was responsible for reporting AEs to the sponsor and ethics committee within required timelines.

The relationship of SAEs to trial treatment was assessed by the chief investigator and this was recorded on each SAE form. All SAEs were recorded in the trial database, when appropriate, reported to and reviewed by the Data Monitoring and Ethics Committee (DMEC) throughout the trial, and were followed up to resolution.

Monitoring and approval