Data collection and management - Statistical analysis

3.3 Statistical analysis

4.2.2 Data collection and management

A prespecified analysis plan was provided to the researcher by the trialists in advance of collecting the individual patient data (IPD). This was used to establish the variables and outcome measures common to each dataset and to assess the data that could be combined (Table 4-1).

Table 4-1 Shared outcomes used in AVERT phase II and VERITAS Shared outcome Outcome measure

Week 1

VERITAS AVERT phase II

Stroke severity National Institute Health Stroke Scale

Scandinavian Stroke Scale Complications Measured as per protocol Measured as per protocol Level of fatigue Borg Perceived Exertion Scale Borg Perceived Exertion

Scale Time to first

mobilisation

Measured as per protocol Measured as per protocol Dose of mobilisation^∗ Time spent upright

(accelerometry)

Time spent in therapy (therapist report) Month 3

Independence Modified Rankin Scale Modified Rankin Scale Discharge date Measured as per protocol Measured as per protocol Discharge destination Measured as per protocol Measured as per protocol Complications Measured as per protocol Measured as per protocol Activities of daily living Barthel Index Barthel Index

Level of mobility Rivermead Mobility Index Rivermead Mobility Index Health-related quality of

life

Assessment of Quality of Life Questionnaire

Resource use^∗ Measured as per protocol Measured as per protocol

∗Data were not aggregated

Annonymised data for the relevant baseline factors and prespecified outcomes were extracted from the AVERT phase II data set and the sub-set sent, as an Excel file, via email to the researcher. The VERITAS data were sent, in its

entirety, as a statistical package file via email to the researcher. The data were cleaned, re-coded as required and cross-checked with reports and publications.

The trial protocols were used to provide information about the intervention provided and the outcome measures (and versions) used in each of the trials.

Data queries were raised and managed in collaboration with the relevant trialist.

Primary outcome and secondary outcomes at one week

The choice of outcomes was based on those used in the previous Cochrane

review.¹⁴² The primary outcome was independence at three months as measured by the Modified Rankin Score (mRS). The secondary outcomes measured at one week included stroke severity (National Institute Health Stroke Scale [NIHSS] or Scandinavian Stroke Scale [SSS]), complications of immobility in hospital and level of fatigue (Borg Exertion Scale [BORG]). The primary outcome was the proportion of patient’s independent at three months as defined by a mRS score of ≤ 2. The original rankin scale consisted of five categories based on ability to perform certain activities as well as taking account of the level of assistance required. The modified version as it is widely used today is considered to be a measure of global disability and consists of two additional categories – ‘no symptoms’ and ‘dead’, providing a seven point scale.

The NIHSS is a 15 item neurological impairment scale with a maximum deficit score of 42 points.¹⁴³The key aspects that are measured are eye movement, motor and sensory impairment and level of consciousness. The SSS is also a neurological impairment scale which incorporates an initial prognostic and long-term functional score. The key aspects that are measured include consciousness, eye movement, motor power, speech and facial palsy. The level of impairment for a patient is measured as a value between 0 and 58, with lower scores indicating greater impairment.¹⁴⁴ As stroke severity was measured using

different outcome measures the SSS scores obtained for patients in AVERT phase II were converted to the NIHSS score using the following interconversion

equation:¹⁴⁵ NIHSS score = 22.99 – (0.39 x SSS score). Complications were defined

as stroke related, immobility related, co-morbidity related or any others.

Complications of immobility included falls, pneumonia, chest infection, deep venous thrombosis, pulmonary embolism and pressure sores. Complications were collected from medical records by a blinded assessor. The BORG is a self-rating scale used to measure perceived exertion during physical activity; it ranges from 6 to 20, where 6 equals “no exertion at all’ and 20 equals “maximal exertion”.¹⁴⁶ Excessive fatigue was defined as a score of >13 “somewhat hard”.

Secondary outcomes at three months

The further prespecified secondary outcomes measured at three months included mobility (Rivermead Mobility Index [RMI]), place of discharge, death, activities of daily living (BI), health-related quality of life (Assessment Quality of Life [AQoL)) and resource use. The Rivermead Mobility Item comprises of 14 items with activities ranging from turning over to running.¹⁴⁷ Each question is answered either yes (score = 1) or no (score = 0) with a maximum score of 14. A lower score indicates a greater mobility disability. Non-impaired mobility was defined as a RMI score of 10 to 13. Discharge destination was categorised as home, rehabilitation unit/ward, acute hospitalisation, sheltered housing or a nursing home. Return home was defined as patients who were previously living in private residence and had returned to this living arrangement by month three.

The BI measures performance of ADL. It is a 10 item scale which ranges from 0 to 100, where lower scores indicate greater dependency. This score is often re-scaled from 0 to 20, with each item divided by five. Independence was defined as a BI score ≥ 18. Patients who had died were assigned a score of zero.

The AQoL is a utility instrument which measures health-related quality of life (HRQoL)¹⁴⁸ comprising of 15-items and five domains as follows; illness,

independent living, social relationships, physical senses and psychological wellbeing. Responses to each of the AQoL items were summed to provide value profiles of illness, independent living, social relationships, physical senses and psychological wellbeing. Patients who died were assigned a score of zero. This coding created scores for each scale ranging from ‘0−9’, where ‘0’ represents

‘normal’ or ‘good’ HRQoL and ‘9’ the worst possible HRQoL for each dimension.

These scores were then summed to provide an overall unweighted HRQoL-index,

where overall AQoL scores ranged from ‘0−45’, where ‘0’ represents ‘normal’ or

‘good’ HRQoL and ‘45’ the worst possible AQoL HRQoL-score. The AQoL score is then used to compute an overall utility score weighted by preference in order to calculate quality-adjusted life years (QALYs) for use in economic evaluation. The conversion of the unweighted HRQoL scores to utilities for use in economic evaluation is presented in Chapter 6.

Resource use was determined by a blinded assessor during a face-to-face interview with the patient or nearest relative and by retrieving information about hospital re-admissions from medical records at three months post-stroke.

The specific resource items varied between the trials; in both trials resource use information on initial acute hospital length of stay (LOS), hospital re-admission LOS and some aspects of care provided in the community were gathered.

Generally, there is no consensus regarding the methods to pool multinational resource data or resource data from different hospitals for meta-analysis or economic evaluation. Resource use is highly variable between countries due to differences in healthcare systems. Combining resource data for a meta-analysis is controversial and may limit the generalisability of estimates of cost and by implication estimates of cost-effectiveness across settings.¹⁴⁹ Considering this and the variation in resource use that existed between the two studies for the purpose of this IPD MA of resource use was not considered appropriate.

Therefore, the summary data available from the published sources were extracted, tabulated and described. A planned economic evaluation alongside AVERT phase II has already been conducted and has since been published.¹⁵⁰

Process indicators are markers defined to assess the quality of care and benchmark the implementation of guidelines.¹⁵¹ Two process indicators were used in this analysis; time to first mobilisation after stroke onset and the amount of mobilisation activity. In order to measure time spent in mobilisation activity in AVERT phase II trial staff recorded time with a therapist doing mobilisation (VEM) and time spent in SC (control). This was measured for the intervention period of 14 days or earlier if the patient was discharged. In AVERT phase II the total dose of mobilisation for each treatment group (in minutes) across the length of stay was calculated. In VERITAS an AC was used to measure time (in minutes) spent in sitting/lying, standing and stepping for patients. This was

measured on days three, four and five with recordings on the first day

considered most reliable due to the lower levels of missing data on that day. In VERITAS time spent upright, defined as the time spent standing or stepping, was calculated. As the methods of measuring mobilisation activity were different in each of the trials the data for this process indicator were not combined.

4.2.3 Statistical analysis

Analyses included all patients and used an intention to treat approach. Baseline patient characteristics were described for each trial and summarised in the two treatment groups. Univariate analysis was used to compare patient

characteristics at baseline and the time to first mobilisation between the two individual trials and between treatment groups in the IPD MA. Where data were not normally distributed (stroke severity scores, time to first mobilisation and length of stay), the Wilcoxon-Mann-Whitney test, a nonparametric equality of medians test was used.¹⁵² Time spent mobilising was compared between the treatment groups using summary data from the trials. The conventional level of significance (p ≤ 0.05) was used.

The methods for meta-analysis for aggregate data are well-developed with a number of approaches available depending on the assumptions being made regarding a common treatment effect between the included studies. For IPD MA two main types of analysis are recognised; the one-stage analysis and the two-stage analysis. The one-two-stage analysis combines all the IPD from the studies and models the treatment effect simultaneously. The alternative approach is the two-stage approach whereby a summary estimate is calculated for each individual trial and synthesised using traditional meta-analysis. For analysis in this Chapter the two-staged approach was used to assess the treatment effect.

The outcomes were analysed for each of the trials and then these individual summaries were used to provide an overall measure of effect. A common

treatment effect was assumed therefore it was appropriate to use a fixed effect model. Analyses were also run using a random effects model (DerSimonian and Laird, 1986¹⁵³) to cross-examine the robustness of this assumption. The

treatment effect was calculated using the Mantel-Haenszel method which

combines on the log scale, ORs for each trial using a weighting scheme based on

the inverse of their variance. The random effects model, in assessing uncertainty incorporates an additional measure of between-study variation. For continuous factors (stroke severity and HRQoL scores) a weighted mean difference was calculated. For stroke severity, patients who died were excluded from the analysis. The amount of heterogeneity was assessed visually using forest plots and quantified using the I² statistic, which describes the percentage of variation between studies due to heterogeneity rather than chance.¹⁵⁴ An available guide for the approximate interpretation of the I² statistic was used; low = 25%, moderate = 50% and 75% = high.¹⁵⁴

Multivariate logistic regression was used to assess the effect of VEM on

independence at three months adjusting for patient and stroke characteristics known to effect outcome. The identity of each trial was upheld within the model so as to preserve clustering of patients within studies and allowed inferences to be based on the randomisation of patients within each trial.¹⁵⁵ In multivariate analysis adjustments were made for known confounders including age, baseline stroke severity and pre-morbid disability. Age, baseline severity and the level of disability on admission have been identified as factors affecting recovery.¹⁵⁶¹⁵⁷ The effect of including additional factors, as informed by the univariate analysis (p < 0.10) was also explored in separate models. As the number of patients was small the most parsimonious model was selected. A similar method of univariate and multivariate analysis was carried out for the secondary outcomes.

Similarly, the same approaches (either the one or two-stage) can be used to examine the effect of covariates. A two-staged approach was used in this analysis to conduct a subgroup analysis where patients within each trial were grouped into prespecified categories and the treatment effect was estimated across the trials for each of the subgroups. This subgroup analysis allowed the exploration of whether groups of patients with similar characteristics from two separate trials respond in the same manner to the intervention. Subgroup

analysis was restricted to the primary outcome and based on prespecified groups that were identified in each trial as important patient characteristics for

adjustment in the final analysis of outcome, these included age, stroke severity at baseline and pre-morbid disability. The patient groups were defined as (i) patients with a mild stroke (NIHSS ≤ 7) or moderate to severe stroke (NIHSS > 8),

(moderate and severe categories were collapsed due to the low number of events in the severe category), (ii) patients aged < 75years or ≥ 75years (iii) patients with no or mild previous symptoms (premorbid mRS, 0 - 1) or patients with moderate previous disability (premorbid mRS, 2 - 3). Subgroup interaction was tested between patient groups using the chi-squared statistic.¹⁴⁹

In document Developing and evaluating a complex intervention in stroke: using very early mobilisation as an example (Page 109-115)