Types of intervention and control
Two general classes of early intensive interventions based on ABA emerged from the literature: 1. EIBI, including the University of California, Los Angeles (UCLA)/Lovaas model26or intensive
ABA-based adult-led interventions (e.g. discrete trial training or separate learning units with a clear beginning and end).
2. EIBI with NDBI, incorporating some aspects of the developmental social pragmatic model that seeks to reinforce social communication and interaction by engaging with child-initiated activities. The ESDM emerged as a prominent form of EIBI with NDBI.
The two key comparator interventions were:
1. ‘eclectic’ treatment, which may include a range of school-, clinic- or home-based interventions, sometimes incorporating lower-intensity ABA-based approaches
2. TAU, which consisted of standard local provision or waiting list controls.
Some studies compared early intensive ABA-based interventions of differing intensity.
The main meta-analyses combined all types of early intensive ABA-based interventions and both types of control arm, assuming equivalence between intervention types, to obtain overall estimates of the effectiveness of early intensive ABA-based interventions. Clinical expertise within the SCABARD team and Advisory Group suggested that the various classes of early intensive ABA-based interventions may be essentially equivalent in methodology and efficacy, so this was considered a reasonable primary approach.
Whether or not the interventions were truly equivalent was investigated by considering each intervention and comparator separately, and conducting pairwise meta-analyses for each intervention/comparator combination. This was followed by analyses that combined interventions and comparators as follows:
l combined comparators (eclectic and TAU), keeping interventions separate
l combined early intensive ABA-based interventions (EIBI, EIBI and NDBI), keeping comparators separate.
These meta-analyses were performed for each outcome measure using both one- and two-stage meta-analysis methods (see Statistical details of individual participant data meta-analyses). The different meta-analyses were compared to identify whether or not there was any evidence of differences between interventions and comparators. This informed the decision of if and how to combine interventions and comparators in all subsequent analyses. Statistical significance (at 5% level) was not the sole driver of this decision; observed size of the effect estimates was also considered, so groups could potentially be kept separate even if there was no statistically significant evidence of difference.
Outcome domains
The meta-analyses focused on key domains of development in autistic children, which might be measured on a range of different scales.
These domains were:
l adaptive behaviour l cognitive ability l language development l autism symptom severity
l presence of behaviours that challenge
l placement into mainstream or specialist schools.
The intention was to analyse these domains at 6 months, 1 and 2 years after randomisation or intervention initiation, but, given the data received, this was amended to 1 and 2 years, with limited analyses at 3, 4 and 7 years for some domains. Mean differences (MDs) [i.e. not standardised mean differences (SMDs)] between early intensive ABA and comparator arms were used as the main outcome measure, because, generally, all studies used equivalent measurement scales [e.g. Vineland Adaptive Behaviour Scale (VABS)].59Analyses using SMDs were performed as a sensitivity analysis
for each outcome domain.
SYSTEMATIC REVIEW AND META-ANALYSIS OF EFFECTIVENESS: METHODS
NIHR Journals Librarywww.journalslibrary.nihr.ac.uk
Specific outcome measures
The outcomes analysed were the individual measurement scales from each of the main outcome domains. The included studies collected a large number of outcome measures, but most were collected in only one study or with insufficient data to assess effectiveness (i.e. collected for only one study arm or no baseline data reported).
The following outcome measures were assessed by at least one study, with sufficient data to estimate effectiveness of early intensive ABA-based interventions:
l adaptive behaviour:
¢ composite VABS59
¢ each component of the VABS composite score – ¢ communication
¢ daily living skills ¢ socialisation ¢ motor skills
¢ maladaptive behaviour (not always recorded)
l cognitive ability (IQ):
¢ as assessed in the study (regardless of exact test used) ¢ based on specified test –
¢ Bayley Scales of Infant Development (BSID) (I, II or III)60–62
¢ Wechsler Intelligence Scale for Children – Revised (WISC-R)63–65
¢ Wechsler Preschool and Primary Scale of Intelligence – Revised (WPPSI-R)66–68
¢ Stanford–Binet Intelligence Scale (S–B)69
l non-verbal IQ:
¢ Merrill–Palmer Scale of Mental Tests (MPSMT)70
l language development:
¢ expressive, receptive, comprehension and overall using scales – ¢ expressive one-word picture vocabulary test71,72
¢ British Picture Vocabulary Scale73
¢ Reynell Developmental Language Scales (RDLS)74,75
¢ Mullen Scales of Early Learning (MSEL) (expressive and receptive language subscales)76
¢ Social Communication Questionnaire (SCQ)77
l autism symptom severity:
¢ Autism Diagnostic Observation Schedule (ADOS)78
¢ Autism Diagnostic Interview – Revised (ADI-R)79
l presence of behaviours that challenge:
l additional outcomes:
¢ other components of MSEL:
¢ composite score ¢ fine motor ¢ visual reception.
When these outcome measures were available in two or more studies at consistent time points, they were combined in one- and two-stage meta-analyses. When available from only one study, results were tabulated. Other outcome measures were described in the protocol, but not collected by any eligible studies, so are not listed here.
Covariates modifying applied behaviour analysis effectiveness
The following potential effect modifiers were investigated to explore whether or not they altered the effectiveness of early intensive ABA-based interventions, and were specified in the protocol. These were only considered when suitable data were recorded in the IPD provided, or in publications, protocols or otherwise provided by triallists (for intervention characteristics).
Study-level intervention characteristics
l Allocation method (parental choice, location based, cohort). l Delivery setting (home, school, specialist centre).
l Parental involvement in ABA (none, encouraged, some). l Use of ABA methods in control intervention (none, partial).
Participant-level characteristics
l Age at enrolment. l Sex.
l Baseline IQ.
l Baseline composite VABS score.
Other characteristics were listed in the protocol, but there were insufficient data to analyse them. The impact that these covariates have on early intensive ABA-based intervention effectiveness was assessed by using one-stage meta-analyses with the covariate included as a treatment–covariate interaction in a regression model (see Impact of covariates on treatment effect).
A separate model was fitted for each main outcome and covariate combination, when sufficient data were available.
Statistical details of individual participant data meta-analyses
Both one- and two-stage meta-analyses were performed for each outcome, provided that data were available for at least two studies. When these thresholds were not reached, a narrative summary of study results was produced.
Two-stage meta-analysis
In a two-stage meta-analysis, estimates of intervention effect (SMD or relative risk) are estimated separately for each study and are then pooled across studies to calculate a summary effect estimate. SYSTEMATIC REVIEW AND META-ANALYSIS OF EFFECTIVENESS: METHODS
NIHR Journals Librarywww.journalslibrary.nihr.ac.uk
The main within-study analysis for continuously distributed outcomes (e.g. IQ, VABS composite score) was the analysis of covariance (ANCOVA) model,81which adjusts the outcome at time of analysis for
the baseline value. This model was used to estimate the MD in outcome between intervention and control arms [with its standard error (SE)] for use in the meta-analysis.
Results were then combined across studies using both fixed-effect and DerSimonian–Laird random-effects meta-analysis, to account for possible heterogeneity. Forest plots were produced for each meta-analysis. Effect estimates (SMDs, MDs or relative risks) and 95% confidence intervals (CIs) were calculated for each study and for the combined result. Heterogeneity was also assessed using I2. If only two studies
presented data, a fixed-effect meta-analysis was used, as heterogeneity cannot be reliably estimated from only two studies.
One-stage meta-analysis
A one-stage meta-analysis takes advantage of the availability of IPD by including all data from all studies in a single regression analysis (while taking account of/stratifying by study). This enables greater flexibility in the modelling structure.
A linear regression analysis was used for continuous outcomes (e.g. VABS composite score), and proportional odds regression for categorical outcomes (school placement). As for the two-stage models, the ANCOVA approach was used to estimate MDs. The models regressed final outcome against treatment and baseline value, with random intercept and intervention effects (to account for heterogeneity).
There are currently no well-tested methods available for one-stage analyses of SMD, so only two-stage analyses of SMD were performed.
All available data from all studies were included in a regression analysis; studies were excluded when they did not include data for the outcome measure of interest. As for two-stage analyses, meta-analyses were performed provided that at least two studies, with a minimum of 50 participants, provided data for the specified outcome. If only two studies provided data, a fixed-effect regression was used.
Heterogeneity assessment
All one-stage models were fitted using mixed-effects regression, with random effects, varying by study, applied to the treatment parameter. Heterogeneity was quantified in terms of the observed statistical heterogeneity in the model (τ2estimate).
When available, results of one- and two-stage analyses were compared.
Impact of covariates on treatment effect
Access to IPD means that the analysis can potentially go beyond looking only at whether or not early intensive ABA-based interventions are effective, to consider whether or not child-level characteristics (including parental and intervention factors specific to each child) might alter how effective the intervention is. For example, whether or not IQ at time of recruitment alters how effective EIBI is in changing outcomes. The impact the covariate may have on effectiveness is called the intervention–covariate interaction.
Two-stage analyses
For study-level characteristics (such as parental involvement in ABA provision, setting and duration), subgroup analyses were used to investigate the impact of covariates. Studies were placed into groups according to the value of the characteristic (e.g. some parental involvement, involvement encouraged or no involvement, with exact groupings decided once it was known what data were available) and meta-analyses performed, as described above, within each group. Subgroups were then compared to identify any differences in effect.
One-stage analyses
For individual-level characteristics, the one-stage regression analyses described earlier were extended to include a parameter for the covariate of interest and one for the intervention–covariate interaction. To ensure model convergence, these parameters were assumed common to all studies (i.e. a fixed effect), but models with random effects for these parameters were tested to ensure the validity of making a fixed-effect assumption. A statistically significant intervention–covariate interaction parameter in these models indicates that the covariate alters the effect of the early intensive ABA-based interventions. These models were fitted for each possible combination of outcomes and covariate to assess the associations between intervention and covariates, provided sufficient data were available (at least two studies and 50 participants reporting both outcome and covariate).
Time of measurement
Analyses were performed at 1 and 2 years after recruitment for each outcome. A tolerance of ± 6 months was used for each analysis. This means that, for example, measurements made from between 18 and 30 months could contribute to analyses at 2 years.
In a few studies, IPD were provided at times other than 1 or 2 years. To incorporate those additional data captured at other times, repeated measures analyses were performed. Repeated measures models analyse all time points simultaneously, so there is a single model estimating effects for all reported years. They also account for the fact that each child may have repeated measurements of the same outcome over time, which are likely to be correlated, by including a correlation term for each child. When the data permitted, exploratory analyses were performed, including an assessment of whether outcomes varied linearly or log-linearly over time (i.e. assuming a trend over time rather than separate analyses). The choice of these models depended on the results of the analyses at each specific time point.
Studies not supplying individual participant data
When studies identified as eligible for inclusion in the meta-analysis did not supply IPD to the SCABARD team, relevant outcome data were extracted from study publications. Data were extracted as means and standard deviations (SDs) in each study arm, as 2 × 2 tables (numbers of events and participants by arm) or as relative risks, odds ratios or MDs if full data were unavailable.
Mean differences or SMDs for each outcome measure were calculated from extracted data. These were combined with the results for each study estimated from the IPD in exploratory two-stage meta-analyses, following the same process as described in Statistical details of individual participant data meta-analyses. Meta-analyses combining IPD with published data from studies not supplying IPD were treated as sensitivity analyses and used to assess whether or not there are any differences between studies that did not supply IPD and those that did.
Missing data
When a study did not examine or record an outcome measure or a covariate, the study was excluded from all relevant analyses.
If > 20% of participants in the IPD had no record for an outcome measure, a best- and worse-case analysis was planned as a sensitivity analysis. All included studies had < 20% of participants with missing outcome data (when the outcome was collected), so this analysis was not required.
Complete-case analysis (excluding all participants with missing covariate data) was used for all analyses. Imputation analyses were considered in the protocol as a way of handling missing covariates, but were not performed, given the limited number of covariate analyses that were feasible and because data were largely complete for the analyses performed.
SYSTEMATIC REVIEW AND META-ANALYSIS OF EFFECTIVENESS: METHODS
NIHR Journals Librarywww.journalslibrary.nihr.ac.uk
Sensitivity analysis
Although a number of sensitivity analyses were identified in the statistical analysis plan, the limitations of the IPD meant that the only sensitivity analysis performed was one limited to an analysis of only UK-based studies.
Network meta-analysis
Network meta-analyses (NMAs) analysed all types of intervention and control simultaneously. The one-stage repeated measures meta-analysis models described above were extended to include multiple arms and incorporated random effects to account for heterogeneity. Potential network inconsistency was investigated by comparing NMA results with results from direct pairwise meta-analyses.
Multivariate meta-analysis
The analysis included many outcomes that are likely to be highly correlated both within domains (e.g. different IQ scoring methods) and between domains (e.g. VABS score and autism symptom severity). Multivariate analysis of these correlated outcomes may improve estimation, particularly in cases in which some studies do not report one outcome, but do report a correlated outcome.
One-stage models of multivariate analysis were considered. Given the limited availability of outcomes, only bivariate analyses of composite VABS score with each other outcome were feasible. These analyses were done but are not reported here, owing to uncertainty as to their validity, given data limitations, and little evidence of any difference from the main univariate analyses.
Software
All data management and meta-analyses were performed at the Centre for Reviews and Dissemination, using the R software package (2016; The R Foundation for Statistical Computing, Vienna, Austria). Additional libraries in R were used as follows:
l data management and manipulation: tidyr, dplyr, tidyverse libraries l two-stage analyses: meta and metafor libraries
l one-stage models: lme4 library
l forest plots: using in-house R code and meta library l other graphics: ggplot2 library.