A comprehensive statistical analysis for the Families for Health trial was conducted with the aim of assessing the effectiveness of the Families for Health programme, in comparison with usual care, in the treatment of overweight and obesity among children aged 6–11 years. The prespecified primary outcome measure in the statistical analysis was the change in children’s BMI z-score from baseline to 12 months’ follow-up so that clinical effectiveness would be declared based on a reduction in BMI z-score relative to the comparator. Secondary outcomes in the statistical analysis fell into four categories: (1) anthropometric measures in children; (2) anthropometric measures in parents; (3) (validated) questionnaires completed by children; or (4) (validated) questionnaires completed by parents.Table 5provides an overview of all outcomes analysed and the time points at which they were measured.
General statistical considerations
The primary outcome analysis as well as all secondary outcome analyses were conducted at the
conventional (two-sided) 5% level and, corresponding to this, all presented CIs are 95% CIs. As specified
TABLE 7 Evenson’s ActiGraph accelerometer cut points for sedentary, light, moderate and vigorous physical activity
for children
Time period
Cut points
Sedentary Light Moderate Vigorous
Per 15 seconds 0–25 26–573 574–1002 ≥1003
CPM ≤100 101–2295 ≤2296 (moderate to vigorous)
CPM, counts per minute.
Source: Evensonet al.41
in the statistical analysis plan, no formal adjustment for multiple testing among the secondary end points was used as outcomes are likely to be highly correlated so that standard adjustment techniques, such as the Bonferroni method, would be conservative. All outcome measures and child/parent characteristics were summarised by the trial allocation group and for outcome measures by follow-up period. The distribution of outcome data was investigated and transformations applied, if necessary, before performing statistical tests and modelling. The main analysis of clinical outcome data was the comparison of change from baseline between treatment groups. Using the change from baseline has a‘normalising’effect so that standard techniques could generally be used.
Unless otherwise stated, all analyses were performed on an intention-to-treat basis, that is all participants were analysed in the arm they were allocated to and included regardless of whether or not the treatment and follow-up schedule was complied with. This reflects the pragmatic nature of the trial and ensures that conclusions drawn from the analysis will reflect the impact of the Families for Health programme in a real-world setting.
The statistician conducting the statistical analysis was unblinded for the final statistical analysis and the analysis presented at the final DMEC/TSC meeting. The multilevel models originally proposed to be adopted in the analyses (seePrimary outcome analysis) consist of two hierarchical levels in the usual-care arm and three levels in the intervention arm. For these models to be fitted, the participants’group affiliation needed to be revealed. Unblinding was agreed to by the TSC.
The entire statistical analysis was conducted using SAS v9.4 TSL1M2 (SAS Institute Inc., Cary, NC, USA) with the exception of the sample size calculation, which was conducted using R versions 2.10 and 3.0 (R Foundation for Statistical Computing, Vienna, Austria).
Primary outcome analysis
As indicated above, the primary outcome for the statistical analysis was the mean change in child BMI z-score after 12 months of follow-up compared between treatment arms. It was anticipated that these data would be correlated within families (if more than one child of a family participates in the trial) and within delivery groups in the Families for Health intervention arm. The analysis allowed for this clustering in order to obtain unbiased estimation of the treatment effect and its standard error (SE).67
A three-level hierarchical mixed-effects model was proposed to be fitted in the statistical analysis plan. At the highest level of the model a random effect for delivery group was intended. Delivery group-level clustering would have been allowed for in the Families for Health arm only, as usual-care interventions varied by site, were not necessarily group based and precise details on usual-care treatment received were generally not available. As the statistical analysis showed, there was no evidence of delivery-group
clustering and models comprising this random effect failed to converge (seeChapter 4,Main primary outcome analysis). The decision has therefore been made (and been approved by the TSC) to remove this effect from all hierarchical modelling and to use a two-level hierarchical mixed-effects model instead. Correlation between measurements of children within family was allowed for in both arms.
The multilevel model was adjusted for the child-level characteristics, baseline BMI z-score and sex, and family-level variable‘locality’as fixed effects. Additionally, adjustment for the family-level characteristic socioeconomic status (SES) and child-level characteristic ethnicity was explored. Restricted maximum likelihood estimation was employed for estimating covariance parameters in the multilevel modelling. The Satterthwaite approximation68was used for computing the denominator degrees of freedom for the
test of a treatment effect difference between the allocation groups (and for tests of other fixed effects). The primary analysis is a complete case analysis in the sense that if either the baseline or 12-month follow-up z-score was missing the subject had to be omitted from the analysis.
Sensitivity analyses for the primary outcome
Several sensitivity analyses were undertaken to assess the impact of areas of uncertainty surrounding the primary outcome analysis and its robustness. These involved re-estimating the treatment effect under the following scenarios: (1) conducting a per-protocol analysis in which families having participated in five or more sessions of the Families for Health programme are regarded as‘programme completers’(i.e. as having complied with the protocol sufficiently); and (2) (multiple) imputation of missing primary outcome data. Three standard imputation techniques were employed to assess the sensitivity of the analysis to the missing data: first, simple regression imputation, in which missing values are imputed by predicted values from a linear regression model using the same predictors as the primary outcome analysis; second, multiple imputation methods Markov chain Monte Carlo;69and, third, fully conditional specification regression.70
For the two multiple imputation analyses, 200 burn-in iterations were used and estimates averaged over 100 imputed data sets. Baseline BMI z-score, age, sex and site were included as explanatory variables in the imputation models. Imputations were generated separately for each treatment group.
Subgroup analyses
To explore heterogeneity in the trial population, the following prespecified exploratory subgroup analyses were conducted with respect to the primary outcome:
l child’s sex (male or female)
l locality (site)
l SES (according toThe National Statistics Socio-Economic Classification’s55four-class standard classifications)
l parent’s BMI at baseline (normal, overweight or obese)
l age of child at baseline (6–8 years or 9–11 years).
The difference in treatment effects by subgroups was initially assessed by interaction tests. These were performed via significance tests of interaction terms in the hierarchical model utilised for the primary outcome analysis. Variables that have been categories for subgroup analyses (e.g. age, parent BMI) were also investigated as covariates on their original scale. Separate models were then fitted for each subgroup to obtain estimates of the treatment effects within subpopulations.
Repeated measures modelling
An exploratory analysis was performed for the investigation of the difference between arms in terms of change over time in the primary outcome measure rather than a comparison between arms at either 3- or 12-month follow-up. In this analysis, the time at which follow-up data were provided was fitted as a continuous variable, accounting for the fact that the actual times varied widely (especially for the 3-month follow-up). For this purpose a repeated measures mixed model was fitted. This model was based on the aforementioned hierarchical model for the primary outcome analysis comprising the same effect (plus time). Model complexity was increased as time became the new first-level (random) factor. The same model specification, as in the primary outcome multilevel model (restricted maximum likelihood estimation), was used where possible. An unstructured covariance matrix was assumed for the correlation between measuring time points.
Analysis of baseline demographic data
Baseline demographic outcomes were obtained on a family level or on a child/parent level. Summary statistics (mean and SD for continuous variables and absolute number and percentage for categorical variables) were calculated for all participants recruited to the trial and for the two treatment groups separately. The characteristics within the two treatment groups were compared usingt-tests and chi-squared tests for continuous and categorical variables, respectively. The statistical comparison of baseline characteristics was considered exploratory rather than confirmatory and ignored clustering by delivery group or family. The main intention was to identify baseline variable differences that might be deemed relevant by the main investigators and would consequently require adjustment in the multilevel models fitted in the primary and secondary outcome analyses.
Secondary outcome analyses
All secondary outcomes, including subscales of questionnaires, were summarised by trial allocation group and follow-up period using mean, SDs and 95% CIs for continuous variables and absolute number, percentages and 95% CIs for categorical variables. Statistical tests of differences for child secondary outcomes were performed using the same hierarchical mixed-effects model as the primary outcome analysis. In addition, child BMI z-score was also compared as change from baseline to 3-month follow-up and change from baseline to 12-month follow-up using independent groupt-tests. Parent outcomes were compared usingt-tests and chi-squared tests, as appropriate, as generally only one parent provided data and within-family correlation could not be modelled. In case parents and children provided data for the same outcome, the simpler test was used to ensure comparability. Where the analysis of parent outcomes was adjusted, hypothesis testing was done within a linear regression model and the adjustment variables provided in the respective results section.