• No results found

The quantity of missing data was much lower within the impact analyses for the primary KS2 maths outcome (414 cases missing, 6.6% of all baseline cases) compared with the interim CT test (2,391 cases missing, 38.4%). The impact of missing data on the balance between the intervention and control group samples is illustrated descriptively in Table 10 in terms of FSM status, gender and KS1 maths attainment. Statistics from four samples are shown for each of these factors. First, at the top, statistics

for the full sample at baseline are shown. Below this, statistics from the sample included in the main ITT impact analyses for the primary outcome (KS2 maths) are shown. Below this, statistics based on two samples that relate to the impact analyses for the interim CT test outcome are shown: first, statistics from the raw sample of pupils in the complete sample of 40 intervention schools and 41 control schools with CT test data; second, statistics from the 'complete pairs' restricted subsample of pupils in the 31 intervention schools and their 31 matched control schools with CT test data.

Table 10: Impact of missing data on the balance of intervention and control group samples for KS2 maths and CT test analyses

Intervention group Control group

n/N (missing) Percentage n/N (missing) Percentage % Ever Classed as FSM [EVERFSM_ALL]

Baseline (N=6,232) 830/2,895 (91) 28.7% 885/3,128 (118) 28.3%

Primary Outcome ITT Analysis

(N=5,818) 788/2,800 (3) 28.1% 844/3,013 (2) 28.0%

CT Test (Raw sample, N=3,841) 517/1,777 (1) 29.1% 595/2,062 (1) 28.9% CT Test (Paired sample, N=3,077) 435/1,446 (1) 30.1% 522/1,629 (1) 32.0%

Gender (% Female)

Baseline (N=6,232) 1,436 / 2,898 (88) 49.6% 1,560 / 3,130 (116) 49.8% Primary ITT Analysis (N=5,818) 1,401 / 2,803 (0) 50.0% 1,518 / 3,015 (0) 50.3% CT Test (Raw sample, N=3,841) 885 /1,778 (0) 49.8% 1,055 / 2,063 (0) 51.1% CT Test (Paired sample, N=3,077) 717 / 1,447 (0) 49.6% 833 /1,630 (0) 51.1%

Pupil-level (continuous) n (missing) Mean (sd) n (missing) Mean (sd) KS1 Maths Points Score

Baseline (N=6,232) 2,897 (89) 16.1 (3.44) 3,128 (118) 16.0 (3.44)

Primary ITT Analysis (N=5,818) 2,803 (0) 16.2 (3.35) 3,015 (0) 16.2 (3.25) CT Test (Raw sample, N=3,841) 1,778 (0) 16.0 (3.45) 2,063 (0) 16.2 (3.40) CT Test (Paired sample, N=3,077) 1,447 (0) 16.0 (3.46) 1,630 (0) 16.0 (3.43) As reported above, at baseline the difference between the intervention and control group sample in

terms of KS1 attainment was small (an effect size of +0.03 sds). For the primary ITT analyses which exclude the 414 pupils with missing KS2 or KS1 data, the difference is zero. For the analyses of the interim CT test outcome, within the raw sample, the difference was small but larger than at baseline and in a different direction (-0.06 sds). Follow-on sensitivity analyses restricted the sample to complete pairs of intervention and control schools with CT test data. When this was done, the difference returned to zero. As discussed in the randomisation section above, this is an illustration of how the propensity- score-paired-stratification design is robust to whole schools drop outs (15 intervention and 14 control schools here). Specifically, this illustrates how this design can be used to best ensure good balance (albeit with a reduced sample and hence statistical power).

After examining the impact of missing values on the baseline balance28, we feel confident that our

research design and analysis plan was robust enough to be confident of our findings from the impact analyses for the primary outcome (overall maths attainment) and follow-on secondary outcomes (attainment in the three KS2 maths test papers). For the primary outcome, missing values were looked

28 For the primary outcome, this is illustrated in Table 9 by comparing statistics for the baseline and primary outcome

ITT analyses. For the secondary outcome it is illustrated in Table 9 by comparing statistics for the baseline and CT test raw sample analyses. The re-balancing provided by the propensity-score-paired-stratification for the CT test secondary outcome is illustrated in Table 9 by comparing the baseline, raw and paired statistics.

at directly and found to be weakly correlated with KS1 attainment in maths (r=-0.21) and overall (-0.22), were more likely to be male (4.4%) compared with female (2.6%) and more likely to have been classed as FSM (4.8%) compared with pupils not classed as FSM (2.9%). These patterns were consistent for both intervention and control group samples which is reflected by the excellent balance shown in Table 9 at baseline and for the primary outcome ITT analysis

The missing data for the interim CT test outcome are more problematic and meant that an ITT approach for the impact analyses was precluded. The patterns in Table 10 suggest that the planned complete pairs sensitivity analyses will help to ensure a good balance in terms of KS1 mathematics between intervention and control samples. Whilst this does not completely eliminate the risk of bias brought by missing data, we feel that this approach provides a useful way of scrutinising the impact analysis finding.

Descriptive summary

Prior to presenting the multilevel impact analyses, Table 11 presents a descriptive summary of the primary and secondary outcomes for the intervention and control group samples in the ScratchMaths evaluation. From this table, the largest impact for ScratchMaths is observed to be with the interim CT test and this is relatively small (effect size = +0.10 sds). For the primary KS2 maths outcome and across the three KS2 maths test papers, the impact is observed to be close to zero.

It would not be appropriate to use the descriptive summary to determine whether ScratchMaths had a causal impact on KS2 maths attainment. This is because the statistics presented in Table 10 do not take account of how pupils are clustered into schools within geographical areas and into classes within schools, nor do they control for different levels of KS1 maths attainment. However, area/school and class clustering and KS1 maths attainment are both taken into account within the multilevel analyses used to evaluate the causal impact of ScratchMaths that are presented in the next section.

Table 11: Descriptive summary of ScratchMaths outcome variables

Intervention group Control group E.S. Overall KS2 Maths Attainment n (missing) Mean (sd.) n (missing) Mean (sd.) Hedges g

KS2 Maths (Raw Points)1 2,877 (105) 76.2 (23.85) 3,111 (133) 76.5 (23.46) -0.01

KS2 Maths (Scaled) 2,877 (105) 104.9 (7.26) 3,108 (136) 105.0 (7.09) -0.01

KS2 Maths Test Papers n (missing) Mean (sd.) n (missing) Mean (sd.)

KS2 Maths Paper 1 (Arithmetic) 2,877 (105) 31.4 (8.08) 3,112 (132) 31.8 (7.87) -0.05

KS2 Maths Paper 2 (Reasoning 1) 2,878 (104) 23.8 (8.54) 3,112 (132) 23.9 (8.85) 0.00

KS2 Maths Paper 3 (Reasoning 2) 2,879 (103) 21.0 (8.77) 3,111 (133) 20.9 (8.68) +0.01

Computational Thinking n (missing) Mean (sd.) n (missing) Mean (sd.)

CT Test Score (Raw) 1,820 (1,162) 4.95 (2.22) 2,136 (1,108) 4.73 (2.18) +0.10

CT Test Score (Complete Pairs) 1,483 (1,502) 4.85 (2.21) 1,688 (1,483) 4.64 (2.20) +0.10

Note. These are bivariate statistics and so have fewer missing values (238 for the raw KS2 maths primary outcome) compared with the multivariate ITT analysis (414 missing cases, see Figure 5). The supplementary, scaled KS2 maths measure had slightly more missing cases (241).