• No results found

Discussion of methodological implications and limitations 1 Threats to internal validity

EVALUATION 1: NEW BEGINNINGS (SCHOOLS 1, 2, 3 AND 4)

5.4 Discussion of methodological implications and limitations 1 Threats to internal validity

As discussed in the methodology chapter, the chosen research design was the most realistic in an applied research context but was a compromise as it is

168 more vulnerable to threats to internal validity than designs such as RCTs. These threats to validity weaken the confidence in the experimental design, meaning that any effects at post-test could be attributed to factors other than the intervention being studied. In the present study, it would be tempting to conclude that the Getting On and Falling Out intervention caused the pre- to post-test gains in the experimental group but not the control group. However, Campbell and Stanley (1963) argue that it is essential for designers of quasi-experiments to be aware of alternative explanations for their results. The following section explores some competing explanations.

Campbell and Stanley (1963) argue that the non-equivalent control group design controls for the following threats to internal validity: history, maturation, testing, instrumentation, selection and mortality (see table 3.1). However, they suggest that selection-maturation interactions are a definite weakness and that regression is a possible source of concern in this design.

Regarding selection-maturation interactions, in the present study the intervention group were selected for the Getting On and Falling Out intervention and the control group were on the waiting list for the Going for Goals intervention. However, the needs of the intervention group might have been greater than those of the control group. For example, children might have been chosen for Getting On and Falling Out, not because of the specific content of the programme, but because they were in more urgent need of intervention. If this was the case (which is plausible given the pre-test

169 differences between the intervention and control groups on empathy and behaviour) the intervention group gains could have occurred as a result of the of a group with extreme scores, rather than due to

receiving the intervention.

Regarding regression, Campbell and Stanley (1963) advise against attempting to control for pre-test differences between groups by matching if this is not accompanied by random assignment to conditions. This pitfall was avoided in the present study, however, regression to the mean is still a possible explanation for the Getting On and Falling Out findings. Barnett et al (2005) warn that regression to the mean is a problem in many repeated -measures studies as an extreme score is likely to be followed by a score that is closer to the mean. They recommend using an analysis of co-variance to overcome the effects of regression to the mean, but this could not be used with the data collected, therefore regression to the mean remains a plausible explanation

- to post-test.

5.4.2 Threats to external validity

Campbell and Stanley (1963) suggest that testing-intervention interactions are a definite threat to external validity for non-equivalent control group designs, and that selection-intervention interactions and reactive arrangements are a possible source of concern in this design.

170 The testing-intervention interaction refers to the effect that the pre-test has on the effectiveness of the intervention, meaning that the benefits of the intervention cannot be generalised to participants who have not had the pre- test. However, Campbell and Stanley (1963) suggest that this is a particular threat in studies of attitude change and less of a threat in education, where assessment is more typical. This threat to external validity could be overcome by replicating the study with different outcome measures, as both this study

H ELAI “DQ

The selection-intervention interaction is the likelihood that the selection of participants affects the results, meaning that the benefits of the intervention cannot be generalised to other participants or settings. Campbell and Stanley (1963) suggest that schools which agree to take part in research are not representative and are more likely to have higher staff morale, lower fear of inspection and more zeal than most schools. They advocate researchers being clear about how many schools were approached, as the author does in figure 3.1. Mertens and McLaughlin (2004) argue that, where random sampling from the target population has not occurred, inferences beyond the sample are affected. This is certainly relevant to the present study. A convenience sample was used for practical and ethical reasons, but this results in a lack of generalisability as participants may not reflect the wider population.

Reactive arrangements are the effects of the artificiality of the experiment, meaning that results cannot be generalised beyond the experimental

171 situation. Campbell and Stanley (1963) advise that these effects are worse in situations where the intervention or staff are unusual, but in this case both the intervention and the group leaders were part of normal school practice.

Another threat to generalisability in this study is sample size. Although there were some positive results for the Getting On and Falling Out intervention, this was based on an intervention group of 12 and control group of 11 pupils.

Although external validity is a concern, there are some reassurances. Firstly, replication is one way to improve generalisability and this study attempts to H evaluation. Secondly, since the study

cannot be generalised beyond the sample, the participants have been described in detail so that readers can decide how similar they are to other populations or so that they can choose to replicate the study with a very different group of people.

5.4.3 Limitations of the study

A common limitation of educational research studies is their small scale. This is also the case in the present study, particularly for the Getting On and Falling Out evaluation and for sub-group analyses. However, hopefully one of the effects of the Development and Research project will be to aggregate trainee EP research, which might help to overcome this problem.

172 Another limitation is the lack of randomisation in sampling and allocating to groups. These have resulted in threats to the internal and external validity of the quasi-experiment, which affect its interpretation.

A further criticism could be that self-report and informant-report measures were used, rather than direct measures of behaviour (for instance role plays of social situations or direct observations of playground behaviour), as some researchers argue that direct measures are more objective. However direct measures concern only observable behaviour rather than cognitions and the problems of measuring social and emotional skills were discussed in section 3.4.2. Self-report measures were chosen over direct measures because they were felt to be more time-efficient and related to the theory of trait, rather than ability, EI. Also, Humphrey et al (2008) used role play measures in case study schools and found no effects.

Another issue with measurement is that self-awareness is a component of emotional literacy. Therefore, if a child improved their emotional literacy by becoming more self-aware, their score on that component may have increased but their overall emotional literacy might have decreased as a result of more realistic ratings in other areas. Since the pupil ELAI only gives total emotional literacy results, this may mask improvements in self-awareness.

A final criticism of the study is that it compared the SEAL small group interventions with a waiting list control group, rather than with an alternat ive

173 intervention. This decision was also made on practical grounds, as schools were unlikely to be able to staff two interventions simultaneously.