2. The Aims and Designs of Value-Added Measures
2.3 Value-Added Model Designs 1 Introduction
2.3.4 Multi-level Value-Added Models
By the late 80s, educational effectiveness researchers were calling attention to methodological issues with the use of pupil-level models (Aitkin and Longford, 1986). These technical problems are detailed in Appendix A3. This section concentrates on explicating the difference between pupil-level and multi-level models on a less technical level.
As described above, it is possible to include school mean scores or other variables relating to larger organisational units (e.g. a local authority) within the OLS pupil-level model (e.g. Muñoz-Chereau and Thomas, 2015). In this sense, pupil-level models can be ‘multi-level’ and consider factors at school or other levels. This is not, however, the sense
in which the term multi-level is conventionally used. The pupil-level models above are single-level models in the sense that they are ‘blind’ to the hierarchical nature of the data and cannot take group membership into account during the estimation process. This has two main consequences:
First, as was touched on when discussing school-level models, this limits the analytical possibilities of the model and prevents relationships within the data varying at the level of the school. If, for example, one wanted to examine whether a school’s effectiveness varied across the ability range, this would be difficult to ascertain using pupil-level data for large numbers of schools (see example in Appendix A3). Moreover, factor relationships may be different at different levels of analysis (see technical note on school-level models in Appendix A1). Analysis at only one level, therefore, has the potential to yield misleading results about effectiveness factors (Snijders and Bosker, 2011). Addressing these problems in pupil-level models is often possible to some degree through subsequent analysis of the residuals, use of school-level variables and the inclusion of interaction terms, but this is generally unfeasible for larger samples or more complex models.
The second limitation of using the pupil-level models relates to non-independence of observations violating the assumptions underpinning statistical tests within the model (Aitkin and Longford, 1986). In short, the problem is that a pupil-level model that is unable to account for group membership and treats each pupil as independent. This is to assume that two pupils from the same school are not expected to be any more similar than two pupils from different schools. When one allows for correlation between pupil-level errors within schools, larger standard errors are produced and so the stringency of statistical tests tend to be higher in a multi-level framework (Snijders and Bosker, 2011, Goldstein, 1997). The effect on the fixed effects (see below) estimates tends to be fairly small (Snijders and Bosker, 2011).
Critics have questioned whether multi-level models are any improvement on simpler methods in practice (Gorard, 2007). Moreover, Chapter 4, Section 4.3.3, examines the suitability and value of statistical techniques, raising several serious limitations. Nonetheless, whether multi-level models are as essential and valuable as is claimed or not, they have been overwhelmingly preferred in educational effectiveness research.
Outputs
Multi-level models offer two classes of output: First, coefficients on the fixed effects of the model show relationships between the dependent variable (pupil performance) and the independent variables included in the model such as pupil prior attainment or pupil background variables. As fixed effects, these hold (on average) across the sample. Educational effectiveness researchers are interested in examining the effect of additional effectiveness factors which have been measured and included in the model to examine the strength of relationships between these and performance. Isac et al. (2013, p.29), for example, use ‘an educational effectiveness approach’ to estimate the effect of a range of school factors (e.g. exposure to political and social issues information) and non-school factors (e.g. socio-economic status) on various student outcomes related to citizenship education (e.g. civic knowledge).
The second group of outputs which one might obtain from multi-level models are the ‘random effects’. Note that the use of causal language in ‘effect’ might be misleading given that these are model residuals. Simple random effects include the residuals partitioned into the school and pupil-levels in the model, the former being school value-added. Examples of the use of random effects include the creation of school effects as value-added scores in the English performance tables and studies such as Noyes (2013) which examined school effects on mathematics performance and on post-16 participation. Random effects can also be estimated for school-level differences in the fixed effect coefficients. These can be understood as interaction effects between specific schools and the factor in question (see Appendix A3).
Other output which may be of interest are the standard errors of the estimates or other statistics associated with inferential statistical methods. Educational effectiveness researchers draw conclusions about effectiveness factors based in part on these statistical tests (see Chapter 4). Another noteworthy output of multilevel models which has been examined in a large number of studies (Luyten, 2003) is to partition the residual variance between various levels (e.g. pupil, class, school) in the multilevel model to see the proportion of variance situated at each level. Sometimes the term school effect is used in this sense, as the percentage of variance situated at school level (either including or excluding variance at lower levels such as teacher-level).