In general, because of its mandates about standardized ELP assessment and disaggregated reporting, NCLB has played a major role in shaping research about ELs in the 21st century. In addition to codifying reclassification as a formalized event, NCLB’s mandates also forced states to pay attention to ELs’ progress and performance, and ensured that certain information about these students would be included in state-level datasets. In the fifteen years during which NCLB was law, research on these students proliferated (and continues to, despite a new ESEA reauthorization in December 2015). Over the same period, however, many experts came to recognize exactly how complex and difficult it can be to come up with appropriate designs and methods to get good, meaningful information about ELs’ performance and progress. Thus, the research of the past fifteen years also represents something of a learning curve for the field, as
researchers have refined their thinking about how to validly structure their EL data, apply their models, and interpret their findings.
Choosing valid research designs to study reclassification can be surprisingly tricky, because of the way the EL subgroup works. Even before NCLB, Linquanti (2001) identified an important “redesignation dilemma,” which is that high achieving ELs are more likely to be reclassified, while struggling ELs are more likely to stay in the subgroup. This dynamic, while obvious, can distort the data, particularly since
accountability data are primarily cross-sectional in nature, and do not track individual students over time. At the group level, this dynamic will make it consistently appear that
reclassified ELs do very well, which suggests that the EL programs and reclassification are functioning well and preparing students to succeed upon exiting. What it masks, however, is the fact that the same lower achieving ELs are remaining in the EL subgroup year after year without exiting. Thus, rather than each student making steady progress towards English proficiency, higher-achieving students are passing through and leaving these “long-term ELs” behind (Menken, Kleyn, & Chae, 2012; Olsen, 2010). This suggests, by contrast, that the programming is actually not working, precisely for the students who need the most help.
The primary implication of this dynamic is that typical accountability data, which groups ELs only by grade, is not enough on its own to give an accurate picture of ELs’ progress. A cross-sectional sample of third grade ELs, for example, might include students who have been ELs for 5 years, three years, and one year, and who have low, medium, and near-fluent levels of English proficiency. Substantively, it matters which of these students make progress or exit the subgroup. In particular, if students who have already received years of instruction are not likely to exit, this suggests that something is not working in either the instruction, the reclassification standard, or both.
A related secondary challenge in measuring ELs’ language development longitudinally is identifying the appropriate outcome measure for such studies. Since ELPA participation ceases within two years of being reclassified, and since
reclassification may occur at any grade-level after any length of time in the subgroup, there is no common outcome datum for all ELs, other than reclassification itself. Given, again, the way EL status functions, this event could occur at any grade-level, and may come at the end of a trajectory of many or few years spent taking the ELP assessment and
participating in language instruction. Those differences matter, for design and interpretation.
It took a few years for researchers to appreciate these dynamics in their data. In time, however, states and researchers learned that the time an EL had spent in the subgroup needed to be accounted for in their designs, and the field has converged
increasingly on a few appropriate methods for these purposes. First, descriptive analyses of state data are actually somewhat common, as they can effectively provide a portrait of who is transitioning, when, and after how long, even if they cannot answer questions about what factors predict reclassification. The remaining two sections include several studies that are based solely on descriptive analyses of state datasets (American Institutes for Research, 2013; Carroll & Bailey, 2015; Grissom, 2004; Massachusetts Department of Elementary and Secondary Education & DePascale, 2012).
Second, discrete-time survival analysis (also referred to as event-history modeling) has increasingly been used to study the time it takes for students to be reclassified. The primary advantage of survival analysis is its capacity to model
censorship within a dataset – that is, to account for and model the fact that some subjects may not achieve the intended outcome within an observation period. In the context of reclassification, it can acknowledge that there is a difference between an EL who has been in the subgroup for 8 years and then is reclassified, and one who has been in the subgroup for 8 years and still has not been reclassified when the observation period of a study ends. Survival analysis also uses as its outcome the probability (referred to as the hazard) that a particular outcome will or will not happen for a given subject. Because the probability is the outcome, this solves the aforementioned problem of needing an
outcome variable to predict in these longitudinal studies. At least five studies to date have used survival analysis to study reclassification in the EL K-12 context, all of which will be discussed in the final literature review section (Cook et al., 2012; Parrish et al., 2006; Slama, 2014; Thompson, 2012; Umansky & Reardon, 2014).