• No results found

Learning Effects in Operational Settings

4. EXPLANATIONS FOR THE RETESTING EFFECT: LEARNING OR

4.1 Learning Effects

4.1.2 Learning Effects in Operational Settings

beyond lab settings into operational settings in both academic and organizational employment settings. These learning effects also occur across both knowledge and ability domains in the prediction of external criteria. Of particular relevance to learning, the literature also demonstrates some preliminary evidence that the retesting effect not only increases retest performance, but may in fact enhance the criterion-related validity of retests. Although the retesting effect is often conceptualized as mean increases in test-scores that reflects error (i.e., memory effects), this increase in the criterion-related

21

validity of retests raises question as to whether this reflects the retest capturing some additional variance in the underlying construct (i.e., learning), or the reduction of some other confounding error that initially suppressed test scores. Nevertheless, this evidence is mixed and difficult to parse out from memory effects in the absence of experimental designs and criterion-related validity data.

Research has consistently demonstrated that the retesting effect on knowledge tests extends outside lab settings to long-term educational settings and classroom materials (Carpenter, Pashler, & Cepeda, 2009; Cranney, Ahn, McKinnon, Morris, & Watts, 2009; Glass & Sinha, 2013; Kromann et al., 2009; Metcalfe, Kornell, & Son, 2007; Rawson & Dunlosky, 2011; Rees, 1986; Vojdanoska, Cranney, & Newell, 2010). In fact, the retesting effect also exhibits significant gains even after summative course assessments (Balch, 1998; Daniel & Broida, 2004; Lyle & Crawford, 2011; McDaniel, Agarwal, Huelser, McDermott, & Roediger, 2011; McDaniel, Anderson, Derbish, & Morrisette, 2007; McDaniel, Wildman, & Anderson, 2012). While these studies primarily investigated the retesting effect using formal educational materials and younger samples, evidence indicates that the retesting effect occurs at a similar magnitude across age groups, including primary, secondary, and undergraduate education, and from mid- to late age (Meyer & Logan, 2013).

Retest score increases also occur in applied selection contexts on both knowledge and ability tests. Lievens et al. (2005) reported a retesting effect d of 0.27 for scores on a test of science knowledge used to assess medical school applicants. Similarly,

22

full standard deviation (d = 0.93, Van Iddekinge et al., 2011). Schleicher et al. (2010) reported a retest increase d of 0.15 on a job-knowledge test used to select applicants for professional jobs within a federal agency. Raymond et al. (2007) found even larger increases on two certification tests completed by medical imaging workers (d = 0.79 and 0.48). Similarly, in the ability domain, Hausknecht et al.’s (2007) meta-analysis

reported a retesting effect d of 0.26 across ability measures in selection settings. Retest score increases occur consistently across diverse constructs and settings; however, it is less clear whether these retest score increases consistently reflect learning effects. So, mean score increases upon retest do not necessarily imply learning, as increases may still in fact reflect memory effects. There is mixed evidence in both academic and selection contexts, across both ability and knowledge tests, that second administrations of tests exhibit higher scores in addition to displaying equal if not greater criterion-related validity. Although some differences between validity coefficients were not significant, increasing validity coefficients are not a necessary condition to demonstrate evidence of learning. That is, mean retest score increases with

no concomitant change in criterion-related validity would still provide evidence of

learning. However, the opposite—a significant retest score increase coupled with a significant decrease in validity coefficients—would more clearly indicate the contaminating influence of memory on retest scores.

Research within academic settings has found that secondary test scores are sometimes better predictors of academic criteria than initial test scores, although differences between initial and retest validity are often small (e.g., Allalouf & Ben-

23

Shakhar, 1998; Coyle, 2006; Lievens et al., 2005, 2007; Reeve & Lam, 2005). Within academic settings, ability retest score increases are common while criterion-related validity increases or stays the same (Allalouf & Ben-Shakhar, 1998; Coyle, 2006). For example, ability retest scores correlated more highly with college matriculation exam scores than did initial test scores (Allalouf & Ben-Shakhar, 1998), and retest scores on the SAT correlated more highly with college grade point average than initial SAT scores correlated with college grade point average (GPA, r = .54 versus r = .50, respectively; although these two validities were not significantly different; Coyle, 2006).

Higher validity coefficients for retest scores are not confined to academic settings. For example, in a sample of law enforcement applicants, Hausknecht et al. (2002) found evidence that the second administration of a GMA test showed greater, although not significantly greater, criterion-related validity coefficients with training performance than initial scores (e.g., r = .31 versus .27, respectively [limited only to applicants who chose to retest]). Similarly, Lievens et al. (2005) found that medical school applicants’ retest scores on a science-knowledge test were significantly more predictive of subsequent grade point average than initial test scores (r = .21 versus .11; limited only to applicants who chose to retest). Similarly, Van Iddekinge et al. (2011) found that a second job knowledge-test exhibited considerably greater prediction of job performance (r = .38 vs. .27; limited only to applicants who chose to retest), and more generally, that second administration scores were comparable in criterion-related validity when evaluated against numerous alternate operationalizations (i.e., most recent test scores, highest test scores, or the mean of initial and retest scores).

24

Thus, research in both laboratory and operational settings demonstrate that retest increases occur across response formats, test forms, and that these retest scores predict relevant external criteria. Nevertheless, little research has directly pitted this learning explanation against memory in laboratory or operational settings.