• No results found

internal processes. I pointed out that the major difference between the two kinds of data-elicitation measures may be the level of internal validity of the study with respect to the information on learners’ actual performances during expo-sure to and/or interaction with the L2 data. Online process meaexpo-sures provide relatively more substantial evidence of processing and processes being measured than off line measures and thus are, by nature, higher in internal validity. The terms “processing” and “processes” are typically conf lated in the SLA literature, and one way to distinguish them may be to view “processing” as an event taking place and “processes” as what (e.g., attention, awareness, knowledge) are being employed during this event. Off line measures can only make inferences as to whether learners, for example, paid attention to, became aware of, or employed prior knowledge during the processing of targeted items in the input and, con-sequently, constitute a broad-grained measurement of cognitive processing or processes. In addition, one other benefit of using online process measures is the opportunity for qualitative analyses that provide a richer source of information on learners’ internal processes when compared to quantitative analyses. Recently my colleagues and I (Leow, Grey, Marijuan, & Moorman, 2014) provided a criti-cal overview of these three concurrent data-elicitation procedures in the SLA literature, specifically in relation to the early stages of the L2 learning process.

Let us take a brief look at each of these procedures and their benefits and limita-tions by beginning with the latest arrival on the scene to address the early stages of the L2 learning process: Reaction time.

Reaction Time (RT)

Reaction time (RT) measures have been a popular procedure in psychology and other non-SLA fields since the 1800s and have been used to address a range of issues that include retrieval of information from short-term memory (e.g., Klatsky & Smith, 1972) and long-term memory (e.g., Anderson, 1970), paral-lel and serial information processing (e.g., Egeth, Marcus, & Bevan, 1972), the psychological representation of semantic and logical representations, naming and letter classification tasks (e.g., Posner, 1978), and selective attention (e.g., Pach-ella, 1973). The measure itself was also a topic of research, seeking to under-stand what factors could cause or account for variation in RTs, especially with respect to the speed-accuracy trade-off (e.g., Yellott, 1971), and implicit learning (e.g., Reber & Allen, 1978), such as the co-occurrence of cues and first-order dependencies in sequence structure (e.g., Nissen & Bullemer, 1987), first-order dependencies (e.g., Frensch, Buchner, & Lin, 1994) and higher-order dependen-cies in sequence structure (e.g., Cleeremans & McClelland, 1991). Additionally, RT studies addressed the contextual cueing paradigm, largely carried out with spatial cues on non-linguistic targets (e.g., Chun, 2000) and first language acqui-sition of word segmentation (e.g., Saffran, Newport, & Aslin, 1996), verb distri-bution (e.g., Wonnacott, Newport, & Tanenhaus, 2008), and the acquisition of syntax (e.g., Chang, Dell, & Bock, 2006).

The Standard Procedure to Collect RT Data

The standard procedure for collecting reaction time data is to ask participants to press a button on a keyboard, computer mouse, or a response box as quickly and accurately as possible in response to a particular stimulus. In simple reaction time experiments, participants press the button whenever a pre-defined stimulus appears, such as a particular tone or image. In recognition reaction time experi-ments, participants respond only to certain stimuli (e.g., a memory set), but not to other stimuli (e.g., a distractor set). Finally, in choice reaction time experi-ments such as lexical decision or grammaticality judgment, participants are asked to press a pre-assigned button for a certain decision (i.e., “Yes, the sentence is good” or “Yes, that is a word”) and a different button for an alternative decision (i.e., “No, the sentence is bad” or “No, that is not a word”). Before beginning the experimental task, many RT experiments provide participants with practice or warm-up so that they can become accustomed to the task demands and respond-ing quickly.

Given its popularity in cognitive psychology for investigating various types of information processing, memory, and implicit learning, it is not surprising to find a similarly wide range of research strands in SLA that have adopted the RT measure to study a variety of topics, including automaticity (e.g., DeKeyser, 2001; Segalowitz & Segalowitz, 1993), feedback (e.g., Lyster & Izquierdo, 2009) explicit instruction (e.g., Sanz, Lin, Lado, Bowden, & Stafford, 2009), L2 pro-cessing (e.g., Alarcón, 2009), and, more recently, implicit learning (e.g., Leung &

Williams, 2011, 2012, 2014).

Benefits of RT

Based on the strands of reaction research that are most pertinent to the topic of this book, namely automaticity, L2 processing, and implicit learning, the benefits of RTs include the exploration of (1) theoretical issues related to linguistic pro-cessing, specifically of gender and gender agreement, and native and non-native learner processing differences; (2) different speeds of processing for certain lin-guistic cues (e.g., animacy vs. noun class); (3) the role of L2 automaticity; and (4) the online operationalization of type of learning. At the same time, it may be worthwhile discussing the notion of automaticity more fully given that this notion appears to underlie most uses of RT in the SLA field.

The notion of automaticity refers to the instances when “we perform aspects of a task automatically, we perform them without the need to invest additional effort and attention . . . Also, performance appears to be more efficient; it is faster, more accurate, and more stable” (Segalowitz, 2003: 383). Automaticity has been operationalized as faster processing, ballistic (unstoppable) processing, or that which functions the same regardless of the amount of information to be processed (load independent). It has also been framed in terms of effortless or

unconscious processing (Segalowitz, 2003). In SLA, the view of automaticity as being “faster processing” appears to be the most dominant angle (Hulstijn, Van Gelderen, & Schoonen, 2009).

As mentioned earlier, it appears that many strands of RT research in SLA have subsumed either explicitly or implicitly the notion of automaticity in their research designs. Many of the SLA studies that used RT measures appear to place a premium on faster reaction times, which is ref lective of effortless or uncon-scious processing (e.g., Leung & Williams, 2011, 2012, 2014), positive effects of feedback or instructional context (barring decreases in accuracy; e.g., Lyster &

Izquierdo, 2009), and speed of processing (e.g., Alarcón, 2009). As such, data evidencing faster mean reaction times compared to some experimental base-line (a priori assumptions of no or limited knowledge, or a pretest measure, for example) would be considered evidence of L2 automaticity. In other words, automaticity has been empirically measured by assuming that decreases in aver-age reaction time over the course of experimental observation index increased automatic processing on the part of the L2 learner.

However, some researchers argue that it is not mean reaction time per se, but instead the coefficient of variation (calculated using participant mean reac-tion times and standard deviareac-tions) that is the most informative measure of automaticity—in that it may tease apart automatic processing and speeded-up control-like processing (Segalowitz, 2003; Segalowitz & Segalowitz, 1993; but see also Hulstijn, et al., 2009, and Lim & Godfroid, 2014, for lexical decision and semantic classification in L2). The theoretical and empirical concern is essen-tially that “faster processing,” as indexed by analyses of mean reaction times, may not adequately distinguish between automatically carrying out a task and quickly applying control-related procedures to the task, where such a distinction would be crucial in L2 automaticity research. Note, however, that this ana-lytic debate on reaction time data (i.e., means versus coefficients of variation) is somewhat tangential to favoring the collection of RT data for research on L2 automaticity. The use of reaction time measures to answer research questions on this issue seems to be a valid and potentially powerful application, assuming that researchers’ operationalization of automaticity coincides with the “faster pro-cessing” perspective and that this perspective appropriately captures qualitative differences in processing over time. However, it remains an open issue whether this is in fact the most valid operationalization and how RT data might be used both for differentiating between operationalizations of automaticity and espe-cially for determining the status of automatic and controlled processing. With future research using RTs that is premised on L2 automaticity, researchers should be cognizant of all of these factors before making strong conclusions about how speed of processing, as measured by mean RTs, translates to one type of process-ing (automatic) compared to another (speeded-up control), and whether speed of processing is the counterpart of controlled processing, as opposed to automatic processing closely tied to implicit learning and knowledge.

Eye-Tracking (ET)

Did you know that early information about eye movement behavior, obtained centuries ago, was actually achieved using visual observation of the eyes? Eye-tracking is the process of measuring either the point of gaze (that is, where one is looking, which could be lateral or vertical) or the motion of an eye relative to the head. An eye tracker is a device that measures eye positions and voluntary or involuntary movements of the eye, which helps in obtaining, fixating, and track-ing visual stimuli. Not surpristrack-ingly, eye trackers are heavily used in research on the visual system, which forms part of the central nervous system that gives us the ability to process visual details as we receive them. Do you recall the studies on visual perception so popular in non-SLA attentional theories? Fortunately, with the advances in technology, studies on the usefulness of the eye-tracker has revealed a close link between the eye and the mind (Carpenter & Just, 1976;

Rayner, 1998), that is, there may be a direct relationship between eye movements and underlying cognitive processes. It is an uncontroversial measure of the allo-cation of overt attention (Blair, Watson, Walshe, & Maj, 2009), and a close link between covert attentional processes and eye movement has been established (see Godfroid, Boers, & Housen, 2013; Rayner, 1998; and Wright & Ward, 2008, for reviews). Interestingly, it has even been proposed that eye-tracking data can be used to measure cognitive effort by means of intensity and time (observed through pupillary dilation), both of which can be captured by eye-fixation loca-tion and eye-movement time (Kahneman, 1973).

The Standard Procedure to Collect ET Data

Eye-tracking data are typically gathered in a laboratory setting and, depending on the type of equipment (e.g., a head-mounted, video-based eye-tracker like EyeLink II or a remote eye-tracker like Tobii 1750), some time is spent calibrat-ing the eye-tracker to participants’ eyes in order to accurately record their gaze direction. Participants are usually given a practice run before actual data collec-tion begins.

Benefits of ET

The use of eye-tracking in SLA research is a recent effort to employ another concurrent data-elicitation procedure to address the initial stages of the learn-ing process with, not surprislearn-ingly, much focus on the attention paid to L2 input by participants. In SLA, the eye-tracking procedure has been employed to address, for example, the constructs of attention and noticing in L2 development (e.g., Ellis et al., 2012; Godfroid, Housen, & Boers, 2013; Smith, 2010, 2012), L2 sentence and discourse processing while reading (e.g., Foucart & Frenck-Mestre, 2012), and L2 speech processing (e.g., Lew-Williams & Fernald, 2010).

The benefits of eye-tracking methodology include the following: (1) It is non-intrusive (Dussias, 2010; Godfroid et al., 2013), (2) it has been argued to measure overt attention (i.e., conscious focus, Ellis et al., 2012) and learner processing (e.g., L2 gender agreement) and to detect very subtle effects in relation to when and where difficulties occur during syntactic processing, as well as the extent of the difficulty (Foucart & Frenck-Mestre, 2012), (3) it offers high temporal resolution and the ability to divide reading time into distinct components dur-ing online L2 sentence comprehension (e.g., Dussias & Sagarra, 2007), (4) it is clearly superior to other measures of reading such as the self-paced reading in that it allows for the naturalness of reading to take place, (5) the eye-tracking procedure is arguably the most robust measure of learner attention given the type of data it gathers in relation to participants’ eye movements, and (6) unlike other concurrent procedures, ET data can provide insights even into what has only been peripherally attended to in the input.

Online Verbal Reports or Think Aloud Protocols (TA)

The use of online or concurrent verbal reports to investigate participants’ cogni-tive processing, thought processes, and strategies in many areas of psychology, cognitive science, and education is not a new data-elicitation procedure. Indeed, their use has been documented extensively in other fields since the 1950s. I have elected to discuss the think aloud procedure last given its extensive usage to address the language learning process when compared to reaction time and eye-tracking procedures. Since TAs appear to have the advantage of provid-ing insights into learners’ cognitive processes as opposed to simple processprovid-ing, as demonstrated in both non-SLA and L1 literature, the strands of research that have employed TAs in SLA also include L2 reading and writing (e.g., Cohen &

Cavalcanti, 1987), as well as comparisons between L1 and L2 strategies (e.g., Yamashita, 2002), L2 test-taking strategies (e.g., Cohen, 2000), translation (e.g., Jaaskelainen, 2000), interlanguage pragmatics (e.g., Kasper & Blum-Kulka, 1993), and L2 attention and awareness studies (e.g., de la Fuente, 2015; Alanen, 1995; Hama & Leow, 2010; Hsieh, Moreno, & Leow, 2015; Leow, 1997, 1998a, 1998b, 2000, 2001a, 2001b; Martínez-Fernández, 2008; Medina, 2015; Rosa &

Leow, 2004a, 2004b; Rosa & O’Neill, 1999; Sachs & Suh, 2007 [cf. Bowles, 2010, for a review]). However, all verbal reports are not equal. It is important to point out the different methods of eliciting verbal reports, broadly categorized as either introspective (concurrent or online) or retrospective (online or off line) and metacognitive or non-metacognitive (Ericsson & Simon, 1993).

Introspective vs. Retrospective

Introspective verbalization is gathered as participants are performing a task. Hence, verbalizations are not constrained by memory. Retrospective verbalization is usually

conducted immediately after some form of processing has taken place, either during specific breaks in the actual task (online) or immediately after the completion of the task (offline). This type of verbalization has been critiqued for the potential effects of memory constraints and reconstructive processes—that is, additional information reported in one’s recall of the data (Nisbett & Wilson, 1977). Ericsson and Simon (1993) advise that retrospective protocols be used with caution, since it is impossible to “rule out the possibility that the information [subjects] retrieve at the time of the verbal report is different from the information they retrieved while actually per-forming the experimental task” (p. xii) or to rule out the issue of veridicality, that is, whether memory decay could be playing a role in the protocols.

Metacognitive vs. Non-Metacognitive Verbal Reports

In non-metacognitive verbalization, learners are focused on the task with the think-aloud secondary and only voice their thoughts without explaining them (Type 1 and Type 2 verbalization). In metacognitive verbalization, the researcher may ask for specific information (e.g., reasoning or explanation), and learners provide a metacognitive report on what they think their processes are (Type 3 verbalization).

Cohen (2000) distinguished metacognitive verbalizations from non-metacognitive verbalizations by characterizing the former as self-observational and the latter as self-revelational.

In order for verbalizations to ref lect learners’ processes, it has been rec-ommended that introspective, non-metacognitive verbalizations be gathered (Cohen, 2000; Ericsson & Simon, 1993) “to avoid this problem of accessing information at two different times—first during the actual cognitive processing and then at the time of report” (Ericsson & Simon, 1993: xiii).

To Think Aloud or Not to Think Aloud: The Issue of Reactivity

One of the prominent critiques of the TA procedure is that it is intrusive and may be subjected to the issue of reactivity, that is, whether thinking aloud could have affected participants’ primary cognitive processes while engaging with the L2 or even add an additional processing load or secondary task on partici-pants, which would not ref lect a pure measure of their thoughts. Additionally, as noted by Rosa and O’Neill (1999), TAs may also present considerable variation due to individual differences. At the same time, as pointed out in Leow et al.

(2014), the level of intrusiveness may depend on type of protocol employed (non-metacognitive vs. (non-metacognitive) and type of experimental task employed (e.g., problem-solving vs. reading). Other variables may include working memory, language of report, and proficiency level.

To empirically address this methodological issue in SLA (like in other non-SLA fields), Leow and Morgan-Short (2004) reported the failure to find a reactive

effect on participants’ performances after a reading exposure when compared to a control group. It is noteworthy that they also cautioned readers that “given the many variables that potentially impact the issue of reactivity in SLA research methodology, it is suggested that studies employing concurrent data-elicitation procedures include a control group that does not perform verbal reports as one way of addressing this issue” (p. 50).

The reactivity strand of research grew exponentially in this second part of the decade with several studies (e.g., Bowles, 2008; Bowles & Leow, 2005; Egi, 2008;

Rossomondo, 2007; Sachs & Polio, 2007; Sachs & Suh, 2007; Sanz et al., 2009;

Yoshida, 2008) addressing the issue of reactivity in relation to various variables.

While a cursory glance at the eight studies (with a total of ten experiments) pub-lished in this period would reveal one study reporting positive (Sanz et al., 2009) and one reporting negative effects in one of two experiments (Sachs & Polio, 2007, in which the protocols were produced in the L2), and another report-ing positive effects (Rossomondo, 2007), a recent meta-analysis (Bowles, 2010) has reported an effect size value that “is not significantly different from zero”

(p. 138), that is, it is not a reliable effect. A few more recent empirical stud-ies (Morgan-Short, Heil, Botero-Moriarty, & Ebert, 2012; Stafford, Bowden, &

Sanz, 2012) reported similar findings, while Stafford et al. (2012) also appeared to contradicted Sanz et al.’s (2009) reactive findings in one of their experimental groups. Goo (2010), on the other hand, reported negative reactivity for compre-hension based on a trend toward statistical significance ( p = .054) with a medium effect size ( d = .62). Another recent study (Yanguas & Lado, 2012) addressed the issue of reactivity in the written mode (learners writing in their heritage language) and reported positive reactivity in terms of f luency and accuracy.

Currently, the standard methodological practice in research designs employing concurrent TAs is to follow Leow and Morgan-Short’s (2004) suggestion cited above.

The Standard Procedure to Collect RT Data

The standard procedure to collect concurrent data is to ask participants to think aloud while performing an experimental task and to record the protocols pro-duced during this experimental phase for subsequent coding. To guide partici-pants to produce non-metacognitive protocols, instructions usually request that participants be as natural as possible as they perform the experimental task, to think aloud constantly from the time they start the task until they finish the task, and not to try to plan out or explain what they are saying. The data collected are then coded to establish the presence or absence of the cognitive construct(s) under investigation. Before beginning the experimental task, many TA experi-ments provide participants with practice or warm-up so that they can become accustomed to the task demand.

Here is a typical instruction for participants to follow, taken from Bowles (2008):

INSTRUCTION

In this experiment I am interested in what you think about when you complete these tasks. In order to find out, I am going to ask you to THINK ALOUD as you work through the mazes. What I mean by “think aloud”

is that I want you to verbalize your thoughts the entire time you are work-ing on the tasks. I would like you to talk CONSTANTLY. Do not plan out what you are saying or explain what you’re saying. Just act as if you are alone in the room talking to yourself while you complete the tasks. What is most important is that you keep talking throughout and talk clearly into the microphone. You can speak in English. Just say whatever passes through your mind as you complete the tasks.

Benefits of TA

The benefits of TA protocols gathered in the SLA field include information on (1)

The benefits of TA protocols gathered in the SLA field include information on (1)