THEORETICAL FOUNDATIONS FOR THE ROLE OF ATTENTION IN

LEARNING FROM NON-SLA FIELDS

One of the phrases we heard as we grew up, especially in the classroom setting, was, “Pay attention!”—yet have we thought seriously about what this admoni-tion really means? Have we paused to contemplate the value of attenadmoni-tion in every single task that we perform in our lives? Are we aware that when we attend to some information in the L2 input we raise our perception, which may then lead to some of the information being taken into our short-term or working memory, which may then lead to potential internalization of such information, ultimately leading to learning and remembering? Indeed, what do we mean when we use the term “attention”? Do we mean it is a single entity or mechanism that origi-nates from one or more pools of resources, if viewed from a psychological per-spective? Perhaps it comprises more than one entity or mechanism, if viewed from a cognitive scientist and neuroscientist perspective, which are associated with a finite set of modal-specific brain processes all working together with other brain processes to fulfill specific tasks. As way back as 1890, William James provided the well-cited definition of attention as “the taking possession by the mind, in a clear and vivid form, of one out of what seem several simultaneously present objects of trains of thought. Focalization, concentration, of conscious-ness are of its essence. It implies withdrawal from some things in order to deal more effectively with others, and is a condition which has a real opposite in the confused, dazed, scatterbrained state which in French is called distraction , and zer-streutheit in German” (pp. 403–404). Clearly, the process of attention may not be as simple as saying, “Pay attention,” and it is useful to be aware of the theoretical foundations of this process before addressing how it is viewed in the SLA field.

The SLA research field is, relatively, like a baby when compared to other fields, and we like to assume that the other fields are more established in their research paradigms and frankly know what they are doing (well, sort of, as

you will read later on). Consequently, many second language acquisition (SLA) researchers have looked mainly to the field of cognitive psychology or science and cognitive neuroscience to provide an explanation of or theoretical account for the role cognitive processes play in SLA (e.g., Bialystok, 1978; DeKeyser, 2007; Robinson, 1995; Schmidt, 1990; Truscott & Sharwood Smith, 2011; Van-Patten, 2004).

Given that this book is situated within the human information processing framework of attention and/or awareness in the learning process, I shall focus on such models and even provide an overall description of the process before we begin to discuss the various tenets of non-SLA attentional models. In this way, we can use this general knowledge to access the finer details of said models.

To this end, this chapter discusses succinctly some of the major theoreti-cal models of attention in cognitive psychology/science and neuroscience. From cognitive psychology/science, these models include filter theories, capacity and non-capacity models (including the notions of selective and focal attention), con-trolled versus automatic processing, and Wickens’ model of the structure of mul-tiple resources. From neuroscience, Posner’s research on the relationships between attentional networks and other cognitive networks in the brain is reported. The chapter also discusses the process of attention in relation to short- and long-term memory, working memory, and whether learning without attention is possible.

Overview of an Attentional Model of Information Processing

The first step is perception of information that is conveyed by our senses. Some researchers (e.g., Gass, 1997, 1998) go farther and associate such perception with some type of prior knowledge related to the sensory data received (appercep-tion). A selection of some aspect(s) of the sensory data (via peripheral, selective, focal attention) is then made potentially based on what is perceived as important to the learner. This selected information (intake?) enters into the learner’s work-ing memory (read short-term memory) and can potentially remain some time or be discarded from memory. For this selected information to move forward into the learner’s internal system (learning?), it needs to remain minimally some time in this stage to be further processed or rehearsed. What has been further processed may then be available for output (production).

Here is another version of the same process with different terminology. Exter-nal stimuli activate memory representations that remain in memory for a short period of time. Type of attention (peripheral, selective, focal) determines the level of activation and quality of these representations and allows this informa-tion to be available across several networks in the brain.

Now that we have a general idea of the basic sequence of information process-ing, let us discuss succinctly some of the major theoretical models of attention in cognitive psychology/science and neuroscience. Keep in mind that these theories

were generally supported by visual attention and take a close look at the measures employed to address the role or function of attention.

Cognitive Psychology/Science

Filter Theories

The early theories of attention were what were known as the filter theories of attention (e.g., Broadbent, 1958; Norman, 1968; Treisman, 1964), although attentional theories date back to 18th-century philosophy (cf. Neumann, 1996, for an excellent review of attentional theories). Let us take a closer look at Broad-bent’s inf luential filter theory, since most reviews of theory building do not report the assumptions underlying the tenets of some theoretical underpinning.

This theory actually originated from Broadbent’s interest in studying the work-ing environment in an aviation control tower in which f light controllers were simultaneously communicating vocally with pilots in several planes. The inspi-rations of theory building are quite interesting (cf. Wickens, 1980, 1989, 2007, discussed below), aren’t they? Simulating this scenario, Broadbent presented his participants with verbal questions that they needed to answer based on informa-tion visually presented (Broadbent, 1952a, 1952b, 1952c). He found that provid-ing a temporal overlap between two different messages impeded participants, but that informing participants that one of the two messages was irrelevant less-ened this interference. So, how does this translate into theory? Well, he pos-tulated three tenets of his filter theory. First, surely interference was due to a limited capacity central channel, which could only handle so much information at the same time (cognitive overload). However, we can reduce the potential of unwanted information entering the central channel by postulating the presence of a filter that would prevent this unwanted information from moving forward.

Finally, based on the results of ‘split-span’ or dichotic experiments (in which different types of sequences—for example, three-digit sequences—are provided simultaneously to each ear) that revealed participants’ ability to first remember one digit and then the other (Broadbent, 1958), Broadbent postulated the role of short-term memory, since the second set of digits was presumably held in a short-term storage system.

Filter theories, then, viewed the processing of incoming information as moving along a serial path comprising several storage structures (e.g., sensory register >

detection device > short-term memory). More specifically, Broadbent’s (1958) model postulated that as information passes through an early sensory register, a selective filter selects specific information based on the form of the message and conveys this information to a detection device that, at this point, assigns semantic value to the message before it is encoded into short-term memory. Broadbent’s model, mainly based on acoustic processing, postulated that events selected to pass through the attentional channel might be their “physical intensity” (p. 297).

Capacity in this model, according to Neumann (1996), was conceptualized as the transmission capacity of a channel, while it was a filter that performed the selection of blocking or attenuating (reducing) information from moving for-ward. This model was aptly called the “bottle-neck” model, if you imagine the selective filter aspect of the model as two lanes on a highway becoming one lane.

It is interesting to see not only how each successive model began to build upon the limitations or critiques of the previous ones, but also how additional factors began to appear alongside the construct or process of attention. Treisman’s (1964) attenuated filter model argued that Broadbent’s model was too restrictive since, based on findings gleaned from dichotic listening tasks (again!) that revealed attention to both attended information and information provided in an “unshad-owed” ear, both pieces of information were registered by participants. He pos-tulated that his selective filter, or what he called attenuation control, unlike Broadbent’s model, allows for processing of all information (both the form and meaning of the message) before being relayed to the detection device. Treisman’s model was subsequently critiqued for the potential for cognitive overload occur-ring duoccur-ring such an early pre-attentive stage of the attentional process, and this led to what is called the late selection model (Norman, 1968) that removed the previous storage structures of selective filter and attenuation control.

A late selection model views all incoming information to be processed in par-allel, and a decision to process it beyond short-term memory is made in this very storage structure based on its importance. Whatever aspect of the information is deemed to be important is further elaborated or rehearsed, while the rest of the information is quickly discarded. A simple term for this phenomenon is what is known as “selective or focal attention” versus “peripheral” attention, which may be broadly exemplified by the description of “glancing out of the corner of your eye.” Logical extensions of Norman’s (1968) late selection model of attention and sensory processing led to what are called capacity models, which have provided quite a strong foundation for several theoretical underpinnings in the SLA field.

Capacity Models

Three features of the filter theory began to emerge, namely, (1) the metaphor of a limited capacity channel (cf. the L2 learner as a limited capacity processor of incoming information), (2) learners’ voluntary control of the deployment of their limited attentional resources toward the incoming information, and (3) the amount of effort related to the nature of the task being performed. Put another way, there was the assumption that the human brain could only handle so much information at any one given time and, consequently, due to insufficient resources and the processing limitations of the human brain, incoming infor-mation was selected by the attentional system. Think of computer technology, upon which the capacity perspective was built, by the way, in the 1960s, when its capacity was relatively limited when compared to current computer capacities

today. Such attention and processing were also affected by what specifically the learners were doing. While capacity theories agree that there is competition for attentional resources to be paid to incoming information, they went a step fur-ther by postulating that what is paid attention to may depend on the amount of mental effort required to process the incoming information.

According to Neumann (1996), Kahneman’s (1973) capacity model of atten-tion, in which capacity was now conceptualized as a general, unspecified pro-cessing capacity, became the dominant idea and led to the dual-process theory of controlled versus automatic processes (e.g., Posner & Snyder, 1975; Shiffrin &

Schneider, 1977), to be discussed later. The metaphor for capacity now began to be one of a supplier, while selection was its allocation. This model, dependent on the learner’s state of arousal, postulated the allocation of attentional resources from a pool of cognitive resources to incoming information. Whereas the fil-ter theories viewed an inevitable competition for the allocation of attentional resources for incoming information, Kahneman’s capacity model allows the pos-sibility of dividing the allocation of resources to different aspects of incom-ing information. Accordincom-ing to Kahneman, performance may not be negatively affected once the state of arousal is adequate and the task demands are not over-whelming. Put another way, the filter theory is like giving our kids money (attentional resources with the assumption that we have limited cash f low) and telling them to buy only one item (in the store/input), while the capacity theory is telling them that they may buy more than one item (in the store/input with the assumption that we have unlimited cash f low), and this is dependent upon whether they are interested in doing so, that is, going to the store in the mall.

Measures employed to test the unspecified capacity notion included responses gleaned from a probe stimulus (e.g., pressing a button when a tone was heard) compared to similar response, but this time the probe stimuli was simultaneously presented with the identification of a visual stimulus that was supposedly engag-ing participants’ processengag-ing system.

However, the concept of unspecified capacity fell in popularity due to evi-dence indicating that simultaneously performing two tasks that employed the same processes did not suffer from interference (e.g., Posner & Boies, 1971).

This concept was then replaced by the notion of multiple, specific resources, while maintaining the notion of effort and diminishing the focus on selection.

To this end, Wickens’ (e.g., 1980, 1989; cf. also Allport, Antonis, & Reynolds, 1972; Navon & Gopher, 1979) inf luential model of the structure of multiple resources (later known as the Multiple Resources Model of divided attention to task demands, cf. Wickens, 2007) expanded Kahneman’s (1973) single pool of attentional resources model of attention to include the allocation of attentional resources from multiple pools and an additional focus on the nature of the task being performed. Based on the different locations of these attentional resources along three intersecting dimensions of resource systems (cf. Wickens, 1980, for further elaboration), Wickens argued that the difficulty level of two tasks

performed simultaneously may depend on whether the attentional resources are coming from the same pool (serial processing) or different pools (parallel pro-cessing). For example, serial processing (try participating in two conversations at the same time) is a much more demanding task than parallel processing, which may be exemplified by driving a car and reading the billboards at the same time.

However, Wickens concedes that concurrent processing may be possible in serial processing if one of the tasks has been automatized, that is, it has been practiced many times, thus freeing up additional resources for the other task.

What needs to be kept in mind (and if you recall Broadbent’s source of his

“bottle-neck” model) is that this Multiple Resources Model (together with the SEEV (selection, effort, expectancy, and value) model of selective attention that also addressed the early stage of the information processing sequence, cf. Wickens, Goh, Helleberg, Horrey, & Talleur, 2003) was initially developed to represent attention via scanning in dynamic visual worlds such as driving (Horrey, Wick-ens, & Consalus, 2006) and flying (Wickens et al., 2003). In addition, the early resource theories have been critiqued for their use of dual-task data that may rely on too small a number of basic resources to address observed patterns of interfer-ence or to be inadequate to tease out performance-resource functions, unless it is known in advance that two tasks are both subserved by the same resource (Neu-mann, 1996). It is also interesting to note that the 1980s witnessed the explosion of studies investigating different and specific attentional mechanisms and their functions, evidenced in the field of visual attention, the birth of the cuing para-digm, and the emerging impact of connectionism on attention theory by making an association between attention and neural activity in specific parts of the brain.

Non-Capacity Models

It is to be noted that other attentional models began to question the overall concept of limited attentional capacity, that is, the need for selection is a con-sequence of such a capacity. For example, instead of focusing on the interfer-ence in processing resulting from the so-called limited capacity, Sanders’ (1983) cognitive-energetical model approached the issue from a psychophysiological perspective and zeroed in on the local aspect of attention (e.g., investigating the effect of sleep loss on arousal) instead of a global one that attempts to explain all types of attentional phenomena. Neisser (1976) questioned the existence of any relationship between selective attention and brain capacity and, like Sanders, viewed interference in the performance of dual tasks not as a competition for limited resources but as local difficulties arising from the performance of one task having an effect on the other (cf. Allport, 1980, for relatively similar views).

This line of theorizing on interferences resulting from problems of coordi-nation and control was continued (e.g., Allport, 1993), but now these problems were not viewed as the causes of interference. According to these researchers, there are specific mechanisms that are designed to deal with these problems, if

one were to consider the functional characteristics of the central nervous system (cf. the modularity of the brain, the potential for massive parallel processing, etc.); they provide the need for selection, while limited capacity is a byproduct of such selection (cf. Neumann, 1996). It is noted that these theories were mainly based on visual attention.

Summary

These attentional theories, discussed above, arguably underlie in some way sev-eral of the theoretical underpinnings postulated for the L2 learning process in SLA, with special focus on the perception that the basic feature of attention is the concept of limited (un)specified capacity and that the main function of selection was to alleviate potential cognitive overload taking place due to this capacity.

However, it is important to note that three important trends began to take place in the 80s, namely, the expansion of additional functional roles for selection beyond alleviating the limited capacity, a shift from dual-task interference to sensory attention (and especially the visual modality), and the emerging impact of connectionism on attention theory (Neumann, 1996). Indeed, connectionist models of visual attention began to view the effect of attention as one more piece to the already existing units in the brain that correspond to the selected stimuli, thereby reinforcing the strength of this unit. As a consequence, studies began to address specific attentional mechanisms and their accompanying functions, and the popular cuing paradigm (e.g., Posner, 1980; Posner, Snyder, & Davidson, 1980) began to be employed in the field of visual attention to investigate learner attentional shifts and their effects. I will elaborate below on this new shift in focus.

Neuroscience

The most cited source in this field of neuroscience in relation to attention is the work of Posner and his colleagues (Posner 1992, 1994, 1995; Posner & Petersen, 1990). Note that these neuroscientists were primarily interested in (1) identify-ing the locations of different attentional processes functions in the brains of both humans and animals, (2) examining the relationships between attentional networks and other cognitive networks, and (3) using this information to treat pathologies linked to attentional disorders.

Three main attentional networks (posterior, anterior, and vigilance) were identified in the brain and associated with their individual functions (Posner &

Petersen, 1990) in the process of attention. As reported in Simard and Wong (2001), the posterior network is found in portions of the parietal cortex, associ-ated thalamic areas of the pulvinar and reticular nuclei, and parts of the midbrain’s superior collicus, and its associated attentional function is to orient attention to sensory stimuli, especially visual locations in visual space. The anterior network

In document (Second Language Acquisition Research Series) Ronald P. Leow-Explicit Learning in the L2 Classroom_ a Student-Centered Approach-Routledge (2015) (Page 42-67)