2. Literature Review 17
2.2 Aspects of Production and Decoding of Spoken English 17
2.2.4 Different Decoding Processes of English Speech by L1 and L2 Users 36
This sub-section deals with different decoding processes of English NS employed by L1 users and L2 learners. It is structured as follows: L1 speakers’ decoding process which involves bottom-up and top-down processing and making intelligent guesses; L2/EFL inefficient decoding process; nature of L1 decoding, EFL decoding and listening strategies.
A. L1 speakers’ decoding process
Receiving incoming acoustic signals, assigning correct labels to the segments and arriving at reasonable interpretations: this is the process called decoding. L1 language users can generally cope with the decoding process automatically.
Brown (1990) gives a thorough description of the natural decoding process of L1 speakers. Recognising the main message given by the prominent units (either the stressed words, or the salient information chunks), L1 listeners reconstruct the information and then achieve an intelligible interpretation. This process involves getting the phonological information from the acoustic signals, narrowing the content down to a particular topic and correctly detecting the words used by the speaker, and then arriving at a correct interpretation.
a) bottom-up processing of speech
The first step – recognition and assigning of phonological code – is called ‘bottom up’ processing (Brown, 1990, pp.10ff, pp.150ff). This is an essential process for listening. The more phonological information the listener gets, the better he grasps the topic being
spoken of. This does not mean that the listener has to understand every single word so as to get the whole conversation. In fact, in listening, listeners normally pay more attention to ‘what’ is said, rather than ‘how’ it is said (Brazil, 1994, p.2), listeners do not perceive spoken language as a series of sounds, instead, they capture the gist of the communication. As Roach (2000, p.130) puts it: ‘when children are learning their first language, they acquire features rather than individual phonemes’. The salient phonological characteristics, as examined earlier, mark significant communicative information in everyday spoken English. Having internalised these native features of English speech, L1 listeners can easily retrieve sufficient phonological information and build them stepwise into a representation going from smaller units to larger ones, then to the whole utterance. Having acquired the basic information input can then lead to the step of ‘top down’ processing.
b) top-down processing of speech
‘Top down’ (Brown, 1990, pp.11-12, pp.151-52) processing means that, after getting sufficient contextual information on the topic in the ‘bottom up’ process, intelligent predictions are made. This process is dependent on listeners’ personal, formalised experience both of language and the world. In NS communication, people tend to use a large amount of ‘prefabricated’ language (reviewed later in Sections 2.3.3 and 2.4). In particular, discourse markers, as Chaudron and Richards (1986) discuss, function as a signpost for facilitating top-down processing. Apart from these clear signposts, due to the stream of English connected speech, it is impossible to pick up every single phoneme. The listener uses his personal language experience to compensate for any gaps in recognising the missed words. In addition, this process also involves completion of the interpretation by bringing in the listener’s world knowledge. As de Beaugrande
(1980, p.30) states, ‘the question of how people know what is going on in a text is a special case of the question of how people know what is going on in the world at all’. Armed with this set of stereotypical knowledge, L1 language users can generally predict what might be talked about in this situation, and what might be discussed by this specific speaker. This familiar, shared knowledge is acquired by L1 users since infancy, and is regarded as one of the automatic skills of L1 language users (Brown, ibid., pp.153-55).
c) making inferences in speech
The process of ‘making inferences’ (Brown, 1990, pp.155-58) refers to getting extra information, which is not linguistically present in the specific expressions used by the speaker. This logical guessing operation depends on how relevant the extra information is in the context in which it occurs, and even depends on the interpretation of the ‘utterer’s beliefs and desires’ (Dennett, 1990, p.191).
In general, L1 language users are active listeners. They are primarily reliant on phonological cues which provide them with basic information to help them tune into the communication. Since in the rapid flow of informal speech, some phonetic clues are reduced, modified, or missing, L1 listeners do not depend on capturing all details of the speaker production; rather they only pay attention to the ‘shape’ of the word (McCarthy, 1990, p.35). L1 language listeners have the innate ability to make up for phonetic weakness or slips of the tongue (Boomer & Laver, 1968), and draw logical conclusions, and sometimes they can even finish the utterance for the speakers, which is termed ‘latch[ing]’ (Sacks, et al., 1974).
B. L2 listeners’ decoding process
In contrast with L1 speakers, for non-L1 language learners, perceiving natural connected speech and achieving a correct rendering is not an easy task. Due to the classroom language learning environment (for most EFL learners in China), in which there is almost no exposure to authentic, natural, spoken English, when non-L1 language learners deal with real L1-L1 English communication, they are easily over- tasked in the decoding process, as investigated by the following researchers.
a) inefficient use of bottom-up processing
Research carried out by Tsui and Fullilove (1998) indicates that ‘less-skilled listeners were more likely to process … linguistic input without understanding the entire text’ (p.447). This is a typical processing approach employed by most Chinese EFL listeners. When involved in informal native-to-native English conversations, typical L2/EFL listeners constantly struggle with scanning the incoming signals and looking for matches in their lexicon. However, there are some factors which impede L2 listeners in recognising every feature of linguistic input. Firstly, as considered in Section 2.2.3, given the nature of the natural flow of English speech, for L2 listeners, the problem is often that ‘it is not clear to them how many words there are supposed to be in the utterance and where their boundaries might lie’ (Brown, 1990, p.150). Another factor is, as pointed out by Field (2004), ‘a limited vocabulary or grammar, or the inability to recognise known words in connected speech’ (p.308).
b) insufficient use of top-down process and making inferences
The skills of narrowing down information and making intelligent guesses from the surrounding contexts are often inappropriately or insufficiently employed by L2/EFL
listeners. One of the problems is, as investigated by Aitchison (1994), the role of culture in the decoding of L2 language. Given that L2/EFL learners come from different social and cultural backgrounds, L2 listeners often lack sufficient ‘familiar knowledge’ (Brown, 1990, p.155) which can supply them with correct cultural information and facilitate their rendering of the L1 speakers’ intention. Another factor which may also affect L2 listeners’ decoding process is kinesic behaviour (i.e., body movement), as investigated by Kellerman (1992). Due to the lack of relevant L2 world knowledge, language learners often cannot narrow down the topic being talked about, although they can pick up some of the words in the utterance. They often do not feel confident enough to make logical inferences by employing their previous L1 experience, which will prevent them from achieving a degree of ‘automaticity’ in the way they decode L2 language (Field, 2004, p.308).
C. Nature of L1 decoding, EFL decoding and listening strategies
One point which needs to be clarified here is that a correct interpretation does not mean 100 percent identity between the speaker’s intention and the listener’s comprehension. As Brown (1990, p.10) emphasises, ‘communication is a risky business’. Every listener has different personal experiences and world knowledge, which also triggers different interpretations of the speaker’s intention. Even if there is agreement among the listeners as to what was said by the speaker, there might still be varying perceptions regarding the real intention behind the words the speaker used. L1 speakers, however, can normally minimize the potential misunderstanding, as they move towards maximal convergence in their communication (details reviewed in Section 2.3.2).
Given that listening is not a passive information-transmission process, how can one achieve Brown’s (1990) goal for language learners – to ‘listen as a native speaker listens’ (p.148)? Cauldwell (2000, pp.2-3) emphasises that, ‘it is a mistake’ to abandon bottom up activities, since it demonstrates to language learners the essential characteristics of speech. In other words, from his point of view, language teachers have inherited a top down approach, and are of the opinion that learners do not need to understand every word. Caudwell suggests that this seems illogical and unreasonable. The skill of the top down approach is ‘a goal to be reached’, rather than ‘a means of getting there’. Language teachers should teach learners how to perceive and work towards an imitation of the L1 listeners’ decoding style rather than teach them how to gain these abilities at an early stage. It seems that there needs to be some bottom up processing before the real top down skill can be achieved. Wilson (2003, p.335) has a similar idea and proposes a more plausible ‘discovering listening’ method, based on Marslen-Wilson’s (1989) ‘bottom-up primary’ model, which improves language learners’ performance by pinpointing their listening difficulties after re-constructing a text. In other words, language learners should ‘spend more time with the signal’ (Cauldwell, 2000) and study how the L1 speaker speaks, not just what the L1 speaker says.
Field (1999, pp.338-39) also discusses bottom up and top down approaches, and he points out that these two operations are actually processed interactively. Bottom-up information is the basis for narrowing down the range of possible predictions. Meanwhile, the contextual information gained by top down processing also influences or supports the basic phonological clues. A more efficient skills-based approach is advocated by Field, which shows the importance of conceptual and perceptual work in second language teaching activities. Teaching listening strategies is not ‘a waste of
time’ (Ridgway, 2000, p.184), and Field (2000, p.194) argues, ‘[l]et us improve the lifebelts rather than relegate our swimmers to the paddling pool’. The present author agrees with Cauldwell’s and Field’s approaches of L2/EFL listening teaching.