Frequency as a mark of formulaicity - Formulaicity in language

Formulaic language and L2 acquisition

6.1 Formulaicity in language

6.1.3 Frequency as a mark of formulaicity

The essence of formulaicity has often been attributed to frequency of occurrence (see below for a discussion on how infrequent strings can be formulaic too). In a review article Ellis (2002) asserts that formulas are frequently and regularly occurring wordstrings. He cites

psycholinguistic evidence for frequency as an important measure at all levels of linguistic representation and processing. High frequency input such as thank you, how are you? and nice to see you is easily processed as compared to low frequency input.

125

In an auditory task in which participants monitored language input for a pre-determined target word, Sosa and MacFarlane (2002) measured reaction times of native English speakers to the function word of in collocations of varying levels of frequency, such as kind of (high frequency) and sort of (low frequency). According to Sosa and MacFarlane, reaction times for of in high frequency combinations were significantly slower than reaction times in low frequency combinations, suggesting that frequently used words become chunks and are represented as wholes.

Biber, Johansson, Leech, Conrad, and Finegan (1999) use frequency of occurrence as a defining feature and sole measure to identify lexical bundles which, according to them, are sequences of two, three or four words that co-occurred at least ten times per million, andsequences ofmore than four words occurring at least five times per million in a corpus. It is important to note that the parameters for a ‘lexical bundle’ can be set where one likes. This means that the boundaries for ‘lexical bundles’ are essentially arbitrary and are set according to the researcher’s corpus size so as not to generate too many, or too few, examples. Lexical bundles are mostly non- idiomatic in meaning, do not constitute complete structural entities and tend to be fully

compositional and systematic in the pattern of use (Biber et al 2004). Examples are: I want to know, well that’s what I, in the case of, the base of the (ibid, p. 377).

Tremblay and Baayen (2010) explored the processing of lexical bundles (regular four-word sequences e.g. in the middle of) of different frequencies in the British National Corpus.

Participants were shown blocks of six four-word strings which they were to recall as accurately as possible at the end of the block. Recall was tested by behavioural (phrase recall) and

electrophysiological (ERP) means. The results showed frequency of occurrence of sequences as a reason for the improved recall. On the basis of evidence from electrophysiological

measurements they concluded that “four-word sequences are retrieved in a holistic manner” (p. 170) and that “phrasal and non-phrasal four-word sequences leave memory traces in the brain” (p. 170). In another study, Tremblay, Derwing, Libben, and Westbury (2011)used a self-paced reading task and sentence recall task to compare the processing of lexical bundles, such as in the middle of the and matched non-lexical or control phrases, such as in the front of the. The string in the middle of the, was considered a lexical bundle because of its having a frequency of 15.3 per million in the British National Corpus, as opposed to a non-lexical phrase in the front of the having a frequency of only 0.4 per million. They found that sentences with lexical bundles (e. g., He sat in the middle of the bullet train) were read faster and better remembered

126

than sentences containing no lexical bundles (e.g., He sat in the front of the bullet train). Tremblay et. al. take this as an evidence of the holistic storage and retrieval of lexical bundles as a result of frequency.

So far we have seen that formulaic sequences did have an integral identity, but we now know that some formulaic sequences (e. g. lexical bundles) don’t. It is therefore important to explore the relationship between frequency, integral meaning/identity, and compositionality. We will consider what each looks like on its own, and in combination with each other.

Because different formulaic sequences exhibit different properties, Wray (2012) argues that formulaic sequences may be located along various continua. For example, on a frequency continuum, lexical bundles lie on the one extreme whereas idioms lie on the other. Similarly, sequences can lie on a continuum of compositionality with novel expressions and certain idioms on the one end, and names on the other end. Wray argues that wordstrings may be frequent and noncompositional at the same time (e.g. names) and such intersection may lead to their faster processing as compared to infrequent compositional strings, that is, novel utterances.

Fast processing of formulaic sequences may be a result of another variable, i.e. salience which is the level to which something in the input can command the learner’s attention. Wray (2012) reasons that “One could observe processing advantages for frequent items compared with less frequent ones if the latter are not salient, as well as for salient items that are not frequent” (p. 243). She argues that infrequent strings can also be formulaic and that there is a tension between frequency and salience. DeLosh and McDaniel (1996), comparing pure lists of high- frequency words to pure lists of low-frequency words found that lists of high-frequency items were better recalled than lists of low frequency words. On the other hand, Merritt et al (2006) found that in a mixed list of high and low frequency words, the latter were better recalled, because they attracted more attention.

The claims about the frequency of formulaic units tends to entail that they occur in a variety of situations, since that is how they can be repeatedly triggered. Since there is only one Quran, frequency in that context can only refer to memorizers’ repeated encounters with the text, and to patterns of repetition within it. Perhaps some verses are more easily memorized than others because they contain words and phrases that have been encountered before. On the other hand,

127

this could make memorization harder because of the need to remember the correct continuation. Interviews with the Quran memorizers (cf. chapter 5) indicated that repeated occurrence of verses and phrases at different places in the Quran facilitates memorization: an earlier encounter with a verse makes it easier for processing at a later occasion in that it helps the memorizers in chunking verses as they are able to memorize more text. Simultaneously, the Quran is noncompositional for memorizers and they may have to store and retrieve text in form of chunks. The combined effect of both frequency and noncompositionality may lead to fast processing of the Quran text in terms of chunking and retrieval.

After having looked at the nature of formulaic sequences in general and some of their characteristics in particular, we will now explore the role of formulaic sequences in L2 acquisition under different learning conditions. Although this study deals with memorization in a foreign language, I will survey literature dealing with various learner populations to understand how formulaic language contributes to language learning in each case. There are two dimensions to the review: (a) type of exposure, such as classroom foreign language acquisition (FLA), and situational/naturalistic second language acquisition (SLA); (b) age, such as older and younger learners under different settings. So we need to ask: does formulaic language contribute differentially to learning outcomes in different conditions? If yes, why? If no, what could be the reasons? In the course of the narrative, some insights from the role of formulaic language in first language acquisition (L1) will also be used to see things in a broader picture.

In document Does memorization without comprehension result in language learning? (Page 132-135)