The Interaction of Phonology with Morphology
5.3 Lexical Storage of Complex Forms, Both Regular and Irregular
We have now discussed some of the ways that morphological infor-mation is associated with phonological alternations. We turn now to properties of a model that can accommodate these and other facts of morphology and their interaction with phonology. In this section, we discuss the storage of morphologically complex forms in memory. In subsequent sections, we turn to ways of modeling the relations among these forms and the effects of high and low frequency on both the storage and relations among forms.
In Chapter 2, I suggested that a usage-based model of storage of lin-guistic material would settle on words as the basic unit of storage, because words are the smallest units that are complete both phono-logically and semantically, and that can be appropriately used in isola-tion. Note, however, that this only specifies the smallest unit of storage.
In Chapter 6, we discuss the evidence for the storage of multiword sequences, such as phrases and constructions. In the present section, we will investigate the consequences of having multimorphemic words in the lexicon.
Also in Chapter 2, I pointed out that the model of storage being used here does not represent the lexicon like a dictionary – a long list of words and their properties – but rather as a complex network with multiple associations among words. Thus, any multimorphemic word or sequence is highly embedded in connections with other words con-taining at least one of the same morphemes. So verb forms such as strike, struck, and striking are highly connected with all other base forms, Past forms, and Participles, as are verbs such as start, started, and starting. What determines the forms that are actually in memory is usage: verb forms that are used frequently are stored in memory. Those that have not been used and those that are of very low frequency do not actually exist in memory, but if they are regular, they can easily be derived by using the associations in the network. That is, the fact that
5.3 Lexical Storage of Complex Forms 109
a word is classified as a verb automatically associates it with other verbs, which in turn have multiple closely associated forms.
For many years now, it has been clear from experimental results that irregular forms, such as English irregular Past Tense forms, are stored in memory in associative networks. Bybee and Moder (1983), repli-cated in Prasada and Pinker (1993), show that subjects’ responses to the task of giving Past forms for nonce verbs is highly influenced by the form of existing Past Tense verbs (see Section 5.9 for further dis-cussion). However, there is still disagreement over the representation of regular forms. In a proposal for a dual-processing model, Pinker, Marcus, and colleagues (Clahsen 1999, Clahsen and Rothweiler 1992, Marcus et al. 1992, Pinker 1991, Prasada and Pinker 1993) argue that there is a discrete distinction between regular and irregular forms:
irregulars are stored in memory, but regulars are created by a symbolic rule. In contrast, the model developed in Bybee (1985, 1988a, 1995), along with the connectionist models (Rumelhart and McClelland 1986) and the analogical model (Skousen 1989, 1992), would claim that both regulars and irregulars are handled by the same storage and process-ing mechanisms. My proposal is that what determines whether a mor-phologically complex form is stored in memory is its frequency of use, not its classification as regular or irregular.
This debate is an excellent example of the difference between a structuralist theory and a usage-based or functionalist theory. The dual-processing model claims that differences in storage and access correspond to differences in the structure of forms: those that are struc-turally irregular are stored in memory, while those that are strucstruc-turally regular are derived by rule. The usage-based model claims that the difference comes down to a difference derived from usage: the high-frequency forms have storage in memory and low-high-frequency forms do not, independent of their structural properties. In fact, the structural properties are also derived from usage, since the only way irregularity can be preserved is through sufficient frequency for memory storage.
That is why low-frequency irregulars either regularize, or fall out of usage and disappear from the language.
The argument for using a symbolic rule to derive regular forms is that the regular pattern is extremely productive (in English at least) and does not seem to be influenced by existing local phonological patterning. The counterargument from the usage-based side is that productivity is strictly related to type frequency, and that the robust productivity and insensitivity to the lexicon of the English regular Past 110 The Interaction of Phonology with Morphology
Tense formation is due to its overwhelmingly high type frequency. In languages in which the regular morphological patterns have lower type frequency, we find reduced productivity, but sensitivity to the lexicon (see Köpcke 1988 for German Plurals and Lobben 1991 for Hausa Plurals).
There is also positive evidence that high-frequency regular forms are stored in the lexicon just as irregulars are: namely, evidence for differ-ential behavior of regulars based on frequency. I will cite several studies that reveal such differences. For other arguments, see Bybee (1995).
Stemberger and MacWhinney (1986, 1988) sought to elicit speech errors from subjects by asking them to form the Past Tense of regular English verbs as quickly as possible. The verbs under study all ended in /t/ or /d/, and 10 were of low frequency, and 10 of high frequency.
Verbs ending in /t/ or /d/ were chosen because it had been found in earlier studies that speakers often make errors of no change on such verbs, producing, say, wait as a Past Tense (Bybee and Slobin 1982).
Stemberger and MacWhinney found a significant association between token-frequency and the number of errors made by subjects in that more than twice as many errors were made on low-frequency verbs as on high-frequency ones. This finding suggests that even regular, mor-phologically complex forms are stored in memory and that frequency can affect the strength of their representations (see Section 5.6).
Alegre and Gordon (1999a) describe a lexical decision task that revealed word frequency effects for regular English Past Tense forms.
Specifically, higher-frequency words were responded to more quickly than low-frequency words. Interestingly, Alegre and Gordon found sig-nificant frequency effects on reaction time for regularly inflected words with frequencies above 6 per million in Francis and Kucˇera (1982), but no frequency effects for words with frequencies below 6 per million.
This finding supports the hypothesis that regular forms of higher fre-quencies are stored in the lexicon and can be accessed as whole words, and that they have differential representation based on frequency.
Hare et al. (2001) created a dictation task in which English-speaking subjects were asked to write a sentence using an orally pre-sented word that reprepre-sented two homophones, one a Past Tense form, some of which were regular and some irregular, and one a monomor-phemic word (for example, aloud and allowed, or spoke as the Past of speak or as a noun). The results showed that the subjects tended sig-nificantly to select the homophone that was the most frequent, even when that form was a regular Past Tense form.
5.3 Lexical Storage of Complex Forms 111
112 The Interaction of Phonology with Morphology
Sereno and Jongman (1997) used regularly inflected English nouns in a lexical decision task and found that it was the surface frequency of the form, whether the Singular or the Plural, that corresponded to how fast the subjects could identify it as a word of English. When the Singular was presented, nouns with high-frequency Singulars were responded to faster than those with low-frequency Singulars. When the Plural was presented, responses to nouns with low-frequency Plurals were slower even if the Singular was in the high-frequency group. Thus, the frequency of the particular inflected form was what influenced the speed of recognition.
Evidence from an entirely different research paradigm also supports the conclusion that regularly inflected words are listed in the lexicon.
In a study of the lexical diffusion of the deletion of final /t/ and /d/
in English, I reported in Bybee (2000b) that there is a significant effect of frequency on t/d deletion for all words coded (2,000 words from running spoken text). In addition, considering just regular verbs with an -ed ending (either the Past Tense or Past Participle), and divid-ing these words into those with frequencies of greater than 35 per million in Francis and Kucˇera (1982) – the high-frequency group – and those with frequencies of less than 35 per million – the low-frequency group, the difference in the percentage of deletion was significant, as shown in Table 5.1. The tokens counted in Table 5.1 all occurred in non-prevocalic position (i.e., they occurred either before a conso-nant or a pause), as this is the phonetic context most favorable to deletion. Given that a sound change such as t/d deletion is gradual both phonetically and lexically and takes place incrementally as language is used (as argued in Chapter 3), the sound change will progress more quickly in high-frequency words than in low-frequency words. But these differential effects of sound change require that sound change is registered in the representation of words in storage.
Table 5.1. The Effects of Word Frequency on t/d Deletion in Regular Past Tense Verbs (Non-Prevocalic Only)
Deletion Non-Deletion % Deletion
High frequency 44 67 39.6
Low frequency 11 47 18.9
c2: 5.00313, df= 1, p < .05
Thus, if regularly inflected forms show differential effects of sound change, they must be stored in memory. If the Past Tense morpheme were added by a rule, there would be no way to derive a difference in the progress of the sound change based on the frequency of the Past Tense word form.
The evidence suggests that there are two ways of processing regular, morphologically complex forms. One is through direct, whole word access, and it occurs with higher-frequency forms. The other is through accessing a base and adding appropriate affixes. Given a highly net-worked representation for morphological classes such as nouns and verbs, these two avenues of access are not really very different, since even low-frequency regulars are highly associated with other verbs that have the appropriate forms stored in memory. The two methods are more different, however, the more frequent the forms in question are, because of the effect of frequency on representation, a matter to which we will turn now.