• No results found

2.5. The present account on the development of complex adverbs and

2.5.4. Frequency, fixedness, and productivity

No study of grammaticalization that makes use of corpus data can escape addressing the role of frequency. Frequency is considered to play a significant role in most accounts of grammaticalization (e.g. Krug 2000; Hopper, Traugott (2003) [1993]; Bybee 2003, 2007, 2010; Hoffmann 2005). Grammaticalization is usually associated with high frequency. Due to the general meaning of gram- matical items and their ability to occur in more diverse contexts, grammatical items are more frequent than lexical items. Thus, as grammaticalization by defi- nition involves the development of grammatical items out of lexical items (or less grammatical items), an increase in frequency is eminent in the process of grammaticalization. (Bybee 2003: 602)

Yet there is still some obscurity about the exact nature of the role frequency plays in the process. First of all, it must be noted that increase in frequency does not equal grammaticalization. For instance, Mair (2004: 125) claims that increased frequency does not always suggest grammaticalization but may be

related to other phenomena or extra-linguistic factors. Secondly, it has been suggested that high frequency is not vital for grammaticalization to occur. For instance, Hoffmann (2004; 2005), who has studied the grammatical status of low-frequency NPN-constructions (such as in proximity to) has suggested that such low-frequency items may be considered complex prepositions on the basis that they may be affected by analogy with more frequent, formally similar con- structions. Moreover, he underlines that when studying frequencies, one must take into account that some concepts are expressed less frequently than others, and if the linguistic element under study is the preferred means of expressing a concept, it might be more salient than would be concluded based on its overall frequency (Hoffmann 2004: 204–205; Hoffmann 2005: 164). Moreover, there is still some dispute about the causal relationship between frequency and gram- maticalization – it is not clear whether high frequency is the result or the pre- requisite (or concomitant) of grammaticalization (Mair 2004: 126). Mair (2004) attempts to answer this question by analyzing several cases of grammaticali- zation, among them the development of the going to-future. He concludes that a rise in overall frequency is often a delayed result of grammaticalization which has occurred centuries earlier (Mair 2004: 38).

However, there are different ways to count frequency, and the results of analysis may be dependent on the method used. For instance, Mair (2004: 128– 129) reports that despite a belated increase in frequency of going to, observing the frequencies of grammaticalizing items in relevant contexts yielded various results. For instance, his analysis showed that the uses of going to + INFINITIVE proportionally exceeded the contexts with prepositional com- plements already centuries before any change in overall frequency of the phrase took place (Mair 2004: 128–129). Thus, although grammaticalization may not be reflected in the overall frequency of the phrase, it might be observable as changes in proportion of usages that are relevant to the particular instance of grammaticalization. According to Bybee (2003: 604–605), frequencies may be observed as token frequency (text frequency) or type frequency (frequency of a pattern). Grammaticalization may be observed in both cases. Thus, in the present account, where the object of study occurs as the source form as well as the target form, it is useful to observe frequency in appropriate contexts.

Mair (2004: 123) also suggests that in regard to frequency, it is useful to dis- tinguish two types of grammaticalization – ‘dynamic’ and ‘static’. The former stands for the diachronic process observable in major shifts in frequency, and the latter for synchronic variation, whereby lexical items are occasionally used in a grammatical function. Instances of static grammaticalization are usually not associated with high frequency, nor are they usually detectable diachronically as a directed change. Mair suggests that such instances of grammaticalization may be better studied qualitatively (Mair 2004: 138–139). It must be noted that the present grammaticalization process – the development of complex postpositions in Estonian – is rather an instance of the static type. Thus, based on Mair, the corpus analysis cannot be expected to yield any drastic changes regarding the frequency of the phrases under investigation. Nevertheless, I will attempt to

account for frequency when tracing the development of complex function words diachronically (see section 4.8.).

As the diachronic data are few (see section 3.2.2), the analysis of complex function words focuses on synchronic data, where the change in frequency can- not be observed. However, in a synchronic study, pattern frequency can still be observed, especially in the context of the parameters of grammaticalization described above. In addition to raw frequencies of the phrases, I will observe the proportion of freely combined phrases and complex units, the proportion of adverbial uses and postpositional uses, and the frequency of uses that indicate actualization of reanalysis (contextual expansion and non-agreement).

In addition to observing absolute (or relative) frequencies, it is useful to implement statistical methods that measure associations between words. Asso- ciation measures show the strength of association between the words (Evert 2005: 75). Association measures have advantages over absolute frequency measures because they allow us to determine whether there is a statistical asso- ciation between the words or their co-occurrence is mere chance. For instance, two words that are both highly frequent may co-occur by coincidence, but asso- ciation measures give a statistical interpretation of the relationship between the words (Evert 2005: 20–21). In this study, association measures are used for two purposes: measuring the collocational strength between the components of the phrases and measuring the collocational strength between the phrase and other elements in the sentential context.

The strength between the components of the phrase shows how tightly bound the units are. Tight connection between the components of (complex) structures is associated with increasing autonomy (Bybee 2010: 50), fixedness, freezing or fossilization, which have been associated with grammaticalization as well as lexicalization (Brinton, Traugott 2005: 105).22 Here, fixedness is measured with

mutual information (Church, Hanks 1990). The method is also used by Móiron and Bouma (2003) to measure associational strength in Dutch collocational prepositional phrases. Mutual information compares the probabilities of occurrence of a phrase to probabilities of occurrence of each component of the phrase independently. If the co-occurrence of a body part term and a simple postposition (such as kaela peal (neck+on)) is not due to chance, the mutual information of the components (I) is above 0. (Church, Hanks 1990: 77) In order to demonstrate that the scores, indeed, suggest fixedness, the values of mutual information of the phrases in question will be compared to those of body part related phrases that do not behave as complex units (selja taga (back+behind) vs. pea all (head+under)), as well as to body part related phrases that consist of the same components as the phrases under investigation, but are formed with plural body part terms (selja taga (back+behind) vs. selgade taga (backs+behind)) (see section 4.1.).

22 To some extent, these terms are (e.g. Brinton, Traugott 2005) used as synonyms. To avoid confusion, henceforth, only fixedness will be used to refer to the fixation of the studied phrases.

The strength between the phrase as a whole and other elements in the sentential context is used to observe the productivity of the complex units. Productivity is here understood as the ability of a linguistic item to be used repeatedly to produce more instances of the same pattern (Crystal 2000: 310). Following Brinton and Traugott, productivity is here taken to be a scalar notion, i.e. there are are more productive and less productive linguistic items (Brinton and Traugott 2005: 18). While grammaticalization often starts out in narrow contexts, the process of grammaticalization is associated with increase in productivity and grammatical items are considered to be of high productivity. (Brinton, Traugott 2005: 17–18, 100, 109). Thus, the productivity of the studied phrases is of interest here because it can be used as one the factors to determine the degree of grammaticalization among the studied phrases.

As the development of the function words studied here is still in its initial stages, it may be assumed that their use is still partially contextually restricted, i.e. unproductive. Indeed, based on dictionaries, most of the studied phrases have been treated as instances of figurative language. Most of the studied phrases are listed in the Phraseological Dictionary23 (Õim 2000) either as

separate entries or as a part of a larger fixed expression. For instance selja taga

seisma lit. ‘stand behind [one’s] back’ has been listed as a phraseological

expression meaning ‘to support somebody’. The same dictionary does not list

käekõrval as part of any fixed expression. However, the he database of Estonian

verbal multi-word expressions24 lists käekõrvale võtma ‘take [something] beside

[one’s] hand’ as a multiword expression25. The aim of the analysis of pro-

ductivity in the present study is to systematically determine:

i strong collocates, which are suggestive of formulaic use of the studied phrases;

ii the amount of examples that represent such formulaic uses and amount of examples that are freely combined.

Rich contexts are considered to suggest productive use of the complex units. However, if the use of the complex units is confined to certain restricted con- texts, they may not be considered as grammatical items but rather as instances of fixed expressions.

In this type of grammaticalization, the productivity of the complex items is observed in two aspects – the occurrence of the complex postposition with a (pro)nominal complement (e.g. euro in example (50)) and the verb that co- occurs with the complex item. The verb that co-occurs with the complex unit is the verb that, together with the body part related complex item, expresses the relationship between the LM and the TR (varitsema ‘ambush’ in (50)).

23 http://www.eki.ee/dict/frs/ (Accessed 03.01.2016)

24 http://www.cl.ut.ee/ressursid/pysiyhendid/index.php?lang=en (Accessed 03.01.2016) 25 http://www.cl.ut.ee/ressursid/pysiyhendid/kasutajaliides?query=k%E4ek%F5rval (Accessed 03.01.2016)

(50) Kuus aasta-t ELi-s ei ole-ø eestlas-t kuigivõrd

six year-PRT EU-INE NEG be-CONNEG Estonian-PRT much

muut-nud, sest kolm kuu-d enne Euroopa-ø

change-PST.PTCP because three month-PRT before Europe-GEN

ühisraha-le ülemineku-t on palju-d asu-nud

common currency-ALL transition-PRT be.3PL many-PL start-PST.PTCP

euro-ø selja-ø taga varitse-va-Ø hinnatõusu-ø

Euro-GEN back-GEN behind.LOC lurk-PTCP-GEN price rise-GEN

hirmu-s oma-ø sääst-e kuluta-ma. [www.maaleht.ee]

fear-INE own-GEN saving-PL.PRT spend-SUP

‘Six years in the EU has not changed the Estonian much because three months before the conversion to the European common currency many have started to spend their savings in fear of the price rise lurking behind euro’s back.’

To determine the association of the complex items and the (pro)nominal complement and the verb, a log-likelihood measure is used. The log-likelihood measure takes into account the frequency of both linguistic elements, the frequency of their co-occurrence, and the size of the corpus. The higher the log- likelihood score, the more closely bound the word pair. Log-likelihood is a widely used measure in linguistics. It can be used to find idiomatic expressions or other fixed word combinations and formulaic expressions. (See Evert 2005: 21). This measure has been used on Estonian data (Uiboaed 2010), as well as to determine the strongest collocates of English complex prepositions. For instance, Hoffmann (2005: 78–79) implements this method to observe the strongest verb collocates of the complex preposition candidate in need of. He concludes that the very short list of collocates and very high association score of the verb be suggests (along with other factors) that in need of should perhaps not be included in the list of common complex prepositions.

III MATERIAL AND DATA SOURCES

This chapter is concerned with the data analyzed in the present study. In the following I will describe the selection of the postpositional phrases studied here, give an overview of the data sources and explain their selection.