2.2 Selection of the appropriate modality
2.3.6 Negation of modal markers
For those who consider modality as an expression of the subjectivity or the attitude of the speaker, negation can be understood as another marker that modifies the proposition (Masuoka & Takubo, 1992; Sanz Alonso, 1996). In this study, however, negation is not considered to carry modal meaning, but a semantic modifier of modality, an element attached to an auxiliary or a modal adjective that can change the type of modality it expresses, as we saw in Section 2.2.1. The most relevant issue concerns the modal auxiliaries: as they are formed by a pair of auxiliary verb and main verb, the negative element can affect either of the two. If the negation affects the auxiliary, modality changes, as indicated by rules of logic. If it affects the main verb, modality does not change. Sentences in 56 and 57 show this matter.
(56) a. No neg pued-o can-pres.MODaux com-er eat-inf nada anything más more
‘I can’t eat anything else’ (= It is not possible, i.e. necessary not, to eat more) b. No neg deb-o must-pres.MODaux com-er eat-inf nada anything más more
I mustn’t eat anything else (= necessary not, to eat more)
(57) a. 賛成 sansei agree が ga nom でき-ません deki-masen can-neg.MODneg
‘(I) can’t agree (with you)’ (= It is not possible, i.e. necessary not, to eat more)
b. 私 watashi I もう mō again 当たり-た-くない atari-ta-kunai get hit-want-neg.MODneg から kara since 最初 saisho beginning から kara from 外野 gaiya outfield ⾏く iku go-pln タイプ、 taipu type
‘I don’t want to get hit again so I’m staying in the outfield from the begin- ning’ (= it is necessary not to hit)
Studies have shown that this occurs in many other languages (Palmer, 2001; Radden, 2014) and that overall it is a challenging feature for recent natural language processing tasks (Dowty, 1994; Wilson et al., 2009; Councill et al., 2010), and Span- ish and Japanese are no exception. Although negation can modify modal auxiliaries, it does not affect them equally. As Kataoka (2012) defends, the issue relates to not only the scope but also the polar point of the negative element (Fauconnier, 1975; Ladusaw, 1979; Huddleston & Pullum, 2002). That is, although every construction formed by a main verb and an auxiliary or suffix is under the scope of the negative element, the polar point depends on the modal auxiliary used. In sentences 56a and 57a the polarity of negation is on the auxiliary. The modality is negated and hence, its type is changed, in this case, from a possibility to a necessity13. However,
in other constructions the modality does not change, such as in 56b and 57b. The necessity is maintained, apparently breaking the rules of negative logic operations (Hintikka, 2002). In these cases, it is understood that the focus of the negation is on the main verb, and therefore not affecting the modality semantics. The discussion regarding which modal marker belongs to each type will take place in Chapter 3.
Grammatically, negation can be performed in many different ways, either at the lexical level, normally with an adverb or an auxiliary, at the morphological level with affixes, or semantically, with predicates expressing doubt, opposition, etc. (Bosque,
2.3. DEFINITION OF MODAL MARKER
1980, p. 26). These elements act as syntactic operators (RAE, 2009, p. 3631) that apply the negative notion to the constituents under their scopes or areas of affect or influence. We are interested on those lexical negative elements of the sentence that modify, or have a scope over, the modal marker, either Spanish or Japanese.
In Spanish, lexical negation is performed by different syntactic classes, mainly adverbs no (“no”), nunca (“never”), jamás (“never”) and tampoco (“neither”, “nor”). The problem when automatically processing modality is the correct detection of the negative element that affects the modal marker. The case of Spanish proves to be problematic especially in spoken discourse due to the separation of words: the negative element may appear in different positions of the sentence, and the modal marker can fall outside its scope.
In Japanese, negation of a predicative element is performed mainly through an inflection auxiliary, especially the grammaticalised adjective ない (nai) (Kaiser et al., 2013, p. 154). This particle is attached to those elements that can be inflective, i.e. verbs and adjectives. Predicative adjectives, since they are not inflected, must use the copula in its inflected negated form. Consequently, the negative ない (nai) has different variations. When attached to the copula, it becomes ではない (de-
hanai) in formal contexts, but in more spontaneous ones it can also appear as はな
い (hanai), がない (ganai), じゃない (janai), りゃない (ryanai), etc. depending on the consonant of the preceeding syllable (Kaiser et al., 2013, p. 444). There are also variations depending on the type of discourse: in the written form it can appear as ぬ (nu), ず (zu), ざる (nai) or にあらず (nai), but in spoken language the ending nai can be shortened into ん (n). Finally, the formal equivalent of nai is the inflection ません (masen), turn into でわ/じゃありません (dewa/jaarimasen) if used with the copula. The Spanish problem of negation distance will not be as problematic in this language since the negative markers appear attached to the auxiliary. The main problem when processing the negation in Japanese is variation.
This concludes Chapter 2 of the study, where the theoretical ideas that will serve as the foundation for this study have been setted. First, we have made a brief overview of the history of the concept of modality and how it has been studied by the most important linguists, philosophers and psychologists in the last centuries,
and how it is considered today.
Secondly, we have explained the stance taken regarding modality, mainly defin- ing it as a psychological connection between the mind of the speaker and a state of affairs (SOA) of the external world. The words that encode modality in the sen- tence will state if a SOA is necessary true or on the other hand possibly true. A second level of classification expresses an epistemic modality if the speaker believes the SOA to be necessary or possible, or a deontic one if he or she desires to be necessarily/possibly true. At this level, however, we may encounter a high amount of ambiguity since one marker may contain both epistemic or deontic readings.
Finally, we have clarified that modality is represented grammatically in a modal marker, an element of the sentence that is not present in all sentences, but only on those in which modality is overtly expressed. A modal marker needs to be marked, grammaticalised or registered by previous studies as an element with strong modal content, and has to modify a verb, according to the dependency rules of dependency grammar, as it is the most appropriate position for a comparative and computational study. The definition of a modal marker is subdued by the definition of modality. Here we have considered a more restricted approach, considering only those elements that modify a verb adding a necessity or a possibility meaning, completing the semantics of the verb mood.
The next chapter will depart from these ideas and explain the methodology followed by this study: the corpora and computational tools used, and how the tagging of the markers in the corpora was made, including the selected tagset.
3.1 Steps
The previous chapter has described the theoretical implications that will serve as foundation of the study. We will now move on to the description of the methodology along with the data and the tools used. Recalling the objective of the work, it is divided in three main parts:
1. Selection of the appropriate approach towards modality for this work.
2. Development of a quantitative comparable study of modality from Spanish and Japanese spoken corpora.
3. Automatic implementation of modality annotation for future studies
Chapter 2 has covered (1), the trends regarding modality from the last centuries until today. It concluded that modality signals the possibility or necessity of an state of affairs becoming true, whether perceived (epistemic) or desired (deontic) by the speaker. Modality is a semantic value, coded into grammatical elements that modify the verb, according to dependency syntax, which add specialised meaning to its mood. This Chapter 3 will describe the development process for (2) and (3), the following Chapter 4 will cover (2) and the final Chapter 5 will explain point (3). We now shift from theoretical to empirical information, from the present to the future, how previous theoretical insights may apply to today’s language and possible future texts. As with the theoretical decisions, the aim in this study is to follow a simple but precise and well-planned methodology: preparation, annotation, observation and implementation.
The preparation phase consists on two main steps: firstly, the configuration of the tagset that will be used in the annotation of the corpora and for the automatic tagger. The objective is a comparative annotation and study; hence, the procedure will use the same XML tags, symbols assigning descriptive information to elements of the text (Leech & Smith, 1999), for both languages. Secondly, a listing of each possible modal marker in both languages. It includes information found in the literature and personal knowledge, and recursively improved after observing the usage in the corpora.
3.1. STEPS
Next, in the annotation phase, after cleaning and preparing the corpora for the XML annotation, the markers compiled in the preparation phase were searched and tagged in the texts. The procedure was made manually and semi-automatically, using the established tagset. Any new information gathered from the text, such as new or different markers or problematic cases, was included in the listing. In our case, the XML tags assign modal information, as well as additional characteristics regarding the nature of the marker in the text: if it is negated or not, if it has an element missing through ellipsis or overlapping, if it is separated by other words, and if there is a misspelling or an error.
Following this, the observation phase takes place through a quantitative analysis of the modal markers found in the corpora. The objective here is to observe the usage of modality in a natural, spoken Spanish and Japanese discourse and confirm a series of hypotheses drawn. The results of these analyses will be presented in Chapter 4.
The last stage of the study is the automatic implementation of modality. That is, to develop a program that could automatically find and tag these markers in a new given text. The program is rule-based, and based on the theoretical information explained in Chapter 2 and the information extracted from the corpora study in Chapter 4. Its development, along as the problems and challenges that Spanish and Japanese present in this area, is explained in Chapter 5. The complete script of the program can be found in Appendix B.
Figure 9 summarises the steps taken in the study:
Figure 9: Methodology followed in the study
Each of step will be explained in the following sections: description and prepa- ration of the corpora (Section 3.2), the annotation language and tagset (Section 3.3.2), the computational tools used along the study (Section 3.4) and the discus- sion of each modal marker (3.5).