Syntactic annotation and primary variables

2.6 Functional properties of S/A detransitivisation

2.7.2 Syntactic annotation and primary variables

e annotation for Chintang started at an early stage in the analysis when it was already clear that S/A detransitivisation was functionally comparatively simple but other things such as the role of arbitrary reference, the relation between quantifiability and identifiability, and the full range of parameters relevant to identification processes had not been discovered yet. For this reason, quantifiability and identifiability were annotated side by side rather than as facets of one and the same phenomenon. e applied definition of identifiability was somewhat more conservative than the radical view presented in section 2.5, and arbitrary reference was not annotated at all. In addition to quantifiability and identifiability, various syntactic information was annotated that was not of direct interest to this but to other ongoing research projects on Chintang.

A minimal amount of syntactic structure is represented by the variable domain, which marks elements that are syntactically associated by identical numeric IDs. For instance, all constituents (arguments and predicate) of the first sentence in a text get the domain ID1, those of the next sentence get2, and so forth. Nested structures can be indicated by slashes; for instance, 2/1 is the first clause embedded into sentence 2, and2/1/3 is recursively embedded into sentence 2 and the third element on level2/1. Domains are not directly relevant for S/A detransitivisation but have a couple of indirect uses. First, they make it possible to locate objects within files easily. Second,

they establish a link between an object and its predicate. ird, they make the understanding of annotated texts easier and allow to do some general statistics.

Another basic variable is role. e central roles were S, A, P, T, G with definitions based on Dowty (1991) and Bickel (2011) with some simplifications. In the definition of ditransitives, actual or metaphorical movement was taken as the central criterion distinguishing T and G. For copular clauses, the special labels CT (copular theme) and CR (copular rheme) were used instead of roles. ree other pseudo-roles were N.EXP (the experiencer noun featuring in the experiential frames), CSR (for the causor in causatives), and BEN (for an additional benefactor in benefactives).

e other, more directly relevant variables are shown with their values below. All variables had an additional valuex that was to be used in cases of insecurity and that was ignored in the sta- tistical evaluation. More complete deﬁnitions can be found in the appended annotation guidelines (Appendix A).

• verb class – the lexical class of the verb as deﬁned by its characteristic frame (cf. section 2.3.3): ◦ itr – intransitive

◦ tr – monotransitive

◦ dido – direct object ditransitive ◦ dipo – primary object ditransitive ◦ dioo – double object ditransitive ◦ exptr – transitive experiential ◦ expitr – intransitive experiential ◦ uninf – uninﬂected verboid ◦ aux – auxiliary

◦ other – any other minor class

• alternation – various syntactic alternations modifying the base frame. Where no alternation was present this variable stayed empty.

◦ sad – S/A detransitivisation

◦ idt – indeterminate as to S/A detransitivisation ◦ sod – S/O detransitivisation

◦ reﬂ – reﬂexive ◦ recp – reciprocal

◦ ambrec – ambitransitive reciprocal ◦ pass – passive

◦ caus – causative ◦ ben – benefactive ◦ cop – copulative frame ◦ poss – possessive frame

◦ dumA – dummy A-AGR in the transitive experiential frame ◦ OtoS – S-AGR with embedded O in infinitival constructions • quantifiability – as defined in section 2.6.1:

◦ qnt – quantiﬁable ◦ nonq – non-quantiﬁable

• identifiability – a less sophisticated version of the definition in section 2.5: ◦ def – identifiable for both speaker and hearer

◦ spec – identiﬁable for the speaker only

◦ idf – identiﬁable for neither speaker nor hearer

In the initial phase of the annotation, two files were worked through by both annotators in- dependently and Cohen’s Kappa was calculated. Cohen’s Kappa (Cohen 1960) measures the pro- portion of interannotator agreement that is not due to chance. Table 2.10 shows the results for the central variables role, quantifiability, and identifiability.

2.7. QUANTITATIVE ANALYSIS BASED ON CORPUS DATA observed agreement expected ance agreement Cohen’s Kappa story rabbit role 86% 17% 0.83 quantifiability 93% 78% 0.67 identifiability 95% 78% 0.79 kamce talk role 77% 21% 0.71 quantifiability 89% 78% 0.49 identifiability 89% 71% 0.61

Table 2.10: Interannotator agreement for Chintang

As is well known in the literature (e.g. Carlea 1996, Sim and Wright 2005), there is no cutoff value for Cohen’s Kappa that is meaningful for all applications and thus generally accepted. One of the first proposals for evaluating Kappa is found in Landis and Koch (1977:165), according to which most of the values in Table 2.10 indicate “substantial” agreement (Kappa between 0.61-0.80). e only case where only a “moderate” level of agreement was reached (Kappa between 0.41-0.60) was quantifiability in the session kamce talk. Note, however, that Cohen’s Kappa is not only influenced by inter-annotator agreement. As noted by Sim and Wright (2005:261), the measure penalises high probabilities for chance agreement so that the higher this probability the lower Kappa. Both quantifiability and identifiability have high probabilities for chance agreement between 70 and 80% because some of their values are much more frequent than others (qnt 76%, def 69%). I therefore accepted the low value in question as sufficient.

2.7.3 e centrality of quantiﬁability

e results of the annotation confirm the central role of quantifiability for S/A detransitivisation. 95% of all non-quantifiable O referents co-occur with the S/A detransitivised frame, and 97% of all quantifiable O referents co-occur with the transitive frame. Unsurprisingly, a Fisher’s exact test on these numbers indicates an extremely high level of significance (p < 0.01) for the interaction between the two variables. Figure 2.5 visualises the proportions.

nonq qnt default

sad

Figure 2.5: antiﬁability and S/A detransitivisation

Cramer’s V, which ranges between 0 (no association) and 1 (perfect association). Cramer’s V for quantiﬁability and S/A detransitivisation is 0.90. is value is far above anything that is reached for single variables in Nepali DOM (cf. section 3.6.4). S/A detransitivisation can thus be said to be linked much more tightly to a single variable and to be functionally less complex than DOM.

For the other variable under investigation, identiﬁability, the numbers are less easy to read. While there are clear associations which also do reach signiﬁcance (pχ²< 0.01), the numbers fall

back behind those for quantifiability: 73% of all indefinite O have S/A detransitivisation, and 78/92% of all specific/definite O have the transitive frame. Figure 2.6 visualises this.

idf spec def default

sad

Figure 2.6: Identiﬁability and S/A detransitivisation

Since definiteness was defined in section 2.5 as entailing specificity, it is possible to fuse the valuesspec and def to a category with the meaning ‘at least specific’. is category gets the transitive frame in 91% of all cases and therefore still stays slightly behind quantifiability.

Cramer’s V for identifiability and S/A detransitivisation is 0.64, which is still high compared to the values for DOM but low compared to the 0.90 reached by quantifiability. Cramer’s V for identifiability withdef and spec fused also rounds to 0.64.

ese results are unexpected given the discussion in section 2.5 and section 2.6, where quantifiability was viewed as a precondition for specificity, complemented by arbitrary reference. If this truly was the case, the fusion ofdef and spec should have produced equally good or beer predictive results than just quantifiability. However, as mentioned before, the definition of identifiability used for the annotation reflects an earlier stage of the analysis. erefore, the aberrant behaviour of identifiability is rather an artifact of the annotation than a reflex of what is really going on. In some cases, mismatches between quantifiability and identifiability defined in a somewhat more conservative fashion point out some interesting differences between various conceptions of identifiability.

ere are two kinds of such mismatches. One are cases where one referent was annotated as non-quantifiable but also as definite or specific. e majority of these cases can be traced back to the role of subamounts for quantifiability. As discussed in section 2.6.3.1, it is important whether a whole referent is affected or just some non-quantifiable subamount. While this distinction should in principle also be applied to identifiability, this would be somewhat less intuitive and was not done in the annotation. Consider, for instance, the example in (257):

(257) Pache then ciya tea kha-pid-e 1nsO-give-IND.PST[.3sA] kinana SEQ ciya

tea thu-i-hẽ.drink-1p[i]S-IND.PST

2.7. QUANTITATIVE ANALYSIS BASED ON CORPUS DATA From the perspective of quantifiability it’s immediately clear that not all the tea is affected at once in the process of drinking but only a subamount, so the referent behind ciya was considered as non-quantifiable. For identifiability one could also have said that the precise affected amount is indefinite, but this seemed a bit awkward given the fact that all of the tea, which has just been mentioned, is much more relevant as a referent than the affected subamount. erefore, referents like ciya in last were usually tagged asdef.

Similarly, the other kind of mismatch – quantifiable indefinite referents – is to the biggest part due to examples like those shown in (258) and (259). Here, the O referents are clearly quantifiable because they are single referents. However, it’s not so clear whether they are also specific or definite. ey are if one takes into account that sometimes very lile information may be sufficient to identify a referent within a certain mental space, but in a more conservative view they are not: (258) Sel-a

jackal-NTVZekthopaat.all ta-macome-INFu-pi-c-o-nɨŋ.3A-allow-d-3[s]O-NEG

‘ey wouldn’t allow a jackal to come near at all.’ (CLC:ctn talk01.153) (259) Ani-yɨŋ=le 1piPOR-language=RESTR u-nis-o-ko, 3pA-know-3[s]O-IND.NPST aru other u-nis-o-ko-nɨŋ. 3pA-know-3[s]O-IND.NPST-NEG ‘ey only know our language, they don’t know (any) other.’ (CLC:Durga Exp.55-56) To summarise, quantifiability is of central importance to S/A detransitivisation. Identifiability would have been expected to yield the same predictive results under the strict definition presented in section 2.5, but since its definition for the annotation reflects an earlier stage of the analysis, the results of the annotation are not as relevant for the discussion of the function of S/A detransitivisation as they are for the meta-question which definition of identifiability works best.

2.7.4 e role of exceptions

As stated above, quantiﬁability is the central variable for S/A detransitivisation. However, there are some cases which it does not explain. If we look at the counts from the perspective of prediction, quantiﬁability correctly predicts 97% of all frames (within the binary choice we are looking at here), or 98% of all transitive frames and 91% of all S/A detransitivised frames.

In a way this is what is expected. In section 2.6.2 it was stressed that quantiﬁability can be overridden by open (unknown or arbitrary) reference. In section 2.6.5 several cases were discussed where S/A detransitivisation is conventional. An inspection of the 38 annotated cases where a quantiﬁable O referent was used with S/A detransitivisation shows that almost all of them fall into one of the categories discussed in the two mentioned sections. 18 (47%) have arbitrary reference (partially with grammaticalisation in the case of frequent composite activities), 13 (34%) have pieces of information in O, and another 5 (13%) fall into other minor categories. e only two cases that cannot be explained at all (5%, 0.005% of all instances of the S/A detransitivised frame) are shown in (260) and (261).

(260) Hun-ce

MED-nsu-nis-a=kha3pS-know-PST=NMLZ2

raicha! MIR

‘ey had known it!’ (CLC:phidang talk.447)

(261) Pa father u-chau 3sPOR-child pokt-e leave-IND.PST[.3sS] kina SEQ huŋ=go MED=NMLZ1 khad-a-kt-e=ta. go-PST-IPFV1-IND.PST[.3sS]=FOC

‘Aer the fatherihad le his childj, hejhimself was going away.’ (CLF:sadstory RM.122)

e cases where a non-quantifiable referent was used with the transitive frame were fewer (19). e subtype that is easiest to understand here is one where the speaker chose to represent a non- quantifiable referent by a non-singular NP (where S/A detransitivisation is extremely rare, as discussed in section 2.6.3.1) or by a non-singular agreement prefix (kha- [1nsO], mai- [1nsiO]). Two examples are shown below.

(262) Warisa-ce

young.woman-nscahı̃RETRVmai-mek-no=mo1nsiO-like-IND.NPST=CITkinaSEQ naCTOPhun-ceMED-nsdukkhatrouble pi-ma-cegive-INF-3nsO mahaʔ=kha

be.not.good=NMLZ2

gonei. ATTN

‘When we think that the girls like us we shouldn’t give trouble to them.’

(CLC:khinci talk.081-082) (263) Koikoi

some lɨk-maenter-INFkha-pi-nɨʔ-nɨŋ=pho.1nsO-allow-IND.NPST-NEG=REP

‘(I heard that) they don’t allow some of us to enter.’ (CLC:chintang sahid.263) is subtype constitutes 7 (37%) of the cases in question. Another, less transparent subtype seems to occur when O is ﬁlled by a dummy referent corresponding to the general situation. is type covers another 6 cases (32%). (264) is an example.

(264) Anaŋ

what me-u-m,do.with-3[s]O-[SUBJ.NPST.]1pAhuŋ-khi=taMED-MOD=FOCnahaŋ!but

‘What shall we do (with this whole thing), that’s just the way it is!’ (CLC:tang talk.146) Finally, in 5 cases (26%) the speaker seems to have picked out an exemplar in order to refer to a category:

(265) Masala

spice kiyaoil mai-ta-yokt-a-kt-eNEG-come-NEG-PST-IPFV1-IND.PST[.3sS]

ba=go PROX=NMLZ1 teı̃-be, village-LOC1 abo nownaCTOP gududu-wa steady-ADVZ u-tad-u-l-o-ŋs-e! 3pA-bring-3[s]O-back-3[s]O-PRF-IND.PST

‘Spices and oil were not coming to this village, but now they bring the stuﬀ non-stop!’ (CLC:khim ring.106-107) ere is only a single case (5%, or 0.0008% of all instances of the transitive frame) where I do not have the slightest idea what could have conditioned the frame. is sentence is shown in (266). (266) Tara

but hana=yaŋ2s=ADD sapphimuch a-numd-o-ko2[s]A-do-3[s]O-IND.NPSTonei.ATTN

‘But you also do a lot.’ (CLC:tang talk.218)

To summarise, the majority of exceptions can be aributed to arbitrary reference, grammaticalisation, and varying construals. us, S/A detransitivisation is a phenomenon with a rigid core conditioned by quantifiability and with flexible fringes. It would be an interesting question to ask whether there was more or less flexibility in earlier times, but since there are no wrien records prior to the arrival of CPDP, this is impossible to answer.

In document Object-conditioned differential marking in Chintang and Nepali (Page 121-126)