Results - Acceptability judgement experiment

2.7 Acceptability judgement experiment

2.7.2 Pretest

2.7.2.4 Results

The AI of a given verb was calculated as the mean of the object-related change ratings across all subjects. Figure 2.1 shows the AI for each verb included in ascending order: the lowest AI of 1.8 is attributed to the verb bewundern/admire and the highest AI of 7 to the verb enthaupten/decapitate and the verbs in between cover the whole range of the scale in a (nearly) continuous way with a mean AI of 4.82 (median: 4.92, SD: 1.23).

The ﬁve verbs with the lowest AI are bewundern/beschatten/befragen/begrüßen/beraten (admire/tail/interview, interrogate/greet/advise ), all of which indeed do not necessarily entail any kind of change in the participant realised as object and can be associated with 8_{In addition to the rating about AI, participants provided two further ratings for each sentence. Since these do}

Figure 2.1: AI rating for each verb (sorted by AI in ascending order; ﬁgure has been rotated by 90

degrees).

AI are enthaupten/töten/ermorden/beseitigen/heilen9_{(decapitate/kill/murder/remove/cure ),}

which all entail a concrete and definite change in the object participant and belong to the highest level of the Affectedness Hierarchy, quantised change. While verbs in the mid- rage are more mixed, the five verbs around the median AI of 4.92 – ausbeuten/bestechen/- erpressen/unterdrücken/zurückweisen (cheer up/exploit/bribe/blackmail/oppress, subdue ) – are largely consistent with classification as ‘potential for change’ or ‘non-quantised change’ predicates, i.e. the two levels covering the middle of the Affectedness Hierarchy. Thus, the continuous AI seems to reasonably capture the degree of affectedness entailed for a verb’s direct object and thus also its semantic prominence as conceptualised by Beavers (2010, 2011).

2.7.3 Acceptability judgement experiment

2.7.3.1 Materials

The -ung-nominalisations of the 85 base verbs (see also table A.1 in Appendix A for details on the nominalisations used) from the pretest were used to create materials for eliciting acceptability judgements about the possibility of linking GenS and GenO with each of these. To do so, sentence pairs were constructed for each nominalisation; these expanded the stimulus sentences applied in the pretest into short ‘stories’, as in examples (97) and (98): the ﬁrst sentence (henceforth ‘context sentence’ – see (97a) and (98a)) always set up the discourse context by describing an event or situation involving two participants. These corresponded to the respective base verb’s subject and object NPs used in the AI-rating pretest or were selected according to the same criteria stated above. In this context sentence, the base verb of the respective nominalisation was not used directly; rather synonyms of the respective verb or paraphrases were used. The second sentence (henceforth ‘continuation sentence’ – see (97b) and (98b)) represented a continuation of the story introduced by the context sentence; it featured the respective -ung -nominalisation as head of the sentence-initial subject NP. The nominalisation was followed by a post-nominal genitive representing the subject of the context sentence (i.e. GenS) or its object (i.e. GenO). The whole subject NP containing the nominalisation and the postnominal genitive was followed by the main verb inﬂected for past tense and further sentence material completing the sentence.

(97) Example story for Bewunderung (admiration ): a. Context sentence with ‘role allocation’ :

Der the.NOM Sohn son.NOM himmelte adored den the.ACC Star star.ACC Tag day und and Nacht night an. PART

9_{Note that heilen (cure, heal ) can occur as a syntactically intransitive ‘unaccusative’ verb – however, in this usage}

the subject slot must be ﬁlled by an entity which does not denote a human or animal, such as as body part as in der Arm heilte/the arm healed, whereas human subjects are unacceptable as in ?*der Patient heilte/the patient healed. Since subject and objects NPs always referred to humans in this and all following experiments, it is clear that heilen here is used in its syntactically transitive form. Note that Blume (2000, p. 181) also lists heilen as a typical case of a semantically transitive verb.

‘The son adored the star day and night.’ b. Continuation sentence : Die the Bewunderung admiration des the.GEN Star-sGenO star-GEN / / des the.GEN Sohn-esGenS boy-GEN beunruhigte worried den the Vater father nach after einiger some Zeit. time

‘The admiration of the star/of the boy worried the father after some time.’ (98) Example story for Ermordung (murder/assassination ):

a. Context sentence with ‘role allocation’ : Der the.NOM Räuber robber.NOM erschoss shot den the.ACC Wächter guard.ACC während during seiner his Flucht flight aus from der the Bank. bank

‘The robber shot the guard during his ﬂight from the bank.’ b. Continuation sentence : Die the Ermordung assassination des the.GEN Wächter-sGenO guard-GEN / / des the.GEN Räuber-sGenS robber-GEN schockierte shocked die the Angestellten employees in in der the Bank. bank

‘The assassination of the guard/of the robber shocked the bank’s employees.’ For each of the 85 nominalisations, two diﬀerent stories were constructed, resulting in a total of 170 stories. Each of these, in turn, was paired with either GenO or GenS as an argument to the respective -ung -nominalisation in its continuation sentence, yielding a pool of 340 sentence pair stimuli. These were distributed over four lists: each list contained 85 stories, one for each nominalisation; half of these featured GenS in the continuation sentence (condition ‘GenS’) and the other half GenO (condition ‘GenO’), with either of these conditions covering the range of the AI representatively. Subjects were assigned to one of the four experimental lists on a random basis and the number of subjects per list was balanced. The order of presentation of stimuli within a list was again randomised for each participant. The complete set of stimuli is provided in Appendix B.

2.7.3.2 Procedure

The acceptability ratings were again collected via the internet using the WebExp software of Keller et al. (1998). On the ﬁrst page, participants were informed about the general institutional background of the experiment and use of the data collected. They were informed that they would be asked to indicate general personal information such as age and sex, but that they would not be asked for their names and that the experiment would be analysed anonymously. In addition, participants were told that only native speakers of German were eligible for participation in the study.

During the experiment, each story was presented on a separate page, with the context sentence printed in black font and the continuation sentence in green font. Participants were told that they would read one pair of sentences on each page and that the first (black) sentence introduced a certain situation or event with two participants. They were further informed that the first sentence would define the ‘role’ each of the participants played in this event. The participants were then told that the second (green) sentence was intended to represent a continuation of the event/situation set up by the first one and that only one of the two participants introduced would appear in this continuation sentence. The subjects were instructed to judge whether the participant appearing in the continuation sentence still had the same role as in the context sentence – and thus could act as a coherent continuation of the event/situation introduced by the first sentence – or not. The judgement was provided by selecting ‘Yes’ or ‘No’ from a box titled ‘Continuation’.

Since the GenO linking is assumed to be the preferred linking option throughout and occurs much more frequently in natural speech, the participants were asked to base their judgements on considering whether the second sentence could in principle act as a continuation, even if accepting this interpretation took some mental eﬀort – if so, they should select ‘Yes’, if not ‘No’. They were also informed that there were no ‘correct’ or ‘incorrect’ answers and they were instructed to follow their own linguistic gut feeling in providing the judgements and to provide their responses quickly. The instructions contained two commented example sentence pairs. After providing judgements about four example stories, the experimental stimuli were presented page by page.

2.7.3.3 Hypotheses

As outlined at the beginning of section 2.7.3, the primary aim of the acceptability judgement experiment was to test whether the acceptability of GenS linking in eventive -ung- nominalisations is indeed graded, as I conjectured in section 2.4. There, I proposed that a graded notion of aﬀectedness as developed by Beavers (2010, 2011) may capture intuitions more adequately than the approach of Ehrich and Rapp (2000). If empirically supported, the linking mechanism investigated here may indeed be conceptualised as being semantically determined and a result of the semantic prominence of GenO, as indexed by its degree of aﬀectedness. As I pointed out, a closely related interpretation involves the degree of semantic transitivity as determined by the semantic distance between the GenO and GenS co-arguments.

The examples in (86), based upon my own intuitions and repeated here, illustrated the crucial pattern mapped onto Beavers’ Aﬀectedness Hierarchy, with acceptability of GenS linking de- creasing from a maximum to a minimum as aﬀectedness increases, while GenO linking always seems equally acceptable.

(86) a. Unspeciﬁed for change : die the Bewunderung admiration / / Begrüßung greeting ✓✓des the.GEN Patient-enGenO patient-GEN / / ✓✓des the.GEN Arzt-esGenS doctor-GEN

‘the admiration/greeting of the patient/the doctor’ b. Potential for change :

die the Behandlung treatment / / Untersuchung examination ✓✓des the.GEN Patient-enGenO patient-GEN / / ✓des the.GEN Arzt-esGenS doctor-GEN ‘the treatment/examination of the patient/the doctor’

c. Non-quantised change : die the Sedierung sedation / / Ermutigung encouragement ✓✓des the.GEN Patient-enGenO patient-GEN / / ✓?des the.GEN Arzt-esGenS doctor-GEN ‘the sedation/encouragement of the patient/the doctor’

d. Quantised change : die the Heilung cure / / Einlieferung hospitalisation ✓✓des the.GEN Patient-enGenO patient-GEN / / *des the.GEN Arzt-esGenS doctor-GEN ‘the cure/hospitalisation of the patient/the doctor’

With degrees of aﬀectedness re-operationalised as the continuous AI, the following hypotheses are posited:

1. Analysis of the acceptability ratings should reveal an interaction of ‘Linking’ (‘GenO’ and ‘GenS’ conditions) and ‘AI’, with the simple slopes of the continuous ‘AI’ variable exhibiting the following patterns:

(a) In the ‘GenS’ condition, ‘AI’ should be associated with a negative slope, showing a signiﬁcant decrease in acceptability with increasing AI, as discussed in section 2.4. Increasing semantic prominence of GenO should thus result in lower acceptability of GenS linking.

(b) Following the claim of Ehrich and Rapp (2000) that GenO arguments can always be linked independently of the semantic structure of the underlying base verb, acceptability should be high or at ceiling throughout the range of ‘AI’ in the ‘GenO’ condition, without a significant effect of ‘AI’. The semantic prominence of GenO should thus not influence acceptability of GenO linking.

2. Acceptability of GenO and GenS linking should be at a similar level at the lower end of the AI range.

2.7.3.4 Participants

Complete ratings were collected from 40 native German speakers with a mean age of 28.3 (range 19–68, 8 male, 32 female). Note that again the data of a number of additional participants were not saved due to technical problems, so their data had to be discarded. For participation in the study, participants could opt for course credits or for taking part in the drawing of a voucher for an online store.

2.7.3.5 Data analysis

The binary judgement data were analysed using logistic mixed models (or generalised linear mixed models – GLMMs) via theglmer function of package lme4 (Bates, Maechler, Bolker, & Walker, 2014) for the statistical environment R (R Development Core Team, 2013). Fixed effects included in the model were ‘c.AI’ and ‘c.Linking’ as well as their interaction, testing for linear effects only. All variables were mean-centred, as indicated by the prefix ‘c.’; note that this reduces collinearity between predictors and changes the interpretation of the intercept estimate to the estimate of the overall mean. ‘GenO’ was defined as the reference level for the ‘Linking’ variable. The assessment of significance of fixed effects is performed using

p-values based upon Wald z-scores10.

Maximal random effects structures were used for subjects and items, taking a largely design- guided apporach as recommended by Barr, Levy, Scheepers, and Tily (2013), resulting in by-subject random intercepts and random slopes for the predictors ‘c.AI’ and ‘c.Linking’ as well as their interaction. As grouping factor for the ‘Items’ random effect ‘Story’ was used, including by-story random intercepts and random slopes for the ‘c.Linking’ predictor, which varied within ‘Story’. Correlation parameters between random effects were included by default. The final random effects specification for subjects and items inlmer notation was thus (1 + c.Linking * c.AI | Subject) + (1 + c.Linking | Story). Random intercept or slope effects for subjects and items were only omitted in the face of convergence problems or indications of overparameterisation of the model (e.g. total correlation between random effects).

Planned post-hoc assessments comprised testing for simple slope effects of ‘AI’ for each level of ‘Linking’. To this end, separate GLMMs were fitted for the ‘GenO’ and ‘GenS’ sub- sets, taking the model specification of the final model for the complete data set as starting point and reducing the fixed and random effects parts as appropriate (i.e. removing any fixed and random effects involving ‘Linking’).

In addition to the information available by default from thelmer output, a number of com- plementary statistics were computed ; univariate 95% confidence intervals for the fixed effects estimates were obtained using the functionsconfint and glht of package multcomp (Hothorn, Bretz, & Westfall, 2008). Predictions for the continuous ‘c.AI’ predictor based upon the model’s fixed effects estimates were computed for each level of the ‘c.Linking’ predictor (i.e. corresponding to ‘GenO’ and ‘GenS’) using theezPredict function of package ez (Lawrence, 2013) along with 95% bootstrap-based prediction intervals using 5000 bootstrap samples. Predictions were computed across the range of the ‘c.AI’ covariate at 100 equally spaced values between the minimum and maximum value of ‘c.AI’. Predictions and corresponding intervals for the difference between ‘c.AI’ for the ‘GenS’ and ‘GenO’ levels were also computed using bootstrap samples.

Model diagnostics included a check of the model fit using theplotlogistic.fit.fnc function of package languageR (Baayen, 2013), which plots observed proportions against mean 10_{Note that the significance of fixed effects was additionally confirmed via likelihood ratio tests using the anova}

predicted probabilities. The influence of outliers was assessed using several functions from package influence.ME (Nieuwenhuis, Grotenhuis, & Pelzer, 2012), which allowed to assess the influence of a given story on the final model’s estimates refitting it leaving out one story at each iteration using function influence. Subsequently, functions cooks.distance and dfbetas were used to acquire indices of the influence of each story on the model estimates and models were refit using reduced data sets leaving out stories which were above the cut- off thresholds for Cook’s distanceAND DFBETAS values; the critical thresholds for either

value were calculated as suggested by Nieuwenhuis et al. (2012). When the estimates of these reduced models did not significantly differ from those of the full models, the latter are presented without further discussion of possible effects of outlying items.

2.7.3.6 Results

To illustrate the eﬀect of the ‘Linking’ variable on the acceptability judgements, the proportion of ‘Yes’ responses for each participant and level of ‘Linking’ was computed. The boxplots in ﬁgure 2.2a show that GenOs were accepted much more often as coherent continuations than GenSs (GenO: mean = 0.93, SD = 0.06; GenS: mean = 0.32, SD = 0.17). Figure 2.2b shows the proportion of ‘Yes’ responses per nominalisation for each ‘Linking’ condition over the range of ‘AI’ as a scatterplot; separate linear smooths for ‘AI’ are added for the ‘GenO’ and ‘GenS’ conditions. The smooth for the ‘GenO’ condition suggests high acceptability rates throughout the range of ‘AI’, though acceptability seems to have increased slightly with higher AI values. Conversely, in the ‘GenS’ condition, acceptability was highest for nominalisations with a low AI and decreases rapidly as AI increases, with the absolute minimum close to zero at the maximum AI values.

The final model specification of the model for the complete data set inlmer notation is given in (99) and table 2.1 summarises the random and fixed effects of the model. The upper subtable lists the variances and standard deviations of the included random effects as well as their correlations and the lower subtable contains the following information for each fixed effect: the coefficient estimate (in log-odds) and its associated standard error in the second and third column, 95% CI (confidence intervals) lower and upper bounds in the fourth and fifth columns, followed by the summary of Wald’s z-test in the next two columns.

(99) Yes_Response ~ c.Linking * c.AI +

(1 + c.Linking * c.AI | Subject) + (1 + c.Linking | Story)

The model summarised in table 2.1 shows signiﬁcant eﬀects of ‘c.Linking’ (B = −4.87, SE = 0.37, p < 0.001), ‘c.AI’ (B = −0.2, SE = 0.08, p = 0.019713) as well as their interaction ‘c.Linking:c.AI’ (B = −1.61, SE = 0.21, p < 0.001).

Figure 2.3 summarises the predictions derived from model (99) together with their 95% bootstrap CIs: ﬁgure 2.3a shows the predicted slopes of ‘AI’ for the ‘GenO’ and ‘GenS’ conditions in log-odds space and ﬁgure 2.3c shows the same transformed to probability space. Both illustrate that the likelihood of accepting a continuation sentence as a coherent continuation

Figure 2.2: Proportion of ‘Yes’ responses for each ‘Linking’ condition. a: Boxplot of acceptance pro-

portions for each subject and ‘Linking’ level. Note that the horizontal position of the points varies randomly within a condition to avoid overplotting and enhance visibility. b: Acceptance proportions for each nominalisation at the ‘GenS’ and ‘GenO’ levels of ‘Linking’ plus linear smooths for ‘AI’.

was equally high for GenO and GenS occuring with nominalisations associated with a minimal or low AI (estimate at the minimal AI value for GenO: log-odds = 1.85, probability = 0.86; for GenS: log-odds = 1.83; probability = 0.86); as already suggested by ﬁgure 2.2a above, acceptability decreases rapidly with increasing AI in the GenS condition to a minimal likelihood at the upper end of the AI scale (estimate at the maximal AI value for GenS: log-odds = −3.37, probability = 0.03), whereas it increases to a maximal likelihood for GenOs (estimate at the maximal AI value for GenO: log-odds = 5, probability = 0.99). Figures 2.3b and 2.3d show the estimates for the corresponding diﬀerence between GenS and GenO (i.e. GenS – GenO) across the AI range together with the bootstrap CIs in log-odds and probability space.

Tables 2.2 and 2.3 summarise the post-hoc models fitted to test for simple slope effects of ‘c.AI’ in each ‘Linking’ condition, with the model specifications given in (100)11_.

(100) a. GenO-model :

Yes_Response ~ c.AI + (1 | Subject) + (1 | Story) b. GenS-model :

Yes_Response ~ c.AI + (1 + c.AI | Subject) + (1 | Story)

The model fitted to the GenO subset (table 2.2) shows a significant effect of ‘c.AI’ (B = 11_{The by-subject random slope for ‘c.AI’ was removed from the GenO model due to perfect correlation with the}

Random eﬀects

Groups Name Variance SD Corr.

Story (Intercept) 0.37 0.61 c.Linking 4.26 2.06 -0.19 Subject (Intercept) 0.46 0.68 c.Linking 2.24 1.50 0.42 c.AI 0.01 0.10 -0.63 -0.10 c.Linking:c.AI 0.20 0.44 -0.82 -0.26 0.96 Fixed eﬀects

Estimate SE Lower Upper z -value Pr(>|z |)

(Intercept) 1.24 0.17 0.91 1.57 7.40 <0.001 *** c.Linking -4.87 0.37 -5.60 -4.14 -13.03 <0.001 *** c.AI -0.20 0.08 -0.36 -0.03 -2.33 0.02 * c.Linking:c.AI -1.61 0.21 -2.02 -1.19 -7.64 <0.001 ***

Table 2.1: Summary tables for the GLMM for the acceptability judgement data.

0.57, SE = 0.14, p < 0.001) and table 2.3 reveals a significant effect of ‘c.AI’ in the model fitted to the GenS subset (B = −1, SE = 0.11, p < 0.001). The positive slope coefficient of 0.57 estimated for the GenO model indicates that a unit increase on the AI scale was

In document Predicate-induced semantic prominence in online argument linking: experiments on affectedness and analytical tools (Page 101-118)