A Recursive Annotation Scheme for Referential Information Status
Arndt Riester
1, David Lorenz
2, Nina Seemann
11Institute for Natural Language Processing, University of Stuttgart, Azenbergstr. 12, 70174 Stuttgart, Germany
2English Department, University of Freiburg, Rempartstr. 15, 79085 Freiburg, Germany
arndt.riester@ims.uni-stuttgart.de, david.lorenz@frequenz.uni-freiburg.de, nina.seemann@googlemail.com
Abstract
We provide a robust and detailed annotation scheme for information status, which is easy to use, follows a semantic rather than cogni-tive motivation, and achieves reasonable inter-annotator scores. Our annotation scheme is based on two main assumptions: firstly, that information status strongly depends on (in)definiteness, and secondly, that it ought to be understood as a property of referents rather than words. Therefore, our scheme banks on overt (in)definiteness marking and provides different categories for each class. Definites are grouped according to the information source by which the referent is identified. A special aspect of the scheme is that non-anaphoric expressions (e.g. names) are classified as to whether their referents are likely to be known or unknown to an expected audience. The annotation scheme provides a solution for annotating complex nominal expressions which may recursively contain embedded
expres-sions. In annotating a corpus of German radio news bulletins, a kappa score of.66for the full scheme was achieved, a core scheme of
six top-level categories yieldsκ=.78.
1.
Introduction
We provide a robust and detailed annotation scheme for
information status, which covers all – also embedded – DPs/PPs in a text (excluding idioms) and achieves rea-sonable inter-annotator scores. Other annotation schemes, e.g. Nissim et al. (2004) or Dipper et al. (2007) take information status to be an independent dimension from (in)definiteness. They adopt three-way distinctions, which – in varying shares – mix the partitions postulated by Prince (1981) (evoked / inferrable / new), Prince (1992) (discourse-old / discourse new but hearer-old (unused) / discourse-new) and Chafe (1994) (given / accessible / new). By contrast, our scheme is built on the convic-tion that informaconvic-tion status is strongly dependent on (in)definiteness. At least for languages, in which morpho-logical (in)definiteness marking is available, like English or German, annotations become more meaningful (and easier to accomplish) if such marking is not neglected.
Instead of assuming a cognitive hierarchy, we claim that definite expressions are grouped according toinformation sources in which they are identified: the discourse text (what has explicitly been uttered), the utterance con-text (entities which are perceptible or available by default to the discourse participants, cf. Kaplan (1989)), the en-cyclopaedic context (entities familiar to the recipient) as well asframes, which act as small contexts at the interface between lexical and world knowledge. Our classification subsumes annotations for coreference and bridging which have already been treated, for instance, in Poesio and Vieira (1998), Poesio (2004) or Nedoluzhko et al. (2009). We also propose a classification for indefinites.
Another important point is that the notion of information status (in particular, givenness) can be understood in two ways: as a property ofwords (suggesting a definition of givenness as recurrence or hyperonymy with regard to a preceding word), or as a property of thereferentsof deter-mined phrases (on this view, givenness means coreference). While the two notions sometimes overlap, there are cases in which they contradict each other: identical words in a text
may (and in fact often do) refer to different individuals, whereas lexically unrelated phrases may refer to one and the same entity. In the present scheme we have opted for the
referential interpretationof information status,1 which
in-evitably means that the markables we are targeting must be full (and possibly complex) prepositional phrases or deter-miner phrases (Montague’s (1974) “terms”, often referred to as “noun phrases” but here the DP terminology is crucial since only determined phrases are referential). Complex phrases, in turn, may recursively contain other DPs or PPs, which have their own information status. For instance, the phrase [DP the red gem [P P in [DP the Queen’s] crown]]
is built from linguistic material referring to three different individuals/entities (the Queen, the crown, the gem) which potentially possess different information status and need to be annotated recursively. In the following, we will provide an overview on the taxonomy using examples adapted from various issues ofTime, The Guardian,andNature.
2.
The scheme
Availability in the discourse context:GIVENA referential expression isGIVEN2if and only if it is def-inite and has a coreferential antecedent in the discourse context. Our notions ofcoreferenceanddiscourse context
are those adopted, for instance, in Discourse Representa-tion Theory (Kamp and Reyle, 1993). GIVENentities are further classified as shown in Table 1.
Several coreferential items taken together form a givenness chain. SubsequentGIVENitems must be classified with re-gard to theentire chain, not only with regard to its
immedi-1
But see Baumann and Riester (2010) for a scheme which ad-dresses both levels.
2
In earlier publications related to the scheme, e.g. Riester
(2008b), this category is calledD-GIVEN(discourse-given) to
GIVEN Example sentences
-PRONOUN “Russia is ruled by the same people
who own it .”
-REFLEXIVE Cancer hides itself within normal
structures like blood vessels.
-SHORT Both had the blessings of Dr. Richard
Klausner. But even Klausner had to be persuaded at first.
-REPEATED It is hard to think of any industrial
society that in its essentials is less like the U.S. than Japan. [. . . ] Japan is a nation with very deep cultural roots and habits.
-EPITHET Before the European Union’s ban on
incandescent lightbulbs went into effect on Sept. 1, consumers across Europe raided stores to stockpile the . . .
familiar bulbs .
Table 1: Subcategories ofGIVEN
ate predecessor. Occasionally, the labels conflict with each other. For instance, the last item in the sequencePresident Obama . . . Obama . . . Obama would count asSHORTwith regard to the first and asREPEATEDwith regard to the sec-ond item. In such cases, as a rule, the label REPEATED wins.
In languages like German, differences of case are ignored. In the sequence ein Elefant (indefinite, nominative) . . .
dem Elefanten (definite, dative), the latter counts asRE
-PEATED.
We annotate both grammatical “binding” and pronomi-nal, e.g. cross-sentential, anaphora. The subclassEPITHET (Clark, 1977) – compare Figure 2 for an example annota-tion in SALTO/SALSA (Burchardt et al., 2006) – deserves special comment: it combines the givenness of its referent with the discourse novelty of its descriptive content. As such, it may represent a case of contentaccommodation. There is no conflict involved here, since we have defined givenness strictly at the referential level; see Riester (2009) for a discussion.
Availability in the utterance context: SITUATIVE Expressions that refer to entities in the utterance context (indexicals, deictic expressions, cf. Fillmore (1975), Ka-plan (1989), Levinson (2006)) receive the labelSITUATIVE. Such entities are either generally available in communica-tion (discourse participants; times and places relative to the communicative situation) or identifiable via sensory per-ception. They are called instances ofsymbolicandgestural
deixis, respectively; compare Table 2. The latter are typ-ically limited to face-to-face dialogue and need to be ac-companied by a pointing gesture.
In this scheme, we have opted to subsume both types un-der the labelSITUATIVE. Note, however, that under some frameworks, e.g. DRT, entities referred to by means of sym-bolic deixis are treated as forming part of thediscourse
con-text (and could thus be labeledGIVEN).3
SITUATIVE– Example sentences
The wildfires that now annually singe Southern California came early this year .
How much did you pay for (pointing)the drawing ?
Table 2: SITUATIVE
Typical participants in a scenario: BRIDGING
The class ofBRIDGINGentities is defined as those definites which neither possess a corefential antecedent nor can be understood in the absence of a preceding expression which defines a scenario or frame in which they play a typical role. We do not attempt a subclassification according to this role, e.g. part-whole, participant in a situation etc., as proposed in Nissim et al. (2004). Such roles have been treated in
frame semantics(Fillmore, 1985) and, as we think, are not relevant to information status. Note, further, that the no-tion of bridging has originally been defined in a broader sense than we are using it here (Clark, 1977; Asher and Lascarides, 1998); nevertheless our use explicitly focuses on the prevalent cases in the contemporary literature. We use the simple labelBRIDGINGif and only if a salient bridging antecedent can be identified. In our annotations, we draw an anchoring link from theBRIDGINGanaphor to that antecedent. Sometimes a frame is evoked by a stretch of text rather than by a single word. In these cases, we use the label BRIDGING-TEXT, without any specific an-chor. A special case are complex DPs in which the bridging “antecedent” is the genitive or prepositional adjunct con-tained in the phrase itself. We assign the labelBRIDGING
-CONTAINEDto these DPs,4compare Table 3.
BRIDGING Example sentences
-0 Their two-day visit is the second step in the beginning of a dialogue with Burma. [. . . ] For years, the US had isolated the junta with political and economic sanctions.
-TEXT United were trailing 3-1 when Fletcher was felled in the area by Aleksei Berezutski. The Scotland midfielder was then yellow-carded by the . . .
referee .
-CONTAINED The Republicans won the . . .
governorship of Virginia .
Table 3: Subtypes ofBRIDGING
(Non-)Availability in the recipients’ encyclopaedic context: ACCESSIBLE/UNUSED
According to Prince (1981), textual annotations of the sort described here should reflect a speaker or writer’s
assump-3Under such a view instances of gestural deixis find their
ref-erents in the so-calledenvironment context(Kamp, to appear).
4
tions concerning the hearer’s knowledge state (“assumed familiarity”), which is a simplification with regard to Clark and Haviland’s (1977) notionshared knowledge. However, we think that annotating the writer’s assumptions is an un-reachable goal, if it is to mean more than distinguishing cer-tain morphosyntactic patterns – particularly when it comes to classifying non-anaphoric, discourse-new entities. Therefore, in order to make the annotations tractable, we introduce a further simplification and propose to label whether theannotatorhimself expects a certain expression to be known or unknown by theintended audienceof a text, cf. Riester (2008a; Riester (2008b). Moreover, this does justice to the fact that, for instance, news texts normally do not only have one addressee but are designed for a large number of recipients, which of course never possess exactly the same knowledge.
Entities which are expected to be widely known are labeled
ACCESSIBLE-GENERAL-TOKEN,5 those which refer to a
(generic) class or type are labeledACCESSIBLE-GENERAL -TYPE. Entities which are expected to be unknown to the audience are labeledACCESSIBLE-DESCRIPTION, com-pare Table 4. From a semantic point of view, only ACCESSIBLE-GENERALentities form part of the recipients’ encyclopaedic context (Kamp, to appear), whereas entities labeledACCESSIBLE-DESCRIPTIONare unknown and have to be accommodated. This important distinction is not suf-ficiently accounted for in any of the previous annotation schemes for information status.
ACCESSIBLE Example sentences
-GENERAL-TYPE Dispersal in the Eurasian . . .
(UNUSED-TYPE) badger is believed to be very limited.
-GENERAL-TOKEN Secretary of state Hillary . . .
(UNUSED-KNOWN) Clinton jettisoned the respect earned on a tough diplomatic mission to Pakistan by claiming that [. . . ]
-DESCRIPTION The hoard was found in January
(UNUSED-UNKNOWN) near Harrogate by David . . . and Andrew Whelan, father . . .
and son hobby metal . . .
detectorists, in a bare field due to be ploughed for spring sowing.
Table 4: Subtypes ofACCESSIBLE(UNUSED)
Note that the use ofACCESSIBLEon these occasions differs from the one in Chafe (1994), who uses it to describe in-formation which is derived from previously mentioned ma-terial. In Table 4 we therefore also include an alternative notion (UNUSED).6 The distinction betweenACCESSIBLE
-GENERAL-TOKEN / UNUSED-KNOWN and ACCESSIBLE
-DESCRIPTION / UNUSED-UNKNOWN entities also
corre-5
ACCESSIBLE-GENERALis used in Dipper et al. (2007).
6
Compare Prince (1981) and Baumann and Riester (2010).
sponds to the difference betweenfamiliarandnon-familiar but uniquely identifiablediscussed by Gundel et al. (1993).
ACCESSIBLE-DESCRIPTIONvs.
BRIDGING-CONTAINED
An important rule concerns the demarcation between the categories ACCESSIBLE-DESCRIPTION (when as-signed to complex definite descriptions) and BRIDGING
-CONTAINED, which both apply tocontext-freeexpressions:
BRIDGING-CONTAINEDitems may occur in autonomy
be-cause they possess a subconstituent representing the an-chorfor the entire expression.ACCESSIBLE-DESCRIPTION items generally consist of sufficient linguistic material to enable the addressee toaccommodate(Lewis, 1979) an ad-equate discourse entity. The two types are distinguished by the following rule: an item labeled as BRIDGING
-CONTAINED refers to an object or person that stands in
aprototypicalrelation to its genitive or PP-argument. By contrast, the relation holding between the referent of an ex-pression labeled ACCESSIBLE-DESCRIPTIONand its argu-ment phrase is non-prototypical; compare Table 5.
BRIDGING The gunman was on the roof of . . .
-CONTAINED the checkpoint when he opened
fire.
ACCESSIBLE [he was] convicted of helping to
-DESCRIPTION organise the seizure of Osama . . .
Moustafa Hassan Nasr from a Milan street in February 2003.
Table 5: BRIDGING-CONTAINED vs. ACCESSIBLE -DESCRIPTION
A further test is that it is always possible to separate the parts of BRIDGING-CONTAINED phrases (the checkpoint . . . the roof), while it is considerably marked (till out-right impossible) to split expressions labeledACCESSIBLE
-DESCRIPTION (Osama Moustafa Hassan Nasr . . . ??the
seizure).
Indefinites: INDEF
Indefinites are grouped intospecificones (calledNEW be-cause they introduce a novel discourse referent that may be taken up later) andGENERIC ones (those referring to a class or a non-specific concept under consideration). There are, furthermore, indefinites standing in a characteristic re-lation to the context: they may pick up a concept mentioned before (literally or paraphrastically:RESUMPTIVE), or they may represent a subset of an aforementioned group:PARTI -TIVE. Similar to theBRIDGINGcases, that group may occa-sionally occur embedded in a complex phrase: PARTITIVE
-CONTAINED. See Table 6.
Other categories
The remaining categories are:CATAPHOR(cataphoric defi-nite or indefidefi-nite expressions which are not reflexives);EX
-PLETIVE(pronouns lacking a thematic role); NULL(
noth-ing, nobody etc.); andRELATIVE(a label fornon-restrictive
INDEF Example sentences
-NEW In a hilltop suburb south of . . .
Jerusalem called Efrat , Sharon Katz serves a neat plate of sliced cake .
-GENERIC Serious beer drinkers should head
straight to this 550-year-old institution.
-RESUMPTIVE That’s close to how a cancer vaccine
works, but not precisely. Most experts see cancer vaccines as a hybrid of treatment and prevention.
-PARTITIVE At violent clashes between the police
and demonstrating Kurds, 52 . . .
police officers and three . . .
demonstrators were injured.
-PARTITIVE In one of the eggs containing four . . .
-CONTAINED haploid nuclei , the nuclei were
dividing.
Table 6: Subtypes ofINDEF
included under the information-status label assigned to this entire DP.)
3.
Evaluation
The present evaluation has been conducted for German. We have annotated a corpus of transcripts from German radio news bulletins. The annotations were done by two students of linguistics, well-trained in the annotation task, using the SALTO/SALSA tool (Burchardt et al., 2006). For this evaluation, we chose a section comprising at least 1149 DPs/PPs. Since the coders themselves had to identify the markables while annotating, a few markables were acci-dentally overlooked by either one of the coders. We are ignoring these. The number of markables annotated by both coders is 1100. Of those, 757 are labeled identically. Table 7 shows the distribution of annotations we obtain.
According to the metric of Cohen (1960) and Artstein and Poesio (2008), we achieve a κscore of .66 using the en-tire scheme. Computing the inter-annotator agreement for a core variant of our scheme using six top-level categories
GIVEN, SITUATIVE,BRIDGING,ACCESSIBLE,INDEFand
OTHERyieldsκ = .78. This is lower than the scores
re-ported in Nissim et al. (2004) (.788 (ext.) / .845 (core)) for their annotations on dialogue data (Switchboard). How-ever, their system distinguishes fewer categories (16) and does not classify all DPs/PPs. Nissim et al. (2004) exclude locatives and directionals and moreover use a category not-understood to exclude some controversial cases. Ritz et al. (2008) report κ = .55 performance for the scheme by Dipper et al. (2007) on newspaper commentaries (the most closely related to our radio news transcripts), going up to .73 for question/answer pairs. Hempelmann and oth-ers (2005) report .72 for Prince’s (1981) classification using seven categories on narratives from 4th grade textbooks.
4.
Conclusion
We think that the way our system proposes annotate in-formation status recursively is convincing (cf. Figure 1), because information status is defined at the level of deter-mined phrases. In other words, it is thediscourse referents
involved, not the expressions used, which are classified ac-cording to information status.
the door to negotiations with Teheran
Figure 1: Nested Annotation
The scheme straightforwardly connects to semantic theory, since the categories we use are definable in DRT, com-pare e.g. Riester (2009). We integrate, in a transcom-parent way, insights from the semantic-philosophical tradition on (in)definiteness, anaphora and accommodation. In related work, we could show that the classes we assume tend to be asymmetrically ordered when coocurring in the same clause (Cahill and Riester, 2009). Furthermore, some of them are statistically correlated with distinct pitch accent distributions (Schweitzer et al., 2009).
5.
Acknowledgements
said Kirchner in C´ordoba . . . emphasised the Argentinian head of state
Figure 2: Anaphoric link:GIVEN-EPITHET
ACC-DES ACC-GEN-T O
ACC-GEN-TY BRI BRI-CON BRI-TXT GIV -EPI
GIV -PR
O
GIV -REF
GIV -REP
GIV -SHO
IND-GEN IND-NEW IND-P AR
IND-P AR-CO
IND-RES CA T
EXP NUL REL SIT Σ
ACC-DES 122 25 7 3 20 2 1 180
ACC-GEN-TO 7 125 2 3 134
ACC-GEN-TY 1 3 32 1 5 1 45
BRI 35 5 8 21 5 8 6 89
BRI-CON 22 5 1 51 1 1 81
BRI-TXT 1 4 5
GIV-EPI 3 4 2 4 3 5 43 64
GIV-PRO 65 1 66
GIV-REF 14 14
GIV-REP 1 6 3 1 28 1 40
GIV-SHO 1 2 6 2 23 34
IND-GEN 6 38 34 2 80
IND-NEW 5 1 4 7 20 98 1 7 1 144
IND-PAR 3 4 12 9 28
IND-PAR-CO 1 1 6 8
IND-RES 1 3 1 5
CAT 1 12 3 16
EXP 11 11
NUL 4 4
REL 1 5 6
SIT 1 45 46
Σ 197 175 67 29 90 25 57 65 14 30 23 67 151 12 13 1 12 15 4 5 28 1100