• No results found

3.7 Evaluation on Event Detection Tasks

4.1.1 TLINK Based Annotations

A large fraction of corpora, which provide temporal information for events, use temporal links (TLINKs) to anchor events in time. The concept of TLINKs was introduced by Setzer (2001) with the objective to determine the temporal order of events, and, where possible, the calender date of the event.

A TLINK is defined as the temporal relation between two events, two temporal expressions, or between an event and a temporal expression. As many existent corpora are based on news reports, they often contain a special TLINK that states the relation between the event and the document creation time (DCT). The number of relation classes varies for the different annotation schemes. Setzer proposed five classes: INCLUDED and INCLUDE to mark that an event is temporally included within another event or temporal expression, BEFORE and AFTER to mark that an event happens before or after another event or temporal expression, and SIMULTANEOUS, which she proposed as a fuzzy relation to mark pairs that happen roughly at the same time. The TimeML specification2 (Saurí et al.,2004) extended those five classes to

14 classes. Newly added classes were IMMEDIATELY AFTER, IMMEDIATELY BEFORE, IDENTITY, BEGINS, ENDS, BEGUN BY, ENDED BY, DURING and DURING INVERSE. A well know corpus using this TimeML specification is the TimeBank Corpus (Puste- jovsky et al., 2003). As of writing, the latest version of the TimeBank Corpus was 1.2 with 183 news articles that have been annotated with the TimeML specifica- tion.

The 14 temporal relation classes defined by TimeML are quite fine-grained, and it can be challenging for annotators to agree on the right class (Mani et al.,2006). For example, the difference between BEFORE and IMMEDIATELY BEFORE can be difficult to grasp resulting in disagreement between annotators.

Subsequent corpora often reduced the number of possible relation types. For the shared task TempEval-1 (Verhagen et al., 2007), the organizers used the event and time annotation verbatim from TimeBank. TLINKs were newly added to this task by seven annotators with a focus on only six relational classes. The same six re- lational classes were also used for the shared task TempEval-2 (Verhagen et al.,

2010). For the latest shared task on TimeBank, TempEval-3 (UzZaman et al.,2013), the organizers used 13 relation types, neglecting the DURING INVERSE relation from TimeML.

A challenge for the annotation of TLINKs is the quadratic nature of possible links. The following sentence, that contains two events and two temporal expressions, has six TLINKs:

4.1. Previous Annotation Work

Mary [left]Eventon [Thursday]T imeand John [arrived]Event[the day after]T ime

Given n is the number of events and temporal expressions in a document, the number of possible TLINKs would be n(n−1)/2. A mid-sized news article can contain more than 200 events and temporal expressions, which would mean that in theory up to 19,900 TLINKs would be possible. As it is infeasible to annotate all possible relations, different strategies have been used to restrict the number of relations to annotate.

A large fraction of corpora on events use sparse annotations. The TimeBank Corpus (Pustejovsky et al., 2003) has only annotations for salient temporal relations. The subsequent TempEval competitions tried to improve the coverage and added some further temporal links for pairs in the same sentence. However, which relations are salient is not clearly defined and was left as a subjective decision to the annotators. This introduces a major dilemma for each unlabeled pair:

1. The annotator missed to look at the pair, hence, a salient relation may or may not exist.

2. The annotator looked at the pair but decided that no salient temporal relation exists.

3. The annotator looked at the pair, but couldn’t decide on the correct relation class.

More dense annotations were applied by Bramsen et al. (2006), Kolomiyets et al.

(2012),Do et al.(2012) and byCassidy et al.(2014). Bramsen et al. created directed acyclic graph that encodes temporal relations found in a text by annotating multi- sentence segments of text. Kolomiyets et al. focused on dependency trees of temporal relations where all the events of a narrative are linked via partial ordering relations. Do et al. performed an annotation where “the annotator was not required to annotate all pairs of event mentions, but as many as possible”.

Figure 4.2: Annotation with sparse TLINKs (left) and the same paragraph anno- tated with a dense annotation (image from Cassidy et al. (2014)).

The densest annotation was performed by Cassidy et al. and was published as TimeBank-Dense Corpus. There, all Event-Event, Event-Time, and Time-Time pairs in the same sentence as well as in directly succeeding sentences were annotated. They adopted from TempEval-1 the VAGUE relations for cases where no particular relation can be established. The difference between a sparse annotation, as used

for TimeBank, and a dense annotation, as used for TimeBank-Dense, is illustrated in Figure 4.2. An overview of the different corpora and the number of annotated relations is provided in Table 4.1.

Corpus Events Times TLINKs

TimeBank 7935 1414 6418 TempEval-1 6832 1249 5790 TempEval-2 5688 2117 4907 TempEval-3 11145 2078 11098 Bramsen et al. (2006) 627 - 615 Kolomiyets et al. (2012) 1233 - 1139 Do et al. (2012) 324 232 3132 Cassidy et al. (2014) 1729 289 12715 Table 4.1: Statistics for corpora that use TLINKs.

There are two major drawbacks of dense annotations. First, a large number of relations have to be annotated. For the TimeBank-Dense Corpus, for each event and temporal expression around 6.3 TLINKs had to be annotated. Second, the annotation is limited to only links between expressions in the same or in succeeding sentences. As we will show, for 59% of the events that did not happen at the document creation time, the temporal expression that allows inferring the calendric date of the event is more than one sentence away from the event expression. Hence, a dense TLINK annotation will not include this important relation for most events. As a consequence, a large set of events cannot be anchored temporally, even though for readers it is straightforward to extract this information from the text. Increasing the window size would reduce the number of events that cannot be temporally anchored, however, it results in a significantly increased annotation effort as the number of links grows quadratic.

The specifications of TLINKs from TimeML also have been used in more recent corpora than the TimeBank Corpus. The most recent is the MEANtime corpus (van Erp et al.,2015), that applied a sparse TLINK annotation, and only temporal links between events and temporal expressions in the same and in succeeding sentences were annotated. The MEANtime corpus distinguished between main event mentions and subordinated event mentions and the focus for TLINKs was on main events. The annotation guidelines define 12 different TLINK classes. Further corpora, that are based on TimeML, are the Spanish TimeBank (Sauri and Badia, 2012), the modern Spanish TimeBank (Nieto et al., 2011), and the French TimeBank (Bittar et al., 2011).