3.7 Evaluation on Event Detection Tasks
4.1.2 Time as Event Argument
The ACE 2005 corpus (Walker et al., 2005), as well as the Rich ERE annotation scheme (Song et al.,2015), defines time as a general event attribute that is tagged if it is within the scope of the corresponding event. The scope of the event is defined as the same sentence that contains the event trigger.
4.1. Previous Annotation Work
We watched the state funeral in Montreal today for Canada’s former prime minister Pierre Trudeau, who [died]Event [last week]T ime−Argument
at 80.
For the above sentence, the event died would have last week as the value for the time argument. While this annotation is simple to perform, it has the shortcoming that the scope is limited to the same sentence.
The temporal information is only given for 19.80% of the events in the ACE 2005 corpus. For all other events, the time argument is empty. The same low num- ber could be observed for the TimeBank-Dense corpus, where only 23.68% of the events had the needed temporal information for temporal anchoring in the same sentence.
Extending the scope of the event might solve the issue that the temporal information for an event is not in the same sentence. However, we observe in our annotation study that for at least 32% of the Single Day Events that there is not a single continuous text passage defining when the event happened. Instead, the temporal anchoring is provided by several text passages scattered throughout the document. This is especially the case for events where the document only provides a rough time frame when the event happened, e.g. the start of the document reveals that it happened after August 1998 and a later text passage reveals that it happened before December 1998.
A corpus that uses a similar idea for the annotation of the event time is the corpus released for SemEval-2015 Task 4 on automatic timeline generation (Minard et al.,
2015). The organizers provided annotated news articles on four different topics: Apple Inc., Airbus, General Motors and general stock market news. Events that involved a specific target entity were anchored in time. The format for anchoring was YYYY-MM-DD. By omitting the value DD events could be anchored in a specific month indicating that the event happened at some point within that month. By omitting MM-DD, the event was anchored within a year.
This annotation scheme doesn’t address three key challenges: First, a large set of events last longer than a day, in fact, for the TimeBank-Dense Corpus around 41% of the events lasted longer than a day. The annotation schemes do not provide a notation to specify the begin and end point of such multi-day events. Second, the granularity is either day, month, or year. Specifying that an event happened between November 1st, 2011 and December 31st, 2011 isn’t possible in this scheme. The best anchoring for such an event would be 2011-XX-XX. Worse, if the timeframe overlaps the year boundary, no temporal anchoring is possible. Third, the annotation scheme cannot deal with temporal information stating that something happened before or after a certain date, e.g. that a person was born before 1980. Such temporal information is quite common in news articles, for example for 28% of the events in the TimeBank-Dense corpus, only the information that the event happened before / after a certain date is provided.
4.2
Document-Wide Event Time Annotation Scheme
Our annotation scheme was created with the goal of being able to create a knowl- edge base from the extracted events in combination with their event times. It assumes that events are already annotated. In our annotation study, we extend the TimeBank-Dense Corpus (Cassidy et al.,2014) with our new annotation scheme. The TimeBank-Dense Corpus is based on TimeML (Saurí et al.,2004), which defines an event as a cover term for situations that happen or occur. Events can be punc- tual or last for a period of time. Predicates describing states or circumstances in which something holds true are also events. For the TimeBank Corpus, the smallest extent of text (usually a single word) that expresses the occurrence of an event is annotated.
The aspectual type of the annotated events in the TimeBank Corpus can be distin- guished into achievement events, accomplishment events, and states (Pustejovsky,
1991). An achievement is an event that results in an instantaneous change of some sort. Examples of achievement events are to find, to be born, or to die. Accomplish- ment events also result in a change of some sort, however, the change spans over a longer time period. Examples are to build something or to walk somewhere. States, on the other hand, do not describe a change of some sort, but that something holds true for some time, for example, being sick or to love someone. The aspectual type of an event does not only depend on the event itself, but also on the context in which the event is expressed.
Punctual events are a single dot on the time axis while events that last for a period of time have a begin- and an endpoint. It can be difficult to distinguish between punctual events and events with a short duration. Furthermore, the documents typically do not report precise starting and ending times for events. Hence, we decided to distinguish between events that happened at a Single Day and Multi- Day Events that span over multiple days. We used days as the smallest granularity for the annotation as none of the annotated articles contained any information on the hour, the minute or the second when the event happened. In the case a corpus contains this information, the annotation scheme could be extended to include this information as well.
For Single Day Events, the event time is written in the format YYYY-MM-DD. For Multi-Day Events, the annotator annotates the begin point and the end point of the event. In the case no statement can be made on when an event happened, the event will be annotated with the label not applicable. This applies only to 0.67% of the annotated events in the TimeBank Corpus which is mainly due to annotation errors in the TimeBank Corpus.
He was sent into space on May 26, 1980. He spent six days aboard the Salyut 6 spacecraft.
The first event in this text, sent, will be annotated with the event time 1980-05-26. The second event, spent, is a Multi-Day Event and is annotated with the event time beginPoint=1980-05-26 and endPoint=1980-06-01.