• No results found

Guidelines for Annotating Spatio-temporal Events

4.5 Event Extraction

4.5.2 Guidelines for Annotating Spatio-temporal Events

Since we introduced our own concept of spatio-temporal events in the context of this thesis (Strötgen et al., 2011; Strötgen and Gertz, 2012a), the extraction of spatio-temporal events is not a well-established task in the research community in contrast to, for instance, the task of temporal tagging addressed in Chapter 3. Thus, while annotation standards and annotated corpora are available for the temporal tagging task, neither annotation guidelines nor annotated corpora exist for spatio-temporal event extraction.

For the manual annotation of spatio-temporal events and also for the development of automatic extraction approaches that go beyond the cooccurrence approach, it is important to clearly specify when a combination of temporal and geographic expressions forms a valid spatio-temporal event. According to the definition of spatio-temporal events (Definition 4.1), we formulate the following specifications:

• A temporal expression teiand a geographic expression gejhave to cooccur within a window w in

a document. This window w is set to one sentence.

• Within w, something has to be described which is, was, or will be happening at the time referred to by teiat the place referred to by gej.

• The temporal expression has to be of type “date” or “time” since durations and set expressions often cannot be anchored on a timeline (cf. Section 2.3.2).

In Section 4.3.2, we already discussed examples of potential spatio-temporal events and explained when combinations of extracted temporal and extracted geographic expressions form valid events. All theses examples can be decided based on the four above described specifications.

4 The Concept of Spatio-temporal Events

Note, however, that it is important that the geographic and temporal expressions are indeed used to refer to the location and time of the spatio-temporal event, and that a location or time is not accidentally valid. For instance, in the Spatio-temporal Event Example – Sentence 1,10e1= h“Sept. 20, 2011”, “Greece”i is not

a valid spatio-temporal event although the valid event e2 = h“Sept. 20, 2011”, “Athens”i is geographically

contained in it. Applying geographic mappings to the geographic component of e2thus transforms e2into

e1. However, the geographic containment relationship in the sentence between “Athens” and “Greece” is

accidentally, and “Greece” could be replaced by any other geographic expression for which the containment relationship does not hold. If the sentence started with “Evangelos Venizelos, <PLACE>Turkey</PLACE>’s finance minister”, it would be obvious that e01= h“Sept. 20, 2011”, “Turkey”i is not a valid spatio-temporal event since the sentence describes something happening in Athens and not in Turkey, i.e., the geographic expression “Turkey” – and thus “Greece” in the original example – is not used to refer to a location where a spatio-temporal event takes place.

Another difficult issue occurs when geographic expressions are not used to refer to a location but to another entity type like in the following example:

Spatio-temporal Event Example – Sentence 9.

In a historic first, <PLACE>Iraq </PLACE> announced the creation of its first national park in <TIME>July 2013 </TIME>.

potential evente24=h“July 2013”, “Iraq”i

In this example, it is described that a representative of the country “Iraq” announced something, i.e., “Iraq” is used to refer to an agent. Thus, “Iraq” is not used to refer to a location where a spatio-temporal event takes place and e24is not a valid spatio-temporal event. However, it is quite obvious that one can

also argue that the geographic expression is used to refer to an agent located in “Iraq”, and thus, e24could

also be considered as a valid spatio-temporal event following the above annotation specifications. A further reason making the above example and similar constructions difficult issues is that it is arguable if expressions in such contexts should be considered as geographic expressions at all and if they thus should be extracted by a geo-tagger. For instance, Leveling and Hartrumpf (2008) argue that expressions in these contexts should not be considered as locations since they are used metonymically and “[m]etonymic location names refer to other, related entities and possess a meaning different from the literal, geographic sense” (Leveling and Hartrumpf, 2008). Although it was shown that such expressions “are to be treated differently to improve performance of geographic information retrieval” (Leveling and

Hartrumpf, 2008), most geo-taggers extract them.

Due to this conflict, we do not want to handle such potential events as invalid spatio-temporal events nor do we want to handle them in the same way as clearly valid events. In addition, spatio-temporal events can often be considered of taking place at the respective locations although the expressions are used metonymically – e.g., in “Washington announced”, “Paris declines”, and “Berlin says” where “Washington”, “Paris”, and “Berlin” refer to the governments located in the respective cities – so that we make the following distinction: We mark such events as “agent-based spatio-temporal events” when manually annotating spatio-temporal events in our annotated data sets, and allow their extraction when addressing the task automatically.

10For convenience, we repeat this sentence here: Evangelos Venizelos, <PLACE> Greece</PLACE> ’s finance minister, listens to an aide during a session of parliament in <PLACE> Athens</PLACE> on <TIME> Sept. 20, 2011</TIME>.

4.5 Event Extraction

Note that in the first example above, “Greece” cannot be considered as an agent since the agent is “Evangelos Venizelos, <PLACE>Greece</PLACE>’s finance minister”. Thus, the event e1= h“Sept. 20, 2011”,

“Greece”i is neither a valid spatio-temporal event nor an agent-based spatio-temporal event.