2.3 Chapter Summary
3.1.2 Micro-level Argument Corpora
Corpora annotated with argument components address the microstructure of argu- ments. They include annotations of different argument components such as claims and premises and allow for a fine-grained analysis of arguments in text. In this section, we highlight the differences in the applied argumentation schemes, i.e. the types of argument components and their granularity.
Kwon et al. (2007) annotated the main claim in English online comments about the emission standard rules proposed by the environmental protection agency (EPA). They annotated claims at the sentence-level in 119 documents and achieved an agreement of κ = .62 with two annotators. Furthermore, they annotated each claim as support, oppose, or propose with an inter-annotator agreement of κ = .80. They found that 59% of the claims are opposing claims, 7% are supporting claims and 34% are proposing claims. Their corpus includes non-argumentative sentences and different types of argument components. It is therefore usable for argument component identification and component classification.
The ECHR corpus contains legal cases of the European court of human rights annotated with argument components (Mochales-Palau and Moens, 2008). The an- notation scheme includes claims and supporting or opposing premises. In their first annotation study, Mochales-Palau and Moens (2008) annotated 10 documents and obtained an inter-annotator agreement of κ = .58. In a subsequent study, they extended their experiments to 47 documents and achieved an inter-annotator agree- ment of κ = .75 (Mochales-Palau and Moens, 2009). The final corpus includes 1,449 non-argumentative sentence and 1,067 argumentative sentences. The argu- mentative sentences include 304 claims and 763 premises. This proportion indicates that arguers in legal cases provide several reasons per claim for ensuring a robust standpoint.
Biran and Rambow (2011a) annotated claims and premises (justifications) in 309 blog threads from LiveJournal.com (a virtual and informal blog community). The corpus contains 1,377 multi-sentence argument components. They achieve an inter-annotator agreement of κ = .69 among two annotators. In subsequent work, they applied their annotation scheme to 118 Wikipedia talk pages (Biran and Ram- bow, 2011b). They annotated 2,404 argument components and obtained an inter- annotator agreement of κ = .75. Both corpora contain non-argumentative text units and different types of argument components, i.e. claims and premises.
Rosenthal and McKeown (2012) created a corpus of 285 blogposts collected from LiveJournal.com and 51 Wikipedia discussion pages. Two annotators iden- tified claims at the sentence-level and reached an agreement of κ = .53 (κ = .5 on LiveJournal and κ = .557 on Wikipedia discussion forums). The corpus is ap- propriate for component identification but not for component classification since it includes only a single type of argument component.
Sardianos et al. (2015) annotated argument components in 300 news articles written in Greek. Two annotators labeled claims and premises at the clause-level and achieved an agreement of F 1 = .76. In total, the corpus contains 1,191 argu- ment components. It can be employed for component identification as well as for component classification.
Chapter 3. Computational Argumentation
newable energy collected from social media. It contains documents from different sources such as news, blogs and microblogs. The corpus comprises 16k sentences of which 760 sentences include claims and premises at the clause-level. The au- thors do not report agreement scores. The corpus can be applied for identifying and classifying argument components.
All corpora described above contain annotations of argument components in single documents, i.e. all components of an argument are encapsulated in the same document. However, it is worthwhile to identify argument components from several documents, e.g. in order to collect evidence for or against a given claim. Therefore, Aharoni et al. (2014) created a corpus that contains claims and premises at the clause-level over multiple Wikipedia articles. Starting with a set of 33 topics from iDebate.org, 20 annotators selected 1,392 related claims from Wikipedia articles with an inter-annotator agreement of κ = .39. Subsequently, they annotated 1,291 associated premises. They classified each premise as study (quantitative analysis), expert (testimony by a person) or anecdotal (specific events) and achieved an inter- annotator agreement of κ = .40. The data set is continuously extended in subsequent work at IBM (Rinott et al., 2015). The current version includes 58 topics, 547 documents, i.e. Wikipedia articles, annotated with 2,294 claims and 4,960 associated premises. The corpus is particularly suitable for information retrieval tasks. It can be used to train supervised machine learning models that identify evidence for a given claim in multiple documents. Furthermore, the different argument component types enable the development of component classification methods.
Habernal and Gurevych (2016a) presented a corpus of user-generated web con- tent (blog posts, forum posts, user comments, etc.) annotated with a modified Toulmin model (cf. Section 2.1.1). First, three annotators annotated 990 documents as argumentative (on-topic persuasive) or non-argumentative (non-persuasive) and achieved an inter-annotator agreement of κ = .59. Subsequently, they annotated 340 argumentative documents with multi-sentence claims, premises, backings, rebuttals and refutations. They achieved an average inter-annotator agreement of αU = .48
across different topics. The corpus is appropriate for identifying arguments at the macro-level and also allows for a more fine-grained analysis of argument components and their types at the micro-level.
A drawback of all corpora described above is that they do not model the targets of argument components, i.e. the target argument component of a particular premise. Consequently, it is not possible to separate several arguments in a document or to model serial argumentation structures (cf. Section 2.1.3). In order to model the targets of premises, Eckle-Kohler et al. (2015) proposed an annotation scheme that indicates if a premise refers to a preceding or following claim. In addition, their scheme distinguishes between supporting and attacking premises. They applied their annotation scheme to 88 German news articles collected with a focused crawler. Three annotators annotated 1,708 multi-sentence argument components (74% of the tokens are argumentative) and reached an agreement of αU = .40. However,
the annotation scheme is limited to convergent argument structures and does not model serial argumentation structures. In addition, the annotation scheme fails to model the targets of premises, e.g. if several independent reasons follow two adjacent claims.
3.1. Existing Corpora
Corpus Domain Lang Task CompGran #Doc #Comp Reliability
Biran and Rambow (2011a) blog threads en CI+CC multi 309 1,377 κ = .69
Biran and Rambow (2011b) Wikipedia talk pages en CI+CC multi 118 2,404 κ = .75
Eckle-Kohler et al. (2015) news de CI+CC multi 88 1,708 αU= .40
Goudas et al. (2014) social media gr CI+CC clause 204 760 -
Habernal and Gurevych (2016a) web content en CI+CC multi 340 1,319 αU= .48
Kwon et al. (2007) online comments en CI+CC sentence 119 240 .62≤ κ ≤ .80
Mochales-Palau and Moens (2009) court cases en CI+CC sentence 47 1,067 κ = .75
Rinott et al. (2015) Wikipedia articles en CI+CC clause 547 7,254 κ = .39
Rosenthal and McKeown (2012) blogs and discussions en CI sentence 336 2,479 κ = .53
Sardianos et al. (2015) news gr CI+CC clause 300 1,191 F 1 = .76
Table 3.2: Corpora annotated with argument components at the micro-level.
riety of text types including news, legal documents, online discussions and different types of user-generated web content. The investigation of the employed annotation schemes shows that the claim-premise-scheme is the most frequent one. The reported reliability scores range from moderate to substantial agreement. Thus, the overview provides some evidence that the claim-premise-scheme can be reliably applied to heterogeneous types of text. A drawback of the applied annotation schemes is that none of the corpora listed above explicitly models the relations between argument components in single documents. However, knowing the targets of argument com- ponents and the structure of arguments respectively is crucial for argument analysis (Henkemans, 2000, p. 448; Govier, 2010, p. 22; Sampson and Clark, 2006; Sergeant, 2013). First of all, the structure of arguments is essential for evaluating the quality of arguments since it is not possible to examine how well a claim is justified without knowing which premises belong to it. Second, solely modeling the types of argument components is not sufficient for recognizing more complex argument structure, i.e. serial arguments. Third, the structure is required to separate several arguments within a single text. Without knowing if and how argument components are related to each other, it is not possible to group the components of an individual argument.