Argument attribution focuses on identifying properties of arguments or their compo- nents. For instance, Feng and Hirst (2011) proposed an approach for identifying the five most frequent argumentation schemes in AraucariaDB (argument from example, argument from cause to effect, practical reasoning, argument from consequences and argument from verbal classification). They experimented with several classification setups and achieved an accuracy between 62.9% and 97.9% using a binary C4.5 decision tree for each argumentation scheme. However, their approach is based on features extracted from mutual information of claims and premises and thus requires
3.3. Argument Attribution
that the argument components are reliably identified in advance.
Oraby et al. (2015) addressed the argumentation style of arguments. They em- ployed the IAC corpus and classified each argument as factual or emotional for separating arguments with an argumentative merit from those which are based on emotional reasons. They employed a bootstrapping approach for extracting linguis- tic patterns from unlabeled arguments and achieved .80 accuracy. Although this approach increases precision, it exhibits a significantly lower recall compared to a supervised unigram baseline.
Wachsmuth et al. (2015) focused on the identification of sentiment flows in prod- uct reviews. They modeled the sentiment flow of a review as a sequence of local sentiments, i.e. sentiment scores of discourse units, in order to identify common argumentation patterns. They showed that reviews across several domains exhibit similar sentiment flows. In later work they extended their approach to rhetorical moves including sequences of discourse relations, discourse functions and argumen- tative roles (Wachsmuth and Benno, 2016). Their results suggest that flow patterns generalize well over different text types such as product reviews and student es- says. Furthermore, they showed that features derived from flow patterns improve the classification of global sentiments and the scoring of the essay organization.
Besides these approaches, there are several approaches on stance recognition and only a few on the quality of arguments. We introduce both in the following two sections.
3.3.1
Stance Recognition
Stance recognition aims at identifying the stance of an author about a controversial topic. This task is usually considered as labeling an author’s comment in an online debate as either for or against.
Somasundaran and Wiebe (2009) proposed an approach for maximizing the over- all side-score of a comment by using integer linear programming since a single com- ment can also include concessions or statements opposing the view of the author. They identified the probability that a particular term is positively or negatively as- sociated with the topic by extracting subjectivity clues and associating them with targets from topic-relevant documents. In addition, they considered concessions recognized with a list of discourse constructs. In their experiments, they achieved accuracies between .611 and 1.0 depending on the four topics of 117 comments in their test set.
In their following work, Somasundaran and Wiebe (2010) experimented with clue words and sentiment features using a supervised classifier. They extracted 3,094 positive and 668 negative words from the annotations of the MPQA corpus (Wilson et al., 2005) and showed that combining them with content words yields promising results. They achieved the best results of .639 accuracy by using a support vector machine and a combination of sentiment and clue word features.
In addition, there are several other approaches on stance recognition. For in- stance, Anand et al. (2011) experimented with lexical, structural, dependency and context features, and Hasan and Ng (2012) showed that jointly modeling contex- tual information and author’s stances on particular subtopics improves the accu- racy. Hasan and Ng (2014) found that combining reason classification and stance
Chapter 3. Computational Argumentation
classification yields promising results, and Qiu and Jiang (2013) proposed a novel generative latent variable model to capture the viewpoint, user identity and user interactions.
3.3.2
Argument Quality
Approaches for automatically assessing the quality of arguments are still very rare. Cabrio and Villata (2012a) employed textual entailment to identify accepted argu- ments in online communities. They built a graph that represents attack and support relations between arguments and applied the abstract argumentation framework pro- posed by Dung (1995) to identify accepted arguments. They report an accuracy of 75% for identifying accepted arguments. However, their approach focuses on the acceptance of entire arguments (macro-level) instead of the acceptability of individ- ual premises as required by the RAS-criteria (cf. Section 2.2.4). In addition, the reliability of their data set is unknown. They consider an argument as accepted if it is not attacked in the debate, i.e. there is no related contra argument. However, it is unclear if humans agree on these automatically extracted labels.
Park and Cardie (2014) presented an approach for classifying argument com- ponents as verifiable or unverifiable. Their best approach based on a support vec- tor machine and various features achieved a macro F1 score of .690. Although they claim that verifiability allows for determining appropriate types of premises and consequently the strength of an argument, it is unclear how the verifiability of propositions relates to the logical quality of arguments.
Persing and Ng (2015) introduced an approach for recognizing the argumentation strength of an entire essay. They report that pos n-grams, prompt adherence features (Persing and Ng, 2014), and predicted argument components perform best. However, their system outputs a single holistic score which summarizes the strength of all arguments in an essay. Consequently, it is only of limited use to pinpoint the weak points of arguments in persuasive essays.