5. Detecting comparisons in product reviews
5.2. Approach
5.2.4. Context from annotation design
Context from annotation design refers to the annotation design decisions we have made in creating our corpus as described in Chapter 4. We focus on three specific questions, the first two concern the annotation of multiword predicates like “less good” or “as good as”, the third concerns the annotation of entities. In our data (and to a certain degree also for the other existing resources) we can systematically change the annotations to reflect a different outcome for each decision and investigate the effect these changes have on the classification performance of our system.
Anchoring of multiword predicates. Most comparative predicates are single words like “better” or “best”, but multiword predicates account for about 10-20% of comparative predicates in our data. Some are expressions like “X has the edge over Y” or “X is on par with Y” which we will not discuss in this work. But the majority of multiword predicates in our data are systematically introduce by English grammar rules. Consider the following variations of a sentence:
5.2. Approach 105
(5.4) a. “[It]E1 had a sturdier [feel]A than [X]E2.” b. “[It]E1 had a less sturdy [feel]A than [X]E2.”
Sentence 5.4a compares the aspect “feel” of a camera to some other camera with the comparative predicate “sturdier”. If we change the direction of the comparison, we get a multiword predicate with the modifier “less” added to the adverb (Sentence 5.4b). In the following we refer to such a modifier as function word and to the modified adjective or adverb as content word. Besides the modifiers “less” and “more” for comparative forms, and “most” and “least” for the superlative, the list of function words includes “as” which is used to introduce an equative comparison like “X is as good as Y”.2
Our system, like the majority of systems to date, uses a single-token-based approach for the automatic detection of comparative predicates which raises the question of which word to select as an anchor for multiword predicates. A strong argument can be made to select the function word as the token anchor for the comparative predicate. There will be more training instances to use in machine learning for a given function word than for the individual content words, so sparsity is reduced. On the other hand, choosing the content word may be more informative for end users.
The first question we want to investigate in this study is whether the different anno- tation decisions translate into a difference in classification performance. In our first ex- periment we identify all occurrences of multiword predicates. In one setting (Function predicates), we annotate the modifying function word as the comparative predicate. In the second setting (Content predicates), we annotate the modified content word. The following illustrates the different annotations for an example sentence:
(5.5) a. “. . . had a less [sturdy]A [feel]A. . . ” (Function predicates)
b. “. . . had a less [sturdy]A [feel]A . . . ” (Content predicates)
In both cases we have the same number of comparative predicates, only the annota- tions differ. Argument annotations are identical, even if it causes one token to have a predicate and an argument annotation at the same time.
Annotation of aspects and scale. The second question deals with the annotation of the content word when we use function predicates for multiword predicates. Most exist- ing corpora annotate the content word as an aspect. We will illustrate some problems with this approach with the following examples:
(5.6) a. “. . . a sturdier [feel]A . . . ”
2Note that not all occurrences of the keywords indicate multiword predicates, e.g., in “X has less noise”
106 5. Detecting comparisons in product reviews
b. “. . . a less [sturdy]A [feel]A . . . ” c. “. . . a less [sturdy]A feel . . . ”
d. “. . . a less sturdy [feel]A . . . ”
e. “. . . a less flimsy [feel]A . . . ”
If we compare the annotations of Sentence 5.6a and Sentence 5.6b we see that chang- ing the direction of the comparison introduces a new aspect. This is counter-intuitive because what is compared (i.e., the aspect) should not depend on the introduced rank- ing. Additionally, if there is only one slot for the aspect, as is the case in one of the corpora we use, annotators will need to decide between annotations 5.6c and 5.6d. An- notation 5.6c is inconsistent when compared to annotation 5.6a as only the direction of the comparison has changed, but as a result we have different annotations for the aspect. With annotation 5.6d we lose information about the actual sentiment polarity that is expressed, as we are not able to distinguish it from the annotation in Sentence 5.6e.
To solve these issues, we have proposed to introduce a separate argument called scale with the sole purpose of modeling the content word in a multiword predicate (cf Sec- tion 4.4). In our second design context experiment, we use function words as predicates and change the label of the content word from aspect (Aspect) to scale (Scale). This results in the following annotations being compared:
(5.7) a. “. . . had a less [sturdy]A [feel]A . . . ” (function predicates with Aspect argument) b. “. . . had a less [sturdy]S [feel]A . . . ” (function predicates with Scale argument)
The tasks of predicate and argument identification are independent of argument labels, so the only change is in argument classification. We expect a drop in classification performance due to the increased number of classes. We hope that the drop is not significant, as the new argument class is well-defined and should be relatively easy to distinguish from real aspects.
Annotation of entities. Our third question deals with the annotation of entities. Usually, two entities participate in a comparison. It is conceivable to treat both entities as one type of argument and to not differentiate between them (Unified entities). When two different types of entities are distinguished, there are two possibilities. Entities can be annotated according to surface position as entity 1 and entity 2 (Surface entities), or by preference as a preferred and a non-preferred entity (Preference entities). Thus, the following setups are compared:
(5.8) a. “[It]E had aless [sturdy]A [feel]A than [the other one]E.” (Unified entities)