A review corpus annotated for negation, speculation and their scope

(1)

A review corpus annotated for negation, speculation and their scope

Natalia Konstantinova

1

, Sheila C.M. de Sousa

1

, Noa P. Cruz

2

, Manuel J. Maña

2

, Maite

Taboada

3

and Ruslan Mitkov

1

1_{Research Group in Computational Linguistics, University of Wolverhampton (UK)} 2_{Departamento de Tecnologías de la Información, Universidad de Huelva (Spain)}

3_{Simon Fraser University, Department of Linguistics (Canada)}

E-mail: {n.konstantinova,sheila.castihomonteirodesousa, R.Mitkov}@wlv.ac.uk {noa.cruz,manuel.mana}@dti.uhu.es

[email protected]

Abstract

This paper presents a freely available resource for research on handling negation and speculation in review texts. The SFU Review Corpus, consisting of 400 documents of movie, book, and consumer product reviews, was annotated at the token level with negative and speculative keywords and at the sentence level with their linguistic scope. We report statistics on corpus size and the consistency of annotations. The annotated corpus will be useful in many applications, such as document mining and sentiment analysis.

Keywords: corpus, annotation, negation and speculation

1. Introduction

Processing negation and speculation can be useful for several NLP applications such as information extraction, paraphrasing, recognizing textual entailment, opinion mining and sentiment analysis.

For example, many authors have studied the role of negation in the sentiment analysis task, where it is one of the most common linguistic means that can lead to a change in polarity. Councill et al. (2010) describe a system that can identify exactly the scope of negation in free text. The authors concluded that performance was improved dramatically by introducing negation scope detection. In more recent work, Dadvar et al. (2011) investigated the problem of determining the polarity of sentiment in movie reviews when negation words occur in the sentences. The authors also observed significant improvements on the classification of the documents after applying negation detection.

Distinguishing between objective and subjective facts is also crucial for sentiment analysis. Speculation is a linguistic expression that tends to correlate with subjectivity (also known as private state). Pang & Lee (2004) showed that subjectivity detection in the review domain helps to improve polarity classification.

To the best of our knowledge, there are no publicly available standard corpora of reasonable size from the review domain annotated with negation and speculation. This motivated our annotation of the SFU corpus, which is widely used in the domain of sentiment analysis and opinion mining.

In this paper, we expand the previous work developed by Konstantinova & de Sousa (2011). The corpus is now annotated in its entirety and available, and the annotation guidelines are fully developed as well.

The paper is organised in the following way: Section 2 describes related research; Section 3 focuses on corpus

characteristics and provides some statistics of the SFU review corpus. Section 4 provides more details about annotation guidelines and also presents statistics about negation and speculation cues in the annotated corpus.

Section 5 discusses agreement analysis and the most regular cases of disagreements between the annotators. Section 6 provides information about the corpus and guidelines availability. The paper finishes with conclusions and future work (Section 7).

2. Related work

Even though negation and speculation detection has gained much attention in recent years, open access annotated resources are rare and relatively small in size. Most of the work has been done for the biomedical domain, where there are several annotated corpora. The GENIA Event corpus (Kim et al., 2008) contains annotation of biological events with negation and two types of uncertainty. Medlock and Briscoe (2007) based their system on a corpus consisting of six papers from genomics literature, which were annotated for speculation. Settles et al. (2008) constructed a corpus where sentences were classified as either speculative or definite; however, no keywords were marked in the corpus. Vincze et al. (2008) developed standard corpora of reasonable size with information about negative/speculative keywords and their scope.

The research community is trying to explore other domains as well: Morante et al. (2011a) discuss the need for corpora which cover different domains than biomedical. The authors point out that the existing guidelines should be adapted to new domains and they annotated literary texts by Conan Doyle, although only for the case of negation (Morante et al., 2011b).

(2)

corpus is also rather small in size, containing only 2111 sentences in total, out of which 679 contain negation. Therefore, our corpus is the first one with an annotation of negative/speculative information and their scope in the review domain.

3. Corpus characteristics

The Simon Fraser University Review corpus (Taboada et al., 2006) was chosen for our annotation of negation and speculation. This corpus consists of 400 documents (50 of each type) of movie, book, and consumer product reviews from the website Epinions.com. Each text was assigned a label based on whether it is a positive or negative review. All the texts differ in size and are written by different people (more information about the size of the corpus can be found in Table 1). As shown in this table, there are appreciable differences in the length of the documents depending on the domain but not in the length of sentences, so sentence complexity in the entire corpus is comparable.

Domain #Senten ces

Av. length document

#Words Av. length sentences

Books 1,596 31.92 32,908 20.61

Cars 3,027 60.54 58,481 19.32

Computers 3,036 60.72 51,668 17.01

Cookware 1,504 30.08 27,323 18.16

Hotels 2,129 42.58 40,344 18.95

Movies 1,802 36.04 38,507 21.36

Music 3,110 62.20 54,058 17.38

Phones 1,059 21.18 18,828 17.77

Total 17,263 43.15 303,289 17.56

Table 1: Statistics of the SFU Review Corpus. Av. length document is shown in number of sentences. Av. length

sentences are shown in number of words.

4. Annotation guidelines

The entire corpus was annotated by one linguist. A second linguist annotated 10% of the documents, randomly selected and in a stratified way, with the aim of measuring inter-annotator agreement (Section 5 provides more details about this analysis).

The guidelines presented in this paper have been adapted from the existing Bioscope corpus guidelines (Vincze et al., 2008) in order to fit the needs of the review domain.

4.1 General remarks

There are several general principles to be followed when annotating negation and speculation:

 Only sentences with some instance of speculative language or negation should be considered.

 Questions should not be annotated at all.

 The min-max strategy should be followed during annotation, following the BioScope corpus guidelines (Vincze et al., 2008):

- When annotating keywords, try to choose the minimal unit which expresses negation or speculation (special attention should be paid to distinguishing complex cues and sequences of several keywords).

- When annotating scope, try to annotate the maximum number of words affected by the phenomenon.

 Cue words are not included in the scope.

 Transitional words (e.g. in addition, not to mention, etc.) should not be included in the scope.

 When unsure of the scope, annotate only a keyword.

 When unsure what category the keyword should be assigned to (whether it expresses negation or speculation), use the 'undecided' label.

As mentioned earlier we did not agree with the BioScope guidelines completely and introduced some modifications. These main changes are summarised below:

 We did not include cue words in their scope;

 A different scheme for annotating coordination was used;

 Embedded scopes were quite a frequent case;

 We had a case of ‘no scope’ both in the case of negation and speculation.

More information about the differences with the BioScope principles can be found in Konstantinova & de Sousa (2011).

The nature of the review domain texts introduces a greater possibility of encountering difficult cases than in the biomedical domain. Some of these special cases are discussed in Section 5. More detail can be found in the full version of the guidelines (see Section 6).

4.2 Negation

(3)

Domain # Cues % Negated sentences

Books 406 22.7

Cars 576 17.1

Computers 590 17.2

Cookware 376 21.3

Hotels 387 16.3

Movies 490 23.7

Music 470 13.4

Phones 232 19.5

Total 3527 18.1

Table 2: Negation statistics in the SFU Review Corpus

The total amount of distinct negation cues in our corpus amounted to 53, with the top 10 most frequent cues shown in Table 3. It is interesting to note that the first two cues for negation (not and no) constitute more than 55% of the total frequency of all the cues found in the corpus, while the remaining 51 cues cover only 45%.

Cue Frequency Percentage

Not 1419 40.23

No 524 14.85

Don’t 296 8.39

Never 248 7.03

Doesn’t 154 4.36

Without 151 4.28

Didn’t 119 3.37

Isn’t 89 2.52

Can’t 68 1.92

Wasn’t 57 1.61

Table 3: The most frequent negation keywords in the SFU Review Corpus

4.3 Speculation

In the case of speculation, the statistical analysis described in Table 4 shows that, out of a total of 17,263 sentences, around 22% are speculative. Therefore the proportion of speculative sentences in the annotated corpus is higher than negative ones. This can be explained by the nature of the corpus which consists of

reviews, which are subjective and where speculation is extensively used to express opinions.

Domain #Cues %Speculative

sentences

Books 370 17.2

Cars 1068 26.0

Computers 944 23.2

Cookware 583 27.3

Hotels 695 23.7

Movies 648 26.0

Music 643 15.1

Phones 408 27.4

Total 5359 22.7

Table 4: Speculation statistics in the SFU Review corpus

More than 100 different cues were used in our corpus for expressing speculation. This number is considerably higher than the number of negation cues encountered during the annotation. In addition, as described in Table 5, the amount of occurrences of each cue was equally distributed across all the cues, so the top most frequent cues did not represent the majority of speculation cases as happened for negation.

Cue Frequency Percentage

If 876 16.34

Or 820 15.30

Can 765 14.27

Would 594 11.08

Could 299 5.57

Should 213 3.97

Think 211 3.93

May 157 2.92

Seems 150 2.79

Probably 121 2.25

(4)

(5)

(6)

for creating semantic orientation dictionaries. In Proceedings of 5th International Conference on Language Resources and Evaluation (LREC), pages 427–432, Genoa, Italy.

Vincze, V.; Szarvas, G.; Farkas, R.; Móra, G. and Csirik J.

(2008). The Bio-Scope corpus: biomedical texts

annotated for uncertainty, negation and their scopes. BMC