Sentiment extraction - Empirical framework

4.5 Empirical framework

4.5.4 Sentiment extraction

AFINN, BING, NRC and TM can be run in R. I use the “syuzhet” package by Jockers (2016) for this analysis, since it summarizes the first three methods. The “syuzhet” package draws back on the “tm” package regarding the pre-processing of the corpora. TM or topic modelling has been widely used and a variety of plug-ins have been developed over the years. Among others, a sentiment specific plug-in is available, which is utilized for the analysis, as in the fourth method.

Besides the methods presented here, a variety of other methods are available. Some need a deeper understanding of other programming languages such as Python. A well-known

representative would be the Stanford CoreNLP application or the Natural Language Tool Kit (NLTK).

The chosen methods for this study rely on categorized word dictionaries, mainly sorted into positive and negative words. In the following, the individual methods and their specifics will be summarized.

AFINN

The second method is based on the work of Nielsen (2011). Similar to Liu et al. (2005) the author developed his own dictionary. One of the main reasons was that the Twitter Tweets he analysed showed a different wording than other texts. He collected a range of positive and negative words and scored them manually. This provided the author with higher accuracy since algorithms in many cases are a static structure.

Different to the previous method, the author scored the terms in a range between –5 and 5, which delivered a more detailed analysis. Nielsen (2011) finally ran a correlation analysis with his new dictionary against other methods (SentiStrength, Opinion Finder and the General Inquirer) and against labelled entities by humans (Amazon’s Mechanical Turk). The latter was used as a reference point. His method generated a higher positive Pearson correlation in comparison to the other three methods.

Figure 4:2 - AFINN example

Note 4.6: The graph illustrates the sentiment within the example file: Cushman & Wakefield – Market beat Office Snapshot Q1 2014. I used the AFINN method to extract the sentiment from the file. The graph shows three different illustrations, a Loess Smooth graph (locally weighted scatterplot smoothing), the rolling mean of the positive and negative relations within each sentence, and the Syuzhet DCT (discrete cosine transformation). The sentiment has been scaled to a range from (–1) to 1.

BING

The first method is based on the work of Hu and Liu (2005) as well as Liu et al. (2005). As pointed out earlier, the authors were motivated to improve the reviewing process of products.

Due to the vast amount of online product reviews it has become more difficult to read all reviews as a customer. The authors, therefore, developed a sentiment analysis which translates into a graphical visualization. The authors used the semantic meaning of words and grouped them into positive and negative categories. They used WordNet and a set of 30 words (positive and negative) as a starting point to develop their classified dictionary.

Figure 4:3 - BING example

Note 4.7: The graph illustrates the sentiment within the example file: Cushman & Wakefield – Market beat Office Snapshot Q1 2014. I used the BING method to extract the sentiment from the document. The graph shows three different illustrations, a Loess Smooth graph (locally weighted scatterplot smoothing), the rolling mean of the positive and negative relations within each sentence, and the Syuzhet DCT (discrete cosine transformation). The sentiment has been scaled to a range from (–1) to 1.

NRC

A different approach was taken by Mohammad and Turney (2010). They identified a lack of lexica which measure emotions. Again, they drew on Amazon’s Mechanical Turk to categorize their entities. Different words create different emotions based on their context. Given the humanized categorization, the precision of their lexicon is satisfying. The syuzhet help file does not offer any insight as to which part of the word lexica from the NRC is used. Given the fact that I am able to measure the positive and negative words, I assume that the included lexica ignores the emotional sorted words for the sentiment extraction and refers to the positive and negative labelling of each word.

Figure 4:4 - NRC example

Note 4.8: The graph illustrates the sentiment within the example file: Cushman & Wakefield – Market beat Office Snapshot Q1 2014. I used the NRC method to extract the sentiment from the file. The graph shows three different illustrations, a Loess Smooth graph (locally weighted scatterplot smoothing), the rolling mean of the positive and negative relations within each sentence, and the Syuzhet DCT (discrete cosine transformation). The sentiment has been scaled to a range from (–1) to 1.

TOPIC MODELLING (TM)

The TM package and different plug-ins make the program a useful source for NLP. I apply the tm.lexicon.GeneralInquireR - package of Theussel. The package links the analysis to the Harvard General Inquirer Dictionary. This lexicon has been used in a variety of studies [Maynard and Bontcheva (2016); Kiritchenko and Mohammad (2016)] and can be seen as one of the more reliable sources in the NLP world. The lexica are organized in different categories and summarize four different sources. We assume that the syuzhet package draws on the positive and negative categorization within the Harvard IV-4 Dictionary.

Figure 4:5 - Topic modelling example

Note 4.9: The graph illustrates the sentiment within the example file: Cushman & Wakefield – Market beat Office Snapshot Q1 2014. I used the TM.Sentiment.Plugin to extract the sentiment from the file. Unfortunately, the sentiment results are not presented at a sentence level; only the overall scores for positive and negative words are given.

0 10 20 30 40 50 60 70

TM Example

Negative Positive

All four methods are based on word lexica. Table 4:5 illustrates the number of words, the separation into neutral, positive and negative words as well as the initial purpose. It has further become clear that in all four cases the number of negative words exceeds the number of positive words, which might indicate why negative word counts perform better since the underlying dictionaries are of a finer grade on this side.

Table 4:5 - Overview of the different lexicons

AFINN BING NRC TM

Name AFINN-96 AFINN-111 Opinion Lexicon EmoLex General Inquirer: H4 and H4Lvd

Initial purpose Twitter Tweets Product reviews Measuring of emotions Multiple

Number of words 1468 2477 6788 14182 11787

Neutral 1 1 0 0 0

Positive 515 878 2005 2312 1915

Negative 964 1598 4783 3324 2291

Score 1 - 5 0 or 1 0 or 1 positive or negative

Note 4.10: The table illustrates the four different sentiment lexicon and their initial purpose.

4.6 RESULTS

In the following, I will present the results of the three different subcorpora. The dependent variable will be adjusted according to the focus of the corpora that has been used to construct the textual sentiment indicators.

In document Essays on sentiment: an analysis of the commercial real estate market (Page 167-171)