As already discussed in Section 5.1, target extraction plays an important role in sentiment analysis. The most important application is to extract clauses which are relevant towards providing the sentiment of the review. It is already established that opinion is expressed towards the target, thus any opinionated clause will have a target present in it. Thus, to extract any opinionated clause, we can extract clauses which have a target word present in it. Below are the clauses which were extracted from the camera review.
• the software was terrible
• It ’s batteries died all the time and lost all of the pictures . • The neck strap is not worth the difference
• And the price is right for the features and MP • This lens is one of my favorites
• Camera comes with the software
The clauses were extracted with the highlighted targets. It can been seen that the target-based extraction of the opinionated clauses works reasonably well. Since the pronominal resolution of the corpus was not done, clauses like “it was great” were not extracted. The other limitation of this approach is that the target containing term may not be subjective/opinionated all the times. It may be the case that the reviewer is just talking about the product without expressing any opinion on it. For example, the last clause in the above list has a target word camera but does not show any opinion. But this can be solved by extracting the clauses which contain both targets and the opinionated word. So a subjective clause will be the one which has the presence of not only the target word but also the word which expresses an opinion towards that target.
Conclusions and Directions
for Future Work
This thesis investigated the various aspects of opinion analysis, namely, opin- ion lexicon extraction, opinion classification (including multi-class), opinion target extraction and summarising opinions of reviews.
The opinion lexicon extraction and opinion classification focused on using the compositional property of opinionated text. The thesis described an additive model with constraints optimisation which is used for both opinion lexicon extraction and opinion classification.
The thesis described an unsupervised algorithm to acquire opinion targets from the reviews. It also investigated the relations between opinion targets and opinionated words to devise an algorithm for extraction of fine-grained subjective clauses.
The thesis also showed the use of feature selection methods to improve on a previous work of opinion classification.
6.1
Summary of Results and Contribution
6.1.1 Weighted Opinion Lexicon
The major part of the thesis revolves around the idea that in a given domain two opinionated words can have different weights, even if they share the same polarity. If such is the case, the thesis showed with examples how a weighted lexicon can be used to successfully identify the polarity of the opinion on a text. The thesis explained an effective algorithm to generate such weighted lexicon from opinionated text. The generated lexicon when used for opinion classification showed promising results. The result bolstered our claim that the opinion lexicon should not only contain the polarity of the words but should also contains weight of the polarity.
6.1.2 Linear Sum for Opinion Classification
The thesis showed how an opinionated text can be represented by a linear additive equation. This part of the work was influenced by the successful use of compositionality for opinion classification in [19]. Contrary to prior belief that opinion classification of a text cannot be done by linear sum of constituent opinion words, the thesis successfully built a model which used the linear sum of weighted opinion words for opinion classification.
6.1.3 Need for Multi-class Opinion Classification
The thesis showed that a text cannot just be positive and negative; some are more positive/negative than others. With examples, the thesis showed the need for multi-class opinion classification. The experiments showed that our additive model which worked well for binary opinion classification failed to produce impressive results for multi-class opinion classification. We also used SVM for the same task; the results obtained by SVM for multi-class classification were also very low on accuracy as compared to SVM’s result on binary classification. This led us to the conclusion that the irregularities in tagging a review as either 1 star or 2 star (5 star or 4 star) makes the multi-class opinion classification a very difficult problem.
6.1.4 Fine-grained subjectivity by exploiting opinion targets and opinion words relations
Many previous works used relations between opinion targets and opinion words to extract opinion targets and opinion words. The thesis showed that such relations can also be used to extract sentences/clauses which show an opinion towards a single feature/target of a topic. Using the output of syntactic parser, we developed hierarchical rules to successfully extract feature/target specific clauses/sentences from the whole review text. Since not all the words in a review text are opinionated and an opinion word modifies its target through certain syntactic modifiers, we can select a subset of words from the review text which will be subjective and the opinion will be targeted to a single feature.