• No results found

Enhanced aspect level opinion mining knowledge extraction and representation

N/A
N/A
Protected

Academic year: 2020

Share "Enhanced aspect level opinion mining knowledge extraction and representation"

Copied!
52
0
0

Loading.... (view fulltext now)

Full text

(1)

EXTRACTION AND REPRESENTATION

MAQBOOL RAMDHAN IBRAHIM AL-MAIMANI 

UNIVERSITI TEKNOLOGI MALAYSIA

(2)
(3)
(4)
(5)

ENHANCED ASPECT LEVEL OPINION MINING KNOWLEDGE EXTRACTION AND REPRESENTATION

MAQBOOL RAMDHAN IBRAHIM AL-MAIMANI

A thesis submitted in fulfilment of the requirements for the award of the degree of

Doctor of Philosophy (Computer Science)

Faculty of Computing Universiti Teknologi Malaysia

AUGUST 2015   

 

(6)
(7)

 

 

 

 

 

 

(8)

ACKNOWLEDGEMENT

In the name of Allah, Most Gracious and Most Merciful

All praise and thanks are for Allah, and peace and blessings be upon to His messenger, Muhammad (S.A.W).

First of all I am grateful to his Almighty Allah who gave me strength to complete this thesis. All praise goes first only to him as without his help I would have not been

able to reach to this successful end.

Then, I would like to express my sincere and special appreciation to my supervisor Prof Dr Naomie bt Salim who helped me a lot to complete this research. Her valuable advices in this regard are unforgettable and triggered me to archive this vital milestone of my thesis. I am speechless and cannot find the right to words to

express my thanks for her valuable inputs and guidance during my study.

I would like also to express my thanks to my colleagues at UTM and Oman Air who helped me with their moral support to ensure I complete my research and submit it

on time.

(9)

ABSTRACT

(10)

ABSTRAK

Terdapat keperluan untuk mencari teknik-teknik yang lebih berkesan untuk

mengekstrak, mengelaskan, mewakilkan dan merumuskan pendapat pelanggan dalam

talian terhadap produk dan servis untuk analisis sentimen yang lebih baik. Tesis ini

bertujuan untuk meningkatkan aspek pengekstrakan dan perwakilan pendapat. Kajian ini

menggunakan sumber leksikal SentiWordNet yang khusus dibina bagi perlombongan

pendapat dan digunakan secara meluas dalam analisis sentimen. Kajian ini

memperkenalkan pendekatan yang menggunakan Kata Sifat, Kata Kerja, Kata Adverba

dan Kata Nama (AVAN), bagi menganalisis kesemua jenis sentimen perkataan

berkaitan pendapat dan ia juga tidak hanya terhad kepada Kata Sifat dan Kata Adverba

seperti dalam pendekatan konvensional. Perwakilan pendapat dipertingkatkan dengan

menawan elemen-elemen utama pendapat ke dalam predikat yang terdiri daripada

perkataan pendapat, kekuatan, skor dan kategori bagi meningkatkan perwakilan pendapat

dan klasifikasinya. Seterusnya, perakaunan pendapat telah diperkenalkan untuk

meringkaskan skor pendapat pada tahap kumpulan yang pelbagai. Di samping itu, tesis

ini memperkenalkan satu konsep baru yang dikenali sebagai kekuatan pendapat dengan

mengklasifikasikannya kepada darjah pendapat tertentu. Skor diperuntukkan kepada

pendapat berdasarkan kepada kekuatan pendapat itu dinyatakan. Selain itu, sebagaimana

pendapat adalah kabur dalam alam semulajadi, kajian ini menunjukkan bahawa logik

kabur adalah teknik yang efektif untuk digunakan kerana logik manusia terdapat

kekaburan. Ini adalah penting kerana pendapat tidak seharusnya hanya boleh

dikategorikan dalam sentimen Boolean klasik. Kajian ini mengenalpasti SentiWordNet,

AVAN, Kekuatan Pendapat dan Logik Kabur sebagai ciri-ciri untuk mengelaskan ulasan

pelanggan ke dalam model ramalan 5 kelas (Cemerlang, Baik, Sederhana, Kurang Baik

dan Tidak Baik). Keputusan menunjukkan Pengelas Turutan Pengoptimuman Minimum

menggunakan ciri-ciri ini memberikan 92% lebih ketepatan berbanding teknik sebelum

ini iaitu Mesin Sokongan Vektor dan Regresi Logistik. Selain itu, pengabungan AVAN,

Pendapat Kekuatan dan Logik Kabur mengatasi SentiWordNet secara bersendirian

dengan ketepatan 30%.

(11)

 

TABLE OF CONTENTS

CHAPTER TITLE PAGE

DECLARATION ii

DEDICATION iii

ACKNOWLEDGEMENT iv

ABSTRACT v

ABSTRAK vi

TABLE OF CONTENTS vii

LIST OF TABLES xiii

LIST OF FIGURES xv

LIST OF ABBREVIATION xvi

LIST OF APPENDENCES xvii

1 INTRODUCTION 1

1.1 Introduction 1

1.2 Problem Background 4

1.3 The Problem Statement 6

1.4 Research Objectives 8

1.5 Research Questions 8

1.6 Scope of Study 8

1.7 Significance and Contributions of the Study 10

1.8 Structure of the Study 12

1.9 Chapter Summary 13

2 LITERATURE REVIEW 15

2.1 Background 15

(12)

Representation – A survey of Previous Works 19

2.3.1 Item ExtractionTechniques 20

2.3.2 Aspect-Level Extraction Techniques 22 2.3.3 Opinion Knowledge Representation 28 2.3.4 Discussion on Opinion Extraction and

Representation 30

2.3.5 Sentiment Polarity Classification and

Summarization 33

2.3.6 Strength of Sentiments (Senti Strength) 36

2.3.7 Summarization Techniques 39

2.3.8 Discussion on Sentiment Polarity Classification

and Summarization 42

2.4 Opinion Mining Lexical Resources and Databases 45 2.4.1 WordNet and WordNet Effect 46

2.4.2 HowNet 46

2.4.3 ConceptNet 47

2.4.4 Amazon Mechanical Turk (AMT) 47

2.4.5 SenticNet 48

2.4.6 SentiWordNet 49

2.4.7 Why SentiWordNet? 50

2.5 Opinion Mining Domain Taxonomy 51

2.6 Building Datasets and Corpus for Opinion Mining 54

2.7 Fuzzy Logic 58

2.7.1 Fuzzy Logic System 59

2.7.2. Research on Fuzzy Logic 60 2.7.3 Challenges for Fuzzy-based approaches 64

2.7.4 Why Fuzzy Logic? 65

2.8 Chapter Summary 67

3 RESEARCH METHODOLOGY 73

3.1 Introduction 73

3.2 Research Operational Framework 74

(13)

3.2.2 Phase II: Experiment Setup 78 3.2.2.1 Knowledge Acquisition and Dataset

Preparation for Domain Taxonomy 79 3.2.2.2 Algorithm Development 82 3.2.2.3 Overall Solution Design 87 3.2.2.4 Questionnaire 87

3.2.3 Phase III: Enhancing Aspect Level Opinion Mining Knowledge Extraction and

Representation by using Opinion Predicates and opinion Senti Strength for Adjectives,

Verbs, Adverbs and Nouns (AVAN) 88 3.2.3.1 Aspect-based Approach 88 3.2.3.2 Expert Opinions on Proposed

Opinion Levels and Value Ranges 90 3.2.3.3 Proposing AVAN Structure 93 3.2.3.4 Building Opinion Predicate and

Opinion Accounting 96 3.2.4 Enhancing Opinion Scoring by Using

SentiWordNet and Opinion Senti Strengths 100 3.2.4.1 Study SentiWordNet structure and

Scoring Methods 100

3.2.4.2 Define Scoring Formulas for

AVAN Components 101

3.2.4.3 Enhancing AVAN scoring further 101 3.2.4.4 Define Opinion Senti Strengths 102 3.2.4.5 Enhance scoring for AVAN using

Opinion Senti Strengths 103 3.2.5 Phase IV – Enriching Aspect level Opinion

Mining Representation by Using Fuzzy

Logic 104

3.2.5.1 Fuzzy Sets and Logic 105 3.2.6 Stages of Opinion Score Enhancements 108

(14)

4 ENHANCING ASPECT LEVEL OPINION MINING KNOWLEDGE EXTRACTION AND

REPRESENTATION BY USING OPINION

PREDICATES AND OPINION SENTI STRENGTH FOR ADJECTIVES, VERBS, ADVERBS AND NOUNS

(AVAN) 112

4.1 Introduction 112

4.2 The AVAN Approach 113

4.3 Extraction of Opinion, Aspect and Senti Strength Words 113

4.4 Building Opinion Predicates 115

4.4.1 Opinion Predicates – Development Stages 117 4.4.2 Building Opinion Predicates – An Example 118

4.5 Building Opinion Accounting 121

4.6 Enhancing Opinion Scoring By SentiWordNet and

Opinion Senti Strengths 124

4.6.1 SentiWordNet 125

4.6.2 The Proposed Approach 129 4.6.3 Opinion Senti Strengths 131 4.6.4 Enhancing Opinion Scoring 133 4.6.4.1 Adjective Scoring 134

4.6.4.2 Verb Scoring 136

4.6.4.3 Adverb Scoring 138

4.6.4.4 Noun Scoring 139

4.6.5 Opinion Senti Strength Scoring 140 4.6.6 Overall Opinion Strength 143 4.6.7 AVAN Scoring Algorithm 147

4.7 Applying the Proposed Approach 149

4.7.1 Opinion Extraction and Score Calculations 149 4.7.2 Opinion Accounting Calculation 157 4.7.3 Discussion and Analysis 159 4.8 Grouping Aspects into Categories 160

4.9 Results and Scoring Analysis 163

(15)

5 ENRICHING ASPECT LEVEL OPINION MINING

REPRESENTATION BY USING FUZZY LOGIC 168

5.1 Introduction 168

5.2 Fuzzy Logic Components 169

5.3 Fuzzy Logic and Experiment setup 171 5.3.1 The Proposed Fuzzy Algorithm 171 5.3.2 Fuzzy Sets, Membership Functions and Fuzzy

Rules 172

5.3.2.1 Positive and Negative Sets 173 5.3.2.2 Opinion Senti Strength Sets 174

5.3.2.3 Fuzzy Knowledgebase 175

5.3.2.4 Fuzzy Rules 178

5.4 The Detailed Fuzzy Process 179

5.4.1 A Trail Run of Fuzzy Process 182

5.4.1.1 Fuzzy Sets 182

5.4.1.2 Fuzzification 184

5.4.1.3 Fuzzy Rules 184

5.4.1.4 Defuzzification 185

5.5 Score Comparison 186

5.6 Analyzing the Fuzzy Logic Approach 189

5.7 Chapter Summary 191

6 CLASSIFICATION RESULTS 192

6.1 Introduction 192

6.2 Dataset 192

6.3 Experiment Setup 193

6.4 Results and Analysis 194

6.5 Benchmarking 199

6.6 Conclusion 201

7 SUMMARY AND CONCLUDING REMARKS 202

7.1 Summary 202

(16)

7.3 Enhancements / Limitations of This Study 208 7.4 Future Enhancements and Open Research Areas 208

REFERENCES 212

(17)

LIST OF TABLES

TABLE NO. TITLE PAGE

2.1 Corpus and Data Sets in an Alphabetical Order (Pang and

Lee 2008) 56

3.1 Research Phases and Activities 75

3.2 Result of Expert Opinions 91

3.3 Opinion Levels 106

3.4 Fuzzy Knowledgebase 107

4.1 Record Structure of SentiWordNet Database 127

4.2 Sample SentiWordNet Data 127

4.3 Score Statistics Per Part of Speech (Esuli et al., 2006) 128

4.4 AVAN Scoring Formulas 144

4.5 Analysis of On-line Passenger Reviews (PARA-1) 150 4.6 Analysis of On-line Passenger Reviews (PARA-2) 152 4.7 Analysis of On-line Passenger Reviews (PARA-3) 155

4.8 Opinion Accounting (ALL PARAs) 157

4.9 Aspect Categorization for Oman Air In-flight Items and

Services 162

5.1 Fuzzy Knowledgebase 176

5.2 Scores of Opinion Words Using SentiWordNet, AVAN

(18)

6.1 Performance of Selected Classifiers 194 6.2 Class-wise Performance Using All Selected Classifiers 196 6.3 Class-wise Overall Performance Using All Selected

Classifiers

196

6.4 Comparison Among Selected Classification Features

Using SMO 197

6.5 Performance Comparison (Jorge’s Vs.

(19)

LIST OF FIGURES

FIGURE NO. TITLE PAGE

2.1 Fuzzy Logic System 60

3.1 Research Operational Framework 74

3.2 Algorithm – Oman Air Corpus 85

3.3 Process for a Fuzzy-based System (Mamdani 1977) 105 4.1 Algorithm for Extracting Opinion Words and Predicates 114 4.2 Position of a Term in SentiWordNet (Esuli 2005) 125 4.3 Overall Extraction and Representation Process 131 4.4 AVAN Scoring and Opinion Accounting Algorithm 148

5.1 Fuzzy Logic Overall Structure 169

5.2 Fuzzy Logic Scoring Algorithm 172

5.3 Positive-Negative (PN) Membership Graph 173

5.4 Opinion Senti Strength (D) Graph 175

5.5 Opinion (OP) Graph 177

5.6 Positive-Negative (PN) Graph (X = 55 & Y = 1) 183 5.7 Positive-Negative Graph (X = 15 & Y = 0.82) 183 5.8 The Opinion Graph (Crisp value of 63 for “Less Clear”) 185 6.1 Classifiers’ Precision, Recall and F-Measure 194

6.2 Classifier-wise Accuracies 195

6.3 Accuracy of Classification Features Using SMO 198 6.4 Accuracy of Different Combination of

Classification Features Using SMO 198

6.5 Accuracy of SentiWordNet with Individual

(20)

LIST OF ABBREVIATIONS

AVAN - Adjectives, Verbs, Adverbs, Nouns CBA - Classification Based on Association

FL - Fuzzy Logic

OM - Opinion Mining

POS - Part of Speech

(21)

LIST OF APPENDICES

APPENDIX TITLE PAGE

A Review Of Extraction, Polarity And Summerization Tools,

Techniques And Methods  229

B Sample Reviews About Oman Air Services 245

C Oman Air On-Board Item And Service Aspects (Data

Dictionary) 251

(22)

CHAPTER 1

INTRODUCTION

1.1 Introduction

With the advent of Web 2.0, many new technologies and platforms have emerged such as blogs, discussion forums, e-commerce sites to enable people procure products and services and provide their opinions and feedbacks online. Consumers have at their disposal different types of information on the web which enable them to share their experiences and opinions (positive or negative) on any product or service (Zabin and Jefferies 2008). Different people often express their experience, opinions and thoughts on almost anything at different occasions and places. One person may find a particular feature is interesting; whereas, it may not make sense for another.

(23)

triggered the need to enhance existing methods and techniques to extract and summarize opinions of different online reviews (Pang and Lee, 2004).

Opinion mining is a field that offers a number of tools and techniques that are used to find people / customers’ opinions on certain products, services, events, occasions etc. The mining process can be as simple as learning polarity (positive or negative) and sentiment of the words, or as complicated as performing deep parsing of data to identify grammar and structure of the sentences. Opinion mining seeks extracting useful information from the opinionated sentences written in different forums, articles, books, product reviews etc and then presenting such details in textual summaries or visual presentations for quick reference and decision making.

Opinion mining is an important field as it helps to achieve the following objectives (Pang and Lee 2008):

 To understand customers’ feelings and opinions on a particular product/service in order to improve the quality and delivery of such goods and as expressed in everyday communications - this will in turn help to enhance products and services.

 To scientifically record different opinions and positions of people and various parties on a specific event, accident, incident, occasion etc. This will in turn help to put proper measures on how to handle such cases based on people opinions.

 To improve social services provided to public by governments and social organizations by understanding their demands and suggestions.

 Getting people expressed opinions on goods and services.

 Companies, supplier and manufacturer firms can utilize on line reviews to respond to their consumer insights by modifying their marketing messages, brand positioning, product development and other activities accordingly (Zabin and Jefferies 2008).

(24)

 The rise of many machine learning methods in natural language processing (NLP) and information retrieval (IR);

 The datasets availability for machine learning algorithms to be trained on, due to the blossoming of the World Wide Web and, specifically, the development of review-aggregation web-sites.

 Awareness of intellectual challenges and commercial and intelligence applications that the area offers.

English opinionated sentences can be Positive, Negative or Neutral. The following are few sentences which resemble few challenges:

Jane Austen’s books madden me so that I can’t conceal my frenzy from

reader

=> Positive

The Power puff girls learned that with great power comes great responsibility

=> Neutral

At movies gonna watch the mechanic, hope this thing is good

=> Neutral

I don’t think I’ve seen one Adam Sandler movie that’s not good

=> Positive

If I don’t see Source Code this weekend it will have been a complete waste.

=> Positive

The battery life of this laptop is very very low

=> Negative

(25)

also affected by the presence of linguistic hedges such as modifiers (e.g., “not”), concentrators (e.g., “very,” “extremely”), and dilators (e.g., “quite,” “almost,” and “nearly”). Zadeh developed the concept of fuzzy linguistic variables and linguistic hedges that modify the meaning and intensity of their operands (Huynh et al., 2002). Recent papers in this field have also pointed out that the task of opinion mining is sensitive to such hedges and taking the effect of linguistic hedges into consideration can improve the efficiency of the sentiment classification task (Dalal and Zaveri 2013).

1.2 Problem Background

Opinions of people about a specific subject, product or service require effective techniques and methods in order to extract, classify, represent and summarize them for better decision making (Haji 2009; Dan 2010; Ayesha et al., 2012; Jayashri. and Mayura, K. 2013). There are still rich research areas which are not well addressed by scholars in this field (Haji 2009; Yongyong et al., 2010; Bjørkelund et al., 2012; Moraes et al., 2013). Among these areas are the following:

 From the existing literature on opinion mining, one can easily notice that the focus is more on adverbs and adjectives (Farah et. al., 2007; Pang and Lee 2009; Hana 2011; Moraes et al., 2013). Other part of speech like verbs and Nouns are not widely addressed and especially when analyzing and processing opinion scores. Opinion words can be adjectives, adverbs, verbs and nouns and analysis of such combination has not been addressed till date of this research. Addressing such a combination to enhance opinion classifications, scoring and summarization is still an important problem which needs to be addressed (Bjørkelund et al., 2012; Moraes et al., 2013).

(26)

“opinion object”, “opinion creation date”, “opinion polarity” and “detailed opinions”). Both of these representations do not represent all characteristics of an opinion. Elements like product/service features, opinion score, intensity of an opinion, group to which product feature belongs are missing and need to be considered while representing an extracted opinion. Moreover, better representation structures need to be designed to enable further processing of extracted opinions. Existing structures can be improved by adding the above suggested elements or characteristics of an opinion.

 Moreover, in order to properly identify or measure the strength of an opinion, it is important to convert such an opinion to a value (called a score) which represent its strength. For example, the opinion word ‘beautiful’ can be given a score of 85 to show that it is a highly positive opinion. Existing studies on scoring opinion words, sentences and documents are still in its infancy and needs to be well researched. There are many permutations and options to be considered and studied while analyzing in depth the area of opinion scoring (Elomaa et al., 2011). SentiWordNet, which is a widely used lexical resource in the field of opinion mining, provides scores to various subjective words. However, such scoring method is a basic one and in many cases the assigned scores do not reflect the actual meaning and level of opinion words. For example the score for the word “large” is 0.50. This score does not really reflect the meaning of this adjective. There is a need to enhance such scorings by introducing better formulas to obtain scores that reflect the meaning of the subjective word.

(27)

score of 85 is assigned to the opinion word “beautiful,” a higher score (like 90) needs to be assigned to the opinion “very beautiful”. Moreover, the opinion “extremely beautiful” needs to be assigned much higher score. Such opinion strength and assignment of strength score are not well addressed in the literature and this is one of the major areas which needs to be studied.

 Since opinions are fuzzy in nature, Fuzzy logic is not properly used to enhance the opinion mining field as explained in chapter 2. Since opinion words are vague, fuzzy logic is a well-known tool to address vagueness in opinion words for better sentiment analysis. Only very few studies done by Samaneh et al. (2010) and Animesh et al. (2011). Animesh did not utilize features of fuzzy logic covering fuzzification, fuzzy rules and defuzzification process. These are very important phases of fuzzy logic and implementing them can enhance the scoring of opinion words. Samaneh on the other side did define membership functions for positive, negative and opinion intensities. Values are predefined based on the classification module which was defined at the beginning. Moreover, fuzzy rules were based on linguistic patterns which were predefined patterns and these cannot be assured to be comprehensive to cover all cases in reviews. In Addition, the defuzzification crisp values are also predefined using a set of expected results. These needs to be addressed and full features of fuzzy calculus need to be applied to produce better crisp opinion values or scores.

1.3 The Problem Statement

In view of the problem background and based on previous approaches, it can be concluded that the following major problems are encountered and need to be further improved:

(28)

powerful techniques for the betterment of opinion mining and sentiment analysis.

 The lack of proper representation of extracted opinions for better scoring and summarization.

 Existing scoring methods are dependent on SentiWordNet with limited improvement on the usage of scoring formulas.

 Opinion intensity, degree or strength of expressed sentiment is judged on the usage of adverb words. Opinions can be expressed at different senti strengths and levels by using words other than adverbs and such senti strengths are not properly measured.

 Fuzzy logic is an important tool which is developed to address vague and unclear problems. Hence, this tool can be used to enhance opinion representation by placing the produced fuzzy crisp values in opinion predicates.

Hence, this research will focus on resolving the above problems by addressing the below queries and issues.

 Can other types of part of speech apart from adjectives and adverbs be used to enhance opinion mining and opinion scoring especially if clumped with tools like SentiWordNet and other effective techniques?

 How can knowledge extraction of opinions be enhanced?

 How can representation of extracted opinions be improved?

 Can extracted opinions be represented in vectors called predicates?

 How can the strength of an opinion be extracted and represented?

 How can summarization of opinions be enhanced?

 How can SentiWordNet be used to score opinion words and opinion strengths?

(29)

1.4 Research Objectives

This research is intended to meet the following key objectives:

 To enhance aspect level opinion mining knowledge extraction by introducing opinion predicates for Adjective, Verb, Adverbs and Nouns (AVAN).

 To enhance aspect level opinion mining knowledge representation by introducing opinion Senti Strength.

 To enrich aspect level opinion mining representation by using fuzzy logic.

1.5 Research Questions

This study will answer the following questions:

 Can aspect level opinion mining knowledge extraction be enhanced by introducing opinion predicates for Adjectives, Verbs, Adverbs and Nouns (AVAN)?

 How can aspect level opinion mining knowledge representation be enhanced by introducing opinion senti strength?

 How can aspect level opinion mining representation be enriched by using fuzzy logic?

1.6 Scope of the Study

(30)

Sentiment Analysis can be done on document, sentence and aspect levels. This study focuses on aspect level opinion mining. The main focus of this study is to enhance the extraction and representation of opinion mining knowledge on aspect level only. This thesis does not address sentiment analysis on document and sentence levels.

This study handles explicit opinions only and does not deal with other types of opinions like hidden, emotion, implicit, spam and sarcastic opinions. Hidden opinions are opinions that are not explicitly stated in the sentence like “I have stayed in this hotel for more than 10 times.” This implies that this person likes this hotel but it is not explicitly stated. Spam opinions are fake opinions that are made about a product or service for the purpose of broadcasting false news. Sarcastic opinions are expressed without using sentiment words and are expressed in an opposite manner like “What a great car! It stopped working in two days.” The previous example also expresses an emotion opinion which is also not covered in this thesis. Emotion opinions are opinions that are expressed with emotions and which are not explicitly stated. (Bing Liu 2012).

In addition to the above, this thesis does not address context awareness sentiment analysis. Here an opinion word meaning can change based on the context in which it is used. One can use a word and can be viewed as positive in one situation. However, if the same word is used in another context, it can be viewed as negative. For example, the word ‘long’ can be positive if it is used for a battery life of a mobile phone. On the other side, the word ‘long’ can indicate a negative meaning if it used for a process that takes long time. In many cases, it is difficult to distinguish between the two contexts. Another challenge is that in few cases opinion words also occur in objective sentences like “it is a long distance between Florida and California.” Here the word ‘long’ is used in a factual sentence. This thesis does not cover the “context awareness” of used opinions in text.

(31)

In the first part, the thesis looks at the opinion mining aspect through extraction and representation of data into opinion predicates and then such representation is enhanced by producing opinion scoring and accounting. Secondly the fuzzy logic analysis will be used as a supportive method to enhance the representation of opinions by determining the polarity and strength of these opinions.

In addition to the above, this study uses SentiWordNet as a lexical resource in order to enhance the extraction, representation and scoring of opinions. This is due to the following reasons (Esuli and Sebastiani 2006; Elomaa et al., 2011; Kennedy et al., 2002; Burns et al., 1990).

 SentiWordNet is built specifically for the field of opinion mining and sentiment analysis. Hence, it is a powerful resource for extraction, representation and scoring of opinions.

 SentiWordNet is built on the well-know lexical resource WorldNet which groups words into sets of synonyms named as synsets and it records all relations among these synonym sets or their members. (Pang and Lee 2008).

 SentiWordNet is widely used in opinion mining literature for opinion representation, classification and scoring. Hence, it will be more professional and accurate to compare work of this thesis with previous works when using SentiWordNet.

 SentiWordNet provides three score for each word or synset (Positive, Negative and Objective). These scores help a lot in analyzing and classifying sentiments easily.

 All other existing lexical resources like HowNet, ConceptNet, SenticNet, WordNet and other dictionaries lack the above characteristics.

1.7 Significance and Contributions of the Study

(32)

 The main contribution of this study is the enhanced way to represent aspect level opinion mining knowledge using a structure called ‘Opinion Predicate.’ A predicate is a vector space consisting of the main components of an opinion in an ordered way. Such a rich structure will empower opinion processing, analysis and scoring. Few of the previous studies represented opinions in tuples which hold very basic elements of an opinion. This study introduces an opinion predicate structure which covers important elements (like opinion strength, score, opinion aspect, aspect group) of an opinion and these elements were not covered in previous representations.

 This is the first study to introduce the combination of AVAN (Adjectives, Verb, Adverb and Noun) for extracting, classifying, scoring and summarizing extracted opinion words by using SentiWordNet, Opinion senti strength and fuzzy logic tools and techniques. Most of the previous studies focused on Adjectives and Adverbs as the main source for opinion words. Nouns and verbs can also be used as opinion words and considered by few previous studies. The reasons for selecting SentiWordNet are highlighted above and in chapter two after analyzing various opinion mining lexical resources and databases.

 This study introduces opinion strength or degree to properly capture the strength of an opinion (this is referred in the literature as ‘Senti Strength.’). No studies as of the date of this thesis introduced opinion strength for all part of speech opinion words (AVAN) which is an important contribution of this research. Previous studies used only adverb as a word which can intensify opinion adjectives.

(33)

 Using fuzzy logic is another major contribution to further enhance the representation of opinion by adding more accurate score in the opinion predicate. This study has utilized the power of fuzzy logic to analyses in-depth opinion words. The reasons behind using fuzzy logic are given in section 2.7. All the three major phases of fuzzy logic (Fuzzification, Fuzzy Rules and Defuzzification) are implemented. In addition, this research demonstrates the importance of an integrative use of NLP and fuzzy logic as a basis for modeling abstract relationship. None of the existing studies has done such multi-level analysis using fuzzy logic the way applied in this thesis.

1.8 Structure of the Study

The thesis is placed out in seven chapters, as follows

 The first chapter introduces the reader to the concept of opinion mining and describes the aims and objectives of the study. It also indicates the scope of the research and sheds the light on the contributions of this study.

 The actual study will begin from chapter two by reviewing the existing literature in this field. The second chapter mainly analyzes in depth existing approaches, methods, techniques and challenges found in the area of sentiment analysis.

 Chapter three reviews and explains the stages of this research and the methodology which is adopted to effectively complete this research. It also highlights various techniques that will be used to meet the set objectives. All definitions and experiment setups are explained in this chapter.

(34)

SentiWordNet, its structure and its scoring norms. In addition, this chapter explains AVAN scoring formulas and the enfacements made to effectively score opinion words using Opinion senti strength.

 Chapter five explains how aspect level opinion mining knowledge representation can be enriched by using fuzzy logic. All stages of the fuzzy process are explained with examples to illustrate these concepts.

 Chapter six gives detail results of classifying customer reviews using SentiWordNet, AVAN, Opinion senti Strength and fuzzy logic as classification features using SMO classifier.

 Chapter seven summarizes the thesis, highlights all the contributions and lists major open issues and challenges that need to be addressed in future researches.

1.9 Chapter Summary

Opinion mining aims to track and summarize opinions of public about a product or a service. Research on opinion mining is a vast area covering many aspects among which is how to properly extract, classify, represent and score opinion strengths in order to take proper decisions. Opinion mining systems should study the degrees and strengths of opinions rather than classifying them as either zero or one. Existing studies on scoring opinion words, sentences and documents still have long way to go. There are many rich areas that need to be addressed and solved in order to scientifically present proper opinions to both service/product providers and customers. This study embarked based on such needs and based on many open areas in this rich field of research.

(35)

people may like few features and dislikes others and hence polarity analysis can be measured more accurately. Also aspect level allows capturing and better representing opinion together with sentiment strengths. The representation of aspect-level sentiments as predicates is the main contribution of this thesis. These predicates also capture sentiment strengths which can be exploited for various uses. To achieve this, powerful resources and tools like SentiWordNet and fuzzy logic are utilized in order to enhance the opinion knowledge extraction and representation. Furthermore, this thesis has proposed novel ideas and interesting directions for coupling fuzzy logic with an integrated use of NLP in representing and characterizing aspect-level sentiments.

In addition to the above and since this thesis focuses on aspect-level analysis of sentiments, there is clear connection between opinion mining and both SentiWordNet and fuzzy logic. SentiWordNet is a lexical resource or a database which contains all English words with positive and negative scores for each word. Hence, this resource can help a lot in analyzing opinions at aspect levels. On the other side, fuzzy logic is built to solve complex problems like handling vagueness of words based on defined classes, knowledgebase and fuzzy rules. Hence, fuzzy logic is an important tool to address opinion fuzzy nature. In view of the above, opinion mining, SentiWordNet and fuzzy logic create an excellent combination to enhance opinion extraction and representation at aspect level.

(36)

REFERENCES

Abbasi, A. 2007. Affect intensity analysis of dark web forums. Proceedings of the 2007 Intelligence and Security Informatics, 2007 IEEE: IEEE, 282-288. Amati, G., Amodeo, G., Capozio, V., Gambosi, G. and Gaibisso, C. 2010. Assessing

the Quality of Opinion Retrieval Systems. Proceedings of the 2010 Web Intelligence/IAT Workshops, 235-238.

Andreevskaia, A. and Bergler, S. 2006. Mining WordNet for a Fuzzy Sentiment: Sentiment Tag Extraction from WordNet Glosses. Proceedings of the 2006 EACL, 209-216.

Al Masum, S. M., Prendinger, H. and Ishizuka, M. 2007. SenseNet: A linguistic tool to visualize numerical-valence based sentiment of textual data. Proceedings of the 2007 Proceedings of the International Conference on Natural Language Processing (ICON), 147-152.

Asbagh, M. J., Sayyadi, M. and Abolhassani, H. 2009. Blog Summarization for Blog Mining Software Engineering, Artificial Intelligence, Networking and

Parallel/Distributed Computing (pp. 157-167)Springer.

Balahur, A. and Montoyo, A. 2008. A feature dependent method for opinion mining and classification. Proceedings of the 2008 Natural Language Processing and Knowledge Engineering, 2008. NLP-KE'08. International Conference on: IEEE, 1-7.

Balahur, A., Kabadjov, M., Steinberger, J., Steinberger, R. and Montoyo, A. 2012. Challenges and solutions in the opinion summarization of user-generated content. Journal of Intelligent Information Systems. 39(2), 375-398. Balahur, A., Kozareva, Z. and Montoyo, A. 2009. Determining the polarity and

(37)

Beata Beigman Klebanov , Jill Burstein , Nitin Madnani , Adam Faulkner , Joel Tetreault (2012). Building subjectivity lexicon(s) from scratch for essay data, Proceedings of the 13th international conference on Computational

Linguistics and Intelligent Text Processing, March 11-17, 2012, New Delhi, India

Benamara, F., Cesarano, C., Picariello, A., Recupero, D. R. and Subrahmanian, V. S. (2007). Sentiment Analysis: Adjectives and Adverbs are better than

Adjectives Alone. Proceedings of the 2007 ICWSM,

Bestgen, Y., Fairon, C. and Kerves, L. 2004. Un barometre affectif effectif: Corpus de référence et méthode pour déterminer la valence affective de phrases. Journées internationales d’analyse statistique des donnés textuelles (JADT). 182-191.

Bhattacharyya, D., Das, P., Mitra, K., Ganguly, D., Mukherjee, S., Bandyopadhyay, S. K. and Kim, T.-h. 2009. A Novel Approach for Refinement of Corpus in the Field of Opinion Mining. Proceedings of the 2009 Future Networks, 2009 International Conference on: IEEE, 281-285.

Bhattacharyya, D., Das, P., Mitra, K., Mukherjee, S., Ganguly, D., Bandyopadhyay, S. K. and Kim, T.-h. 2009. Refine Crude Corpus for Opinion Mining. Proceedings of the 2009 Computational Intelligence, Communication

Systems and Networks, 2009. CICSYN'09. First International Conference on: IEEE, 17-22.

Binali, H., Potdar, V. and Wu, C. 2009. A state of the art opinion mining and its application domains. Proceedings of the 2009 Industrial Technology, 2009. ICIT 2009. IEEE International Conference on: IEEE, 1-6.

Bjørkelund, E., Burnett, T. H. and Nørvåg, K. 2012. A study of opinion mining and visualization of hotel reviews. Proceedings of the 2012 Proceedings of the 14th International Conference on Information Integration and Web-based Applications & Services: ACM, 229-238.

(38)

Service Reviews. WWW2008 Workshop : Natural Language Processing Challenges in the Information Explosion Era (NLPIX 2008).

Blitzer, J., Dredze, M. and Pereira, F. 2007. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. Proceedings of the 2007 ACL, 440-447.

Bouchlaghem, R., Elkhlifi, A. and Faiz, R. (2010). Automatic extraction and classification approach of opinions in texts. Proceedings of the 2010

Intelligent Systems Design and Applications (ISDA), 2010 10th International Conference on: IEEE, 918-922.

Breck, E., Choi, Y. and Cardie, C. 2007. Identifying Expressions of Opinion in Context. Proceedings of the 2007 IJCAI, 2683-2688.

Cambria, E., Speer, R.; Havasi, C., and Hussain, A. (2010). SenticNet: A publicly available semantic resource for opinion mining. In Proceedings of AAAI CSK, 14–18.

Cardie, C., Farina, C. and Bruce, T. 2006. Using natural language processing to improve erulemaking: project highlight. Proceedings of the 2006 Proceedings of the 2006 international conference on Digital government research: Digital Government Society of North America, 177-178.

Carenini, Giuseppe, Raymond T. Ng, and Ed Zwart. 2005. Extracting Knowledge from Evaluative Text. In Proceedings of the 3rd international conference on Knowledge captur.

Chang R, Pimentel S, Svistunov A (2011). Sentiment analysis of Occupy Wall Street Tweets.

http://cs229.stanford.edu/proj2011/ChangPimentelSvistunov-SentimentAnalysisOfOccupyWallStreetTweets.pdf

(39)

Culibrk, D., Mirkovic, M., Lugonja, P. and Crnojevic, V. (2010). Mining web videos for video quality assessment. Proceedings of the 2010 Soft Computing and Pattern Recognition (SoCPaR), 2010 International Conference of: IEEE, 75-80.

Das, A. and Bandyopadhyay S. (2011). Towards the Global SentiWordNet. Pro-ceedings of the 24th Paci_c Asia Conference on Language, Information and Com-putation, pp. 799-808.

Das, S. and Chen, M. 2001. Yahoo! for Amazon: Extracting market sentiment from stock message boards. Proceedings of the 2001 Proceedings of the Asia Pacific finance association annual conference (APFA): Bangkok, Thailand, 43.

Dave, K., Lawrence, S. and Pennock, D. M. 2003. Mining the peanut gallery: Opinion extraction and semantic classification of product reviews.

Proceedings of the 2003 Proceedings of the 12th international conference on World Wide Web: ACM, 519-528.

Davies, J. 2010. Lightweight ontologies Theory and Applications of Ontology: Computer Applications (pp. 197-229)Springer.

de Albornoz, J. C., Plaza, L., Gervás, P. and Díaz, A. 2011. A joint model of feature mining and sentiment analysis for product review rating Advances in

information retrieval (pp. 55-66)Springer.

Ding, X. and Liu, B. 2007. The utility of linguistic rules in opinion mining.

Proceedings of the 2007 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval: ACM, 811-812.

Ding, X., Liu, B., & Yu, P. S 2008. A holistic lexicon based approach to opinion mining. In Proceedings of the international conference on Web search and web data mining. Palo Alto, California, USA: ACM, 231-240

Dragoni, M., da Costa Pereira, C. and Tettamanzi, A. G. 2012. A conceptual

(40)

using light ontologies. Expert Systems with applications. 39(12), 10376-10388.

Esuli, A. and Sebastiani, F. 2005. Determining the semantic orientation of terms through gloss classification. Proceedings of the 2005 Proceedings of the 14th ACM international conference on Information and knowledge management: ACM, 617-624.

Esuli, A. and Sebastiani, F. 2006. Determining Term Subjectivity and Term Orientation for Opinion Mining. Proceedings of the 2006 EACL, 2006. Esuli, A. and Sebastiani, F. 2006. Sentiwordnet: A publicly available lexical resource

for opinion mining. Proceedings of the 2006 Proceedings of LREC: Citeseer, 417-422.

Feng, S., Zhang, M., Zhang, Y. and Deng, Z. 2010. Recommended or Not Recommended? Review Classification through Opinion Extraction.

Proceedings of the 2010 Web Conference (APWEB), 2010 12th International Asia-Pacific: IEEE, 350-352.

Furuse, O., Hiroshima, N., Yamada, S. and Kataoka, R. 2007. Opinion Sentence Search Engine on Open-Domain Blog. Proceedings of the 2007 IJCAI, 2760-2765.

Gamon, M., Aue, A., Corston-Oliver, S. and Ringger, E. 2005. Pulse: Mining customer opinions from free text Advances in Intelligent Data Analysis VI (pp. 121-132)Springer.

Ganapathibhotla, Murthy and Bing Liu (2008). Mining opinions in comparative sentences. in Proceedings of International Conference on Computational Linguistics (COLING-2008).

(41)

Gautami T. Naganna S (2014). Opinion Mining: A Review. International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 16, pp. 1625-1635

Ghose, A., Ipeirotis, P. G. and Sundararajan, A. 2007. Opinion mining using econometrics: A case study on reputation systems. Proceedings of the 2007 Annual Meeting-Association for Computational Linguistics, 416.

Greene, S. C. 2007. Spin: Lexical semantics, transitivity, and the identification of implicit sentiment. ProQuest.

Godbole, N., Srinivasaiah, M. and Skiena, S. 2007. Large-Scale Sentiment Analysis for News and Blogs. ICWSM. 7, 21.

Gomez-Perez, A. (1999), Evaluation of taxonomic knowledge in ontologies and knowledge bases, in `Proceedings of the 12th Ban® Knowledge Acquisition for Knowledge-Based Systems Workshop, Ban, Alberta, Canada'.

Gu, Y. H. and Yoo, S. J. 2009. Rules for Mining Comparative Online Opinions. Proceedings of the 2009 Computer Sciences and Convergence Information Technology, 2009. ICCIT'09. Fourth International Conference on: IEEE, 1294-1299.

Guohong Fu and Xin Wang (2010). Chinese Sentence-Level Sentiment

Classification Based on Fuzzy Sets. Coling: Poster Volume, Beijing, pp 312-319.

Ha, Q.-T., Vu, T.-T., Pham, H.-T. and Luu, C.-T. 2011. An upgrading feature-based opinion mining model on vietnamese product reviews Active Media

Technology (pp. 173-185)Springer.

Hagen-Zanker, Jessica; Duvendack, Maren; Mallett, Richard; Slater, Rachel; Carpenter, Samuel; Tromme, Mathieu (January 2012). "Making systematic reviews work for international development research". Overseas

(42)

Han, P., Du, J. and Chen, L. 2010. Web opinion mining based on sentiment phrase classification vector. Proceedings of the 2010 Network Infrastructure and Digital Content, 2010 2nd IEEE International Conference on: IEEE, 308-312. Hatzivassiloglou, V. and McKeown, K. R. 1997. Predicting the semantic orientation

of adjectives. Proceedings of the 1997 Proceedings of the 35th annual meeting of the association for computational linguistics and eighth conference of the european chapter of the association for computational linguistics: Association for Computational Linguistics, 174-181.

Hiroshi, K., Tetsuya, N. and Hideo, W. 2004. Deeper sentiment analysis using machine translation technology. Proceedings of the 2004 Proceedings of the 20th international conference on Computational Linguistics: Association for Computational Linguistics, 494.

Horrigan, J. A. 2008. Online shopping. Pew Internet & American Life Project Report. 36.

Hu, M. and Liu, B. 2004. Mining and summarizing customer reviews. Proceedings of the 2004 Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining: ACM, 168-177.

Hu, Minqing and Liu, Bing 2004. Mining Opinion Features in Customer Reviews. AAAI 2004, pages 755-760.

Hu, W., Gong, Z. and Guo, J. 2010. Mining product features from online reviews. Proceedings of the 2010 e-Business Engineering (ICEBE), 2010 IEEE 7th International Conference on: IEEE, 24-29.

Hu, X. and Wu, B. (2009). Classification and summarization of pros and cons for customer reviews. Proceedings of the 2009 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology-Volume 03: IEEE Computer Society, 73-76. Hu, X. and Wu, B. 2006. Automatic keyword extraction using linguistic features.

(43)

Huang, Y., Zhu, L. and Zhang, W. (2002). [From clone selection to danger model]. Zhongguo yi xue ke xue yuan xue bao. Acta Academiae Medicinae Sinicae. 24(4), 430-433.

Houen, S. 2011. Opinion mining with semantic analysis. Online:< http://www. diku. dk/forskning/Publikationer/specialer/2011/specialerapport_final_Soren_Houe n. pdf.

Jayashri, K. and Mayura, K. 2013. Machine Learning Algorithms for Opinion Mining and Sentiment Classification. International Journal of Scientific and Research Publications, Volume 3, Issue 6, June 2013 1 ISSN 2250-3153 www.ijsrp.org.

Jeong, H., Shin, D. and Choi, J. 2011. Ferom: Feature extraction and refinement for opinion mining. ETRI Journal. 33(5), 720-730.

Jia, W.-J., Zhang, S., Xia, Y.-J., Zhang, J. and Yu, H. 2010. A novel product features categorize method based on twice-clustering. Proceedings of the 2010 Web Information Systems and Mining (WISM), 2010 International Conference on: IEEE, 281-284.

Jin, X., Li, Y., Mah, T. and Tong, J. 2007. Sensitive webpage classification for content advertising. Proceedings of the 2007 Proceedings of the 1st international workshop on Data mining and audience intelligence for advertising: ACM, 28-33.

John, C. 2000. Fast Training of Support Vector Machines using Sequential Minimal Optimization. Platt. Microsoft Research. WA 98052, USA,

[email protected]. http://www.research.microsoft.com/_jplatt Jorge, C. A., Laura, P., Pablo G. and Alberto D. 2011, A Joint Model of Feature

Mining and Sentiment Analysis for Product Review Rating. ECIR, LNCS 6611, pp. 55–66, Springer-Verlag Berlin Heidelberg 2011.

(44)

Kar, A. and Mandal, D. P. 2011. Finding opinion strength using fuzzy logic on web reviews. International Journal of Engineering and Industries. 2(1), 37-43. Li, C.-h. (2009). Sentence Factorization for Opinion Feature Mining.

Proceedings of the 2009 Computational Aspects of Social Networks, 2009. CASON'09. International Conference on: IEEE, 129-132.

Kim P. (2006), “The forrester wave: Brand monitoring, Q3 2006,” Forrester Wave (white paper).

Kim, S.-M. and Hovy, E. 2004. Determining the sentiment of opinions. Proceedings of the 2004 Proceedings of the 20th international conference on

Computational Linguistics: Association for Computational Linguistics, 1367. Kobayashi, N., Iida, R., Inui, K. and Matsumoto, Y. 2006. Opinion mining as

extraction of attribute-value relations New Frontiers in Artificial Intelligence (pp. 470-481)Springer.

Kobayashi, N., Inui, K. and Matsumoto, Y. 2007. Opinion mining from web documents: Extraction and structurization. Information and Media Technologies. 2(1), 326-337.

Ku, L.-W., Lo, Y.-S. and Chen, H.-H. 2007. Using polarity scores of words for sentence-level opinion extraction. Proceedings of the 2007 Proceedings of NTCIR-6 workshop meeting: Citeseer, 316-322.

Lau, R. Y., Lai, C. C., Ma, J. and Li, Y. 2009. Automatic domain ontology extraction for context-sensitive opinion mining. ICIS 2009 Proceedings. 35-53.

Lin, W.-H., Wilson, T., Wiebe, J. and Hauptmann, A. 2006. Which side are you on?: identifying perspectives at the document and sentence levels. Proceedings of the 2006 Proceedings of the Tenth Conference on Computational Natural Language Learning: Association for Computational Linguistics, 109-116. Lita, L. V., Schlaikjer, A. H., Hong, W. and Nyberg, E. 2005. Qualitative dimensions

(45)

Liu, B., Hu, M. and Cheng, J. 2005. Opinion observer: analyzing and comparing opinions on the web. Proceedings of the 2005 Proceedings of the 14th international conference on World Wide Web: ACM, 342-351.

Liu, B. 2007. Web data mining: exploring hyperlinks, contents, and usage data. Springer Science & Business Media.

Liu, B. 2012. Sentiment Analysis and Opinion Mining Morgan & Claypool Publishers. May.

Li, C.-h. 2009. Sentence Factorization for Opinion Feature Mining. Proceedings of the 2009 Computational Aspects of Social Networks, 2009. CASON'09. International Conference on: IEEE, 129-132.

Liu, C.-L., Hsaio, W.-H., Lee, C.-H., Lu, G.-C. and Jou, E. 2012. Movie rating and review summarization in mobile environment. Systems, Man, and

Cybernetics, Part C: Applications and Reviews, IEEE Transactions on. 42(3), 397-407.

Liu, D., Ma, S.-x. and Guo, Z.-h. 2010. Research on the Method of Opinion Mining Based on Danger Theory. Proceedings of the 2010 2010 Second International Workshop on Education Technology and Computer Science, 385-387.

Lloret, E., Balahur, A., Gómez, J. M., Montoyo, A. and Palomar, M. (2012). Towards a unified framework for opinion retrieval, mining and

summarization. Journal of Intelligent Information Systems. 39(3), 711-747. Liu, J., Seneff, S. and Zue, V. 2012. Harvesting and Summarizing User-Generated

Content for Advanced Speech-Based HCI. Selected Topics in Signal Processing, IEEE Journal of. 6(8), 982-992.

Lo, Y. W. and Potdar, V. 2009. A review of opinion mining and sentiment

classification framework in social networks. Proceedings of the 2009 Digital Ecosystems and Technologies, 2009. DEST'09. 3rd IEEE International Conference on: Ieee, 396-401.

(46)

Màrquez, L., Carreras, X., Litkowski, K. C. and Stevenson, S. 2008. Semantic role labeling: an introduction to the special issue. Computational linguistics. 34(2), 145-159.

Mei, I.-H., Mi, H. and Quiaot, J. 2007. Sentiment Mining and Indexing in Opinmind. Proceedings of the 2007 ICWSM: Citeseer,

Miao, Q., Li, Q. and Dai, R. 2008. A unified framework for opinion retrieval. Proceedings of the 2008 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology-Volume 01: IEEE Computer Society, 739-742.

Min, Hye-Jin and Jong C. Park (2011). Detecting and Blocking False SentimentPropagation. in Proceedings of the 5th International Joint Conference onNatural Language Processing (IJCNLP-2010)

Mishne, G. and Glance, N. S. 2006. Predicting Movie Sales from Blogger Sentiment. Proceedings of the 2006 AAAI Spring Symposium: Computational

Approaches to Analyzing Weblogs, 155-158.

Mishne, G. 2005. Experiments with mood classification in blog posts. Proceedings of the 2005 Proceedings of ACM SIGIR 2005 workshop on stylistic analysis of text for information access: Citeseer,

Moraes, R., Valiati, J. F. and Neto, W. P. G. 2013. Document-level sentiment classification: An empirical comparison between SVM and ANN. Expert Systems with Applications. 40(2), 621-633.

Morinaga, S., Yamanishi, K., Tateishi, K. and Fukushima, T. 2002. Mining product reputations on the web. Proceedings of the 2002 Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining: ACM, 341-349.

(47)

Nasukawa, T. and Yi, J. 2003. Sentiment analysis: Capturing favorability using natural language processing. Proceedings of the 2003 Proceedings of the 2nd international conference on Knowledge capture: ACM, 70-77.

Navrat, P., Ezzeddine, A. B. and Slizik, L. 2010. Mining Overall Sentiment in Large Sets of Opinions Advances in Intelligent Web Mastering-2 (pp.

167-173)Springer.

Nayana Mariya Varghese and Jomina John,(2012). Cluster Optimization for Enhanced Web Usage Mining using Fuzzy Logic”, World Congress on Information and Communication Technologies,IEEE, pp.948-952

Ortigosa, A., Martín, J. M. and Carro, R. M. 2014. Sentiment analysis in Facebook and its application to e-learning. Computers in Human Behavior. 31, 527-541.

Ounis I., de Rijke M., Macdonald C., Mishne G., and Soboroff I. 2006, “Overview of the TREC-2006 blog track,” in Proceedings of the 15th Text Retrieval

Conference (TREC).

Pang, B. and Lee, L. 2004. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. Proceedings of the 2004 Proceedings of the 42nd annual meeting on Association for Computational Linguistics: Association for Computational Linguistics, 271.

Pang, B. and Lee, L. 2005. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. Proceedings of the 2005

Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics: Association for Computational Linguistics, 115-124.

Pang, B. and Lee, L. 2008. Opinion mining and sentiment analysis. Foundations and trends in information retrieval. 2(1-2), 1-135.

Pang, B. and Lee, L. 2008. Using Very Simple Statistics for Review Search: An Exploration. Proceedings of the 2008 COLING (Posters), 75-78.

(48)

Proceedings of the 2012 Data Mining Workshops (ICDMW), 2012 IEEE 12th International Conference on: IEEE, 709-716.

Popescu, A.-M. and Etzioni, O. 2007. Extracting product features and opinions from reviews Natural language processing and text mining (pp. 9-28)Springer. Qi Zhang, Yuanbin Wu, Yan Wa, Xuanjing 2011. ‘Opinion Mining Sentiment

Graph’, 2011 IEEE/WIC/ACM International Conferences on Wed Intelligence Agent technology, pp. 249-252

Rainie L. and Horrigan J. 2007. “Election 2006 online,” Pew Internet & American Life Project Report, January.

Raut, V. B. and Londhe, D. 2014. Survey on Opinion Mining and Summarization of User Reviews on Web. International Journal of Computer Science &

Information Technologies. 5(2).

Richa, S., Shweta, N. and Rekha, J. 2013. Supervised Opinion Mining Techniques: A Survey. International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 8 (2013), pp. 737-742. http://www. irphouse.com /ijict.htm

Rodrigues, L. M. and Dimuro, G. P. 2011. Measuring the Quality of Internet Shopping: An Experiment Using Fuzzy Logic. Proceedings of the 2011 Theoretical Computer Science (WEIT), 2011 Workshop-School on: IEEE, 60-66.

Saleh, M. R., Martín-Valdivia, M. T., Montejo-Ráez, A. and Ureña-López, L. 2011. Experiments with SVM to classify opinions in different domains. Expert Systems with Applications. 38(12), 14799-14804.

(49)

Seki, Y., Evans, D. K., Ku, L.-W., Chen, H.-H., Kando, N. and Lin, C.-Y. 2007. Overview of opinion analysis pilot task at NTCIR-6. Proceedings of the 2007 Proceedings of NTCIR-6 Workshop Meeting, 265-278.

Shandilya, S. K. and Jain, S. 2009. Automatic opinion extraction from web

documents. Proceedings of the 2009 Computer and Automation Engineering, 2009. ICCAE'09. International Conference on: IEEE, 351-355.

Sindhu, C. and Ch, S. 2013. A Survey on Opinion Mining and Sentiment Polarity Classification.

Sing, J., Sarkar, S. and Mitra, T. K. 2012. Development of a novel algorithm for sentiment analysis based on adverb-adjective-noun combinations.

Proceedings of the 2012 Emerging Trends and Applications in Computer Science (NCETACS), 2012 3rd National Conference on: IEEE, 38-40. Singh, V. K., Adhikari, R. and Mahata, D. 2010. A clustering and opinion mining

approach to socio-political analysis of the blogosphere. Proceedings of the 2010 Computational Intelligence and Computing Research (ICCIC), 2010 IEEE International Conference on: IEEE, 1-4.

Snyder, B. and Barzilay, R. 2007. Multiple Aspect Ranking Using the Good Grief Algorithm. Proceedings of the 2007 HLT-NAACL, 300-307.

Strapparava C.and Valitutti A. (2004). WordNet-Affect: An Affective Extension of WordNet. Proc. Int\'l Conf. Language Resources and Evaluation, vol. 4, pp.1083 -1086

Su, X., Gao, G. and Tian, Y. 2010. A Framework to Answer Questions of Opinion Type. Proceedings of the 2010 Web Information Systems and Applications Conference (WISA), 2010 7th: IEEE, 166-169.

Tadayon Tabrizi, Mir and G.(2012). Improving Data Clustering Using Fuzzy Logic and PSO Algorithm. 20th Iranian Conference on Electrical Engineering, (ICEE2012), May 15-17,Tehran, Iran. IEEE, pp .784-788

(50)

Annual Meeting on Association for Computational Linguistics: Association for Computational Linguistics, 133-140.

Tatemura, J. 2000. Virtual reviewers for collaborative exploration of movie reviews. Proceedings of the 2000 Proceedings of the 5th international conference on Intelligent user interfaces: ACM, 272-275.

Trilla, T. and Alias, F. 2013. Sentence-Based Sentiment Analysis for Expressive Text-to-Speech. Audio, Speech, and Language Processing, IEEE

Transactions on. 21(2), 223-233.

Tong, R. M. 2001. An operational system for detecting and tracking opinions in on-line discussion. Proceedings of the 2001 Working Notes of the ACM SIGIR 2001 Workshop on Operational Text Classification, 6.

Turney, P. D. 2002. Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. Proceedings of the 2002 Proceedings of the 40th annual meeting on association for computational linguistics: Association for Computational Linguistics, 417-424.

Tushar, Dilip Kumar Pratihar (2009). Design of cluster-wise optimal fuzzy logic controller to model input-output relationships of some manufacturing processes. Int J.of Data Mining, Modelling and Management,vol.1,no2 pp.178-205,DOI :10.1504/IJDMMM 2009.026075.

Verma, S. and Bhattacharyya, P. 2009. Incorporating semantic knowledge for sentiment analysis. Proceedings of ICON.

Wei, H., Xin, C. and Haibo, W. 2010. Product information retrieval based on opinion mining. Proceedings of the 2010 Fuzzy Systems and Knowledge Discovery (FSKD), 2010 Seventh International Conference on: IEEE, 2489-2492. Weng, C.-H. and Chen, Y.-L. 2010. Mining fuzzy association rules from uncertain

data. Knowledge and Information Systems. 23(2), 129-152.

(51)

Wiebe, J., Wilson, T. and Cardie, C. 2005. Annotating expressions of opinions and emotions in language. Language resources and evaluation. 39(2-3), 165-210. Wilson, T., Wiebe, J. and Hwa, R. 2004. Just how mad are you? Finding strong and

weak opinion clauses. Proceedings of the 2004 aaai, 761-769.

Wilson, T., Wiebe, J. and Hoffmann, P. 2005. Recognizing contextual polarity in phrase-level sentiment analysis. Proceedings of the 2005 Proceedings of the conference on human language technology and empirical methods in natural language processing: Association for Computational Linguistics, 347-354. Wong T. and Lam W. (2005). Hot item mining and summarization from multiple

auction Web sites. Proceedings of the 5th IEEE International Conference on Data Mining (ICDM '05); November 2005; Houston, Tex, USA. pp. 797– 800.

Xia, Y.-Q., Hao, B.-Y. and Dai, L.-L. (2009). Term extraction from web reviews with opinion heuristics. Proceedings of the 2009 Machine Learning and Cybernetics, 2009 International Conference on: IEEE, 3516-3521. Xuan, H. N. T., Le, A. C. and Nguyen, L. M. 2012. Linguistic Features for

Subjectivity Classification. Proceedings of the 2012 Asian Language Processing (IALP), 2012 International Conference on: IEEE, 17-20. Yan-Yan, Z., Bing, Q. and Ting, L. 2010. Integrating intra-and inter-document

evidences for improving sentence sentiment classification. Acta Automatica Sinica. 36(10), 1417-1425.

Yao, T., Cheng, X., Xu, F., Uszkoreit, H. and Wang, R. 2008. A survey of opinion mining for texts. Journal of Chinese information processing. 22(3), 71-80. Ye, Q., Zhang, Z. and Law, R. 2009. Sentiment classification of online reviews to

travel destinations by supervised machine learning approaches. Expert Systems with Applications. 36(3), 6527-6535.

(52)

Empirical methods in natural language processing: Association for Computational Linguistics, 129-136.

Yu, L., Ma, J., Tsuchiya, S. and Ren, F. 2008. Opinion mining: A study on semantic orientation analysis for online document. Proceedings of the 2008 Intelligent Control and Automation, 2008. WCICA 2008. 7th World Congress on: IEEE, 4548-4552.

Zabin, J. and Jefferies, A. 2008. Social media monitoring and analysis: Generating consumer insights from online conversation. Aberdeen Group Benchmark Report. 37(9).

Zhai, Y., Chen, Y., Hu, X., Li, P. and Wu, X. 2010. Extracting Opinion Features in Sentiment Patterns. Proceedings of the 2010 Information Networking and Automation (ICINA), 2010 International Conference on: IEEE, V1-115-V111-119.

Zhao, Lili, and Chunping Li. 2009. Ontology Based Opinion Mining for Movie Reviews. In Proceedings of the 3rd International Conference on Knowledge Science, Engineering and Management.

Zhendong Dong, Qiang Dong, and Changling Hao. 2010. Hownet and its computation of meaning. In Proc. The 23rd International Conference on Computational Linguistics: Demonstrations, pages 53–56. ACL.

Zhuang, L., Jing, F. and Zhu, X.-Y. (2006). Movie review mining and

References

Related documents

Rather, this is our opportunity to acknowledge how legacies of racism and violence continue to deeply impact curriculum and pedagogy in the arts; to make space where we have

For the euro zone (17 countries) and the EU as a whole (27 countries), the European Commission is expecting a growth of 1.6% and 1.7% respectively for the full year 2011, with

Aims of this study were to analyze IL-17 induction by HTLV-1 infection and to determine whether resveratrol (RES) is able to down regulate the pathway of cytokines production either

HEED has four primary objectives (i) prolonging network lifetime by distributing energy consumption (ii) terminating the clustering process within a constant

Therefore to calculate the leakage power in standby mode PDN in this PMOS stack which lead to the reduction of leakage current which in turn reduces the

Cherniak et al Earth, Planets and Space 2014, 66 165 http //www earth planets space com/content/66/1/165 LETTER Open Access Approaches for modeling ionosphere irregularities based on

Earth Planets Space, 63, 71?88, 2011 Examination of source model construction methodology for strong ground motion simulation of multi segment rupture during 1891 Nobi earthquake

Microsoft Word Afstudeerverslag 210207 Competent meten Ontwikkeling van een meetinstrument om competentiemanagement voor verpleegkundigen te beoordelen naar effectiviteit Eline Nap