Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN

(1)

Contents lists available at ScienceDirect

Expert

Systems

With

Applications

journal homepage: www.elsevier.com/locate/eswa

Improving

sentiment

analysis

via

sentence

type

classiﬁcation

using

BiLSTM-CRF

and

CNN

Tao Chen

a

, Ruifeng Xu

a ,b ,∗

, Yulan He

c

, Xuan Wang

a

a_Shenzhen_Engineering_Laboratory_of_Performance_Robots_at_Digital_Stage,_Shenzhen_Graduate_School,_Harbin_Institute_of_Technology,_Shenzhen,_China b_Guangdong_Provincial_Engineering_Technology_Research_Center_for_Data_Science,_Guangzhou,_China

c_School_of_Engineering_and_Applied_Science,_Aston_University,_Birmingham,_UK

a

r

t

i

c

l

e

i

n

f

o

Articlehistory:

Received9July2016 Revised1October2016 Accepted21October2016 Availableonline9November2016

Keywords:

Naturallanguageprocessing Sentimentanalysis Deepneuralnetwork

a

b

s

t

r

a

c

t

Differenttypesofsentencesexpresssentimentinverydifferentways. Traditionalsentence-level senti-mentclassificationresearchfocusesonone-technique-fits-allsolutionoronlycentersononespecialtype ofsentences.Inthispaper,weproposeadivide-and-conquerapproachwhichfirstclassifiessentences intodifferenttypes,thenperforms sentimentanalysis separatelyonsentencesfromeachtype. Specif-ically,wefindthat sentencestend tobemore complexifthey containmoresentiment targets. Thus, weproposetofirstapplyaneuralnetworkbasedsequencemodeltoclassifyopinionatedsentencesinto threetypesaccordingtothenumberoftargetsappearedinasentence.Eachgroupofsentencesisthen fedintoaone-dimensionalconvolutionalneuralnetworkseparatelyforsentimentclassification.Our ap-proachhasbeenevaluatedonfoursentimentclassificationdatasetsandcomparedwithawiderangeof baselines.Experimentalresultsshowthat:(1)sentencetypeclassificationcanimprovetheperformance ofsentence-levelsentimentanalysis;(2)theproposedapproachachievesstate-of-the-artresultson sev-eralbenchmarkingdatasets.

1. Introduction

Sentiment analysisis theﬁeld of studythat analyzes people’s opinions, sentiments, appraisals, attitudes, and emotions toward entities andtheir attributesexpressedin writtentext (Liu, 2015 ). With the rapid growth of social media on the web, such as re-views, forum discussions, blogs, news, and comments, more and more peopleshare their viewsandopinionsonline.As such, this fascinatingproblemisincreasinglyimportantinbusinessand soci-ety.

One of themain directions of sentimentanalysis is sentence-level sentiment analysis. Much of the existing research on this topicfocused on identifying thepolarity ofa sentence(e.g. posi-tive,negative,neutral)basedonthelanguagecluesextractedfrom thetextualcontentofsentences(Liu, 2012; Pang & Lee, 2004; Tur- ney, 2002 ). They solved this task as a general problem without considering different sentence types. However, different types of sentences expresssentiment invery differentways.Forexample, for the sentence“It is good.”,the sentiment polarity is deﬁnitely

∗ _{Corresponding}_author._Fax.:_{8618988759558.}

E-mailaddresses:[email protected](T.Chen),[email protected](R. Xu),[email protected](Y.He),[email protected](X.Wang).

positive;fortheinterrogativesentence“Isitgood?”,thesentiment polarityisobscure,anditslightlyinclinedtothenegative;forthe comparativesentence“Ais betterthanB.”,weevencannot decide its sentiment polarity, becauseit is dependent on whichopinion targetwefocuson(AorB).

Unlikefactual text,sentimenttext canoftenbeexpressed ina moresubtleor arbitrarymanner,making it difficultto be identi-fiedby simplylooking ateachconstituentwordin isolation.It is argued that there is unlikely to havea one-technique-fits-all so-lution(Narayanan, Liu, & Choudhary, 2009 ). Adivide-and-conquer approach may be needed to deal with some special sentences with unique characteristics, that is, different types of sentences mayneed differenttreatments onsentence-level sentiment anal-ysis(Liu, 2015 ).

There are many ways in classifying sentences in sentiment analysis. Sentences can be classiﬁed as subjective and objec-tive which is to separate opinions from facts (Wiebe & Wilson, 2002; Wiebe, Bruce, & O’Hara, 1999; Yu & Hatzivassiloglou, 2003 ). Some researchers focused on target-dependent sentiment classi-ﬁcation, which is to classify sentiment polarity for a given tar-get on sentences consisting of explicit sentiment targets (Dong et al., 2014; Jiang, Yu, Zhou, Liu, & Zhao, 2011; Mitchell, Aguilar, Wilson, & Durme, 2013; Tang, Qin, Feng, & Liu, 2015a; Vo & http://dx.doi.org/10.1016/j.eswa.2016.10.065

(2)

Zhang, 2015 ). Others dealt with mining opinions in comparative sentences, which is to determinate the degree of positivity sur-round the analysis of comparative sentences (Ganapathibhotla & Liu, 2008; Jindal & Liu, 2006b; Yang & Ko, 2011 ). There has also beenworkfocusingonsentimentanalysisofconditionalsentences (Narayanan et al., 2009 ), orsentences withmodality,which have somespecialcharacteristicsthatmakeithard fora systemto de-terminesentimentorientations(Liu, Yu, Chen, & Liu, 2013 ).

In thispaper,we propose adifferent wayindealingwith dif-ferent sentence types. In particular, we investigate the relation-ship betweenthe numberof opinion targetsexpressed in a sen-tence and the sentiment expressed in this sentence; propose a novel framework for improving sentiment analysis via sentence typeclassiﬁcation.Opiniontarget(hereafter,targetforshort)can beanyentityoraspectoftheentityonwhichanopinionhasbeen expressed(Liu, 2015 ). Anopinionated sentencecanexpress senti-mentswithoutamentionofanytarget,ortowardsonetarget,two ormore targets.We deﬁne three types ofsentences: non-target sentences,one-targetsentencesandmulti-target sentences, re-spectively.Considerthefollowingexamplesfromthemoviereview sentence polarity dataset v1.0 (hereafter, MR dataset for short) (Pang & Lee, 2005 )1_:

Example1. Amasterpiecefouryearsinthemaking.

Example2. Ifyousometimesliketogotothemoviestohavefun, Wasabiisagoodplacetostart.

Example3. DirectorKapurisafilmmakerwitharealflairforepic landscapesandadventure,andthisisabetterfilmthanhisearlier English-languagemovie,theoverpraisedElizabeth.

Example 1 is a non-target sentence. In order to infer its tar-get,weneed toknowits context. Example 2 isa one-target sen-tence,inwhichthesentimentpolarityofthetargetWasabiis posi-tive. Example 3 isamulti-targetsentence,inwhichtherearethree targets:DirectorKapur,ﬁlm andhisearlier English-languagemovie, theoverpraisedElizabeth.Wecanobservethatsentencestendtobe morecomplexwithmoreopiniontargets,andsentimentdetection ismorediﬃcultforsentencescontainingmoretargets.

Basedonthisobservation,weapplyadeepneuralnetwork se-quence model, which is a bidirectional long short-term memory withconditionalrandom ﬁelds(henceforthBiLSTM-CRF) (Lample, Ballesteros, Subramanian, Kawakami, & Dyer, 2016 ), toextract tar-getexpressionsinopinionatedsentences.Basedonthetargets ex-tracted,we classify sentencesinto three groups:non-target, one-targetandmulti-target.Then,one-dimensional convolutional neu-ralnetworks(1d-CNNs)(Kim, 2014 )aretrainedforsentiment clas-siﬁcationoneachgroup separately.Finally,thesentimentpolarity ofeachinputsentenceispredictedbyoneofthethree1d-CNNs.

We evaluate the effectiveness of our approach empirically on various benchmarking datasets including the Stanford sentiment treebank (SST)2 ₍_Socher_et_al.,₂₀₁₃₎ _and _the _customer _reviews dataset (CR)3 ₍_Hu_{& Liu, 2004}_). _We _compare _our _results _with _a wide range of baselines including convolutional neural networks (CNN) with multi-channel (Kim, 2014 ), recursive auto-encoders (RAE) (Socher, Pennington, Huang, Ng, & Manning, 2011 ), recur-siveneural tensor network(RNTN)(Socher et al., 2013 ), dynamic convolutionalneural network(DCNN) (Kalchbrenner, Grefenstette, & Blunsom, 2014 ),NaiveBayes supportvector machines(NBSVM) (Wang & Manning, 2012 ), dependency tree withconditional ran-dom ﬁelds (tree-CRF) (Nakagawa, Inui, & Kurohashi, 2010 ) et al. Experimental results show that the proposed approach achieves

1_Available_at:_{https://www.cs.cornell.edu/people/pabo/movie-review-data/}_. 2_{http://nlp.stanford.edu/sentiment/}_.

3_{https://www.cs.uic.edu/}∼_{liub/FBS/sentiment-analysis.html#datasets}_.

state-of-the-art results on several benchmarking datasets. This shows that sentence type classiﬁcation can improve the perfor-manceofsentence-levelsentimentanalysis.

Themaincontributionsofourworkaresummarizedbelow:

• Weproposeanoveltwo-steppipelineframeworkfor sentence-levelsentimentclassiﬁcationby ﬁrstclassifyingsentencesinto different types based on the number of opinion targets they contain,andthentraining1d-CNNsseparately forsentencesin eachtypeforsentimentdetection;

• Whileconventional sentiment analysismethods largely ignore differentsentencetypes,wehavevalidatedinourexperiments that learning a sentiment classiﬁer tailored to each sentence typewouldresultinperformancegainsinsentence-level senti-mentclassiﬁcation.

The rest of this article is organized as follows: we review related work in Section 2 ; and then present our approach in Section 3 ; experimental setup, evaluationresults anddiscussions are reported in Section 4 ; ﬁnally, Section 5 concludes the paper andoutlinesfutureresearchdirections.

2. Relatedwork

2.1. Sentencetypeclassiﬁcationforsentimentanalysis

Since early 2000, sentiment analysis has grown to be one of the most active research areas in natural language processing (NLP) (Liu, 2015; Ravi & Ravi, 2015 ). It is a multifaceted prob-lemwithmanychallengingandinterrelatedsub-problems, includ-ingsentence-levelsentimentclassiﬁcation.Manyresearchers real-ized that different type ofsentence need different treatment for sentiment analysis. Models of different sentencetypes, including subjectivesentences,target-dependentsentences,comparative sen-tences, negation sentences, conditional sentences, sarcastic sen-tences,havebeenproposedforsentimentanalysis.

Subjectivity classification distinguishes sentences that express opinions(calledsubjectivesentences)fromsentencesthatexpress factual information (called objective sentences) (Liu, 2015 ). Al-though some objective sentences can imply sentiments or opin-ionsandsome subjectivesentences maynot expressanyopinion orsentiment,manyresearchersregard subjectivityandsentiment asthesameconcept(Hatzivassiloglou & Wiebe, 20 0 0; Wiebe et al., 1999 ),i.e.,subjectivesentencesexpressopinionsandobjective sen-tences express fact. Riloff and Wiebe (2003) presented a boot-strappingprocesstolearnlinguisticallyrichextractionpatternsfor subjectiveexpressions froma largeunannotated data. Rill, Reinel, Scheidt, and Zicari (2014) presenteda systemto detect emerging politicaltopics ontwitter andthe impact onconcept-level senti-mentanalysis. Appel, Chiclana, Carter, and Fujita (2016) proposed a hybrid approach using SentiWordNet (Baccianella, Esuli, & Se- bastiani, 2010 ) and fuzzy sets to estimate the semantic orienta-tion polarityandintensity ofsentimentwords, beforecomputing the sentencelevelsentiments. Muhammad, Wiratunga, and Loth- ian (2016) introducedalexicon-basedsentimentclassification sys-tem for social media genres, which captures contextual polarity fromboth localandglobalcontext. Fernández-Gavilanes, Álvarez- López, Juncal-Martínez, Costa-Montenegro, and González-Castaño (2016) proposed a novel approachto predict sentiment inonline texts based on an unsupervised dependency parsing-based text classificationmethod.

Mostprevioustargetrelatedworksassumedtargetshavebeen givenbeforeperformingsentimentclassiﬁcation(Dong et al., 2014; Jiang et al., 2011; Mitchell et al., 2013; Vo & Zhang, 2015 ). Little researchhasbeenconductedonclassifyingsentencebythetarget numberalthoughthereisalargebodyofworkfocusingonopinion targetextractionfromtext.

(3)

Acomparativeopinionsentenceexpressesarelationof similar-ities ordifferencesbetweentwoormoreentitiesand/ora prefer-enceof the opinion holderbased on some shared aspects ofthe entities. Jindal and Liu (2006a) showed that almost every com-parative sentence had a keyword (a word or phrase) indicating comparison, and identified comparative sentences by using class sequential rules based on human compiled keywords as features for a naive Bayes classifier. Ganapathibhotla and Liu (2008) re-portedthey were thefirst workfor miningopinionsin compara-tive sentences. Theysolved theproblem by usinglinguistic rules and a large external corpus of Pros and Cons from product re-views to determine whether the aspect and sentiment context were moreassociatedwitheach otherin ProsorinCons. Kessler and Kuhn (2014) presentedacorpusofcomparisonsentencesfrom English camera reviews. Park and Yuan (2015) proposed two lin-guisticknowledge-driven approachesforChinesecomparative ele-mentsextraction.

Negation sentences occur fairly frequently in sentiment anal-ysis corpus. Manyresearchers considered the impact of negation words or phrases aspart of their works (Hu & Liu, 2004; Pang, Lee, & Vaithyanathan, 2002 );afewresearchersinvestigated nega-tionwordsidentificationand/ornegativesentenceprocessingasa singletopic. Jia, Yu, and Meng (2009) studiedthe effectof nega-tionon sentimentanalysis, includingnegation termandits scope identification,byusingaparsetree,typeddependenciesand spe-cial linguisticrules. Zhang, Ferrari, and Enjalbert (2012) proposed a compositional model to detect valence shifters, such as nega-tions, which contribute to the interpretation of the polarity and theintensityofopinionexpressions. Carrillo-de Albornoz and Plaza (2013) studiedtheeffectofmodifiersontheemotionsaffectedby negation,intensifiersandmodality.

Conditional sentences are another commonly used language constructsintext.Such asentencetypically containstwoclauses: theconditionclauseandtheconsequentclause.Theirrelationship has signiﬁcant impact on the sentiment orientation of the sen-tence (Liu, 2015 ). Narayanan et al. (2009) ﬁrst presented a lin-guistic analysis of conditional sentences, and built some super-vised learning models to determine if sentiments expressed on different topics in a conditional sentence are positive, negative or neutral. Liu (2015) listed a set of interesting patterns in con-ditional sentences that often indicate sentiment, which was par-ticularly useful for reviews, online discussions, and blogs about products.

Sarcasm is a sophisticated formof speech act widely used in onlinecommunities.Inthecontextofsentimentanalysis,itmeans that when one says something positive,one actually means neg-ative, and vice versa. Tsur, Davidov, and Rappoport (2010) pre-sented a novel semi-supervised algorithm for sarcasm identiﬁ-cation that recognized sarcastic sentences in product reviews. González-Ibánez, Muresan, and Wacholder (2011) reported on a method for constructing a corpus of sarcastic Twitter messages, andusedthiscorpustoinvestigatetheimpactoflexicaland prag-maticfactorsonmachinelearningeffectivenessforidentifying sar-casticutterances. Riloff et al. (2013) presenteda bootstrapping al-gorithm for sarcasm recognition that automatically learned lists ofpositivesentimentphrasesandnegativesituationphrases from sarcastictweets.

Adversative and concessive structures, asanother kind of lin-guistical feature, are constructions express antithetical circum-stances (Crystal, 2011 ). A adversative or a concessive clause is usually in clear opposition to the main clause about the fact or event commented. Fernández-Gavilanes et al. (2016) treated the constructions as an extension of intensiﬁcation propaga-tion, where the sentiment formulated could be diminished or intensiﬁed, depending on both adversative/concessive and main clauses.

2.2.Opiniontargetdetection

Hu and Liu (2004) used frequentnouns and noun phrases as featurecandidates foropinion target extraction. Qiu, Liu, Bu, and Chen (2011) proposed a bootstrapping method where a depen-dency parserwas used to identify syntactic relations that linked opinion words and targetsfor opinion target extraction. Popescu and Etzioni (2005) considered product features to be concepts formingcertainrelationshipswiththeproductandsoughtto iden-tify the features connected with the product name by comput-ing the point wise mutual information (PMI) score between the phrase and class-specific discriminators through a web search. Stoyanov and Cardie (2008) treatedtargetextractionasatopic co-referenceresolutionproblemandproposedto traina classifier to judgeiftwo opinionswere onthesametarget. Liu, Xu, and Zhao (2014) constructedaheterogeneousgraphtomodelsemantic rela-tions andopinionrelations, andproposed aco-rankingalgorithm toestimatetheconfidenceofeachcandidate.Thecandidateswith higher confidence would be extracted as opinion targets. Poria, Cambria, and Gelbukh (2016) presentedthefirstdeeplearning ap-proachtoaspectextractioninopinionminingusinga7-layerCNN andasetoflinguisticpatternstotageachwordinsentences.

Mitchell et al. (2013) modeled sentiment detection as a se-quence taggingproblem, extracted namedentities and their sen-timent classes jointly. They referred this kind of approach open domain targeted sentiment detection. Zhang, Zhang, and Vo (2015) followedMitchellet al.’swork, studied theeffect ofword embeddings and automatic feature combinations on the task by extendingaCRFbaselineusingneuralnetworks.

2.3.Deeplearningforsentimentclassiﬁcation

Deeplearningapproachesare abletoautomaticallycapture, to someextent,thesyntacticandsemanticfeaturesfromtextwithout featureengineering,whichislabor intensiveandtimeconsuming. Theyattract much research interest in recent years, andachieve state-of-the-artperformancesinmanyﬁeldsofNLP,including sen-timentclassiﬁcation.

Socher et al. (2011) introduced semi-supervised recursive au-toencoders for predicting sentiment distributions without using anypre-definedsentimentlexicaorpolarityshiftingrules. Socher et al. (2013) proposed a family of recursive neural network, in-cluding recursive neural tensor network (RNTN), to learn the compositional semantic of variable-length phrases andsentences over a human annotatedsentiment treebank. Kalchbrenner et al. (2014) and Kim (2014) proposed different CNN modelsfor senti-mentclassification, respectively. Bothofthemcan handlethe in-put sentences with varying length and capture short and long-rangerelations. Kim (2014) ’smodelhaslittlehyperparameter tun-ing and can be trained on pre-trained word vectors. Irsoy and Cardie (2014a) presentedadeeprecursiveneuralnetwork(DRNN) constructed by stacking multiple recursive layers for composi-tionality inLanguage andevaluated theproposed model on sen-timent classification tasks. Tai, Socher, and Manning (2015) in-troduced a tree long short-term memory (LSTM) for improving semantic representations, which outperforms many existing sys-temsandstrongLSTM baselineson sentimentclassification. Tang et al. (2015c) proposed a joint segmentation and classification frameworkforsentence-levelsentimentclassification. Liu, Qiu, and Huang (2016) useda recurrentneural network(RNN) based mul-titasklearningframework to jointly learn acrossmultiple related tasks. Chaturvedi, Ong, Tsang, Welsch, and Cambria (2016) pro-poseda deep recurrentbelief network withdistributed time de-lays forlearningworddependencies intext which usesGaussian networkswithtime-delaystoinitializetheweightsofeachhidden neuron. Tang, Qin, and Liu (2015b) gaveasurveyonthistopic.

(4)

Fig.1. FrameworkofsentencetypeclassiﬁcationbasedsentimentanalysisusingBiLSTM-CRFand1d-CNN.

Table1

AnexamplesentencewithlabelsinIOBformat.Thetargetistheact, thelabelBindicatesthebeginningofatarget,Iindicatesthattheword isinsideatarget,andOindicatesawordbelongstonotarget.

Words: Yet the act is still charming here .

Labels: O B I O O O O O

3. Methodology

We presentourapproach forimprovingsentimentanalysisvia sentence type classification in this section. An overview of the approach is shown in Fig. 1 . We first introduce the BiLSTM-CRF model which extracts target expressions from input opinionated sentences, andclassifies each sentence according to the number oftargetexplicitlyexpressedinit(Section 3.1 ).Then,wedescribe the 1d-CNN sentiment classification model which predicts senti-mentpolarity fornon-target sentences,one-target sentences and multi-targetsentences,separately(Section 3.2 ).

3.1.Sequencemodelforsentencetypeclassiﬁcation

We describe our approach for target extraction and sentence type classiﬁcation with BiLSTM-CRF. Target extraction is similar tothe classic problemof named entityrecognition (NER),which viewsasentenceasasequenceoftokensusuallylabeledwithIOB format(shortforInside,Outside,Beginning). Table 1 showsan ex-amplesentencewiththeappropriatelabelsinthisformat.

Deep neural sequence models have shown promising success inNER(Lample et al., 2016 ), sequencetagging (Huang, Xu, & Yu, 2015 ) and fine-grained opinion analysis (Irsoy & Cardie, 2014b ). BiLSTM-CRFisoneofdeepneuralsequencemodels,wherea bidi-rectional long short-term memory (BiLSTM) layer (Graves, Mo- hamed, & Hinton, 2013 ) and a conditional random fields (CRF) layer(Lafferty, McCallum, & Pereira, 2001 )arestackedtogetherfor sequencelearning, asshownin Fig. 2 .BiLSTM incorporatesa for-wardlongshort-termmemory(LSTM)layerandabackwardLSTM layerinordertolearn informationfromprecedingaswell as fol-lowingtokens. LSTM (Hochreiter & Schmidhuber, 1997 ) is a kind ofrecurrent neural network (RNN) architecture withlong short-termmemoryunitsashiddenunits.NextwebrieflydescribeRNN, LSTM,BiLSTMandBiLSTM-CRF.

RNN(Elman, 1990 )isaclassofartiﬁcialneuralsequencemodel, whereconnections between unitsform a directed cycle. It takes arbitraryembedding sequences x=

(

x1,...,xT

)

as input, uses its internalmemorynetworktoexhibitdynamictemporalbehavior.It

consisting ofa hiddenunit h andan optionaloutput y. T is the last time step.It is alsothe length ofinput sentenceinthis text sequencelearningtask.Ateach timestept,thehiddenstatehtof theRNNiscomputedbasedontheprevioushiddenstateht−1and

theinputatthecurrentstepxt:

ht=g

(

Uxt+Wht−1

)

(1)

where U and W are weight matrices of the network; g( · ) is a non-linear activation function, such as an element-wise logis-tic sigmoid function. The output at time step t is computed as

yt=softmax

(

Vht

)

, where V is another weight parameter of the network, softmaxis an activation function oftenimplemented at theﬁnallayerofanetwork.

LSTMisavariantofRNNdesignedtodealwithvanishing gra-dientsproblem(Hochreiter & Schmidhuber, 1997 ).TheLSTMused in the BiLSTM-CRF(Lample et al., 2016 ) has two gates (an input gateit,anoutputgateot)andacellactivationvectorsct.

BiLSTM uses two LSTMs to learn each token of the sequence based on both the past andthe future context of the token. As shown in Fig. 2 , one LSTM processes the sequence from left to right,theotheronefromrighttoleft.Ateachtimestept,ahidden forwardlayerwithhiddenunitfunction−→h iscomputedbasedon theprevioushiddenstate−→ht−1andtheinputatthecurrentstepxt andahiddenbackwardlayerwithhiddenunitfunction←h−is com-putedbasedonthefuturehiddenstate ←h−t+1 andtheinputatthe

currentstepxt.Theforwardandbackwardcontextrepresentations, generatedby−→htand←h−t respectively,areconcatenatedintoalong vector.The combinedoutputsarethepredictionsofteacher-given targetsignals.

As another widely used sequence model, conditional random ﬁelds (CRF) is a type of discriminative undirected probabilistic graphicalmodel,whichrepresentsasinglelog-lineardistributions over structured outputs asa function ofa particular observation inputsequence.

GivenobservationsvariablesXwhosevaluesareobserved, ran-domvariablesYwhosevaluesthetaskrequiresthemodelto pre-dict, and a undirected graph G where Y are connected by undi-rectededgesindicating dependencies.CRFdeﬁnesthe conditional probabilityofasetofoutputvaluesy_∈Ygivenasetofinput val-uesx∈Xtobe proportionaltotheproductofpotential functions oncliquesofthegraph(McCallum, 2003 ),

p

(

y

|

x

)

= _Z1

x

s∈S(y,x)

(5)

Fig.2. AnillustrationofBiLSTM-CRFfortargetextractionandsentencetypeclassiﬁcation.BiLSTMlayerincorporatesaforwardLSTMlayerandabackwardLSTMlayer.

whereZx isanormalizationfactoroveralloutputvalues,S(y,x)is thesetofcliquesofG,

s(ys,xs)isthecliquepotentialoncliques. Afterwards,intheBiLSTM-CRFmodel,asoftmaxoverall possi-bletagsequencesyieldsaprobabilityforthesequencey.The pre-dictionoftheoutputsequenceiscomputedasfollows:

y∗ =argmaxy∈Y

σ

(

X,y

)

(3)

where

σ

(X,y)isthescorefunctiondeﬁnedasfollows:

σ

(

X,y

)

= n i=0 Ayi,yi+1+ n i=1 Pi,yi (4)

where A is a matrix of transition scores, Ay_i,y_i₊₁ represents the score ofa transitionfromthe tag yi to yi+1.n isthe length ofa

sentence,P isthematrixofscoresoutputbytheBiLSTMnetwork,

Pi,y_i isthescoreoftheythi tagoftheithwordinasentence. As shownin Fig. 2 ,dropout techniqueis used afterthe input layerofBiLSTM-CRFtoreduceoverﬁttingonthetrainingdata.This technique is ﬁrstly introduced by Hinton, Srivastava, Krizhevsky, Sutskever, and Salakhutdinov (2012) for preventing complex co-adaptationsonthetrainingdata.Ithasgivenbigimprovementson manytasks.

AftertargetextractionbyBiLSTM-CRF,allopinionatedsentences are classiﬁedinto non-targetsentences, one-targetsentences and multi-target sentences, according to the number of targets ex-tractedfromthem.

3.2. 1d-CNNforsentimentclassiﬁcationoneachsentencetype

1d-CNN, firstly proposed by Kim (2014) , takes sentences of varyinglengthsasinputandproducesfixed-lengthvectorsas out-put.Beforetraining, wordembeddingsforeach wordinthe glos-saryofallinputsentencesaregenerated.Allthewordembeddings arestackedinamatrixM.Intheinputlayer,embeddingsofwords comprisingcurrenttrainingsentencearetakenfromM.The max-imumlengthofsentencesthatthenetworkhandlesisset.Longer sentencesarecut;shortersentencesarepaddedwithzerovectors. Then,dropoutregularizationisusedtocontrolover-fitting.

Intheconvolutionlayer,multiplefilterswithdifferentwindow size move on the wordembeddings to perform one-dimensional convolution. As the filtermoveson,many sequences, which cap-ture the syntactic and semantic features in the filtered n-gram,

are generated. Manyfeature sequences are combined into a fea-turemap.In thepoolinglayer,a max-overtime poolingoperation (Collobert et al., 2011 )isapplied tocapturethemostusefullocal features fromfeaturemaps.Activationfunctionsare addedto in-corporateelement-wisenon-linearity.Theoutputs ofmultiple ﬁl-ters are concatenated in the merge layer. After another dropout process,afullyconnectedsoftmaxlayeroutputtheprobability dis-tributionoverlabelsfrommultipleclasses.

CNNis oneof mostcommonlyused connectionism modelfor classification.Connectionismmodelsfocuson learningfrom envi-ronmentalstimuliandstoringthis informationin aform of con-nections betweenneurons. The weights in a neural network are adjusted according to the training data by some learning algo-rithm. It is the greater the difference in the training data, the moredifficultforthelearningalgorithmtoadaptthetrainingdata, andtheworseclassificationresults.Dividingopinionatedsentences intodifferenttypesaccordingtothe numberoftargetsexpressed inthemcanreducethedifferencesoftrainingdataineachgroup, therefore,improveoverallclassificationaccuracy.

4. Experiment

We conduct experiments to evaluate the performance of the proposed approach for sentence-level sentiment classiﬁcation on variousbenchmarkingdatasets.Inthissection,wedescribethe ex-perimentalsetupandbaselinemethodsfollowedbythediscussion ofresults.

4.1. Experimentalsetup

FortrainingBiLSTM-CRFfortargetextractionandsentencetype classiﬁcation,weusetheMPQAopinioncorpusv2.0(MPQAdataset for short) provided by Wiebe, Wilson, and Cardie (2005) 4 _since itcontainsa diverserange ofsentenceswith variousnumbers of opinion targets.It contains14,492sentences froma wide variety of news sources manually annotated with opinion target at the phrase level (7,026 targets). All the sentences are used to train BiLSTM-CRF.

(6)

Forsentimentclassiﬁcationwith1d-CNN,wetestourapproach ondifferentdatasets:

• MR: Movie review sentence polarity dataset v1.0. It contains 5331 positive snippets and 5331 negative snippets extracted from Rotten Tomatoes web site pages where reviewsmarked with“fresh” are labeled aspositive,andreviewsmarked with “rotten” are labeled as negative. 10-fold cross validation was usedfortesting.

• SST-1: Stanford sentiment treebank contains11,855 sentences also extracted from the original pool of Rotten Tomatoes page ﬁles. These sentences are split into 8544/1101/2210 for train/dev/test.Each ofthem is ﬁne-grainedlabeled (very pos-itive,positive,neutral,negative,verynegative).

• SST-2: Binary labeled version of Stanford sentiment treebank, inwhichneutralreviewsare removed,verypositive and posi-tivereviewsarelabeledaspositive,negativeandverynegative reviews are labeled asnegative (Kim, 2014 ). It contains 9613 sentencessplitinto6920/872/1821fortrain/dev/test.

• CR: Customerreviewsof5digitalproducts contains3771 sen-tences extracted from amazon.com, including 2405 positive sentencesand1366negativesentences.10-foldcrossvalidation wasusedfortesting.

Following Kim (2014) ’swork,weuseaccuracyastheevaluation metrictomeasuretheoverallsentimentclassiﬁcationperformance. During training a BiLSTM-CRF for target extraction in a sen-tence, the input sequence xt is set to the t-th word embedding (adistributedrepresentation fora word (Bengio, Ducharme, Vin- cent, & Jauvin, 2003 ))inainputsentence.Publiclyavailable word vectorstrainedfromGoogle News5 _are _used_as_pre-trained _word embeddings.Thesizeoftheseembeddingsis300.U,W,Vandh0

areinitializedtoarandomvectorofsmallvalues,ht+1 are

initial-izedtoacopyofhtrecursively.Aback-propagationalgorithmwith Adamstochasticoptimizationmethodisusedtotrainthenetwork throughtimewithlearningrateof0.05.Aftereachtrainingepoch, thenetworkistestedonvalidationdata.Thelog-likelihoodof val-idationdataiscomputedforconvergencedetection.

For trainingCNN,we use: CNN-non-staticmodel, ReLUas ac-tivationfunction, Adadeltadecay parameter of0.95, dropout rate of0.5,thesizeofinitialwordvectorsof300.Weusedifferent fil-terwindowsandfeaturemapsfordifferenttargetclasses.For non-targetsentences,weusefilterwindowsof3,4,5with100feature mapseach;Forone-targetsentences, weusefilterwindows of3, 4,5,6with100featuremapseach;Formulti-targetsentences,we usefilterwindowsof3,4,5,6,7with200featuremapseach.

4.2.Baselinemethods

We benchmark the following baseline methods for sentence-levelsentimentclassiﬁcation,some ofthemhavebeenpreviously usedin Kim (2014) :

• MNB:MultinomialnaiveBayeswithuni-bigrams.

• NBSVM:SVMvariantusingnaiveBayeslog-countratiosas fea-turevaluesproposedby Wang and Manning (2012) .

• Tree-CRF:Dependencytreebasedmethodforsentiment classi-ﬁcationusingCRFwithhiddenvariablesproposedby Nakagawa et al. (2010) .

• RAE: Semi-supervisedrecursiveautoencoders withpre-trained wordvectorsfromWikipediaproposedby Socher et al. (2011) .

• MV-RNN: Recursive neural networkusing a vector anda ma-trixoneverynodeinaparsetreeforsemanticcompositionality proposedby Socher, Huval, Manning, and Ng (2012) .

5_{https://drive.google.com/ﬁle/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit?usp=} sharing.

Table2

ExamplesentencesineachtargetclassofStanfordsentimenttreebank. T0,T1 andT2+refertonon-targetsentences,one-targetsentencesandmulti-target sen-tencesrecognized byBiLSTM-CRF,respectively. In eachtarget class,we show 3examplesentences(onepositive,oneneutral,onenegativesentence, respec-tively),s1tos9aretheordernumbersoftheexamples.

Class Examplesentences

T0 s1:...veryfunny,veryenjoyable...

s2:Darkanddisturbing,yetcompellingtowatch.

s3:Hey,whoelseneedsashower?

T1 s4:Yettheactisstillcharminghere.

s5:Asadirector,Mr.Ratliff wiselyrejectsthetemptationtomakefun ofhissubjects.

s6:NotoriousC.H.O.hasoodlesofvulgarhighlights.

T2+ s7:Singer/composerBryanAdamscontributesaslewofsongs– afew potentialhits,afewmoresimplyintrusivetothestory– butthe wholepackagecertainlycapturestheintended,er,spiritofthepiece.

s8:YouShouldPayNineBucksforThis:Becauseyoucanhearabout sufferingAfghanrefugeesonthenewsandstillbeunaffected.

s9:...whileeachmomentofthisbrokencharacterstudyisrichin emotionaltexture,thejourneydoesn’treallygoanywhere.

• RNTN: Recursivedeep neural network for semantic composi-tionalityover asentimenttreebankusing tensor-basedfeature functionproposedby Socher et al. (2013) .

• Paragraph-Vec:Anunsupervisedalgorithmlearningdistributed feature representations from sentences and documents pro-posedby Le and Mikolov (2014) .

• DCNN:Dynamicconvolutionalneuralnetworkwithdynamick -maxpoolingoperationproposedby Kalchbrenner et al. (2014) .

• CNN-non-static: 1d-CNN with pre-trained word embeddings andﬁne-tuningoptimizingstrategyproposedby Kim (2014) .

• CNN-multichannel:1d-CNNwithtwosetsofpre-trainedword embeddingsproposedby Kim (2014) .

• DRNN: Deep recursive neural networkswith stacked multiple recursivelayersproposedby Irsoy and Cardie (2014a) .

• Multi-taskLSTM:Amulti-tasklearningframeworkusingLSTM to jointlylearn across multiple relatedtasks proposed by Liu et al. (2016) .

• TreeLSTM:AgeneralizationofLSTMtotreestructurednetwork topologiesproposedby Tai et al. (2015) .

• Sentic patterns: A concept-level sentiment analysis approach using dependency-based rules proposed by Poria, Cambria, Winterstein, and Huang (2014) .

4.3. Results

4.3.1. Qualitativeevaluations

In Table 2 ,weshowtheexamplesentencesineachtargetclass of Stanford sentiment treebank. It is observed that many non-target sentences are smallimperative sentences, havedirect sub-jective expressions (DSEs) which consist of explicit mentions of private states orspeech events expressingprivate states(Irsoy & Cardie, 2014b ), e.g., funny and enjoyable in s1, dark and disturb-ing in s2.For some non-targetsentences, it is diﬃcult to detect itssentimentwithoutcontext,e.g.,itisunclearwhethertheword

shower in s3 conveys positive or negative sentiment. Non-target sentences tend to be short comparing with two other types of sentences.Manyone-targetsentencesaresimplesentences,which containbasicconstituentelementsformingasentence.Thesubject ismostlytheopinionated targetinaone-targetsentence,e.g.,the act ins4, Mr. Ratliffin s5 andC.H.O. ins6. Almostall the multi-target sentences are compound/complex/compound-complex sen-tences, which have two or more clauses, and are very complex inexpressions. Manyof themhavecoordinating orsubordinating conjunctions,whichmakeitdiﬃculttoidentifythesentimentofa wholesentence,e.g.,butins7,becauseandandins8,whileins9.

(7)

Table3

Experimentalresultsofsentimentclassiﬁcationaccuracy. %isomitted.Thebestresultsarehighlightedinboldface. The resultsofthe top10 approacheshavebeen previ-ouslyreportedbyKim(2014).Thetop3approachesare conventional machine learning approaches with hand-craftedfeatures.Sentic patternsisrulebasedapproach. Other 11approaches,including ourapproach, aredeep neuralnetwork(DNN)approaches,whichcan automati-callyextractfeaturesfrominputdataforclassiﬁer train-ingwithoutfeatureengineering.

Model MR SST-1 SST-2 CR MNB 79.0 – – 80.0 NBSVM 79.4 – – 81.8 Tree-CRF 77.3 – – 81.4 Senticpatterns – – 86.2 – RAE 77.7 43.2 82.4 – MV-RNN 79.0 44.4 82.9 – RNTN – 45.7 85.4 – Paragraph-Vec – 48.7 87.8 – DCNN – 48.5 86.8 – CNN-non-static 81.5 48.0 87.2 84.3 CNN-multichannel 81.1 47.4 88.1 85.0 DRNN – 49.8 86.6 – Multi-taskLSTM – 49.6 87.9 – TreeLSTM – 50.6 86.9 – Ourapproach 82.3 48.5 88.3 85.4

Overall, as the result of the qualitative evaluations, the diﬃ-culty degree ofsentiment classiﬁcation on each sentence type is

T2+>T1>T0,i.e.,multi-targetsentencesaremostdiﬃcult,while non-targetsentences aremuch easierforsentimentclassiﬁcation. Theexperimentalresultslistedinthenextsubsectionvalidatethis observation.

4.3.2. Overallcomparison

Table 3 shows the results achieved on the MR, SST-1, SST-2 and CRdatasets. It isobserved that comparing with three hand-crafted features based methods, although RAE and MV-RNN per-formworse on MRdataset,twoCNN basedmethods givesbetter results on both MRandCRdatasets. This indicates the effective-nessofDNNapproaches.Among11DNNapproaches,ourapproach outperforms other baselineson all the datasets exceptSST-1, i.e., our approach gives relative improvements of 0.98% compared to CNN-non-staticonMRdataset,0.23%and0.47% relative improve-mentscomparedtoCNN-multichannelonSST-2andCRdataset, re-spectively.ComparingwithtwoCNNbasedmethods,oursentence type classification based approach givessuperior performance on all the fourdatasets (including SST-1 dataset). Thesevalidate the influencesofsentencetypeclassificationintermsofsentence-level sentimentanalysis.

4.3.3. Comparisononeachtargetclass

Table 4 showsthestatisticsandcomparisonofeachtargetclass ontheMR,SST-1,SST-2andCRdatasets.Therelativeimprovement ratio

Δ

calculatesasfollows:

Δ

=

(

Accour−AccCNN

)

÷AccCNN×100 (5)

Itisobviousthat theperformanceforeverytarget classis im-provedusingsentencetypeclassification.Yet,theimprovementfor themulti-targetsentences(T2+)ismoresignificantthanothertwo target classes on three of the four dataset, e.g. the relative im-provement ratio of T2+ class on the SST-1 and CR datasets are 4.75% and5.26%,respectively, whichare abouttwice higherthan therelativeimprovementratioofT1class. Table 4 isaclear indica-tionthattheproposedsentencetypeclassificationbasedsentiment classification approach is very effective for complex sentences. Both the AccCNN and Accour use 1d-CNN (non-static) and pre-trainedGoogleNewswordembedding,ourapproachachieves

bet-Table4

Theclass-by-classclassificationresultsusingsentencetypeclassificationaswell aswithoutusingsentencetypeclassificationonthefourdatasets.#trainand #testarethewordnumberofsentencesintrainingand testdataset, respec-tively;lmaxandlavgaremaxandaveragewordlengthofsentences,respectively;

AccCNNistheexperimentalresultthatwedosentimentclassiﬁcationdirectlyon

thefourdatasetsusing1d-CNN(non-static)withoutsentencetypeclassiﬁcation, andstatistictheaccuracyoneachtargetclassthesamewiththetargetclass recognizedbyBiLSTM-CRF.Accour istheexperimentalresult ofourapproach

oneachtargetclass,whichusingbothsentencetypeclassiﬁcationand1d-CNN (non-static).Δistherelativeimprovementratiocalculates.IntheAccCNN,Accour

andΔcolumns,%isomittedforconciseness.

#train #test lmax lavg AccCNN Accour Δ

MR T0 6,426 698 52 18.8 83.3 84.5 1.44 T1 2,552 290 56 22.7 77.9 78.8 1.16 T2+ 618 78 51 27.1 73.9 75.1 1.62 SST-1 T0 5,436 1,367 51 17.5 49.8 50.9 2.21 T1 2,495 655 56 21.0 45.1 46.1 2.22 T2+ 613 189 52 25.5 38.0 39.8 4.74 SST-2 T0 4,373 1,134 50 16.9 87.6 89.6 2.28 T1 2,047 534 53 20.3 83.9 86.7 3.34 T2+ 500 153 51 24.9 82.0 84.1 2.56 CR T0 1,982 191 75 17.4 85.5 88.4 3.39 T1 1,140 152 105 19.1 80.9 83.2 2.84 T2+ 273 33 95 27.9 74.1 78.0 5.26 Table5

Experimentalresultsofdifferentsequencemodels. %isomittedforconciseness.Thebestresultsare highlightedinboldface.

Model MR SST-1 SST-2 CR CRF 81.7 47.6 87.4 84.4 LSTM 81.3 47.5 87.6 84.1 BiRNN 81.7 48.1 87.9 84.8 BiRNN-CRF 81.8 48.3 87.9 84.9 BiLSTM 82.0 48.3 88.0 85.3 BiLSTM-CRF 82.3 48.5 88.3 85.4

ter performancebecause thedivide-and-conquerapproach, which first classifies sentences into different types, then optimize the sentimentclassifierseparatelyonsentencesfromeachtype.

4.3.4. Comparisonwithdifferentsequencemodels

Wehavealsoexperimentedwithdifferentsequencemodels, in-cludingCRF, LSTM,BiRNN (Schuster & Paliwal, 1997 ), BiRNN-CRF, BiLSTMandBiLSTM-CRF,forsentence typeclassification. ForCRF, we use CRFSuite (Okazaki, 2007 ) withword, Part-Of-Speech tag, prefix, suffix and a sentiment dictionary as features. For LSTM, BiRNN, BiRNN-CRF and BiLSTM, we also use Google News word embeddingsas pre-trained wordembeddings. For other parame-ters,weusedefaultparametersettings.

Table 5 showstheexperimentalresultsontheMR,SST-1,SST-2 andCRdatasets.Itcanbe observedthatBiLSTM-CRFoutperforms all the other approaches on all the four datasets. It is because BiLSTM-CRF has more complicated hidden units, and offers bet-ter composition capability than other DNN approaches. CRFwith hand-craftedfeatures givescomparableperformance toLSTM,but lower performance than more complex DNN models. BiRNN and BiLSTMgivesbetter performancecomparedtoLSTMbecausethey canlearneach tokenofthesequencebasedonboththepastand thefuturecontextofthetoken,whileLSTMonlyusethepast con-textof thetoken.ComparingBiRNNandBiLSTMwithBiRNN-CRF and BiLSTM-CRF, respectively, it is observed that combining CRF andDNNmodelscanimprovetheperformanceofDNNapproaches.

4.3.5. EvaluationonopiniontargetextractionwithBiLSTM-CRF

Oneunavoidableproblemforeverymulti-stepapproachis the propagationoferrors.In ourapproach,we usea BiLSTM-CRF/1d-CNNpipeline for sentiment analysis.It is interesting to see how

(8)

Table6

ExperimentalresultsoftargetextractionwithBiLSTM-CRFonSemEval16task5 as-pectbasedsentimentanalysisdatasetsubtask1slot2.BestSystemreferstothe participationsystemwithbestperformancesubmittedtoSemEval16task5. Base-linereferstobaselinemodelprovidedbytheorganizers;Creferstothemodelonly usestheprovidedtrainingdata;Ureferstothemodelusesotherresources(e.g., publiclylexica)andadditionaldatafortraining;“-” referstonosubmissionswere made.%isomittedforconciseness.Thebestresultsarehighlightedinboldface.

Models English Spanish French Russian Dutch Turkish

BestSystem U 72.34 68.39 66.67 33.47 56.99 –

BestSystem C 66.91 68.52 65.32 30.62 51.78 –

Baseline C 44.07 51.91 45.46 49.31 50.64 41.86

BiLSTM-CRF C 72.44 71.70 73.50 67.08 64.29 63.76

thefirst stage of opinion target extraction impacts thefinal sen-timentclassification.Evaluationontargetextractionwith BiLSTM-CRFisafundamentalstepforthiswork.

Lample et al. (2016) reportedthatBiLSTM-CRFmodelobtained state-of-the-artperformance inNERtasksinfourlanguages with-out resorting to any language-speciﬁc knowledge or resources. Specially, in CoNLL-2002 dataset,it achieved85.75 and 81.74F1 scoreinSpanishandDutchNERtasks,respectively;InCoNLL-2003 dataset,itachieved90.94and78.76F1 scoreinEnglish and Ger-manNERtasks,respectively.

We have also conducted experiments with BiLSTM-CRF using theSemEval-2016task 5aspect based sentimentanalysis dataset (Pontiki et al., 2016 ).Thereare3subtasksinthistask,eachsubtask containsseveralslots.Wehaveconductedexperimentsonsubtask 1slot2:sentence-levelopiniontargetexpressionextraction,onthe restaurantsdomain.F1 scoreisused asmetric.The experimental resultsareshownin Table 6 .

In this table, for English, the best systems are NLANG (Toh & Su, 2016 ) (U) and UWB (Hercig, Brychcín, Svoboda, & Konkol, 2016 ) (C), respectively; For Spanish, GTI (Álvarez López, Juncal- Martínez, Fernández-Gavilanes, Costa-Montenegro, & González- Castaño, 2016 )achievesboththebestsystemsUandC;ForFrench, they are IIT-T (Kumar, Kohail, Kumar, Ekbal, & Biemann, 2016 ) (U) andXRCE(Brun, Perez, & Roux, 2016 ) (C); ForRussian,Danii achievesboth thebestsystemsUandC;ForDutch,theyareIIT-T (Kumar et al., 2016 )(U)andTGB(Çetin, Yıldırım, Özbey, & Eryi ˘git, 2016 )(C).

It is observedthat BiLSTM-CRFachieves thebest performance onall thedataset usingdifferentlanguages,andoutperformsthe othersby agoodmarginin5outof6languages.Itindicatesthat BiLSTM-CRFiseffectiveinopiniontargetexpressionextraction.

WehavealsoevaluatedtheperformanceofBiLSTM-CRFonthe MPQA dataset described in Section 4.1 . We randomly select 90% sentencesinMPQAdatasetfortrainingandtheremaining10% sen-tences for testing. BiLSTM-CRF achieves 20.73 F1 score on opin-ion target extraction. This is due to the complex nature of the datathatmanyopiniontargetsarenotsimplenamedentitiessuch asperson, organizationand location intypical NERtasks. Rather, theopinion targetscouldbe events,abstractnounsormulti-word phrases. For example, “overview of Johnson’s eccentric career” in sentence“AnengagingoverviewofJohnson’seccentriccareer.”. Tar-getnumberclassificationismuch easier.Itachieves65.83% accu-racy,whenweclassifythetestsentencesinto3groupsbythe tar-getnumbers extractedfrom them. These results show that even though the performance of the first step of our approach is not veryhigh,ourpipelineapproachstillachievesthestate-of-the-art resultsonmostbenchmarkingdatasets.Ifwecanimprovethe per-formanceofthesequencemodelforopiniontargetextraction,the finalsentimentclassificationperformanceofourapproachmaybe furtherimproved.

Wehavealsoconsideredusingotherexistingopiniontarget de-tectionsystems,whicharespeciﬁcallytrainedforthistask. Unfor-tunately,itisnotveryeasytoﬁndan applicableone.Some opin-iontargetdetectionsystems,such as Liu et al. (2014) ,canalsobe regardasNERmodels.

4.3.6. Erroranalysisforsentencetypeclassiﬁcation

We havealso done erroranalysis forsentence type classiﬁca-tion.Inthissection, we listsome resultexamplesfromthe Stan-fordsentiment treebank.The__O, __Band__Iconcatenatedafter eachwordarethelabelpredictedbyBiLSTM-CRF.

Easyexample1:Yet__Othe__Bact__Iis__Ostill__Ocharming__O here__O.__O.

Easy example 2: The__B-MPQA movie__I-MPQA is__O pretty__O funny__Onow__Oand__Othen__Owithout__Oin__O any__O way__Odemeaning__Oits__Osubjects__O.__O

Easyexample3:Chomp__Ochomp__O!__O.

Diﬃcultexample1:You__B’ll__Oprobably__Olove__Oit__B.__O

Diﬃcult example 2: This__B is__O n’t__O a__B new__I idea__I .__O.

Diﬃcultexample3:An__Oengaging__Ooverview__Oof__O John-son__O’s__Oeccentric__Ocareer__O.__O

It is observed that sentences with basic constituentelements (Easy example 1), even ifa litter longin length (Easy example2), are relatively easier for target extraction with BiLSTM-CRF. One reason is that in these two sentences, the targets (the art and

themovie) arecommonlyused nouns;Anotherreasonisthat the MPQA dataset, used for training BiLSTM-CRF model, is obtained fromnewssources.Newstextisusuallymorestructuredthanthe text from other sources, such as web reviews. Small imperative sentence(Easyexample3)isalsorelativelyeasierfortarget extrac-tion,becausemanyofthemarenon-targetsentences.

Sentences containingpronouns, such asyou and itin Difficult example1andthisinDifficultexample2,are relativelymore diffi-cultfortargetextractionwithBiLSTM-CRF.Moreover,complex tar-get,suchasoverviewofJohnson’seccentriccareerinDifficult exam-ple3,isalsoverydifficult.

Example sentence: Their computer-animated faces are very ex-pressive.

Result of CRF: Their__O computer-animated__Ofaces__B are__O very__Oexpressive__O.__O

Result of BiLSTM-CRF: Their__B computer-animated__I faces__I are__Overy__Oexpressive__O.__O

We have also analyzed examples in which BiLSTM-CRF de-tects opinion targets better than CRF. As shown above, CRF can onlyidentifyapartialopiniontarget(faces),whileBiLSTM-CRFcan identifythewholeopiniontarget moreaccurately(their computer-animatedfaces).

5. Conclusion

This paper has presented a novel approach to improve sentence-levelsentiment analysisvia sentence type classification. Theapproach employsBiLSTM-CRFtoextracttarget expressionin opinionated sentences, and classifies these sentences into three types according to the number of targets extracted from them. These three types of sentences are then used to train separate 1d-CNNsforsentimentclassification.Wehaveconductedextensive experimentsonfoursentence-levelsentimentanalysisdatasetsin comparisonwith11otherapproaches.Empiricalresultsshow that ourapproachachievesstate-of-the-artperformanceonthreeofthe fourdatasets.Wehavefoundthatseparatingsentencescontaining differentopiniontargetsbooststheperformance ofsentence-level sentimentanalysis.

(9)

In future work, we plan to explore other sequence learning modelsfortargetexpressiondetectionandfurtherevaluateour ap-proachonotherlanguagesandotherdomains.

Acknowledgment

This workis supported by the National Natural Science Foun- dation of China (No. 61370165, 61632011 ), National 863Program ofChina2015AA015405, ShenzhenFundamental ResearchFunding JCYJ20150625142543470, Shenzhen Peacock Plan Research Grant KQCX20140521144507925 and Guangdong Provincial Engineering TechnologyResearchCenterforDataScience2016KF09.

References

ÁlvarezLópez,T.,Juncal-Martínez,J.,Fernández-Gavilanes,M.,Costa-Montenegro,E., &González-Castaño,F.J.(2016).GTIatSemEval-2016task5:Svmandcrffor aspectdetectionandunsupervisedaspect-basedsentimentanalysis.In Proceed-ingsof the10thinternationalworkshop onsemanticevaluation (SemEval-2016)

(pp.306–311).SanDiego,California:AssociationforComputationalLinguistics.

URL:http://www.aclweb.org/anthology/S16-1049.

Appel,O.,Chiclana,F.,Carter,J.,&Fujita,H.(2016).Ahybridapproachtothe sen-timentanalysisproblematthe sentencelevel.Knowledge-BasedSystems,108, 110–124.

Baccianella, S., Esuli,A.,&Sebastiani, F.(2010). SentiWordNet3.0:An enhanced lexicalresourcefor sentimentanalysisandopinionmining.InProceedingsof theinternationalconferenceonlanguageresourcesandevaluation(LREC):vol.10 (pp.2200–2204).

Bengio,Y.,Ducharme,R.,Vincent,P.,&Jauvin,C.(2003).Aneuralprobabilistic lan-guagemodel.JournalofMachineLearningResearch,3,1137–1155.

Brun,C.,Perez,J.,&Roux,C.(2016).XRCEatSemEval-2016task5:Feedbacked en-semblemodelingonsyntactico-semanticknowledgeforaspectbasedsentiment analysis.InProceedingsofthe10thinternationalworkshoponsemanticevaluation (SemEval-2016)(pp.277–281).SanDiego,California:Associationfor Computa-tionalLinguistics. URL:http://www.aclweb.org/anthology/S16-1044.

Carrillo-deAlbornoz, J.,&Plaza,L.(2013).Anemotion-basedmodelofnegation, intensiﬁers,andmodalityforpolarityandintensityclassiﬁcation.Journalofthe AmericanSocietyforInformationScienceandTechnology,64(8),1618–1633.

Çetin,F.S.,Yıldırım,E.,Özbey,C.,&Eryi˘git,G.(2016).TGBatSemEval-2016task 5:Multi-lingualconstraintsystemforaspectbasedsentimentanalysis.In Pro-ceedingsofthe10thinternationalworkshoponsemanticevaluation(SemEval-2016)

(pp.337–341).SanDiego,California:AssociationforComputationalLinguistics.

URL:http://www.aclweb.org/anthology/S16-1054.

Chaturvedi,I.,Ong,Y.-S.,Tsang,I.W.,Welsch,R.E.,&Cambria,E.(2016).Learning worddependenciesintextbymeansofadeeprecurrentbeliefnetwork. Knowl-edge-BasedSystems,108,144–154.

Collobert,R.,Weston,J.,Bottou,L.,Karlen,M.,Kavukcuoglu,K.,&Kuksa,P.(2011). Naturallanguageprocessing(almost)fromscratch.TheJournalofMachine Learn-ingResearch,12,2493–2537.

Crystal,D.(2011).Dictionaryoflinguisticsandphonetics:vol.30.JohnWiley&Sons. Dong,L.,Wei,F.,Tan,C.,Tang,D.,Zhou,M.,&Xu,K.(2014).Adaptiverecursive neu-ralnetworkfortarget-dependenttwittersentimentclassiﬁcation.InProceedings ofthe52ndannualmeetingoftheassociationforcomputationallinguistics(ACL) (pp.49–54).Baltimore,Maryland:AssociationforComputationalLinguistics. Elman,J.L.(1990).Findingstructureintime.CognitiveScience,14(2),179–211. Fernández-Gavilanes, M., Álvarez-López, T., Juncal-Martínez, J.,

Costa-Montene-gro,E.,&González-Castaño,F.J. (2016).Unsupervised methodforsentiment analysisinonlinetexts.ExpertSystemswithApplications,58,57–75.

Ganapathibhotla,M.,&Liu,B.(2008).Miningopinionsincomparativesentences. InProceedingsofthe22nd internationalconferenceoncomputationallinguistics (COLING):vol.1(pp.241–248).AssociationforComputationalLinguistics. González-Ibánez,R., Muresan,S., &Wacholder,N. (2011).Identifying sarcasmin

twitter:acloser look.InProceedingsofthe 49thannual meetingofthe asso-ciationforcomputationallinguistics:Humanlanguage technologies(ACL):vol.2 (pp.581–586).AssociationforComputationalLinguistics.

Graves,A.,Mohamed,A.-r.,&Hinton,G.(2013).Speechrecognitionwithdeep re-currentneuralnetworks.InIEEEInternationalconferenceonacoustics,speechand signalprocessing(ICASSP)(pp.6645–6649).IEEE.

Hatzivassiloglou,V.,&Wiebe,J.M.(2000).Effectsofadjectiveorientationand grad-abilityonsentencesubjectivity.InProceedingsofthe18thconferenceon compu-tationallinguistics(COLING):vol.1(pp.299–305).AssociationforComputational Linguistics.

Hercig,T., Brychcín, T.,Svoboda, L., &Konkol,M. (2016).UWBatSemEval-2016 task5:Aspectbasedsentimentanalysis.InProceedingsofthe10thinternational workshoponsemanticevaluation(SemEval-2016)(pp.342–349).SanDiego, Cali-fornia:AssociationforComputationalLinguistics. URL:http://www.aclweb.org/ anthology/S16-1055.

Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdi-nov,R.R. (2012).Improvingneuralnetworks bypreventingco-adaptation of feature detectors.ComputingResearchRepository (CoRR),abs/1207.0580. URL:

http://arxiv.org/abs/1207.0580.

Hochreiter,S.,&Schmidhuber,J.(1997).Longshort-termmemory.Neural Computa-tion,9(8),1735–1780.

Hu,M.,&Liu,B.(2004).Miningand summarizingcustomerreviews.In Proceed-ingsofthetenthACMSIGKDDinternationalconferenceonknowledgediscoveryand datamining(KDD)(pp.168–177).ACM.

Huang,Z.,Xu,W.,&Yu,K.(2015).Bidirectionallstm-crfmodelsforsequence tag-ging.arXivpreprintarXiv:1508.01991.

Irsoy,O.,&Cardie,C. (2014a).Deeprecursiveneuralnetworksfor compositional-ityinlanguage.InAdvancesinneuralinformationprocessingsystems27:annual conferenceonneuralinformationprocessingsystems(NIPS)(pp.2096–2104). Irsoy,O.,&Cardie,C. (2014b).Opinionminingwithdeeprecurrentneuralnetworks.

InProceedingsofthe2014conferenceonempiricalmethodsinnaturallanguage processing(EMNLP)(pp.720–728).

Jia,L.,Yu,C.,&Meng,W.(2009).Theeffectofnegationonsentimentanalysisand retrievaleffectiveness.InProceedingsofthe18thACMconferenceoninformation andknowledgemanagement(CIKM)(pp.1827–1830).ACM.

Jiang, L., Yu,M., Zhou, M., Liu,X., & Zhao,T. (2011). Target-dependent twitter sentimentclassiﬁcation.InProceedingsofthe49thannualmeetingofthe asso-ciationforcomputationallinguistics: Humanlanguagetechnologies(ACL):vol.1 (pp.151–160).AssociationforComputationalLinguistics.

Jindal,N.,&Liu,B. (2006a).Identifyingcomparativesentencesintextdocuments. InProceedingsofthe29thannualinternationalACMSIGIRconferenceonresearch anddevelopmentininformationretrieval(SIGIR)(pp.244–251).ACM.

Jindal, N., & Liu, B. (2006b). Mining comparative sentences and relations. In Proceedings, the 21st national conference on artiﬁcial intelligence and the 18thinnovative applications of artiﬁcial intelligence conference (AAAI): vol. 22 (pp.1331–1336).

Kalchbrenner,N.,Grefenstette,E.,&Blunsom,P.(2014).Aconvolutionalneural net-workformodellingsentences.InProceedingsofthe52ndannualmeetingofthe associationforcomputationallinguistics (ACL)(pp. 655–665).Baltimore, Mary-land:AssociationforComputationalLinguistics.

Kessler,W.,&Kuhn,J.(2014).Acorpusofcomparisonsinproductreviews.InIn pro-ceedingsofthe9thlanguageresourcesandevaluationconference(LREC),Reykjavik, Iceland(pp.2242–2248). URL:http://www.lrec-conf.org/proceedings/lrec2014/ pdf/1001_Paper.pdf.

Kim,Y.(2014).Convolutionalneuralnetworksforsentenceclassiﬁcation.In Proceed-ingsofthe2014conferenceonempiricalmethodsinnaturallanguageprocessing (EMNLP)(pp.1746–1751).

Kumar, A., Kohail, S., Kumar, A., Ekbal, A., & Biemann, C. (2016). IIT-TUDA at SemEval-2016task 5:Beyond sentimentlexicon: Combiningdomain depen-dencyanddistributionalsemanticsfeaturesforaspectbasedsentiment anal-ysis.InProceedingsofthe 10thinternationalworkshoponsemanticevaluation (SemEval-2016)(pp.1129–1135).SanDiego,California:Associationfor Compu-tationalLinguistics. URL:http://www.aclweb.org/anthology/S16-1174.

Lafferty,J.,McCallum,A.,&Pereira,F.C.(2001).Conditionalrandomﬁelds: Proba-bilisticmodelsforsegmentingandlabelingsequencedata.InProceedingsofthe eighteenthinternationalconferenceonmachinelearning(ICML),WilliamsCollege, Williamstown,MA,USA,June28,-July1,2001(pp.282–289).

Lample,G.,Ballesteros,M.,Subramanian,S.,Kawakami,K.,&Dyer,C.(2016).Neural architecturesfornamedentityrecognition.arXivpreprintarXiv:1603.01360.

Le,Q.,&Mikolov,T.(2014).Distributedrepresentationsofsentencesanddocuments. InProceedingsofthe 31thinternationalconferenceonmachinelearning(ICML) (pp.1188–1196).

Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers. doi:10.2200/ S00416ED1V01Y201204HLT016.

Liu,B.(2015). Sentimentanalysis:Miningopinions,sentiments,andemotions. Cam-bridgeUniversityPress.

Liu,P.,Qiu,X.,&Huang,X.(2016).Recurrentneuralnetworkfortextclassiﬁcation withmulti-tasklearning.arXivpreprintarXiv:1605.05101.

Liu,K.,Xu,L.,&Zhao,J.(2014).Extractingopiniontargetsandopinionwordsfrom onlinereviewswithgraphco-ranking.InProceedingsofthe52ndannual meet-ingoftheassociationforcomputationallinguistics(ACL)(pp.314–324).Baltimore, Maryland:AssociationforComputationalLinguistics.

Liu,Y.,Yu,X.,Chen,Z.,&Liu,B.(2013).Sentimentanalysisofsentenceswith modal-ities.InProceedingsofthe2013internationalworkshoponminingunstructuredbig datausingnaturallanguageprocessing(UnstructureNLP@CIKM)(pp.39–44).ACM. McCallum,A.(2003).Efficientlyinducingfeaturesofconditionalrandomfields.In Proceedingsofthe19thconferenceonuncertaintyinartificialintelligence (UAI) (pp.403–410).MorganKaufmann.

Mitchell,M.,Aguilar,J.,Wilson,T.,&Durme,B.V.(2013).Opendomaintargeted sentiment.InProceedingsofthe2013conferenceonempiricalmethodsinnatural languageprocessing,EMNLP2013,18–21October2013,GrandHyattSeattle, Seat-tle,Washington,USA,AmeetingofSIGDAT, aSpecialInterestGroupoftheACL (pp.1643–1654).

Muhammad,A.,Wiratunga,N.,&Lothian,R.(2016).Contextualsentimentanalysis forsocialmediagenres.Knowledge-BasedSystems,108,92–101.

Nakagawa,T., Inui,K.,&Kurohashi,S. (2010). Dependencytree-basedsentiment classiﬁcationusingCRFswithhiddenvariables.InHumanlanguagetechnologies: The2010annualconferenceoftheNorthAmericanchapteroftheassociationfor computationallinguistics(NAACL)(pp.786–794).AssociationforComputational Linguistics.

Narayanan,R.,Liu,B.,&Choudhary,A.(2009).Sentimentanalysisofconditional sentences.InProceedingsofthe2009conferenceonempiricalmethodsinnatural languageprocessing(EMNLP):vol.1(pp.180–189).AssociationforComputational Linguistics.

(10)

Okazaki,N. (2007).Crfsuite:afast implementationofconditionalrandomﬁelds (CRFs).

Pang,B.,&Lee,L.(2004).Asentimentaleducation:Sentimentanalysisusing subjec-tivitysummarizationbasedonminimumcuts.InProceedingsofthe42ndannual meetingonassociationforcomputationallinguistics(ACL)(pp.271–278). Associa-tionforComputationalLinguistics.

Pang,B.,&Lee,L.(2005).Seeingstars:Exploitingclassrelationshipsforsentiment categorizationwithrespecttoratingscales.InProceedingsofthe43rdannual meetingonassociationforcomputationallinguistics(ACL)(pp.115–124). Associa-tionforComputationalLinguistics.

Pang,B.,Lee,L.,&Vaithyanathan,S.(2002).Thumbsup?:sentimentclassiﬁcation usingmachinelearningtechniques.InProceedingsofthe2002conferenceon em-piricalmethodsinnaturallanguageprocessing(EMNLP)(pp.79–86).

Park,M.,&Yuan,Y.(2015).Linguisticknowledge-drivenapproachtochinese com-parativeelementsextraction.InProceedingsofthe8thSIGHANworkshopon chi-neselanguageprocessing(SIGHAN-8)(pp.79–85).AssociationforComputational Linguistics.

Pontiki,M.,Galanis,D.,Papageorgiou,H.,Androutsopoulos,I.,Manandhar,S., AL-S-madi,M.,Al-Ayyoub,M.,Zhao,Y.,Qin,B.,Clercq,O.D.,Hoste,V.,Apidianaki,M., Tannier, X., Loukachevitch,N., Kotelnikov,E., Bel, N., Jimnez-Zafra, S. M., & Eryiit,G.(2016).SemEval-2016task5:Aspectbasedsentimentanalysis.In Pro-ceedingsofthe10thinternationalworkshoponsemanticevaluation(SemEval).In SemEval’16 (pp.19–30).SanDiego, California:Association forComputational Linguistics.

Popescu,A.,&Etzioni,O.(2005).Extractingproductfeaturesand opinionsfrom reviews.InProceedingsofthe2005,humanlanguagetechnologyconferenceand conference on empirical methods in natural language processing (HLT/EMNLP) (pp.9–28).Springer.

Poria,S., Cambria, E., &Gelbukh, A.(2016). Aspect extraction for opinion min-ingwithadeepconvolutionalneuralnetwork.Knowledge-BasedSystems,108, 42–49.

Poria,S.,Cambria,E.,Winterstein,G.,&Huang,G.-B.(2014).Senticpatterns: Depen-dency-basedrulesfor concept-levelsentimentanalysis. Knowledge-Based Sys-tems,69,45–63.

Qiu,G., Liu, B., Bu, J., & Chen, C. (2011). Opinion word expansion and target extractionthrough doublepropagation.ComputationalLinguistics, 37(1),9–27. doi:10.1162/coli_a_00034.

Ravi,K.,&Ravi,V.(2015). Asurveyonopinionmining andsentimentanalysis: tasks,approachesandapplications.Knowledge-BasedSystems,89,14–46. Rill,S.,Reinel,D.,Scheidt,J.,&Zicari,R.V.(2014).Politwi:Earlydetectionof

emerg-ingpoliticaltopicsontwitterandtheimpactonconcept-levelsentiment anal-ysis.Knowledge-BasedSystems,69,24–33.

Riloff,E.,Qadir,A.,Surve,P.,DeSilva,L.,Gilbert,N.,&Huang,R.(2013).Sarcasmas contrastbetweenapositivesentimentandnegativesituation..InProceedingsof the2013conferenceonempiricalmethodsonnaturallanguageprocessing(EMNLP) (pp.704–714).AssociationforComputationalLinguistics.

Riloff,E.,&Wiebe,J.(2003).Learningextractionpatternsforsubjectiveexpressions. InProceedingsofthe2003conferenceonempiricalmethodsinnaturallanguage processing(emnlp)(pp.105–112).AssociationforComputationalLinguistics. Schuster,M.,&Paliwal,K.K.(1997).Bidirectionalrecurrentneuralnetworks.IEEE

TransactionsonSignalProcessing,45(11),2673–2681.

Socher,R.,Huval,B.,Manning,C.D.,&Ng,A.Y.(2012).Semanticcompositionality throughrecursivematrix-vectorspaces.InProceedingsofthe2012joint confer-enceonempiricalmethodsinnaturallanguageprocessingandcomputational nat-urallanguagelearning(EMNLP-CoNLL)(pp.1201–1211).ACL.

Socher,R.,Pennington,J.,Huang,E.H.,Ng,A.Y.,&Manning,C.D.(2011). Semi-su-pervisedrecursiveautoencodersforpredictingsentimentdistributions.In Pro-ceedingsofthe conferenceonempiricalmethodsinnaturallanguage processing (pp.151–161).AssociationforComputationalLinguistics.

Socher, R., Perelygin, A., Wu, J. Y., Chuang, J., Manning, C. D., Ng, A. Y., & Potts, C.(2013).Recursivedeep modelsforsemanticcompositionalityover a sentimenttreebank.InProceedingsofthe2013conferenceonempiricalmethods innaturallanguageprocessing(EMNLP)(pp.1631–1642).Citeseer.

Stoyanov,V.,&Cardie,C.(2008).Topicidentiﬁcationforﬁne-grainedopinion anal-ysis.InProceedingsofthe22ndinternationalconferenceoncomputational linguis-tics(COLING):vol.1(pp.817–824).AssociationforComputationalLinguistics.

Tai, K. S.,Socher, R., & Manning, C. D. (2015). Improved semantic representa-tionsfromtree-structured longshort-termmemory networks.arXivpreprint arXiv:1503.00075.

Tang,D.,Qin,B.,Feng,X.,&Liu,T.(2015a).Target-dependentsentiment classiﬁca-tionwithlongshorttermmemory.arXivpreprintarXiv:1512.01100.

Tang,D.,Qin,B.,&Liu,T.(2015b). Deeplearningforsentimentanalysis:Successful approachesandfuturechallenges.WileyInterdisciplinaryReviews:DataMining andKnowledgeDiscovery,5(6),292–303.doi:10.1002/widm.1171. URL:http:// dx.doi.org/10.1002/widm.1171.

Tang,D., Qin,B.,Wei,F.,Dong, L., Liu,T.,&Zhou,M.(2015c). Ajoint segmen-tationandclassiﬁcationframeworkforsentencelevelsentimentclassiﬁcation.

IEEE/ACMTransactionsonAudio,Speech,andLanguageProcessing,23(11),1750– 1761.doi:10.1109/TASLP.2015.2449071.

Toh,Z.,&Su,J.(2016).NLANGPatSemEval-2016task5:Improvingaspectbased sentimentanalysisusingneuralnetworkfeatures.InProceedingsofthe10th in-ternationalworkshoponsemanticevaluation(SemEval-2016)(pp.282–288).San Diego,California:AssociationforComputationalLinguistics. URL:http://www. aclweb.org/anthology/S16-1045.

Tsur,O.,Davidov,D.,&Rappoport,A.(2010).Icwsm-agreatcatchyname: Semi-su-pervisedrecognition ofsarcasticsentencesinonlineproductreviews.In Pro-ceedingsofthe4thinternationalconferenceonweblogsandsocialmedia(ICWSM) (pp.162–169).

Turney,P.D.(2002).Thumbsuporthumbsdown?:semanticorientationappliedto unsupervisedclassiﬁcationofreviews.InProceedingsofthe40thannualmeeting onassociationforcomputationallinguistics(acl)(pp.417–424).Associationfor ComputationalLinguistics.

Vo,D.-T.,&Zhang,Y.(2015).Target-dependenttwittersentimentclassiﬁcationwith richautomatic features.InProceedingsofthe twenty-fourthinternationaljoint conferenceonartiﬁcialintelligence(IJCAI)(pp.1347–1353).

Wang,S.,&Manning,C.D.(2012).Baselinesandbigrams:Simple,goodsentiment andtopicclassiﬁcation.InProceedingsofthe50thannualmeetingofthe associa-tionforcomputationallinguistics(ACL):vol.2(pp.90–94).Associationfor Com-putationalLinguistics.

Wiebe,J.M.,Bruce,R.F.,&O’Hara,T.P.(1999).Developmentanduseofa gold-s-tandarddatasetforsubjectivityclassiﬁcations.InProceedingsofthe37thannual meetingoftheassociationforcomputationallinguistics(ACL)(pp.246–253). As-sociationforComputationalLinguistics.

Wiebe, J., &Wilson, T. (2002). Learning to disambiguate potentially subjective expressions.InProceedingsofthe 6thconferenceonnaturallanguagelearning (CoNLL)(pp.1–7).AssociationforComputationalLinguistics.

Wiebe,J.,Wilson,T.,&Cardie,C.(2005).Annotatingexpressionsofopinionsand emotionsinlanguage.LanguageResourcesandEvaluation,39(2–3),165–210. Yang,S.,&Ko,Y.(2011).Extractingcomparativeentitiesandpredicatesfromtexts

usingcomparativetypeclassiﬁcation. InProceedingsofthe49thannual meet-ingoftheassociationforcomputationallinguistics:Humanlanguagetechnologies (ACL):vol.1(pp.1636–1644).AssociationforComputationalLinguistics. Yu,H.,&Hatzivassiloglou,V.(2003).Towardsansweringopinionquestions:

Sepa-ratingfactsfromopinionsandidentifyingthepolarityofopinionsentences.In Proceedingsofthe2003conferenceonempiricalmethodsinnaturallanguage pro-cessing(EMNLP)(pp.129–136).AssociationforComputationalLinguistics. Zhang,L.,Ferrari,S.,&Enjalbert,P.(2012).Opinionanalysis:theeffectofnegation

onpolarityandintensity.InKONVENSworkhopPATHOS-1stworkshoponpractice andtheoryofopinionminingandsentimentanalysis(pp.282–290).

Zhang,M.,Zhang,Y.,&Vo,D.-T.(2015).Neuralnetworksforopendomaintargeted sentiment.InProceedingsofthe2015conferenceonempiricalmethodsinnatural languageprocessing(EMNLP)(pp.612–621).

221–230

ScienceDirect

www.elsevier.com/locate/eswa

(

https://www.cs.cornell.edu/people/pabo/movie-review-data/

http://nlp.stanford.edu/sentiment/

https://www.cs.uic.edu/

http://mpqa.cs.pitt.edu/

5

http://www.aclweb.org/anthology/S16-1049

110–124

2200–2204)

1137–1155

1618–1633

144–154

2493–2537

30.

49–54).

179–211

57–75

241–248).

581–586).

6645–6649).

299–305).

Cali-fornia:

http://arxiv.org/abs/1207.0580

1735–1780

168–177).

1508.01991

2096–2104)

720–728)

1827–1830).

151–160).

244–251).

1331–1336)

655–665).

2242–2248)

1746–1751)

282–289)

1603.01360

1188–1196)

10.2200/

Liu,

1605.05101

314–324).

39–44).

403–410).

1643–1654)

92–101

786–794).

180–189).

271–278).

115–124).

79–86)

79–85).

19–30).

Popescu,

42–49

45–63

034

14–46

24–33

704–714).