Top PDF Dynamic joint sentiment-topic model

Dynamic joint sentiment-topic model

Dynamic joint sentiment-topic model

Social media data are produced continuously by a large and uncontrolled number of users. The dynamic na- ture of such data requires the sentiment and topic analysis model to be also dynamically updated, capturing the most recent language use of sentiments and topics in text. We propose a dynamic joint sentiment-topic model (dJST) which allows the detection and tracking of views of current and recurrent interests and shifts in topic and sentiment. Both topic and sentiment dynamics are captured by assuming that the current sentiment-topic specific word distributions are generated according to the word distributions at previous epochs. We study three different ways of accounting for such dependency information, (1) Sliding window where the current sentiment-topic-word distributions are dependent on the previous sentiment-topic specific word distributions in the last S epochs; (2) Skip model where history sentiment-topic-word distributions are considered by skipping some epochs in between; and (3) Multiscale model where previous long- and short- timescale distributions are taken into consideration. We derive efficient online inference procedures to se- quentially update the model with newly arrived data and show the effectiveness of our proposed model on the Mozilla add-on reviews crawled between 2007 and 2011.
Show more

22 Read more

Research paper on Sentiment Analyzer by using a Supervised Joint Topic Modeling Approach

Research paper on Sentiment Analyzer by using a Supervised Joint Topic Modeling Approach

example, when people read a product review, they often care about which specific aspects of the product are commented on, and what sentiment orientations(e.g., positive or negative) have been expressed on the aspects. Instead of employing bag-of-words representation, which is typically adopted for processing usual text documents, The review is represented in the form of opinion pairs. where each opinion pair consists of an aspect term and related opinion word in the review. To propose a novel supervised joint aspect and sentiment model (SJASM), which can cope with the overall and aspect-based sentiment analysis problems in one go under a unified framework. Probabilistic topic models, notably latent Dirichlet allocation (LDA) [8], have been widely used for analyzing semantic topical structure of text data. Based on the basic LDA, we introduce an additional aspect-level sentiment identification layer, and construct a probabilistic joint aspect and senti-ment framework to model the textual bag-of-opinion-pair data.
Show more

5 Read more

Multi-task learning with mutual learning for joint sentiment classification and topic detection

Multi-task learning with mutual learning for joint sentiment classification and topic detection

in the NN-based text classifier as illustrated in Figure 2(b). In order to explain our key idea better, we give an example in Figure 3 to illustrate the word-level attention weights generated from a traditional sentiment classification model and our proposed topic-sentiment mutual learning. We can observe that the traditional sentiment classification model puts a higher attention weight on the polarity word ‘incredible’. But with topic-sentiment mutual learning, higher attention weights are not only put on the polarity word ‘incredible’, but also on the associated topic indicated by the words ‘acting’ and ‘movie’. We argue that mutual learning would benefit sentiment classification since it enriches the information required for the training of the sentiment clas- sifier (e.g., when the word ‘incredible’ is used to describe ‘acting’ or ‘movie’, the polarity should be positive). At the same time, mutual learning could also benefit topic model learning since it encourages the clustering of words which are not only relevant under a similar topic but are also linked with a similar polarity.
Show more

14 Read more

Topic Modeling with Sentiment Clues and Relaxed Labeling Schema

Topic Modeling with Sentiment Clues and Relaxed Labeling Schema

There are several works that simultaneously mod- eled topic and sentiment. Mei et al. (2007) proposed Topic Sentiment Mixture (TSM) model which is a multinomial mixture model that mixes topic models and a sentiment model. Lin et al. (2012) proposed joint sentiment-topic model (JSTM) that extends LDA to jointly model topic and sentiment. Jo and Oh (2011) proposed As- pect and Sentiment Unification Model (ASUM) that adapts LDA to model aspect and sentiment pairs. Titov and McDonald (2008a) proposed Multi-Aspect Sentiment (MAS) model that mod- els topic with observed aspect ratings and latent overall sentiment ratings. Blei and McAuliffe (2007) proposed supervised LDA (sLDA) that can handle sentiments as observed labels. Our method is different from TSM model, JSTM, and ASUM since these models handle sentiments as latent variables. MAS model and sLDA utilize senti- ments explicitly like in our method. However, not like in the relaxed labeling schema of our method, they have not presented a technique specialized for non-strict labels.
Show more

9 Read more

A Dynamic Bayesian Network Approach for Analysing Topic-Sentiment Evolution

A Dynamic Bayesian Network Approach for Analysing Topic-Sentiment Evolution

On the other hand, topic level or hashtag level sentiment analysis that detects the overall or general sentiment tendency towards topics is also important in many scenarios [4], [6]. For example, people’s overall sentiment tendency on the topic ‘‘#Brexit’’ on Twitter will be an important indicator of the outcome of political events. Another example is that the topic level sentiment origination of a new product ‘‘#iphone10’’ can be used for the prediction of sales of this new phone model [4], and even the stock price of the relevant com- pany [8]. However, people are not only interested in the sentiment tendency of one topic, but also want to know how the sentiment of a topic has been been influenced by other topics over time. For example, how the sentiment of the topic ‘‘#Brexit’’ has been influenced by other related topics. In all these scenarios, comprehensive topic level sentiment dynamics analysis is needed.
Show more

11 Read more

Sentiment Independent Topic Detection in Rated Hospital Reviews

Sentiment Independent Topic Detection in Rated Hospital Reviews

Much work was done on developing joint topic-sentiment models, usually to improve sentiment detection. Lin and He (2009) propose a method based on LDA that explicitly deals with the interaction of topics and sentiments in text. However, their goal is exactly opposite to ours: they use the fact that the topic distribution is different for positive and negative documents and in fact use the polarity of topics to enhance the sentiment detection, which is the main goal of their efforts. Thus the algorithm is encouraged to find topics that have a high sentiment bias. The joint topic sentiment model of Eguchi and Lavrenko (2006) goes into the same directions: they optimize sentiment detection using the fact that the polarity of words depends on the topic. Also the paper of Maas et al. (2011) follows this general direction. Paul and Dredze (2012) propose a multidimensional model with word distributions for each topic-sentiment combination. This model was used to analyze patient reviews by Wallace et al. (2014).
Show more

6 Read more

A Joint Model for Chinese Microblog Sentiment Analysis

A Joint Model for Chinese Microblog Sentiment Analysis

Topic-based sentiment analysis for Chi- nese microblog aims to identify the user attitude on specified topics. In this paper, we propose a joint model by incorporating Support Vector Ma- chines (SVM) and deep neural network to improve the performance of senti- ment analysis. Firstly, a SVM Clas- sifier is constructed using N-gram, N- POS and sentiment lexicons features. Meanwhile, a convolutional neural net- work is applied to learn paragraph rep- resentation features as the input of an- other SVM classifier. The classification results outputted by these two classi- fiers are merged as the final classifica- tion results. The evaluations on the SIGHAN-8 Topic-based Chinese mi- croblog sentiment analysis task show that our proposed approach achieves the second rank on micro average F1 and the fourth rank on macro average F1 among a total of 13 submitted sys- tems.
Show more

7 Read more

A Bayesian Model for Joint Unsupervised Induction of Sentiment, Aspect and Discourse Representations

A Bayesian Model for Joint Unsupervised Induction of Sentiment, Aspect and Discourse Representations

it is difficult to deduce on the basis of local lexical features alone that the opinion about the view is negative. The clause let’s not talk about the view could by itself be neutral or even positive given the right context (e.g., I’ve never seen such a fancy ho- tel room, my living room doesn’t look that cool... and let’s not talk about the view). However, the contrast relation signaled by the connective but makes it clear that the second clause has a nega- tive polarity. The same observations can be made about transitions between aspects: changes in as- pect are often clearly marked by discourse connec- tives. Importantly, some of these cues are not dis- course connectives in the strict linguistic sense and are specific to the review domain (e.g., the phrase I would also in a review indicates that the topic is likely to be changed). In order to accurately predict sentiment and topic, 2 a model needs to ac-
Show more

10 Read more

What Affects Patient (Dis)satisfaction? Analyzing Online Doctor Ratings with a Joint Topic-Sentiment Model

What Affects Patient (Dis)satisfaction? Analyzing Online Doctor Ratings with a Joint Topic-Sentiment Model

Figure 1: Mean absolute errors (markers) and ranges (vertical lines) over five folds with respect to predicting the sentiment scores of held out reviews for three aspects (staff, helpfulness and knowledgeability). B: f-LDA without priors; W: priors over words; R: priors on ratings; WR: priors over words and ratings. Results include 3,6 and 9 topics ( x-axis). Top row: predictions made using only features representing the inferred distribution over (topic, sentiment) pairs; baseline corresponds to simply predicting the observed mean score for each aspect. Bottom row: adding bag-of-words (BoW) features; we also show results using standard BoW representation (with no topic information). Results for each model show the performance achieved when the inferred topic distributions are added to the BoW representations.
Show more

6 Read more

A neural generative model for joint learning topics and topic-specific word embeddings

A neural generative model for joint learning topics and topic-specific word embeddings

There have also been Bayesian extensions of the Skip-Gram models for word embedding learning. Barkan (2017) inherited the probabilistic generative line while extending the Skip-Gram by placing a Gaussian prior on the parameterized word vectors. The parameters were estimated via variational inference. In a similar vein, Rios et al. (2018) proposed to generate words in bilingual parallel sentences by shared hidden semantics. They introduced a latent index variable to align the hidden semantics of a word in the source language to its equivalence in the target language. More recently, Braˇzinskas et al. (2018) proposed the Bayesian Skip-Gram (BSG) model, in which each word type with its related word senses collapsed is associated with a ‘prior’ or static embedding and then, depending on the context, the representation of each word is updated by ‘posterior’ or dynamic embedding. Through Bayesian modeling, BSG is able to learn context-dependent word embeddings. It does not explicitly model topics, however. In our proposed JTW, global topics are shared among all documents and learned from data. Also, whereas BSG only models the generation of context words given a pivot word, JTW explicitly models the generation of both the pivot word and the context words with different generative routes.
Show more

15 Read more

A Joint Model of Text and Aspect Ratings for Sentiment Summarization

A Joint Model of Text and Aspect Ratings for Sentiment Summarization

In its simplest form, MAS introduces a classifier for each aspect, which is used to predict its rating. Each classifier is explicitly associated to a single topic in the model and only words assigned to that topic can participate in the prediction of the senti- ment rating for the aspect. However, it has been ob- served that ratings for different aspects can be cor- related (Snyder and Barzilay, 2007), e.g., very neg- ative opinion about room cleanliness is likely to re- sult not only in a low rating for the aspect rooms, but also is very predictive of low ratings for the as- pects service and dining. This complicates discovery of the corresponding topics, as in many reviews the most predictive features for an aspect rating might correspond to another aspect. Another problem with this overly simplistic model is the presence of opin- ions about an item in general without referring to any particular aspect. For example, “this product is the worst I have ever purchased” is a good predic- tor of low ratings for every aspect. In such cases, non-aspect ‘background’ words will appear to be the most predictive. Therefore, the use of the aspect sen- timent classifiers based only on the words assigned to the corresponding topics is problematic. Such a model will not be able to discover coherent topics associated with each aspect, because in many cases the most predictive fragments for each aspect rating will not be the ones where this aspect is discussed.
Show more

9 Read more

A Joint Sentiment Target Stance Model for Stance Classification in Tweets

A Joint Sentiment Target Stance Model for Stance Classification in Tweets

SemEval 2016 Task 6 organizers (Mohammad et al., 2016) released a joint stance and sentiment an- notated dataset. Studying the correlation between sentiment and stance and how the former can help detect the latter is an important research question that we address in this paper. Our approach relies on one observation for stance detection in tweets. Ignoring general words and stopwords, a lot of the time, we can expect a rough dichotomy on the remaining n-grams of the tweets. Concretely, a stance-related n-gram either refers to a topic related to the target or bears a sentiment. In Table 1 Christian, religion, Feminism, and campaign are of the first type, while murder and enjoyed are of the second type. We design the model such that the probability of a stance y given the text x, and its associated target t and
Show more

10 Read more

BeamSeg: A Joint Model for Multi Document Segmentation and Topic Identification

BeamSeg: A Joint Model for Multi Document Segmentation and Topic Identification

We propose BeamSeg, a Bayesian unsupervised topic modeling approach to breaking documents in coherent segments while identifying similar top- ics. The generative process assumes that segments can share the same topic and, consequently, are generated from the same lexical distribution. Lex- ical cohesion is achieved by having higher seg- mentation likelihoods when the probability mass is concentrated in a narrow subset of words. This is in the same spirit as topic modeling approaches such as Latent Dirichlet Allocation (LDA) (Blei et al., 2003), but here the inherent topics are con- strained to the linear discourse structure. To model interactions between lexical distributions, we use a dynamic prior, which assumes that the word probabilities change smoothly across topics. To model segment length characteristics, we assign prior variables conditioned on document modality. The linear segmentation constraint has been used to make inference tractable by exhaustively exploring the segmentation space to obtain the exact maximum-likelihood estimation (Eisenstein and Barzilay, 2008). Given a multi-document set- ting, this is not feasible, as segments can share top- ics. We address this issue using a beam search algorithm, which allows the inference procedure to recover from early mistakes. In our experi- ments, we show that BeamSeg is able to perform well when segmenting learning materials, where previously single-document models obtained bet- ter results (Mota et al., 2018). We also observe that topic identification is more accurately determined in a joint model, as opposed to a pipeline approach (performing the tasks sequentially), indicating that both problems should be modeled simultaneously.
Show more

11 Read more

Exploring Joint Neural Model for Sentence Level Discourse Parsing and Sentiment Analysis

Exploring Joint Neural Model for Sentence Level Discourse Parsing and Sentiment Analysis

We evaluate our models using 10-fold cross val- idation on the sentiment treebank and on RST-DT. In Table 1 and Table 3, a star indicates that there is statistical significance with a p-value less than 0.05. The significance is with respect to the joint model vs the model before joining. The results for Discourse Parsing are shown in Table 1. To build the most probable tree, a CKY-like bottom- up parsing algorithm that uses dynamic program- ming to compute the most likely parses is applied (Joty et al., 2015) and we have used the 41 re- lations outlined in (Mann and Thompson, 1988) for training and evaluation of the Relation predic- tion. From the results, we see some improvement on Discourse Structure prediction when we are us- ing a joint model but the improvement is statisti- cally significant only for the Nuclearity and Re- lation predictions. The improvements on the Re- lation predictions were mainly on the Contrastive set (Bhatia et al., 2015), specifically the class of Contrast, Comparison and Cause relations as de- fined in (Mann and Thompson, 1988). The result for each of these relations under different training settings are shown in Table 2. Notice that the ac- curacies may seem low, but because we train over 41 classes of relations, a random prediction would result in 2.43% accuracy. Among the contrastive relations, the Problem-Solution did not improve due to the fact that this relation is hardly seen at the sentence level. This confirms our hypothesis that knowing the sentiment of the two Discourse
Show more

10 Read more

Joint Sentiment-Topic Detection from Text Document

Joint Sentiment-Topic Detection from Text Document

The work related to jointly determine sentiment and topic simultaneously from text is relatively sparse. Most closely related to our work is [7], [8], [9]. ]. Topic-sentiment model (TSM) [7] models the mixture of topics and sentiments simultaneously from web-blogs. TSM is based on the probabilistic latent semantic indexing (pLSI). It finds the latent topics in a Weblog collection, sentiments and the subtopics in the results of query. If the word is common English then it samples a word from background component model. Else, a word is sampled from a topical model or sentiment model. Thus, the word generation for sentiment is independent of topic. While in JST, a word is drawn from the joint distribution of sentiment and topic label. To obtain the sentiment coverage, TSM performs postprocessing. JST gives the document sentiment by using probability distribution of sentiment label for a given document. The Multi-Grain Latent Dirichlet Allocation (MG-LDA) [8] is more appropriate to build topics in which a customer provide a rating for each aspect that is customer will annotate every sentence and phrase in a review as being relevant to some aspect. Each word is generated from either a global topic or a local topic. The model uses a topic model in that it assigns words to a set of induced topics, each of which may represent one particular aspect. The limitation of MG-LDA is that it does not considers the associations between sentiments and topics.
Show more

5 Read more

Weakly supervised joint sentiment-topic detection from text

Weakly supervised joint sentiment-topic detection from text

Compared to the traditional topic-based text classifica- tion, sentiment classification is deemed to be more chal- lenging as sentiment is often embodied in subtle linguis- tic mechanisms such as the use of sarcasm or incorpo- rated with highly domain-specific information. Among various efforts for improving sentiment detection accu- racy, one of the directions is to incorporate prior infor- mation from the general sentiment lexicon (i.e., words bearing positive or negative sentiment) into sentiment models. These general lists of sentiment lexicons can be acquired from domain-independent sources in many different ways, i.e., from manually built appraisal groups [5], to semi-automatically [16] or fully automatically [17] constructed lexicons. When incorporating lexical knowl- edge as prior information into a sentiment-topic model, Andreevskaia and Bergler [18] integrated the lexicon- based and corpus-based approaches for sentence-level sentiment annotation across different domains. A re- cently proposed non-negative matrix tri-factorization approach [19] also employed lexical prior knowledge for semi-supervised sentiment classification, where the domain-independent prior knowledge was incorporated in conjunction with domain-dependent unlabelled data and a few labelled documents. However, this approach performed worse than the JST model on the movie review data even with 40% labelled documents as will be discussed in Section 5.
Show more

12 Read more

Joint Sentiment/Topic Model for Sentiment Analysis

Joint Sentiment/Topic Model for Sentiment Analysis

sentially transformed to a simple LDA model with only S topics, each of which corresponds to a sentiment label. Con- sequently, it ignores the correlation between sentiment labels and topics. It can be observed from Figure 3 that, JST per- forms worse with single topic compared to 50 and 100 topics, except for the case of full subjectivity lexicon as shown in Figure 3(d) where the single topic performance is almost the same as the one with 100 topics. For paradigm words + MI, filtered subjectivity lexicon and filter subjectivity lexi- con (subjective MR) (Figures 3(c), 3(e), and 3(f)), the result with 100 topics outperforms the ones with other topic num- ber settings. For the case when no prior information is ap- plied as well as paradigm words as shown in Figure 3(a) and Figure 3(b), the results with 50 topics are almost the same as the ones achieved with 100 topics and both are higher than that of the single topic setting. It can be also easily seen that the results with filtered subjectivity lexicon in Figure 3(e) give the most balanced classification accuracy on both positive and negative documents. From the above, we can conclude that topic information indeed helps in sentiment classification as the JST model with the mixture of topics consistently outperforms a simple LDA model ignoring the mixture of topics. This justifies the proposal of our JST model. Also, the empirical results reveal that the optimum number of topics for the movie review dataset is 100.
Show more

10 Read more

Weakly-Supervised Joint Sentiment-Topic Detection from Text

Weakly-Supervised Joint Sentiment-Topic Detection from Text

This is a crucial difference compared to the JST model as in JST one draws a word from the distribution over words jointly conditioned on both topic and sentiment label. Third, for sentiment detection, TSM requires postprocessing to calculate the sentiment coverage of a document, while in JST the document sentiment can be directly obtained from the probability distribution of sentiment label given a document. Other models by Titov and McDonald [12], [20] are also closely related to ours, since they are all based on LDA. The Multi-Grain Latent Dirichlet Allocation model (MG-LDA) [20] is argued to be more appropriate to build topics that are representative of ratable aspects of customer reviews, by allowing terms being generated from either a global topic or a local topic. Being aware of the limitation that MG-LDA is still purely topic-based without considering the associations between topics and sentiments, Titov and McDonald further proposed the Multi-Aspect Sentiment model (MAS) [12] by extending the MG-LDA framework. The major improvement of MAS is that it can aggregate sentiment text for the sentiment summary of each rating aspect extracted from MG-LDA. Our model differs from MAS in several aspects. First, MAS works in a supervised setting as it requires that every aspect is rated at least in some documents, which is infeasible in real-world applica- tions. In contrast, JST is weakly supervised with only minimum prior information being incorporated, which in turn is more flexible. Second, the MAS model was designed for sentiment text extraction or aggregation, whereas JST is more suitable for the sentiment classification task.
Show more

12 Read more

Exploiting Topic based Twitter Sentiment for Stock Prediction

Exploiting Topic based Twitter Sentiment for Stock Prediction

We note that neighboring days may share the same or closely related topics because some top- ics may last for a long period of time covering multiple days, while other topics may just last for a short period of time. Given a set of time- stamped tweets, the overall generative process should be dynamic as the topics evolve over time. There are several ways to model this dynamic nature (Sun et al., 2010; Kim and Oh, 2011; Chua and Asur, 2012; Blei and Lafferty, 2006; Wang et al., 2008). In this paper, we follow the approach of Sun et al. (2010) due to its generality and extensibility.
Show more

6 Read more

Sentiment Classification: A Topic Sequence-Based Approach

Sentiment Classification: A Topic Sequence-Based Approach

Because of the high feature dimension of traditional feature representation methods, they usually contain a lot of useless features and can’t capture the main meanings of the document. Therefore, it is necessary to remove the useless features in order to increase the accuracy of classification results. As a feature extraction method, it is worth mentioning that LDA has been successfully used to sentiment classification in recent years. LDA is one of the most popular topic models, it assumes that documents are mixture of topics where a topic is a probability distribution over words. LDA can reduce the number of features significantly. Lin and He [13] proposed a novel probabilistic modelling framework based on LDA, called joint sentiment/topic model (JST), which was fully unsupervised. Li et al. [14] presented a sentiment-LDA model for sentiment classification with global topics and local dependency. Jo and Oh [15] proposed two models, Sentence-LDA (SLDA) and Aspect and Sentiment Unification Model (ASUM) to tackle the problem of automatically discovering what aspects are evaluated in reviews and how sentiments for different aspects are expressed. Although LDA has been applied to sentiment classification for many years, there is little effort done to explore the order relationships among topics.
Show more

9 Read more

Show all 10000 documents...