Research Hotspots Evolving Action Detection based on Time Sequence Journal Topic Model

(1)

Research Hotspots Evolving Action Detection based

on Time Sequence Journal Topic Model

Chunxiu Du, Fang Huang*, Shaoyong Wang,

Yijian Zhao, Chengyuan Zhang

School of Information Science and Engineering Central South University

Changsha, 410083, China *[email protected]

Abstract—Since the research hotspot development in academic fields is mainly reflected through academic journal contents, how to analyze the evolving action of academic journal related data set topics is a huge factor for researchers in knowing the traces of hotspot development and grasping the tendency of research hotspots. This paper considered and combined two characteristics of academic journals: 1) topic property and 2) time-sequence feature to realize journals’ time-sequence topic extraction, which also puts forward the TS-JTM (Time Sequence Journal Topic Model) at the same time. On the basis of TS-JTM, we developed topic-snapshot journal research hotspot evolution model based on time sequence, and we proposed a method which could detect the continuing, emerging, splitting, amalgamating or disappearing between two neighbor topic-snapshots, adopting topic similarity measurement based on Kullback-Leibler(KL) Divergence. So we are able to analyze the evolving traces of research hotspots, and our experiments show that our proposed method could realize evolving analysis of journals’ research hotspots effectively.

Keywords: time sequence journal topic model; KL divergence; research hotspot; evolving action

I. INTRODUCTION

Research hotspots are changing and evolving along with the development of science research and exploration. Because of the permeation among various subjects and utility of new technology, academic research hotspots are evolving over time. In this process, some outdated research problems will disappear and there will be emerging research topics appearing meanwhile. Also, some topics or directions will either split or get combined, which all lead to the evolution of research hotspots. Therefore, analyzing the hotspot evolution and seizing related traces are huge factors in predicting the research hotspot developing tendency. It could not only help to inform researchers about current research hotspots, but also could help scientific personnel and administrators to grasp researching trends. Additionally, the scientific researchers’ achievement and progress are mainly reflected among the journals where their papers are published. And these academic journals also collect and classify a large amount of research achievements. Due to the periodical publication, indeed, they keep track of the academic development of related fields which journals focus on. So, it will make sense to analyze hotspot evolving action with time through documents’ topic extraction.

In the field of document topic analysis, ATM (Author Topic Model) is one of the common topic clustering analysis methods. ATM builds model from the view of authors’ interests, which could analyze authors’ academic preference[2]. ATM is built up with three Bayes layers, including words, topics and author’s interests. It could be directly mapped into journal topic model which refers to that journal choose certain topic with some possibility and then topic gives raise to theme words depending on some possibility. However, the time-varying topic evolution is a crucial factor that will influence the topic extraction. Blei proposed DTM[9] on the basis of LDA and realized the extraction of time-sequence topics. As for the influence towards journal topic distribution and evolving process from the topic property and time-sequence feature, this paper will consider and combine both of these two characteristics, proposing the TS-JTM (Time Sequence Journal Topic Model), which realizes the extraction of time-varying topics from academic journals. On the basis of it, build up topic-snapshot journal research hotspot evolution model based on time sequence. Meanwhile, we also propose an algorithm which could detect evolving action between two neighbor topic-snapshots, adopting Kullback-Leibler Divergence (Relative Entropy) to realize granule analysis.

The remainder of this paper is organized as follows. In Section II, we make an investment about related methods of research hotspots evolution. Section III illustrates the process of how to utilize the TS-JTM to analyze hotspot evolving action. Detailed demonstration of TS-JTM and its principles, model parameters and evaluation are showed in Section IV. Topic-snapshot journal research hotspot evolution model and the topic evolving action detection algorithm using KL divergence are presented in Section V, followed by the conclusion and future work in Section VI.

II. RELATED WORK

Topic model could search for latent topic information among an incredible amount of data automatically, which is used for information retrieval, classification, clustering, abstract extraction, info similarity, relationship judgment and etc. Princeton professor Blei firstly proposed LDA (Latent Dirichlet Allocation) model in 2003. It is the most fundamental one among topic models and it is constructed with a framework: document-topic-word[1], namely, topic is the multinomial distribution of documents and word is the multinomial

(2)

distribution of topics. Later, many researcher proposed a lot of improved model based on LDA. In 2004, ATM (Author-Topic-Model) was brought out by Rosen-Zvi[2], which realized building model of authors’ interests through excavation of documents and got the possibility distribution about authors’ contribution towards each topic. In 2005, McCallum proposed ART[3] (Author-Recipient-Topic) model which built up relationship model between senders and receivers in corpora. This model could analyze a large amount of message data and discover topics relying on sending relationship. In 2006, Wallach brought up a hypothesis of reduction bag-of-words model[4], combining topic model and word sequence and it enhanced prediction accuracy. CTM (Correlated Topic Model) was proposed by Blei team[5], which adopted logarithm normal distribution to depict the relying relationship between two topics and solved the problem that standard LDA is not suitable for analyzing inner topic relativity. In 2010, Blei and his team-workers brought up HLDA[6] (Hierarchical LDA) model. This model solved the topic hierarchy problem and it used Bayes methods with nested Chinese Restaurant Process to give raise to an appropriate priori. It builds up a hierarchy tree structure for data set and could excavate the topic hierarchy relationship. Because the number of topics in LDA model is a hyper-parameter which needs to set by researchers, in 2006, Teh proposed the HDP (Hierarchical Dirichlet Processes) [7]. He replaced the traditional Dirichlet distribution with Dirichlet process, and its property is that it takes advantage of DP infinite dimension feature to make topics cluster. In other words, it could generate topics automatically without setting the number. In 2014, F. Ying added the Dirichlet processes to LDA model[8], which could not only get the topic latent variable but also update Dirichlet priori parameters dynamically. In 2015, S. Liu brought out HB-HDP model[9]. It combined time info, user interests with topic labels and could realize Weibo topic clustering efficiently, which overcame the deficiency of Weibo sparse data caused by Weibo short length data genres. In 2016, C. Ying brought up discipline hotspots mining based on hierarchical Dirichlet topic clustering and co-word network[10], which could excavate research hotspots.

Since the generation of topic are evolving through time sequence, DTM[11] (Dynamic Topic Model), considering the influence from priori topics, breaks up time sequence into slices and then Markov Chain is brought in to connect time slices. This approach could bring the time sequence into topic model building. In 2013, Dubey et al. proposed npTOT[12] (non-parametric Topic Over Time), which allows a flexible distribution of infinite changing topics with topics' strength over time. In 2016, Li proposed the DOHDP[13] (Dynamic Online Hierarchical Dirichlet Process), which was developed by using the OnlineHDP model between each period. And the exponential decay function was also used internally to develop the model of time dependence towards historical period. As a result, it could discover the topic evolution of Chinese social documents to a large extent. In 2010, Zhang brought up evolutionary hierarchical Dirichlet processes[14] based on HDP for multiple correlated time-varying corpora.

Up to now, topic evolving model building is still an opening research buzz, and evolving action analysis is the key to recognize its traces. Since the DTM model is based on the

general data set, it does not take into account the influence of the thematic nature and periodic evolution of journals on the topic model. So it can not directly meet the demands of the journal topic extraction. At the same time, an effective approach is needed to achieve the detection based on evolutionary action of journal’s time-sequence topics. In this paper, we propose a detecting approach of hotspots’ evolution actions based on KL distance similarity measure. By combining the topicality with time-sequence property of journals, the time-sequence journal topic model TS-JTM is proposed. And the sequential topic is extracted from the journal in this way. Then the proposed topic similarity measure based on KL distance is used to realize the detection of topic evolution actions ranging from continuation, emerging, division to amalgamation and extinction.

III. THE ANALYZING PROCESS OF HOTSPOTS EVOLVING

ACTION

The analysis of hotspots evolving action based on journal topic is composed of four main steps: 1) data preprocess 2) dynamic topic extraction 3) neighbor topic similarity measurement 4) detecting analysis of topic evolving action.

Step 1: We do a preprocessing for technological document and build up eigenvectors package for document. Firstly, get document information from the public document repository, including titles, abstract, key words, journal name, publishing time and etc. And then, we make a standard form process for the raw information with word segmentation, deleting stop words to establish document feature vectors, which would build up a corpus with time property.

Step 2: We combine the journal topic property with time-sequence feature to build up TS-JTM. Then it could realize topic extraction for above mentioned corpus depending on time-sequence and get the distribution of journal topics of each time slice.

Step 3: Measure the similarity of two neighbor topics in one journal randomly through building up topic-snapshot journal research hotspot evolution model with Kullback-Leibler divergence. We could develop the related-relationship between two time slices depending on the above similarity measurement.

Step 4: Analysis of journal topic evolution. On the basis of similarity of topics in neighbor time slices, detect topic evolving action of the continuing, emerging, amalgamating, splitting or disappearing through Step 3 analysis result.

IV. TIME SEQUENCE JOURNAL TOPIC MODEL

It is common that there are many journals under one of academic fields. And the topic of all documents which belong to one journal is the journal’s topic. In order to analyze journal topic distribution, we bring journal’s topic property and time-sequence feature into the model. Also, we define and re-infer parameters of the model base on reference[2] [11], which leads to the formation of TS-JTM.

(3)

A. Time Sequence Journal Topic Model

TS-JTM means the journal topic model stuck inside each time slice. The journal topic model part is shown in Fig. 1. In this model, α and β are respective priori parameters of the Dirichlet distributions toward journal-topic distribution θ and topic-word distribution ϕ, K means the total number of whole journals, T refers to the number of topics. The main idea of journal topic model is that: 1) select one topic z from the topic distribution θ of a journal J which covers a target paper, 2) then generate a word w randomly depending on the word distribution ϕ of the above selected topic z, 3) finally do not stop repeating above process 1) and 2) until generate each word of the target paper.

Figure 1. Journal topic model (JTM)

Figure 2. Time Sequence Journal Topic Model (TS-JTM )

Because the above journal topic model does not consider influence of time-sequence feature, it will be unreliable to put the JTM into each time slice to extract topics due to the independence nature of each time slice parameter. However, Dynamic Topic Model (DTM) makes use of the model parameters’ relying relationship of neighbor time slices to reflect the inside topic evolution. This paper will use time-sequence relying relationship to map the parameters of JTM, which develops the framework of TS-JTM in Fig. 2. In this figure, the model inside each time slice is just the JTM and neighbor time slices are connected through parameter α and β. The most important thing is that the influence of journal-topic distribution θ and topic-word distribution ϕ in the priori time slice will transmit to the model parameters in the next time slice.

B. Model parameter

There are two parameters in the TS-JTM: 1) journal-topic distribution θ and 2) topic-word distribution ϕ. The parameter interference adopts Gibbs Sampling method. Each word inside the model is sampled from journals and topics according to (1). The right side of the equation is p(topic|journal)·p(word |topic), which is the possibility of that journal chooses one topic and a word is selected from the topic meanwhile. Because the

number of topics is T and there are K journals, the physical meaning of (1) is to exert sampling process in the T×Ktraces.

' ' ' ' ( , | , , ) KT WT kj mj i i i i i KT WT kj m j j m C C p z j x k w m Z X C T C N              



(1)

zi=j and xi=k means that the word i in a paper or essay are allocated to the topic j and journal k.wi=m means the number of above word i is NO.m in the prepared dictionary. Z–i and X–i respectively represent the remaining topic and journal allocation except for the word i. WT

mj

C refers to that how many

times has the word m been allocated to the topic j before. KT kj

C means that how many times has the journal k been allocated to the topic j. N is the number of words in the dictionary. And the dictionary is composed of words appearing in the data set uniquely. The only task of the parameter estimation in (1) is to keep track of two matrices. One is the word by topic counting matrix N×T and the other one is the journal by topic counting matrix K×T. Then, according to these two counting matrices, we could estimate topic-word distribution ϕand journal-topic distribution θ. The estimating approaches are shown in (2) and (3).

' ' WT mj mj WT m j m C C N      



(2)

' ' KT kj kj KT kj j C C T      



(3)

ϕmj represents the possibility of word m used in topic j and

θkj represents the possibility that journal k selects topic j.

2 1 1 | ~ ( , ) t t N t I    

(4)

2 1 1 | ~ ( , ) t t N t I    

(5)

_~ _{( )} t Dir t  

(6)

_~ ₍ ₎ t Dir t  

(7) In the Fig. 2, the parameter transmitting process between two neighbor time slice are as follows.

Equation (4) represents the priori parameter βt of topic-word distribution ϕt in current time slice is influenced by the

generated from model training of previous neighbor time slice.

βt and βt-1 conform to the prerequisite of Markov process. Likewise, (5) represents the priori parameter αt of journal-topic

distribution θt in current time slice is influenced by the αt-1 generated from model training of previous neighbor time slice. Equation (6) and (7) respectively represent βt and αt are the

priori parameters of topic-word distribution ϕt and journal-topic distribution θt [11]. The αt and βt will influence journal-topic

distribution and topic-word distribution.

V. ANALYSIS OF TOPIC-SNAPSHOT JOURNAL RESEARCH

HOTSPOT EVOLVING ACTION

We could attain journal-topic distribution in each time slice from TS-JTM and we will analyze research hotspot evolving action through establishing journal topic-snapshot model. With the help of Kullback-Leibler divergence to build up the relationship of neighbor time slice inside topic, we do a detection of topic evolving action.

(4)

A. Topic-snapshot Journal Research Hotspot Evolution Model

Figure 3. Topic-snapshot journal research hotspot evolution model

The model is shown in the Fig. 3. The three parts in the figure represent three topic snapshots inside their successively neighbor time slices. The several dotted lines represent topic relationships and there are five kinds of relationship are shown in the figure: 1) ‘one-to-one’ means that the current topic is the continuing of previous one. 2) ‘null-to-one’ represents an emerging topic which has no relationship with previous one. 3) ‘one-to-many’ means that the previous topic is divided into more topics. 4) ‘many-to-one’ represents several previous topics are amalgamated into one topic. 5) ‘one-to-null’ refers to that the previous topic has no relationship with the next time slice and it naturally perishes.

B. The Detection of Topic Evolving Action Based on Kullback-Leibler Divergence

The topic will change and evolve along with the time, including disappearing, emerging, splitting and amalgamating. All of these conditions lead to topic evolving action. We will detect the evolving action and develop the relationship of topics inside neighbor time slices through Kullback-Leibler divergence measurement.

1) Topic Similarity Measurement using Kullback-Leibler Divergence

Kullback-Leibler Divergence was proposed by Solomon Kullback and Richard Leibler[15], which is also called Relative Entropy. It is usually used to measure the similarity of two possibility distribution. Here, we use Kullback-Leibler Divergence to measure the similarity for each pair of topics inside neighbor time slices. Equation (9) is the calculating approach of Kullback-Leibler Divergence. P(x) and Q(x) are two respective possibility distributions. If they are same, the value of KL divergence is zero.

( || ) ( ) log ( ) ( ) x X P x D P Q P x Q x  



(9) When it comes to measuring the similarity of topic A and topic B, it actually calculates their topic-word distribution (ϕA and ϕB) similarity. We build up the relationships of topics inside two neighbor time slices through calculating the KL divergence of each pair of their topics.

2) The detection of evolving action

The topic evolving actions include continuing, emerging, disappearing, splitting and amalgamating when the time goes by. The essence of these actions is the changing process of topic-word distribution inside each topic. KL divergence could

measure the similarity of a pair of topics, so we could set a threshold value of KL divergence to reach for the goal of detecting evolving action. As for the five actions which drive the topic evolving, related five detecting rules are listed in (10).

















+1 1 +1 1 1 1 +1 +1 +1 1 0( ) min ( || ) , & & ( || ) , 1( ) min ( || ) , 2( ) min ( || ) , 3( ) ( || ) , , t t t t i j t t t i k t t t j i t t t i j t t t i j i T s j Continuing D THR H ta j T D THR k T j Emerging D THR j T Disappearing D THR j l tus T Sp itting T R j T                             ，，，，









+1 1 1 1 1 1 2 & & ( || ) , 4( ) ( || ) , , 2 & & ( || ) , t t t i k t t t j i t t t k i D THR k T j Amalgamating D THR j T j D THR k T j                                      , (10)

The collection of topics in current time slice is t

T , and t

i



is the topic distribution of topic i.

status

_{i T}_ trepresents

the evolving action state of topic i in the time slice t. THR is the similarity threshold. The principle of detecting evolving action is to calculate the similarity of topic distribution in neighbor time slices. What matters is that: 1) if the topic i in current time slice is only similar with one of topics in next time slice and is different with others, the topic i will be a continuing in the next time slice; 2) if the topic i in current time slice is different with all of topics in previous time slice, it is a emerging one; 3) accordingly, if it is different with all of topics in next time slice, it just has perished; 4) if the topic i in current time slice has more than one similar topic in next time slice, the dividing action is happening; 5) on the contrary, if it is similar with more than one topic in previous time slice, it is amalgamating.

3) Algorithm

The first step of detecting topic evolving action is to use TS-JTM to achieve topic clustering. It will help to attain the topic distribution of each time slice. Then we use KL divergence to measure similarity of each pair of topics in neighbor time slices. Take advantage of (10) to detect evolving actions of all the topics in each time slice and build up topic relationship according to previous and follow-up topic ID. Finally, all of these steps will form a description about time-dependent topic evolving action of journals. The pseudo code of the algorithm is shown in Table I.

TABLE I. THE DETECTING ALGORITHM OF TOPIC EVOLVING ACTION

Algorithm1: Topic evolution behavior detection algorithm. Define : TC =  // Collection of time-sequence topics

Action //Struct of topic action attribution description

Action.topic_ID = null; //Topic’s ID

Action.time = null; //Topic’s current time slice

Action.status = null; //Topic’s action status

Action.pre_ID =; //Collection of previous related topic ID

Action.next_ID = ; //Collection of follow-up related topic ID

Result = ; //Collection of above Action Struct

Input :Ct = {(w1,v1),…,(wn,vn)},t∈(1,…,n), Ct is the collection of paper

feature vectors.

Output : TC,Result 1 For each time slice t

2 Use the TS-JTM model to extract topics t

(5)

3 Add t

T to TC ; //Keep the topic distribution to TC

4 Update the TS-JTM model with new αt, βt;

5 End-for

6 For each time slice t

7 Get t1

T, t

T , t1

T in TC 8 For each topic t

i  in Tt 9 If min( ( t|| t1)) i j D  THR, t1 jT && 1 ( t|| t ) i k D  THR, 10 k



Tt1j



and t1, t1 t1 j k T    _ 

11 Action.status = 0; //Topic continuing

12 Action.time = t; 13 Action.topic_ID = i_ID; 14 Action.next_ID = j_ID; 15 Else Ifmin( ( t1|| t)) j i D  THR,jTt-1 and tj-1Tt-1

16 Action.status = 1; //Topic emerging

17 Action.time = t; 18 Action.topic_ID = i_{_ID ;} 19 Else If min( ( ||t t1)) i j D  THR,jTt1 and t1 t1 j T  _ 

20 Action.status = 2; //Topic disappearing

21 Action.time =t ; 22 Action.topic_ID = i_{_ID ;} 23 Else If _{( ||}t t1₎ i j D  THR, t1 jT , | | 2j && 24 ( ||t t1) i k D  THR,



t1



kT j and t1, t1 t1 j k T    _ 

25 Action.status = 3; //Topic splitting

26 Action.time = t;

27 Action.topic_ID = i_{_ID ;}

28 Action.next_ID = { j1_ID, j2_ID,…, jm_ID};

29 Else ₍ t1_|| t₎ j i D  THR,_j__Tt1 , | | 2j && 30 ( t1|| t) k i D  THR,



t1



kT j and t1, t1 t1 j k T    _ 

31 Action.status = 4; //Topic amalgamating

32 Action.time = t;

33 Action.topic_ID = i_{_ID ;}

34 Action.pre_ID = { j1_ID, j2_ID,…, jm_ID};

35 Add Action to Result; //Record each topic Action into Result

36 End-for 37 End-for 38 Return TC,Result;

VI. ANALYSIS OF TS-JTM PERFORMANCE AND

EVOLUTIONARY ACTIONS DETECTION

A. Model Performance

In order to estimate the validity of TS-JTM, we adopt the perplexity to contrast the robustness of ATM and DTM with TS-JTM. The experiment is excused in the environment of Windows7 system with 8G memory and double kernels on a PC. Related algorithms are realized through Python.

1) Data Pre-process

We develop topic and word set through the technological documents gathered from China National Knowledge Infrastructure (CNKI) public resources. We select 6487 records of papers in computer science field dated from 2010 to 2016, including their abstracts, journals and publication time, as the experiment data. And we divide all of the data into 7 time slices according their publication year. Then we use the NLPIR system which is used to realize Chinese word segment on the abstracts and also delete stop words. This process helps us establish the collection w={c1,c2,…,cn} of topic words for each paper and (wi,vi) represents paper-journal feature vector. wi

refers to the feature words collection of the paper i and vi represents the journal which the paper i belongs to. In the time slice t, the data set Ct that is composed of n papers could be

represented as Ct = {(w1,v1), (w2,v2),…,(wn,vn)}, which naturally

forms the document feature vectors collection based on time sequence.

2) Estimation of Model Performance

Equation (8) is the calculation of perplexity. Dtest represents testing data, which is a collection of M documents.

p(wd) represents the chosen possibility of words in the paper.

Nd represents the number of words in the paper d. wd = (w1d,

w2d,…, wid,…, wnd) represents the form of words vector of paper d. The less value of perplexity, the better performance of the model. 1 1 log ( ) Per ( ) exp M d d test M d d p plexity D N             



w ₍₈₎

Three parameters are set for the model before experiments. The parameter |T| which is the number of topic will ascend from 10. The two hyper parameters of ATM’s Dirichlet distribution was set as α=50/|T| and β=0.01. The two hyper parameters of DTM’s and TS-JTM’s Dirichlet distribution in the first time slice is set as α=50/|T| and β=0.01. The rest of α and β parameters will be set automatically in the model. The comparison of results is showed in the Fig. 4. The horizontal axis represents the number of topics and the vertical axis represents perplexity value. As you can see, with the increase of topics, the perplexity value of TS-JTM is always the smallest, which reflects that the performance of TS-JTM is the best among three models. Additionally, perplexity value will descend with the increase of topics and when the number of topics is greater than about 50, the perplexity value is almost static later. We could infer that setting the number of topics 50 in the TS-JTM is also suitable.

Figure 4. The curve of changing perplexity along with number of topics

B. Analysis of topic evolution

Use the TS-JTM to achieve topic clustering. The number of topics (|T|) is set as 50, α = 50/|T| and β = 0.01. Then we get the journal topic distribution in each time slice. The selection of similarity threshold (THR) is influenced by the actual condition of data set. According to the principle of detecting topic evolving action and our iterative experiments, it is rational to set THR at 0.4.

(6)

In the journal signed as ID:003 in the data set, the topic (ID:2) was related to Face Recognition among the topic distributions in 2010. The topic-word distributions of topic (ID:2) in this journal from 2010 to 2016 is shown in the Table II. What appears in the table is that 10 top possibility topic words in the topic (ID:2) every year. As we can see, with the time goes by, the kernel words of Face Recognition in topic (ID:2) were static and did not change largely, such as ‘Images’, ‘Features’ and ‘Face Recognition’, all of which that were related to Face Recognition were always appearing. However, ‘Genetic Algorithm’ was appearing in 2013 and ‘Deep Learning’ was appearing in 2015. These were some novel approaches utilized in the field of Face Recognition. From 2010 to 2016, the topic KL divergences of each neighbor time slices are 0.20, 0.26, 0.23, 0.17, 0.21, 0.19. All of them are smaller than THR. At the same time, the KL divergences between Face Recognition topic with all of topics in the next time slices were always greater than THR. Therefore, Face Recognition topic was continuous from 2010 to 2016.

TABLE II. THE TOPIC-WORD DISTRIBUTIONS OF FACE RECOGNITION

Year Topic-word Distribution

2010

image(0.066) feature(0.061) algorithm(0.055) face recognition(0.042)

classification(0.037) amalgamation(0.033) texture(0.027)

sampling(0.025) threshold(0.019) dimension reduction(0.016) 2011

image(0.063) feature(0.059) amalgamation(0.057) Algorithm(0.052) face recognition(0.045) segment(0.038) classification(0.031) dimension reduction(0.026) extraction(0.022) training(0.017) 2012

image(0.078) feature(0.062) noise(0.055) algorithm(0.046)

extraction(0.038) edge(0.032) face recognition(0.026)

watermark(0.020) detection(0.018) threshold(0.014) 2013

image(0.083) algorithm(0.066) feature(0.052) sampling(0.044) filtering(0.040) face recognition(0.035) genetic algorithm(0.031) segment(0.024) edge(0.019) extraction(0.017)

2014

image(0.074) feature(0.061) algorithm(0.057) noise(0.052) classification(0.046) segment(0.043) face recognition(0.039) genetic algorithm(0.027) filtering(0.020) threshold(0.015)

2015

image(0.069) feature(0.059) amalgamation(0.052) algorithm(0.043) face recognition(0.037) SVM(0.032) deep learning(0.028) training(0.023) classification(0.022) segment(0.019)

2016

image(0.071) feature(0.064) algorithm(0.060) amalgamation(0.056) deep learning(0.047) face recognition(0.035) classification(0.033) perception(0.029) segment(0.024) contour(0.022)

2) Journal topics evolve over time

In order to describe concisely, we will replace topic names with their abbreviations in the following part. Table III shows the top 10 possibility topic words of ‘Neural Network (NN)’, ‘Deep Learning (DL)’ and ‘Speech Recognition (SR)’ topics from 2010 to 2016. The kernel words of topic NN were almost static, except some edge words such as ‘sampling’ and ‘particle swarm’ distributions were changing obviously. The overlapping words between topic NN in 2013 and topic DL in 2014 are ‘training’, ’performance’, ‘feature’, ‘classification’ and ‘neuron’. Because of the similar word distributions of these two topics, the KL divergence of them is 0.27 which is kind of small. And it is smaller than THR. In 2013, the KL divergences between topic NN with all of topics in the next time slice are respectively 0.55, 0.27, 0.21, 0.69, 1.84, 1.16, 0.92, 1.53. The least two of them are related to topic DL and NN, and the rest

of them are all greater than THR. In this case, the topic DL separated from topic NN.

TABLE III. THE WORD DISTRIBUTIONS OF THREE TOPICS:NN,DL,SR

FROM 2010 TO 2016

Year

Topic

Topic

Name Topic Words

2010 NN

neural-network, neuron, perception, optimization, sampling, recognition, performance, training, threshold, feature

SR speech-recognition, voice, noise, filtering, imitation,

training, combination, performance, feature, extraction

2011 NN

neural-network, neuron, convolution, training, hidden-layer, performance, feature, optimization, prediction, classification

SR

speech-recognition, semantic, mechanism, compress, transformation, combination, training, performance, feature, detection

2012 NN

neural-network, convolution, hidden-layer, convergence, training, performance, feature, propagation, water-mark, perception

SR

speech-recognition, mechanism, noise, simulation, modeling, performance, feature, combination, detection, robot

2013 NN

neural-network, neuron, perception, convergence, particle-swarm, performance, feature, training, classification, precision

SR speech-recognition, acoustic, simulation, performance,

noise, signal, combination, detection, feature, extraction

2014 NN

neural-network, neuron, particle-swarm, convolution, wave, deep-learning, optimization, perception, feature, classification

DL

deep-learning, neuron, convergence, optimization, prediction, classification, performance, training, feature, extraction

SR

speech-recognition, acoustic, filtering, key-words, convolution, feature, performance, combination, training, extraction

2015 NN

neural-network, neuron, convolution, combination, deep-learning, speech-recognition, feature, performance, training, classification

DL

deep-learning, hidden-layer, convergence, image, face-recognition, vector, feature, algorithm, training, extraction

SR

speech-recognition, semantic, neural-network, filtering, key-words, robot, feature, performance, combination, training

2016 DL

deep-learning, particle-swarm, optimization, algorithm, image, face-recognition, classification, training, feature, performance

SR

speech-recognition, pronunciation, neural-network, neuron, semantic, simulation, combination, extraction, performance, feature

3) Analysis of journal topic evolution

The topic evolution of journal (ID:003) from 2010 to 2016 is shown in Fig. 5. Because the same topics are formed as different ID by clustering in different time slices, the topic abbreviations is used to represent them. From the Fig. 5, the KL divergences between all of topics in 2015 and topic SR in

(7)

2016 are 0.74, 0.46, 0.23, 0.16, 0.81, 0.95, 1.37. Among them, the values that are smaller than THR are topic NN and topic SR and the rest are greater than THR. This phenomenon means that topic NN was amalgamated into the topic SR, the KL divergences between topic ‘Aircraft’ in 2014 and all of topics in 2015 are 1.72, 1.46, 1.25, 1.07, 1.20, 0.83, 1.59. The least value is 0.83 which has been greater than THR. So in this case, the topic ‘Air Craft(AC)’ was perishing at that time. Furthermore, the KL divergences between all of topics in 2010 and the topic ‘Cloud Computing(CC)’ in 2011 are 1.16, 0.75, 1.37, 2.32, 1.51. The least of them is 0.75 which has been greater than THR. So the topic ‘Cloud Computing’ was an emerging one. Accordingly, the topic ‘Target Tracking(TT)’ was always in the state of continuing. The emerging topic in 2013 was ‘Entity Recognition(ER)’.

Figure 5. Topic evolving snapshot model of journal (ID:003)

In addition, to make a measurement of the running time of TS-JTM, we compared the running time of TS-JTM with ATM and DTM. Feeding them with same input, their running time are respectively 23.8 minutes, 25.6 minutes and 24.2 minutes. So TS-JTM is close to DTM, and ATM’s running time is the longest. Combined with the perplexity performance in the Fig. 4, TS-JTM has not only low perplexity value, but also good running time performance.

VII. CONCLUSION

The topic evolution of academic journals indeed reflects the tendency of research hotspots. Since the topic property and time-sequence feature will influence topic distributions and evolving processes, and also there are several evolving actions, the analysis of research hotspots tracing is becoming complicated. Combining the topic property with time-sequence feature, we proposed TS-JTM model. We realize research hotspots extraction through it and make a comparison

depended on perplexity to prove its performance. On the basis of it, we develop topic-snapshot journal research hotspot evolution model based on time sequence and use the KL divergence to measure similarity, making a detection of continuing, emerging, splitting, amalgamating or disappearing between two neighbor topic-snapshots which realize the particle analysis of journal research hotspots evolution. Based on all of above mentioned research, our next work is to how to recognize topic layer semantic relationship and topic conception evolution.

ACKNOWLEDGMENT

This work was supported by Project 2016JC2011 of Science and Technology Plan of Hunan Province and Project 61073105 of National Natural Science Foundation of China.

REFERENCES

[1] Blei D M, Ng A Y, Jordan M I. Latent Dirichlet allocation[J]. Journal of Machine Learning Research, 2003: 993–1022.

[2] Rosen-Zvi M, Griffiths T, Steyvers M. The Author-Topic Model for Authors and Documents[C]. Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence. 2004 :487-494.

[3] Mccallum A, Corrada-Emmanuel A,Wang X. The Author-Recipient-Topic Model for Author-Recipient-Topic and Role Discovery in Social Networks: Experiments with Enron and Academic Email[R]. University of Massachus-sets, Amherst, UM-CS-2004-0961,2005.

[4] Wallach H. Topic modeling: beyond bag-of-words[C]. Proceedings of the 23rd International Conference on Machine Learning , 2006 :977-984. [5] Blei D M, Lafferty J D. correlated topic model of Science[C]. Proceedings of the 23rd International Conference on Machine Learning , 2006:17-35.

[6] Blei D M, Griffiths T L, Jordan M I, et al. The nested Chinese restaurant process and Bayesian nonparametric inference of topic hierarchies[J]. Journal of the ACM, 2007 , 57(2) :17-24.

[7] Teh Y W, Jordan M I, Beal M J, et al. Hierarchical Dirichlet processes[J]. Journal of the American Statistical Association, 2006, 101(476): 1566-1581.

[8] Fang Y, Huang H Y, Xin X, et al.Topic Evolutionary Analysis for Dynamic Topic Number[J]. Journal of Chinese Information Processing, 2014,28(3):142-149.

[9] Liu S P, Yin J, Huang Y, et al. Topic Mining from Microblogs Based on MB-HDP Model [J]. Chinese Journal of Computers, 2015(7):1408-1419. [10] Cai Y, Huang F, Peng M Y. Discipline Hotspots Mining Based on Hierarchical Dirichlet Topic Clustering and Co-word Network [J]. Journal of Software, 2016,11(11):1089-100.

[11] Blei D M, Lafferty J D. Dynamic Topic Models[C]. Proceedings of the 23rd International Conference on Machine Learning, 2006:113-120. [12] Dubey A, Hefny A, Williamson S, et al. A non-parametric mixture

model for topic modeling over time[J]. Statistics, 2013: 530-538. [13] Li J, Yang K, Cui L, et al. Dynamic Online HDP model for discovering

evolutionary topics from Chinese social texts[J]. Neurocomputing, 2016, 171(C):412-424.

[14] Zhang J, Song Y, Zhang C, et al. Evolutionary hierarchical dirichlet processes for multiple correlated time-varying corpora[C]. Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2010:1079-1088.

[15] Berend D, Harremoës P, Kontorovich A. Minimum KL-divergence on complements of balls[J]. IEEE Transactions on Information Theory, 2014, 60(6):3172-3177. 0.25 0.20 0.29 0.22 1 min( ( t|| t )) 0.4 D TT TT  Continuing 0.18 0.27 2010 2011 1 min( (Dj ||CC ))0.75 2015 2016 ( || ) 0.23 D NN SR  2012 2013 1 min( (Dj ||ER ))0.71 2014 2015 min( (D AC ||j ))0.83 Amalgamation Splitting 2013 2014 ( || ) 0.27 D NN DL  0.19 0.21 0.17 0.26 0.20 0.23 DL 21 2 33 2 26 2 AC 13 2 6 2 18 2 36 2 23 2 Disappearing +1 min( ( t|| t )) 0.4 D FR FR  FR 2 2 7 2 12 2 2 8 2 7 16 2 30 2 Continuing CC Emerging 22 2 34 2 28 2 36 2 47 2 36 2 2010 2011 2012 2013 2014 2015 2016 TT Emerging 2 2 2 2 ER 39 18 20 18 SR 6 2 1 2 39 2 34 2 14 2 26 2 12 2 NN 2 5 15 2 6 2 14 2 19 2 15 2 2 38 2 2 27 2 11 2 22 2 2 29 2 15