• No results found

Unsupervised Learning Of Finite Mixture Models With Deterministic Annealing For Large scale Data Analysis

N/A
N/A
Protected

Academic year: 2020

Share "Unsupervised Learning Of Finite Mixture Models With Deterministic Annealing For Large scale Data Analysis"

Copied!
45
0
0

Loading.... (view fulltext now)

Full text

(1)

Unsupervised Learning Of

Finite Mixture Models

With Deterministic Annealing

For Large-scale Data Analysis

"

Student: Jong Youl Choi"

Advisor: Geoffrey Fox!

S

A

L

S

A

 project  

 

h*p://salsahpc.indiana.edu  

Thesis Defense, January 12, 2012"

School of Informatics and Computing!

Pervasive Technology Institute!

(2)

I.

Finite Mixture Model (FMM)

!

II.

Model Fitting with Deterministic Annealing (DA)

!

III.

Generative Topographic Mapping with

Deterministic Annealing (DA-GTM)

!

IV.

Probabilistic Latent Semantic Analysis with

Deterministic Annealing (DA-PLSA)

!

V.

Conclusion And Future Work

!

(3)

I.

Finite Mixture Model (FMM)

!

II.

Model Fitting with Deterministic Annealing (DA)

!

III.

Generative Topographic Mapping with

Deterministic Annealing (DA-GTM)

!

IV.

Probabilistic Latent Semantic Analysis with

Deterministic Annealing (DA-PLSA)

!

V.

Conclusion And Future Work

!

(4)

▸ 

Learning from observed data

!

Inferring a data generating process

!

Essential structure, abstract, summary, …

!

Machine Learning Problem

!

(5)

Component Model!

–  A mixture of simple

distributions (components)!

–  Hidden or latent

components!

–  Serve as abstract or

summary of data!

Generative Model!

–  Simulate observed random

sample!

Convenient and flexible !

Model fitting is hard!

–  Too many parameters!

Jong Youl Choi (Jan 12, 2012)!

4!

Finite Mixture Model (FMM)

!

0 2 4 6 8 10

0.00

0.05

0.10

0.15

0.20

0.25

(6)

Two Finite Mixture Models

!

Traditional model!

Component ≈ Cluster!

Clustering, Gaussian

Mixture (GM), GTM, …!

Factor model!

Component ≈ Generator!

PLSA!

ω1 · · · ωK

xn

πK

π1

FMM  Type-­‐1   FMM  Type-­‐2  

K

X

k=1

k = 1

K

X

k=1

ik = 1

ω1 ωK

x1 x2 xN

· · ·

· · ·

(7)

▸ 

Bayes’ Theorem

!

! !

–  𝛳ii: Parameters!: Parameters!

–  X : Observations!

–  P(𝛳ii) : Prior!) : Prior!

–  P(X | 𝛳ii) : Likelihood!) : Likelihood!

Maximum Likelihood Estimator (MLE)

!

–  Used to find the most plausible 𝛳, given X !

–  Maximize likelihood or log-likelihood ! ! Optimization problem!

Jong Youl Choi (Jan 12, 2012)!

6!

Model Fitting

!

(8)

▸ 

Problems in MLE

!

–  Observation X is often not complete!

–  Latent (hidden) variable Z exists!

–  Hard to explore whole parameter space!

EM algorithm

!

–  Random initialization 𝛳old!

–  E-step : Expectation P(Z | X, 𝛳old)!

–  M-step : Maximize (log-)likelihood!

–  Repeat E-,M-step until converge.!

(9)

Motivation

!

▸ 

Local optimum problem

!

–  Easily trap in the local optimum!

–  Sensitive to initial conditions or parameters!

–  High-variance solution!

–  SA, GA, …!

▸ 

Overfitting problem

!

–  Poor generalization quality!

–  Directly related with predicting power!

–  Early stopping, cross validation, …!

Jong Youl Choi (Jan 12, 2012)!

8!

(Maxima and Minima, Wikipedia)!

Validation Error!

Training Error!

Underfitting! Overfitting!

(10)

▸ 

Solve FMM with DA

!

–  Avoid local optimum problem!

–  Avoid overfitting problem!

▸ 

Present DA applications

!

–  GTM with DA (DA-GTM)!

–  PLSA with DA (DA-PLSA)!

▸ 

Experimental results

!

–  Data visualization!

–  Text mining!

(11)

I.

Finite Mixture Model (FMM)

!

II.

Model Fitting with Deterministic Annealing (DA)

!

III.

Generative Topographic Mapping with

Deterministic Annealing (DA-GTM)

!

IV.

Probabilistic Latent Semantic Analysis with

Deterministic Annealing (DA-PLSA)

!

V.

Conclusion And Future Work

!

(12)

Optimization!

–  Gradually lowering numeric

temperature!

–  No stochastic process !

Local optimum avoidance!

–  Tracing the global solution by

changing level of smoothness !

–  Smoothed  bumpy!

Principle of Maximum Entropy!

–  A solution with maximum entropy!

–  Minimize the free energy F of

log-likelihood!

–  Eventually, we will have

maximized log-likelihood!

(13)

Finite Mixture Model with EM

!

Jong Youl Choi (Jan 12, 2012)!

12!

E-step!

M-step!

Update Log-Likelihood!

Converged!

No!

(14)

Finite Mixture Model with DA

!

Update Temperature!

E-step!

M-step!

Update Free Energy!

Converged!

Last?!

No!

Yes!

No!

Set Temp High!

•  Minimize free energy!

•  Free energy F = 


f(Temp, Entropy of

Log-Likelihood)!

•  Annealing (High  Low)!

•  High temperature!

-  Soft (or fuzzy) association !

-  Smooth cost function!

•  Low temperature!

-  Hard association!

-  Bumpy cost function!

(15)

▸ 

Free Energy

!

!

!

▸ 

General form for Finite Mixture Model

!

–  Cost function: !

Jong Youl Choi (Jan 12, 2012)!

14!

Free Energy for Finite Mixture Model

!

Zn = K

X

k=1

exp ✓

dnk

T

-  D : expected cost <dnk>!

-  S : Shannon entropy!

-  T : computational temperature!

-  Zn : partition function! = T

N

X

n=1

ln Zn

F = D T S

F

F M M

=

T

N

X

n=1

log

K

X

k=1

{

c

(

n, k

)

p

(

x

n

|

y

k

)

}

T1

(16)

I.

Finite Mixture Model (FMM)

!

II.

Model Fitting with Deterministic Annealing (DA)

!

III.

Generative Topographic Mapping with

Deterministic Annealing (DA-GTM)

!

IV.

Probabilistic Latent Semantic Analysis with

Deterministic Annealing (DA-PLSA)

!

(17)

Dimension Reduction

!

▸ 

Simplification, feature selection/extraction,

visualization, etc.

!

▸ 

Preserve the original data’s information as much

as possible in lower dimension

!

16!

High  Dimensional  Data  (166  dimensions)  PubChem  Data   Low  Dimensional  Data  

(18)

▸ 

An algorithm for dimension reduction

!

–  Find an optimal K latent variables in a latent space !

–  f is a non-linear mappings!

–  Strict Gaussian mixture model!

–  EM model fitting!

DA optimization can improve the fitting process

!

Generative Topographic Mapping

!

Data Space (D dimension) Latent Space (L dimension)

zk yk

xn

f

(19)

Dim1 Dim2 −1.5 −1.0 −0.5 0.0 0.5 1.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

−2 −1 0 1 2

Maximum Log−Likelihood = 1721.554

Dim1 Dim2 −1.0 −0.5 0.0 0.5 1.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

−1.0 −0.5 0.0 0.5 1.0

Computational complexity is

O

(KN), where !

–  N is the number of data points !

–  K is the number of latent variables or clusters. K << N !

Efficient, compared with MDS which is

O

(N

2

)!

Produce more separable map (right) than PCA (left)!

Jong Youl Choi (Jan 12, 2012)!

18!

Advantages of GTM

!

PCA   GTM  

Oil flow data!

-. 1000 points!

-. 12 Dimensions!

(20)

▸ 

GTM Model Setting

!

–  Strict Gaussian assumption!

–  Constant mixing weight!

Free Energy for GTM

!

F

F M M

=

T

N

X

n=1

log

K

X

k=1

{

c

(

n, k

)

p

(

x

n

|

y

k

)

}

T1

p(x

n

|

y

k

) =

N

(x

n

|

µ

k

,

k

)

c

(

n, k

) =

1

(21)

GTM with Deterministic Annealing

!

20!

T

N

X

n=1 ln

(✓

1

K

◆ 1

T XK

k=1

p(xn|yk)T1

)

N

X

n=1

ln

(

1

K

K

X

k=1

p(xn|yk)

)

Maximize  log-­‐likelihood  L   Minimize  free  energy  F  

  Very  sensiAve  to  

parameters  

  Trapped  in  local  opAma  

  Faster  

  Less  sensiAve  to  poor  parameters    

  Avoid  local  opAmum  

  Require  more  computaAonal  Ame   When  T  =  1,  L  =  -­‐F.  

EM-­‐GTM    

(TradiAonal  method)   (New  algorithm)  DA-­‐GTM    

Objec)ve   Func)on   Op)miza)on  

Pros  &  Cons  

(22)

▸ 

Traditional method : static cooling schedule

!

Adaptive cooling, a dynamic cooling schedule

!

–  Able to adjust the problem on the fly!

–  Move to a temperature at which F may change!

Cooling Schedules

!

Iteration

Te

mp

2 3 4 5

200 400 600 800 1000

Linear  

Iteration

Te

mp

1 2 3 4 5

200 400 600 800 1000

ExponenAal  

IteraAons  

Te

m

pe

ratu

re

 

Iteration

Te

mp

1 2 3 4 5

200 400 600 800 1000 1200

(23)

▸ 

Discrete behavior of DA

!

–  In some temperatures, the free energy is stable!

–  At a specific temperature, start to explode, which is known as critical temperature Tc!

Critical temperature

T

c!

–  Free energy F is drastically changing at Tc!

–  Second derivative test : Hessian matrix loose its positive definiteness at Tc!

–  det ( H ) = 0 at Tc , where!

Jong Youl Choi (Jan 12, 2012)!

22!

Phase Transition

!

H =

2 6 4

H11 · · · H1K

..

. ...

HK1 · · · HKK

3 7

5 Hkk =

2F

yk yTrk Hkk0 =

2F

(24)

DA-GTM with Adaptive Cooling

!

Iteration

kelihood

value

Type

Likelihood

Everage Log-Likelihood of EM-GTM

Iteration

Tempe

rature

1 2 3 4 5 6 7

● ● ●●●●●●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●●●●●●●

● ● ● ● ●

2000 4000 6000 8000

Type

● Temp

Starting Temperature

1st Critical Temperature

(25)

Jong Youl Choi (Jan 12, 2012)!

24!

DA-GTM Result

!

Start Temperature

Log

Lik

elihood (llh)

0.0 0.5 1.0 1.5 2.0

N/A 5 7 9

Type

EM Adaptive Exp−A Exp−B

(α = 0.99)!

Start Temperature (1st T

c = 4.64)!

(α = 0.95)!

5.0! 7.0! 9.0!

N/A!

L

o

g

-L

ike

lih

o

o

d

!

(26)

Conclusion

!

▸ 

GTM with Deterministic Annealing (DA-GTM)

!

–  Overcome short-comes of traditional EM method !

–  Avoid local optimum !

–  Robust against poor initial parameters!

Phase-transitions in DA-GTM

!

–  Use Hessian matrix for detection!

–  Eigenvalue computation!

▸ 

Adaptive cooling schedule

!

–  New convergence approach!

(27)

I.

Finite Mixture Model (FMM)

!

II.

Model Fitting with Deterministic Annealing (DA)

!

III.

Generative Topographic Mapping with

Deterministic Annealing (DA-GTM)

!

IV.

Probabilistic Latent Semantic Analysis with

Deterministic Annealing (DA-PLSA)

!

V.

Conclusion And Future Work

!

(28)

Corpus Analysis

!

▸ 

Polysems

!

–  A word with multiple meanings!

–  E.g., ‘thread’!

Synonyms

!

–  Different words that have similar

meaning, a topic!

–  E.g., ‘car’ and ‘automotive’!

Wa  

 

Wb  

 

Wc  

 

Wb  

 

Wc  

 

Wa  

 

Wb  

 

wa  

 

Wc  

 

Wd  

 

(29)

▸ 

Topic model

!

–  Assume latent K topics generating words!

–  Each document is a mixture of K topics!

▸ 

FMM Type-2

!

–  The original proposal used EM for model fitting!

Jong Youl Choi (Jan 12, 2012)!

28!

Probabilistic Latent Semantic Analysis (PLSA)!

Doc 1! Doc 2! Doc N!

Topic 1! Topic 2! …! Topic K!

(30)

An Example of DA-PLSA

!

Topic 1 Topic 2 Topic 3 Topic 4 Topic 5

percent million

year sales billion

new company

last corp share

stock market

index million percent

stocks trading shares

new exchange

soviet gorbachev

party i

president union gorbachevs government

new news

bush dukakis percent

i jackson campaign

poll president

new israel

percent computer

aids year new drug virus futures people

two

Top  10  list  of  the  best  words  of  the  AP  news  dataset  for  30  topics.   Processed  by  DA-­‐PLSA  and  shown  only  5  topics  among  30  topics    

(31)

▸ 

Predictive power

!

–  Maintain good performance on unseen data!

–  A generalized model is preferable!

Jong Youl Choi (Jan 12, 2012)!

30!

Overfitting Problem

!

Model  

Training!

Query


(Inference)!

Sample!

Unseen Data!

Queried   Documents  

(32)

▸ 

PLSA Model Setting

!

–  Use Multinomial distribution!

–  Flexible mixing weight!

Free Energy for PLSA

!

F

F M M

=

T

N

X

n=1

log

K

X

k=1

{

c

(

n, k

)

p

(

x

n

|

y

k

)

}

T1

c

(

n, k

) =

nk

(33)

▸ 

DA can control smoothness

!

–  Smoothed solution at high temperature!

–  Getting specific as annealing!

–  Early stopping to get a smoothed (general) model!

Stop condition

!

–  Use a V-fold cross validation method!

–  Measure total perplexity, sum of log-likelihood of both training set and testing set !

▸ 

Tempered-EM, proposed by Hofmann (the

original author of PLSA), but annealing is done

in a reversed way

!

Jong Youl Choi (Jan 12, 2012)!

32!

(34)

k

k

5

Training Set

Testing Set Mix B Mix C

A B C D

Annealing in DA-PLSA

!

Over-fitting at Temp=1!

Improved fitting quality with training set during annealing!

Early-stop temperatures depending on schemes:!

A (a=0.0, b=1.0)!

B (a=0.5, b=0.5)!

C (a=0.9, b=0.1)!

D (a=1.0, b=0.0)! Testing Set!

Training Set!

Mix C!

(35)

Predicting Power in DA-PLSA

!

Log of word probabilities of AP data (100 topics for 10,473 words)!

Early stop


(Temp = 49.98)! 20 40 60 80 100

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

40 35 30 25 20 15 10 5 0

W

o

rd

I

n

d

e

x

!

Over-fitting 


(Temp = 1.0)!

20 40 60 80 100

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

40 35 30 25 20 15 10 5 0

W

o

rd

I

n

d

e

x

!

Jong Youl Choi (Jan 12, 2012)!

34!

High  Word   Probability  

(36)

AP data with DA-PLSA

!

Test  Set  Only   Training  Set  Only  

(37)

NIPS data with DA-PLSA

!

Jong Youl Choi (Jan 12, 2012)!

36!

Maximum Sum of Log−Likelihood of Training And Testing Sets

Latent space dimensions

Log

lik

elihood

−3000

−2500

−2000

● ●

● ●

1 5 10 50 100 500

Method

● DA−Train DA−Test EM−Train EM−Test

Maximum Sum of Log−Likelihood of Training And Testing Sets

Latent space dimensions

Log

lik

elihood

−10000 −8000 −6000 −4000 −2000

● ●

● ●

● ●

1 5 10 50 100 500

Method

● DA−Train DA−Test EM−Train EM−Test

Test  Set  Only   Training  Set  Only  

(38)

DA-PLSA with DA-GTM

!

Corpus!

(Set of documents)!

Embedded Corpus in 3D!

Corpus 


in K-dimension!

DA-PLSA!

(39)

AP Data Top Topic Words

!

▸ 

In the previous picture, we found among 500

topics:

!

Topic 331! Topic 435! Topic 424! Topic 492! Topic 445! Topic 406!

lately !

oferrell !

mandate !

ACK!

fcc !

cardboard !

commuter !

exam !

kuwaits !

fabrics!

lately oferrell !

ACK!

fcc !

mandate !

cardboard !

exam !

commuter !

fabrics !

corroon !

mandate !

kuwaits !

cardboard !

commuter !

ACK!

fcc !

lately !

exam !

fabrics !

oferrell !

mandate !

kuwaits !

cardboard !

commuter !

lately !

ACK!

exam !

fcc !

oferrell !

fabrics!

mandate !

lately !

ACK!

cardboard !

fcc !

commuter !

oferrell !

exam !

kuwaits !

fabrics!

plunging !

referred !

informal !

Anticommu. !

origin !

details !

relieve !

psychologist !

lately !

thatcher!

ACK : acknowledges !

Anticommu. : anticommunist!

Jong Youl Choi (Jan 12, 2012)!

(40)

EM vs. DA-{GTM, PLSA}

!

T

N

X

n=1 ln

(✓

1

K

◆ 1

T XK

k=1

p(xn|yk)

1 T

)

N

X

n=1

ln

(

1

K

K

X

k=1

p(xn|yk)

)

Ob

jec)ve  

Fu

nc

)o

ns

 

EM   DA  

Maximize  log-­‐likelihood  L   Minimize  free  energy  F   OpAmizaAon  

  Very  sensiAve  

  Trapped  in  local  opAma  

  Faster  

  Less  sensiAve  to  an  iniAal  condiAon  

  Find  global  opAmum  

  Require  more  computaAonal  Ame  

Pros  &  Cons  

Note:  When  T  =  1,  L  =  -­‐F.    

This  implies  EM  can  be  treated  as  a  special  case  in  DA  

GTM  

PLSA   T

N

X

n=1 ln

K

X

k=1

{ nkMulti(xn|yk)}

1

T N

X

n=1

ln

K

X

k=1

(41)

I.

Finite Mixture Model (FMM)

!

II.

Model Fitting with Deterministic Annealing (DA)

!

III.

Generative Topographic Mapping with

Deterministic Annealing (DA-GTM)

!

IV.

Probabilistic Latent Semantic Analysis with

Deterministic Annealing (DA-PLSA)

!

V.

Conclusion And Future Work

!

(42)

▸ 

Finite Mixture Model (FMM) problems

!

–  FMM-1 and FMM-2!

–  Maximize log-likelihood for model fitting (MLE)!

–  Traditional solutions use EM!

Solve FMMs with DA

!

–  Avoid local optimum problem!

–  Find generalized (smoothed) solution!

▸ 

Enhance and develop two data mining

algorithms

!

–  DA-GTM!

–  DA-PLSA!

(43)

▸ 

Determine number of components

!

–  Help to choose the right number of clusters, topics, the number of lower dimension, …!

–  Bayesian model selection, minimum description

length (MDL), Bayesian information criteria (BIC), …!

–  Need to develop in a DA framework!

▸ 

Quality study for DA-PLSA

!

–  Comparison with LDA!

–  Precision and recall measurements!

▸ 

Performance study for data-intensive analysis

!

–  MPI, MapReduce, PGAS, …!

Jong Youl Choi (Jan 12, 2012)!

42!

(44)

▸  [CCPE] J. Y. Choi, S.-H. Bae, J. Qiu, B. Chen, and D. Wild. Browsing large scale cheminformatics data with dimension reduction. Concurrency and Computation: Practice and Experience, 2011.!

▸  [ECMLS] J. Y. Choi, S.-H. Bae, J. Qiu, G. Fox, B. Chen, and D. Wild.

Browsing large scale cheminformatics data with dimension reduction. In

Workshop on Emerging Computational Methods for Life Sciences (ECMLS), in conjunction with the 19th ACM International Symposium on High

Performance Distributed Computing (HPDC) 2010, HPDC ’10, pages 503– 506, Chicago, Illinois, June 2010. ACM.!

▸  [CCGrid] J. Y. Choi, S.-H. Bae, X. Qiu, and G. Fox. High performance dimension reduction and visualization for large high-dimensional data analysis. Cluster Computing and the Grid, IEEE International Symposium on, 0:331–340, 2010. !

▸  [ICCS] J. Y. Choi, J. Qiu, M. Pierce, and G. Fox. Generative Topographic Mapping by Deterministic Annealing. In Proceedings of the 10th

International Conference on Computational Science and Engineering (ICCS 2010), 2010.!

(45)

Thank you!!

"

Question?

"

Email me at jychoi@cs.indiana.edu"

Jong Youl Choi (Jan 12, 2012)!

References

Related documents

Even if Article 31(3)(c) of the VCLT is construed as referring to international law rules accepted by all parties to the treaty concerned (i.e. all WTO members), WTO

(b) The servicing clinic shall insert a completed Service Treatment Record (STR) Certification, DD Form 2963 (Enclosure 4) and DD Form 214, Certificate of Release or Discharge

successful abortions were performed with a metal shaft of 7 mm internal diameter, using a 4 cm rubber catheter tip of 5mm inner diameter.. The angle of the tip was most important

14 prizes at more than 30 festivals, including winner of Europa Cinema award as Best European Film at the Berlin Film Festival (Panorama) and FIPRESCI Prize for Best Foreign

(2014),“ The Impact of Green Marketing and Perceived Innovation on Purchase Intention for Green Products”, International Journal of Marketing Studies, Vol.. and

In this application development, we have used MySQL to store employer details, jobseeker details, applied jobs by the applicants, jobs posted by the employer.. Hence, we

5 True resolved temperature (horizontal axis) computed from instantaneous mixture fraction (before filter- ing) and resolved temperature obtained by integration of laminar

Actual statistics don’t bear that out but perception is a powerful thing — especially in an era when America feels so divided and when people are so often only listening to news