• No results found

Probabilistic Model. Search Engine

N/A
N/A
Protected

Academic year: 2021

Share "Probabilistic Model. Search Engine"

Copied!
16
0
0

Loading.... (view fulltext now)

Full text

(1)

Probabilistic Model

(2)

Probability: Definitions & Notations

 Probability is a numerical measure of the likelihood of a given event.

 number of desirable outcome / total number of possible outcomes

 P(A) denotes the probability that event A will occur

1 - P(A) : probability that A will not occur

 0.00  P(A)  1.00

 The probability that any event A will occur varies from 0 to 1.

+---+---+

0 .5 1 never occur always occur half of the time occur

) ( 1 ) ( ) ( )

( A P A P A P A

P  

c

  

(3)

Probability: Definitions & Notations

 P(A or B)  P(A  B)  The probability that either A or B will occur

 Mutually Exclusive events

Events A and B cannot occur simultaneously

e.g. A = coin toss turns up a head B = coin toss turns up a tail

 Not Mutually Exclusive events

Events A and B can occur simultaneously

e.g. A = a card drawn is an ace B = a card drawn is a heart

 P(A and B)  P(A  B)  The probability that both A and B will occur

 Independent Events

The occurrence of A is independent of the occurrence of B

e.g. A = the first coin toss turns up a head B = the second coin toss turns up a tail

 Dependent Events

The occurrence of A is not independent of the occurrence of B

e.g. A = the first card drawn is an ace

B = the second card drawn (without replacement) is a heart

 P(B|A)  The probability that B will occur given A has occurred.

(4)

Probability: Examples

P(A or B)  P(A  B)  The probability that either A or B will occur

 Mutually Exclusive events

 Events A and B cannot occur simultaneously

 P(A  B) = P(A) + P(B)

 What is the probability of rolling a die and getting 1 or 6?

A = The die rolls a 1

B = The die rolls a 6

P(A) = 1/6

P(B) = 1/6

P(A  B) = 1/6 + 1/6 = 1/3

 Not Mutually Exclusive events

 Events A and b can occur simultaneously

 P(A  B) = P(A) + P(B) - P(A  B)

 What is the probability that a card drawn from a deck is an ace or spade?

A = The card is an ace

B = The card is a spade

P(A) = 4/52

P(B) = 13/52

P(A  B) = 1/52

P(A  B) = 4/52 + 13/52 – 1/52 = 16/52 = 4/13

(5)

Probability: Examples

P(A and B)  P(A  B)  The probability that both A and B will occur

 Independent Events

 The occurrence of A is independent of the occurrence of B

 P(A  B) = P(A) P(B)

 What is the probability that fair coin tosses will come up heads twice in a row?

A = a head on the first toss

B = a head on the second toss

P(A) = P(B) = 1/2

P(A  B) = 1/2*1/2 = 1/4

 Dependent Events

 The occurrence of A is not independent of the occurrence of B

 P(A  B) = P(A) P(B|A) = P(B) P(A|B)

 What is the probability that two cards drawn from a deck will both be aces?

A = the first card is an ace

B = the second card is an ace

P(A) = 4/52 = 1/13

P(B|A) = 3/51 = 1/17

P(A  B) = 1/13*1/17 = 1/221

(6)

Addition

• Mutually Exclusive

P(A  B  C) = P(A) + P(B) + P(C)

• Not Mutually Exclusive

P(A  B  C) = P(A) + P(B) + P(C) – P(A  B) - P( (A  B)  C)

AB = A  B

P(A  B  C) = P(AB  C) = P(AB) + P(C) – P(AB  C)

» P(AB) = P(A) + P(B) – P(A  B)

» P(AB  C) = P( (A  B)  C)

Multiplication

• Independent

P(A  B  C) = P(A) P(B) P(C)

• Dependent

P(A  B  C) = P(A) P(B|A) P(C|(A  B))

AB = A  B

P(A  B  C) = P(AB  C) = P(AB) P(C|AB)

» P(AB) = P(A) P(B|A)

Combination of Probability

OR  addition

P(A  B) AND  multiplication

P(A  B)

Mutually Exclusive P(A) + P(B)

NOT Mutually Exclusive P(A) + P(B) – P(A  B)

Independent P(A) P(B)

Dependent P(A) P(B|A)

P(A) P(B)

Not Mutually Exclusive Events P(A  B)

P(A) P(B)

Mutually Exclusive Events

(7)

Probability: Examples

P(A|B)  The probability that A will occur given B has occurred.

 The conditional probability of A given B

 P(A|B) = P(A  B) / P(B)

Derived from P(A  B) = P(B) P(A|B)

 At a restaurant, 90% of the customers order a burger. If 72% of the customers order a burger and fries, what is the probability that a customer who orders a burger will also order fries?

A = a customer orders fries

B = a customer orders a burger

P(B) = 90/100, P(A  B) = 72/100

P(A|B) = (72/100) / (90/100) = 72/90

(8)

Probability: Bayes Theorem

 Formula for calculating conditional probabilities

 P(A|B) as a function of P(B|A)

 Bayesian Inference

 Learning about a hypothesis from data

Hypothesis

 Prior probability: initial subjective degree of belief in the hypothesis

Learning

 Posterior probablity: the degree of belief in the hypothesis is updated based on evidence

H = hypothesis, E = evidence

P(H) = prior probability of H

P(E) = marginal probability of E

Probability of witnessing the evidence E

P(E|H) = likelihood of E given H

Conditional probability of seeing the evidence E given that hypothesis H is true

P(H|E) = posterior probability of H given E

)

| ( ) ( )

| ( ) (

)

| ( ) ( )

( )

| ( ) ( )

( ) ) (

|

( P A P B A P A P B A

A B P A P B

P A B P A P B

P B A B P

A

P     

)

| ( ) ( )

| ( ) ( ) (

) ( )

(B P B A P B A P A P B A P A P B A

P      

)

| ( ) ( )

| ( ) (

)

| ( ) ( )

(

)

| ( ) ) (

|

( P H P E H P H P E H

H E P H P E

P

H E P H E P

H

P   

(9)

Probability: Bayesian Inference

 Box 1 has 10 oranges and 20 apples. Box 2 has 20 oranges and 10 apples. If you randomly picked out a fruit from a random box and got an apple, what is the probability that it came from box 1?

 H= The fruit (e.g., apple) comes from box 1

 Hc= The fruit (e.g., apple) comes from box 2

 E = picks an apple

 P(H|E) = P(H)P(E|H) / P(E) = P(H)P(E|H) / (P(H)P(E|H) + P(HC)P(E|HC))

 P(H) = P(HC) = ½

 P(E|H) = 20/30

 P(E|HC) = 10/30

 P(H|E) = (½ *2/3) / (½ *2/3 + ½ *1/3)

)

| ( ) ( )

| ( ) (

)

| ( ) ( )

(

)

| ( ) ) (

|

( P H P E H P H P E H

H E P H P E

P

H E P H E P

H

P   

(10)

Probability: Odds

Probability and odds are two ways of measuring the chances of an event occurring.

Probability

• number of desirable outcome / total number of possible outcomes

occurrence / whole

• chances for / total chances

• predictive: long run ratio

Odds

• number of desirable outcome / number of undesirable outcomes

occurrence / non-occurrence

• chances for / chances against

O(A) = 2 / 1  odds are 2 to 1 in favor of A

O(A) = 1 / 2  odds are 2 to 1 against A

• payoff to break even

Probability and odds have similar values for infrequent events (P < 0.1)

Probability Odds Transformation

Range 0 to 1 0 to

Comparison

1

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

 9.00 4.00 2.33 1.50 1.00 0.67 0.43 0.25 0.11 0.00

) ( 1

) ) (

( O A

A A O

P  

) ( 1

) ) (

( P A

A A P

O  

(11)

Logarithm

Logarithm of x to base b

x = by

Common Logarithm

• Logarithm to base 10

Natural Logaritm

• Logarithm to base e, where e = 2.718281828…

Properties

Log Transformation

x = by,

y

b

x  log

x x log log

10

x

e

x ln

log 

2 1

2

1 log log

logx xxx

2 1

2

1 log log

log x x

x

x  

x k xk log

log 

b x x

c c

b log

log log

y

b x

log

b x b

y b

x c y c b c

c log log log log

log   

logx

(12)

IR: Probabilistic Model

Probability Ranking Principle

• Assumptions

− Each document is either relevant or not relevant

− Relevance of one document does not affect that of another document

• Ranking documents according to the probability of relevance to the query

 Maximizes the overall retrieval effectiveness

Binary Independence Retrieval Model

Represent documents d as binary term vectors

• Rank documents according to the relevance odds (i.e., probability of relevance)

1. compute the odds that document d will be relevant:

» : probability that d is relevant

» : probability that d is not relevant

2. sort the documents by the descending order of

• Term Independence Assumption

− Presence/absence of one term is unrelated to the presence/absence of any other term.

− assumes that occurrence of terms in documents are independent from each other





otherwise d to belongs t

t if where t

t

d n i i

, 0

, ), 1

,..., (1

)

| (

)

| ) (

|

( P R d

d R d P

R

O

 

 )

| (R d P

)

| (R d P

)

| (R d

O

(13)

IR: BIR Model Derivation

)

| (

)

| ( ) (

) (

R d P

R d P R P

R

P

 O(R)

 )

| (

)

| ) (

|

( P R d

d R d P

R

O

 

) (

)

| ( ) (

) (

)

| ( ) (

d P

R d P R P

d P

R d P R P

 ( | )

)

| (

1P t R

R t P

i n i i

apply Bayes Theorem assume Term Independence

P(R) = probability that a (randomly selected) document is relevant

split by presence/absence of terms

)

| 0 (

)

| 0 ( )

| 1 (

)

| 1 ) (

(

0

1 P t R

R t

P R t P

R t R P

O

i i i t

i

ti i

 

 

Ignore , which is constant across documents, and take the log:

)

| 1

(t R

P pii

)

| 1

(t R

P qii

i i i t

i

t q

p q

R p O d R O

i

i

 

1

) 1 ( )

| (

0 1

i i

i i

i i

t

i i t

i i n

i i

i i n

i

i i i n

i t

i t i

t i t i n

i q

p q

R p O t

q if p

t q if p q

q p R p

O



 

 

 







 

 

 

 

1

1 1

1 1

1

1 1

) 1 ( 0 ),

1 (

) 1 (

1 ,

) 1 (

) 1 ) (

(

Let : probability that ti occurs in a relevant document : probability that ti occurs in a non-relevant document

) (

) ) (

( P R

R R P

O







 

 

 

 







 

 

 



n

i

t

i i t

i i t

i i t

i i n

i

i i

i i

q p q

p q

p q

d p RSV

1

1 1

1 1

log 1 1

log 1 )

(

 



 

 

 



 

 

n

i i

i

i i i

i i i n

i i

i i

i i

i q

p q

t p q t p q

t p q

t p

1

1 1

log1 1

log1 1 log

log1 ) 1 ( log

 

 

 









 

  n

i

n

i i

i

i i

i i i n

i i

i

i i i i

i q

p p

q q t p

q p

q p q p t

1 1

1 1

log1 )

1 (

) 1 log ( 1

log1

1

log1 Ignore ,

which is constant across documents

n

i i

i

q p

1 1

log1

 n

i i n

n i

n i

x x

x x x

x x x

1 2

1 2

1 1

log log

..

log log ) ..

log(

log

(14)

IR: Term Relevance Weight

n

i i i

i i

i q p

q t p

d RSV

1 (1 )

) 1 log ( )

(

tr

k

= Log odds ratio

Probability Estimation for Term Relevance Weight

• Without relevance information

− Known

Nd = total number of documents in collection (collection size)

dk = number of documents in which term k appears (postings)

− Estimated

qk = probability that term k occurs in a non-relevant document

» qk  dk / Nd (conventional) qk = (dk+0.5) / (Nd+1)

» assume that number of non-relevant documents  collection size

pk = probability that term k occurs in a relevant document

» pk  0.5

» assume random chance

− Same as idf for large Nd

) (

) log ( ) 1 log( ) 1 log( ) 1 log( ) 1 log( ) 1 (

) 1 log (

k k

k k

k k

k k

k k

k k

k k

k O q

p O q

q p

p q

q p

p p

q q

tr p

 

 

 

 

 



 

relevant document non

a in appears k

term that odds

document relevant

a in appears k

term that log odds

k k d

d k d

k d

d k d k

k k

k k

k k

k k

k d

d N N

d N

d N

N d N d q

q p

p p

q q

tr p

 

 

 

 

  0 log log

1 5 log . 0 1

5 . log 0 ) 1 log( ) 1 log( ) 1 (

) 1 log (

5 . 0

5 . log 0

5 . 0

) 5 . 0 ( log 1

1 5 . 0

1 5 . 1 0

log ) 0

1 log( ) 1 log( ) 1 (

) 1 log (

:

k k d k

k d

k d k

k k k

k k

k k k

k d

d N d

d N

N d

N d q

q p

p p

q q tr p

al Convention

(15)

IR: Term Relevance Weight

Probability Estimation for Term Relevance Weight

• With relevance information

− Known

Nd = total number of documents in collection (collection size)

dk = number of documents in which term k appears (postings)

Nr = total number of relevant documents in collection

» Nd - Nr = total number of non-relevant documents in collection

rk = number of relevant documents in which term k appears

» dk - rk = number of non-relevant documents in which term k appears

− Estimated

qk = probability that term k occurs in a non-relevant document

» (conventional)

pk = probability that term k occurs in a relevant document

» (conventional)

r d

k k

k N N

r q d

 

rk

p   r 0.5

1 5 . 0

 

r d

k k

k N N

r q d

Relevant Documents

Non-relevant Documents

Total

Documents

with Term k rk dk – rk Dk

Documents

without Term k Nr – rk Nd – Nr –(dk – rk) Nd – dk

Total Nr Nd – Nr Nd

(16)

IR: Term Relevance Weight

Probability Estimation for Term Relevance Weight

• With relevance information

qk = probability that term k occurs in a non-relevant document

» (conventional)

pk = probability that term k occurs in a relevant document

» (conventional)

(conventional)



 



 

 

 

 

 

k r k d

k k

k r

k

r d

k k r d

r d

k k

r k r

r k

k k

k k

k k

k k k

r N d N

r d

r N

r

N N

r d N N

N N

r d

N r N

N r

q q

p p p

q q

tr p log log

1 log1 ) 1 (

) 1 log (

r d

k k

k N N

r q d

 

r k

k N

pr

1 5 . 0

 

r k

k N

p r

1 5 . 0

 

r d

k k

k N N

r q d



 



 









  









 

 

 

5 . 0 5 . 0

5 . 0 5 . 0 log

1 5 . 0 1

1 5 . 0 1

5 . 0 1

1 5 . 0

) log 1

( ) 1 log (

k r k d

k k

k r

k

r d

k k r

d

r d

k k

r k r

r k

k k

k k k

r N d N

r d

r N

r

N N

r d N

N

N N

r d

N r N

N r

p q

q tr p

References

Related documents

The addition of edges in a graph (preserving the nodes) always increases its dimension, because tiles may cover more nodes, and the number N ( r ) of tiles required to cover the

Although Somoza introduced the first Income Tax Law in 1952, and additional taxes on coffee exports in 1955, he tried to accommodate the interests of economic elites

1-1-2 Pasívny, agresívny alebo rotácia hráčov 1-1-2 Prvý útočník vytláča smerom k mantinelu a druhý útočník blokuje stred ihriska.. 1-3 Smerom k mantinelu 1-3

Due to possible bias resulting from the absolute number of respiratory specimens processed (up to 6) compared to only one stool specimen tested on Xpert MTB/RIF, Xpert Tube Fill

The Minnesota Department of Health Asthma Program partnered with Pediatric Home Service to conduct a demonstration project Reducing Environmental Triggers of Asthma (RETA) in homes

In this study, the CFD simulation method is applied in order to assess the effect of the wind direction to the resistance of naval vessels resistance in calm

This study was conducted with the aim of evaluating the modulatory effect of ascorbic acid supplementation on rectal temperature, body weight changes and

The combination of the parasite control expertise of the Asian Centre of International Parasite Control (ACIPAC) at Mahidol University's Faculty of Tropical Medicine, the research