• No results found

ProbLog: a probabilistic programming language for data analysis

N/A
N/A
Protected

Academic year: 2021

Share "ProbLog: a probabilistic programming language for data analysis"

Copied!
202
0
0

Loading.... (view fulltext now)

Full text

(1)

ProbLog: a probabilistic

programming language

for data analysis

CATCH meeting 04.10.13

Angelika Kimmig

(2)

codes for gene protein pathway cellular component homologgroup phenotype biological process locus is homologous to participates in participates in is located in is related to refers to belongs to is found in is found in participates in refers to

Networks of Uncertain

Information

(3)

Biomine

network

(4)

Biomine

network

(5)
(6)

Biomine Network

presenilin 2

Gene

EntrezGene:81751

Notch receptor processing

BiologicalProcess

(7)

Biomine Network

-participates_in

0.220

(8)

Biomine Network

-participates_in

0.220

BiologicalProcess

different types of nodes & links

automatically extracted from

text, databases, ...

(9)

Biomine

network

(10)

Biomine

network

gene

(11)

Biomine

network

What is the most

relevant subnetwork

(with at most k edges)

connecting these?

gene

(12)

Biomine

network

What is the most

relevant subnetwork

(with at most k edges)

connecting these?

gene

(13)

Biomine

network

gene

phenotype

Should there be a link?

If so, of what type?

(14)

Node Classification

Can we predict

the type of a node

given information

on its neighbors?

e.g., the type of a

webpage given its links

and the words on the

(15)

Homer

Marge

Bart

Lisa

Lenny

Moe

Maggie

?

?

?

?

?

?

?

?

?

?

?

?

?

?

?

?

+$5

-$3

Which

strategy

gives the

maximum

expected utility

?

Viral Marketing

Which advertising

strategy maximizes

expected profit?

(16)

Dynamic networks

border border border border Alliance 2 Alliance 3 Alliance 4 Alliance 6 P 2 1081 895 1090 1090 1093 1084 915 1081 1040 1077 955 1073 804 1054 830 942 1087 621 P 3 744 748 559 P 5 P 6 950 644 985 932 837 871 777 P 7 946 878 864 913 P 9

Travian: A massively multiplayer

real-time strategy game

Can we build a model

of this world ?

Can we use it for playing

better ?

(17)

Can we find the mechanism

Molecular interaction

networks

(18)

Diagnosing machine

failures

Can we build a model of the robot’s working

and use it to find causes of failures?

(19)

How to achieve a specific configuration

of objects on the shelf?

Where’s the orange mug?

Where’s something to serve soup in?

(20)

Analyzing

Video Data

Track people or objects

over time? Even if

temporarily hidden?

Recognize activities?

Infer object properties?

Fig. 4. Tracking results from experiment 2. In frame 5, two groups are

present. In frame 15, the tracker has correctly split group 1 into 1-0 and 1-1 (see Fig. 3). Between frames 15 and 29, group 1-0 has split up into groups 1-0-0 and 1-0-1, and split up again. New groups, labeled 2 and 3, enter the field of view in frames 21 and 42 respectively.

Six frames of the current best hypothesis from experiment 2 are shown in Fig. 4, the corresponding hypothesis tree is shown in Fig. 3. The sequence exemplifies movement and formation of several groups.

A. Clustering Error

Given the ground truth information on a per-beam basis we can compute the clustering error of the tracker. This is done by counting how often a track’s set of points P contains too many or wrong points (undersegmentation) and how often P is missing points (oversegmentation) compared to the ground truth. Two examples for oversegmentation errors can be seen

0 0.1 0.2 0.3 0.4 0.5 0.6 0.5 1 1.5 2 2.5 3 3.5

Error rates per track and frame

Clustering distance threshold dP (m) w/o tracking Overs. + Unders. Oversegm. Undersegm. 0 0.2 0.4 0.6 0.8 1 0 4 8 12 16 20

Avg. cycle time (sec)

Number of people in ground truth Group tracker

People tracker

Fig. 5. Left: clustering error of the group tracker compared to a memory-less single linkage clustering (without tracking). The smallest error is achieved for a cluster distance of 1.3 m which is very close to the border of personal and social space according to the proxemics theory, marked at 1.2 m by the vertical line. Right: average cycle time for the group tracker versus a tracker for individual people plotted against the ground truth number of people.

relations can be determined in such cases.

For experiment 1, the resulting percentages of incorrectly clustered tracks for the cases undersegmentation, overseg-mentation and the sum of both are shown in Fig. 5 (left), plotted against the clustering distance dP . The figure also

shows the error of a single-linkage clustering of the range data as described in section II. This implements a memory-less group clustering approach against which we compare the clustering performance of our group tracker.

The minimum clustering error of 3.1% is achieved by the tracker at dP = 1.3 m. The minimum error for the

memory-less clustering is 7.0%, more than twice as high. In the more complex experiment 2, the minimum clustering error of the tracker rises to 9.6% while the error of the memory-less clustering reaches 20.2%. The result shows that the group tracking problem is a recursive clustering problem that requires integration of information over time. This occurs when two groups approach each other and pass from opposite directions. The memory-less approach would merge them immediately while the tracking approach, accounting for the velocity information, correctly keeps the groups apart.

In the light of the proxemics theory the result of a minimal clustering error at 1.3 m is noteworthy. The theory predicts that when people interact with friends, they maintain a range of distances between 45 to 120 cm called personal space. When engaged in interaction with strangers, this distance is larger. As our data contains students who tend to know each other well, the result appears consistent with Hall’s findings.

B. Tracking Efficiency

When tracking groups of people rather than individuals, the assignment problems in the data association stage are of course smaller. On the other hand, the introduction of an additional tree level on which different models hypoth-esize over different group formation processes comes with additional computational costs. We therefore compare our system with a person-only tracker which is implemented by

(21)

Common theme

Dealing with

uncertainty

Reasoning with

relational data

Learning

(22)

ProbLog

probabilistic Prolog

Dealing with

uncertainty

Reasoning with

relational data

Learning

(23)

Prolog / logic

programming

ProbLog

probabilistic Prolog

Dealing with

uncertainty

Learning

stress(ann). influences(ann,bob). influences(bob,carl).

(24)

Prolog / logic

programming

ProbLog

probabilistic Prolog

Dealing with

uncertainty

Learning

stress(ann). influences(ann,bob). influences(bob,carl). smokes(X) :- stress(X).

one world

(25)

Prolog / logic

programming

atoms as random

variables

ProbLog

probabilistic Prolog

Learning

stress(ann). influences(ann,bob). influences(bob,carl). 0.8::stress(ann). 0.6::influences(ann,bob). 0.2::influences(bob,carl).

one world

(26)

Prolog / logic

programming

atoms as random

variables

ProbLog

probabilistic Prolog

Learning

stress(ann). influences(ann,bob). influences(bob,carl). smokes(X) :- stress(X). 0.8::stress(ann). 0.6::influences(ann,bob). 0.2::influences(bob,carl).

one world

(27)

Prolog / logic

programming

atoms as random

variables

ProbLog

probabilistic Prolog

Learning

stress(ann). influences(ann,bob). influences(bob,carl). 0.8::stress(ann). 0.6::influences(ann,bob). 0.2::influences(bob,carl).

one world

several possible worlds

Distribution Semantics [Sato, ICLP 95]:

probabilistic choices + logic program

(28)

adapted

relational

learning

techniques

Prolog / logic

programming

atoms as random

variables

ProbLog

probabilistic Prolog

stress(ann). influences(ann,bob). influences(bob,carl). smokes(X) :- stress(X). 0.8::stress(ann). 0.6::influences(ann,bob). 0.2::influences(bob,carl).

one world

several possible worlds

Distribution Semantics [Sato, ICLP 95]:

probabilistic choices + logic program

(29)

ProbLog Basics

-

ProbLog by example

-

Inference

-

Parameter Learning

Selected Topics

-

Upgrading relational learning

-

Dynamics under uncertainty

-

Continuous-valued random variables

(30)

ProbLog Basics

-

ProbLog by example

-

Inference

-

Parameter Learning

Selected Topics

-

Upgrading relational learning

-

Dynamics under uncertainty

-

Continuous-valued random variables

-

Decision making

(31)

ProbLog by example:

A bit of gambling

h

toss (biased) coin & draw ball from each urn

(32)

0.4 :: heads.

ProbLog by example:

A bit of gambling

h

toss (biased) coin & draw ball from each urn

win if (heads and a red ball) or (two balls of same color)

probabilistic fact

: heads is true with

(33)

0.4 :: heads.

0.3 :: col(1,red); 0.7 :: col(1,blue) <- true.

ProbLog by example:

A bit of gambling

h

toss (biased) coin & draw ball from each urn

win if (heads and a red ball) or (two balls of same color)

annotated disjunction

: first ball is red

(34)

0.4 :: heads.

0.3 :: col(1,red); 0.7 :: col(1,blue) <- true.

0.2 :: col(2,red); 0.3 :: col(2,green);

0.5 :: col(2,blue) <- true.

annotated disjunction

: second ball is red with

probability 0.2, green with 0.3, and blue with 0.5

ProbLog by example:

A bit of gambling

h

toss (biased) coin & draw ball from each urn

(35)

0.4 :: heads.

0.3 :: col(1,red); 0.7 :: col(1,blue) <- true.

0.2 :: col(2,red); 0.3 :: col(2,green);

0.5 :: col(2,blue) <- true.

ProbLog by example:

A bit of gambling

h

toss (biased) coin & draw ball from each urn

(36)

0.4 :: heads.

0.3 :: col(1,red); 0.7 :: col(1,blue) <- true.

0.2 :: col(2,red); 0.3 :: col(2,green);

0.5 :: col(2,blue) <- true.

win :- heads, col(_,red).

logical rule

encoding

ProbLog by example:

A bit of gambling

h

toss (biased) coin & draw ball from each urn

(37)

0.4 :: heads.

0.3 :: col(1,red); 0.7 :: col(1,blue) <- true.

0.2 :: col(2,red); 0.3 :: col(2,green);

0.5 :: col(2,blue) <- true.

ProbLog by example:

A bit of gambling

h

toss (biased) coin & draw ball from each urn

win if (heads and a red ball) or (two balls of same color)

(38)

Questions

Probability of

win

?

Probability of

win

given

col(2,green)

?

Most probable world where

win

is true?

0.4 :: heads.

0.3 :: col(1,red); 0.7 :: col(1,blue) <- true.

0.2 :: col(2,red); 0.3 :: col(2,green); 0.5 :: col(2,blue) <- true. win :- heads, col(_,red).

(39)

Questions

Probability of

win

?

Probability of

win

given

col(2,green)

?

Most probable world where

win

is true?

0.4 :: heads.

0.3 :: col(1,red); 0.7 :: col(1,blue) <- true.

0.2 :: col(2,red); 0.3 :: col(2,green); 0.5 :: col(2,blue) <- true. win :- heads, col(_,red).

win :- col(1,C), col(2,C).

marginal probability

query

(40)

Questions

Probability of

win

?

Probability of

win

given

col(2,green)

?

Most probable world where

win

is true?

0.4 :: heads.

0.3 :: col(1,red); 0.7 :: col(1,blue) <- true.

0.2 :: col(2,red); 0.3 :: col(2,green); 0.5 :: col(2,blue) <- true. win :- heads, col(_,red).

win :- col(1,C), col(2,C).

marginal probability

conditional probability

evidence

(41)

Questions

Probability of

win

?

Probability of

win

given

col(2,green)

?

Most probable world where

win

is true?

0.4 :: heads.

0.3 :: col(1,red); 0.7 :: col(1,blue) <- true.

0.2 :: col(2,red); 0.3 :: col(2,green); 0.5 :: col(2,blue) <- true. win :- heads, col(_,red).

win :- col(1,C), col(2,C).

marginal probability

(42)

Possible Worlds

0.4 :: heads.

0.3 :: col(1,red); 0.7 :: col(1,blue) <- true.

0.2 :: col(2,red); 0.3 :: col(2,green); 0.5 :: col(2,blue) <- true. win :- heads, col(_,red).

(43)

Possible Worlds

0.4 :: heads.

0.3 :: col(1,red); 0.7 :: col(1,blue) <- true.

0.2 :: col(2,red); 0.3 :: col(2,green); 0.5 :: col(2,blue) <- true. win :- heads, col(_,red).

(44)

Possible Worlds

H

0.4 :: heads.

0.3 :: col(1,red); 0.7 :: col(1,blue) <- true.

0.2 :: col(2,red); 0.3 :: col(2,green); 0.5 :: col(2,blue) <- true. win :- heads, col(_,red).

win :- col(1,C), col(2,C).

(45)

Possible Worlds

H

R

0.4 :: heads.

0.3 :: col(1,red); 0.7 :: col(1,blue) <- true.

0.2 :: col(2,red); 0.3 :: col(2,green); 0.5 :: col(2,blue) <- true. win :- heads, col(_,red).

win :- col(1,C), col(2,C).

×

0.3

0.4

(46)

Possible Worlds

H

R

×

0.3

0.4 :: heads.

0.3 :: col(1,red); 0.7 :: col(1,blue) <- true.

0.2 :: col(2,red); 0.3 :: col(2,green); 0.5 :: col(2,blue) <- true. win :- heads, col(_,red).

win :- col(1,C), col(2,C).

×

0.3

0.4

(47)

Possible Worlds

H

R

×

0.3

0.4 :: heads.

0.3 :: col(1,red); 0.7 :: col(1,blue) <- true.

0.2 :: col(2,red); 0.3 :: col(2,green); 0.5 :: col(2,blue) <- true. win :- heads, col(_,red).

win :- col(1,C), col(2,C).

×

0.3

0.4

(48)

Possible Worlds

H

W

R

×

0.3

0.4 :: heads.

0.3 :: col(1,red); 0.7 :: col(1,blue) <- true.

0.2 :: col(2,red); 0.3 :: col(2,green); 0.5 :: col(2,blue) <- true. win :- heads, col(_,red).

win :- col(1,C), col(2,C).

×

0.3

0.4

(1

0.4)

(49)

Possible Worlds

R

H

R

×

0.3

0.4 :: heads.

0.3 :: col(1,red); 0.7 :: col(1,blue) <- true.

0.2 :: col(2,red); 0.3 :: col(2,green); 0.5 :: col(2,blue) <- true. win :- heads, col(_,red).

win :- col(1,C), col(2,C).

×

0.3

0.4

(1

0.4)

×

0.3

(50)

Possible Worlds

R R

H

W

R

×

0.3

0.4 :: heads.

0.3 :: col(1,red); 0.7 :: col(1,blue) <- true.

0.2 :: col(2,red); 0.3 :: col(2,green); 0.5 :: col(2,blue) <- true. win :- heads, col(_,red).

win :- col(1,C), col(2,C).

×

0.3

0.4

(1

0.4)

×

0.3

×

0.2

(51)

Possible Worlds

R R

H

R

×

0.3

0.4 :: heads.

0.3 :: col(1,red); 0.7 :: col(1,blue) <- true.

0.2 :: col(2,red); 0.3 :: col(2,green); 0.5 :: col(2,blue) <- true. win :- heads, col(_,red).

win :- col(1,C), col(2,C).

×

0.3

0.4

(1

0.4)

×

0.3

×

0.2

(52)

Possible Worlds

W

R R

H

W

R

×

0.3

0.4 :: heads.

0.3 :: col(1,red); 0.7 :: col(1,blue) <- true.

0.2 :: col(2,red); 0.3 :: col(2,green); 0.5 :: col(2,blue) <- true. win :- heads, col(_,red).

win :- col(1,C), col(2,C).

×

0.3

0.4

(1

0.4)

×

0.3

×

0.2

(1

0.4)

(53)

Possible Worlds

R R

H

R R

×

0.3

0.4 :: heads.

0.3 :: col(1,red); 0.7 :: col(1,blue) <- true.

0.2 :: col(2,red); 0.3 :: col(2,green); 0.5 :: col(2,blue) <- true. win :- heads, col(_,red).

win :- col(1,C), col(2,C).

×

0.3

0.4

(1

0.4)

×

0.3

×

0.2

(1

0.4)

×

0.3

(54)

Possible Worlds

W

R R

H

W

R R G

×

0.3

0.4 :: heads.

0.3 :: col(1,red); 0.7 :: col(1,blue) <- true.

0.2 :: col(2,red); 0.3 :: col(2,green); 0.5 :: col(2,blue) <- true. win :- heads, col(_,red).

win :- col(1,C), col(2,C).

×

0.3

0.4

(1

0.4)

×

0.3

×

0.2

(1

0.4)

×

0.3

×

0.3

(55)

Possible Worlds

R R

H

R R G

×

0.3

0.4 :: heads.

0.3 :: col(1,red); 0.7 :: col(1,blue) <- true.

0.2 :: col(2,red); 0.3 :: col(2,green); 0.5 :: col(2,blue) <- true. win :- heads, col(_,red).

win :- col(1,C), col(2,C).

×

0.3

0.4

(1

0.4)

×

0.3

×

0.2

(1

0.4)

×

0.3

×

0.3

(56)

Possible Worlds

W

R R

H

W

R R G

×

0.3

0.4 :: heads.

0.3 :: col(1,red); 0.7 :: col(1,blue) <- true.

0.2 :: col(2,red); 0.3 :: col(2,green); 0.5 :: col(2,blue) <- true. win :- heads, col(_,red).

win :- col(1,C), col(2,C).

×

0.3

0.4

(1

0.4)

×

0.3

×

0.2

(1

0.4)

×

0.3

×

0.3

(57)

All Possible Worlds

W

R R

H

W

R G

H

W

R R R G

H

B G

H

W

R B B R G B

0.024

0.036

0.060

0.036

0.054

0.090

0.056

0.084

0.084

0.126

0.140

0.210

(58)

Most likely world

where

win

is true?

W

R R

H

R B

H

W

R G

H

W

R R R G R B

H

B B

H

B G

H

W

R B B R G B B B

0.024

0.036

0.060

0.036

0.054

0.090

0.056

0.084

0.084

0.126

0.140

0.210

MPE Inference

(59)

Most likely world

where

win

is true?

W

R R

H

W

R G

H

W

R R R G

H

B G

H

W

R B B R G B

0.024

0.036

0.060

0.036

0.054

0.090

0.056

0.084

0.084

0.126

0.140

0.210

MPE Inference

(60)

MPE Inference

Most likely world where

col(2,blue)

is false?

W

R R

H

R B

H

W

R G

H

W

R R R G R B

H

B B

H

B G

H

W

R B B R G B B B

0.024

0.036

0.060

0.036

0.054

0.090

0.056

0.084

0.084

0.126

0.140

0.210

(61)

MPE Inference

Most likely world where

col(2,blue)

is false?

W

R R

H

W

R G

H

W

R R R G

H

B G

H

W

R B B R G B

0.024

0.036

0.060

0.036

0.054

0.090

0.056

0.084

0.084

0.126

0.140

0.210

(62)

P(win)=

W

R R

H

R B

H

W

R G

H

W

R R R G R B

H

B B

H

B G

H

W

R B B R G B B B

0.024

0.036

0.060

0.036

0.054

0.090

0.056

0.084

0.084

0.126

0.140

0.210

?

Probability

Marginal

(63)

P(win)=

W

R R

H

W

R G

H

W

R R R G

H

B G

H

W

R B B R G B

0.024

0.036

0.060

0.036

0.054

0.090

0.056

0.084

0.084

0.126

0.140

0.210

Probability

Marginal

(64)

P(win)=

W

R R

H

R B

H

W

R G

H

W

R R R G R B

H

B B

H

B G

H

W

R B B R G B B B

0.024

0.036

0.060

0.036

0.054

0.090

0.056

0.084

0.084

0.126

0.140

0.210

=0.562

Probability

Marginal

(65)

P(win|col(2,green))=

W

R R

H

W

R G

H

W

R R R G

H

B G

H

W

R B B R G B

0.024

0.036

0.060

0.036

0.054

0.090

0.056

0.084

0.084

0.126

0.140

0.210

?

Conditional

Probability

(66)

=P(win

col(2,green))/P(col(2,green))

P(win|col(2,green))=

W

R R

H

R B

H

W

R G

H

W

R R R G R B

H

B B

H

B G

H

W

R B B R G B B B

0.024

0.036

0.060

0.036

0.054

0.090

0.056

0.084

0.084

0.126

0.140

0.210

/

Conditional

Probability

(67)

=P(win

col(2,green))/P(col(2,green))

P(win|col(2,green))=

W

R R

H

W

R G

H

W

R R R G

H

B G

H

W

R B B R G B

0.024

0.036

0.060

0.036

0.054

0.090

0.056

0.084

0.084

0.126

0.140

0.210

/

Conditional

Probability

(68)

P(win|col(2,green))=

=0.036/0.3=0.12

W

R R

H

R B

H

W

R G

H

W

R R R G R B

H

B B

H

B G

H

W

R B B R G B B B

0.024

0.036

0.060

0.036

0.054

0.090

0.056

0.084

0.084

0.126

0.140

0.210

/

Conditional

Probability

(69)

ProbLog by example:

(70)

ProbLog by example:

Rain or sun?

day 0

0.5

0.5

(71)

0.5::weather(sun,0) ; 0.5::weather(rain,0) <- true.

ProbLog by example:

Rain or sun?

day 0

0.5

0.5

(72)

0.5::weather(sun,0) ; 0.5::weather(rain,0) <- true.

ProbLog by example:

Rain or sun?

day 0

0.5

0.5

day 1

0.6

0.4

(73)

0.5::weather(sun,0) ; 0.5::weather(rain,0) <- true. 0.6::weather(sun,T) ; 0.4::weather(rain,T) <- T>0, Tprev is T-1, weather(sun,Tprev).

ProbLog by example:

Rain or sun?

day 0

0.5

0.5

day 1

0.6

0.4

day 2

0.6

0.4

day 3

0.6

0.4

day 4

0.6

0.4

day 5

0.6

0.4

day 6

0.6

0.4

(74)

0.5::weather(sun,0) ; 0.5::weather(rain,0) <- true. 0.6::weather(sun,T) ; 0.4::weather(rain,T) <- T>0, Tprev is T-1, weather(sun,Tprev).

ProbLog by example:

Rain or sun?

day 0

0.5

0.5

day 1

0.6

0.4

day 2

0.6

0.4

day 3

0.6

0.4

day 4

0.6

0.4

day 5

0.6

0.4

day 6

0.6

0.4

0.8

0.2

(75)

0.5::weather(sun,0) ; 0.5::weather(rain,0) <- true. 0.6::weather(sun,T) ; 0.4::weather(rain,T) <- T>0, Tprev is T-1, weather(sun,Tprev). 0.2::weather(sun,T) ; 0.8::weather(rain,T) <- T>0, Tprev is T-1, weather(rain,Tprev).

ProbLog by example:

Rain or sun?

day 0

0.5

0.5

day 1

0.6

0.4

day 2

0.6

0.4

day 3

0.6

0.4

day 4

0.6

0.4

day 5

0.6

0.4

day 6

0.6

0.4

0.8

0.2

0.8

0.2

0.8

0.2

0.8

0.2

0.8

0.2

0.8

0.2

(76)

0.5::weather(sun,0) ; 0.5::weather(rain,0) <- true. 0.6::weather(sun,T) ; 0.4::weather(rain,T) <- T>0, Tprev is T-1, weather(sun,Tprev). 0.2::weather(sun,T) ; 0.8::weather(rain,T) <- T>0, Tprev is T-1, weather(rain,Tprev).

ProbLog by example:

Rain or sun?

day 0

0.5

0.5

day 1

0.6

0.4

day 2

0.6

0.4

day 3

0.6

0.4

day 4

0.6

0.4

day 5

0.6

0.4

day 6

0.6

0.4

0.8

0.2

0.8

0.2

0.8

0.2

0.8

0.2

0.8

0.2

0.8

0.2

(77)

ProbLog by example:

Friends & smokers

1

2

3

4

person(1). person(2). person(3). person(4). friend(1,2). friend(2,1). friend(2,4). friend(3,4). friend(4,2).

(78)

0.3::stress(X):- person(X). 0.2::influences(X,Y):-

person(X), person(Y).

ProbLog by example:

Friends & smokers

1

2

3

4

person(1). person(2). person(3). person(4). friend(1,2). friend(2,1). friend(2,4). friend(3,4). friend(4,2).

typed probabilistic facts

(79)

0.3::stress(X):- person(X). 0.2::influences(X,Y):-

person(X), person(Y).

ProbLog by example:

Friends & smokers

1

2

3

4

person(1). person(2). person(3). person(4). friend(1,2). friend(2,1). friend(2,4). friend(3,4). friend(4,2).

typed probabilistic facts

= a probabilistic fact for each grounding

0.3::stress(1). 0.3::stress(2). 0.3::stress(3). 0.3::stress(4). 0.2::influences(1,1). 0.2::influences(1,2). 0.2::influences(1,3). 0.2::influences(1,4). 0.2::influences(2,1). ...

(80)

0.3::stress(X):- person(X). 0.2::influences(X,Y):-

person(X), person(Y). smokes(X) :- stress(X).

smokes(X) :-

friend(X,Y), influences(Y,X), smokes(Y).

ProbLog by example:

Friends & smokers

1

2

3

4

person(1). person(2). person(3). person(4). friend(1,2). friend(2,1). friend(2,4). friend(3,4). friend(4,2).

(81)

0.3::stress(X):- person(X). 0.2::influences(X,Y):-

person(X), person(Y). smokes(X) :- stress(X).

smokes(X) :-

friend(X,Y), influences(Y,X), smokes(Y). 0.4::asthma(X) <- smokes(X).

ProbLog by example:

Friends & smokers

1

2

3

4

person(1). person(2). person(3). person(4). friend(1,2). friend(2,1). friend(2,4). friend(3,4). friend(4,2).

(82)

0.3::stress(X):- person(X). 0.2::influences(X,Y):-

person(X), person(Y). smokes(X) :- stress(X).

smokes(X) :-

friend(X,Y), influences(Y,X), smokes(Y). 0.4::asthma(X) <- smokes(X).

ProbLog by example:

Friends & smokers

1

2

3

4

person(1). person(2). person(3). person(4). friend(1,2). friend(2,1). friend(2,4). friend(3,4). friend(4,2).

(83)

ProbLog by example:

Limited Luggage

weight(skis,6). weight(boots,4). weight(helmet,3). weight(gloves,2).

(84)

ProbLog by example:

Limited Luggage

weight(skis,6). weight(boots,4). weight(helmet,3). weight(gloves,2).

P::pack(Item) :- weight(Item,Weight), P is 1.0/Weight.

flexible probability:

(85)

ProbLog by example:

Limited Luggage

weight(skis,6). weight(boots,4). weight(helmet,3). weight(gloves,2).

P::pack(Item) :- weight(Item,Weight), P is 1.0/Weight.

flexible probability:

computed from the weight of the item

1/6::pack(skis). 1/4::pack(boots). 1/3::pack(helmet). 1/2::pack(gloves).

(86)

ProbLog by example:

Limited Luggage

weight(skis,6). weight(boots,4). weight(helmet,3). weight(gloves,2).

P::pack(Item) :- weight(Item,Weight), P is 1.0/Weight.

excess(Limit) :- excess([skis,boots,helmet,gloves],Limit).

(87)

ProbLog by example:

Limited Luggage

weight(skis,6). weight(boots,4). weight(helmet,3). weight(gloves,2).

P::pack(Item) :- weight(Item,Weight), P is 1.0/Weight.

excess(Limit) :- excess([skis,boots,helmet,gloves],Limit). excess([],Limit) :- Limit<0.

excess([I|R],Limit) :-

pack(I), weight(I,W), L is Limit-W, excess(R,L). excess([I|R],Limit) :-

(88)

ProbLog by example:

Limited Luggage

weight(skis,6). weight(boots,4). weight(helmet,3). weight(gloves,2).

P::pack(Item) :- weight(Item,Weight), P is 1.0/Weight.

excess(Limit) :- excess([skis,boots,helmet,gloves],Limit). excess([],Limit) :- Limit<0.

excess([I|R],Limit) :-

pack(I), weight(I,W), L is Limit-W, excess(R,L). excess([I|R],Limit) :-

\+pack(I), excess(R,Limit).

pack first item, decrease

(89)

ProbLog by example:

Limited Luggage

weight(skis,6). weight(boots,4). weight(helmet,3). weight(gloves,2).

P::pack(Item) :- weight(Item,Weight), P is 1.0/Weight.

excess(Limit) :- excess([skis,boots,helmet,gloves],Limit). excess([],Limit) :- Limit<0.

excess([I|R],Limit) :-

pack(I), weight(I,W), L is Limit-W, excess(R,L). excess([I|R],Limit) :-

(90)

ProbLog by example:

Limited Luggage

weight(skis,6). weight(boots,4). weight(helmet,3). weight(gloves,2).

P::pack(Item) :- weight(Item,Weight), P is 1.0/Weight.

excess(Limit) :- excess([skis,boots,helmet,gloves],Limit). excess([],Limit) :- Limit<0.

excess([I|R],Limit) :-

pack(I), weight(I,W), L is Limit-W, excess(R,L). excess([I|R],Limit) :-

(91)

ProbLog by example:

Limited Luggage

weight(skis,6). weight(boots,4). weight(helmet,3). weight(gloves,2).

P::pack(Item) :- weight(Item,Weight), P is 1.0/Weight.

excess(Limit) :- excess([skis,boots,helmet,gloves],Limit). excess([],Limit) :- Limit<0.

excess([I|R],Limit) :-

pack(I), weight(I,W), L is Limit-W, excess(R,L). excess([I|R],Limit) :-

(92)

probabilistic

choices

+ their

consequences

probability distribution over

possible worlds

how to efficiently answer

questions

?

most probable world (MPE inference)

probability of query (computing marginals)

probability of query given evidence

(93)

Answering Questions

program

queries

evidence

marginal

probabilities

conditional

probabilities

MPE state

Given:

Find:

?

(94)

Answering Questions

program

queries

evidence

marginal

probabilities

conditional

probabilities

MPE state

Given:

Find:

?

possible worlds

inf

easible

(95)

Answering Questions

program

queries

evidence

marginal

probabilities

conditional

probabilities

MPE state

Given:

Find:

?

possible worlds

inf

easible

logical reasoning

probabilistic inference

data structure

(96)

Initial Approach

(ProbLog1)

Find all proofs of query

calculate marginal by

dynamic programming

Binary Decision

Diagram (BDD)

(97)

Initial Approach

(ProbLog1)

Find all proofs of query

calculate marginal by

Binary Decision

Diagram (BDD)

0.4::heads(1). 0.7::heads(2). 0.5::heads(3). win :- heads(1). win :- heads(2),heads(3).

win

(98)

heads(1)

heads(2) & heads(3)

Initial Approach

(ProbLog1)

Find all proofs of query

calculate marginal by

dynamic programming

Binary Decision

Diagram (BDD)

0.4::heads(1). 0.7::heads(2). 0.5::heads(3). win :- heads(1). win :- heads(2),heads(3).

win

(99)

heads(1)

heads(2) & heads(3)

Initial Approach

(ProbLog1)

Find all proofs of query

calculate marginal by

Binary Decision

Diagram (BDD)

0.4::heads(1). 0.7::heads(2). 0.5::heads(3). win :- heads(1). win :- heads(2),heads(3).

win

h(1) h(2) h(3)

0

1

(100)

heads(1)

heads(2) & heads(3)

Initial Approach

(ProbLog1)

Find all proofs of query

calculate marginal by

dynamic programming

Binary Decision

Diagram (BDD)

0.4::heads(1). 0.7::heads(2). 0.5::heads(3). win :- heads(1). win :- heads(2),heads(3).

win

true

false

win?

h(1) h(2) h(3)

0

1

(101)

heads(1)

heads(2) & heads(3)

Initial Approach

(ProbLog1)

Find all proofs of query

calculate marginal by

Binary Decision

Diagram (BDD)

0.4::heads(1). 0.7::heads(2). 0.5::heads(3). win :- heads(1). win :- heads(2),heads(3).

win

win?

0.4

h(1) h(2) h(3)

0

1

0.6

0.7

0.3

0.5

0.5

P(

win

) =

probability of

(102)

Current Approach

(ProbLog2)

Find relevant ground

program for queries &

evidence

use weighted model

counting / satisfiability

(103)

Current Approach

(ProbLog2)

Find relevant ground

program for queries &

evidence

use weighted model

Weighted CNF

0.4::heads(1). 0.7::heads(2). 0.5::heads(3). win :- heads(1). win :- heads(2), heads(3).

win

(104)

Current Approach

(ProbLog2)

Find relevant ground

program for queries &

evidence

use weighted model

counting / satisfiability

Weighted CNF

0.4::heads(1). 0.7::heads(2). 0.5::heads(3). win :- heads(1). win :- heads(2), heads(3). win :- heads(1).

win :- heads(2), heads(3).

(105)

Current Approach

(ProbLog2)

Find relevant ground

program for queries &

evidence

use weighted model

Weighted CNF

0.4::heads(1). 0.7::heads(2). 0.5::heads(3). win :- heads(1). win :- heads(2), heads(3). win :- heads(1).

win :- heads(2), heads(3).

win h(1)

(h(2)

h(3))

(106)

Current Approach

(ProbLog2)

Find relevant ground

program for queries &

evidence

use weighted model

counting / satisfiability

Weighted CNF

0.4::heads(1). 0.7::heads(2). 0.5::heads(3). win :- heads(1). win :- heads(2), heads(3). win :- heads(1).

win :- heads(2), heads(3).

win h(1)

(h(2)

h(3))

(¬win

h(1)

h(2))

(¬win

h(1)

h(3))

(win

¬h(1))

(win

¬h(2)

¬h(3))

win

(107)

Current Approach

(ProbLog2)

Find relevant ground

program for queries &

evidence

use weighted model

Weighted CNF

0.4::heads(1). 0.7::heads(2). 0.5::heads(3). win :- heads(1). win :- heads(2), heads(3). win :- heads(1).

win :- heads(2), heads(3).

win h(1)

(h(2)

h(3))

(¬win

h(1)

h(2))

(¬win

h(1)

h(3))

(win

¬h(1))

(win

¬h(2)

¬h(3))

win

(108)

Current Approach

(ProbLog2)

Find relevant ground

program for queries &

evidence

use weighted model

counting / satisfiability

Weighted CNF

0.4::heads(1). 0.7::heads(2). 0.5::heads(3). win :- heads(1). win :- heads(2), heads(3). win :- heads(1).

win :- heads(2), heads(3).

h(1)

0.4

h(2)

0.7

h(3)

0.5

win h(1)

(h(2)

h(3))

(¬win

h(1)

h(2))

(¬win

h(1)

h(3))

(win

¬h(1))

(win

¬h(2)

¬h(3))

win

use

standard

tool

(109)

Diagnostics for Prognostics

Visual

representation of a

ProbLog diagnostics

program

The GUI knows

how to interpret

certain predicates

(e.g. failure, partof)

(110)

ProbLog for activity recognition from

video

 

(111)

Parameter Learning

class(Page,C)

:-

has_word(Page,W), word_class(W,C).

class(Page,C)

:-

links_to(OtherPage,Page),

for each

CLASS1, CLASS2

and each

WORD

?? :: link_class(Source,Target,

CLASS1

,

CLASS2

).

?? :: word_class(

WORD

,

CLASS

).

(112)

Sampling

(113)

Sampling

(114)
(115)

Parameter Estimation

(116)

Learning from partial

interpretations

Not all facts observed

Soft-EM

use

expected count

instead of

count

(117)

ProbLog Basics

-

ProbLog by example

-

Inference

-

Parameter Learning

(118)

ProbLog Basics

-

ProbLog by example

-

Inference

-

Parameter Learning

Selected Topics

-

Upgrading relational learning

-

Dynamics under uncertainty

-

Continuous-valued random variables

-

Decision making

(119)

ProbLog Basics

-

ProbLog by example

-

Inference

-

Parameter Learning

Selected Topics

-

Upgrading relational learning

-

Dynamics under uncertainty

-

Continuous-valued random variables

(120)

Prolog

infl(a,b).

...

ProbLog

0.4::infl(a,b).

...

Reasoning

query true?

yes/no

with probability P

query true?

Machine

Learning

example covered?

yes/no

example covered?

with probability P

(121)

Prolog

infl(a,b).

...

ProbLog

0.4::infl(a,b).

...

Reasoning

query true?

yes/no

with probability P

query true?

Machine

example covered?

example covered?

(122)

Biomine

network

What is the most

relevant subnetwork

(with at most k edges)

connecting these?

gene

(123)

Theory compression

a

b

c

d

f

e

g

0.4

0.8

0.9

0.6

0.5

0.9

0.2

0.6

0.7

0.3

0.8

0.5

path(X,Y) :- edge(X,Y).

path(X,Y) :- edge(X,Z), path(Z,Y). 0.8::edge(e,c).

0.3::edge(e,a). ...

(124)

Theory compression

a

b

c

d

f

e

g

0.4

0.8

0.9

0.6

0.5

0.9

0.2

0.6

0.7

0.3

0.8

0.5

best subnetwork of at most

5 edges where

path(a,b)

but not

path(e,g)

?

path(X,Y) :- edge(X,Y).

path(X,Y) :- edge(X,Z), path(Z,Y). 0.8::edge(e,c).

0.3::edge(e,a). ...

(125)

Theory compression

a

b

c

d

f

e

g

0.4

0.8

0.9

0.6

0.5

0.9

0.2

0.6

0.7

0.3

0.8

0.5

best subnetwork of at most

5 edges where

path(a,b)

but not

path(e,g)

?

path(X,Y) :- edge(X,Y).

path(X,Y) :- edge(X,Z), path(Z,Y). 0.8::edge(e,c).

0.3::edge(e,a). ...

build inference data structures for all examples

(126)

Theory compression

a

b

c

d

f

e

g

0.4

0.8

0.9

0.6

0.5

0.9

0.2

0.6

0.7

0.3

0.8

0.5

best subnetwork of at most

5 edges where

path(a,b)

but not

path(e,g)

?

path(X,Y) :- edge(X,Y).

path(X,Y) :- edge(X,Z), path(Z,Y). 0.8::edge(e,c).

0.3::edge(e,a). ...

build inference data structures for all examples

simulate edge deletion by setting probability to 0

(127)

Theory compression

a

b

c

d

f

e

g

0.4

0.8

0.9

0.6

0.5

0.9

0.2

0.6

0.7

0.3

0.5

best subnetwork of at most

5 edges where

path(a,b)

but not

path(e,g)

?

path(X,Y) :- edge(X,Y).

path(X,Y) :- edge(X,Z), path(Z,Y). 0.8::edge(e,c).

0.3::edge(e,a). ...

build inference data structures for all examples

(128)

Theory compression

a

b

c

d

f

e

g

0.4

0.8

0.9

0.6

0.5

0.9

0.2

0.6

0.7

0.5

best subnetwork of at most

5 edges where

path(a,b)

but not

path(e,g)

?

path(X,Y) :- edge(X,Y).

path(X,Y) :- edge(X,Z), path(Z,Y). 0.8::edge(e,c).

0.3::edge(e,a). ...

build inference data structures for all examples

simulate edge deletion by setting probability to 0

(129)

Theory compression

a

b

c

d

f

e

g

0.4

0.9

0.6

0.5

0.9

0.2

0.6

0.7

0.5

best subnetwork of at most

5 edges where

path(a,b)

but not

path(e,g)

?

path(X,Y) :- edge(X,Y).

path(X,Y) :- edge(X,Z), path(Z,Y). 0.8::edge(e,c).

0.3::edge(e,a). ...

build inference data structures for all examples

(130)

Theory compression

a

b

c

d

f

e

g

0.4

0.9

0.6

0.5

0.9

0.2

0.6

0.7

best subnetwork of at most

5 edges where

path(a,b)

but not

path(e,g)

?

path(X,Y) :- edge(X,Y).

path(X,Y) :- edge(X,Z), path(Z,Y). 0.8::edge(e,c).

0.3::edge(e,a). ...

build inference data structures for all examples

simulate edge deletion by setting probability to 0

(131)

Theory compression

a

b

c

d

f

e

g

0.4

0.9

0.6

0.5

0.9

0.6

0.7

best subnetwork of at most

5 edges where

path(a,b)

but not

path(e,g)

?

path(X,Y) :- edge(X,Y).

path(X,Y) :- edge(X,Z), path(Z,Y). 0.8::edge(e,c).

0.3::edge(e,a). ...

build inference data structures for all examples

(132)

Theory compression

a

b

c

d

f

e

g

0.9

0.6

0.5

0.9

0.6

0.7

best subnetwork of at most

5 edges where

path(a,b)

but not

path(e,g)

?

path(X,Y) :- edge(X,Y).

path(X,Y) :- edge(X,Z), path(Z,Y). 0.8::edge(e,c).

0.3::edge(e,a). ...

build inference data structures for all examples

simulate edge deletion by setting probability to 0

(133)

Theory compression

a

b

c

d

f

e

g

0.9

0.6

0.9

0.6

0.7

best subnetwork of at most

5 edges where

path(a,b)

but not

path(e,g)

?

path(X,Y) :- edge(X,Y).

path(X,Y) :- edge(X,Z), path(Z,Y). 0.8::edge(e,c).

0.3::edge(e,a). ...

build inference data structures for all examples

(134)

Biomine

network

gene

phenotype

Should there be a link?

If so, of what type?

(135)

Learning Patterns

Probabilistic Explanation Based Learning

:

example + background theory

rule

Probabilistic Query Mining

:

pos./neg. examples

set of independent rules

Probabilistic Rule Learning

:

probabilistic examples

(probabilistic) concept

definition

(136)

Probabilistic explanation

based learning

1

2

3

4

0.4::stress(1). 0.9::stress(2). 0.5::stress(3). 0.2::stress(4). 0.8::influences(1,2). 0.7::influences(2,4). 0.5::influences(3,4). smokes(X) :- stress(X). smokes(X) :- influences(Y,X), smokes(Y).

(137)

Probabilistic explanation

based learning

1

2

3

4

0.4::stress(1). 0.9::stress(2). 0.5::stress(3). 0.2::stress(4). 0.8::influences(1,2). 0.7::influences(2,4). 0.5::influences(3,4). smokes(X) :- stress(X). smokes(X) :- influences(Y,X), smokes(Y). smokes(4)

example

(138)

Probabilistic explanation

based learning

1

2

3

4

0.4::stress(1). 0.9::stress(2). 0.5::stress(3). 0.2::stress(4). 0.8::influences(1,2). 0.7::influences(2,4). 0.5::influences(3,4). smokes(X) :- stress(X). smokes(X) :- influences(Y,X), smokes(Y). smokes(4) stress(4)

influences(2,4) & stress(2)

influences(2,4) & influences(1,2) & stress(1) influences(3,4) & stress(3)

example

proofs

(139)

Probabilistic explanation

based learning

1

2

3

4

0.4::stress(1). 0.9::stress(2). 0.5::stress(3). 0.2::stress(4). 0.8::influences(1,2). 0.7::influences(2,4). 0.5::influences(3,4). smokes(X) :- stress(X). smokes(X) :- influences(Y,X), smokes(Y). smokes(4) stress(4)

influences(2,4) & stress(2)

influences(2,4) & influences(1,2) & stress(1) influences(3,4) & stress(3)

0.200 0.630 0.224 0.250

example

proofs

(140)

Probabilistic explanation

based learning

1

2

3

4

0.4::stress(1). 0.9::stress(2). 0.5::stress(3). 0.2::stress(4). 0.8::influences(1,2). 0.7::influences(2,4). 0.5::influences(3,4). smokes(X) :- stress(X). smokes(X) :- influences(Y,X), smokes(Y). smokes(4) stress(4)

influences(2,4) & stress(2)

influences(2,4) & influences(1,2) & stress(1) influences(3,4) & stress(3)

0.200 0.630 0.224 0.250

smokes(4) if influences(2,4) & stress(2)

example

proofs

(141)

Probabilistic explanation

based learning

1

2

3

4

0.4::stress(1). 0.9::stress(2). 0.5::stress(3). 0.2::stress(4). 0.8::influences(1,2). 0.7::influences(2,4). 0.5::influences(3,4). smokes(X) :- stress(X). smokes(X) :- influences(Y,X), smokes(Y). smokes(4) stress(4)

influences(2,4) & stress(2)

influences(2,4) & influences(1,2) & stress(1) influences(3,4) & stress(3)

0.200 0.630 0.224 0.250

example

proofs

(142)

Probabilistic query mining

a

b

c

d

f

e

g

0.4

0.8

0.9

0.6

0.5

0.9

0.2

0.6

0.7

0.3

0.8

0.5

pos(a).

(143)

Probabilistic query mining

a

b

c

d

f

e

g

0.4

0.8

0.9

0.6

0.5

0.9

0.2

0.6

0.7

0.3

0.8

0.5

pos(a).

pos(d).

not pos(b).

not pos(g).

subgraph queries that

maximize ?

X

pos

P X

neg

(144)

Probabilistic query mining

a

b

c

d

f

e

g

0.4

0.8

0.9

0.6

0.5

0.9

0.2

0.6

0.7

0.3

0.8

0.5

1.08

pos(a).

pos(d).

pos(X) :- edge(X,Y),edge(Y,Z).

not pos(b).

not pos(g).

subgraph queries that

maximize ?

X

pos

P X

neg

(145)

Probabilistic query mining

a

b

c

d

f

e

g

0.4

0.8

0.9

0.6

0.5

0.9

0.2

0.6

0.7

0.3

0.8

0.5

1.08

a-d-g 0.54

d-g-b 0.54

pos(a).

pos(d).

pos(X) :- edge(X,Y),edge(Y,Z).

not pos(b).

not pos(g).

subgraph queries that

maximize ?

X

pos

P X

neg

(146)

Probabilistic query mining

a

b

c

d

f

e

g

0.4

0.8

0.9

0.6

0.5

0.9

0.2

0.6

0.7

0.3

0.8

0.5

1.08

a-d-g 0.54

d-g-b 0.54

b-

pos(a).

pos(d).

pos(X) :- edge(X,Y),edge(Y,Z).

not pos(b).

not pos(g).

subgraph queries that

maximize ?

X

pos

P X

neg

(147)

Probabilistic query mining

a

b

c

d

f

e

g

0.4

0.8

0.9

0.6

0.5

0.9

0.2

0.6

0.7

0.3

0.8

0.5

1.08

a-d-g 0.54

d-g-b 0.54

pos(a).

pos(d).

pos(X) :- edge(X,Y),edge(Y,Z).

not pos(b).

not pos(g).

subgraph queries that

maximize ?

X

pos

P X

neg

(148)

Probabilistic query mining

a

b

c

d

f

e

g

0.4

0.8

0.9

0.6

0.5

0.9

0.2

0.6

0.7

0.3

0.8

0.5

1.08

0.93

0.756

a-d-g 0.54

d-g-b 0.54

b-

pos(a).

pos(d).

pos(X) :- edge(X,Y),edge(Y,Z). pos(X) :- edge(X,Y),edge(X,Z). pos(X) :-

not pos(b).

not pos(g).

subgraph queries that

maximize ?

X

pos

P X

neg

(149)

ProbFOIL

Upgrading FOIL to learn a set of rules from

References

Related documents

6 CORR, ICC, MSE yes 5 Abbreviations: Landmark Localization (Land. Local.) – AAM: Active Apparance Model, ASM: Active Shape Model, CLM: Constrained Local Model, PFFL: Particle

In par with existing languages for PP, our proposed extension consists of two parts: a generative Datalog program that specifies a prior probability space over (finite or infinite)

We present YAGO2, an extension of the YAGO knowledge base, in which entities, facts, and events are anchored in both time and space.. YAGO2 is built automatically from

Youth Ages 12-17 with Serious Emotional Disturbance (SED): Stanislaus County has access equivalent to the state mean, but utilization is the highest in the state,

lts Universities Evaluated Universities Accr edited Remarks Š Ch onb uk Nation al Un iv ersity Š Ch onn am Nati on al Un iv ersity Š C hu ngnam Nat ional Uni versi ty Š

We …nd that transfer pricing in intermediate production factors does not a¤ect real activity of a multi- national …rm if the …rm’s concealment e¤ort as well as the probability

2002 Nordiska Ministerrådet Gallery , Copenhagen, Denmark (paintings &amp; luminous performance; Dancers from local dance academy, musician). Corridor Gallery , Brooklyn, New

[The new welfare state – abdication of party politics?] In Ann-Helén Bay, Axel West Pedersen and Jo Saglie (eds.), “Når velferd blir politikk.. Partier, organisasjoner og