• No results found

Separation theorems for chain event graphs

N/A
N/A
Protected

Academic year: 2020

Share "Separation theorems for chain event graphs"

Copied!
41
0
0

Loading.... (view fulltext now)

Full text

(1)

University of Warwick institutional repository: http://go.warwick.ac.uk/wrap

This paper is made available online in accordance with

publisher policies. Please scroll down to view the document

itself. Please refer to the repository record for this item and our

policy information available from the repository home page for

further information.

To see the final version of this paper please visit the publisher’s website

.

Access to the published version may require a subscription.

Author(s): PA Thwaites and JQ Smith

Article Title: Separation Theorems for Chain Event Graphs

Year of publication: 2011

Link to published article:

http://www2.warwick.ac.uk/fac/sci/statistics/crism/research/2011/paper

11-09

(2)

SEPARATION THEOREMS FOR CHAIN EVENT GRAPHS

By Peter A. Thwaites

and Jim Q. Smith

University of Warwik

Aseparationtheoremonagraphialmodelallowsananalystto

identifythe onditionalindependenestatementsitlogially entails

usingonlythetopologyofthegraph.Inthispaperweprove

separa-tiontheoremsassoiatedwithanewolouredgraphialmodelalled

a ChainEventGraph(CEG).ThelassofCEG modelsgeneralises

thelassofnitedisreteBayesianNetworkmodels.Hereweformally

denethismodellass,andonsiderthesetofpermissibleonditional

independenequerieson thisgraph. We provideneessaryand

suf-ient onditions for these onditional independenestatements to

holdonasublassofunolouredCEGsalledsimpleCEGs.Wethen

prove suÆient onditions for suh statements to hold on a muh

larger sublassalled regularCEGs.Thepaperis illustratedwitha

runningexampledemonstratingtheappliationofthesetheorems.

1. Introdution. IftheDAG (diretedayligraph) G ofa Bayesian

Network (BN) has a vertex set fX

1 ;X

2

;:::;X n

g, then there are n

on-ditional independene assertions whih an simply be read o the graph.

These are the properties that state that a vertex-variable is independent

of its non-desendants given its parents (the direted loal Markov

prop-erty[14 ℄).Answeringmostonditionalindependenequerieshowever,isnot

so straightforward. The d-separation theorem for BNs was rst proved by

VermaandPearl[31 ℄,andanalternativeversiononsideredin[15 ,14,5 ℄.The

theoremaddresseswhethertheonditional independenequeryAqB jC ?

an be answeredfrom thetopologyof theDAG ofa BN,where A;B;C are

disjointsubsetsofthesetofvertex-variablesoftheDAG.ItallowstheBNto

be interrogated and irrelevanesheked beforeanyquantitative

embellish-ments of distribution on its onditional probability tables are added. This

provides a valuabletool inthe proess of disovering requisitemodels [21℄,

aswellasa logialframework forpropagation algorithmsandlearning(see

forexample[5℄ andthe TETRADsoftware of Sheines etal).

However formanyproblems theavailablequantitative dependene

infor-mation annot all be embodied in the DAG of a BN. Separation theorems

ThisresearhissupportedbyEPSRCgrantno.EP/F036752/1

AMS 2000subjetlassiations:Primary68T30,68T37;seondary62F30,68R10

Keywordsandphrases: Bayesian Network, Chain Event Graph, onditional

indepen-dene,diretedayligraph,graphialmodel,separationtheorem

(3)

havebeenprovedformoregenerallassesofgraphialmodelinludinghain

graphs [3 ℄, alternative hain graphs [2 ℄, and anestral graphs [23 ℄. In this

paperwe prove separation theorems fora partiularly expressive graphial

model{ theChainEventGraph (CEG).

OurmotivationforthedevelopmentofthislassisthatCEGsare

proba-blythemostnaturalgraphialmodelsfordisreteproesseswheneliitation

involves questions abouthow situations might unfold. Although the

topol-ogyofthesegraphsismoreompliatedthanthatoftheBN,theyaremuh

more expressive, as they allow us to represent all strutural quantitative

informationwithin thegraph itself.Context-spei symmetrieswhih are

notintrinsito thestrutureoftheBN[4 , 16,22 ,24 ℄are fullyexpressedin

thetopology of the CEG, whih also reognises logial zeros in probability

tables, and thenumbersof levels taken byproblem-variables.This lasthas

been found to be essential to understanding the geometry of BN models

withhiddenvariables[1 , 18 ℄.

TheCEGhasalreadybeendemonstratedtobeausefulinferential

frame-workforappliationsasdiverseasforensisiene[26 ℄,biologialregulatory

models[27℄, and eduation [8 ℄. The graphsprovidea framework for

repre-sentation[27 ℄,probabilitypropagation[29 ℄,learningandmodelseletion[8 ℄,

and forausalanalysis[30℄.

These papers onentrate on the appliation of CEG-based tehniques.

Whilstthey usetheonditionalindependene properties ofthegraph,they

donotprovideafullformal developmentforthelassof CEGmodels.This

paper reties this lak. In doing so we identify the form of the types of

onditional independene statements it is natural to query, and also prove

a number of separation theorems whih allow us to answer eah query as

always true ornot,solelyon thebasisof thetopology ofthegraph.

Wenotethat,even moresothanistheasewithBNs,thereareanumber

of onditional independene properties whih an simply be read o the

CEG.These aredesribed inSetions2.4and 5.2, andgiven thetree-based

natureoftheCEGthesepropertiesarenaturallyontext-spei.Thatisto

saytheyarepropertiesoftheformAqBjforsomeevent.Ananalogous

statement foradisrete BNwould be of theform

p(A=a jB =b;C=)=p(A=a jC =)

forsomesubsetsofvariablesA;B;C,somespeivetorvalue ofCand

allvetor valuesa ofA andb ofB.Thelassof onditioningeventswean

takle with a CEG is however muh riher than that generally onsidered

(4)

w

0

w

1

w

2

w

3

w

4

w

5

w

inf

w

6

1

2

1

2

3

1

2

3

1

2

1

2

1

2

1

2

w

7

w

8

w

9

1

2

1

2

1

2

Fig1.AnSCEGC

2. The Simple Chain Event Graph.

2.1. The basi denition of an SCEG. The Chain Event Graph C(V;E)

is a direted ayli graph (DAG), whih is onneted with a unique root

vertex(withno inomingedges) andauniquesink vertex(with nooutgoing

edges).UnliketheBNmorethanone edgean existbetweentwovertiesof

aCEG.TheregularChain EventGraph (RCEG) disussedinsetion4also

hasits vertiesand edgesoloured.

We rst onsider a sublass of the lass of CEGs alled a simple Chain

EventGraph (SCEG).Neither theverties(alled positions)w2V(C),nor

theedgese(w;w 0

)2E(C)ofanSCEGareoloured.AnexampleofanSCEG

isgiven inFigure 1.

TherootandsinkvertiesofaCEGarelabelledw 0

andw

1

.Eahposition

w2V(C)nfw

1

ghasasetE(w)ofk(w)outgoingedges,whihwhenwewish

toemphasisetheironnetionwiththepositionw,maybelabelledfe

x (w):

x=1;2;:::;k(w)g.

Adireted w

0

!w

1

pathinC isalled aroute.The set ofroutes ofC

(5)

ContextforFigure 1

Desriptor Edges

male e1(w0)

female e

2 (w

0 )

displayedsymptomS beforepuberty e1(w1);e1(w2)

displayedsymptomS afterpuberty e 2

(w 1

);e 2

(w 2

)

neverdisplayedsymptomS e3(w1);e3(w2)

developedondition e1(w3);e1(w4)

didnotdevelopondition e2(w3);e2(w4)

diedbeforetheageof50 e1(w5);e1(w6);e1(w7);

e 1

(w 8

);e 1

(w 9

)

diedattheageof50orolder e2(w5);e2(w6);e2(w7);

e 2

(w 8

);e 2

(w 9

)

probability spae represented by C { see below). Note that eah route is

uniquelydetermined by asequene of edges.Thusinthe CEGinFigure 1,

one suh route is

1

fe

1 (w

0 );e

1 (w

1 );e

1 (w

3 );e

1 (w

6

)g. It is easy to hek

that C here has 20 suh routes. We write w w

0

when the position w

preedesthepositionw

0

ona route.

When our CEG is applied to a population, eah route orresponds to a

possible set of attributes that a member of the population ould take. For

example,iftheCEGinFigure 1is appliedto apopulationof people whose

parentssuereredfromaninheritedmedialondition,andtheedgesofthe

CEG arry the desriptors given in Table 1, then the route

1

desribed

above orresponds to male, displayed symptom S before puberty, developed

ondition, died beforethe age of 50.

AnSCEGis route ompatible fora population ofunits ifeahpossible

historyof aunitinthepopulation(oratomoftheevent spae)orresponds

totheunitpassingalongone oftheroutes2(C) .We useF( C) todenote

the sigma eld of events formed by these atoms. F( C) orresponds to the

power set of (C) . Sine eah atom of this event spae odes what might

happento a unitin , theSCEGenodesan additionallongitudinal

devel-opmentdepitingthepossiblewaysthefuturemightunfold,notenodedby

thesigmaeld F( C) alone(see [25℄).

We labelan event inF(C) by,and notethatbeause theCEG'satoms

have this impliit longitudinal development assoiated with them, ertain

events in F( C) are partiularlyimportant. Let (w) denote the event that

a unit takes a route that passes through the position w 2 V(C). (w;w

0 )

is then the union of all routes passing through the positions w and w

0 ,

(e(w;w

0

))istheunionofallroutespassingthroughtheedgee(w;w 0

),and

((w;w

0

)) istheunion ofall routesutilisingthesubpath(w;w 0

(6)

Certain subsets of the set of positions also have an important status in

this ontext. In this paper we will all a set R V(C) a regular subset if

theeventsf(w):w2R garedisjoint.NotethatR isregularifandonlyif

there is no route 2(C) ontaining more than one positionw 2R . Call

Raposition-ut iff(w) :w2R gformsapartitionof(C) .A position-ut

an be assoiated with a random variable that labels whih of a lass of

developmentsa unitmight take (seesetion 3).

2.2. Probabilities on an SCEG. Underlying the SCEG there is a

prob-abilityspae whih is speiedbyassigning probabilitiesto the atoms. We

dothisasfollows:Foreahpositionw2V(C)nfw 1

g andedgee(w;w 0

)

em-anatingfrom w, we all

e (w

0

jw) a primitive probability if e

(w 0

jw) 0

and P

w 0

e (w

0

jw)=1.

Definition 1. A probability mass funtion p(); 2 (C) is said to

have themonomial property fora population ifthereexistsasetof

prim-itive probabilities = f

e (w

0

jw) :e(w;w 0

)2E(w);w2V(C)nfw

1

gg on

theedges ofC suhthat forallroutes 2(C)

p()=

Y

e(w;w 0

)2

e (w

0

jw) (2:1)

wheree(w;w

0

)2meansthat theedgee(w;w

0

) lieson theroute .

Notethat(2.1)fullydenesaprobabilitymeasureoverF(C)byspeifying

eah atomiprobabilityasa funtionofits primitiveprobabilities.

The assignment of probabilities (2.1), determined by impliitly

de-mands a Markov property over the ow of the units through the graph.

Thus, in the ontext of our medial example, the probablility of an

indi-vidual with attributes (male, displayed symptom S before puberty), (male,

displayed symptom S after puberty) or(female, displayed symptom S before

puberty)developingtheonditiondependsonlyonthefatthatthesubpaths

orresponding tothese pairsofattributesterminateat thepositionw 3

,and

notonthepartiularsubpathleadingto w

3

.Theprobabilitythisindividual

developstheonditionisthen

e (w

6 jw

3

)p((e(w

3 ;w

6

))j(w

3

)). Sowe

onlyneedtoknowthepositionaunithasreahedinordertopreditaswell

asispossiblewhat thenext unfoldingof itsdevelopment willbe.

ThisMarkovhypothesislooksstrongbutinfatholdsformanyfamiliesof

statistialmodel.Forexamplealleventtreedesriptionsofaproblemsatisfy

thisproperty,allnitestatespaeontextspeiBayesianNetworksaswell

(7)

Wean gofurtherandstate thatthesets ofpossiblefuturedevelopments

(whetherornottheydeveloped theonditionandwhether ornotthey died

before the age of 50) for individuals taking any of these three subpaths

must be the same. Moreover the onditional probability of any partiular

subsequentdevelopmentmustbethesameforindividualstakinganyofthese

threesubpaths.Thisaounts forthetermposition fora (non-sink)vertex.

In this paper we disuss minimal CEGs where if positions w

and w

aresuhthat the sets of possiblefuture developmentsfrom w

and w

are

idential, and the onditional probability distributions over these sets are

idential, thenw

andw

arethe sameposition.Anyreferene to aCEG,

SCEGorRCEGshouldtherefore be taken to meanaminimalCEG, SCEG

orRCEG.

Definition 2. An SCEG C is said to be valid fora population if it

isroute ompatible andhas themonomialpropertyfor .

Note that like the BN, the SCEG an be valid without its assoiated

primitive probabilitiesbeing known.We just needto believe that some set

exists so that the assoiated Markov hypothesis holds. We are free to

assign any set of probabilities to the edges of a valid SCEG within the

simplexonditions above. Soin partiularthe probabilitymodelspae of a

valid C an be dened as the produt spae of these jV(C))j 1 dierent

simplieswherethesimplexassoiatedwithw2V(C)nfw

1

ghasEulidean

dimension k(w) 1: The probabilityof any event in F(C) is then of the

form

p()=

X

2

p()=

X

2 Y

e(w;w 0

)2

e (w

0 jw)

where 2 means that is one of the omponent atoms of the event .

Inthispaperwe willalso usethefollowingfurthernotation:

(w 0

j w)p(((w;w

0

))j (w)) denotes theprobability of utilisingthe

subpath(w;w

0

) (onditionalon passingthroughw),

(w 0

jw) p((w;w

0

) j(w))=

P

(w

0

jw) denotesthe probabilityof

arrivingat w 0

onditional onpassing throughw.

2.3. Conditioning on intrinsi events. In thispaperweareinterestedin

onditioningsets whih give rise to onditional independene queries that

an be answered purely byinspeting thetopologyof an SCEG C. An

im-portant sublassof these areevents inF(C) whihare alledintrinsi.

Definition 3. Anintrinsievent inF(C) isasetofroutesofCwhih

arealsoroutesofC

whereC

(8)

w 0

and thesink vertexw

1

of C inits vertex set, and wherew 0

is theonly

vertex in V(C

) with no parent, and w

1

is the onlyvertex in V(C

) with

nohild. Callsuh a subgraphC

a sub SCEG.

Note that the sub SCEG C

is itself an SCEG. All atoms of F( C) are

intrinsi, as are (w) and (w;w

0

) (provided this is non-empty) for all

w;w 0

2 V(C), and as is the exhaustive set (w

0

). If we inlude the empty

set in the set of intrinsi events then we note that intrinsisets are losed

underintersetionandsotehniallyforma-system (seeforexample[12℄)

we an assoiate withtheSCEGC.

Notall eventsinF(C) areneessarilyintrinsibeause thelassof

intrin-sieventsisnotlosedunder union.Forexample,fortheCEGinFigure 1,

theevent onsistingoftheunionofthetwoatoms(e

1 (w

0 );e

1 (w

1 );e

1 (w

3 );

e 1

(w 6

))and (e 1

(w 0

);e 2

(w 1

);e 1

(w 3

);e 2

(w 6

)) produes a subgraph C

whih

hasfourdistint routes, so is notintrinsi.However thelass of intrinsi

events is rih enough to enompass virtually all of the onditioning events

intheonditional independene statementswe would like to query.In

par-tiular,ifourmodelan be expressedasa BNthen any setof observations

expressiblein theform O(A )=fX

j

2A

j

g (for subsets fA j

gof the sample

spaesof fX

j

g,thevertex-variablesoftheBN)isa propersubsetof theset

ofintrinsieventsdened ontheCEG ofour model[29 ℄.

The rst important property of the lass of validSCEG models is that

they arelosedunder onditioningbyan intrinsievent:

Theorem1. If an SCEGC is valid on a population then the

proba-bilitymodel on F(Cj) of anyof its subSCEGs C

isa probability model on

F(C

) whih isalso valid.

Theobvious setof primitive probabilitiesforthesub-SCEGC

isgiven by

=

e

(w 0

jw):e(w;w 0

)2E(w);w2V(C)nfw

1 g

where

e

(w 0

jw)=

p( j(e(w;w

0 )))

p( j(w))

e

(w 0

jw)

providing this is well-dened.A proof of this theorem an be foundin the

appendix. We note that this property has now been suessfuly used to

develop fastpropagationalgorithmsforCEGs (see [29℄).

Notethat the probabilityof an atom in C onditionedon the intrinsi

event is the probability of that atom in the SCEG C

. We denote this

probability p

(). It is then trivially the ase that the probability of an

eventinC onditionedontheevent istheprobabilityofthateventinthe

SCEGC

(9)

2.4. RandomvariablesonanSCEG.. Randomvariablesmeasurablewith

respet to F( C) partition the set of atoms into events. So for example, we

an denevariablesX;Y,measurable withrespetto F( C),whihpartition

the set of atoms into events f

X

g;f

Y

g. Moreover for an event (with

p()6=0) we an writeXqY j ifp(X =x jY =y;)=p(X =x j)

forall valuesx of X and y ofY (see forexample[7 ℄).

Lemma 1. ForaCEGC,variablesX;Y measurablewithrespettoF( C),

and intrinsi onditioning event , the statement XqY j is true if and

onlyif XqY istrue in the CEG C

.

Theproof of thislemma isin theappendix. This isa partiularlyuseful

propertybeauseitallowsusto hekanyontext-speionditional

inde-pendene propertybyhekinganon-onditionalindependenepropertyon

asub-SCEG.

We now turnourattention to two typesofelementaryrandomvariables,

measurable with respetto F( C), that an be identied witheah position

w2V(C)nfw

1

g.These are thevariablesf I(w):w2V(C)nfw 1

gg dened

by

I(w)= (

1 ifpasses throughw

0 otherwise

and thevariablesfX(w):w2V(C)nfw

1

gg denedby

X(w)=

(

x if passes along edgee

x

(w)2E(w)

0 if thepositionw doesnotlieon

where x = 1;2;:::k(w) index the edges emanating from w. Notie that

sineI(w) islearly a funtion of X(w), to speifya fulljoint distribution

over f (I(w);X(w)):w2V(C)nfw 1

gg it is suÆient to speify the joint

distributionoff X(w):w2V(C)nfw 1

gg .Notethatallatomieventsan

be expressedas theintersetionof events

=

\

w2

fX(w)=x

g

andeventsinF( C) astheunion ofthese atomi events

=

[

2 8

<

: \

w2

fX(w)=x

g

9

=

;

wherew2denotesthat thepositionw lieson theroute , and x

6=0 is

(10)

w

0

w

1

w

2

w

3

w

4

w

inf

w

6

1

2

1

2

1

2

1

2

1

2

1

2

w

7

w

8

w

9

1

2

1

2

1

2

Fig2.C onditionedontheeventthatdisplayedsymptomS

Figure 2 shows the SCEG C from Figure 1 onditioned on the intrinsi

event =(X(w

1

)=1)[(X(w

1

) =2)[(X(w

2

)=1)[(X(w

2

) =2) or in

theontext of ourmedialexample,displayed symptom S.

ForanysetAV(C),letX A

denotethesetofrandomvariablesfX(w):

w2AgandI

A

thesetf I(w):w2Ag.Also,foranyw2V(C),letU(w)be

thesetof positionsinV(C) whih lieupstreamof thepositionw,D(w) the

setofpositionswhihliedownstreamofw,U

(w) thesetofpositionswhih

do not lie upstream of w, and D

(w) the set of positions whih do not lie

downstream of w.

Lemma 2. ForanySCEGCandpositionw2V(C)nfw

1

g,thevariables

I(w);X(w) exhibit the positionindependeneproperty that

X(w)qX

D

(w) jI(w)

TheresultgiveninthislemmaisanalogoustothatwhihPearl[20 ℄usesto

deneBNs,whihstatesthataBNvertex-variableisindependentofits

non-desendantsgiven its parents. Itprovidesaset ofonditional independene

statements that an simply be read from the graph, one for eah position

inV(C).The proof ofthe lemmais intheappendix.

The statement that X(w)qX

D

(w)

j (I(w) =1) an be read as: Given

a unit reahes a position w 2V(C), whatever happens immediately after w

(11)

reahed, but also of all positions that logially have not happened or ould

not now happen beause the unit has passed through w. Thus, in the sense

above, the position of a valid SCEG C is suÆient to desribe the future

development ofunitspassing throughit.

As already noted, the produt spae dened by f(I(w);X(w)) :

w2V(C)gisoverspeied.ThisissorstlybeauseI(w)=0,X(w)=0

and I(w) = 1 ) X(w

0

) = 0 for w

0

2 D

(w) \U

(w). Probability

dis-tributions exist whih satisfy the set of statements of the form X(w)q

X D

(w)

jI(w) whihdonotobeythese impliations,butsuhdistributions

annotbe represented onan SCEG.

Moresigniantly,if the SCEGis used for the purpose forwhih it was

intended, as a representation of an asymmetri proess or problem, then

therewillbemanyprobabilitiesinthejointprobabilitytablesoverthespae

denedbyf(I(w);X(w)) :w2V(C)g whih areidentially zero. The joint

massfuntionisthenextremelysparse.Thesezerosorrespondtoimpossible

eventswhihnonethelessaregiven equalsignianewithpossibleeventsin

a BN-representation of the problem. In many ases these events are not

just impossiblebutmeaningless. Forexample ifX(w

a

)= 1 orrespondsto

patient dies, X(w

b

) = 1 orresponds to patient is given treatment 2, and

w a

w

b

,thentheevent (X(w

a

)=1;X(w

b

)=1) hasno logialmeaning.

Asthesetofstatementsoftheformf(I(w);X(w)):w2V(C)gdonot

de-netheSCEG,theseadditional ounterfatual statements produedbythe

produtspaerepresentationarenotanintegralpartoftheCEG-framework.

Theprodutspaedenedbythefullsetofstatementsisneverthelessa

use-fulonstrutbeauseitallowsustoenodesetsofonditionalindependene

statementsinto a validSCEGand so allowsus to quikly prove separation

theoremsforsuh graphs.

The struture of the CEG illustrates a further aspet of the graphial

modellingproess whihis nottransparent inthe topologyof theBN. The

CEG depits all possible histories of a unit in a population, and gives a

probability distribution over these histories. However, when a single unit

traversesoneoftheroutesintheCEG,valuesareassignedtoI(w);X(w)for

allpositionsw2V(C).Thoseonditionalindependenestatementsenoded

bythe positions andedges throughwhih ourunit hasnot passedarenow

truly ounterfatual [6 ℄ in that they answer queries of the form If X had

not been the ase, what would be the hane of Y happening? So the CEG

simultaneously depits both the \reality" and the ounterfatual aspets

of the problem one we start to observe the atual behaviour of units in

the population. It also makes it a powerful framework for expressing rih

(12)

3. A separation theorem for Simple CEGs. We all a position

w2V(C)astalkiftheremovalofwfromV(C)wouldresultinagraphwith

two disonneted omponents. In (non-probabilisti) graph theory suh a

vertexis alleda utvertex (see forexample[11 ℄).

Theorem2. In an SCEG C with w

1 ;w

2

2 V(C) and w

2

6 w

1 ,

X(w 1

)qX(w

2

) if and only if either w 2

is a stalk, or there exists a stalk

downstream of w

1

andupstream of w

2 ,for w

0

w

1

w

1

; w

0

w

2

w

1 .

The proof of this theorem is in the appendix. Theorem 2 has a number

of powerful orollaries,whih we give after introduingtwo new variables.

CallJ(R ) theinidene variable ofa regular subsetR if

J(R )

X

w2R

I(w) sup

w2R I(w)

andall Y(R )the riterionvariable of a regularsubsetR if

Y(R )

X

w2R

X(w) sup

w2R X(w)

Lemma 3. For an SCEGC with position uts R

a

=fw

a

g; R

b

=fw

b g:

If X(w

a

)qX(w

b

) for any w

a

2 R

a

; w

b

2 R

b

, then Y(R

a

)qY(R b

) in

every distributionompatible with C.

Conversely,if Y(R a

)qY(R

b

) holdsforalldistributions ompatiblewithC ,

then X(w

a

)qX(w

b

) for all w a

2R

a

; w

b

2R

b .

This lemma and Corollary 5 in Setion 7 formalise and generalise the

result given in [27 ℄ Theorem2.The proofof thelemma is inthe appendix.

The onverse result is somewhat surprising, but is a onsequene of the

partiularstrutureof thesigma eldassoiatedwithan SCEG.

Corollary 1. Let C be an SCEG, an intrinsi event, R

a

= fw

a g;

R b

=fw

b

g beposition uts of C.

Ifin thesub-CEG C

, w a

andw

b

areseparated by a stalk,for anyw a

2R

a ;

w b

2R

b , w

a ;w

b 2V(C

), then Y(R

a

)qY(R b

) j .

Theproofofthisorollaryisintheappendix.Thishasmajoronsequenes

formodelswhih admitaprodutspaestruture, where othogonaluts of

the CEG have a natural meaning orresponding to measurement variables

oftheproblem.ModelsofthissortanberepresentedasBNs,withpossible

(13)

Corollary 2. If an SCEG C is of a model whih admits a produt

spae struture, A;B are measurement variables of the model, and R

a ;R

b

are the position uts of C orreponding tothese variables, then:

If X(w

a

)qX(w

b

) for any w

a

2 R

a

; w

b

2 R

b

, then AqB in every

distributionompatible with C.

Conversely, if AqB holds for all distributions ompatible with C, then

X(w a

)qX(w

b

) for all w a

2R

a

; w

b

2R

b .

Theproofof thisfollowsimmediatelyfrom Lemma 3.

Corollary 3. Let C be an SCEG of a model whih admits a produt

spaestruture,A;B bemeasurement variablesof the model, an intrinsi

event, R a

;R b

be position uts of C orrepondingto the variables A andB.

Ifin thesub-CEG C

, w a

andw

b

areseparated by a stalk,for anyw a

2R

a ;

w b

2R

b , w

a ;w

b 2V(C

), then AqB j.

The proof of this follows diretly from Corollaries 1 and 2. In the ase

where our model has a natural produt spae struture, the topology of

the SCEG allows us to replae onditional independene queries suh as

AqB jC ?bysets of ontext-speiqueriessuhasfAqB j(C =)?g,

allowing us to interrogate the graph using Corollary 3. If in addition our

modeladmits noontext-spei onditionalindependeneproperties,then

thesymmetriesintheSCEGmeanthatweneedonlyhektheanswerto a

singlequery,forinstane AqB j(C =1) ?

Example 3.1. Figure3showstheSCEGCfromFigure1onditionedon

theintrinsievent =(X(w

1

)=1)[(X(w

2

)=1) ordisplayed symptom S

beforepuberty.Thisgraphhasastalkatw 3

,andbyTheorem2wehavethat

X(w 0

)qfX(w

3 );X(w

6 );X(w

7

)g inthisgraph.

Considerthepositionuts R

0

=fw

0 g;R

1

=fw

1 ;w

2 g;R

2

=fw

3 ;w

4 ;w

5 g;

R 3

= fw

5 ;w

6 ;w

7 ;w

8 ;w

9

g of C. Then as Figure 3 depits a onditioned

CEGC

fortheintrinsievent , Corollary1 givesusthat

Y(R

0

)q(Y(R 2

);Y(R 3

)) j

Now the CEG C from Figure 1 does not have a natural produt spae

struture, butthisisno obstale to our usingCorollary3 here.AsC

does

admita produt spaestruture we an impose thisonto C byforexample

deningAY(R

0

);B Y(R

1

), DY(R

3 )and

C=

(

1 if sup(X(w

3 );X(w

4 ))=1

(14)

w

0

w

1

w

2

w

3

w

inf

w

6

1

2

1

1

1

2

1

2

w

7

1

2

Fig3.C onditionedondisplayedsymptomSbeforepuberty

Thisallows usto useCorollary3 and givesusthat

Aq(C ;D) j(B=1)

whih in our medial ontext reads as whether an individual develops the

ondition and whether they die before 50 are independent of their gender

giventhat they displayed symptom S before puberty.

4. RegularCEGs. AlthoughSCEGsformanimportantlassof

graph-ialmodel,byaddingextrastrutureto themwe anmakethemevenmore

expressive.Wedothisbyolouringpositionsandedges.Theresultantgraph

is alled a regular Chain Event Graph (RCEG). We note that oloured

graphs have reently been found to provide a valuable embellishment to

othergraphialmodels(see forexample [9℄).

An RCEG is a oloured SCEG C where the set V(C) has an assoiated

partitionU(C)=fu

1 ;u

2 ;:::u

t

gforwhiheahsetuV(C) isregular.The

setuisalledastageandissuhthatforeahw2uthedistributionfuntion

ofX(w)j(I(w) =1)isdependentonlyonuandnotonthepartiularw2u.

Definition 4. w

1 ;w

2

2 V(C)nfw

1

g are in the same stage u if there

exists a bijetion (w

1 ;w

2

) between E(w

1

) and E(w

2

) suh that if :

e x

(w 1

)7!e x

(w 2

) thenp((e

x (w

1

))j(w

1

))=p((e

x (w

2

)) j(w

2 )).

The positions w

1 ;w

2

(15)

w

0

w

1

w

2

w

3

w

4

w

5

w

8

w

6

1

2

1

2

3

1

2

3

1

2

1

2

1

2

1

2

Fig 4.TheRCEGforExample4.1

andtheedges e x

(w 1

);e x

(w 2

)have thesameolour ifw 1

;w 2

areinthesame

stage ande

x (w

1

) maps to e

x (w

2

) under thisbijetion.

Theexisteneorotherwiseofabijetionbetweentwoedgesetsisnormally

apparentfromtheontextoftheproblem.Notethatife x

(w 1

)mapstoe x

(w 2

)

underabijetion ,thentheseedgesmustorrespondtothesame outome

(forexamplepatient dies)giventhetwo histories(w 1

)and (w

2

).Weall

theolouringof theRCEG thestage-struture ofthe graph.

Example 4.1. Produing an RCEG from the SCEG in Figure 1 we

anadd theextrainformationthatthepositionsw

3

and w

4

areinthesame

stage{thatistheprobabilityofdevelopingtheondition(ornot)isthesame

whetheramemberofthepopulationhasattributesorrespondingtothe

sub-paths(e 1

(w 0

);e 1

(w 1

));(e 1

(w 0

);e 2

(w 1

));(e 2

(w 0

);e 1

(w 2

))or(e 2

(w 0

);e 2

(w 2

)).

TheRCEGC is giveninFigure 4.

This additional struture allows us to express a riher set of

ontext-spei properties and sample spae information than we an with the

(16)

subsetthelass of modelsexpressible asfaithful regular orontext-spei

BNsonnitevariables.UnliketheBN,theRCEGembodiesthestrutureof

the model state spae and any ontext-spei information inits topology

andolouring.

RCEGsareroute-ompatibleand havethemonomialpropertyfora

pop-ulation iftheirunderlyingSCEGdoes, and henearevalidfora

popula-tion if their underlyingSCEG is. The subgraph C

of an RCEG C

on-ditionedon an intrinsievent is anRCEG. Theorem1holdsforRCEGs.

NotehoweverthatC

maynothavethesamestage-stutureasCinthat

po-sitionsoredgeswhihhavethesame olourinC may have dierent olours

inC

.Lemma1 and theposition independene property holdforRCEGs.

TheonditionsstipulatedinCorollaries1and 3an nowberelaxed.It is

suÆientthatC

shouldbesimple(ratherthanC) forthese resultstohold.

ThesubgraphofaCEGwhihonsistsofapositionw,thesink-nodew

1 ,

and all edges and positions whih lie on a w ! w

1

subpath is alled the

subgraph rooted in w. When the CEG is used as a pratial tool it is

im-portanttomaximiseitsrepresentationaleÆieny.SoifinthesubgraphC

,

thesubgraphs rootedin the positionsw

and w

have idential topologies

and olouring we an ombine the positions w

and w

into a single

posi-tion[30 ℄.NotethatifwedothisthenC

althoughnowminimal,isnolonger

asubgraphof C (see Denition3).

Followingtheideasof setion3, we let

J(u)=sup

w2u

I(w) and Y(u)=sup

w2u X(w)

The RCEGis also a powerfultool for interrogation purposes, butto

max-imise its potential in this area we use the Augmented Chain Event Graph

(ACEG) desribed inthenext setion.

5. Augmented CEGs.

5.1. Denition of an Augmented CEG. Analogously to the denition

of X

A , let Y

A

= fY(u) : u 2 Ag and J

A

= fJ(u) : u 2 Ag. Sine the

CEGC is aDAG, there existsa partialorder ofthe stagesintheset U(C).

LetP(u)bethe setof allu 0

stages thatpreedeu inthispartialorder.Let

Y Q(u)

be aminimalsubsetof Y

P(u)

suh that

J(u)qY

P(u) jY

Q(u)

Definition 5. An augmented CEG (ACEG) A(C) is a funtion of the

(17)

1

1

1

1

2

2

2

1

2

ω

0

ω

1

ω

2

ω

3

ω

4

ω

5

2

2

2

2

2

1

1

1

1

Fig 5.TheRCEGforExample5.1

The edge set E(A(C)) onsists of direted edges onneting the parents

of any vertex in V(A(C)) to that vertex. Eah vertex Y(u) has a single

parent J(u),and theparentsofJ(u)arepreiselythose Y(u 0

) vertiesthat

aremembersof Y

Q(u) .

Example 5.1. A researh group has taken a sample from the

popu-lation desribed in Setion 2.1 whih ontains only people who displayed

symptomS.Analysisofthissamplesuggests thatwhetheran individual

de-velops theonditionandwhetherthey diebefore50areindependent oftheir

gendergivenwhentheydisplayed symptom S.TheRCEGforthisisgivenin

Figure5.AnACEGforthisgraphisgiveninFigure6,whereforillustrative

onvenienethe edges emanatingfrom Y(u) nodeshave beenlabelledwith

valuesofA (=Y(R 0

)forR 0

=fw

0

g),B (=Y(R 1

) forR 1

=fw

1 ;w

2

g),and

C (=Y(R

2

) forR 2

=fw

3 ;w

4 g).

5.2. ACEGsareBayesianNetworks. Weextendthenotationofsetion3

to let X D

(u)

be the vetor of random variables of the form X(w)

(18)

A

=1

B=

1

C

=1

C=1

C=2

A

=2

B=2

B=

1

B=2 C

=2

B=1

B=1

B=2

B=2

J(u

A

)

Y(u

A

)

J(u

B1

) Y(u

B1

)

J(u

C

)

Y(u

C

)

J(u

B2

)

Y(u

B2

)

J(u

D1

)

Y(u

D1

)

Fig 6.AnACEGfortheRCEGinFigure 5

Y D

(u)

;J D

(u)

be the vetors of random variables of the form Y(u

0 );J(u

0 )

assoiated withstagesinC whihdo notliedownstreamof thestage u.

Lemma 4. For CEG C, and stage u2U(C)

Y(u)qY

D

(u) j J(u)

Theresultgiveninthislemmais analogoustothatgiveninLemma2for

positions, and so also to the result quoted there for BNs. It provides a set

of onditional independene statements that an simply be read from the

graph,one for eah stage in U(C).A partial readingof the lemma givesus

thattheimmediatefuturefora unitat astage u isindependent ofhowthe

unitreahedthatstage. Theproofof thelemma isintheappendix.

Byonstrution,ifastageu 0

isnotdownstreamofuinC,thenJ(u 0

);Y(u 0

)

arenotdownstream ofJ(u);Y(u) inA(C).Sine forevery stageu 0

,J(u 0

) is

afuntionof Y(u 0

),it follows that

Y(u)q(J D

(u)

;Y D

(u)

(19)

andhene that

Y(u)q(J P(u)

;Y P(u)

) jJ(u)

inany partialorder of U(C).Clearlywe also have that

J(u)q(J

P(u) ;Y

P(u) ) jY

Q(u)

and henethatallvertiesinan ACEGA(C) areindependent oftheir

pre-deessor vertiesgiven theirparental vertiesinanypartialorder ofU(C).

In[19 ℄itisshownthataprobabilitydistributionP isMarkovrelativetoa

DAGGifandonlyifeahvariableinGisindependentofallitspredeessors

onditionalonitsparents,insomeorderingofthevariablesthatagreeswith

thearrowsofG.ClearlyourACEGisaDAG,andfromtheabovereasoning

eahvariableinA(C)isindependentofallitspredeessorsonditionalonits

parentsforallP denedontheCEGC.SoourACEGobeyswhatPearl[20 ℄

alls the ordered Markov ondition, and hene also obeys the loal Markov

ondition[13 ℄. Results in[10℄ allow ustherefore to dedue that theACEG

isitself aBN.

Thisdedutionmeansthatanyresult availableforusewithBNsanalso

be usedwith ACEGs. In partiular we an use d-separation to allow usto

interrogate ACEGs foronditional independeneproperties.Theadvantage

thatthe ACEG has hereoverthe BNis that intheformer ontext-spei

onditionalindependenepropertiesaredepitedexpliitly inthetopology

ofthe graph,and soit an be interrogated diretly forsuhproperties.We

begin however bylookingat modelswhih anbe representedby BNs.

6. Models depitable by Bayesian Networks and others. If a

modelhasanaturalprodutspaestrutureandadmits noontext-spei

onditionalindependene properties thenit an be depited bya BN

with-out any further annotation. In this setion we show that if our CEG is of

suha modelthenanyseparation-based onditionalindependeneproperty

readablefrom theBNan also beread fromits assoiatedACEG.

IfourCEGisofamodelwhihhasanaturalprodutspaestruturethen

foreah variable X

i

inthe BN there exists a olletion of vertiesfJ(u

i )g

in the ACEG whose members orrespond to the possible ongurations of

Q(X i

)(theparentvariablesofX i

),andaolletionofvertiesfY(u i

)gwhose

members orrespondto X

i

given those ongurations.

Theorem3. Ifamodelwithanaturalprodutspaestrutureadmitting

no ontext-spei onditional independene properties, has a BN

(20)

If fY(u i

)g is d-separated from fY(u

j

)g by fY(u

k

)g in A(C), then

X i

qX

j

jX

k

in every distributionompatible with C and G.

Conversely,if X i

qX

j

jX

k

holdsforalldistributions ompatible withC and

G, then fY(u i

)g is d-separated from fY(u j

)g by fY(u k

)g in A(C).

The proof of this theorem is given in the appendix. This result an be

explained as follows: The CEG C is of a model whih has a natural

prod-utspae struture admitting no ontext-spei onditional independene

properties,and an be depitedbyaBN G.Thereforethere exist (inA(C))

edgesfromvertiesinfY(u i

)gtovertiesinfJ(u j

)gifandonlyifthereexists

anedgefromX

i

toX

j

inG,andsothereisa1:1orrespondenebetweenthe

parental onditional independene statements in G and the parental

ondi-tionalindependenestatementsinA(C).By[31 ℄Corollary1,theonditional

independenestatementsina DAG anbederivedfromd-separation ifand

only if they an be derived from the list of parental onditional

indepen-denestatementsusingthesemi-graphoid axioms[28 ℄. AsbothG andA(C)

areDAGs, wean inferthatthereis a1:1orrespondenebetweenthe

on-ditional independene statements derived from d-separation in G and the

onditionalindependene statements derivedfrom d-separation inA(C).

Essentially, Theorem 3 allows us to use the olletions fY(u

i

)g in the

ACEGassurrogates forX

i

intheBN,whenansweringonditional

indepen-denequeries.

Example 6.1. For the RCEG in Figure 5, let A;B;C be as in

Exam-ple 5.1, R 3

=fw

5 ;w

6 ;w

7 ;w

8

g and D= Y(R

3

). Then usingTheorem 3 on

theACEGforthisRCEG (given inFigure 6),wesee that

Y(u A

) isd-separated fromfY(u C

);Y(u D

)g byfY(u B

)g)Aq(C ;D) jB

fY(u A

);Y(u B

)g ared-separatedfrom fY(u

C

)g)(A;B)qC

Y(u A

) isnotd-separated fromY(u C

) byfY(u D

)g)Aq/C jD

Y(u B

) isnotd-separated fromY(u C

) byfY(u D

)g)Bq/C jD

for the RCEG in Figure 5. In our medial ontext whether an individual

develops the ondition and whether they die before the age of 50 are

inde-pendent of theirgender givenwhenthey displayed symptom S;whetherthey

developtheonditionisindependent oftheirgenderandwhentheydisplayed

symptom S; butwhether they develop the ondition is not independent of

eithertheir gender or whenthey displayed symptom S givenwhether or not

(21)

Corollary 4. If X i

;X j

;X k

aredistintsubsets of the vertex-variables

of G, and fY(u

i

)g;fY(u j

)g;fY(u k

)g are the orresponding olletions in

A(C), then the results of Theorem 3 still hold.

This follows from the proof of Theorem 3 (whih does not depend on

X i

;X j

;X k

being singlevariables).

We have alled the olletions fY(u

i

)g in A(C) surrogates for X

i in G,

but when we replae the statement \X

i

qX

j

j X

k

in G" by \fY(u i

)g is

d-separated from fY(u j

)g by fY(u

k

)g in A(C)", only fY(u k

)g is atually

a surrogate. This is beause the latter statement implies that fY(u

i )gq

fY(u j

)g jfY(u k

)g (sine A(C) is a BN),and if thisstatement is true then

X i

qX

j

j fY(u k

)g sine X

i

supY(u

i )

is a funtion of fY(u

i )g. By

onstrution only one Y(u

i

) within the set fY(u

i

)g an take a non-zero

value,and the value thisvariable takesisequal to thevaluetaken byX i

.

Notethat the ACEGhas J(u) and Y(u)nodes foreah stage u 2U(C),

and eah stage u in a CEG is assoiated with a partiular olletion of

parents{thesetofu 0

2U(C)orrespondingto Y

Q(u)

intheACEG.Indeed,

ifourCEGhassuÆient symmetryto beembeddedintoafamilyofmodels

withaprodutspae struture,thenthepositionsonstitutingeahstageu

are members of a spei orthogonal ut R (setion 2.1), and u enodes a

partiularongurationof theparentalvariablesof Y(R ).

For this reason an ACEG has many more nodes than a standard BN,

and so admits a far larger olletion of onditional independene

state-ments. This olletion inludesmany ontext-spei properties whih an

onlybe represented in BNs by modifyingtheir struture [4 , 16 , 22 , 24 ℄. It

also inludesmanyounterfatual statements of thetype desribed in

Se-tion2.5onSCEGs.So forexample,ifweonsidertheCEGinFigure 2,but

ombine the positions w

6 ;w

7 ;w

8

and w

9

into a sink-node w

1

, we get the

ACEG depitedinFigure 7,whereforonvenienewehave let A=Y(R

0 );

B=Y(R

1

); C=Y(R

2

) with R

0 ;R

1 ;R

2

dened asinExample5.1above.

Using the ACEG in Figure 7 we an dedue that C qB j (A = 1) {

whether an individual develops the ondition is independent of when they

displayed symptom S given that their gender is male (and symptom S was

displayed), and that CqA j(B = 1) { whether they develop the ondition

is independent of their gender given that they displayed symptom S before

puberty. But we also have statements suh as Y(u

C2

)qY(u B1

) j Y(u A

),

whihhas noobvious meaningintheontext of theproblem.

7. More on ontext-spei onditional independene. One of

thedistintadvantages oftheCEGwhen representing andanalysing

(22)

J(u

A

)

Y(u

A

)

J(u

B1

)

Y(u

B1

)

J(u

B2

)

Y(u

B2

)

J(u

C1

)

Y(u

C2

)

J(u

C2

)

Y(u

C2

)

A

=

1

A=2

A

=

2

A=

1,2

B

=

1

B=2

Fig 7.AnACEGfortheadaptedCEGfrom Figure2

spei event, perhaps a spei value of a variable, and use the CEG's

topologyto disoveronditionalindependenieswhihwouldnotexist ifwe

were to ondition on a related event, suh as a dierent value of our

vari-able.Ifwe areinterestedinthese ontext-spei onditional independene

propertiestheninthisdisreteontextweneedtoonentrateourattention

on statements where the onditioning element is an event. In most ases

thisevent willbeexpressible asa valueof a singleY(u) (or J(u)) variable,

and so queries an be heked diretly on an ACEG without the need of

the surrogate argument of the last setion. What happens in ases where

our onditioningevent annot be expressed asa value of a Y(u) (or J(u))

variable? An example of this is the event = (B = 1) for the model

de-pitedin Figure 4. We ould draw an ACEG for the fullCEG C here,but

the ACEG for C

is muh more useful. The sub-CEG C

for this event is

given in Figure 3. Note that the edge-probabilities on this graph are now

A = 1 j B = 1, A = 2 j B = 1 for the edges leaving w

0

; 1 for the edges

leaving w

1 & w

2

(the positions w

1 & w

2

ould be ombined into a single

positionassuggested inSetion 4);C =1; C =2 fortheedges leavingw

3 ;

(23)

B=1

C

=1

B=

1

C

=2

B=

1

J(u

A

) Y(u

A

)

J(u

B=1

)

Y(u

B=1

)

J(u

C

)

Y(u

C

)

J(u

D1

)

Y(u

D1

)

Fig 8.AnACEGforthesub-CEGfrom Figure3

and D = 2 j B = 1;C =2 forthe edges leaving w

5 & w

6

(see Setion 2.3

and[29 ℄).

An ACEG for C

is given in Figure 8. Notie that unlike in Figure 6,

J(u A

) is not a root-vertex as A is now dependent on B.Also, beause we

haveonditionedontheevent(B=1),thesetofY(u B

)vertieshasbeome

a single vertex Y(u

B=1

) with no anestors exept J(u

B=1

). We now use

d-separationto read that

fY(u C

);Y(u D

)g are d-separatedfrom Y(u A

)byY(u B=1

)

) (Y(u

C

);fY(u D

)g)qY(u A

) jY(u B=1

)

) (C ;D)qA jY(u B=1

)

sinetheACEGisaBN,andA;C ;DarefuntionsofY(u A

);Y(u C

);fY(u D

)g.

AlsoY(u B=1

)=1 , B =1,sothisinturnimpliesthat(C ;D)qAj(B=1)

withouttheuseof thesurrogate argument.

Thismethodwillalwayswork forases like thissineonditioningonan

event suh asB =1 always produes a single vertex of the form Y(u

B=1 ),

(24)

AswithSCEGs(seeSetion3),standardonditionalindependenequeries

onanRCEGangenerallybeansweredbylookingatontext-spei

ondi-tionalindependenequeriesonsubgraphsoftheCEGwhihareoftensimple.

Suh onditioning an only remove olouring from the graph and not add

it. Beause of the way CEGs are onstruted, the onditioning event in a

ontext-spei onditional independene query an very often be written

as(w) forsome w2V(C).But ifwe onditionon an event =(w), we

an readonditional independenepropertiesothe graphC

even ifC

is

notsimple.

Corollary 5. Let C be an RCEG with position uts R

a

= fw

a g;

R b

=fw

b g.Ifw

a ;w

b

areseparatedbyastalk,foranyw a

2R

a ;w

b

2R

b ,then

Y(R a

)qY(R b

).

The proof of this orollary is in theappendix. This result an obviously

be extended to give suÆient onditions for Y(R

a

) q Y(R

b

) j just as

Corollary1 extendsthe resultforSCEGs.

So if = (w) for some w 2 V(C), then w is a stalk in C

, and in

thisgraph,Y(R a

)qY(R b

) foranypositionuts R

a

upstreamofw,and R

b

downstreamofw.HeneY(R

a

)qY(R b

)jprovidingwehavedenedthese

variablesonsistentlyon theCEGsC andC

(see proof ofCorollary1).

Example 7.1. Consider the RCEG from Figure 4 onditioned on the

event =(X(w

1

) =1)[(X(w

1

)= 2)[(X(w

2

)= 1)[(X(w

2

) =2). The

RCEG forthisis given inFigure9.

Ignoring themedial ontext here, we note that in this graph the event

=(w

3

) an beharaterisedas(min(A;B)=1). Ifwe onditiononthis

event we getC

asinFigure10,whihasalreadynotedmusthavea stalk.

Forillustrative onveniene edges inFigure 10 have beengiven probability

labels.

Using Corollary 5 we get (C ;D)q(A;B) j (min(A;B) = 1), and

on-ditioning on = (w

4

) we get a CEG from whih we an trivially read

that (C ;D)q(A;B) j (min(A;B) = 2). Combining these we get (C ;D)q

(A;B) j min(A;B).

8. Conlusion. Insummary,theresultsof setion3giveusonditions

forthetruthofAqB statementsonSCEGswhiharediretlyanalogousto

those given in (for examplePearl's [20 ℄ or Lauritzen's[14℄ versionsof) the

d-separationtheoremforBNs.Corollary1alsogivesussuÆientonditions

(25)

1

1

1

1

2

2

2

1

2

ω

0

ω

1

ω

2

ω

3

ω

4

ω

5

2

2

1

Fig 9.TheRCEGforExample7.1

onditions for AqB j statements to hold on RCEGs. Queries suh as

AqB jC ?wheretheonditioningelement is alsoa variable(orolletion

ofvariables)an generallybeanswered byonsideringsets of queriesofthe

formAqB j ?Methodsfordoing thisaresuggested at variouspointsin

thetext, but inthespeialase where theRCEGdesribesa modelwhih

anbedepitedbyaBN, Theorem3givesonditionsforAqB jCdiretly

analogous to those given inthe d-separation theorem forBNs. The ACEG

fromSetion5isveryusefulforalltypesofonditionalindependenequery,

butis partiularlyuseful for queriesof the form AqB j ? insituations

whereusingothertehniquesisnotstraightforward.ThefatthattheACEG

isitselfaBNopensupanexitingrange ofpossibilitiesstillto be explored.

Analysts working with BNs have found that attempts to feed bak to

a user all the impliit onditional independenies assoiated with a given

graph an be rather overwhelming unless the BN is very simple. Clearly

this would also be the ase with CEG-based models. However, within any

given ontext thetypes of independeniesthat it is natural forthe user to

be able to understand, examine and verify are smallin number. Sine the

(26)

A

=1

|M

in

(A

,B

)=

1

B=1|A=1

C=

1

D

=1

|M

in

(A

,B

)=

1,C

=1

C=2

A

=2

|M

in

(A

,B

)=

1

B=2|A=1

B

=

1

w

it

h

pr

ob

ab

il

it

y

1

Fig10.TheRCEGfrom Figure9onditionedon=(w3)

appliationof theCEGwedefer thisdisussionto a future paper.

APPENDIX1:PROOFS ANDONE ADDITIONAL LEMMA

LIMITED MEMORY LEMMA. For any CEG C, w

1 ;w

2 ;w

3

2 V(C) with

w 1

w

2

w

3 ,

I(w 3

)qI(w 1

) j(I(w 2

)=1)

PROOF. Itis suÆientto prove that

p(I(w 3

)=1 jI(w 1

)=1;I(w 2

)=1)=p(I(w

3

)=1jI(w 2

)=1)

Soonsidera singleroute passingthroughw

1 ;w

2 ;w

3

.Thisroute onsists

ofasetofedgesandbyonstrutiontheprobabilityp()oftherouteisequal

to the produt of the probabilitieslabellingeah of these edges. Moreover,

the probability of any subpath of is equal to the produt of the

proba-bilitieslabellingeah ofits edges. So p() an be written astheprodut of

the probabilities of four subpaths:

0 (w

0 ;w

1

),

1 (w

1 ;w

2

),

2 (w

2 ;w

3 ), and

3

(w 3

;w 1

). Thus

p()=

0 (w

1 jw

0

)

1 (w

2 jw

1

)

2 (w

3 jw

2

)

3 (w

1 jw

3 )

Considernowthe event (I(w

1

)=1;I(w 2

)=1;I(w 3

)=1) or(w

1 ;w

2 ;w

3 ),

whihis theunionofall w 0

!w

1

routespassingthroughw

1 ;w

2 ;w

(27)

sine(w 1 ;w 2 ;w 3

)is anintrinsievent we an write

p((w 1 ;w 2 ;w 3 ))= X 0 2M 0 0 (w 1 jw 0 ) X 1 2M 1 1 (w 2 jw 1 ) X 2 2M 2 2 (w 3 jw 2 ) X 3 2M 3 3 (w 1 jw 3 ) whereM i

(i=0;1;2) is theset of all subpathsfrom w i

to w i+1

,and M

3 is

theset of all subpaths from w 3 to w 1 .But P 0 2M 0 0 (w 1 jw 0

) is simply

theprobabilityofreahingw 1

fromw

0

,or(w

1 jw 0 ), so p((w 1 ;w 2 ;w 3

))=(w

1 jw 0 ) (w 2 j w 1 ) (w 3 jw 2 ) (w 1 jw 3 ) =(w 1 jw 0 ) (w 2 j w 1 ) (w 3 jw 2

)1

sineallpaths passing throughw

3

terminateinw

1

.Therefore

p(I(w 3

)=1 jI(w 1

)=1;I(w 2

)=1)= p((w 1 ;w 2 ;w 3 )) p((w 1 ;w 2 )) = (w 1 jw 0 ) (w 2 jw 1 )(w 3 jw 2

)1

(w 1 jw 0 ) (w 2 jw 1

) 1

=(w 3 jw 2 ) =p(I(w 3

)=1jI(w 2

)=1)

If we replae I(w

1

) = 1 by ((w

0 ;w

2

)) for any subpath (w

0 ;w 2 ), and I(w 3

)=1by(e(w

2 ;w

0 2

)) forsome edgee(w

2 ;w

0 2

) thenwe obtain

COROLLARYA. For any CEG C with w2V(C)

p((e(w;w

0

))j((w

0

;w));(w))=p((e(w;w

0

))j(w))

Similarly,ifw 0 1

w

2

andwereplaeI(w

1

)=1by(e(w

1 ;w 0 1 ))(X(w 1 )=x

1

forsome x 1

21;:::k(w 1

)), andI(w 3

)=1 by(e(w

3 ;w 0 3 )) (X(w 3 )=x

3 for

some x

3

21;:::k(w 3

))thenwe obtain

COROLLARYB. For any CEG C,w

1 ;w

2 ;w

3

2V(C), with w 0 1 w 2 w 3 p((e(w 3 ;w 0 3

))j(e(w

1 ;w 0 1 ));(w 2

))=p((e(w

3 ;w

0 3

))j(w

2 ))

p(X(w 3

)=x 3

jX(w

1

)=x

1 ;I(w

2

)=1)=p(X(w

3 )=x

3 jI(w

2 )=1)

PROOF OF THEOREM 1. Sine the atoms of F( Cj) are routes in C

,

F(Cj ) = F(C

), and so the onditioned model is route ompatible

(28)

inC

is given byp

()=p(j) whihequals

p((w 0 );(e(w 0 ;w 1 ));(e(w 1 ;w 2

));:::(e(w m

;w 1

))j)

=p ((w 0 );(e(w 0 ;w 1 ));(e(w 1 ;w 2

));:::(e(w m ;w 1 ))) =p ((w 0 ))p ((e(w 0 ;w 1

) j(w

0 )) p ((e(w 1 ;w 2

) j(w

0 );(e(w 0 ;w 1 ))

::: p

((e(w m

;w 1

))j(w

0

);::: (e(w

m 1

;w m

)))

whihbyCorollary Aof theLimitedMemory Lemmaequals

1p

((e(w

0 ;w

1

) j(w

0 ))p ((e(w 1 ;w 2

) j(w

1 )) p ((e(w 2 ;w 3

) j(w

2

))::: p

((e(w m

;w 1

))j(w

m )) = Y e(w;w 0 )2 p ((e(w;w 0

)) j(w))

Soletting

e

(w 0

jw)=p

((e(w;w

0

)) j(w)) we have

p(j)= Y e(w;w 0 )2 e (w 0 j w)

andtheprobabilitymassfuntionp(j)hasthemonomialproperty.Hene

C

is avalidSCEG.

Notethat asstatedinsetion 2.3

p

((e(w;w

0

))j(w))=p((e(w;w

0

))j;(w))

=

p(j(w);(e(w;w

0 )))

p(j(w))

p((e(w;w

0

))j(w))

=

p(j(w);(e(w;w

0 )))

p(j(w))

e

(w 0

jw)

PROOF OF LEMMA 1. X;Y partition the set of atoms of C, and sine

(C), X;Y also partition the set of atoms of C

. Consider

arbi-trary events

X

and

Y

from the sets f

X

g and f

Y

g (partitions of the

set of atoms of C), and the event

A

=

X

\

Y

. Then p(

X

j ) =

p ( X ); p( Y

j )=p

(

Y

) and p(

X ;

Y

j)=p(

A

j)=p

( A )= p ( X ; Y

).The statement

p( X

; Y

j)=p(

X

j)p(

Y

(29)

isthentrue ifand onlyifthestatement

p

( X

; Y

)=p

( X

) p

( Y

)

istrue;and thisholdsforall X

2f

X

g

Y

2f

Y

g.

PROOF OF LEMMA 2. By denition I(w) = 0 ) X(w) = 0, so if

I(w) = 0, X(w) is known and so in partiular is independent of all other

variables. If I(w)=1then X(w 0

)=0 forall w 0

2D

(w)\U

(w).

So onsiderI(w) =1 and w

0

2U(w). We now use the monomial property

to showthat X(w)qX

U(w)

j(I(w)=1).

The primitive probability

e (w

+

j w) is a fator of the probabilityp()

foranumberofroutes.Consideroneoftheseroutesanddenotethesubpath

of this route between w

0

and w by (w

0

;w). Then by Corollary A of the

LimitedMemoryLemma, wean write

e

(w +

jw)=p((e(w;w

+

))j(w))

=p((e(w;w

+

))j((w

0

;w));(w))

and this is learly true for all subpaths between w

0

and w. But the set of

these subpaths is in1:1 orrespondene with theset of vetorsof values of

X U(w)

whih are onsistent with the topology of the SCEG and with the

event I(w)=1.The event (w) an bewritten asI(w)=1, and theevent

(e(w;w

0

))asX(w)=x forsome x>0.Hene

p(X(w)=x jI(w)=1)=p(X(w)=x jX

U(w)

;I(w)=1)

and X(w)qX

U(w)

j(I(w)=1).

Combiningthiswiththetwo previousresultswetherefore have

X(w)qX

D

(w)

jI(w)

PROOFOFTHEOREM2. (1) SUFFICIENTCONDITIONSFORINDEPENDENCE

Consideran SCEG C,and two positions w

1 ;w

2

2V(C), where w

2

6w

1 ,

and byonstrution I(w

1

)60; I(w 2

)60.

Supposethereexistsastalkdownstreamofw

1

andupstreamofw

2 .Label

thispositionw.Thenallpathspassingthroughw 1

passthroughw,allpaths

passingthroughw

2

pass throughw,and neessarilyw

1

ww

2 .Also

p(X(w 2

)=x

2

jX(w

1 )=x

1

)=p(X(w

2 )=x

2

jX(w

1 )=x

1

;I(w)=1)

=p(X(w

2 )=x

2

(30)

sinew 1

ww

2

,and usingCorollaryBof theLimitedMemory Lemma

=p(X(w

2 )=x

2

jX(w

1 )=x

0 1

;I(w)=1)

=p(X(w

2 )=x

2

jX(w

1 )=x

0 1

)

forall valuesx 1 ;x 0 1 ofX(w 1

)and all values x 2 of X(w 2 ) ) X(w 1

)qX(w

2 )

If w 2

is itself a stalk,then we replae I(w) =1 by I(w 2

) =1 in the above

argument withthesame result.

So a suÆient onditionfor X(w

1

)qX(w

2

) is that eitherw 2

is itself a

stalk,orthere existsa stalkdownstream ofw 1

andupstreamof w

2 .

(2) NECESSARYCONDITIONSFORINDEPENDENCE

LetX(w

1

)qX(w

2

) (andsineI(w) isafuntionofX(w),X(w

1

)qI(w 2

)

andI(w 1

)qI(w 2

)).LetthesetofroutesofCbepartitionedintofoursubsets.

Call a route Type A if it passes through w

2

, but notthrough w

1

,Type B

ifitpassesthroughneither w 1

norw

2

,Type C ifit passesthroughbothw 1

andw

2

,and TypeDifitpassesthroughw 1

,butnotthroughw 2

.Ourproof

proeeds asfollows:

(a) We show that we must have w

1

w

2

(ie. the set of Type C routes is

non-empty.

(b) We show that every route intersets with every other route at some

point downstream of w

0

and upstream of w

1

. If two w

0

! w

1

routes

share no verties exept w

0

and w

1

, we all them internally disjoint (see

forexample[11 ℄). Soweansaythatthereannotbetwointernallydisjoint

diretedroutes inC

()Weshowthatthereannotbetwointernallydisjointroutesinthe

undi-reted version of the CEG (the CEG withits edge arrows removed), and

thattherefore there mustbe astalk betweenw

0

and w

1 .

(d)Weshowthat eitherw 1

isa stalkorw 2

isa stalk,orthereexistsastalk

downstream of w

1

and upstreamofw

2 .

(e) Finally we show that if w

1

is a stalk then there must also either be a

stalkat w 2

orastalk downstreamof w

1

and upstreamof w

2 .

(a) Suppose that w

1

6 w

2

(and reall that w

2 6 w 1 ). Then p(I(w 2

) = 1 j I(w

1

) = 1) 0. I(w

1

) q I(w 2

) ) p(I(w

2

) = 1) 0

)I(w

2

)0. Thisisimpossiblebyonstrution.Thereforew

1

w

2 .

(b) We rst show that eah Type C route intersets with every other

routeat w 1

orat w

2

(31)

µ1

λ2

λ2

p

p

p

0

0

1

p = arbitrary probability in (0,1)

ω1

ω2

Fig11.Illustrationfor Type CandType B routes

Let

1

be a Type C route, and

1 (w

1 ;w

2

) the subpath oinident with

1

between w

1

and w

2

. If the set of Type B routes is non-empty then let

2

be a Type B route whihdoesnotintersetwith

1

(ie.

2

and

1

have no

positions inommon).

ConsideradistributionP whih(1)assigns aprobabilityof1to every edge

of the subpath

1 (w

1 ;w

2

), and (2) an arbitrary probabilitygreater than 0

and less than 1 to eah edge of the route

2

(Figure 11). If our SCEG

is minimal and

2

does not interset with

1

then this is always possible.

UnderP,assignment(1) givesusthat

p(I(w 2

)=1 jI(w 1

)=1)=1

andI(w 1

)qI(w 2

) impliesthat underthisP

p(I(w 2

)=1 jI(w 1

)=0)=1 ) p(I(w

2

)=0 jI(w 1

)=0)=0

Butassignment(2) givesusthatp(I(w

2

)=0jI(w 1

)=0)>0 ¸

The assumption I(w

1

)qI(w

2

) is inompatible withthe assignments of (1)

and (2). But these assignments arealways possibleif

2

doesnot interset

with

1

.Hene

2

mustinterset with

(32)

λ4

λ3

λ3

ω1

ω2

λ4

µ5

µ5

λ3

&

µ5

Fig12.Illustrationfor non-TypeC routes:w4nw3m

HeneeahTypeC routeintersetswithevery TypeB routeat somepoint

downstream of w

1

and upstreamof w

2

. Also eah Type C route intersets

witheveryTypeAroute(atw

2

),witheveryTypeDroute(atw 1

)andwith

every other Type C route (atbothw

1

and w

2 ).

WenowonsiderroutesthatarenotofTypeC.Ifthesetofnon-TypeC

routesisnon-emptylet

3 ;

4

bemembersofthissetwhihdonotinterset

exeptat w

0

and w

1

.Let (w

1 ;w

2

) be asubpathbetweenw

1

and w

2 .

From above both

3

and

4

must interset with. Let

3

interset with

only at the positions w 31

;:::w 3m

, where w

31

w

3m

; and let

4

in-terset with only at the positions w

41 ;:::w

4n

, where w

41

w

4n .

Without lossof generalitylet w 1

w

31

w

41

w

2

,sothat

3

ould be a

routeofTypeB orType D,and

4

ould be aroute ofTypeAorTypeB.

Suppose that w

4n

w

3m

(Figure 12). Consider the subpath

5 (w

1 ;w

2 )

whihoinideswithfromw

1 tow

31 (ifw

31 6=w

1

),oinideswith

3 from

w 31

to w

3m

, and oinides with from w

3m

to w

2

. This subpath

5 does

not interset with the route

4

. This is impossible sine every route in C

intersetswithevery (w

1 ;w

2

) subpath.

Suppose therefore that w

3m

w

4n

References

Related documents

payable from cash funds as defined by Arkansas Code 19-4-801 of the North 33. Arkansas Community/Technical College, for personal services and operating

Poudre River Explorers welcome all Catholic boys, who are between the age of 8 to 12, to join our Timber Wolf den for hot dog roast and flag.. retirement ceremony on

Building a culture of change is not an easy task, and business leaders need to take bold steps to make sure it percolates to every process and function within an

board shall issue certification as a Certified Professional Health Educator to 9. any

license to practice funeral directing in this state, on application to the 31. board and on providing the board with such evidence as to his

It is issued by the local Probate Registry in England and Wales (Probate Office in Northern Ireland or local Sheriff Court in Scotland).. Get several copies as you will need them

undivided interests in pools of loans the Commission shall retain the services 5. of a financial advisor who shall render prior to the closing of the

Individuals in the Architecture role provide overall direction, guidance, definition and facilitation for the development of current and future architecture required to meet