University of Warwick institutional repository: http://go.warwick.ac.uk/wrap
This paper is made available online in accordance with
publisher policies. Please scroll down to view the document
itself. Please refer to the repository record for this item and our
policy information available from the repository home page for
further information.
To see the final version of this paper please visit the publisher’s website
.
Access to the published version may require a subscription.
Author(s): PA Thwaites and JQ Smith
Article Title: Separation Theorems for Chain Event Graphs
Year of publication: 2011
Link to published article:
http://www2.warwick.ac.uk/fac/sci/statistics/crism/research/2011/paper
11-09
SEPARATION THEOREMS FOR CHAIN EVENT GRAPHS
By Peter A. Thwaites
and Jim Q. Smith
University of Warwik
Aseparationtheoremonagraphialmodelallowsananalystto
identifythe onditionalindependenestatementsitlogially entails
usingonlythetopologyofthegraph.Inthispaperweprove
separa-tiontheoremsassoiatedwithanewolouredgraphialmodelalled
a ChainEventGraph(CEG).ThelassofCEG modelsgeneralises
thelassofnitedisreteBayesianNetworkmodels.Hereweformally
denethismodellass,andonsiderthesetofpermissibleonditional
independenequerieson thisgraph. We provideneessaryand
suf-ient onditions for these onditional independenestatements to
holdonasublassofunolouredCEGsalledsimpleCEGs.Wethen
prove suÆient onditions for suh statements to hold on a muh
larger sublassalled regularCEGs.Thepaperis illustratedwitha
runningexampledemonstratingtheappliationofthesetheorems.
1. Introdution. IftheDAG (diretedayligraph) G ofa Bayesian
Network (BN) has a vertex set fX
1 ;X
2
;:::;X n
g, then there are n
on-ditional independene assertions whih an simply be read o the graph.
These are the properties that state that a vertex-variable is independent
of its non-desendants given its parents (the direted loal Markov
prop-erty[14 ℄).Answeringmostonditionalindependenequerieshowever,isnot
so straightforward. The d-separation theorem for BNs was rst proved by
VermaandPearl[31 ℄,andanalternativeversiononsideredin[15 ,14,5 ℄.The
theoremaddresseswhethertheonditional independenequeryAqB jC ?
an be answeredfrom thetopologyof theDAG ofa BN,where A;B;C are
disjointsubsetsofthesetofvertex-variablesoftheDAG.ItallowstheBNto
be interrogated and irrelevanesheked beforeanyquantitative
embellish-ments of distribution on its onditional probability tables are added. This
provides a valuabletool inthe proess of disovering requisitemodels [21℄,
aswellasa logialframework forpropagation algorithmsandlearning(see
forexample[5℄ andthe TETRADsoftware of Sheines etal).
However formanyproblems theavailablequantitative dependene
infor-mation annot all be embodied in the DAG of a BN. Separation theorems
ThisresearhissupportedbyEPSRCgrantno.EP/F036752/1
AMS 2000subjetlassiations:Primary68T30,68T37;seondary62F30,68R10
Keywordsandphrases: Bayesian Network, Chain Event Graph, onditional
indepen-dene,diretedayligraph,graphialmodel,separationtheorem
havebeenprovedformoregenerallassesofgraphialmodelinludinghain
graphs [3 ℄, alternative hain graphs [2 ℄, and anestral graphs [23 ℄. In this
paperwe prove separation theorems fora partiularly expressive graphial
model{ theChainEventGraph (CEG).
OurmotivationforthedevelopmentofthislassisthatCEGsare
proba-blythemostnaturalgraphialmodelsfordisreteproesseswheneliitation
involves questions abouthow situations might unfold. Although the
topol-ogyofthesegraphsismoreompliatedthanthatoftheBN,theyaremuh
more expressive, as they allow us to represent all strutural quantitative
informationwithin thegraph itself.Context-spei symmetrieswhih are
notintrinsito thestrutureoftheBN[4 , 16,22 ,24 ℄are fullyexpressedin
thetopology of the CEG, whih also reognises logial zeros in probability
tables, and thenumbersof levels taken byproblem-variables.This lasthas
been found to be essential to understanding the geometry of BN models
withhiddenvariables[1 , 18 ℄.
TheCEGhasalreadybeendemonstratedtobeausefulinferential
frame-workforappliationsasdiverseasforensisiene[26 ℄,biologialregulatory
models[27℄, and eduation [8 ℄. The graphsprovidea framework for
repre-sentation[27 ℄,probabilitypropagation[29 ℄,learningandmodelseletion[8 ℄,
and forausalanalysis[30℄.
These papers onentrate on the appliation of CEG-based tehniques.
Whilstthey usetheonditionalindependene properties ofthegraph,they
donotprovideafullformal developmentforthelassof CEGmodels.This
paper reties this lak. In doing so we identify the form of the types of
onditional independene statements it is natural to query, and also prove
a number of separation theorems whih allow us to answer eah query as
always true ornot,solelyon thebasisof thetopology ofthegraph.
Wenotethat,even moresothanistheasewithBNs,thereareanumber
of onditional independene properties whih an simply be read o the
CEG.These aredesribed inSetions2.4and 5.2, andgiven thetree-based
natureoftheCEGthesepropertiesarenaturallyontext-spei.Thatisto
saytheyarepropertiesoftheformAqBjforsomeevent.Ananalogous
statement foradisrete BNwould be of theform
p(A=a jB =b;C=)=p(A=a jC =)
forsomesubsetsofvariablesA;B;C,somespeivetorvalue ofCand
allvetor valuesa ofA andb ofB.Thelassof onditioningeventswean
takle with a CEG is however muh riher than that generally onsidered
w
0
w
1
w
2
w
3
w
4
w
5
w
inf
w
6
1
2
1
2
3
1
2
3
1
2
1
2
1
2
1
2
w
7
w
8
w
9
1
2
1
2
1
2
Fig1.AnSCEGC
2. The Simple Chain Event Graph.
2.1. The basi denition of an SCEG. The Chain Event Graph C(V;E)
is a direted ayli graph (DAG), whih is onneted with a unique root
vertex(withno inomingedges) andauniquesink vertex(with nooutgoing
edges).UnliketheBNmorethanone edgean existbetweentwovertiesof
aCEG.TheregularChain EventGraph (RCEG) disussedinsetion4also
hasits vertiesand edgesoloured.
We rst onsider a sublass of the lass of CEGs alled a simple Chain
EventGraph (SCEG).Neither theverties(alled positions)w2V(C),nor
theedgese(w;w 0
)2E(C)ofanSCEGareoloured.AnexampleofanSCEG
isgiven inFigure 1.
TherootandsinkvertiesofaCEGarelabelledw 0
andw
1
.Eahposition
w2V(C)nfw
1
ghasasetE(w)ofk(w)outgoingedges,whihwhenwewish
toemphasisetheironnetionwiththepositionw,maybelabelledfe
x (w):
x=1;2;:::;k(w)g.
Adireted w
0
!w
1
pathinC isalled aroute.The set ofroutes ofC
ContextforFigure 1
Desriptor Edges
male e1(w0)
female e
2 (w
0 )
displayedsymptomS beforepuberty e1(w1);e1(w2)
displayedsymptomS afterpuberty e 2
(w 1
);e 2
(w 2
)
neverdisplayedsymptomS e3(w1);e3(w2)
developedondition e1(w3);e1(w4)
didnotdevelopondition e2(w3);e2(w4)
diedbeforetheageof50 e1(w5);e1(w6);e1(w7);
e 1
(w 8
);e 1
(w 9
)
diedattheageof50orolder e2(w5);e2(w6);e2(w7);
e 2
(w 8
);e 2
(w 9
)
probability spae represented by C { see below). Note that eah route is
uniquelydetermined by asequene of edges.Thusinthe CEGinFigure 1,
one suh route is
1
fe
1 (w
0 );e
1 (w
1 );e
1 (w
3 );e
1 (w
6
)g. It is easy to hek
that C here has 20 suh routes. We write w w
0
when the position w
preedesthepositionw
0
ona route.
When our CEG is applied to a population, eah route orresponds to a
possible set of attributes that a member of the population ould take. For
example,iftheCEGinFigure 1is appliedto apopulationof people whose
parentssuereredfromaninheritedmedialondition,andtheedgesofthe
CEG arry the desriptors given in Table 1, then the route
1
desribed
above orresponds to male, displayed symptom S before puberty, developed
ondition, died beforethe age of 50.
AnSCEGis route ompatible fora population ofunits ifeahpossible
historyof aunitinthepopulation(oratomoftheevent spae)orresponds
totheunitpassingalongone oftheroutes2(C) .We useF( C) todenote
the sigma eld of events formed by these atoms. F( C) orresponds to the
power set of (C) . Sine eah atom of this event spae odes what might
happento a unitin , theSCEGenodesan additionallongitudinal
devel-opmentdepitingthepossiblewaysthefuturemightunfold,notenodedby
thesigmaeld F( C) alone(see [25℄).
We labelan event inF(C) by,and notethatbeause theCEG'satoms
have this impliit longitudinal development assoiated with them, ertain
events in F( C) are partiularlyimportant. Let (w) denote the event that
a unit takes a route that passes through the position w 2 V(C). (w;w
0 )
is then the union of all routes passing through the positions w and w
0 ,
(e(w;w
0
))istheunionofallroutespassingthroughtheedgee(w;w 0
),and
((w;w
0
)) istheunion ofall routesutilisingthesubpath(w;w 0
Certain subsets of the set of positions also have an important status in
this ontext. In this paper we will all a set R V(C) a regular subset if
theeventsf(w):w2R garedisjoint.NotethatR isregularifandonlyif
there is no route 2(C) ontaining more than one positionw 2R . Call
Raposition-ut iff(w) :w2R gformsapartitionof(C) .A position-ut
an be assoiated with a random variable that labels whih of a lass of
developmentsa unitmight take (seesetion 3).
2.2. Probabilities on an SCEG. Underlying the SCEG there is a
prob-abilityspae whih is speiedbyassigning probabilitiesto the atoms. We
dothisasfollows:Foreahpositionw2V(C)nfw 1
g andedgee(w;w 0
)
em-anatingfrom w, we all
e (w
0
jw) a primitive probability if e
(w 0
jw) 0
and P
w 0
e (w
0
jw)=1.
Definition 1. A probability mass funtion p(); 2 (C) is said to
have themonomial property fora population ifthereexistsasetof
prim-itive probabilities = f
e (w
0
jw) :e(w;w 0
)2E(w);w2V(C)nfw
1
gg on
theedges ofC suhthat forallroutes 2(C)
p()=
Y
e(w;w 0
)2
e (w
0
jw) (2:1)
wheree(w;w
0
)2meansthat theedgee(w;w
0
) lieson theroute .
Notethat(2.1)fullydenesaprobabilitymeasureoverF(C)byspeifying
eah atomiprobabilityasa funtionofits primitiveprobabilities.
The assignment of probabilities (2.1), determined by impliitly
de-mands a Markov property over the ow of the units through the graph.
Thus, in the ontext of our medial example, the probablility of an
indi-vidual with attributes (male, displayed symptom S before puberty), (male,
displayed symptom S after puberty) or(female, displayed symptom S before
puberty)developingtheonditiondependsonlyonthefatthatthesubpaths
orresponding tothese pairsofattributesterminateat thepositionw 3
,and
notonthepartiularsubpathleadingto w
3
.Theprobabilitythisindividual
developstheonditionisthen
e (w
6 jw
3
)p((e(w
3 ;w
6
))j(w
3
)). Sowe
onlyneedtoknowthepositionaunithasreahedinordertopreditaswell
asispossiblewhat thenext unfoldingof itsdevelopment willbe.
ThisMarkovhypothesislooksstrongbutinfatholdsformanyfamiliesof
statistialmodel.Forexamplealleventtreedesriptionsofaproblemsatisfy
thisproperty,allnitestatespaeontextspeiBayesianNetworksaswell
Wean gofurtherandstate thatthesets ofpossiblefuturedevelopments
(whetherornottheydeveloped theonditionandwhether ornotthey died
before the age of 50) for individuals taking any of these three subpaths
must be the same. Moreover the onditional probability of any partiular
subsequentdevelopmentmustbethesameforindividualstakinganyofthese
threesubpaths.Thisaounts forthetermposition fora (non-sink)vertex.
In this paper we disuss minimal CEGs where if positions w
and w
aresuhthat the sets of possiblefuture developmentsfrom w
and w
are
idential, and the onditional probability distributions over these sets are
idential, thenw
andw
arethe sameposition.Anyreferene to aCEG,
SCEGorRCEGshouldtherefore be taken to meanaminimalCEG, SCEG
orRCEG.
Definition 2. An SCEG C is said to be valid fora population if it
isroute ompatible andhas themonomialpropertyfor .
Note that like the BN, the SCEG an be valid without its assoiated
primitive probabilitiesbeing known.We just needto believe that some set
exists so that the assoiated Markov hypothesis holds. We are free to
assign any set of probabilities to the edges of a valid SCEG within the
simplexonditions above. Soin partiularthe probabilitymodelspae of a
valid C an be dened as the produt spae of these jV(C))j 1 dierent
simplieswherethesimplexassoiatedwithw2V(C)nfw
1
ghasEulidean
dimension k(w) 1: The probabilityof any event in F(C) is then of the
form
p()=
X
2
p()=
X
2 Y
e(w;w 0
)2
e (w
0 jw)
where 2 means that is one of the omponent atoms of the event .
Inthispaperwe willalso usethefollowingfurthernotation:
(w 0
j w)p(((w;w
0
))j (w)) denotes theprobability of utilisingthe
subpath(w;w
0
) (onditionalon passingthroughw),
(w 0
jw) p((w;w
0
) j(w))=
P
(w
0
jw) denotesthe probabilityof
arrivingat w 0
onditional onpassing throughw.
2.3. Conditioning on intrinsi events. In thispaperweareinterestedin
onditioningsets whih give rise to onditional independene queries that
an be answered purely byinspeting thetopologyof an SCEG C. An
im-portant sublassof these areevents inF(C) whihare alledintrinsi.
Definition 3. Anintrinsievent inF(C) isasetofroutesofCwhih
arealsoroutesofC
whereC
w 0
and thesink vertexw
1
of C inits vertex set, and wherew 0
is theonly
vertex in V(C
) with no parent, and w
1
is the onlyvertex in V(C
) with
nohild. Callsuh a subgraphC
a sub SCEG.
Note that the sub SCEG C
is itself an SCEG. All atoms of F( C) are
intrinsi, as are (w) and (w;w
0
) (provided this is non-empty) for all
w;w 0
2 V(C), and as is the exhaustive set (w
0
). If we inlude the empty
set in the set of intrinsi events then we note that intrinsisets are losed
underintersetionandsotehniallyforma-system (seeforexample[12℄)
we an assoiate withtheSCEGC.
Notall eventsinF(C) areneessarilyintrinsibeause thelassof
intrin-sieventsisnotlosedunder union.Forexample,fortheCEGinFigure 1,
theevent onsistingoftheunionofthetwoatoms(e
1 (w
0 );e
1 (w
1 );e
1 (w
3 );
e 1
(w 6
))and (e 1
(w 0
);e 2
(w 1
);e 1
(w 3
);e 2
(w 6
)) produes a subgraph C
whih
hasfourdistint routes, so is notintrinsi.However thelass of intrinsi
events is rih enough to enompass virtually all of the onditioning events
intheonditional independene statementswe would like to query.In
par-tiular,ifourmodelan be expressedasa BNthen any setof observations
expressiblein theform O(A )=fX
j
2A
j
g (for subsets fA j
gof the sample
spaesof fX
j
g,thevertex-variablesoftheBN)isa propersubsetof theset
ofintrinsieventsdened ontheCEG ofour model[29 ℄.
The rst important property of the lass of validSCEG models is that
they arelosedunder onditioningbyan intrinsievent:
Theorem1. If an SCEGC is valid on a population then the
proba-bilitymodel on F(Cj) of anyof its subSCEGs C
isa probability model on
F(C
) whih isalso valid.
Theobvious setof primitive probabilitiesforthesub-SCEGC
isgiven by
=
e
(w 0
jw):e(w;w 0
)2E(w);w2V(C)nfw
1 g
where
e
(w 0
jw)=
p( j(e(w;w
0 )))
p( j(w))
e
(w 0
jw)
providing this is well-dened.A proof of this theorem an be foundin the
appendix. We note that this property has now been suessfuly used to
develop fastpropagationalgorithmsforCEGs (see [29℄).
Notethat the probabilityof an atom in C onditionedon the intrinsi
event is the probability of that atom in the SCEG C
. We denote this
probability p
(). It is then trivially the ase that the probability of an
eventinC onditionedontheevent istheprobabilityofthateventinthe
SCEGC
2.4. RandomvariablesonanSCEG.. Randomvariablesmeasurablewith
respet to F( C) partition the set of atoms into events. So for example, we
an denevariablesX;Y,measurable withrespetto F( C),whihpartition
the set of atoms into events f
X
g;f
Y
g. Moreover for an event (with
p()6=0) we an writeXqY j ifp(X =x jY =y;)=p(X =x j)
forall valuesx of X and y ofY (see forexample[7 ℄).
Lemma 1. ForaCEGC,variablesX;Y measurablewithrespettoF( C),
and intrinsi onditioning event , the statement XqY j is true if and
onlyif XqY istrue in the CEG C
.
Theproof of thislemma isin theappendix. This isa partiularlyuseful
propertybeauseitallowsusto hekanyontext-speionditional
inde-pendene propertybyhekinganon-onditionalindependenepropertyon
asub-SCEG.
We now turnourattention to two typesofelementaryrandomvariables,
measurable with respetto F( C), that an be identied witheah position
w2V(C)nfw
1
g.These are thevariablesf I(w):w2V(C)nfw 1
gg dened
by
I(w)= (
1 ifpasses throughw
0 otherwise
and thevariablesfX(w):w2V(C)nfw
1
gg denedby
X(w)=
(
x if passes along edgee
x
(w)2E(w)
0 if thepositionw doesnotlieon
where x = 1;2;:::k(w) index the edges emanating from w. Notie that
sineI(w) islearly a funtion of X(w), to speifya fulljoint distribution
over f (I(w);X(w)):w2V(C)nfw 1
gg it is suÆient to speify the joint
distributionoff X(w):w2V(C)nfw 1
gg .Notethatallatomieventsan
be expressedas theintersetionof events
=
\
w2
fX(w)=x
g
andeventsinF( C) astheunion ofthese atomi events
=
[
2 8
<
: \
w2
fX(w)=x
g
9
=
;
wherew2denotesthat thepositionw lieson theroute , and x
6=0 is
w
0
w
1
w
2
w
3
w
4
w
inf
w
6
1
2
1
2
1
2
1
2
1
2
1
2
w
7
w
8
w
9
1
2
1
2
1
2
Fig2.C onditionedontheeventthatdisplayedsymptomS
Figure 2 shows the SCEG C from Figure 1 onditioned on the intrinsi
event =(X(w
1
)=1)[(X(w
1
) =2)[(X(w
2
)=1)[(X(w
2
) =2) or in
theontext of ourmedialexample,displayed symptom S.
ForanysetAV(C),letX A
denotethesetofrandomvariablesfX(w):
w2AgandI
A
thesetf I(w):w2Ag.Also,foranyw2V(C),letU(w)be
thesetof positionsinV(C) whih lieupstreamof thepositionw,D(w) the
setofpositionswhihliedownstreamofw,U
(w) thesetofpositionswhih
do not lie upstream of w, and D
(w) the set of positions whih do not lie
downstream of w.
Lemma 2. ForanySCEGCandpositionw2V(C)nfw
1
g,thevariables
I(w);X(w) exhibit the positionindependeneproperty that
X(w)qX
D
(w) jI(w)
TheresultgiveninthislemmaisanalogoustothatwhihPearl[20 ℄usesto
deneBNs,whihstatesthataBNvertex-variableisindependentofits
non-desendantsgiven its parents. Itprovidesaset ofonditional independene
statements that an simply be read from the graph, one for eah position
inV(C).The proof ofthe lemmais intheappendix.
The statement that X(w)qX
D
(w)
j (I(w) =1) an be read as: Given
a unit reahes a position w 2V(C), whatever happens immediately after w
reahed, but also of all positions that logially have not happened or ould
not now happen beause the unit has passed through w. Thus, in the sense
above, the position of a valid SCEG C is suÆient to desribe the future
development ofunitspassing throughit.
As already noted, the produt spae dened by f(I(w);X(w)) :
w2V(C)gisoverspeied.ThisissorstlybeauseI(w)=0,X(w)=0
and I(w) = 1 ) X(w
0
) = 0 for w
0
2 D
(w) \U
(w). Probability
dis-tributions exist whih satisfy the set of statements of the form X(w)q
X D
(w)
jI(w) whihdonotobeythese impliations,butsuhdistributions
annotbe represented onan SCEG.
Moresigniantly,if the SCEGis used for the purpose forwhih it was
intended, as a representation of an asymmetri proess or problem, then
therewillbemanyprobabilitiesinthejointprobabilitytablesoverthespae
denedbyf(I(w);X(w)) :w2V(C)g whih areidentially zero. The joint
massfuntionisthenextremelysparse.Thesezerosorrespondtoimpossible
eventswhihnonethelessaregiven equalsignianewithpossibleeventsin
a BN-representation of the problem. In many ases these events are not
just impossiblebutmeaningless. Forexample ifX(w
a
)= 1 orrespondsto
patient dies, X(w
b
) = 1 orresponds to patient is given treatment 2, and
w a
w
b
,thentheevent (X(w
a
)=1;X(w
b
)=1) hasno logialmeaning.
Asthesetofstatementsoftheformf(I(w);X(w)):w2V(C)gdonot
de-netheSCEG,theseadditional ounterfatual statements produedbythe
produtspaerepresentationarenotanintegralpartoftheCEG-framework.
Theprodutspaedenedbythefullsetofstatementsisneverthelessa
use-fulonstrutbeauseitallowsustoenodesetsofonditionalindependene
statementsinto a validSCEGand so allowsus to quikly prove separation
theoremsforsuh graphs.
The struture of the CEG illustrates a further aspet of the graphial
modellingproess whihis nottransparent inthe topologyof theBN. The
CEG depits all possible histories of a unit in a population, and gives a
probability distribution over these histories. However, when a single unit
traversesoneoftheroutesintheCEG,valuesareassignedtoI(w);X(w)for
allpositionsw2V(C).Thoseonditionalindependenestatementsenoded
bythe positions andedges throughwhih ourunit hasnot passedarenow
truly ounterfatual [6 ℄ in that they answer queries of the form If X had
not been the ase, what would be the hane of Y happening? So the CEG
simultaneously depits both the \reality" and the ounterfatual aspets
of the problem one we start to observe the atual behaviour of units in
the population. It also makes it a powerful framework for expressing rih
3. A separation theorem for Simple CEGs. We all a position
w2V(C)astalkiftheremovalofwfromV(C)wouldresultinagraphwith
two disonneted omponents. In (non-probabilisti) graph theory suh a
vertexis alleda utvertex (see forexample[11 ℄).
Theorem2. In an SCEG C with w
1 ;w
2
2 V(C) and w
2
6 w
1 ,
X(w 1
)qX(w
2
) if and only if either w 2
is a stalk, or there exists a stalk
downstream of w
1
andupstream of w
2 ,for w
0
w
1
w
1
; w
0
w
2
w
1 .
The proof of this theorem is in the appendix. Theorem 2 has a number
of powerful orollaries,whih we give after introduingtwo new variables.
CallJ(R ) theinidene variable ofa regular subsetR if
J(R )
X
w2R
I(w) sup
w2R I(w)
andall Y(R )the riterionvariable of a regularsubsetR if
Y(R )
X
w2R
X(w) sup
w2R X(w)
Lemma 3. For an SCEGC with position uts R
a
=fw
a
g; R
b
=fw
b g:
If X(w
a
)qX(w
b
) for any w
a
2 R
a
; w
b
2 R
b
, then Y(R
a
)qY(R b
) in
every distributionompatible with C.
Conversely,if Y(R a
)qY(R
b
) holdsforalldistributions ompatiblewithC ,
then X(w
a
)qX(w
b
) for all w a
2R
a
; w
b
2R
b .
This lemma and Corollary 5 in Setion 7 formalise and generalise the
result given in [27 ℄ Theorem2.The proofof thelemma is inthe appendix.
The onverse result is somewhat surprising, but is a onsequene of the
partiularstrutureof thesigma eldassoiatedwithan SCEG.
Corollary 1. Let C be an SCEG, an intrinsi event, R
a
= fw
a g;
R b
=fw
b
g beposition uts of C.
Ifin thesub-CEG C
, w a
andw
b
areseparated by a stalk,for anyw a
2R
a ;
w b
2R
b , w
a ;w
b 2V(C
), then Y(R
a
)qY(R b
) j .
Theproofofthisorollaryisintheappendix.Thishasmajoronsequenes
formodelswhih admitaprodutspaestruture, where othogonaluts of
the CEG have a natural meaning orresponding to measurement variables
oftheproblem.ModelsofthissortanberepresentedasBNs,withpossible
Corollary 2. If an SCEG C is of a model whih admits a produt
spae struture, A;B are measurement variables of the model, and R
a ;R
b
are the position uts of C orreponding tothese variables, then:
If X(w
a
)qX(w
b
) for any w
a
2 R
a
; w
b
2 R
b
, then AqB in every
distributionompatible with C.
Conversely, if AqB holds for all distributions ompatible with C, then
X(w a
)qX(w
b
) for all w a
2R
a
; w
b
2R
b .
Theproofof thisfollowsimmediatelyfrom Lemma 3.
Corollary 3. Let C be an SCEG of a model whih admits a produt
spaestruture,A;B bemeasurement variablesof the model, an intrinsi
event, R a
;R b
be position uts of C orrepondingto the variables A andB.
Ifin thesub-CEG C
, w a
andw
b
areseparated by a stalk,for anyw a
2R
a ;
w b
2R
b , w
a ;w
b 2V(C
), then AqB j.
The proof of this follows diretly from Corollaries 1 and 2. In the ase
where our model has a natural produt spae struture, the topology of
the SCEG allows us to replae onditional independene queries suh as
AqB jC ?bysets of ontext-speiqueriessuhasfAqB j(C =)?g,
allowing us to interrogate the graph using Corollary 3. If in addition our
modeladmits noontext-spei onditionalindependeneproperties,then
thesymmetriesintheSCEGmeanthatweneedonlyhektheanswerto a
singlequery,forinstane AqB j(C =1) ?
Example 3.1. Figure3showstheSCEGCfromFigure1onditionedon
theintrinsievent =(X(w
1
)=1)[(X(w
2
)=1) ordisplayed symptom S
beforepuberty.Thisgraphhasastalkatw 3
,andbyTheorem2wehavethat
X(w 0
)qfX(w
3 );X(w
6 );X(w
7
)g inthisgraph.
Considerthepositionuts R
0
=fw
0 g;R
1
=fw
1 ;w
2 g;R
2
=fw
3 ;w
4 ;w
5 g;
R 3
= fw
5 ;w
6 ;w
7 ;w
8 ;w
9
g of C. Then as Figure 3 depits a onditioned
CEGC
fortheintrinsievent , Corollary1 givesusthat
Y(R
0
)q(Y(R 2
);Y(R 3
)) j
Now the CEG C from Figure 1 does not have a natural produt spae
struture, butthisisno obstale to our usingCorollary3 here.AsC
does
admita produt spaestruture we an impose thisonto C byforexample
deningAY(R
0
);B Y(R
1
), DY(R
3 )and
C=
(
1 if sup(X(w
3 );X(w
4 ))=1
w
0
w
1
w
2
w
3
w
inf
w
6
1
2
1
1
1
2
1
2
w
7
1
2
Fig3.C onditionedondisplayedsymptomSbeforepuberty
Thisallows usto useCorollary3 and givesusthat
Aq(C ;D) j(B=1)
whih in our medial ontext reads as whether an individual develops the
ondition and whether they die before 50 are independent of their gender
giventhat they displayed symptom S before puberty.
4. RegularCEGs. AlthoughSCEGsformanimportantlassof
graph-ialmodel,byaddingextrastrutureto themwe anmakethemevenmore
expressive.Wedothisbyolouringpositionsandedges.Theresultantgraph
is alled a regular Chain Event Graph (RCEG). We note that oloured
graphs have reently been found to provide a valuable embellishment to
othergraphialmodels(see forexample [9℄).
An RCEG is a oloured SCEG C where the set V(C) has an assoiated
partitionU(C)=fu
1 ;u
2 ;:::u
t
gforwhiheahsetuV(C) isregular.The
setuisalledastageandissuhthatforeahw2uthedistributionfuntion
ofX(w)j(I(w) =1)isdependentonlyonuandnotonthepartiularw2u.
Definition 4. w
1 ;w
2
2 V(C)nfw
1
g are in the same stage u if there
exists a bijetion (w
1 ;w
2
) between E(w
1
) and E(w
2
) suh that if :
e x
(w 1
)7!e x
(w 2
) thenp((e
x (w
1
))j(w
1
))=p((e
x (w
2
)) j(w
2 )).
The positions w
1 ;w
2
w
0
w
1
w
2
w
3
w
4
w
5
w
8
w
6
1
2
1
2
3
1
2
3
1
2
1
2
1
2
1
2
Fig 4.TheRCEGforExample4.1
andtheedges e x
(w 1
);e x
(w 2
)have thesameolour ifw 1
;w 2
areinthesame
stage ande
x (w
1
) maps to e
x (w
2
) under thisbijetion.
Theexisteneorotherwiseofabijetionbetweentwoedgesetsisnormally
apparentfromtheontextoftheproblem.Notethatife x
(w 1
)mapstoe x
(w 2
)
underabijetion ,thentheseedgesmustorrespondtothesame outome
(forexamplepatient dies)giventhetwo histories(w 1
)and (w
2
).Weall
theolouringof theRCEG thestage-struture ofthe graph.
Example 4.1. Produing an RCEG from the SCEG in Figure 1 we
anadd theextrainformationthatthepositionsw
3
and w
4
areinthesame
stage{thatistheprobabilityofdevelopingtheondition(ornot)isthesame
whetheramemberofthepopulationhasattributesorrespondingtothe
sub-paths(e 1
(w 0
);e 1
(w 1
));(e 1
(w 0
);e 2
(w 1
));(e 2
(w 0
);e 1
(w 2
))or(e 2
(w 0
);e 2
(w 2
)).
TheRCEGC is giveninFigure 4.
This additional struture allows us to express a riher set of
ontext-spei properties and sample spae information than we an with the
subsetthelass of modelsexpressible asfaithful regular orontext-spei
BNsonnitevariables.UnliketheBN,theRCEGembodiesthestrutureof
the model state spae and any ontext-spei information inits topology
andolouring.
RCEGsareroute-ompatibleand havethemonomialpropertyfora
pop-ulation iftheirunderlyingSCEGdoes, and henearevalidfora
popula-tion if their underlyingSCEG is. The subgraph C
of an RCEG C
on-ditionedon an intrinsievent is anRCEG. Theorem1holdsforRCEGs.
NotehoweverthatC
maynothavethesamestage-stutureasCinthat
po-sitionsoredgeswhihhavethesame olourinC may have dierent olours
inC
.Lemma1 and theposition independene property holdforRCEGs.
TheonditionsstipulatedinCorollaries1and 3an nowberelaxed.It is
suÆientthatC
shouldbesimple(ratherthanC) forthese resultstohold.
ThesubgraphofaCEGwhihonsistsofapositionw,thesink-nodew
1 ,
and all edges and positions whih lie on a w ! w
1
subpath is alled the
subgraph rooted in w. When the CEG is used as a pratial tool it is
im-portanttomaximiseitsrepresentationaleÆieny.SoifinthesubgraphC
,
thesubgraphs rootedin the positionsw
and w
have idential topologies
and olouring we an ombine the positions w
and w
into a single
posi-tion[30 ℄.NotethatifwedothisthenC
althoughnowminimal,isnolonger
asubgraphof C (see Denition3).
Followingtheideasof setion3, we let
J(u)=sup
w2u
I(w) and Y(u)=sup
w2u X(w)
The RCEGis also a powerfultool for interrogation purposes, butto
max-imise its potential in this area we use the Augmented Chain Event Graph
(ACEG) desribed inthenext setion.
5. Augmented CEGs.
5.1. Denition of an Augmented CEG. Analogously to the denition
of X
A , let Y
A
= fY(u) : u 2 Ag and J
A
= fJ(u) : u 2 Ag. Sine the
CEGC is aDAG, there existsa partialorder ofthe stagesintheset U(C).
LetP(u)bethe setof allu 0
stages thatpreedeu inthispartialorder.Let
Y Q(u)
be aminimalsubsetof Y
P(u)
suh that
J(u)qY
P(u) jY
Q(u)
Definition 5. An augmented CEG (ACEG) A(C) is a funtion of the
1
1
1
1
2
2
2
1
2
ω
0
ω
1
ω
2
ω
3
ω
4
ω
5
2
2
2
2
2
1
1
1
1
Fig 5.TheRCEGforExample5.1
The edge set E(A(C)) onsists of direted edges onneting the parents
of any vertex in V(A(C)) to that vertex. Eah vertex Y(u) has a single
parent J(u),and theparentsofJ(u)arepreiselythose Y(u 0
) vertiesthat
aremembersof Y
Q(u) .
Example 5.1. A researh group has taken a sample from the
popu-lation desribed in Setion 2.1 whih ontains only people who displayed
symptomS.Analysisofthissamplesuggests thatwhetheran individual
de-velops theonditionandwhetherthey diebefore50areindependent oftheir
gendergivenwhentheydisplayed symptom S.TheRCEGforthisisgivenin
Figure5.AnACEGforthisgraphisgiveninFigure6,whereforillustrative
onvenienethe edges emanatingfrom Y(u) nodeshave beenlabelledwith
valuesofA (=Y(R 0
)forR 0
=fw
0
g),B (=Y(R 1
) forR 1
=fw
1 ;w
2
g),and
C (=Y(R
2
) forR 2
=fw
3 ;w
4 g).
5.2. ACEGsareBayesianNetworks. Weextendthenotationofsetion3
to let X D
(u)
be the vetor of random variables of the form X(w)
A
=1
B=
1
C
=1
C=1
C=2
A
=2
B=2
B=
1
B=2 C
=2
B=1
B=1
B=2
B=2
J(u
A
)
Y(u
A
)
J(u
B1
) Y(u
B1
)
J(u
C
)
Y(u
C
)
J(u
B2
)
Y(u
B2
)
J(u
D1
)
Y(u
D1
)
Fig 6.AnACEGfortheRCEGinFigure 5
Y D
(u)
;J D
(u)
be the vetors of random variables of the form Y(u
0 );J(u
0 )
assoiated withstagesinC whihdo notliedownstreamof thestage u.
Lemma 4. For CEG C, and stage u2U(C)
Y(u)qY
D
(u) j J(u)
Theresultgiveninthislemmais analogoustothatgiveninLemma2for
positions, and so also to the result quoted there for BNs. It provides a set
of onditional independene statements that an simply be read from the
graph,one for eah stage in U(C).A partial readingof the lemma givesus
thattheimmediatefuturefora unitat astage u isindependent ofhowthe
unitreahedthatstage. Theproofof thelemma isintheappendix.
Byonstrution,ifastageu 0
isnotdownstreamofuinC,thenJ(u 0
);Y(u 0
)
arenotdownstream ofJ(u);Y(u) inA(C).Sine forevery stageu 0
,J(u 0
) is
afuntionof Y(u 0
),it follows that
Y(u)q(J D
(u)
;Y D
(u)
andhene that
Y(u)q(J P(u)
;Y P(u)
) jJ(u)
inany partialorder of U(C).Clearlywe also have that
J(u)q(J
P(u) ;Y
P(u) ) jY
Q(u)
and henethatallvertiesinan ACEGA(C) areindependent oftheir
pre-deessor vertiesgiven theirparental vertiesinanypartialorder ofU(C).
In[19 ℄itisshownthataprobabilitydistributionP isMarkovrelativetoa
DAGGifandonlyifeahvariableinGisindependentofallitspredeessors
onditionalonitsparents,insomeorderingofthevariablesthatagreeswith
thearrowsofG.ClearlyourACEGisaDAG,andfromtheabovereasoning
eahvariableinA(C)isindependentofallitspredeessorsonditionalonits
parentsforallP denedontheCEGC.SoourACEGobeyswhatPearl[20 ℄
alls the ordered Markov ondition, and hene also obeys the loal Markov
ondition[13 ℄. Results in[10℄ allow ustherefore to dedue that theACEG
isitself aBN.
Thisdedutionmeansthatanyresult availableforusewithBNsanalso
be usedwith ACEGs. In partiular we an use d-separation to allow usto
interrogate ACEGs foronditional independeneproperties.Theadvantage
thatthe ACEG has hereoverthe BNis that intheformer ontext-spei
onditionalindependenepropertiesaredepitedexpliitly inthetopology
ofthe graph,and soit an be interrogated diretly forsuhproperties.We
begin however bylookingat modelswhih anbe representedby BNs.
6. Models depitable by Bayesian Networks and others. If a
modelhasanaturalprodutspaestrutureandadmits noontext-spei
onditionalindependene properties thenit an be depited bya BN
with-out any further annotation. In this setion we show that if our CEG is of
suha modelthenanyseparation-based onditionalindependeneproperty
readablefrom theBNan also beread fromits assoiatedACEG.
IfourCEGisofamodelwhihhasanaturalprodutspaestruturethen
foreah variable X
i
inthe BN there exists a olletion of vertiesfJ(u
i )g
in the ACEG whose members orrespond to the possible ongurations of
Q(X i
)(theparentvariablesofX i
),andaolletionofvertiesfY(u i
)gwhose
members orrespondto X
i
given those ongurations.
Theorem3. Ifamodelwithanaturalprodutspaestrutureadmitting
no ontext-spei onditional independene properties, has a BN
If fY(u i
)g is d-separated from fY(u
j
)g by fY(u
k
)g in A(C), then
X i
qX
j
jX
k
in every distributionompatible with C and G.
Conversely,if X i
qX
j
jX
k
holdsforalldistributions ompatible withC and
G, then fY(u i
)g is d-separated from fY(u j
)g by fY(u k
)g in A(C).
The proof of this theorem is given in the appendix. This result an be
explained as follows: The CEG C is of a model whih has a natural
prod-utspae struture admitting no ontext-spei onditional independene
properties,and an be depitedbyaBN G.Thereforethere exist (inA(C))
edgesfromvertiesinfY(u i
)gtovertiesinfJ(u j
)gifandonlyifthereexists
anedgefromX
i
toX
j
inG,andsothereisa1:1orrespondenebetweenthe
parental onditional independene statements in G and the parental
ondi-tionalindependenestatementsinA(C).By[31 ℄Corollary1,theonditional
independenestatementsina DAG anbederivedfromd-separation ifand
only if they an be derived from the list of parental onditional
indepen-denestatementsusingthesemi-graphoid axioms[28 ℄. AsbothG andA(C)
areDAGs, wean inferthatthereis a1:1orrespondenebetweenthe
on-ditional independene statements derived from d-separation in G and the
onditionalindependene statements derivedfrom d-separation inA(C).
Essentially, Theorem 3 allows us to use the olletions fY(u
i
)g in the
ACEGassurrogates forX
i
intheBN,whenansweringonditional
indepen-denequeries.
Example 6.1. For the RCEG in Figure 5, let A;B;C be as in
Exam-ple 5.1, R 3
=fw
5 ;w
6 ;w
7 ;w
8
g and D= Y(R
3
). Then usingTheorem 3 on
theACEGforthisRCEG (given inFigure 6),wesee that
Y(u A
) isd-separated fromfY(u C
);Y(u D
)g byfY(u B
)g)Aq(C ;D) jB
fY(u A
);Y(u B
)g ared-separatedfrom fY(u
C
)g)(A;B)qC
Y(u A
) isnotd-separated fromY(u C
) byfY(u D
)g)Aq/C jD
Y(u B
) isnotd-separated fromY(u C
) byfY(u D
)g)Bq/C jD
for the RCEG in Figure 5. In our medial ontext whether an individual
develops the ondition and whether they die before the age of 50 are
inde-pendent of theirgender givenwhenthey displayed symptom S;whetherthey
developtheonditionisindependent oftheirgenderandwhentheydisplayed
symptom S; butwhether they develop the ondition is not independent of
eithertheir gender or whenthey displayed symptom S givenwhether or not
Corollary 4. If X i
;X j
;X k
aredistintsubsets of the vertex-variables
of G, and fY(u
i
)g;fY(u j
)g;fY(u k
)g are the orresponding olletions in
A(C), then the results of Theorem 3 still hold.
This follows from the proof of Theorem 3 (whih does not depend on
X i
;X j
;X k
being singlevariables).
We have alled the olletions fY(u
i
)g in A(C) surrogates for X
i in G,
but when we replae the statement \X
i
qX
j
j X
k
in G" by \fY(u i
)g is
d-separated from fY(u j
)g by fY(u
k
)g in A(C)", only fY(u k
)g is atually
a surrogate. This is beause the latter statement implies that fY(u
i )gq
fY(u j
)g jfY(u k
)g (sine A(C) is a BN),and if thisstatement is true then
X i
qX
j
j fY(u k
)g sine X
i
supY(u
i )
is a funtion of fY(u
i )g. By
onstrution only one Y(u
i
) within the set fY(u
i
)g an take a non-zero
value,and the value thisvariable takesisequal to thevaluetaken byX i
.
Notethat the ACEGhas J(u) and Y(u)nodes foreah stage u 2U(C),
and eah stage u in a CEG is assoiated with a partiular olletion of
parents{thesetofu 0
2U(C)orrespondingto Y
Q(u)
intheACEG.Indeed,
ifourCEGhassuÆient symmetryto beembeddedintoafamilyofmodels
withaprodutspae struture,thenthepositionsonstitutingeahstageu
are members of a spei orthogonal ut R (setion 2.1), and u enodes a
partiularongurationof theparentalvariablesof Y(R ).
For this reason an ACEG has many more nodes than a standard BN,
and so admits a far larger olletion of onditional independene
state-ments. This olletion inludesmany ontext-spei properties whih an
onlybe represented in BNs by modifyingtheir struture [4 , 16 , 22 , 24 ℄. It
also inludesmanyounterfatual statements of thetype desribed in
Se-tion2.5onSCEGs.So forexample,ifweonsidertheCEGinFigure 2,but
ombine the positions w
6 ;w
7 ;w
8
and w
9
into a sink-node w
1
, we get the
ACEG depitedinFigure 7,whereforonvenienewehave let A=Y(R
0 );
B=Y(R
1
); C=Y(R
2
) with R
0 ;R
1 ;R
2
dened asinExample5.1above.
Using the ACEG in Figure 7 we an dedue that C qB j (A = 1) {
whether an individual develops the ondition is independent of when they
displayed symptom S given that their gender is male (and symptom S was
displayed), and that CqA j(B = 1) { whether they develop the ondition
is independent of their gender given that they displayed symptom S before
puberty. But we also have statements suh as Y(u
C2
)qY(u B1
) j Y(u A
),
whihhas noobvious meaningintheontext of theproblem.
7. More on ontext-spei onditional independene. One of
thedistintadvantages oftheCEGwhen representing andanalysing
J(u
A
)
Y(u
A
)
J(u
B1
)
Y(u
B1
)
J(u
B2
)
Y(u
B2
)
J(u
C1
)
Y(u
C2
)
J(u
C2
)
Y(u
C2
)
A
=
1
A=2
A
=
2
A=
1,2
B
=
1
B=2
Fig 7.AnACEGfortheadaptedCEGfrom Figure2
spei event, perhaps a spei value of a variable, and use the CEG's
topologyto disoveronditionalindependenieswhihwouldnotexist ifwe
were to ondition on a related event, suh as a dierent value of our
vari-able.Ifwe areinterestedinthese ontext-spei onditional independene
propertiestheninthisdisreteontextweneedtoonentrateourattention
on statements where the onditioning element is an event. In most ases
thisevent willbeexpressible asa valueof a singleY(u) (or J(u)) variable,
and so queries an be heked diretly on an ACEG without the need of
the surrogate argument of the last setion. What happens in ases where
our onditioningevent annot be expressed asa value of a Y(u) (or J(u))
variable? An example of this is the event = (B = 1) for the model
de-pitedin Figure 4. We ould draw an ACEG for the fullCEG C here,but
the ACEG for C
is muh more useful. The sub-CEG C
for this event is
given in Figure 3. Note that the edge-probabilities on this graph are now
A = 1 j B = 1, A = 2 j B = 1 for the edges leaving w
0
; 1 for the edges
leaving w
1 & w
2
(the positions w
1 & w
2
ould be ombined into a single
positionassuggested inSetion 4);C =1; C =2 fortheedges leavingw
3 ;
B=1
C
=1
B=
1
C
=2
B=
1
J(u
A
) Y(u
A
)
J(u
B=1
)
Y(u
B=1
)
J(u
C
)
Y(u
C
)
J(u
D1
)
Y(u
D1
)
Fig 8.AnACEGforthesub-CEGfrom Figure3
and D = 2 j B = 1;C =2 forthe edges leaving w
5 & w
6
(see Setion 2.3
and[29 ℄).
An ACEG for C
is given in Figure 8. Notie that unlike in Figure 6,
J(u A
) is not a root-vertex as A is now dependent on B.Also, beause we
haveonditionedontheevent(B=1),thesetofY(u B
)vertieshasbeome
a single vertex Y(u
B=1
) with no anestors exept J(u
B=1
). We now use
d-separationto read that
fY(u C
);Y(u D
)g are d-separatedfrom Y(u A
)byY(u B=1
)
) (Y(u
C
);fY(u D
)g)qY(u A
) jY(u B=1
)
) (C ;D)qA jY(u B=1
)
sinetheACEGisaBN,andA;C ;DarefuntionsofY(u A
);Y(u C
);fY(u D
)g.
AlsoY(u B=1
)=1 , B =1,sothisinturnimpliesthat(C ;D)qAj(B=1)
withouttheuseof thesurrogate argument.
Thismethodwillalwayswork forases like thissineonditioningonan
event suh asB =1 always produes a single vertex of the form Y(u
B=1 ),
AswithSCEGs(seeSetion3),standardonditionalindependenequeries
onanRCEGangenerallybeansweredbylookingatontext-spei
ondi-tionalindependenequeriesonsubgraphsoftheCEGwhihareoftensimple.
Suh onditioning an only remove olouring from the graph and not add
it. Beause of the way CEGs are onstruted, the onditioning event in a
ontext-spei onditional independene query an very often be written
as(w) forsome w2V(C).But ifwe onditionon an event =(w), we
an readonditional independenepropertiesothe graphC
even ifC
is
notsimple.
Corollary 5. Let C be an RCEG with position uts R
a
= fw
a g;
R b
=fw
b g.Ifw
a ;w
b
areseparatedbyastalk,foranyw a
2R
a ;w
b
2R
b ,then
Y(R a
)qY(R b
).
The proof of this orollary is in theappendix. This result an obviously
be extended to give suÆient onditions for Y(R
a
) q Y(R
b
) j just as
Corollary1 extendsthe resultforSCEGs.
So if = (w) for some w 2 V(C), then w is a stalk in C
, and in
thisgraph,Y(R a
)qY(R b
) foranypositionuts R
a
upstreamofw,and R
b
downstreamofw.HeneY(R
a
)qY(R b
)jprovidingwehavedenedthese
variablesonsistentlyon theCEGsC andC
(see proof ofCorollary1).
Example 7.1. Consider the RCEG from Figure 4 onditioned on the
event =(X(w
1
) =1)[(X(w
1
)= 2)[(X(w
2
)= 1)[(X(w
2
) =2). The
RCEG forthisis given inFigure9.
Ignoring themedial ontext here, we note that in this graph the event
=(w
3
) an beharaterisedas(min(A;B)=1). Ifwe onditiononthis
event we getC
asinFigure10,whihasalreadynotedmusthavea stalk.
Forillustrative onveniene edges inFigure 10 have beengiven probability
labels.
Using Corollary 5 we get (C ;D)q(A;B) j (min(A;B) = 1), and
on-ditioning on = (w
4
) we get a CEG from whih we an trivially read
that (C ;D)q(A;B) j (min(A;B) = 2). Combining these we get (C ;D)q
(A;B) j min(A;B).
8. Conlusion. Insummary,theresultsof setion3giveusonditions
forthetruthofAqB statementsonSCEGswhiharediretlyanalogousto
those given in (for examplePearl's [20 ℄ or Lauritzen's[14℄ versionsof) the
d-separationtheoremforBNs.Corollary1alsogivesussuÆientonditions
1
1
1
1
2
2
2
1
2
ω
0
ω
1
ω
2
ω
3
ω
4
ω
5
2
2
1
Fig 9.TheRCEGforExample7.1
onditions for AqB j statements to hold on RCEGs. Queries suh as
AqB jC ?wheretheonditioningelement is alsoa variable(orolletion
ofvariables)an generallybeanswered byonsideringsets of queriesofthe
formAqB j ?Methodsfordoing thisaresuggested at variouspointsin
thetext, but inthespeialase where theRCEGdesribesa modelwhih
anbedepitedbyaBN, Theorem3givesonditionsforAqB jCdiretly
analogous to those given inthe d-separation theorem forBNs. The ACEG
fromSetion5isveryusefulforalltypesofonditionalindependenequery,
butis partiularlyuseful for queriesof the form AqB j ? insituations
whereusingothertehniquesisnotstraightforward.ThefatthattheACEG
isitselfaBNopensupanexitingrange ofpossibilitiesstillto be explored.
Analysts working with BNs have found that attempts to feed bak to
a user all the impliit onditional independenies assoiated with a given
graph an be rather overwhelming unless the BN is very simple. Clearly
this would also be the ase with CEG-based models. However, within any
given ontext thetypes of independeniesthat it is natural forthe user to
be able to understand, examine and verify are smallin number. Sine the
A
=1
|M
in
(A
,B
)=
1
B=1|A=1
C=
1
D
=1
|M
in
(A
,B
)=
1,C
=1
C=2
A
=2
|M
in
(A
,B
)=
1
B=2|A=1
B
=
1
w
it
h
pr
ob
ab
il
it
y
1
Fig10.TheRCEGfrom Figure9onditionedon=(w3)
appliationof theCEGwedefer thisdisussionto a future paper.
APPENDIX1:PROOFS ANDONE ADDITIONAL LEMMA
LIMITED MEMORY LEMMA. For any CEG C, w
1 ;w
2 ;w
3
2 V(C) with
w 1
w
2
w
3 ,
I(w 3
)qI(w 1
) j(I(w 2
)=1)
PROOF. Itis suÆientto prove that
p(I(w 3
)=1 jI(w 1
)=1;I(w 2
)=1)=p(I(w
3
)=1jI(w 2
)=1)
Soonsidera singleroute passingthroughw
1 ;w
2 ;w
3
.Thisroute onsists
ofasetofedgesandbyonstrutiontheprobabilityp()oftherouteisequal
to the produt of the probabilitieslabellingeah of these edges. Moreover,
the probability of any subpath of is equal to the produt of the
proba-bilitieslabellingeah ofits edges. So p() an be written astheprodut of
the probabilities of four subpaths:
0 (w
0 ;w
1
),
1 (w
1 ;w
2
),
2 (w
2 ;w
3 ), and
3
(w 3
;w 1
). Thus
p()=
0 (w
1 jw
0
)
1 (w
2 jw
1
)
2 (w
3 jw
2
)
3 (w
1 jw
3 )
Considernowthe event (I(w
1
)=1;I(w 2
)=1;I(w 3
)=1) or(w
1 ;w
2 ;w
3 ),
whihis theunionofall w 0
!w
1
routespassingthroughw
1 ;w
2 ;w
sine(w 1 ;w 2 ;w 3
)is anintrinsievent we an write
p((w 1 ;w 2 ;w 3 ))= X 0 2M 0 0 (w 1 jw 0 ) X 1 2M 1 1 (w 2 jw 1 ) X 2 2M 2 2 (w 3 jw 2 ) X 3 2M 3 3 (w 1 jw 3 ) whereM i
(i=0;1;2) is theset of all subpathsfrom w i
to w i+1
,and M
3 is
theset of all subpaths from w 3 to w 1 .But P 0 2M 0 0 (w 1 jw 0
) is simply
theprobabilityofreahingw 1
fromw
0
,or(w
1 jw 0 ), so p((w 1 ;w 2 ;w 3
))=(w
1 jw 0 ) (w 2 j w 1 ) (w 3 jw 2 ) (w 1 jw 3 ) =(w 1 jw 0 ) (w 2 j w 1 ) (w 3 jw 2
)1
sineallpaths passing throughw
3
terminateinw
1
.Therefore
p(I(w 3
)=1 jI(w 1
)=1;I(w 2
)=1)= p((w 1 ;w 2 ;w 3 )) p((w 1 ;w 2 )) = (w 1 jw 0 ) (w 2 jw 1 )(w 3 jw 2
)1
(w 1 jw 0 ) (w 2 jw 1
) 1
=(w 3 jw 2 ) =p(I(w 3
)=1jI(w 2
)=1)
If we replae I(w
1
) = 1 by ((w
0 ;w
2
)) for any subpath (w
0 ;w 2 ), and I(w 3
)=1by(e(w
2 ;w
0 2
)) forsome edgee(w
2 ;w
0 2
) thenwe obtain
COROLLARYA. For any CEG C with w2V(C)
p((e(w;w
0
))j((w
0
;w));(w))=p((e(w;w
0
))j(w))
Similarly,ifw 0 1
w
2
andwereplaeI(w
1
)=1by(e(w
1 ;w 0 1 ))(X(w 1 )=x
1
forsome x 1
21;:::k(w 1
)), andI(w 3
)=1 by(e(w
3 ;w 0 3 )) (X(w 3 )=x
3 for
some x
3
21;:::k(w 3
))thenwe obtain
COROLLARYB. For any CEG C,w
1 ;w
2 ;w
3
2V(C), with w 0 1 w 2 w 3 p((e(w 3 ;w 0 3
))j(e(w
1 ;w 0 1 ));(w 2
))=p((e(w
3 ;w
0 3
))j(w
2 ))
p(X(w 3
)=x 3
jX(w
1
)=x
1 ;I(w
2
)=1)=p(X(w
3 )=x
3 jI(w
2 )=1)
PROOF OF THEOREM 1. Sine the atoms of F( Cj) are routes in C
,
F(Cj ) = F(C
), and so the onditioned model is route ompatible
inC
is given byp
()=p(j) whihequals
p((w 0 );(e(w 0 ;w 1 ));(e(w 1 ;w 2
));:::(e(w m
;w 1
))j)
=p ((w 0 );(e(w 0 ;w 1 ));(e(w 1 ;w 2
));:::(e(w m ;w 1 ))) =p ((w 0 ))p ((e(w 0 ;w 1
) j(w
0 )) p ((e(w 1 ;w 2
) j(w
0 );(e(w 0 ;w 1 ))
::: p
((e(w m
;w 1
))j(w
0
);::: (e(w
m 1
;w m
)))
whihbyCorollary Aof theLimitedMemory Lemmaequals
1p
((e(w
0 ;w
1
) j(w
0 ))p ((e(w 1 ;w 2
) j(w
1 )) p ((e(w 2 ;w 3
) j(w
2
))::: p
((e(w m
;w 1
))j(w
m )) = Y e(w;w 0 )2 p ((e(w;w 0
)) j(w))
Soletting
e
(w 0
jw)=p
((e(w;w
0
)) j(w)) we have
p(j)= Y e(w;w 0 )2 e (w 0 j w)
andtheprobabilitymassfuntionp(j)hasthemonomialproperty.Hene
C
is avalidSCEG.
Notethat asstatedinsetion 2.3
p
((e(w;w
0
))j(w))=p((e(w;w
0
))j;(w))
=
p(j(w);(e(w;w
0 )))
p(j(w))
p((e(w;w
0
))j(w))
=
p(j(w);(e(w;w
0 )))
p(j(w))
e
(w 0
jw)
PROOF OF LEMMA 1. X;Y partition the set of atoms of C, and sine
(C), X;Y also partition the set of atoms of C
. Consider
arbi-trary events
X
and
Y
from the sets f
X
g and f
Y
g (partitions of the
set of atoms of C), and the event
A
=
X
\
Y
. Then p(
X
j ) =
p ( X ); p( Y
j )=p
(
Y
) and p(
X ;
Y
j)=p(
A
j)=p
( A )= p ( X ; Y
).The statement
p( X
; Y
j)=p(
X
j)p(
Y
isthentrue ifand onlyifthestatement
p
( X
; Y
)=p
( X
) p
( Y
)
istrue;and thisholdsforall X
2f
X
g
Y
2f
Y
g.
PROOF OF LEMMA 2. By denition I(w) = 0 ) X(w) = 0, so if
I(w) = 0, X(w) is known and so in partiular is independent of all other
variables. If I(w)=1then X(w 0
)=0 forall w 0
2D
(w)\U
(w).
So onsiderI(w) =1 and w
0
2U(w). We now use the monomial property
to showthat X(w)qX
U(w)
j(I(w)=1).
The primitive probability
e (w
+
j w) is a fator of the probabilityp()
foranumberofroutes.Consideroneoftheseroutesanddenotethesubpath
of this route between w
0
and w by (w
0
;w). Then by Corollary A of the
LimitedMemoryLemma, wean write
e
(w +
jw)=p((e(w;w
+
))j(w))
=p((e(w;w
+
))j((w
0
;w));(w))
and this is learly true for all subpaths between w
0
and w. But the set of
these subpaths is in1:1 orrespondene with theset of vetorsof values of
X U(w)
whih are onsistent with the topology of the SCEG and with the
event I(w)=1.The event (w) an bewritten asI(w)=1, and theevent
(e(w;w
0
))asX(w)=x forsome x>0.Hene
p(X(w)=x jI(w)=1)=p(X(w)=x jX
U(w)
;I(w)=1)
and X(w)qX
U(w)
j(I(w)=1).
Combiningthiswiththetwo previousresultswetherefore have
X(w)qX
D
(w)
jI(w)
PROOFOFTHEOREM2. (1) SUFFICIENTCONDITIONSFORINDEPENDENCE
Consideran SCEG C,and two positions w
1 ;w
2
2V(C), where w
2
6w
1 ,
and byonstrution I(w
1
)60; I(w 2
)60.
Supposethereexistsastalkdownstreamofw
1
andupstreamofw
2 .Label
thispositionw.Thenallpathspassingthroughw 1
passthroughw,allpaths
passingthroughw
2
pass throughw,and neessarilyw
1
ww
2 .Also
p(X(w 2
)=x
2
jX(w
1 )=x
1
)=p(X(w
2 )=x
2
jX(w
1 )=x
1
;I(w)=1)
=p(X(w
2 )=x
2
sinew 1
ww
2
,and usingCorollaryBof theLimitedMemory Lemma
=p(X(w
2 )=x
2
jX(w
1 )=x
0 1
;I(w)=1)
=p(X(w
2 )=x
2
jX(w
1 )=x
0 1
)
forall valuesx 1 ;x 0 1 ofX(w 1
)and all values x 2 of X(w 2 ) ) X(w 1
)qX(w
2 )
If w 2
is itself a stalk,then we replae I(w) =1 by I(w 2
) =1 in the above
argument withthesame result.
So a suÆient onditionfor X(w
1
)qX(w
2
) is that eitherw 2
is itself a
stalk,orthere existsa stalkdownstream ofw 1
andupstreamof w
2 .
(2) NECESSARYCONDITIONSFORINDEPENDENCE
LetX(w
1
)qX(w
2
) (andsineI(w) isafuntionofX(w),X(w
1
)qI(w 2
)
andI(w 1
)qI(w 2
)).LetthesetofroutesofCbepartitionedintofoursubsets.
Call a route Type A if it passes through w
2
, but notthrough w
1
,Type B
ifitpassesthroughneither w 1
norw
2
,Type C ifit passesthroughbothw 1
andw
2
,and TypeDifitpassesthroughw 1
,butnotthroughw 2
.Ourproof
proeeds asfollows:
(a) We show that we must have w
1
w
2
(ie. the set of Type C routes is
non-empty.
(b) We show that every route intersets with every other route at some
point downstream of w
0
and upstream of w
1
. If two w
0
! w
1
routes
share no verties exept w
0
and w
1
, we all them internally disjoint (see
forexample[11 ℄). Soweansaythatthereannotbetwointernallydisjoint
diretedroutes inC
()Weshowthatthereannotbetwointernallydisjointroutesinthe
undi-reted version of the CEG (the CEG withits edge arrows removed), and
thattherefore there mustbe astalk betweenw
0
and w
1 .
(d)Weshowthat eitherw 1
isa stalkorw 2
isa stalk,orthereexistsastalk
downstream of w
1
and upstreamofw
2 .
(e) Finally we show that if w
1
is a stalk then there must also either be a
stalkat w 2
orastalk downstreamof w
1
and upstreamof w
2 .
(a) Suppose that w
1
6 w
2
(and reall that w
2 6 w 1 ). Then p(I(w 2
) = 1 j I(w
1
) = 1) 0. I(w
1
) q I(w 2
) ) p(I(w
2
) = 1) 0
)I(w
2
)0. Thisisimpossiblebyonstrution.Thereforew
1
w
2 .
(b) We rst show that eah Type C route intersets with every other
routeat w 1
orat w
2
µ1
λ2
λ2
p
p
p
0
0
1
p = arbitrary probability in (0,1)
ω1
ω2
Fig11.Illustrationfor Type CandType B routes
Let
1
be a Type C route, and
1 (w
1 ;w
2
) the subpath oinident with
1
between w
1
and w
2
. If the set of Type B routes is non-empty then let
2
be a Type B route whihdoesnotintersetwith
1
(ie.
2
and
1
have no
positions inommon).
ConsideradistributionP whih(1)assigns aprobabilityof1to every edge
of the subpath
1 (w
1 ;w
2
), and (2) an arbitrary probabilitygreater than 0
and less than 1 to eah edge of the route
2
(Figure 11). If our SCEG
is minimal and
2
does not interset with
1
then this is always possible.
UnderP,assignment(1) givesusthat
p(I(w 2
)=1 jI(w 1
)=1)=1
andI(w 1
)qI(w 2
) impliesthat underthisP
p(I(w 2
)=1 jI(w 1
)=0)=1 ) p(I(w
2
)=0 jI(w 1
)=0)=0
Butassignment(2) givesusthatp(I(w
2
)=0jI(w 1
)=0)>0 ¸
The assumption I(w
1
)qI(w
2
) is inompatible withthe assignments of (1)
and (2). But these assignments arealways possibleif
2
doesnot interset
with
1
.Hene
2
mustinterset with
λ4
λ3
λ3
ω1
ω2
λ4
µ5
µ5
λ3
&
µ5
Fig12.Illustrationfor non-TypeC routes:w4nw3m
HeneeahTypeC routeintersetswithevery TypeB routeat somepoint
downstream of w
1
and upstreamof w
2
. Also eah Type C route intersets
witheveryTypeAroute(atw
2
),witheveryTypeDroute(atw 1
)andwith
every other Type C route (atbothw
1
and w
2 ).
WenowonsiderroutesthatarenotofTypeC.Ifthesetofnon-TypeC
routesisnon-emptylet
3 ;
4
bemembersofthissetwhihdonotinterset
exeptat w
0
and w
1
.Let (w
1 ;w
2
) be asubpathbetweenw
1
and w
2 .
From above both
3
and
4
must interset with. Let
3
interset with
only at the positions w 31
;:::w 3m
, where w
31
w
3m
; and let
4
in-terset with only at the positions w
41 ;:::w
4n
, where w
41
w
4n .
Without lossof generalitylet w 1
w
31
w
41
w
2
,sothat
3
ould be a
routeofTypeB orType D,and
4
ould be aroute ofTypeAorTypeB.
Suppose that w
4n
w
3m
(Figure 12). Consider the subpath
5 (w
1 ;w
2 )
whihoinideswithfromw
1 tow
31 (ifw
31 6=w
1
),oinideswith
3 from
w 31
to w
3m
, and oinides with from w
3m
to w
2
. This subpath
5 does
not interset with the route
4
. This is impossible sine every route in C
intersetswithevery (w
1 ;w
2
) subpath.
Suppose therefore that w
3m
w
4n