ContentslistsavailableatSciVerseScienceDirect
Artificial
Intelligence
in
Medicine
j o ur na l h o mep a g e :w w w . e l s e v i e r . c o m / l o c a t e / a i i mIdentifying
significant
edges
in
graphical
models
of
molecular
networks
Marco
Scutari
a,∗,
Radhakrishnan
Nagarajan
baGeneticsInstitute,UniversityCollegeLondon,DarwinBuilding,GowerStreet,WC1E6BTLondon,UnitedKingdom
bDivisionofBiomedicalInformatics,DepartmentofBiostatistics,CollegeofPublicHealth,UniversityofKentucky,725RoseStreet,MultidisciplinaryScienceBldg,230F,Lexington,KY
40536-0082,USA
a
r
t
i
c
l
e
i
n
f
o
Articlehistory:
Received6December2011
Receivedinrevisedform
14December2012 Accepted16December2012 Keywords: Graphicalmodels Bayesiannetworks Modelaveraging L1norm Molecularnetworks
a
b
s
t
r
a
c
t
Objective:Modellingtheassociationsfromhigh-throughputexperimentalmoleculardatahasprovided
unprecedentedinsightsintobiologicalpathwaysandsignallingmechanisms.Graphicalmodelsand networkshaveespeciallyproventobeusefulabstractionsinthisregard.Adhocthresholdsareoften usedinconjunctionwithstructurelearningalgorithmstodeterminesignificantassociations.Thepresent studyovercomesthislimitationbyproposingastatisticallymotivatedapproachforidentifyingsignificant associationsinanetwork.
Methodsandmaterials:Anewmethodthatidentifiessignificantassociationsingraphicalmodelsby
estimatingthethresholdminimisingtheL1normbetweenthecumulativedistributionfunction(CDF)
oftheobservededgeconfidencesandthoseofitsasymptoticcounterpartisproposed.Theeffectiveness oftheproposedmethodisdemonstratedonpopularsyntheticdatasetsaswellaspubliclyavailable experimentalmoleculardatacorrespondingtogeneandproteinexpressionprofiles.
Results:Theimprovedperformanceoftheproposedapproachisdemonstratedacrossthesyntheticdata
setsusingsensitivity,specificityandaccuracyasperformancemetrics.Theresultsarealso demon-stratedacrossvaryingsamplesizesandthreedifferentstructurelearningalgorithmswithwidelyvarying assumptions.Inallcases,theproposedapproachhasspecificityandaccuracycloseto1,whilesensitivity increaseslinearlyinthelogarithmofthesamplesize.Theestimatedthresholdsystematically outper-formscommonadhoconesintermsofsensitivitywhilemaintainingcomparablelevelsofspecificityand accuracy.Networksfromexperimentaldatasetsarereconstructedaccuratelywithrespecttotheresults fromtheoriginalpapers.
Conclusion:Currentstudiesusestructurelearningalgorithmsinconjunctionwithadhocthresholds
foridentifyingsignificantassociationsingraphicalabstractionsofbiologicalpathwaysandsignalling mechanisms.Suchanadhocchoicecanhavepronouncedeffectonattributingbiologicalsignificanceto theassociationsintheresultingnetworkandpossibledownstreamanalysis.Thestatisticallymotivated approachpresentedinthisstudyhasbeenshowntooutperformadhocthresholdsandisexpectedto alleviatespuriousconclusionsofsignificantassociationsinsuchgraphicalabstractions.
© 2012 Elsevier B.V.
1. Introductionandbackground
Graphicalmodels[1,2]areaclassofstatisticalmodelswhich combinetherigourofaprobabilisticapproachwiththeintuitive representationofrelationshipsgivenbygraphs.Theyarecomposed byasetX={X1,X2,...,XN}ofrandomvariablesdescribingthe quan-titiesofinterestandagraphG=(V,E)inwhicheachnodeorvertex
v
∈Vis associatedwithoneoftherandomvariables inX(they areusuallyreferredtointerchangeably).Theedgese∈Eareused toexpressthedependencerelationshipsamongthevariablesinX. Thesetoftheserelationshipsisoftenreferredtoasthedependence∗Correspondingauthor.
E-mailaddresses:[email protected](M.Scutari),[email protected]
(R.Nagarajan).
structureofthegraph. Differentclassesof graphsexpressthese
relationshipswithdifferentsemantics,whichhaveincommonthe principlethatgraphicalseparationoftwoverticesimpliesthe con-ditionalindependenceofthecorrespondingrandomvariables[2].
ThetwoexamplesmostcommonlyfoundinliteratureareMarkov
networks[3,4],whichuseundirectedgraphs,andBayesiannetworks
(BNs)[5,6],whichusedirectedacyclicgraphs.
Inprinciple,therearemanypossiblechoicesforthejoint
dis-tribution of X, dependingon the natureof the data. However,
literaturehasfocusedmostlyontwocases:thediscretecase[3,7], inwhichbothXandtheXiaremultinomialrandomvariables,and
thecontinuouscase[3,8],inwhichXismultivariatenormaland
theXiareunivariatenormalrandomvariables.Intheformer,the parametersofinterestaretheconditionalprobabilitiesassociated witheachvariable,usuallyrepresentedas conditional probabil-itytables;inthelatter,theparametersofinterestarethepartial
0933-3657© 2012 Elsevier B.V.
http://dx.doi.org/10.1016/j.artmed.2012.12.006
Open access under CC BY license.
correlationcoefficientsbetweeneachvariableand itsneighbours (i.e.theadjacentnodesinG).
TheestimationofthestructureofthegraphGiscalledstructure
learning[1,4],andinvolvesdeterminingthegraphstructurethat
encodestheconditionalindependenciespresentinthedata.Ideally itshouldcoincidewiththedependencestructureofX,oritshould atleastidentifyadistributionascloseaspossibletothecorrectone intheprobabilityspace.Severalalgorithmshavebeenpresentedin theliteratureforthisproblem,thankstotheapplicationofmany results from probability, information and optimisation theory. Despitedifferencesintheoreticalbackgroundsand terminology, theycanallbegroupedintoonlythreeclasses:constraint-based
algorithms,thatarebasedonconditionalindependencetests;
score-based algorithms,that are based on goodness-of-fit scores; and
hybridalgorithms,thatcombinetheprevioustwoapproaches.For
someexamplesseeBrombergetal.[9],CasteloandRoverato[10], Friedmanetal.[11],Larra ˜nagaetal.[12]andTsamardinosetal.
[13].
Ontheotherhand,thedevelopmentoftechniquesforassessing thestatisticalrobustnessofnetworkstructureslearnedfromdata (e.g.thepresenceofartefactsarisingfromnoisydata)hasbeen limited.Structurelearningalgorithmsarecommonlystudied
mea-suring differences from the true (known) structure of a small
numberofreferencedatasets[14,15].Theusefulnessofsuchan approachininvestigatingnetworkslearnedfromreal-worlddata setsislimited,sincethetruestructureoftheirprobability distribu-tionisunknown.
Amoresystematicapproachtomodelassessment,andin partic-ulartotheproblemofidentifyingstatisticallysignificantfeatures inanetwork, hasbeendevelopedbyFriedmanet al.[16]using bootstrap resampling [17] and model averaging [18]. It can be
summarisedasfollows:
1.Forb=1,2,...,m:
(a)sampleanewdatasetX∗bfromtheoriginaldataXusingeither parametricornonparametricbootstrap;
(b)learnthestructureofthegraphicalmodelGb=(V,Eb)from
X∗b.
2.Estimatetheprobabilitythateachpossibleedgeei,i=1,...,kis presentinthetruenetworkstructureG0=(V,E0)as
ˆ P(ei)= 1 m m
b=1 1{ei∈Eb}, (1)where1{ei∈Eb}istheindicatorfunctionoftheevent{ei∈Eb}(i.e. itisequalto1ifei∈Eband0otherwise).
Theempiricalprobabilities ˆP(ei)areknownasedgeintensitiesor
arcstrengths,andcanbeinterpretedasthedegreeofconfidence
thateiispresentinthenetworkstructureG0 describingthetrue dependencestructureofX.1However,theyaredifficulttoevaluate,
becausetheprobabilitydistributionofthenetworksGbinthespace
ofthenetworkstructuresisunknown.Asaresult,thevalueofthe confidencethreshold(i.e.theminimumdegreeofconfidenceforan edgetobesignificantandthereforeacceptedasanedgeofG0)isan unknownfunctionofboththedataandthestructurelearning algo-rithm.Thisisaseriouslimitationintheidentificationofsignificant edgesandhasledtotheuseofadhoc,pre-definedthresholdsin spiteoftheimpactonmodelassessmentevidencedbyseveral stud-ies[16,19].AnexceptionisNagarajanetal.[20],whoseapproach willbediscussedbelow.
1 Theprobabilities ˆP(e
i)areinfactanestimatoroftheexpectedvalueofthe{0,
1}randomvectordescribingthepresenceofeachpossibleedgeinG0.Assuch,they
donotsumtooneandaredependentononeanotherinanontrivialway.
Apartfromthislimitation,Friedman’sapproachisverygeneral andcanbeusedinawiderangeofsettings.Firstofall,itcanbe appliedtoanykindofgraphicalmodelwithonlyminor adjust-ments(forexample,accountingforthedirectionoftheedgesin BNs,seeSection4).Nodistributionalassumptiononthedatais requiredinadditiontotheonesneededby thestructure learn-ingalgorithm.Noassumptionismadeonthelatter,either,soany score-based, constraint-based or hybridalgorithm can beused. Furthermore,parallelcomputingcaneasilybeusedtooffsetthe additionalcomputationalcomplexityintroducedbymodel averag-ing,becausebootstrapisembarrassinglyparallel.
Inthis paper,wepropose astatisticallymotivatedestimator fortheconfidencethresholdminimisingtheL1normbetweenthe cumulativedistributionfunction(CDF)oftheobservedconfidence levelsandtheCDFoftheconfidencelevelsoftheunknown net-workG0.Subsequently,wedemonstratetheeffectivenessofthe proposedapproachbyre-investigatingtwoexperimentaldatasets fromNagarajanetal.[20]andSachsetal.[21].
2. Selectingsignificantedges
Considertheempiricalprobabilities ˆP(ei)definedinEq.(1),and denotethemwith ˆp={pi,ˆ i=1,...,k}.ForagraphwithNnodes, k=N(N−1)/2.Furthermore,considertheorderstatistic
ˆ
p(·)=(ˆp(1),pˆ(2),...,pˆ(k)) with pˆ(1)pˆ(2)···pˆ(k) (2) derivedfrom ˆp.Itisintuitivelyclearthatthefirstelementsof ˆp(·) aremorelikelytobeassociatedwithnon-significantedges,and thatthelastelementsof ˆp(·)aremorelikelytobeassociatedwith significantedges.Theidealconfiguration ˜p(·)of ˆp(·)wouldbe ˜ p(i)=
1 if e(i)∈E0 0 otherwise , (3) thatisthesetofprobabilitiesthatcharacterisesanyedgeaseither significantor non-significant withoutany uncertainty. In other words,˜
p(·)={0,...,0,1,...,1}. (4) Sucha configurationarisesfromthelimitcase inwhich allthe networksGbhaveexactlythesamestructure.Thismayhappenin
practicewithaconsistentstructurelearningalgorithmwhenthe samplesizeislarge[22,23].
Ausefulcharacterisationof ˆp(·)andp(˜·)canbeobtainedthrough theempiricalCDFsoftherespectiveelements,
Fpˆ(·)(x)= 1 k k
i=1 1{pˆ(i)<x} (5) and Fp˜(·)(x)=⎧
⎨
⎩
0 if x∈(−∞,0) t if x∈[0,1) 1 if x∈[1,+∞) . (6)Inparticular,tcorrespondstothefractionofelementsof ˜p(·)equal tozeroandisameasureofthefractionofnon-significantedges.At thesametime,tprovidesathresholdforseparatingtheelements of ˜p(·),namely
e(i)∈E0⇔p˜(i)>Fp˜−(1·)(t) (7)
whereFp˜−(1·)(t)=infx∈R{Fp˜(·)(x)t}isthequantilefunction[24].
Moreimportantly,estimatingt fromdataprovidesa statisti-callymotivated threshold for separatingsignificant edgesfrom non-significantones.Inpractice,thisamountstoapproximating
Fig.1. TheempiricalCDFFpˆ(·)(left),theCDFFp˜(·)(centre)andtheL1normbetweenthetwo(right),shadedingrey. theideal, asymptoticempirical CDF Fp˜(·) withits finite sample
estimateFpˆ(·).Suchanapproximationcanbecomputedinmany
differentways,dependingonthenormusedtomeasurethe dis-tancebetweenFpˆ(·)andFp˜(·)asafunctionoft.Commonchoicesare
theLpfamilyofnorms[25],whichincludestheEuclideannorm, andCsiszar’sf-divergences[26],whichincludeKullback–Leibler divergence.
TheL1norm L1(t; ˆp(·))=
|Fpˆ(·)(x)−Fp˜(·)(x;t)|dx (8)
appearstobeparticularlysuitedtothisproblem;anexampleis showninFig.1.Firstofall,notethatFpˆ(·) ispiecewiseconstant,
changingvalueonlyatthepoints ˆp(i);thisdescendsfromthe defi-nitionofempiricalCDF.Therefore,fortheproblemathandEq.(8)
simplifiesto
L1(t; ˆp(·))=
xi∈{{0}∪pˆ(·)∪{1}}
|Fpˆ(·)(xi)−t|(xi+1−xi), (9)
whichcanbecomputedinlineartimefrom ˆp(·).Itsminimisationis alsostraightforwardusinglinearprogramming[27].Furthermore,
comparedtothemorecommonL2norm
L2(t; ˆp(·))=
[Fpˆ(·)(x)−Fp˜(·)(x;t)] 2 dx (10) ortheL∞norm L∞(t; ˆp(·))= max x∈[0,1]{|Fpˆ(·)(x)−Fp˜(·)(x;t)|}, (11) theL1 normdoesnotplaceasmuchweightonlargedeviations comparedtosmallones,makingitrobustagainstawidevarietyof configurationsof ˆp(·).Thentheidentificationofsignificantedgescanbethoughtof eitherasaleastabsolutedeviationsestimationoranL1approximation oftheform
ˆ
t=argmin
t∈[0,1] L1
(t; ˆp(·)) (12)
followedbytheapplicationofthefollowingrule:
e(i)∈E0⇔pˆ(i)>Fpˆ−1
(·)(ˆt). (13)
Notethat,eventhoughedgesareindividuallyidentifiedas signif-icantornon-significant,theyarenotidentifiedindependentlyof eachotherbecause ˆtisafunctionofthewhole ˆp(·).
Asimpleexampleisillustratedbelow.
Example1. Consideragraphicalmodelbasedonanundirected graphGwithnodesetV={A,B,C,D}.Thesetofpossibleedgesof Gcontains6elements:(A,B),(A,C),(A,D),(B,C),(B,D)and(C,D). Supposethatwehaveestimatedthefollowingconfidencevalues:
ˆ
pAB=0.2242, pACˆ =0.0460, pADˆ =0.8935, pBCˆ =0.3921,
ˆ pBD=0.7689, pCDˆ =0.9439. (14) Then ˆp(·)={0.0460,0.2242,0.3921,0.7689,0.8935,0.9439}and Fpˆ(·)(x)=
⎧
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎩
0 if x∈(−∞,0.0460) 1 6 if x∈[0.0460,0.2242) 2 6 if x∈[0.2242,0.3921) 3 6 if x∈[0.3921,0.7689) 4 6 if x∈[0.7689,0.8935) 5 6 if x∈[0.8935,0.9439) 1 if x∈[0.9439,+∞) . (15)TheL1normtakestheform L1(t; ˆp(·))=|0−t|(0.0460−0)+
1 6−t
(0.2242−0.0460) +
2 6−t
(0.3921−0.2242) +
3 6−t
(0.7689−0.3921) +
46−t
(0.8935−0.7689) +
56−t
(0.9439−0.8935) +|1−t|(1−0.9439) (16)
and is minimised for ˆt=0.4999816. Therefore, an edge is
deemed significant if its confidence is strictly greater than
Fp−ˆ(1
·)(0.4999816)=0.3921,or,equivalently,ifithasconfidenceof
atleast0.7689;only(A,D),(B,D)and(C,D)satisfythiscondition (Fig.2).
3. Simulationresults
Wetestedtheproposedapproachonsyntheticdatasetsusing threeestablishedperformancemeasures:sensitivity,specificityand
accuracy.Sensitivityisgivenbytheproportionofedgesofthetrue
networkstructurethathavebeencorrectlyidentifiedassignificant.
Specificityisgivenbytheproportionoftheedgesmissingfromthe
truenetworkstructurethathavebeencorrectlyidentifiedas non-significant.Accuracyisgivenbytheproportionofedgescorrectly identifiedaseithersignificantornon-significantoverthesetofall
Fig.2.TheCDFsFpˆ(·)andFp˜(·)(ˆt),respectivelyinblackandgrey(left),andtheL1(t; ˆp(·))norm(right)fromExample1. possibleedges.Tothatend,wegenerated400datasetsofvarying
sizes(100,200,500,1000,2000,5000,10,000and20,000)from
threediscreteBNscommonlyusedasbenchmarks:
•theALARMnetwork[28],anetworkdesignedtoprovideanalarm messagesystemforintensivecareunitpatientmonitoring.Its
truestructure iscomposedby37nodesand 46edges(of666
possibleedges),anditsprobabilitydistributionhas509 parame-ters;
•theHAILFINDER network[29],anetwork designedtoforecast severesummerhailinnortheasternColorado.Itstruestructure iscomposedby56nodesand66edges(of1540possibleedges), anditsprobabilitydistributionhas2656parameters;
•theINSURANCEnetwork[30],anetworkdesignedtoevaluatecar insurancerisks.Itstruestructureiscomposedby27nodesand 52edges(of351possibleedges),anditsprobabilitydistribution has984parameters.
Threedifferentstructurelearningalgorithmswereconsidered: •theIncrementalAssociationMarkovBlanket(IAMB)
constraint-basedalgorithm[31].IAMBwasusedtolearntheMarkovblanket ofeachnodeasapreliminarysteptoreducethenumberofits candidateparentsandchildren;anetworkstructuresatisfying theseconstraintsisthenidentifiedasintheGrow–Shrink algo-rithm[32].Conditionalindependencetestswereperformedusing ashrinkagemutualinformationtest[33]with˛=0.05.Suchatest, unlikethemorecommonasymptotic2mutualinformationtest, isvalidandhasbeenshowntoworkreliablyevenonsmall sam-ples.An˛=0.01wasalsoconsidered;however,theresultswere notsignificantlydifferentfrom˛=0.05andwillnotbediscussed separatelyinthispaper;
•theHillClimbing(HC)score-basedalgorithmwiththeBayesian Dirichletequivalentuniform(BDeu)scorefunction,theposterior distributionofthenetworkstructurearisingfromauniformprior distribution[7].Theequivalentsamplesizewassetto10.Thisis thesameapproachdetailedinFriedmanetal.[16],althoughthey consideredonly100(insteadof500)bootstrapsamplesforeach scenario;
•theMax–MinHillClimbing(MMHC)hybridalgorithm[13],which
combinestheMax–MinParentsandChildren(MMPC)andHC.
TheconditionalindependencetestusedinMMPCandthescore
functionsusedin HCare theones illustratedin theprevious points.
Theperformancemeasureswereestimatedforeachcombinationof network,samplesizeandstructurelearningalgorithmasfollows:
1.asampleoftheappropriatesizewasgeneratedfromeitherthe
ALARM,theHAILFINDERortheINSURANCEnetwork;
2.weestimatedtheconfidencevalues ˆpforallpossibleedgesfrom 200and500nonparametricbootstrapsamples.Sinceresultsare verysimilar,theywillbediscussedtogether;
3.we estimated theconfidence threshold ˆt, and identified
sig-nificantand non-significant edgesin the network. Note that
thedirectionoftheedgespresentinthenetworkstructureis effectivelyignored,becausetheproposedapproachfocusesonly thoseedges’presence.Significantedgeswerethenusedtobuild anaveragednetworkstructure;
4.wecomputedsensitivity,specificityandaccuracycomparingthe averagednetworkstructuretothetrueone,whichisknownfrom theliterature.
Thesestepswererepeated50timesinordertoestimateboththe performancemeasuresandtheirvariability.
Allthesimulationsand thethresholdsestimation were
per-formed with the bnlearn package [34,35] for R [36], which
implementsseveralmethodsforstructurelearning,parameter esti-mationandinferenceonBNs(includingtheapproachproposedin Section2).
Theaveragevaluesofsensitivity,specificity,accuracyand ˆtfor thenetworksacrossvarioussamplesizes(n)areshowninFigs.3
(IAMB),4(HC)and5(MMHC).Sincethenumberofparametersis non-constantacrossthenetworks,anormalisedratioofthesizeof thegeneratedsampletothenumberofparametersofthenetwork (i.e.n/p)isusedasareferenceinsteadoftherawsamplesize(i.e.n). Intuitively,asampleofsizeofn=1000maybelargeenoughto esti-matereliablyasmallnetworkwithfewparameters,sayp=100,but itmaybetoosmallforalargernetworkwithp=10,000.Onarelated note,densernetworks(i.e.networkswithalargenumberofedges
comparedtothenumberofnodes)usuallyhaveahighernumber
ofparametersthansparserones(i.e.networkswithfewedges). Severalinterestingtrendsemergefromtheestimated quanti-ties.Asexpected,sensitivityincreasesasthesamplesizegrows. ThisprovidesanempiricalverificationthatthecombinationofHC andBDeisindeedconsistent,asprovedbyChickering[23].No anal-ogousresultexistsforIAMBorMMHC,althoughintuitivelytheir sensitivityshouldimproveaswellwiththesamplesizeduetothe consistencyoftheconditionalindependencetestsusedbythose algorithms.Moreover,evenwhenn/pisextremelylowa substan-tialproportionofthenetworkstructurecanbecorrectlyidentified. Whenn/pisatleast0.2(i.e.1observationevery5parameters),HC
successfullyrecoversfromabout50%(forALARMandINSURANCE)
to75%(forHAILFINDER)ofthetruenetworkstructure.Incontrast,
IAMBandMMHCsuccessfullyrecoverfromabout45%to50%of
HAILFINDER,butonlyabout26%to40%ofALARMand19%to30%
n
p
Accur
ac
y
Specificity
Sensitivity
0.2 0. 4 0. 6 0. 8 1.0 0 10 20 30 40(g)
0.2 0. 4 0. 6 0. 8 1.0 0 2 4 6(h)
0.2 0. 4 0. 6 0. 8 1.0 0 5 10 15 20(i)
0.2 0. 4 0. 6 0. 8 1.0 0 10 20 30 40(d)
0.2 0. 4 0. 6 0. 8 1.0 0 2 4 6(e)
0.2 0. 4 0. 6 0. 8 1.0 0 5 10 15 20(f)
0.2 0. 4 0. 6 0. 8 1.0 0 10 20 30 40(a)
0.2 0. 4 0. 6 0. 8 1.0 0 2 4 6(b)
0.2 0. 4 0. 6 0. 8 1.0 0 5 10 15 20(c)
Fig.3.Averagesensitivity,specificityandaccuracyofIAMBfortheALARM,HAILFINDERandINSURANCEnetworksovern/p.Barsrepresent95%confidenceintervals,and
thedottedverticallineisn=p.
thesparsity-inducingeffectofshrinkagetests[37],whichincrease specificityatthecostofsensitivity.Forvaluesofn/pgreaterthan1 (i.e.moreobservationsthanparameters)theincreasein sensitiv-ityslowsdownforallcombinationsofnetworksandalgorithms, reachingaplateau.
Overall, sensitivity seems to have an hyperbolic behaviour,
growingveryrapidlyforn/p1andthenconverging asymptoti-callyto1forn/p>1.Thusweexpectittoincreaselinearlyona log(n/p)scale.Theslowerconvergencerateobservedforthe INSUR-ANCEnetworkcomparedtotheothertwonetworksislikelytobea consequenceofitshighedgedensity(1.92edgespernode)relative
toALARM(1.24)andHAILFINDER(1.17).Slowerconvergencemay
alsobeanoutcomeofinherentlimitationsofstructurelearning algorithmsinthecaseofdensenetworks[1,38].
Furthermore,bothspecificityandaccuracyarecloseto1forall thenetworksandthesamplesizesconsideredintheanalysis,even
atverylown/pratios.Suchhighvaluesarearesultofthelow
num-beroftrueedgesinALARM,HAILFINDERandINSURANCEcompared
totherespectivenumbersofpossibleedges.Thisistruein
partic-ularfortheALARMandHAILFINDERnetworks.Thelowervalues
observedfortheINSURANCEnetworkcanbeattributedagaintothe inherentlimitationsofstructurelearningalgorithmsinmodelling densenetworks.Thesparsity-inducingeffectofshrinkagetestsis againevidentforbothIAMBandMMHC;bothspecificityand accu-racyactuallydecreaseslightlyasn/pgrowsandtheinfluenceof shrinkagedecreases.
Itisalsoimportanttonotethat,asshowninFig.6,theaverage valueoftheconfidencethreshold ˆt doesnot exhibitany appar-enttrendasafunctionofn/p.Inaddition,itsvariabilitydoesnot appeartodecreaseasn/pgrows.Thissuggeststhattheoptimal ˆt
dependsstronglyonthespecificsampleusedintheestimationof theconfidencevalues ˆp,evenforrelativelylargesamples.However,
n
p
Accur
ac
y
Specificity
Sensitivity
ALARM
HAILFINDE
R
INSURANCE
0.2 0. 4 0. 6 0. 8 1.0 0 10 20 30 40
(g)
0.2 0. 4 0. 6 0. 8 1.0 0 2 4 6(h)
0.2 0. 4 0. 6 0. 8 1.0 0 5 10 15 20(i)
0.2 0. 4 0. 6 0. 8 1.0 0 10 20 30 40(d)
0.2 0. 4 0. 6 0. 8 1.0 0 2 4 6(e)
0.2 0. 4 0. 6 0. 8 1.0 0 5 10 15 20(f)
0.2 0. 4 0. 6 0. 8 1.0 0 10 20 30 40(a)
0.2 0. 4 0. 6 0. 8 1.0 0 2 4 6(b)
0.2 0. 4 0. 6 0. 8 1.0 0 5 10 15 20(c)
Fig.4. Averagesensitivity,specificityandaccuracyofHCfortheALARM,HAILFINDERandINSURANCEnetworksovern/p.Barsrepresent95%confidenceintervals,andthe
dottedverticallineisn=p.
specificity,sensitivityandaccuracyestimatesappearontheother handtobeverystable(allconfidenceintervalsshowninFigs.3–5
areverysmall).
From Fig.6, it is alsoapparent that the threshold estimate ˆ
t canbesignificantly lowerthan 1even forhighvalues of n/p. Thisbehaviourisobservedconsistentlyacrossthethreenetworks (ALARM,HAILFINDER,INSURANCE).Theseresultsareinsharp con-trastwithadhocthresholdscommonly foundintheliterature, whichareusuallylarge[e.g.0.8in16].Alargethresholdcan cer-tainlybeusefulinexcludingnoisyedges,whichmayresultfrom artefactsatthemeasurementanddynamicallevelsandfromfinite sample-sizeeffects.However,whilealargeadhocthresholdcan certainlyminimisefalsepositives,itisalsoexpectedtoaccentuate falsenegatives.Sucha conservativechoicecanhavea profound impactonthe network topology, resulting in artificiallysparse networks.ThethresholdestimatorintroducedinSection2achieves
a goodtrade-offbetweenincorrectlyidentifyingnoisyedgesas significantanddisregardingsignificantones.Asanexample,the differenceinsensitivity,specificityandaccuracybetweenthe esti-matedthreshold ˆtandseverallarge,adhocones(t=0.70,0.80,0.90, 0.95)forHCisshowninFig.7(thecorrespondingplotsforIAMB andMMHCaresimilar,andareomittedforbrevity).The thresh-old ˆtsystematicallyoutperformstheadhocthresholdsintermsof sensitivity,inparticularforlowvaluesofn/p.Thedifference pro-gressivelyvanishesasn/pgrows.Allthresholdshavecomparable levelsofspecificityandaccuracy.
Onarelatednote,falsenegativesacrossadhocthresholdsmay alsobeattributedtothefactthatedgesareconsideredasseparate, independententitiesasfarasthechoiceofthethresholdis con-cerned–i.e.a0.99thresholdisexpectedtoidentifyassignificant about1in100edgesinthenetwork.However,inabiologicalsetting thestructureofthenetworkisanabstractionfortheunderlying
n
p
Accur
ac
y
Specificity
Sensitivity
0.2 0. 4 0. 6 0. 8 1.0 0 10 20 30 40(g)
0.2 0. 4 0. 6 0. 8 1.0 0 2 4 6(h)
0.2 0. 4 0. 6 0. 8 1.0 0 5 10 15 20(i)
0.2 0. 4 0. 6 0. 8 1.0 0 10 20 30 40(d)
0.2 0. 4 0. 6 0. 8 1.0 0 2 4 6(e)
0.2 0. 4 0. 6 0. 8 1.0 0 5 10 15 20(f)
0.2 0. 4 0. 6 0. 8 1.0 0 10 20 30 40(a)
0.2 0. 4 0. 6 0. 8 1.0 0 2 4 6(b)
0.2 0. 4 0. 6 0. 8 1.0 0 5 10 15 20(c)
Fig.5. Averagesensitivity,specificityandaccuracyofMMHCfortheALARM,HAILFINDERandINSURANCEnetworksovern/p.Barsrepresent95%confidenceintervals,and
thedottedverticallineisn=p.
functional mechanisms; as an example, consider the signalling
pathwaysin atranscriptionalnetwork.Insuchacontext,edges areclearlynotindependent,butappearinconcertalongsignalling pathways.Thisinterdependenceisaccountedforintheproposed approach(thatisbasedonthefullsetˆpofestimatedconfidence
values),but it is not commonly consideredin choosing ad hoc
thresholds.Forinstance,edgesappearing withindividual confi-dencevaluesfarbelowthe[0.80,1]rangemaynotnecessarilybe identifiedassignificantbyanadhocthreshold.However,the pro-posedapproachrecognisestheirinterplayandcorrectlyidentifies themassignificant.Thisaspect,alongwiththestrongdependence betweentheoptimal ˆtandtheactualsamplethenetworkislearned from,maydiscouragetheuseofanapriorioradhocconfidence thresholdinfavourofmorestatisticallymotivatedalternatives.
4. Applicationstomolecularexpressionprofiles
In order to demonstrate the effectiveness of the proposed
approachonexperimentaldatasets, wewillexaminetwogene
expressiondatasetsfromNagarajanetal.[20]andSachsetal.[21]. Alltheanalyseswillbeperformedagainwiththebnlearnpackage. FollowingImotoetal.[39],wewillconsidertheedgesoftheBNs disregardingtheirdirectionwhendeterminingtheirsignificance. Edgesidentifiedassignificantwillthenbeorientedaccordingtothe directionobservedwiththehighestfrequencyinthebootstrapped networksGb.Whilesimplistic,thiscombinedapproachallowsthe
proposedestimatortohandletheedgeswhosedirectioncannot
bedeterminedbythestructurelearningalgorithmpossiblydueto scoreequivalentstructures[40].
n
p
t
^
ALARM
HAILFINDER
INSURANCE
MMHC
HC
IAMB
0.2 0.4 0.6 0.8 0 10 20 30 40(g)
0 2 4 6(h)
0 5 10 15 20(i)
0.2 0.4 0.6 0.8 0 10 20 30 40(d)
0 2 4 6(e)
0 5 10 15 20(f)
0.2 0.4 0.6 0.8 0 10 20 30 40(a)
0 2 4 6(b)
0 5 10 15 20(c)
Fig.6.Averageestimatedsignificancethreshold(ˆt)fortheALARM,HAILFINDERandINSURANCEnetworksovern/p.Barsrepresent95%confidenceintervals.
4.1. Differentiationpotentialofagedmyogenicprogenitors
Inarecentstudy[20]theinterplaybetweencrucialmyogenic (Myogenin,Myf-5,Myo-D1), adipogenic(C/EBP˛,DDIT3,FoxC2, PPAR),andWnt-relatedgenes(Lrp5,Wnt5a)orchestratingaged myogenicprogenitordifferentiationwasinvestigatedby Nagara-janetal.usingclonalgeneexpressionprofilesinconjunctionwith BNstructurelearningtechniques.Theobjectivewastoinvestigate possiblefunctionalrelationshipsbetweenthesediverse differen-tiationprogramsreflectedbytheedgesintheresultingnetworks. TheclonalexpressionprofilesweregeneratedfromRNAisolated
across 34 clones of myogenic progenitors obtained across
24-month-oldmiceandreal-timeRT-PCRwasusedtoquantifythe
geneexpression.Suchanapproachimplicitlyaccommodates inher-entuncertaintyingeneexpressionprofilesandjustifiedthechoice ofprobabilisticmodels.
In the same study, the authors proposed a non-parametric
resampling approach to identify significant functional
relationships. Starting from Friedman’s definition of
confi-dencelevels(Eq.(1)),theycomputedthenoise floordistribution ˆ
f={fˆ1,fˆ2,...,fkˆ}oftheedgesbyrandomlypermutingthe
expres-sionofeach gene andperformingBNstructure learning onthe
resultingdatasets.Anedgeeiwasdeemedsignificantif ˆpi>max
{fˆ l∈ˆf}
ˆ
fl.
In addition to revealing several functional relationships docu-mentedinliterature,thestudyalsorevealednewrelationshipsthat wereimmunetothechoiceofthestructurelearningtechniques. Theseresultswereestablishedacrossclonalexpressiondata
nor-malisedusingthreedifferenthousekeepinggenesandnetworks
learnedwiththreedifferentstructurelearningalgorithms. Theapproachpresentedin[20]hastwoimportantlimitations. First,thecomputationalcostofgeneratingthenoisefloor distri-butionmaydiscourageitsapplicationtolargedatasets.Infact, thegenerationoftherequiredpermutationsofthedataandthe subsequentstructurelearning(inadditiontothebootstrap resam-plingandthesubsequentlearningrequiredfortheestimationof
diff
erence
ALARM
0.0 0.1 0.2 0.3 0 10 20 30 40(a)
0 10 20 30 40(b)
0 10 20 30 40(c)
t = 0.70
t = 0.80
t = 0.90
t = 0.95
diff
erence
HAILFINDER
0.00 0.05 0.10 0.15 0.20 0.25 0 2 4 6(d)
0 2 4 6(e)
0 2 4 6(f)
t = 0.70
t = 0.80
t = 0.90
t = 0.95
n
p
diff
erence
INSURANCE
0.00 0.05 0.10 0.15 0.20 0 5 10 15 20(g)
0 5 10 15 20(h)
0 5 10 15 20(i)
t = 0.70
t = 0.80
t = 0.90
t = 0.95
Fig.7. Differenceinsensitivity,specificityandaccuracybetweentheestimatedthreshold ˆtandseveraladhocones(t=0.70,0.80,0.90,0.95)forHCovern/p.
ˆ
p)essentiallydoublesthecomputationalcomplexityofFriedman’s approach.Second,alargesamplesizemayresultinanextremely lowvalueofmax(ˆf),andthereforeinalargenumberoffalse posi-tives.
Inthepresentstudy,were-investigatethemyogenic progen-itorclonalexpressiondatanormalisedusinghousekeepinggene GAPDHwiththeapproachoutlinedinSection2andtheIAMB algo-rithm.Itisimportanttonotethatthisstrategywasalsousedin theoriginalstudy[20],henceitschoice.Theorderstatistic ˆp(·)was computedfrom500bootstrapsamples.TheempiricalCDFFpˆ(·),the
estimatedthresholdandthenetworkwiththesignificantedgesare showninFig.8.
Alledgesidentifiedassignificantintheearlierstudy[20]across thevariousstructurelearningtechniquesandnormalisation tech-niqueswerealsoidentifiedbytheproposedapproach(seeFig.3D in[20]).IncontrasttoFig.8,theoriginalstudyusingIAMBand nor-malisationwithrespecttoGAPDHalonedetectedaconsiderable
numberofadditionaledges(seeFig.3Ain[20]).Thusitisquite possiblethattheapproachproposedinthispaperreducesthe num-beroffalsepositivesandspuriousfunctionalrelationshipsbetween thegenes.Furthermore,theapplicationoftheproposedapproach inconjunctionwiththealgorithmfromImotoetal.[39]reveals directionalityoftheedges,incontrasttotheundirectednetwork reportedbyNagarajanetal.[20].
4.2. Proteinsignallinginflowcytometrydata
Inalandmarkstudy,Sachsetal.[21]usedBNsforidentifying causal influences in cellular signalling networks from
simulta-neous measurement of multiple phosphorylated proteins and
phospholipidsacrosssinglecells.Theauthorsusedabatteryof per-turbationsinadditiontotheunperturbeddatatoarriveatthefinal
network representation. Agreedy searchscore-based algorithm
Fig.8.TheempiricalCDFFpˆ(·)forthemyogenicprogenitorsdatafromNagarajanetal.[20](ontheleft),andthenetworkstructureresultingfromtheselectionofthe
significantedges(ontheright).TheverticaldashedlineintheplotofFpˆ(·)representsthethresholdF
−1 ˜ p(·)(ˆt).
Fig.9.TheempiricalCDFof ˆp(·)fortheflowcytometrydatafromSachsetal.[21](ontheleft),andthenetworkstructureresultingfromtheselectionofthesignificantedges
(ontheright).TheverticaldashedlineintheplotofFˆp(·)representsthethresholdF
−1 ˜ p(·)(ˆt). accommodatesforvariationsinthejointprobabilitydistribution acrosstheunperturbedandperturbeddatasetswasusedtoidentify theedges[41].Moreimportantly,significantedgeswereselected usinganarbitrarysignificancethresholdof0.85(seeFig.3,[21]).A detailedcomparisonbetweenthelearnednetworkandfunctional relationshipsdocumentedintheliteraturewaspresentedinthe
samestudy.
We investigated theperformance of theproposed approach
inidentifyingsignificantfunctionalrelationshipsfromthesame
experimental data. However, we limit ourselves to the data
recorded without applying any molecular intervention, which
amountto854observationsfor11variables.Wecompareand con-trastourresultstothose obtainedusing anarbitrarythreshold of0.85.Thecombinationofperturbedandnon-perturbed obser-vationsstudiedinSachsetal.[21]cannotbeanalysedwithour approach,becauseeachsubsetofthedatafollowsadifferent prob-abilitydistributionandthereforethereisnosingle“true”network G0.Analysisoftheunperturbeddatausingtheapproachpresented inSection2revealstheedgesreportedintheoriginalstudy.The resultingnetworkisshowninFig.9alongwithFpˆ(·)andthe
esti-matedthreshold.Fromtheplot ofFpˆ(·) we canclearly seethat
significantandnon-significantedgespresentwidelydifferent lev-elsofconfidence,tothepointthatanythresholdbetween0.4and 0.9resultsinthesamenetworkstructure.This,alongwiththevalue oftheestimatedthreshold(ˆp(i)0.93),showsthatthenoisinessof thedatarelativetothesamplesizeislow.Inotherwords,the sam-pleisbigenoughforthestructurelearningalgorithmtoreliably selectthesignificantedges.Theedgesidentifiedbytheproposed
methodwerethesameasthoseidentifiedby[21]usinggeneral
stimulatorycuesexcludingthedatawithinterventions(seeFig. 4Ain[21],Supplementaryinformation).Incontrastto[21],using Imotoetal.[39]approachinconjunctionwiththeproposed thresh-oldingmethodwewereabletoidentifythedirectionsoftheedges inthenetwork.Thedirectionscorrelatedwiththefunctional rela-tionshipsdocumentedinliterature(Table3,[21],Supplementary information)aswellaswiththedirectionsoftheedgesinthe net-worklearnedfrombothperturbedandunperturbeddata(Fig.3,
[21]).
5. Conclusions
Graphicalmodelsandnetworkabstractionshaveenjoyed
con-siderableattentionacrossthebiologicalandmedicalcommunities. Suchabstractionsareespecially usefulin decipheringthe
inter-actions between the entities of interest from high-throughput
observationaldata.Classicaltechniquesforidentifyingsignificant edgesintheresultinggraphrely onadhocthresholdingofthe edgeconfidenceestimatedfromacrossmultipleindependent
real-isationsof networkslearned fromthegiven data.Largead hoc
thresholdvaluesareparticularlycommon,andarechoseninan efforttominimise noisyedgesin theresulting network. While usefulinminimisingfalsepositives,suchachoicecanaccentuate falsenegativeswithpronouncedeffectonthenetworktopology.
The present study overcomesthis caveatby proposing a more
straightforwardandstatisticallymotivatedapproachfor identify-ingsignificantedgesinagraphicalmodel.Theproposedestimator
levelsand theCDFoftheirasymptotic,idealconfiguration.The effectivenessoftheproposedapproachisdemonstratedonthree syntheticdatasets[28–30]andongeneexpressiondatasetsacross twodifferentstudies[20,21].However,theapproachisdefinedina moregeneralsettingandcanbeappliedtomanyclassesofgraphical modelslearnedfromanykindofdata.
Acknowledgements
ThisworkwassupportedbytheUKTechnologyStrategyBoard
(TSB) and Biotechnology & Biological Sciences Research Coun-cil(BBSRC),grantTS/I002170/1(MarcoScutari)andtheNational
LibraryofMedicine,grantR03LM008853(Radhakrishnan
Nagara-jan).MarcoScutariwouldalsoliketothankAdrianaBroginifor proofreadingthepaperandprovidingusefulsuggestions.
References
[1]KollerD,FriedmanN.Probabilisticgraphicalmodels:principlesandtechniques.
Cambridge,MA,USA:MITPress;2009.
[2]PearlJ.Probabilisticreasoninginintelligentsystems:networksofplausible
inference.SanMateo,CA,USA:MorganKaufmann;1988.
[3]WhittakerJ.Graphicalmodelsinappliedmultivariatestatistics.Hoboken,NJ,
USA:Wiley;1990.
[4]EdwardsDI.Introductiontographicalmodelling.2nded.NewYork,USA:
Springer;2000.
[5]NeapolitanRE.LearningBayesiannetworks.EnglewoodCliffs,NJ,USA:Prentice
Hall;2003.
[6]KorbK,NicholsonA.Bayesianartificialintelligence.2nded. London,UK:
Chapman&Hall;2010.
[7]Heckerman D, Geiger D, Chickering DM. Learning Bayesian networks:
the combination of knowledge and statistical data. Machine Learning
1995;20(3):197–243,availableasMicrosoftTechnicalReportMSR-TR-94-09.
[8]GeigerD,HeckermanD.LearningGaussiannetworks,Tech.rep.,Microsoft
Research,Redmond,Washington,MicrosoftTechnicalReportMSR-TR-94-10;
1994.
[9]BrombergF,MargaritisD,HonavarV.EfficientMarkovnetworkstructure
dis-coveryusingindependencetests.JournalofArtificialIntelligenceResearch
2009;35:449–85.
[10]CasteloR,RoveratoA.ArobustprocedureforGaussiangraphicalmodelsearch
from microarraydatawith plarger thann.Journalof MachineLearning
Research2006;7:2621–50.
[11]FriedmanN,Pe’erD,NachmanI.LearningBayesiannetworkstructurefrom
massivedatasets:the“sparsecandidate”algorithm.In:LaskeyKB,PradeH,
editors.Proceedingsof15thconferenceonuncertaintyinartificialintelligence
(UAI).MorganKaufmann;1999.p.206–21.
[12]Larra ˜nagaP,SierraB,GallegoMJ,MichelenaMJ,PicazaJM.LearningBayesian
networksbygeneticalgorithms:acasestudyinthepredictionofsurvivalin
malignantskinmelanoma.In:KeravnouET,GarbayC,BaudRH,WyattJC,
edit-ors.Proceedingsofthe6thconferenceonartificialintelligenceinmedicinein
Europe(AIME).Springer;1997.p.261–72.
[13]TsamardinosI,BrownLE,AliferisCF.TheMax–MinHill-ClimbingBayesian
networkstructurelearningalgorithm.MachineLearning2006;65(1):31–78.
[14]Elidan G.Bayesian networkrepository;2001 http://www.cs.huji.ac.il/site/
labs/compbio/Repository
[15]MurphyP,AhaD.UCImachinelearningrepository;1995http://archive.ics.
uci.edu/ml
[16]FriedmanN,GoldszmidtM,WynerA.DataanalysiswithBayesiannetworks:
abootstrapapproach.In:LaskeyKB,PradeH,editors.Proceedingsofthe15th
annualconferenceonuncertaintyinartificialintelligence(UAI).Morgan
Kauf-mann;1999.p.206–15.
[17]EfronB,TibshiraniR.Anintroductiontothebootstrap.London,UK:Chapman
&Hall;1993.
CambridgeUniversityPress;2008.
[19]HusmeierD.Sensitivityandspecificityofinferringgeneticregulatory
inter-actions from microarray experiments with dynamic Bayesian networks.
Bioinformatics2003;19:2271–82.
[20]NagarajanR,DattaS,ScutariM,BeggsML,NolenGT,PetersonCA.Functional
relationshipsbetweengenesassociatedwithdifferentiationpotentialofaged
myogenicprogenitors.FrontiersinPhysiology2010;1(21):1–8.
[21]SachsK, Perez O, Pe’er D, Lauffenburger DA, NolanGP. Causal
protein-signalingnetworks derivedfrommultiparametersingle-cell data.Science
2005;308(5721):523–9.
[22]LauritzenSL.Graphicalmodels.Oxford,UK:OxfordUniversityPress;1996.
[23]ChickeringDM.Optimalstructureidentificationwithgreedysearch.Journalof
MachineLearningResearch2002;3:507–54.
[24]DeGrootMH,SchervishMJ.Probabilityandstatistics.4thed.Reading,MA,USA:
Addison-Wesley;2011.
[25]KolmogorovAN,FominSV.Elementsofthetheoryoffunctionsandfunctional
analysis.Rochester,NewYork,USA:GraylockPress;1957.
[26]CsiszárI,ShieldsP.Informationtheoryandstatistics:atutorial.Boston,MA,
USA:NowPublishersInc.;2004.
[27]NocedalJ,WrightSJ.Numericaloptimization.NewYork,USA:Springer-Verlag;
1999.
[28]BeinlichIA,SuermondtHJ,ChavezRM,CooperGF.TheALARMmonitoring
system:acasestudywithtwoprobabilisticinferencetechniquesforbelief
networks.In:HunterJ,CooksonJ,WyattJ,editors.Proceedingsofthe2nd
Euro-peanconferenceonartificialintelligenceinmedicine(AIME).Springer-Verlag;
1989.p.247–56.
[29]AbramsonB,BrownJ,EdwardsW,MurphyA,WinklerRL.Hailfinder:aBayesian
systemforforecastingsevereweather.InternationalJournalofForecasting
1996;12(1):57–71.
[30]BinderJ,KollerD,RussellS,KanazawaK.Adaptiveprobabilisticnetworkswith
hiddenvariables.MachineLearning1997;29(2-3):213–44.
[31]TsamardinosI,AliferisCF,StatnikovA.AlgorithmsforlargescaleMarkov
blanketdiscovery.In:RussellI,HallerSM,editors.Proceedingsofthe16th
internationalFloridaartificialintelligenceresearchsocietyconference.AAAI
Press;2003.p.376–81.
[32]MargaritisD.LearningBayesiannetworkmodelstructurefromdata.Ph.D.
the-sis.SchoolofComputerScience,Carnegie-MellonUniversity,Pittsburgh,PA,
AvailableasTechnicalReportCMU-CS-03-153;May2003.
[33]HausserJ,StrimmerK.EntropyinferenceandtheJames–Steinestimator,with
applicationtononlineargeneassociationnetworks.StatisticalApplicationsin
GeneticsandMolecularBiology2009;10:1469–84.
[34]ScutariM.bnlearn:Bayesiannetworkstructurelearning,Rpackageversion2.7;
2011http://www.bnlearn.com/
[35]ScutariM.LearningBayesiannetworkswiththebnlearnRpackage.Journalof
StatisticalSoftware2010;35(3):1–22.
[36]RDevelopmentCoreTeam.R:alanguageandenvironmentforstatistical
computing.Vienna, Austria:RFoundationforStatisticalComputing;2011
http://www.R-project.org
[37]Scutari M, Brogini A. Bayesian network structure learning with
per-mutation tests. Communications in Statistics – Theory and Methods
2012;41(16–17):3233–43.Specialissue“Statisticsforcomplexproblems:
per-mutationtestingmethodsandrelatedtopics”.Proceedingsoftheconference
“statisticsforcomplexproblems:themultivariatepermutationapproachand
relatedtopics”,Padova,June14–15,2010.
[38]DashD,DruzdzelMJ.Ahybridanytimealgorithmfortheconstructionofcausal
modelsfromsparsedata.In:LaskeyKB,PradeH,editors.Proceedingsofthe15th
conferenceonuncertaintyinartificialintelligence(UAI).MorganKaufmann;
1999.p.142–9.
[39]ImotoS,KimSY,ShimodairaH,AburataniS,TashiroK,KuharaS,etal.Bootstrap
analysisofgenenetworksbasedonBayesiannetworksandnonparametric
regression.GenomeInformatics2002;13:369–70.
[40]ChickeringDM.AtransformationalcharacterizationofequivalentBayesian
networkstructures.In:BesnardP,HanksS,editors.Proceedingsofthe11th
conferenceonuncertaintyinartificialintelligence(UAI).MorganKaufmann;
1995.p.87–98.
[41]CooperGF, Yoo C.Causaldiscoveryfroma mixtureofexperimentaland
observationaldata.In:LaskeyKB,PradeH,editors.Proceedingsof15th
confer-enceonuncertaintyinartificialintelligence(UAI).MorganKaufmann;1999.