City, University of London Institutional Repository
Citation:
Overton, C. E., Broom, M. ORCID: 0000-0002-1698-5495, Hadjichrysanthou, C.
and Sharkey, K. (2019). Methods for approximating stochastic evolutionary dynamics on
graphs. Journal of Theoretical Biology, 468, pp. 45-59. doi: 10.1016/j.jtbi.2019.02.009
This is the published version of the paper.
This version of the publication may differ from the final published
version.
Permanent repository link:
http://openaccess.city.ac.uk/id/eprint/21919/
Link to published version:
http://dx.doi.org/10.1016/j.jtbi.2019.02.009
Copyright and reuse: City Research Online aims to make research
outputs of City, University of London available to a wider audience.
Copyright and Moral Rights remain with the author(s) and/or copyright
holders. URLs from City Research Online may be freely distributed and
linked to.
ContentslistsavailableatScienceDirect
Journal
of
Theoretical
Biology
journalhomepage:www.elsevier.com/locate/jtb
Methods
for
approximating
stochastic
evolutionary
dynamics
on
graphs
Christopher
E.
Overton
a,∗,
Mark
Broom
b,
Christoforos
Hadjichrysanthou
c,
Kieran
J.
Sharkey
aaDepartmentofMathematicalSciences,UniversityofLiverpool,MathematicalSciencesBuilding,LiverpoolL697ZL,UK bDepartmentofMathematics,City,UniversityofLondon,NorthamptonSquare,LondonEC1V0HB,UK
c DepartmentofInfectiousDiseaseEpidemiology,SchoolofPublicHealth,ImperialCollegeLondon,StMary’sCampus,NorfolkPlace,LondonW21PG,UK
a
r
t
i
c
l
e
i
n
f
o
Articlehistory: Received20May2018 Revised7February2019 Accepted13February2019 Availableonline14February2019
Keywords:
Evolutionarygraphtheory Momentclosure Fixationprobability Network Markovprocess
a
b
s
t
r
a
c
t
Populationstructurecanhave asignificanteffectonevolution.Forsomesystemswithsufficient
sym-metry,analyticresultscanbederivedwithinthemathematicalframeworkofevolutionarygraphtheory
whichrelatetotheoutcomeoftheevolutionaryprocess.However,formorecomplicatedheterogeneous
structures,computationally intensivemethodsarerequiredsuchas individual-basedstochastic
simula-tions.Byadaptingmethodsfromstatisticalphysics,includingmomentclosuretechniques,wefirstshow
howtoderiveexistinghomogenisedpairapproximationmodelsandthe exactneutraldriftmodel.We
thendevelopnode-levelapproximationstostochasticevolutionaryprocessesonarbitrarilycomplex
struc-turedpopulationsrepresentedbyfinitegraphs,whichcancapturethedifferentdynamicsforindividual
nodes inthepopulation.Using theseapproximations, weevaluatethe fixationprobability ofinvading
mutantsforgiveninitialconditions,wherethedynamicsfollowstandardevolutionaryprocessessuchas
theinvasionprocess.Comparisonswiththeoutputofstochasticsimulationsrevealtheeffectivenessof
ourapproximationsindescribingthestochasticprocessesandinpredictingtheprobabilityoffixationof
mutantsonawiderangeofgraphs.Constructionofthesemodelsfacilitatesasystematicanalysisandis
valuableforagreaterunderstandingoftheinfluenceofpopulationstructureonevolutionaryprocesses.
© 2019TheAuthors.PublishedbyElsevierLtd.
ThisisanopenaccessarticleundertheCCBYlicense.(http://creativecommons.org/licenses/by/4.0/)
1. Introduction
Models of evolutionary dynamics were originally determinis-tic andassumed well-mixed populations in which every individ-ual of a giventype is identical.Stochastic models of thesefinite well-mixedpopulationshavebeenstudied(Moran,1958),however real populations are usually characterisedby a complicated rela-tionship structurebetweenindividuals(Zhangetal.,2007).To ac-countforthis,aclassofmathematicalmodelsknownas evolution-arygraphtheory havebeendevelopedwhichshowthat the pop-ulation structure can significantly influence the outcome of evo-lutionarydynamics (Liebermanetal., 2005; Traulsenand Hauert, 2010).Inthesemodels,structuredpopulations arerepresentedby finitegraphs,whereeachnoderepresentsanindividualinthe pop-ulation and relationshipsbetweenindividuals are represented by the edges ofthe graph. Stochastic evolutionary processes can be considered analytically and precise results can be derived for a
∗ Correspondingauthor.
E-mailaddress:[email protected](C.E. Overton).
number of simple graphs, such as the circle, star and complete graphs(Broometal., 2010;BroomandRychtáˇr,2008; Lieberman etal., 2005), mainly dueto their symmetry. Analytic approaches for investigating evolutionary dynamics on complex graphs have alsobeenproposed.However,suchmethodsareusuallylimitedby assumptionssuchaslargepopulations(Nowaketal.,2010;Ohtsuki etal.,2006)orarespecificallydesignedforinvestigating evolution-aryprocessesunderweakselection(Allenetal.,2017;Zhongetal., 2013),wheretheevolutionarygamehasonlyasmalleffecton re-productivesuccess.
Importantquantitiesofinterestsuchastheexactfixation prob-ability and time can, in principle, be obtained by solving the discrete-time difference equations of the underlying stochastic model (Hindersin et al., 2016), although this is only feasible for very small populations unless there are simplifying symmetries. Individual-based stochastic simulations(Barbosaetal., 2010; Ma-ciejewskietal.,2014)providenumericallyaccuraterepresentations ofthe evolutionary process on arbitrary graphs buthave limited scopeforgeneratingconceptualinsightsintothedynamicsontheir own.Theycanalsobecomputationallyexpensiveonlargergraphs,
https://doi.org/10.1016/j.jtbi.2019.02.009
butasapreciserepresentationoftheunderlyingstochasticmodel, theyallow usto evaluatetheaccuracy ofapproximatemodelsby comparison.
Here we develop approximations to the stochastic model by usinginsights from methods in statistical physics that have also been used extensively for epidemic modelling (Born and Green, 1946; Keeling and Eames, 2005; Kirkwood, 1947; Pellis et al., 2015; Sharkey, 2008; Sharkey et al., 2015). Such methods have been applied to develop pair approximations for evolutionary processes on graphs which satisfy the homogeneity assump-tion that all individuals can be considered identical and inter-changable(Hadjichrysanthouetal.,2012;HauertandSzabó,2005; Morita,2008; Penaetal., 2009; Szabó and Fath, 2007). However, theunderlying assumptions linking thesemodels tothe underly-ingstochastic dynamicsare not always clear.Onecontribution of thisworkistoderivethesemodelsexplicitlybyidentifyingthe re-quiredassumptions. The startingpoint forall ofour approxima-tions is to derive an equation to describe the time-evolution of thestateofanygivenindividualnode.Fromthisequation,various routestoapproximationbecomeapparentbyapplyingdifferent as-sumptions.We then investigate the applicability and accuracy of theresultingapproximationmethods.
Evolutionary graph theory is traditionally explored as a discrete-timestochastic model.While it ispossible to work with thesedynamics, it is easier to work with a continuous-time ap-proximationtotheprocess. Thecontinuous-timesystemis repre-sentedbyamasterequationdescribinghowtheprobabilityof be-ingineachsystemstatechanges.Fromthemasterequationwe ob-tainexactequations(withrespecttothecontinuous-timeprocess) fortheprobabilitiesofthestatesofindividualnodes(Theorem2.1). Theseequationscanthen beapproximated byadopting moment-closuremethods.Wefocusonevaluatingtheprobabilitythatatthe endoftheevolutionaryprocess,aninitialsubsetofmutantsplaced onthegraphwilltakeoverthewholepopulationand‘fixate’. Us-ing this continuous-time system is justified because the fixation probabilityandexpectedtime tofixationare identicalto thoseof theoriginaldiscrete-timeprocess.Withinthisframeworkwestudy whenaccurateapproximationscanbederived.
InSections2.1–2.3weintroducethestochasticevolutionary dy-namicsandthemasterequation, andderive adescriptionofhow node-levelquantitieschangeinthemasterequation.Wethen dis-cussanddevelop various techniquesthat can beused to approx-imate these systems ofequations in Section 3. Within these ap-proximationframeworks we derive the pairapproximation mod-elsusedintheliterature,whichwewillcallthehomogenisedpair approximation,andthe exact neutraldrift model,and build new node level approximation methods. InSection 4we demonstrate howthedifferentmethodscanbeusedtoapproximatethe dynam-icsof the original discrete-time process. Section 4.1 studies how the methods perform when approximating the fixation probabil-ityof asingleinitial mutantplacedon idealisedandon complex graphs.Section 4.2then showshow the methods perform when studyingtheevolutionarygamedynamicsin aHawk–Dove game. InSection5wediscusstheresultsobtainedfromthemethods de-velopedandtheinsightsthesecangive.
2. Thestochasticmodel
2.1.Stochasticevolutionarydynamics
We considera population whoserelationshipstructure is rep-resented by a strongly connected undirected graph (V, E) where
V=
{
1,2,...,N}
isthesetofnodesandEdenotesthesetofedges. Thiscanbe representedby an adjacencymatrixG,whereGi j =1 ifj is connected to i, and Gi j=0 otherwise, with Gii=0 foralli∈V.Weconsiderpopulationsconsistingoftwotypesof
individu-als,typeAandtypeB,eitherofwhichcanbeintheroleof invad-ingmutantinaresidentpopulation.Eachnode isoccupiedby ei-theranAoraBindividual.ThereforewecanletAi =1ifandonly ifnodei isoccupiedby anA individualandAi =0otherwise and letBi denotethesameforindividualsoftype B.SinceBi =1−Ai ,
thestate ofthesystemcan berepresented bythevalues ofAi at anygiven time. Ifthere exists an edge (i, j)∈E between nodesi, j∈V,thentheoffspring oftheindividualinnode jcanreplacethe individual innode iandviceversa.Tostudytheevolutionary dy-namicsbetweenthesetwo typesofindividual werequirea mea-sureof fitness. We can describe the fitness payoff received from interactionsbetweenindividualsbythefollowingpayoff matrix:
A B
A B
a b
c d
,
where an A individual obtains a payoff a when interacting with anotherAindividualandpayoffbwheninteractingwithaB indi-vidual.Similarly,aBindividualobtainspayoffscanddwhen inter-actingwithanAindividualandaBindividualrespectively.
Todefine fitnessbasedon thepayoff,following similar defini-tions in the literature (Hadjichrysanthou et al., 2011; Lieberman et al., 2005; Ohtsuki et al., 2006; Taylor et al., 2004; Traulsen andHauert,2010),thefitnessofeachindividualisassumedtobe
f= fback +wP,wherefback isthebackgroundfitnessofall individ-uals,Pistheaveragepayoff receivedfrominteractionswith neigh-bours,andw∈[0,∞)isa parameterwhichcontrolsthe contribu-tionofthegamepayoff tofitness.
The fitness of an A individual which occupies node j, fA j , is thereforegivenby
fA j =fback +wa
N
i =1Gi j Ai +b
N i =1Gi j Bi
N i =1Gi j
, (1)
andsimilarlythefitnessofaBindividualoccupyingnodejisgiven by
fBj =fback +wc N
i =1Gi j Ai +d
N i =1Gi j Bi
N i =1Gi j
. (2)
Inthespecialcaseofconstantfitness,wherethefitnessof individ-uals remains constantindependent oftheinteractions withother individuals,wetakethepayoff matrixas
A B
A B
r r
1 1
,
sothatAindividualshaverelativepayoff equaltor.
2.2. Themasterequation
Toapproximatethediscrete-timeevolutionaryprocess wefirst translate thediscrete-timesystemto an approximate continuous-timesystem. Todothiswe modeleach(replacement)eventusing a Poissonprocess.Therateatwhicheach eventhappensisequal totheprobabilityofthateventinthediscrete-timemodel. There-forethetotaleventpressurewillbethesumofallsuch probabil-ities,which isequaltoone,so thatthetime untilthenext event followsaPoissonprocesswithrateparameterone.Wethen deter-minewhicheventtakesplaceusingtherelevantprobability.Under thiscontinuous-timesystemthefixationprobability andexpected timetofixationwillbeidenticaltothoseofthediscrete-time sys-tem,sinceweusethesameprobabilitieswheneveraneventoccurs andtheexpectedtime betweeneventsisconstant. Thisis impor-tantbecausethesearethemainquantitiesofinterestin evolution-arydynamics.
We will use this system to build approximation methods to study the original discrete-time process. We choose to use continuous-time becauseit enables ustobuild a system of ordi-narydifferentialequationstoapproximatethedynamics,which al-lowustomakeuseofefficientnumericalsolversandenableusto derivesomeanalyticresults.
Since this evolutionary process is a continuous-time Markov process, we can constructa master equation to describe the dy-namics. Let Si =
(
s1,s2,...,sN)
be a state of the system, wherei∈
{
1,...,2N}
and where sj=1 if node j is a type A individ-ual and sj =0 otherwise. We define S1=(
0,0,...,0)
and S2N=(
1,1,...,1)
to be the statesconsisting of onlyB individuals and onlyAindividuals,respectively.We introduce a vectorp
(
t)
which representsthe probabilities ofeachsystemstateattimet.Thatis,theithentryofp(
t)
,pi (t),is theprobabilitythatthesystemisinstateSi attimet.This Marko-vianevolutionaryprocesshas2N possiblestatesandthetransitions betweenthemare governedby a2N ×2N transitionratematrixRwhoseentriesdependupon thegraphandupdatemechanismwe consider.
Wewritetherateofchangeinthestateprobabilitiesusingthe masterequationoftheMarkovprocess:
dp
dt =Rp. (3)
Suchanequationcanbeconstructedforanygraphundera Marko-vianupdatemechanism.Theabsorbingstatescorrespondtotheall typeBoralltypeAstates,S1 andS2N,soaregivenbyp1 andp2N.
Since we consider a strongly connected adjacency matrix G, provided we have atleast one type A and one type B it is pos-sible to get to either of the absorbing states and therefore from any mixed initial condition the system will always end up dis-tributed betweenthesetwo states.Wedefine the fixation proba-bilityPA
f ix
(
S(
i))
oftypeAfromaninitialstateS(i)tobethe proba-bilityofbeingintheallAabsorbingstate,thatisPA f ix
(
Si)
=limt→∞
(
p2N(
t)
|
pi(
0)
=1)
,wherepi (0)istheprobabilityofbeinginthestateSi attimet=0. SimilarlywedefinethefixationprobabilityoftypeBas
PB f ix
(
Si)
=limt→∞
(
p1(
t)
|
pi(
0)
=1)
.The computationalcost ofimplementingsystem(3)increases ex-ponentiallywith N (Hindersin etal., 2016), andthus the compu-tationofthe fixation probabilitybecomesinfeasible asthe popu-lation size increases.Therefore it is ofinterest to build approxi-mationmethods.Pairapproximationsofthemasterequationhave beendevelopedunderthehomogeneityassumptionthatallnodes ontheunderlyinggraphareidenticalandinterchangeable(Hauert andSzabó,2005;Szabó andFath,2007),whichcangiveinteresting
insightintotheevolutionarydynamics.Howeverthehomogeneity assumptionsmadeintheseapproximationsresultinthelossof in-sightintographandnode-specificdynamics,soweaimtodevelop approximationsofthemasterequationwhichcancapturethis in-formation.
2.3.Nodelevelequations
Weapproximatethemasterequationbyapproximatingthe dy-namicsofthestateprobabilitiesofindividualnodesinthe popula-tion.Thisismotivatedbyapproachesinstatisticalphysicsand epi-demicmodelling(BornandGreen,1946;Kirkwood,1947;Sharkey, 2008;Sharkeyetal., 2015), andfirstrequiresexactequations de-scribinghowtheprobabilityofeachnodebeingoccupiedbya cer-taintypechangeswithtime,whichcanbederivedfromthemaster equation(3).
Definition2.1. Let
χ
(
t j→i|
St)
denotetherateatwhichthe indi-vidualin nodej replaces the individualin node iat timet given thatthesystemisinstate Sattime t; wereferto thisasthe re-placementrate.Definition2.2. Xt
C denotes theeventthat thesetofnodesCisin stateXattimet;forexampleAt {i }istheeventthatnodeiisinthe typeAstateattimet.
Throughoutthispaperwe shallusetheshorthandBt
{i }At{j}X t C to representtheintersectionofeventsBt {i }∩At {j}∩XC t .
Theorem2.1. UnderanyMarkovianupdate mechanism,fora struc-turedpopulation represented bytheadjacency matrixG,therate of changeoftheprobabilitythattheindividualinnodeiisanA individ-ualis
dP
(
At {i })
dt =
N
j=1
X V\{i,j}
Gi j P
(
Bt {i }At {j}XV t \{i,j})
χ
(
t j→i|
Bt {i }A{t j}XV t \{i,j})
−
N
j=1
X V\{i,j}
Gi j P
(
At {i }B{t j}XV t \{i,j})
χ
(
t j→i|
At {i }B{t j}XV t \{i,j})
,(4)
where the sum over XV \{i,j } is over all possible states of the nodes
V\{i,j}.
Proof. SeeAppendixA.
This theorem can be applied to any update mechanism by choosing an appropriate definition for the replacement rate,
χ
(
tj→i
)
,whichweshalldefinefortheinvasionprocessasan ex-ample.Example2.1(Invasionprocess). Theinvasionprocessisan adapta-tionoftheMoranprocess(Moran,1958)tostructuredpopulations. Eacheventisdetermined by selectingan individual to reproduce withprobabilityproportionaltoitsfitness.Itproducesanidentical offspringwhichreplacesoneoftheconnectedindividualswhichis chosenuniformlyatrandom.Thereforetherateatwhichthe indi-vidualinnode jreplaces theindividual innodei attime tunder theinvasionprocessrulesisgivenby
χ
(
t j→i|
S)
=ft j
|
S Ft|
S1
kj , (5)
where ft j isthefitnessoftheindividualoccupyingnodejattimet,
Whencalculating
χ
(
t j→i)
inequation (4),wewillusethe fol-lowingexpressionforthefitnessoftheindividualatagivennodejattimet,
ft j = fback +wP
(
At {j})
a Ni =1Gi j P
(
At{i })
+bN i =1Gi j P(
Bt{i })
Ni =1Gi j
+wP
(
Bt {j})
c Ni =1Gi j P
(
At {i })
+dN
i =1Gi j P
(
Bt {i })
N i =1Gi j
, (6)
which is a sum of equations (1) and (2) weighted by the node probabilities. We use this definition because when we evaluate equation(6)giventhatthesystemisinaparticularstateS,as re-quiredbyequation(4),thevaluesofP
(
At {k })
andP(
Bt {k })
areeither 1or0,whichleadstothefitnessofnodejinthatparticularsystem stateequations(1)and(2).However,bydefiningfitnessintermsof thenode probabilities,thisallowsusto haveadescription of fit-nesswhichwecanapproximate(seeSections3.2and3.3).3. Approximatingthestochasticmodel
Inotherfields,suchasepidemiology,theconstructionof node-level equations such as equation (4) can lead to a hierarchy of momentequationswhereby theseequationsare writteninterms ofpair probabilities, pairs are written in terms of triples andso on, until the full system state size is reached and the hierarchy isclosed.Thisisusefulwhenwecan findappropriate closure ap-proximationsto closethishierarchy ata low order.However, we seethatsuchanapproachcannotbeusedherebecausewe condi-tionagainstthefullsystemstateinequation(4)whichmeansthat thefull systemsizeappears even atthefirst order.We therefore attempttofindothermethodstosimplifythissystemofequations. In this section we will describe three different techniques to derive approximations forthis system. The first technique yields a system of equations which become computationally infeasible insomecircumstances,butbyapplyinghomogeneityassumptions totheunderlying graph,wecan derivethe existingpair approxi-mationmodels currentlyusedin theliterature (Hadjichrysanthou et al., 2012; Hauert and Szabó, 2005; Morita, 2008; Pena et al., 2009;Szabó andFath, 2007)(Section3.1).Toreducecomputation costs,we then develop methodsbased onrestrictingthe number ofstateswhich weconditionagainstinthe replacementrate.We firstobtainamethodwhosecomputational complexityscales lin-early with the population size N and, after an appropriate scal-ing, approximates the fixation probability well on a wide range ofgraphs(Section3.2). Then,inSection 3.3,we obtaina method which,although itscaleswithN2,provides agoodapproximation tothe evolutionarydynamicsover thewhole time seriesfor var-ious graphs, and in particular provides a very accurate approxi-mationto the initial dynamicsofthe evolutionary processon all graphs.
3.1.Derivingthehomogenisedpairapproximationmodel
Onewayofsimplifying(4)istoassumethatthefitness ft j does not needto be normalised by the total fitness Ft in the replace-mentrate(e.g.asinequation(5)fortheinvasionprocess).This ap-proximationisjustifiedbecauseitdoesnotchangethefinalvalue to whichthe exact node-level equationsconverge (and therefore thefixation probability),and will only transformthe time series until fixation. Making this assumption, the node level equations simplifysothatwe onlysumoverthe neighboursofthe individ-ualthatweselectedbasedonfitness.Thatis,whenlookingatthe eventwherenode j replaces node i,if weare selecting ondeath weneedto conditionagainst thestate ofallneighbours ofi,and ifselecting onbirth weneed toconditionagainst thestate ofall
neighboursofj.As an example,weshall assumeherethat selec-tionoccursonbirthsothatwerequireconditioningonthe neigh-bourhoodofnodej,howeverwecanalsomakesimilararguments whenselectingondeath.Using
χ
¯ torepresentthismodificationofχ
in(4)andQtorepresentthenewprobabilitydistributionwith themodifiedtimeseriesweobtaindQ
(
At {i })
dt =
N
j=1
X Nj\{i}
Gi j Q
(
Bt {i }At {j}XNt j\{i })
χ
¯(
tj→i
|
Bt {i }A{t j}XNt j\{i })
−
N
j=1
X Nj\{i}
Gi jQ
(
At {i }Bt {j}XNt j\{i })
χ
¯(
tj→i
|
At {i }B{t j}XNt j\{i })
,(7)
where Nj is the neighbourhood ofnode j; i.e.all nodes that are connectedtoj.Tosolvethissystemexactlyrequiresthe develop-mentofequationsdescribinghowtheprobabilityofeachpossible neighbourhoodofnodeschanges.Thisinturnwouldleadtoa hier-archyofequationswhichiscomputationallysimilartothemaster equation.Howeveritispossibletodevelopapproximationmethods byassumingindependenceattheleveloflower-orderterms,such asindividualsorpairsofnodes,andapproximatingthe neighbour-hoodprobabilitiesasafunctionofthese.
For example, we can make a pair approximation by applying Bayes’Theoremandassumingstatisticalindependenceatthelevel ofpairstorewritetheneighbourhoodprobability intermsofpair probabilities.Applying Bayes’Theorem totheprobabilities onthe righthandsideofequation(7)weget
dQ
(
At {i })
dt =
N
j=1
X Nj\{i}
Gi j Q
(
At {j})
Q(
Bt {i }XNt j\{i }|
At
{j}
)
×
χ
¯(
t j→i|
B{t i }At {j}XNt j\{i })
−
N
j=1
X Nj\{i}
Gi j Q
(
Bt {j})
Q(
At {i }XNt j\{i }|
B t{j}
)
×
χ
¯(
t j→i|
A{t i }Bt {j}XNtj\{i }
)
. (8)If we assume statistical independence ofall nodesin the neigh-bourhoodofj,giventhestateofj,we canrewritethe neighbour-hoodprobabilityQ
(
At {j})
Q(
Bt {i }XNtj\{i }
|
At
{j}
)
asQ
(
At {j})
Q(
Bt {i }XNt j\{i }|
At
{j}
)
≈Q(
At {j})
Q(
Bt {i }|
At {j})
l∈Nj\{i }
Q
(
X{t l}|
At {j})
,whereX{t l}iseventwherenodelisinthesamestateasitisinthe eventXNt
j\{i }.Substitutingthisintoequation(8)gives dQ
(
At {i })
dt ≈
N
j=1
X Nj\{i}
Gi j Q
(
At {j})
Q(
Bt {i }|
At {j})
×
l∈Nj\{i }
Q
(
Xl t|
At {i })
χ
¯(
t j→i|
Bt {i }At {j}XNt j\{i })
−
N
j=1
X Nj\{i}
Gi j Q
(
Bt {j})
Q(
Ai|
Bt {I})
×
l∈Nj\{i }
Q
(
Xl t|
Bt {I})
χ
¯(
t j→i|
At {i }Bt {j}XNt j\{i })
.we can derive exact equationsdescribing pairs.For the probabil-ityP
(
Bt {i}At {j})
weobtaindP
(
Bt {i }At {j})
dt =
N
k =1
X V\{i,j,k}
Gjk P
(
Bt {i }Bt {j}At {k }XV t \{i,j,k })
×
χ
(
tk →j
|
Bt {i }Bt {j}At {k }XV t \{i,j,k })
−
N
k =1
X V\{i,j,k}
Gjk P
(
Bt {i }At {j}Bt {k }XV t \{i,j,k })
×
χ
(
tk →j
|
Bt {i }At {j}Bt {k }XV t \{i,j,k })
+
N
k =1
X V\{i,j,k}
Gik P
(
Bt {k }At {i }At {j}XV t \{i,j,k })
×
χ
(
tk →i
|
Bt {k }At {i }At {j}XV t \{i,j,k })
−
N
k =1
X V\{i,j,k}
GikP
(
At {k }Bt {i }At {j}XV t \{i,j,k })
×
χ
(
tk →i
|
At {k }Bt {i }A{t j}XV t \{i,j,k })
. (9)We can now apply the same assumption regarding total fitness thatweusedforthesinglenodeprobabilitiessothat
dQ
(
Bt {i }At {j})
dt =
N
k =1
X Nk\{i,j}
Gjk Q
(
Bt {i }B{t j}At {k }XNt k\{i,j})
×
χ
¯(
t k →j|
Bt {i }Bt {j}At {k }XNt k\{i,j})
−
N
k =1
X Nk\{i,j}
Gjk Q
(
Bt {i }A{t j}Bt {k }XNt k\{i,j})
×
χ
¯(
t k →j|
Bt {i }At {j}Bt {k }XNt k\{i,j})
+
N
k =1
X Nk\{i,j}
Gik Q
(
Bt {k }A{t i }At {j}XNt k\{i,j})
×
χ
¯(
t k →i|
Bt {k }At {i }At {j}XNt k\{i,j})
−
N
k =1
X Nk\{i,j}
Gik Q
(
At {k }B{t i }At {j}XNt k\{i,j})
×
χ
¯(
t k →i|
At {k }Bt {i }At {j}XNtk\{i,j}
)
. (10)Applying Bayes’ Theorem to the neighbourhood probability
Q
(
Bt {i}Bt {j}At {k}XNtk\{i,j}
)
weobtainQ
(
Bt {i }Bt {j}At {k }XNtk\{i,j}
)
=Q(
B t{j}At {k }
)
Q(
Bt {i }XNt k\{i,j}|
B t{j}At {k }
)
We can now assume statistical independence of the remaining nodesgiventhestateofjandksothat
Q
(
Bt {i }Bt {j}At {k }XNtk\{i,j}
)
≈Q(
B t{j}At {k }
)
Q(
Bt {i }|
Bt {j}At {k })
×
l∈Nk\{i,j}
Q
(
X{t l}|
B{t j}At {k })
.Sinceweknowthatnodei isconnectedtonodej wecanassume thatgiventhestateofnodej,thestateofnodeiisindependentof node k,andsimilarlythestate ofanynode intheneighbourhood ofkisindependentofnodej,whichgivesus
Q
(
Bt{i }Bt{j}At{k }XNtk\{i,j}
)
≈Q(
B t{j}A t
{k }
)
Q(
Bt{i }|
Bt{j})
×
l∈Nk\{i,j}
Q
(
X{t l}|
At {k })
.Substitutingthisintoequation(10)gives
dQ
(
Bt {i }At {j})
dt ≈
N
k =1
X Nk\{i,j}
Gjk Q
(
Bt {j}At {k })
Q(
Bt {i }|
Bt {j})
×
l∈Nk\{i,j}
Q
(
X{t l}|
A{t k })
χ
¯(
t k →j|
Bt {i }Bt {j}At {k }XNt k\{i,j})
−
N
k =1
X Nk\{i,j}
Gjk Q
(
At {j}Bt {k })
Q(
Bt {i }|
At {j})
×
l∈Nk\{i,j}
Q
(
X{t l}|
B{t k })
χ
¯(
t k →j|
Bt {i }At {j}Bt {k }XNt k\{i,j})
+
N
k =1
X Nk\{i,j}
Gik Q
(
At {i }Bt {k })
Q(
At {j}|
At {i })
×
l∈Nk\{i,j}
Q
(
X{t l}|
B{t k })
χ
¯(
t k →i|
At {i }At {j}Bt {k }XNt k\{i,j})
−
N
k =1
X Nk\{i,j}
Gik Q
(
Bt{i }At{k })
Q(
At{j}|
B t{i }
)
×
l∈Nk\{i,j}
Q
(
X{t l}|
A{t k })
χ
¯(
t k →i|
Bt {i }At {j}At {k }XNt k\{i,j})
.Whilethissystemisclosed,itscomputationalcomplexityincreases exponentiallywiththemaximum nodedegree ofthegraph,so it isnotnumericallyfeasibleforgraphswithhighlyconnectednodes. Whilethiscouldpotentiallybe addressedby introducing approxi-mations fornodes with high degree and this maylead to accu-ratemodels,herewecontinuetowards asimplified model.Todo this,wefollowthesameprocessasinepidemicmodelsandmake a homogeneity assumption by assuming that any pair is equally likelytobeinanygivenstate(Kissetal.,2017;Sharkey,2008);i.e.
Q
(
X{t i }|
Y{t j})
=Q(
Xt|
Yt)
forallpairs(i,j).ThisleadstodQ
(
At {i })
dt ≈
N
j=1
X Nj\{i}
Gi j Q
(
At {j})
Q(
Bt|
At)
k j−n XQ(
At|
At)
n X×
χ
¯(
t j→i|
Bt {i }At {j}XNt j\{i })
−
N
j=1
X Nj\{i}
Gi j Q
(
Bt {j})
Q(
At|
Bt)
n X+1Q(
Bt|
Bt)
k j−n X−1×
χ
¯(
t j→i|
A{t i }Bt {j}XNt j\{i })
,wherekjisthedegreeofnodejandnX isthenumberoftypeA in-dividualsinstateXN
j\{i }.Sincethetransitionrateonlydependson
thenumberoftypeAandtypeBindividualsintheneighbourhood ofnode j and not on their positions, the summand on the right handsideisequalforallstatesXN
j\{i }whichhavethesame
con-figurationofAandBindividuals.Thefrequencyofacertain neigh-bourhoodstate across all possible configurations is given by the binomialcoefficient,sothat
dQ
At {i }dt ≈
N
j=1 k j−1
n =0
Gij
kj −1
n
Q
At {j }QBt|
At k j−nQ
At|
At n×
χ
t j→i
n
−
N
j=1 k j−1
n =0
Gij
kj −1
n
Q
Bt {j }QAt|
Bt n +1QBt|
Bt k j−n −1×
χ
t j→i
n+1
,
where
χ
¯(
t A →B|
n)
isthe rateatwhichweselectone ofthetypearentypeAindividualsandkj−ntypeBindividualsinthe neigh-bourhoodoftheselectednode.
Since wehaveassumedthat anypairisequallylikely,this as-sumptiononly holdswhenevery node inthegraphformsk con-nections, whichare chosen atrandom. Thereforewe requirethat nodeiisequallylikelytobeconnectedtoanyothernode andall nodesare topologicallyequivalent, so that the probability that a givennodeoftypeBisconnectedtoxtypeAneighboursisgiven byabinomial distributionwithn=kand p=Q
(
At|
Bt)
.Therefore theprobabilityofanindividualbeingtypeAchangeswithratedQ
Atdt ≈kQ
At
|
Bt QBt×
k −1
n =0
k−1
n
Q
Bt|
At k−nQAt|
At nχ
t A →B
n
−kQ
(
Bt|
At)
QAt×
k −1
n =0
k−1
n
Q
At|
Bt n +1QBt|
Bt k −n −1χ
B t →A
n+1.
We can also apply theseassumptions to the pair-level equations to obtain a closed system of equations which are efficient to solvenumerically.Theresultingmodelisequivalenttothe model inMorita(2008),whichwasjustifiedbyusingtheassumptionthat thepopulation occupiesaregular graph,such that all individuals havedegreek,andthatallnodesaretopologicallyequivalent,such thateverypairofindividualsisequallylikelytobeconnected.We haveshownthatbyapplyingtheseassumptionstotheexact node-levelequationsequation(4)wecanderivethesemodels.
Similarly we can obtain a pair approximation model for the dynamics where we select on death by conditioning against the state of the neighbours of node i. Applying analogous as-sumptions to the previous example then leads to the model inHadjichrysanthou etal.(2012).Thesemodelshavebeenshown toyieldinteresting qualitativeresultsabouttherelative strengths ofdifferentstrategiesin evolutionarygames on graphs.However, thehomogeneityassumptionsmaderesultinlosingimportant as-pectsofthestructure,suchashowindividualnodesinthesystem canbehavedifferently.Inthenextsectionswewillattemptto de-velopapproximationmethodswhichcancapturethisnode-specific information.
As we alluded to earlier, a natural method would be to use equation (7) as a basis for this. However, difficulties in imple-mentingthismethodongeneralnetworksaswell asthe number of equations that result leads us to a different direction for the presentwork.
3.2.Anunconditionedfitnessapproximationmodel
Herewedevelopamethodwhichremovestheneedtoinclude theprobability ofwhole neighbourhoods byremoving the condi-tioninginthe replacementrate. Thiscausesthe replacementrate toonlydepend onthe marginalprobabilities ofthestate ofeach noderather thanthe full systemstate.This assumptionalso mo-tivated a model in Szabó and Fath (2007) in which the authors constructapopulation-levelapproximationdescribinghowthe ex-pectednumberofindividualsofeachtype changewithtime. Un-derthisassumption,equation(4)becomes
dP
(
At {i })
dt ≈
N
j=1
X V\{i,j}
Gi j P
(
Bt {i }At {j}XV t \{i,j})
χ
(
t j→i)
−
N
j=1
X V\{i,j}
Gi j P
(
At {i }Bt {j}XV t \{i,j})
χ
(
t j→i)
.Since
χ
(
t j→i)
isnowthesameforallsystemstates,dP
(
At {i })
dt ≈
N
j=1
Gi j P
(
Bt {i }A{t j})
χ
(
t j→i)
− N
j=1
Gi j P
(
At {i }B{t j})
χ
(
t j→i)
.AddingandsubtractingN j=1Gi j P
(
At {i }At {j})
χ
(
t j→i)
weobtaindP
(
At {i })
dt ≈
N
j=1
Gi j P¯
(
Bt {i }At {j})
χ
(
t j→i)
+Gi j P(
At {i }A{t j})
χ
(
t j→i)
−
N
j=1
Gi j P
(
At {i }B{t j})
χ
(
t j→i)
+Gi j P¯(
At {i }A{t j})
χ
(
t j→i)
≈
N
j=1
Gi j P
(
At {j})
χ
(
t j→i)
− N
j=1
Gi j P
(
At {i })
χ
(
t j→i)
,whichisaclosedsetofNequationswithatmostNsummandson theright handside.Therefore by defining P¯ asan approximation totheprobabilitydistributionPweobtaintheclosedsystem
dP¯
(
At {i })
dt =
N
j=1
Gi j P¯
(
At {j})
χ
(
t j→i)
− N
j=1
Gi j P¯
(
At {i })
χ
(
t j→i)
, (11)whichiseasytosolvenumericallyforanarbitrarygraph.
Example3.1(Neutraldrift). Inthespecialcaseofneutraldrift,i.e. when allindividuals haveidenticalfitness, theunconditioned fit-nessmodelgivestheexactfixationprobability.Withthedynamics ofthe invasion process underneutral driftwe obtain
χ
(
t j→i)
=1
Nk j,andthereforeequation(11)canbewrittenas
dP¯
(
At {i })
dt =
N
j=1
Gi j P¯
(
At {j})
1Nkj −
N
j=1
Gi j P¯
(
At {i })
1 Nkj ,which is equivalentto the exact node equation (4) forthe inva-sion process underneutral drift (Shakarian etal., 2013). The un-conditionedfitnessmodelisalsoexactforallupdatemechanisms under neutraldrift, butwe donot write the equationsexplicitly here.
As the population size N increases, the solution to equation
(11) moves further away from the exact fixation probability ob-tainedeitherbysolvingthemasterequation(3)orfromtheoutput ofstochasticsimulations.Toobtainareasonableapproximationto thefixationprobabilityfromagiveninitialconditionweconstruct ascalingfactorfortheconstant fitnesscasebycomparingthe ra-tiobetweenthesolutionofequation (11)ona completegraphto theexactfixationprobability onacompletegraph.Wechoosethe completegraphbecause theexactfixation probabilitycan be cal-culatedanalytically inthiscase. Whilst weconsider the constant fitnesscase,itmayalsobepossibletofindasuitablescalingfactor inthefrequencydependentfitnesscase,howeverusingacomplete graphmaynolongerbeappropriate becausetherelativestrength of different strategies in some games is strongly affected by the averagedegreeofthegraph(Ohtsukietal.,2006).
Example3.2(Invasionprocess). Forconstantfitnessunderthe dy-namicsoftheinvasionprocess,theexactfixationprobabilityform
initial mutantA individualson acomplete graphis equivalentto theMoranprobability(Liebermanetal.,2005):
ρ
=1−1 r m 1− 1 r N
.
thetwo.Intheconstantfitnesscasethiscanbedoneanalytically, withthescalingfactorforminitialmutantsgivenby
ρ
lim t→∞Ac
(
t)
=
1−1 rm 1−1 rN
1 r−1
−1+
1+m (r 2−1) N , (12)whereAc
(
t)
= 1 NN
j=1 ¯
P
(
At {j})
.Thederivationofthiscanbefound inAppendixB.
We can now define two methods for predicting the fixation probabilityunderanyMarkovianupdatemechanism.
• Method1(Unconditionedfitnessmodel)Solveequation(11)to providean approximationtothedynamicsoftheevolutionary process. (MATLAB code for solving the unconditioned fitness modelisprovidedassupplementarymaterial.)
• Method2(Scaledunconditionedfitnessmodel)Solveequation
(11)andthenuseascalingfactor,theratiooftheexactfixation probability andthesolution toequation (11) forthecomplete graph,to providean approximation tothe fixation probability fromagiveninitialcondition.
InSection4weinvestigatethenumericalperformanceofthese two methods. Note that for the purpose of this paper we have foundthescalingfactorequation (12)forMethod2underthe in-vasion process.However, themethodcanbe appliedtoother up-datemechanisms,such asdeath–birthwithselectionon birth,by findinganappropriatescalingfactor,whichcanbedonebysolving equation(11)(eitheranalyticallyornumerically)andcomparingto theexactfixationprobabilityonthecompletegraph.Forexample, seeHindersinandTraulsen(2015)fortheexactfixationprobability onacompletegraphundertheDB-Bdynamics.
3.3. Acontactconditioningapproximationmodel
In Section 3.2 we restricted the conditioning so that we only require the marginal probabilities of the individual nodes. How-ever, this removes a significant amount of information from the dynamics. In the evolutionary process, when considering a re-placementeventthetwo nodesofmostinterest arethenode se-lected forbirth and the node selected for death. Therefore, here we followa similarmethod butretainconditioning onthestates of these two key nodes. Since we restrict the conditioning to only thestatesofthe relevantcontact,when lookingatthe term
χ
(
tj→i
|
Bt {i }At {j}XV t \{i,j})
in equation (4) we condition only on the statesofthenodesiandjandobtainχ
(
tj→i
|
Bt {i }At {j}XV t \{i,j})
≈χ
(
t j→i|
B{t i }At {j})
. Undertheabovecondition,equation(4)becomesdP
(
At {i })
dt ≈
N
j=1
X V\{i,j}
Gi j P
(
Bt {i }At {j}XV t \{i,j})
χ
(
t j→i|
Bt {i }At {j})
−
N
j=1
X V\{i,j}
Gi j P
(
At {i }Bt {j}XV t \{i,j})
χ
(
t j→i|
A{t i }Bt {j})
. (13)To see the effect of this assumption on the rates, consider
χ
(
tj→i
|
Bt {i }At {j})
.Here we conditiononly against node i beingin state B and node j being in state A rather than against the en-tiresystemstate.Consequentlyinthefitnessequation(6)wehaveP
(
Bt {i })
=1andP(
At {j})
=1givingft j
|
Bt {i }At {j}=fback +wbTi j +a
l=i
Gjl P
(
At {l})
+bl=i
Gjl P
(
Bt {l})
N
l=1
Gjl
.
Inequation (13),the chance ofselecting node j is now indepen-dentofthestateXV t \{i,j}oftheremainingnodeswhichenablesthe equationtobereducedto
dP
(
At {i })
dt ≈
N
j=1
Gi j P
(
Bt {i }At {j})
χ
(
t j→i|
Bt {i }At {j})
−
N
j=1
Gi j P
(
At {i }Bt {j})
χ
(
t j→i|
At {i }Bt {j})
. (14)This gives an approximate equation for individuals in terms of pairs.Wethenneedtobuildequationstodescribepair-level prob-abilities. Similar methodologies have been followed to describe epidemicspropagatedonnetworks(Sharkey,2008;Sharkeyetal., 2015).
Applyingthesameconditioningtotheexactpairlevelequation
(9)weobtain
dP
(
Bt {i }At {j})
dt ≈
N
k =1
Gjk P
(
Bt {i }Bt {j}At {k })
χ
(
t k →j|
Bt {j}At {k })
−
N
k =1
Gjk P
(
Bt {i }At {j}Bt {k })
χ
(
t k →j|
A{t j}Bt {k })
+
N
k =1
Gik P
(
Bt {k }At {i }At {j})
χ
(
t k →i|
B{t k }At {i })
−
N
k =1
Gik P
(
At {k }Bt {i }At {j})
χ
(
t k →i|
At {k }Bt {i })
. (15)Similarformulaecanbe constructedforallpossiblepairs,writing pairs interms oftriples. In a similar way, triples can be written interms ofquadsandso on,up to thefull systemsize N which isthenclosed.Therefore,whenusingthismethodweobtaina hi-erarchysimilartotheBBGKY(Bogoliubov–Born–Green–Kirkwood– Yvon)hierarchy(BornandGreen,1946;Kirkwood,1947)in statis-ticalphysics. However, herethe hierarchyonly representsan ap-proximationto theoriginal dynamics.Solvingthis systemexactly isnosimplerthanevaluatingequation(3)sinceevaluatingthe hi-erarchyinfulliscomparableinnumericalcomplexity,sowewish tofindapproximationmethodstoreducethis.
Withthishierarchy,wecanapplytechniquesdevelopedin sta-tisticalphysics toapproximatehigher-order termsasfunctionsof lower-orderterms.Inparticularwecanclosethesystemof equa-tions(14)and(15)atthelevelofpairsbyapproximatingalltriples inequation (15) interms of pair-level andindividual-level prob-abilities. Similar techniqueshave beenapplied for many stochas-ticprocessesincludinginepidemiology(KeelingandEames,2005; Kiss etal., 2017; Sharkey, 2008; Sharkey etal., 2015) and evolu-tionarydynamics (Hauertand Szabó, 2005; Ohtsukiet al., 2006; Szabó andFath,2007)leadingtomodelswhichcanbenumerically evaluated.
Toclosethesystem,we requireafunctionalformthatcan ap-proximatetripleprobabilitiesintermsofindividualandpair prob-abilities. One method isto approximate a triple P
(
At {i }Bt {j}C{t k })
as theproductofallpossiblepairsamongthesenodesdividedbythe productofallindividuals,i.e.P
(
At {i }Bt {j}C{t k })
≈P(
At
{i }Bt {j}