Methods for approximating stochastic evolutionary dynamics on graphs

(1)

City, University of London Institutional Repository

Citation:

Overton, C. E., Broom, M. ORCID: 0000-0002-1698-5495, Hadjichrysanthou, C.

and Sharkey, K. (2019). Methods for approximating stochastic evolutionary dynamics on

graphs. Journal of Theoretical Biology, 468, pp. 45-59. doi: 10.1016/j.jtbi.2019.02.009

This is the published version of the paper.

This version of the publication may differ from the final published

version.

Permanent repository link:

http://openaccess.city.ac.uk/id/eprint/21919/

Link to published version:

http://dx.doi.org/10.1016/j.jtbi.2019.02.009

Copyright and reuse: City Research Online aims to make research

outputs of City, University of London available to a wider audience.

Copyright and Moral Rights remain with the author(s) and/or copyright

holders. URLs from City Research Online may be freely distributed and

linked to.

(2)

ContentslistsavailableatScienceDirect

Journal

of

Theoretical

Biology

journalhomepage:www.elsevier.com/locate/jtb

Methods

for

approximating

stochastic

evolutionary

dynamics

on

graphs

Christopher

E. Overton

a,∗

_,

_Mark

_Broom

b

_,

_Christoforos

_{Hadjichrysanthou}

c

_,

Kieran

J. Sharkey

a

a_Department_of_Mathematical_Sciences,_University_of_Liverpool,_Mathematical_Sciences_Building,_Liverpool_L69_7ZL,_UK b_Department_of_Mathematics,_City,_University_of_London,_Northampton_Square,_London_EC1V_0HB,_UK

c _Department_of_Infectious_Disease_{Epidemiology,}_School_of_Public_Health,_Imperial_College_London,_St_Mary’s_Campus,_Norfolk_Place,_London_W2_1PG,_UK

a

r

t

i

c

l

e

i

n

f

o

Articlehistory: Received20May2018 Revised7February2019 Accepted13February2019 Availableonline14February2019

Keywords:

Evolutionarygraphtheory Momentclosure Fixationprobability Network Markovprocess

a

b

s

t

r

a

c

t

Populationstructurecanhave asigniﬁcanteffectonevolution.Forsomesystemswithsuﬃcient

sym-metry,analyticresultscanbederivedwithinthemathematicalframeworkofevolutionarygraphtheory

whichrelatetotheoutcomeoftheevolutionaryprocess.However,formorecomplicatedheterogeneous

structures,computationally intensivemethodsarerequiredsuchas individual-basedstochastic

simula-tions.Byadaptingmethodsfromstatisticalphysics,includingmomentclosuretechniques,weﬁrstshow

howtoderiveexistinghomogenisedpairapproximationmodelsandthe exactneutraldriftmodel.We

thendevelopnode-levelapproximationstostochasticevolutionaryprocessesonarbitrarilycomplex

struc-turedpopulationsrepresentedbyﬁnitegraphs,whichcancapturethedifferentdynamicsforindividual

nodes inthepopulation.Using theseapproximations, weevaluatethe ﬁxationprobability ofinvading

mutantsforgiveninitialconditions,wherethedynamicsfollowstandardevolutionaryprocessessuchas

theinvasionprocess.Comparisonswiththeoutputofstochasticsimulationsrevealtheeffectivenessof

ourapproximationsindescribingthestochasticprocessesandinpredictingtheprobabilityofﬁxationof

mutantsonawiderangeofgraphs.Constructionofthesemodelsfacilitatesasystematicanalysisandis

valuableforagreaterunderstandingoftheinﬂuenceofpopulationstructureonevolutionaryprocesses.

ThisisanopenaccessarticleundertheCCBYlicense.(http://creativecommons.org/licenses/by/4.0/)

1. Introduction

Models of evolutionary dynamics were originally determinis-tic andassumed well-mixed populations in which every individ-ual of a giventype is identical.Stochastic models of thesefinite well-mixedpopulationshavebeenstudied(Moran,1958),however real populations are usually characterisedby a complicated rela-tionship structurebetweenindividuals(Zhangetal.,2007).To ac-countforthis,aclassofmathematicalmodelsknownas evolution-arygraphtheory havebeendevelopedwhichshowthat the pop-ulation structure can significantly influence the outcome of evo-lutionarydynamics (Liebermanetal., 2005; Traulsenand Hauert, 2010).Inthesemodels,structuredpopulations arerepresentedby finitegraphs,whereeachnoderepresentsanindividualinthe pop-ulation and relationshipsbetweenindividuals are represented by the edges ofthe graph. Stochastic evolutionary processes can be considered analytically and precise results can be derived for a

∗ _{Corresponding}_author.

E-mailaddress:[email protected](C.E. Overton).

number of simple graphs, such as the circle, star and complete graphs(Broometal., 2010;BroomandRychtáˇr,2008; Lieberman etal., 2005), mainly dueto their symmetry. Analytic approaches for investigating evolutionary dynamics on complex graphs have alsobeenproposed.However,suchmethodsareusuallylimitedby assumptionssuchaslargepopulations(Nowaketal.,2010;Ohtsuki etal.,2006)orarespeciﬁcallydesignedforinvestigating evolution-aryprocessesunderweakselection(Allenetal.,2017;Zhongetal., 2013),wheretheevolutionarygamehasonlyasmalleffecton re-productivesuccess.

Importantquantitiesofinterestsuchastheexactﬁxation prob-ability and time can, in principle, be obtained by solving the discrete-time difference equations of the underlying stochastic model (Hindersin et al., 2016), although this is only feasible for very small populations unless there are simplifying symmetries. Individual-based stochastic simulations(Barbosaetal., 2010; Ma-ciejewskietal.,2014)providenumericallyaccuraterepresentations ofthe evolutionary process on arbitrary graphs buthave limited scopeforgeneratingconceptualinsightsintothedynamicsontheir own.Theycanalsobecomputationallyexpensiveonlargergraphs,

https://doi.org/10.1016/j.jtbi.2019.02.009

(3)

butasapreciserepresentationoftheunderlyingstochasticmodel, theyallow usto evaluatetheaccuracy ofapproximatemodelsby comparison.

Here we develop approximations to the stochastic model by usinginsights from methods in statistical physics that have also been used extensively for epidemic modelling (Born and Green, 1946; Keeling and Eames, 2005; Kirkwood, 1947; Pellis et al., 2015; Sharkey, 2008; Sharkey et al., 2015). Such methods have been applied to develop pair approximations for evolutionary processes on graphs which satisfy the homogeneity assump-tion that all individuals can be considered identical and inter-changable(Hadjichrysanthouetal.,2012;HauertandSzabó,2005; Morita,2008; Penaetal., 2009; Szabó and Fath, 2007). However, theunderlying assumptions linking thesemodels tothe underly-ingstochastic dynamicsare not always clear.Onecontribution of thisworkistoderivethesemodelsexplicitlybyidentifyingthe re-quiredassumptions. The startingpoint forall ofour approxima-tions is to derive an equation to describe the time-evolution of thestateofanygivenindividualnode.Fromthisequation,various routestoapproximationbecomeapparentbyapplyingdifferent as-sumptions.We then investigate the applicability and accuracy of theresultingapproximationmethods.

Evolutionary graph theory is traditionally explored as a discrete-timestochastic model.While it ispossible to work with thesedynamics, it is easier to work with a continuous-time ap-proximationtotheprocess. Thecontinuous-timesystemis repre-sentedbyamasterequationdescribinghowtheprobabilityof be-ingineachsystemstatechanges.Fromthemasterequationwe ob-tainexactequations(withrespecttothecontinuous-timeprocess) fortheprobabilitiesofthestatesofindividualnodes(Theorem2.1). Theseequationscanthen beapproximated byadopting moment-closuremethods.Wefocusonevaluatingtheprobabilitythatatthe endoftheevolutionaryprocess,aninitialsubsetofmutantsplaced onthegraphwilltakeoverthewholepopulationand‘fixate’. Us-ing this continuous-time system is justified because the fixation probabilityandexpectedtime tofixationare identicalto thoseof theoriginaldiscrete-timeprocess.Withinthisframeworkwestudy whenaccurateapproximationscanbederived.

InSections2.1–2.3weintroducethestochasticevolutionary dy-namicsandthemasterequation, andderive adescriptionofhow node-levelquantitieschangeinthemasterequation.Wethen dis-cussanddevelop various techniquesthat can beused to approx-imate these systems ofequations in Section 3. Within these ap-proximationframeworks we derive the pairapproximation mod-elsusedintheliterature,whichwewillcallthehomogenisedpair approximation,andthe exact neutraldrift model,and build new node level approximation methods. InSection 4we demonstrate howthedifferentmethodscanbeusedtoapproximatethe dynam-icsof the original discrete-time process. Section 4.1 studies how the methods perform when approximating the ﬁxation probabil-ityof asingleinitial mutantplacedon idealisedandon complex graphs.Section 4.2then showshow the methods perform when studyingtheevolutionarygamedynamicsin aHawk–Dove game. InSection5wediscusstheresultsobtainedfromthemethods de-velopedandtheinsightsthesecangive.

2. Thestochasticmodel

2.1.Stochasticevolutionarydynamics

We considera population whoserelationshipstructure is rep-resented by a strongly connected undirected graph (V, E) where

V=

{

1,2,...,N

}

isthesetofnodesandEdenotesthesetofedges. Thiscanbe representedby an adjacencymatrixG,whereG_{i j}=1 ifj is connected to i, and Gi j=0 otherwise, with Gii=0 forall

i∈V.Weconsiderpopulationsconsistingoftwotypesof

individu-als,typeAandtypeB,eitherofwhichcanbeintheroleof invad-ingmutantinaresidentpopulation.Eachnode isoccupiedby ei-theranAoraBindividual.ThereforewecanletA_i=1ifandonly ifnodei isoccupiedby anA individualandA_i₌0otherwise and letB_idenotethesameforindividualsoftype B.SinceB_i=1−A_i,

thestate ofthesystemcan berepresented bythevalues ofA_iat anygiven time. Ifthere exists an edge (i, j)_∈E between nodesi, j∈V,thentheoffspring oftheindividualinnode jcanreplacethe individual innode iandviceversa.Tostudytheevolutionary dy-namicsbetweenthesetwo typesofindividual werequirea mea-sureof ﬁtness. We can describe the ﬁtness payoff received from interactionsbetweenindividualsbythefollowingpayoff matrix:

A B

a b

c d

,

where an A individual obtains a payoff a when interacting with anotherAindividualandpayoffbwheninteractingwithaB indi-vidual.Similarly,aBindividualobtainspayoffscanddwhen inter-actingwithanAindividualandaBindividualrespectively.

Todefine fitnessbasedon thepayoff,following similar defini-tions in the literature (Hadjichrysanthou et al., 2011; Lieberman et al., 2005; Ohtsuki et al., 2006; Taylor et al., 2004; Traulsen andHauert,2010),thefitnessofeachindividualisassumedtobe

f= f_back+wP,wherefback isthebackgroundﬁtnessofall individ-uals,Pistheaveragepayoff receivedfrominteractionswith neigh-bours,andw∈[0,∞)isa parameterwhichcontrolsthe contribu-tionofthegamepayoff toﬁtness.

The ﬁtness of an A individual which occupies node j, f_Aj , is thereforegivenby

f_Aj =fback +wa

N

i =1Gi j Ai +b

N i =1Gi j Bi

N i =1Gi j

, (1)

andsimilarlytheﬁtnessofaBindividualoccupyingnodejisgiven by

f_Bj =f_back+wc N

i =1Gi j Ai +d

N i =1Gi j Bi

N i =1Gi j

. (2)

Inthespecialcaseofconstantﬁtness,wheretheﬁtnessof individ-uals remains constantindependent oftheinteractions withother individuals,wetakethepayoff matrixas

A B

r r

1 1

,

sothatAindividualshaverelativepayoff equaltor.

(4)

2.2. Themasterequation

Toapproximatethediscrete-timeevolutionaryprocess wefirst translate thediscrete-timesystemto an approximate continuous-timesystem. Todothiswe modeleach(replacement)eventusing a Poissonprocess.Therateatwhicheach eventhappensisequal totheprobabilityofthateventinthediscrete-timemodel. There-forethetotaleventpressurewillbethesumofallsuch probabil-ities,which isequaltoone,so thatthetime untilthenext event followsaPoissonprocesswithrateparameterone.Wethen deter-minewhicheventtakesplaceusingtherelevantprobability.Under thiscontinuous-timesystemthefixationprobability andexpected timetofixationwillbeidenticaltothoseofthediscrete-time sys-tem,sinceweusethesameprobabilitieswheneveraneventoccurs andtheexpectedtime betweeneventsisconstant. Thisis impor-tantbecausethesearethemainquantitiesofinterestin evolution-arydynamics.

We will use this system to build approximation methods to study the original discrete-time process. We choose to use continuous-time becauseit enables ustobuild a system of ordi-narydifferentialequationstoapproximatethedynamics,which al-lowustomakeuseofeﬃcientnumericalsolversandenableusto derivesomeanalyticresults.

Since this evolutionary process is a continuous-time Markov process, we can constructa master equation to describe the dy-namics. Let S_i=

(

s1,s2,...,sN

)

be a state of the system, where

i∈

{

1,...,2N

}

and where sj=1 if node j is a type A individ-ual and s_j=0 otherwise. We deﬁne S1=

(

0,0,...,0

)

and S2N=

(

1,1,...,1

)

to be the statesconsisting of onlyB individuals and onlyAindividuals,respectively.

We introduce a vectorp

(

t

)

which representsthe probabilities ofeachsystemstateattimet.Thatis,theithentryofp

(

t

)

,p_i(t),is theprobabilitythatthesystemisinstateS_iattimet.This Marko-vianevolutionaryprocesshas2N possiblestatesandthetransitions betweenthemare governedby a2N ×2N transitionratematrixR

whoseentriesdependupon thegraphandupdatemechanismwe consider.

Wewritetherateofchangeinthestateprobabilitiesusingthe masterequationoftheMarkovprocess:

dp

dt =Rp. (3)

Suchanequationcanbeconstructedforanygraphundera Marko-vianupdatemechanism.Theabsorbingstatescorrespondtotheall typeBoralltypeAstates,S1 andS2N,soaregivenbyp1 andp2N.

Since we consider a strongly connected adjacency matrix G, provided we have atleast one type A and one type B it is pos-sible to get to either of the absorbing states and therefore from any mixed initial condition the system will always end up dis-tributed betweenthesetwo states.Wedeﬁne the ﬁxation proba-bilityPA

f ix

(

S

(

i

))

oftypeAfromaninitialstateS(i)tobethe proba-bilityofbeingintheallAabsorbingstate,thatis

PA _{f ix}

(

Si

)

=lim

t→∞

(

p2N

(

t

)

|

pi

(

0

)

=1

)

,

wherep_i(0)istheprobabilityofbeinginthestateS_iattimet=0. SimilarlywedeﬁnetheﬁxationprobabilityoftypeBas

PB _{f ix}

(

S_i

)

=lim

t→∞

(

p1

(

t

)

|

pi

(

0

)

=1

)

.

The computationalcost ofimplementingsystem(3)increases ex-ponentiallywith N (Hindersin etal., 2016), andthus the compu-tationofthe ﬁxation probabilitybecomesinfeasible asthe popu-lation size increases.Therefore it is ofinterest to build approxi-mationmethods.Pairapproximationsofthemasterequationhave beendevelopedunderthehomogeneityassumptionthatallnodes ontheunderlyinggraphareidenticalandinterchangeable(Hauert andSzabó,2005;Szabó andFath,2007),whichcangiveinteresting

insightintotheevolutionarydynamics.Howeverthehomogeneity assumptionsmadeintheseapproximationsresultinthelossof in-sightintographandnode-speciﬁcdynamics,soweaimtodevelop approximationsofthemasterequationwhichcancapturethis in-formation.

2.3.Nodelevelequations

Weapproximatethemasterequationbyapproximatingthe dy-namicsofthestateprobabilitiesofindividualnodesinthe popula-tion.Thisismotivatedbyapproachesinstatisticalphysicsand epi-demicmodelling(BornandGreen,1946;Kirkwood,1947;Sharkey, 2008;Sharkeyetal., 2015), andﬁrstrequiresexactequations de-scribinghowtheprobabilityofeachnodebeingoccupiedbya cer-taintypechangeswithtime,whichcanbederivedfromthemaster equation(3).

Deﬁnition2.1. Let

χ

(

t _j_→_i

|

St

)

denotetherateatwhichthe indi-vidualin nodej replaces the individualin node iat timet given thatthesystemisinstate Sattime t; wereferto thisasthe re-placementrate.

Deﬁnition2.2. Xt

C denotes theeventthat thesetofnodesCisin stateXattimet;forexampleAt _{_i_}istheeventthatnodeiisinthe typeAstateattimet.

Throughoutthispaperwe shallusetheshorthandBt

{i }At{j}X t C to representtheintersectionofeventsBt _{_i_}_∩At _{_j_}_∩X_Ct .

Theorem2.1. UnderanyMarkovianupdate mechanism,fora struc-turedpopulation represented bytheadjacency matrixG,therate of changeoftheprobabilitythattheindividualinnodeiisanA individ-ualis

dP

(

At _{_i_}

)

dt =

N

j=1

X V\{i,j}

Gi j P

(

Bt _{i }At {j}XV t \{i,j}

)

χ

(

t j→i

|

Bt {i }A{t j}XV t \{i,j}

)

−

N

j=1

X V\{i,j}

G_{i j}P

(

At _{_i_}B_{t _j_}X_Vt _\{_i,j_}

)

χ

(

t _j_→_i

|

At _{_i_}B_{t _j_}X_Vt _\{_i,j_}

)

,

(4)

where the sum over X_V\{i,j } is over all possible states of the nodes

V\{i,j}.

Proof. SeeAppendixA.

This theorem can be applied to any update mechanism by choosing an appropriate deﬁnition for the replacement rate,

χ

(

t

j→i

)

,whichweshalldeﬁnefortheinvasionprocessasan ex-ample.

Example2.1(Invasionprocess). Theinvasionprocessisan adapta-tionoftheMoranprocess(Moran,1958)tostructuredpopulations. Eacheventisdetermined by selectingan individual to reproduce withprobabilityproportionaltoitsﬁtness.Itproducesanidentical offspringwhichreplacesoneoftheconnectedindividualswhichis chosenuniformlyatrandom.Thereforetherateatwhichthe indi-vidualinnode jreplaces theindividual innodei attime tunder theinvasionprocessrulesisgivenby

χ

(

t j→i

|

S

)

=

ft _j

|

S Ft

|

S

1

kj , (5)

where ft _jistheﬁtnessoftheindividualoccupyingnodejattimet,

(5)

Whencalculating

χ

(

t _j_→_i

)

inequation (4),wewillusethe fol-lowingexpressionfortheﬁtnessoftheindividualatagivennode

jattimet,

ft _j= f_back+wP

(

At _{_j_}

)

a N

i =1Gi j P

(

At_{i }

)

+bN i =1Gi j P

(

Bt_{i }

)

_N

i =1Gi j

+wP

(

Bt _{_j_}

)

c _N

i =1Gi j P

(

At _{_i_}

)

+d

_N

i =1Gi j P

(

Bt _{_i_}

)

N i =1Gi j

, (6)

which is a sum of equations (1) and (2) weighted by the node probabilities. We use this deﬁnition because when we evaluate equation(6)giventhatthesystemisinaparticularstateS,as re-quiredbyequation(4),thevaluesofP

(

At _{_k_}

)

andP

(

Bt _{_k_}

)

areeither 1or0,whichleadstothefitnessofnodejinthatparticularsystem stateequations(1)and(2).However,bydefiningfitnessintermsof thenode probabilities,thisallowsusto haveadescription of fit-nesswhichwecanapproximate(seeSections3.2and3.3).

3. Approximatingthestochasticmodel

Inotherfields,suchasepidemiology,theconstructionof node-level equations such as equation (4) can lead to a hierarchy of momentequationswhereby theseequationsare writteninterms ofpair probabilities, pairs are written in terms of triples andso on, until the full system state size is reached and the hierarchy isclosed.Thisisusefulwhenwecan findappropriate closure ap-proximationsto closethishierarchy ata low order.However, we seethatsuchanapproachcannotbeusedherebecausewe condi-tionagainstthefullsystemstateinequation(4)whichmeansthat thefull systemsizeappears even atthefirst order.We therefore attempttofindothermethodstosimplifythissystemofequations. In this section we will describe three different techniques to derive approximations forthis system. The first technique yields a system of equations which become computationally infeasible insomecircumstances,butbyapplyinghomogeneityassumptions totheunderlying graph,wecan derivethe existingpair approxi-mationmodels currentlyusedin theliterature (Hadjichrysanthou et al., 2012; Hauert and Szabó, 2005; Morita, 2008; Pena et al., 2009;Szabó andFath, 2007)(Section3.1).Toreducecomputation costs,we then develop methodsbased onrestrictingthe number ofstateswhich weconditionagainstinthe replacementrate.We firstobtainamethodwhosecomputational complexityscales lin-early with the population size N and, after an appropriate scal-ing, approximates the fixation probability well on a wide range ofgraphs(Section3.2). Then,inSection 3.3,we obtaina method which,although itscaleswithN2_,_provides _a_good_{approximation} tothe evolutionarydynamicsover thewhole time seriesfor var-ious graphs, and in particular provides a very accurate approxi-mationto the initial dynamicsofthe evolutionary processon all graphs.

3.1.Derivingthehomogenisedpairapproximationmodel

Onewayofsimplifying(4)istoassumethatthefitness ft _jdoes not needto be normalised by the total fitness Ft in the replace-mentrate(e.g.asinequation(5)fortheinvasionprocess).This ap-proximationisjustifiedbecauseitdoesnotchangethefinalvalue to whichthe exact node-level equationsconverge (and therefore thefixation probability),and will only transformthe time series until fixation. Making this assumption, the node level equations simplifysothatwe onlysumoverthe neighboursofthe individ-ualthatweselectedbasedonfitness.Thatis,whenlookingatthe eventwherenode j replaces node i,if weare selecting ondeath weneedto conditionagainst thestate ofallneighbours ofi,and ifselecting onbirth weneed toconditionagainst thestate ofall

neighboursofj.As an example,weshall assumeherethat selec-tionoccursonbirthsothatwerequireconditioningonthe neigh-bourhoodofnodej,howeverwecanalsomakesimilararguments whenselectingondeath.Using

χ

¯ torepresentthismodiﬁcationof

χ

in(4)andQtorepresentthenewprobabilitydistributionwith themodiﬁedtimeseriesweobtain

dQ

(

At _{_i_}

)

dt =

N

j=1

X Nj\{i}

Gi j Q

(

Bt _{i }At {j}XNt j\{i }

)

χ

¯

(

t

j→i

|

Bt {i }A{t j}XNt j\{i }

)

−

N

j=1

X Nj\{i}

Gi jQ

(

At _{_i_}Bt _{j}XNt j\{i }

)

χ

¯

(

t

j→i

|

At {i }B{t j}XNt j\{i }

)

,

(7)

where Nj is the neighbourhood ofnode j; i.e.all nodes that are connectedtoj.Tosolvethissystemexactlyrequiresthe develop-mentofequationsdescribinghowtheprobabilityofeachpossible neighbourhoodofnodeschanges.Thisinturnwouldleadtoa hier-archyofequationswhichiscomputationallysimilartothemaster equation.Howeveritispossibletodevelopapproximationmethods byassumingindependenceattheleveloflower-orderterms,such asindividualsorpairsofnodes,andapproximatingthe neighbour-hoodprobabilitiesasafunctionofthese.

For example, we can make a pair approximation by applying Bayes’Theoremandassumingstatisticalindependenceatthelevel ofpairstorewritetheneighbourhoodprobability intermsofpair probabilities.Applying Bayes’Theorem totheprobabilities onthe righthandsideofequation(7)weget

dQ

(

At _{_i_}

)

dt =

N

j=1

X Nj\{i}

G_{i j}Q

(

At _{_j_}

)

Q

(

Bt _{_i_}X_Nt j\{i }

|

A

t

{j}

)

×

χ

¯

(

t _j_→_i

|

B_{t _i_}At _{_j_}X_Nt j\{i }

)

−

N

j=1

X Nj\{i}

Gi j Q

(

Bt _{j}

)

Q

(

At {i }XNt j\{i }

|

B t

{j}

)

×

χ

¯

(

t _j_→_i

|

A_{t _i_}Bt _{_j_}X_Nt

j\{i }

)

. (8)

If we assume statistical independence ofall nodesin the neigh-bourhoodofj,giventhestateofj,we canrewritethe neighbour-hoodprobabilityQ

(

At _{_j_}

)

Q

(

Bt _{_i_}X_Nt

j\{i }

|

A

t

{j}

)

as

Q

(

At _{_j_}

)

Q

(

Bt _{_i_}X_Nt j\{i }

|

A

t

{j}

)

≈Q

(

At {j}

)

Q

(

Bt {i }

|

At {j}

)

l∈Nj\{i }

Q

(

X_{t _l_}

|

At _{_j_}

)

,

whereX_{t _l_}iseventwherenodelisinthesamestateasitisinthe eventX_Nt

j\{i }.Substitutingthisintoequation(8)gives dQ

(

At _{_i_}

)

dt ≈

N

j=1

X Nj\{i}

G_{i j}Q

(

At _{_j_}

)

Q

(

Bt _{_i_}

|

At _{_j_}

)

×

l∈Nj\{i }

Q

(

X_lt

|

At _{_i_}

)

χ

¯

(

t _j_→_i

|

Bt _{_i_}At _{_j_}X_Nt j\{i }

)

−

N

j=1

X Nj\{i}

Gi j Q

(

Bt _{j}

)

Q

(

Ai

|

Bt {I}

)

×

l∈Nj\{i }

Q

(

X_lt

|

Bt _{_I_}

)

χ

¯

(

t _j_→_i

|

At _{_i_}Bt _{_j_}X_Nt j\{i }

)

.

(6)

we can derive exact equationsdescribing pairs.For the probabil-ityP

(

Bt _{_i_}At _{_j_}

)

weobtain

dP

(

Bt _{_i_}At _{_j_}

)

dt =

N

k =1

X V\{i,j,k}

G_jkP

(

Bt _{_i_}Bt _{_j_}At _{_k_}X_Vt _\{_i,_j,k_}

)

×

χ

(

t

k →j

|

Bt {i }Bt {j}At {k }XV t \{i,j,k }

)

−

N

k =1

X V\{i,j,k}

G_jkP

(

Bt _{_i_}At _{_j_}Bt _{_k_}X_Vt _\{_i,_j,k_}

)

×

χ

(

t

k →j

|

Bt {i }At {j}Bt {k }XV t \{i,j,k }

)

+

N

k =1

X V\{i,j,k}

Gik P

(

Bt _{k }At {i }At {j}XV t \{i,j,k }

)

×

χ

(

t

k →i

|

Bt {k }At {i }At {j}XV t \{i,j,k }

)

−

N

k =1

X V\{i,j,k}

GikP

(

At _{_k_}Bt _{_i_}At _{j}XV t \{i,j,k }

)

×

χ

(

t

k →i

|

At {k }Bt {i }A{t j}XV t \{i,j,k }

)

. (9)

We can now apply the same assumption regarding total ﬁtness thatweusedforthesinglenodeprobabilitiessothat

dQ

(

Bt _{_i_}At _{_j_}

)

dt =

N

k =1

X Nk\{i,j}

G_jkQ

(

Bt _{_i_}B_{t _j_}At _{_k_}X_Nt k\{i,j}

)

×

χ

¯

(

t _k_→_j

|

Bt _{_i_}Bt _{_j_}At _{_k_}X_Nt k\{i,j}

)

−

N

k =1

X Nk\{i,j}

G_jkQ

(

Bt _{_i_}A_{t _j_}Bt _{_k_}X_Nt k\{i,j}

)

×

χ

¯

(

t _k_→_j

|

Bt _{_i_}At _{_j_}Bt _{_k_}X_Nt k\{i,j}

)

+

N

k =1

X Nk\{i,j}

G_ikQ

(

Bt _{_k_}A_{t _i_}At _{_j_}X_Nt k\{i,j}

)

×

χ

¯

(

t _k_→_i

|

Bt _{_k_}At _{_i_}At _{_j_}X_Nt k\{i,j}

)

−

N

k =1

X Nk\{i,j}

G_ikQ

(

At _{_k_}B_{t _i_}At _{_j_}X_Nt k\{i,j}

)

×

χ

¯

(

t _k_→_i

|

At _{_k_}Bt _{_i_}At _{_j_}X_Nt

k\{i,j}

)

. (10)

Applying Bayes’ Theorem to the neighbourhood probability

Q

(

Bt _{_i_}Bt _{_j_}At _{_k_}X_Nt

k\{i,j}

)

weobtain

Q

(

Bt _{_i_}Bt _{_j_}At _{_k_}X_Nt

k\{i,j}

)

=Q

(

B t

{j}At {k }

)

Q

(

Bt {i }XNt k\{i,j}

|

B t

{j}At {k }

)

We can now assume statistical independence of the remaining nodesgiventhestateofjandksothat

Q

(

Bt _{_i_}Bt _{_j_}At _{_k_}X_Nt

k\{i,j}

)

≈Q

(

B t

{j}At {k }

)

Q

(

Bt {i }

|

Bt {j}At {k }

)

×

l∈Nk\{i,j}

Q

(

X_{t _l_}

|

B_{t _j_}At _{_k_}

)

.

Sinceweknowthatnodei isconnectedtonodej wecanassume thatgiventhestateofnodej,thestateofnodeiisindependentof node k,andsimilarlythestate ofanynode intheneighbourhood ofkisindependentofnodej,whichgivesus

Q

(

Bt_{_i_}Bt_{_j_}At_{_k_}X_Nt

k\{i,j}

)

≈Q

(

B t

{j}A t

{k }

)

Q

(

Bt{i }

|

Bt{j}

)

×

l∈Nk\{i,j}

Q

(

X_{t _l_}

|

At _{_k_}

)

.

Substitutingthisintoequation(10)gives

dQ

(

Bt _{_i_}At _{_j_}

)

dt ≈

N

k =1

X Nk\{i,j}

G_jkQ

(

Bt _{_j_}At _{_k_}

)

Q

(

Bt _{_i_}

|

Bt _{_j_}

)

×

l∈Nk\{i,j}

Q

(

X_{t _l_}

|

A_{t _k_}

)

χ

¯

(

t _k_→_j

|

Bt _{_i_}Bt _{_j_}At _{_k_}X_Nt k\{i,j}

)

−

N

k =1

X Nk\{i,j}

G_jkQ

(

At _{_j_}Bt _{_k_}

)

Q

(

Bt _{_i_}

|

At _{_j_}

)

×

l∈Nk\{i,j}

Q

(

X_{t _l_}

|

B_{t _k_}

)

χ

¯

(

t _k_→_j

|

Bt _{_i_}At _{_j_}Bt _{_k_}X_Nt k\{i,j}

)

+

N

k =1

X Nk\{i,j}

Gik Q

(

At _{i }Bt {k }

)

Q

(

At {j}

|

At {i }

)

×

l∈Nk\{i,j}

Q

(

X_{t _l_}

|

B_{t _k_}

)

χ

¯

(

t _k_→_i

|

At _{_i_}At _{_j_}Bt _{_k_}X_Nt k\{i,j}

)

−

N

k =1

X Nk\{i,j}

Gik Q

(

Bt_{i }At{k }

)

Q

(

At{j}

|

B t

{i }

)

×

l∈Nk\{i,j}

Q

(

X_{t _l_}

|

A_{t _k_}

)

χ

¯

(

t _k_→_i

|

Bt _{_i_}At _{_j_}At _{_k_}X_Nt k\{i,j}

)

.

Whilethissystemisclosed,itscomputationalcomplexityincreases exponentiallywiththemaximum nodedegree ofthegraph,so it isnotnumericallyfeasibleforgraphswithhighlyconnectednodes. Whilethiscouldpotentiallybe addressedby introducing approxi-mations fornodes with high degree and this maylead to accu-ratemodels,herewecontinuetowards asimpliﬁed model.Todo this,wefollowthesameprocessasinepidemicmodelsandmake a homogeneity assumption by assuming that any pair is equally likelytobeinanygivenstate(Kissetal.,2017;Sharkey,2008);i.e.

Q

(

X_{t _i_}

|

Y_{t _j_}

)

=Q

(

Xt

|

Yt

)

forallpairs(i,j).Thisleadsto

dQ

(

At _{_i_}

)

dt ≈

N

j=1

X Nj\{i}

Gi j Q

(

At _{j}

)

Q

(

Bt

|

At

)

k j−n XQ

(

At

|

At

)

n X

×

χ

¯

(

t _j_→_i

|

Bt _{_i_}At _{_j_}X_Nt j\{i }

)

−

N

j=1

X Nj\{i}

G_{i j}Q

(

Bt _{_j_}

)

Q

(

At

|

Bt

)

n X+1_Q

(

_Bt

|

_Bt

)

k j−n X−1

×

χ

¯

(

t _j_→_i

|

A_{t _i_}Bt _{_j_}X_Nt j\{i }

)

,

wherekjisthedegreeofnodejandnX isthenumberoftypeA in-dividualsinstateX_N

j\{i }.Sincethetransitionrateonlydependson

thenumberoftypeAandtypeBindividualsintheneighbourhood ofnode j and not on their positions, the summand on the right handsideisequalforallstatesX_N

j\{i }whichhavethesame

con-figurationofAandBindividuals.Thefrequencyofacertain neigh-bourhoodstate across all possible configurations is given by the binomialcoefficient,sothat

dQ

At _{_i_}

dt ≈

N

j=1 k j−1

n =0

Gij

kj −1

n

Q

At _{_j_}

Q

Bt

|

At

k j−n

Q

At

|

At

n

×

χ

t j→i

n

−

N

j=1 k j−1

n =0

Gij

kj −1

n

Q

Bt _{_j_}

Q

At

|

Bt

n +1Q

Bt

|

Bt

k j−n −1

×

χ

t j→i

n+1

,

where

χ

¯

(

t _A_→_B

|

n

)

isthe rateatwhichweselectone ofthetype

(7)

arentypeAindividualsandkj−ntypeBindividualsinthe neigh-bourhoodoftheselectednode.

Since wehaveassumedthat anypairisequallylikely,this as-sumptiononly holdswhenevery node inthegraphformsk con-nections, whichare chosen atrandom. Thereforewe requirethat nodeiisequallylikelytobeconnectedtoanyothernode andall nodesare topologicallyequivalent, so that the probability that a givennodeoftypeBisconnectedtoxtypeAneighboursisgiven byabinomial distributionwithn=kand p=Q

(

At

|

Bt

)

.Therefore theprobabilityofanindividualbeingtypeAchangeswithrate

dQ

At

dt ≈kQ

At

|

Bt

Q

Bt

×

k −1

n =0

k−1

n

Q

Bt

|

At

k−nQ

At

|

At

n

χ

t _A_→_B

n

−kQ

(

Bt

|

At

)

Q

At

×

k −1

n =0

k−1

n

Q

At

|

Bt

n +1Q

Bt

|

Bt

k −n −1

χ

_Bt _→_A

n+1

.

We can also apply theseassumptions to the pair-level equations to obtain a closed system of equations which are eﬃcient to solvenumerically.Theresultingmodelisequivalenttothe model inMorita(2008),whichwasjustiﬁedbyusingtheassumptionthat thepopulation occupiesaregular graph,such that all individuals havedegreek,andthatallnodesaretopologicallyequivalent,such thateverypairofindividualsisequallylikelytobeconnected.We haveshownthatbyapplyingtheseassumptionstotheexact node-levelequationsequation(4)wecanderivethesemodels.

Similarly we can obtain a pair approximation model for the dynamics where we select on death by conditioning against the state of the neighbours of node i. Applying analogous as-sumptions to the previous example then leads to the model inHadjichrysanthou etal.(2012).Thesemodelshavebeenshown toyieldinteresting qualitativeresultsabouttherelative strengths ofdifferentstrategiesin evolutionarygames on graphs.However, thehomogeneityassumptionsmaderesultinlosingimportant as-pectsofthestructure,suchashowindividualnodesinthesystem canbehavedifferently.Inthenextsectionswewillattemptto de-velopapproximationmethodswhichcancapturethisnode-speciﬁc information.

As we alluded to earlier, a natural method would be to use equation (7) as a basis for this. However, diﬃculties in imple-mentingthismethodongeneralnetworksaswell asthe number of equations that result leads us to a different direction for the presentwork.

3.2.Anunconditionedﬁtnessapproximationmodel

Herewedevelopamethodwhichremovestheneedtoinclude theprobability ofwhole neighbourhoods byremoving the condi-tioninginthe replacementrate. Thiscausesthe replacementrate toonlydepend onthe marginalprobabilities ofthestate ofeach noderather thanthe full systemstate.This assumptionalso mo-tivated a model in Szabó and Fath (2007) in which the authors constructapopulation-levelapproximationdescribinghowthe ex-pectednumberofindividualsofeachtype changewithtime. Un-derthisassumption,equation(4)becomes

dP

(

At _{_i_}

)

dt ≈

N

j=1

X V\{i,j}

Gi j P

(

Bt _{i }At {j}XV t \{i,j}

)

χ

(

t j→i

)

−

N

j=1

X V\{i,j}

G_{i j}P

(

At _{_i_}Bt _{_j_}X_Vt _\{_i,j_}

)

χ

(

t _j_→_i

)

.

Since

χ

(

t _j_→_i

)

isnowthesameforallsystemstates,

dP

(

At _{_i_}

)

dt ≈

N

j=1

Gi j P

(

Bt _{i }A{t j}

)

χ

(

t j→i

)

− N

j=1

Gi j P

(

At _{i }B{t j}

)

χ

(

t j→i

)

.

AddingandsubtractingN _j₌₁G_{i j}P

(

At _{_i_}At _{_j_}

)

χ

(

t _j_→_i

)

weobtain

dP

(

At _{_i_}

)

dt ≈

N

j=1

G_{i j}P¯

(

Bt _{_i_}At _{_j_}

)

χ

(

t _j_→_i

)

+G_{i j}P

(

At _{_i_}A_{t _j_}

)

χ

(

t _j_→_i

)

−

N

j=1

Gi j P

(

At _{i }B{t j}

)

χ

(

t j→i

)

+Gi j P¯

(

At _{i }A{t j}

)

χ

(

t j→i

)

≈

N

j=1

Gi j P

(

At _{j}

)

χ

(

t j→i

)

− N

j=1

Gi j P

(

At _{i }

)

χ

(

t j→i

)

,

whichisaclosedsetofNequationswithatmostNsummandson theright handside.Therefore by deﬁning P¯ asan approximation totheprobabilitydistributionPweobtaintheclosedsystem

dP¯

(

At _{_i_}

)

dt =

N

j=1

Gi j P¯

(

At _{j}

)

χ

(

t j→i

)

− N

j=1

Gi j P¯

(

At _{i }

)

χ

(

t j→i

)

, (11)

whichiseasytosolvenumericallyforanarbitrarygraph.

Example3.1(Neutraldrift). Inthespecialcaseofneutraldrift,i.e. when allindividuals haveidenticalfitness, theunconditioned fit-nessmodelgivestheexactfixationprobability.Withthedynamics ofthe invasion process underneutral driftwe obtain

χ

(

t _j_→_i

)

=

1

Nk j,andthereforeequation(11)canbewrittenas

dP¯

(

At _{_i_}

)

dt =

N

j=1

G_{i j}P¯

(

At _{_j_}

)

1

Nk_j−

N

j=1

G_{i j}P¯

(

At _{_i_}

)

1 Nk_j,

which is equivalentto the exact node equation (4) forthe inva-sion process underneutral drift (Shakarian etal., 2013). The un-conditionedﬁtnessmodelisalsoexactforallupdatemechanisms under neutraldrift, butwe donot write the equationsexplicitly here.

As the population size N increases, the solution to equation

(11) moves further away from the exact fixation probability ob-tainedeitherbysolvingthemasterequation(3)orfromtheoutput ofstochasticsimulations.Toobtainareasonableapproximationto thefixationprobabilityfromagiveninitialconditionweconstruct ascalingfactorfortheconstant fitnesscasebycomparingthe ra-tiobetweenthesolutionofequation (11)ona completegraphto theexactfixationprobability onacompletegraph.Wechoosethe completegraphbecause theexactfixation probabilitycan be cal-culatedanalytically inthiscase. Whilst weconsider the constant fitnesscase,itmayalsobepossibletofindasuitablescalingfactor inthefrequencydependentfitnesscase,howeverusingacomplete graphmaynolongerbeappropriate becausetherelativestrength of different strategies in some games is strongly affected by the averagedegreeofthegraph(Ohtsukietal.,2006).

Example3.2(Invasionprocess). Forconstantﬁtnessunderthe dy-namicsoftheinvasionprocess,theexactﬁxationprobabilityform

initial mutantA individualson acomplete graphis equivalentto theMoranprobability(Liebermanetal.,2005):

ρ

=1−

1 r m 1− 1 r N

.

(8)

thetwo.Intheconstantﬁtnesscasethiscanbedoneanalytically, withthescalingfactorforminitialmutantsgivenby

ρ

lim t→∞Ac

(

t

)

=

1−1 rm 1−1 rN

1 r−1

−1+

1+m (r 2₋₁₎ N

, (12)

whereAc

(

t

)

= 1 N

N

j=1 ¯

P

(

At _{_j_}

)

.Thederivationofthiscanbefound in

AppendixB.

We can now deﬁne two methods for predicting the ﬁxation probabilityunderanyMarkovianupdatemechanism.

• Method1(Unconditionedﬁtnessmodel)Solveequation(11)to providean approximationtothedynamicsoftheevolutionary process. (MATLAB code for solving the unconditioned ﬁtness modelisprovidedassupplementarymaterial.)

• Method2(Scaledunconditionedﬁtnessmodel)Solveequation

(11)andthenuseascalingfactor,theratiooftheexactﬁxation probability andthesolution toequation (11) forthecomplete graph,to providean approximation tothe ﬁxation probability fromagiveninitialcondition.

InSection4weinvestigatethenumericalperformanceofthese two methods. Note that for the purpose of this paper we have foundthescalingfactorequation (12)forMethod2underthe in-vasion process.However, themethodcanbe appliedtoother up-datemechanisms,such asdeath–birthwithselectionon birth,by findinganappropriatescalingfactor,whichcanbedonebysolving equation(11)(eitheranalyticallyornumerically)andcomparingto theexactfixationprobabilityonthecompletegraph.Forexample, seeHindersinandTraulsen(2015)fortheexactfixationprobability onacompletegraphundertheDB-Bdynamics.

3.3. Acontactconditioningapproximationmodel

In Section 3.2 we restricted the conditioning so that we only require the marginal probabilities of the individual nodes. How-ever, this removes a signiﬁcant amount of information from the dynamics. In the evolutionary process, when considering a re-placementeventthetwo nodesofmostinterest arethenode se-lected forbirth and the node selected for death. Therefore, here we followa similarmethod butretainconditioning onthestates of these two key nodes. Since we restrict the conditioning to only thestatesofthe relevantcontact,when lookingatthe term

χ

(

t

j→i

|

Bt {i }At {j}XV t \{i,j}

)

in equation (4) we condition only on the statesofthenodesiandjandobtain

χ

(

t

j→i

|

Bt {i }At {j}XV t \{i,j}

)

≈

χ

(

t j→i

|

B{t i }At {j}

)

. Undertheabovecondition,equation(4)becomes

dP

(

At _{_i_}

)

dt ≈

N

j=1

X V\{i,j}

G_{i j}P

(

Bt _{_i_}At _{_j_}X_Vt _\{_i,j_}

)

χ

(

t _j_→_i

|

Bt _{_i_}At _{_j_}

)

−

N

j=1

X V\{i,j}

Gi j P

(

At _{i }Bt {j}XV t \{i,j}

)

χ

(

t j→i

|

A{t i }Bt {j}

)

. (13)

To see the effect of this assumption on the rates, consider

χ

(

t

j→i

|

Bt _{i }At {j}

)

.Here we conditiononly against node i beingin state B and node j being in state A rather than against the en-tiresystemstate.Consequentlyintheﬁtnessequation(6)wehave

P

(

Bt _{_i_}

)

=1andP

(

At _{_j_}

)

=1giving

ft _j

|

Bt _{_i_}At _{_j_}=f_back+w

bT_{i j}+a

l=i

G_jlP

(

At _{_l_}

)

+b

l=i

G_jlP

(

Bt _{_l_}

)

N

l=1

G_jl

.

Inequation (13),the chance ofselecting node j is now indepen-dentofthestateX_Vt _\{_i,j_}oftheremainingnodeswhichenablesthe equationtobereducedto

dP

(

At _{_i_}

)

dt ≈

N

j=1

Gi j P

(

Bt _{i }At {j}

)

χ

(

t j→i

|

Bt {i }At {j}

)

−

N

j=1

G_{i j}P

(

At _{_i_}Bt _{_j_}

)

χ

(

t _j_→_i

|

At _{_i_}Bt _{_j_}

)

. (14)

This gives an approximate equation for individuals in terms of pairs.Wethenneedtobuildequationstodescribepair-level prob-abilities. Similar methodologies have been followed to describe epidemicspropagatedonnetworks(Sharkey,2008;Sharkeyetal., 2015).

Applyingthesameconditioningtotheexactpairlevelequation

(9)weobtain

dP

(

Bt _{_i_}At _{_j_}

)

dt ≈

N

k =1

Gjk P

(

Bt _{i }Bt {j}At {k }

)

χ

(

t k →j

|

Bt {j}At {k }

)

−

N

k =1

G_jkP

(

Bt _{_i_}At _{_j_}Bt _{_k_}

)

χ

(

t _k_→_j

|

A_{t _j_}Bt _{_k_}

)

+

N

k =1

G_ikP

(

Bt _{_k_}At _{_i_}At _{_j_}

)

χ

(

t _k_→_i

|

B_{t _k_}At _{_i_}

)

−

N

k =1

Gik P

(

At _{k }Bt {i }At {j}

)

χ

(

t k →i

|

At {k }Bt {i }

)

. (15)

Similarformulaecanbe constructedforallpossiblepairs,writing pairs interms oftriples. In a similar way, triples can be written interms ofquadsandso on,up to thefull systemsize N which isthenclosed.Therefore,whenusingthismethodweobtaina hi-erarchysimilartotheBBGKY(Bogoliubov–Born–Green–Kirkwood– Yvon)hierarchy(BornandGreen,1946;Kirkwood,1947)in statis-ticalphysics. However, herethe hierarchyonly representsan ap-proximationto theoriginal dynamics.Solvingthis systemexactly isnosimplerthanevaluatingequation(3)sinceevaluatingthe hi-erarchyinfulliscomparableinnumericalcomplexity,sowewish toﬁndapproximationmethodstoreducethis.

Withthishierarchy,wecanapplytechniquesdevelopedin sta-tisticalphysics toapproximatehigher-order termsasfunctionsof lower-orderterms.Inparticularwecanclosethesystemof equa-tions(14)and(15)atthelevelofpairsbyapproximatingalltriples inequation (15) interms of pair-level andindividual-level prob-abilities. Similar techniqueshave beenapplied for many stochas-ticprocessesincludinginepidemiology(KeelingandEames,2005; Kiss etal., 2017; Sharkey, 2008; Sharkey etal., 2015) and evolu-tionarydynamics (Hauertand Szabó, 2005; Ohtsukiet al., 2006; Szabó andFath,2007)leadingtomodelswhichcanbenumerically evaluated.

Toclosethesystem,we requireafunctionalformthatcan ap-proximatetripleprobabilitiesintermsofindividualandpair prob-abilities. One method isto approximate a triple P

(

At _{_i_}Bt _{_j_}C_{t _k_}

)

as theproductofallpossiblepairsamongthesenodesdividedbythe productofallindividuals,i.e.

P

(

At _{_i_}Bt _{_j_}C_{t _k_}

)

≈P

(

A

t

{i }Bt {j}

)

P

(

Bt {j}Ct {k }

)

P

(

A{t i }Ct {k }

)