• No results found

Fleming-Viot processes Outline Background

N/A
N/A
Protected

Academic year: 2021

Share "Fleming-Viot processes Outline Background"

Copied!
20
0
0

Loading.... (view fulltext now)

Full text

(1)

from a Bayesian perspective

MATTEORUGGIERO

University of Pavia, Italy

Joint with STEPHENG. WALKER, University of Kent, UK

(2)

Outline

Background

Bayes nonparametrics and Gibbs sampler Fleming-Viot processes

Bayesian construction of Fleming-Viot diffusions

Basic construction: neutrality Extension to selective models

(3)

Dirichlet process

Definition.(Ferguson, 1973) Given a finite non-null measure α on (X ,X ), a random probabil-ity measureµon (X ,X ) is said to be aDirichlet processwithparameter α, denoted µ ∼Dα, if for every K = 1, 2, . . . and for every measurable partition (B1, . . . , BK) ofX ,

`µ(B1), . . . , µ(BK)´ ∼ Dirichlet`α(B1), . . . , α(BK)´.

Conjugate posterior process. (Ferguson, 1973) Let µ be a Dirichlet process on (X ,X ) with parameter α, and let X1, . . . , Xnbe a sample of size n from µ. Then

µ | X1, . . . , Xn∼Dα+Pn i=1δXi where δxdenotes a point mass at x.

Connection with GEM distribution.Sethuraman (1994) proposed a series representation for the Dirichlet process, the so-called stick-breaking construction, with locations Yi iid∼ α/α(X ) and weights piwith GEM distribution

Dα= L „ ∞ X i=1 „ Vi i−1 Y j=1 (1 − Vj) | {z } pi « δYi « Vi iid ∼ Beta(1, α(X )), Yiiid∼ α α(X )

(4)

A Polya urn for the Dirichlet process

Blackwell and MacQueen (1973)

Let again α be a finite measure on (X ,X ), and let {Xn}n≥1be such that

X1∼ α

α(X ), Xn+1|X1, . . . , Xn∼

α +Pn i=1δXi

α(X ) + n , n> 1. The observed colours have higher probability of being drawn again. If α is non atomic, there is a continuum of colours. Then{Xn}n≥1is calledPolya sequencewith parameter α and

(a) α +

Pn i=1δXi

α(X ) + n =⇒ µ

a.s., µ∗being a discrete measure

(b) µ∗∼

(c) X1, X2, . . . | µ∗ iid∼ µ∗

Rephrasing, we can write the joint law of X1, . . . , Xn, for n ≥ 1, as

P(X1∈ dx1, . . . , Xn∈ dxn) = E » n Y i=1 µ(dxi) – = Z F n Y i=1 µ(dxi)Dα(dµ)

(5)

Gibbs sampling

Geman and Geman (1984)

Special case of a Metropolis-Hastings algorithm, broadly used in Bayesian inference.

Suppose we want to sample from some joint distribution pX,Y(x, y), but this is unfeasible. Given the initial value (x0, y0), it is usually easier to sample from the (full) conditional distributions

X1∼ pX|Y(x | y0) Y1∼ pY|X(y | x1) X2∼ pX|Y(x | y1) . . .

and so on. Then {(xn, yn)}n≥1is aMarkov chainwith stationary distribution pX,Y(x, y). Taking M such chains {(xi

n, yin)}n≥1,i=1,...,M, for sufficiently large N ≥ 1 we can approximate 1 M M X i=1 f(xi N, yiN) ≈ Z f(x, y)pX,Y(x, y)dxdy

If the coordinates are updated in a random order and visited infinitely often, the chain is also reversible w.r.t. pX,Y(x, y).

(6)

Outline

Background

Bayes nonparametrics and Gibbs sampler Fleming-Viot processes

Bayesian construction of Fleming-Viot diffusions

Basic construction: neutrality Extension to selective models

(7)

Fleming-Viot processes

Fleming and Viot (1979)

A Fleming-Viot process is a probability-measure-valued diffusions which describes the evolution in time of an infinite population subject to mutation, resampling and (possibly) selection and recombination.

Among its main features:

• individuals are labeled by points in acomplete separable metric space X, calledtype space

(for simplicity we assume X is compact);

• it takesvalues onthe setP(X )of Borel probability measures;

• it hassample-paths inthe spaceCP(X )([0, ∞))of continuous functions from [0, ∞) to P(X ).

The neutral version has infinitesimal generator A0ϕ(µ) = m X i=1 hPif, µmi + 1 2 X 1≤k6=i≤m hΦkif− f , µmi hf , µi = Z fdµ with domain D(A0) = n ϕ(µ) ∈ B(P(X )) : ϕ(µ) = hf , µmi, f ∈ C(Xm ), m ∈ N o

where P is the generator of a Feller mutation process on X , Piacts on xiin f (x1, . . . , xn), Φki changes xkto xiin f .

(8)

When

Pf(x) =θ 2 Z

ˆf (y) − f (x)˜ν0(dy) ν0non atomic, θ > 0 itsstationary distributionisDα, with α = θν0. (Ethier and Kurtz, 1986)

Its transition function is given by P(t, µ, dν) = ∞ X m=0 dm(t) Z XmDα+ Pm i=1δXi(dν)µ(dX1) . . . µ(dXm)

where dm(t) = P(Dt= m) and Dtis a death process starting a.s. from ∞, andDα+Pm i=1δXiis a posterior Dirichlet process. (Ethier and Griffiths, 1993)

If we addselection, then the FVP has generator Aσϕ(µ) = m X i=1 hPif, µmi +1 2 X 1≤k6=i≤m hΦkif− f , µmi + m X i=1 hσi(·) f − σm+1(·) f , µm+1i

where σi(·) = σ(xi) is the selection coefficient, andstationary distributionproportional to e2hσ,µiDα(dµ)

(9)

Outline

Background

Bayes nonparametrics and Gibbs sampler Fleming-Viot processes

Bayesian construction of Fleming-Viot diffusions

Basic construction: neutrality Extension to selective models

(10)

Gibbs sampling the Polya urn

Given an exchangeable vector Xn= (X1, . . . , Xn), define aGibbs sampler driven Markov chain

{Xn(k)}k≥1such that at each transition

• xiis removed from xn= (x1, . . . , xn) with probability 1/n • a replacement X0

i is sampled from the Blackwell-MacQueen prediction scheme with α = θν0, where θ = α(X ) and ν0= α/α(X ) non atomic, namely

X0i| x(−i)∼ θ θ + n − 1ν0(dx 0 i) + 1 θ + n − 1 n X k6=i δxk(dx 0 i) (1)

• the arrival state is (x1, . . . , xi−1, x0i, xi+1, . . . , xn).

This amounts to performing a Gibbs sampler on (X1, . . . , Xn), with a random scan (update Xi with index i random) and full conditionals Pα

Xi|X(−i)(dxi|x1, . . . , xi−1, xi+1, . . . , xn) given by (1).

This produces a reversible Markov chain {(X1(k), . . . , Xn(k))}k≥1with stationary distribution PXα 1,...,Xn(dx1, . . . , dxn) = ν0(dx1) θν0(dx2) + δx1 θ + 1 . . . θν0(dxn) +Pn k=1δxk(dxn) θ + n − 1

(11)

The particle process

Embed it in continuous time in DXn[0, ∞), with Exp(λn) sojourn times, and let λn= n(θ + n − 1)/2

Remarks

a) λnsubstitutes time rescaling.

b) for θ = 0 there is no mutation, and λnis the transition rate of Kingman’s coalescent.

The generator of the Xn-valued process is Anf(x) = n X i=1 λn n(θ + n − 1) Z h f(ηi(x|y)) − f (x) i (θν0+ n X k6=i δxk)(dy)

where ηi(x|y) = (x1, . . . , xi−1, y, xi+1, . . . , xn).

Define the process of empirical measures {µn(t)}t≥0:= {1nPni=1δxi(t)}t≥0with c`adl`ag sample-paths in DP(X )[0, ∞).

(12)

Convergence and stationarity

Neutral diffusion model

Define ϕm(µ) = hf , µ(m)i, µ(m)= (n − m)! n! X 1≤i16=...6=im≤n δ(xi1,...,xim)

Then Anϕm(µ) = hAnf, µ(m)i is the generator of the measure-valued process and ||Anϕm(µ) − A0φm(µ)|| −→

n→∞0 φm(µ) = hf , µ

mi

where A is the generator of a FV process. Since the linear span of functions φmis a core for A in C(P(X )), and both Anand A generate strongly continuous contraction semigroups, this implies{µn(t)} =⇒ {µ∞(t)} in DP(X )[0, ∞), where {µ∞(t)} is a FV process.

From de Finetti’s theorem, w.p. 1 we have µn(t) ⇒ µ∞(t) for every t, and µ∞(t) ∼Dθν0. From the well-posedness of the martingale problem for A0, it follows that the stationary distri-bution of {µ∞(t)} is a Dirichlet processDθν0.

(13)

Outline

Background

Bayes nonparametrics and Gibbs sampler Fleming-Viot processes

Bayesian construction of Fleming-Viot diffusions

Basic construction: neutrality Extension to selective models

(14)

A generalised Polya urn scheme

Consider the exchangeable law

Qα,βn

X1,...,Xn(dx1, . . . , dxn) ∝ P α

X1,...,Xn(dx1, . . . , dxn)βn(x1) . . . βn(xn) (2) where we assume βn∈ B(X ) for all n.

Remark

It can be shown that Qα,βn

X1,...,Xnadmits representation in terms of aDirichlet process mixture(Lo, 1984), a model widely used for Bayesian density estimation.

From (2)the predictive law for xiis Qα,βn

Xi|X(−i)(dxi|x1, . . . , xi−1, xi+1, . . . , xn) ∝ θβn(xi) ν0(dxi) + n X

l6=i

βn(xl) δxl(dxi)

and it is clear that for βn(x) ≡ 1 it reduces to the Blackwell-MacQueen case Qα,1X 1,...,Xn(dx1, . . . , dxn) ∝ P α X1,...,Xn(dx1, . . . , dxn) = n Y i=1 θν0(dxi) + P l≤iδxl(dxi) θ + i − 1 .

(15)

Gibbs sampling again

Similarly to the neutral case, define aMarkov chain{Xn(k)}k≥1such that at each transition • xiis removed from xn= (x1, . . . , xn) with probability 1/n

• a replacement is sampled from the generalized Blackwell-MacQueen predictive

Qα,βn

Xi|X(−i)(dxi|x1, . . . , xi−1, xi+1, . . . , xn) ∝ θβn(xi) ν0(dxi) + n X

l6=i

βn(xl) δxl(dxi)

This produces a chain reversible with respect to Qα,βX

1,...,Xn(dx1, . . . , dxn). Embed it in continuous time in DXn[0, ∞) with Exp(λn,i) sojourn times such that

λn,i= 1 2n „ θ Z βn(u) ν0(du) +X l6=i βn(xl) «

and note that βn≡ 1 ⇒ λn,i= n(θ + n − 1)/2

Remark

(16)

Convergence

Fleming-Viot process with selection

When the weights in Qα,βn

X1,...,Xnhave form βn(x) = 1 + 2

nσ(x), with σ ∈ B(X ) the particle process has generator

Anσf(x) = n X i=1 1 2θ Z h f(ηi(x|y)) − f (x)i{1 +2 nσ(y)}ν0(dy) +1 2 X 1≤k6=i≤n h f(ηi(x|xk)) − f (x) i +1 n X 1≤k6=i≤n σ(xk) h f(ηi(x|xk)) − f (x) i

Remark

σ represents the fitness of the offspring, acting as fertility selection.

Since hAnσf, µ(m)i → Aσφm(µ) strongly, it can be shown that the process of empirical measures converges in distribution in DP(X )[0, ∞) to the FV process with fertility selection

 1 n n X i=1 δxi(t), t ≥ 0 ff =⇒ n→∞{µ σ ∞(t), t ≥ 0} in DP(X )[0, ∞) where µσ

(17)

Diploid case

For adiploid population, take a bivariate selection function βn(x, y) ∈ Bsym(X2) and consider the law, joint with pairings Pn,

Qα,βn X1,...,Xn,Pn(dx1, . . . , dxn, Pn) ∝ P α X1,...,Xn(dx1, . . . , dxn) Y k βn(xk, xjk)

With appropriate modifications, the same procedure leads to a generalized urn scheme with con-ditional law proportional to

θ n X j6=i βn(xi, xj) ν0(dxi) + n X k6=i n X j6=i βn(xi, xj) δxk(dxi)

and a particle process with Poisson rate λn,i= 1 2n „ θ Z n X j6=i βn(xi, xj) ν0(dxi) + n X k6=i n X j6=i βn(xk, xj) «

whose process of empirical measures converge to a FV process with diploid selection. When βn(x) =R βn(x, y)µ(y) we recover the haploid case. When βn(x, y) ≡ 1 we recover

n(θ + n − 1)/2 and θν0(dxi) +

n X

k6=i δxk(dxi).

(18)

Stationarity

We exploit the representation of Qα,βn

X1,...,Xn(dx1, . . . , dxn) in terms of Dirichlet process mixture model. Given

zi|xiind∼ Kn(·|xi) xi|µiid∼ µ µ ∼Dθν0 so that

L(X1, . . . , Xn|z1, . . . , zn) ∝ Kn(·|x1) . . . Kn(·|xn)PXα1,...,Xn. Assuming Kn(1|xi) = βn(xi) we have Qα,βn

X1,...,Xnis the stationary of (x|zn= 1). Consider the Gibbs sampler extended to (x1, . . . , xn, µ|zn= 1), alternating updates to

(x1, . . . , xn|µ, zn= 1) and (µ|x1, . . . , xn, zn= 1).

Hence (µ|zn= 1) is a MV chain with stationary L(µ|zn= 1). From Bayes’ theorem we have L(µ|zn= 1) ∝ L(zn= 1|µ)Dθν0(dµ) ∝ » Z βn(y)µ(dy) –n Dθν0(dµ) = Πn(dµ)

(19)

The limit of Πnwill be the de Finetti measure of the sequence (x1, x2, . . . |z∞= 1), since (x1, . . . , xn|µ, zn= 1) iid ∼ µ µ ∼ Πn from which (x1, . . . , xn|zn= 1) ∼ Qα,βn X1,...,Xnimplies 1 n n X i=1 δxi=⇒ µ

a.s. µ∼ Π∞ (if it exists) (3)

When X is compact,P(X ) (with the topology of weak convergence) is compact, hence {Πn} is tight, and Π∞is well defined. If βn(x) = 1 +2nσ(x) we have

Π∞(dµ) ∝ lim n » 1 +2 n Z σ(y)µ(dy) –n Dθν0(dµ) ∝ e2R σdµDθν 0(dµ).

Since the martingale problem for Aσis well-posed, (3) is enough to conclude that Π∞is the stationary distribution of the FV process with selection.

(20)

ANTONIAKC.E. (1974).Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems.Ann. Statist. 2. BLACKWELLD.ANDMACQUEENJ.B. (1973).Ferguson distributions via Polya urn schemes.Ann. Statist. 1.

DAWSOND.A.ANDGREVENA. (1999).Hierarchically interacting Fleming-Viot processes with selection and mutation: multiple space time scale analysis and quasi-equilibria.Electron. J. Probab. 4.

DAWSOND.A., GREVENA.ANDVAILLANCOURTJ. (1995).Equilibria and quasi-equilibria for infinite collections of interacting Fleming-Viot processes.Trans. Amer. Math. Soc., 347.

DONNELLYP.ANDKURTZT.G. (1996).A countable representation of the Fleming-Viot measure-valued diffusion.Ann. Probab. 24. DONNELLYP.ANDKURTZT.G. (1999).Genealogical processes for Fleming-Viot models with selection and recombination.Ann.

Appl. Probab. 9.

ETHIERS.N.ANDGRIFFITHSR.C. (1993).The transition function of a Fleming-Viot process.Ann. Probab. 21. ETHIERS.N.ANDKURTZT.G. (1986).Markov processes: characterization and convergence.Wiley.

ETHIERS.N.ANDKURTZT.G. (1994).Convergence to the Fleming-Viot process in the weak atomic topology.Stoch. Proc. Appl. 54. FERGUSONT.S. (1973).A Bayesian analysis of some nonparametric problems.Ann. Statist., 1.

FLEMINGW. H.ANDVIOTM. (1979).Some measure-valued processes in population genetics theory.Indiana University Mathematics J. 28.

GEMAN, S.ANDGEMAN, D. (1984).Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images.IEEE Trans. Patt. Anal. Mach. Intelligence, 6.

LOA. Y. (1984) .On a class of Bayesian nonparametric estimates I: density estimatesAnn. Statist., 12. SETHURAMANJ. (1994).A constructive definition of Dirichlet priors.Statist. Sinica, 4.

References

Related documents