• No results found

7. ARES Joint inference of the three dimensional density field and its power-spectrum

7.3. The Large scale structure Gibbs sampler

The aim of the Bayesian analysis method, as presented here, is to map out the joint posterior distribution of the three dimensional density field and the corresponding matter power-spectrum. In principle, this task could be solved by dividing the parameter space into an equidistant grid and estimating the posterior values at each grid node. However, since the number of grid points required for such an analysis scales exponentially with the number of free parameters, this approach cannot be realized efficiently. In such situations Markov Chain Monte Carlo (MCMC) methods are known to generally outperform any known deterministic approach to this problem (see e.g.

Neal 1993,Andrieu et al. 2003,Robert & Casella 2005). The basic ideas of MCMC methods is to approximate the target probability distributionP(ξ), defined on a high dimensional spaceχ, by a set of N independent and identically distributed (i.i.d) samples{ξ(i)}N

i=1 drawn from the target distribution (Andrieu et al. 2003). Here, the

Markov chain, usually realized as a discrete random process, efficiently generates samplesξ(i) while exploring the

state spaceχ. Given such a set of samples it is possible to approximate the target posterior with the following point particle distribution: PN(ξ)= 1 N N X i=1 δD(ξξ(i)), (7.6)

7.3.2 Joint power-spectrum and density field inference

withδD(x) being the Dirac delta distribution. Given such a representation of the target density any statistical

summary can be approximated with tractable sums, rather than integrals, as:

IN(f)= 1 N N X i=1 f(ξ(i))−−−−−a.s.→ N→ ∞ I(f)= Z χ f(ξ) P(ξ) dξ . (7.7)

This means that the estimateIN(f) is unbiased and will almost surely (a.s.) converge toI(f) by the Strong Law of Large Numbers (Andrieu et al. 2003,Robert & Casella 2005). The Gibbs sampler, as proposed byGeman & Geman(1984) andGelfand & Smith(1990), is a special case of the MCMC algorithm. Its special importance to Bayesian data analysis arises from the fact that it allows for very efficient parameter space exploration if the problem under consideration can be formulated accordingly. The Gibbs sampler is generally applicable to problems where the joint distributionP(ξ1, ξ2, ..., ξM) of parameters is not known explicitly, but the conditional distribution P(ξq| {ξp} : p , q) of each individual variableξq is known (Neal 1993). In this case, the theory of Gibbs sam-

pling (Gelfand & Smith 1990,Tanner 1996,O’Hagan 2000) states, that iterative random draws from the individual conditional distributions will yield samples from the joint target distribution. More explicitly, the iteration of the following random draws:

1) ξ1(j+1)xP(ξ1|ξ2j, ..., ξMj ) 2) ξ2(j+1)xP(ξ2|ξ j+1 1 , ξ j 3, ..., ξ j M) . . . M) ξ(Mj+1)xP(ξM|ξ j+1 1 , ξ j+1 2 , ..., ξ j+1 M−1),

will yield samples from the joint multivariate densityP(ξ1, ξ2, ..., ξM). Since the Gibbs sampler is an homogeneous

Markov chain by definition and is also ergodic, the sequence of samples are proven to converge to samples from the desired target distribution (Neal 1993). For a more detailed introduction to MCMC methods and the Gibbs sampler the reader is referred to the literature (see e.g.Kemeny & Snell 1960,Hastings 1970,Neal 1993,Andrieu et al. 2003,Robert & Casella 2005,Gamerman & Lopes 2006).

7.3.2. Joint power-spectrum and density field inference

The advantages of Gibbs sampling procedures for Bayesian inference of cosmological density fields and power- spectra have already been previously demonstrated in the case of CMB (see e.g.Wandelt et al. 2004,Wandelt 2004,

Eriksen et al. 2004,Jewell et al. 2004). Here we extend their approach to be applicable to three dimensional galaxy surveys. The entire purpose of the Gibbs sampler in our approach is to efficiently map out the joint posterior distri- butionP({P(ki)},{si}|{di}) of the power-spectrum coefficientsP(ki) and the 3D matter density contrast amplitudessi

given a set of observations{di}via a MCMC method. The corresponding two step Gibbs sampling procedure can

therefore be written as follows:

1) {si}(j+1)xP({si}|{P(ki)}(j),{di})

2) {P(ki)}(j+1)xP({P(ki)}|{si}(j+1),{di}), (7.8) whereP({si}|{P(ki)}(j),{di}) is the conditional probability of the three dimensional density field conditional on a

given power-spectrum and the data andP({P(ki)}|{si}(j+1),{di}) is the conditional probability of the power-spectrum

conditional on a previously sampled three dimensional density field and the data. Iteration of the Gibbs sampling steps (7.8) will therefore yield a set of MCMC samples and therefore an approximation of the joint posterior:

PNGibbs({P(ki)},{si}|{di}) =

1 NGibbs NGibbs X k=1 δD({si} − {si}(k))δD({P(ki)} − {P(ki)}(k)), (7.9)

whereNGibbs is the number of Gibbs samples. The overall Gibbs sampling procedure is depicted in Fig. 7.1. According to the Gibbs sampling procedure (7.8) we first generate a density field realization from the Wiener posterior via the procedure described in section7.4. Given such a density field sample, in the second step a power- spectrum sample is drawn from an inverse gamma distribution as discussed in section7.5.1. Iteration of these sampling steps yields a sampled representation based on which any desired statistical summary can be reported. Further, it allows for estimating normalization factors required for Bayesian model comparisons and calculation of odds factors. In particular, it is possible to provide an analytic description of the full power-spectrum posterior

P({P(ki)}|{di}) by marginalizing over the three dimensional density field samples. As will be demonstrated in section

7.5.2such a procedure yields a Blackwell-Rao estimate for the power-spectrum posterior. Also note, that additional sampling steps such as the joint inference of biases or peculiar velocities can be added to the Gibbs sampling procedure (7.8), allowing for a fully global analysis of all these quantities.