Approximate Bayesian Computation - New Statistical Techniques

2.4 New Statistical Techniques

2.4.2 Approximate Bayesian Computation

Approximate Bayesian Computation (ABC) offers a “likelihood free” means of sampling from the posterior distribution when the likelihood is intractable. This idea of an alternative means of parameter inference was first introduced in Rubin (1984) and the algorithm and official name of ABC was established in Beaumont et al. (2008).

The goal of the ABC algorithm is to simulate samples directly from the posterior distribution p(θ|D)without assuming a particular form for the likelihood. At each proposed

Figure 2.2: Sample Bayesian hierarchical networks for the SN Ia cosmology problem, reproduced from March et al. (2011, top) and Rubin et al. (2015, bottom). Solid lines indicate probabilistic connections; dashed lines indicate deterministic connections. The UNITY model builds on that of March et al. (2011) and includes the same set of cosmological and standardization parameters and hyperparameters as a subset of their larger

point in parameter spaceθ∗, a simulation of the data is drawn, i.e.,D∗∼ f(D|θ∗). Sam-

pling the posterior by forward-modeling the data allows for the inclusion of complicated systematics and other survey-specific effects that are not trivial to include in standardχ2-

minimization or other likelihood-based techniques (e.g., BHM). The simulated data set is then compared to the data by way of a metricρ. Simulations which are “close” to the data

are accepted, while others are rejected. The criterion for acceptance is determined by a tolerance thresholdε, which is initially large, but decreased at each step as the simulated

distribution converges on the true distribution. Proposed parametersθ∗are accepted if

ρ(D∗− D)<ε. (2.17)

This form of “rejection sampling” is the most common implementation of the ABC algorithm. The process of adapting the threshold to ensure reasonable acceptance rates and proper convergence is known as Sequential Monte Carlo (SMC). SMC will produce samples from p(θ|ρ(D∗− D)<ε)which will approximate the posterior ifε is small.

Cases of higher dimensional data may reduce the acceptance rate and efficiency of the ABC algorithm. In some instances, it may be simpler to use a lower dimensional summary statisticof the data, e.g., a sample mean or variance. Summary statistics used in this way should besufficient statistics, where information contained in the data is also contained in the summary statistic. Using these sufficient statistics ensures that we have not reduced our ability to constrain the parameters of interest.

2.4.2.1 ABC Parameter Inference with SNe Ia

The development of sophisticated supernova light-curve simulation software, such as the SuperNova ANAlysis package (SNANA; Kessler et al., 2009b) offer an excellent opportu- nity for ABC SN Ia cosmology analyses. Such analyses have only recently been explored in works such as Weyant et al. (2013) and Jennings et al. (2016).

Weyant et al. (2013) use the SNANA suite to simulate SNe Ia from the SDSS-SNS (Section 1.3.1) and apply their algorithm to data used in the SDSS-SNS first year cosmology analysis (Kessler et al., 2009a). They choose to fit the simulated light curves with

MLCS2k2 (Section 1.2.3) and use the difference between the observed distance modulus and simulated distance modulus as their metric. To evaluate their metric, they smooth the distance moduli as a function of redshift using nonparametric linear regression (loess; Cleveland et al., 1992) and take the difference between the theoretical and observed values at the observed redshifts. They defineρ as the median absolute difference between

the smoothed curves.

Figure 2.3 compares the uncertainty regions in the inference of w₀ and Ωm using the Weyant et al. (2013) ABC framework and the χ2-minimization analysis described

in Kessler et al. (2009a). As demonstrated in the Figure, the ABC inference recovers a roughly equivalent uncertainty region as theχ2-minimization treatment even when incor-

porating a complex forward-model simulation of the data.

Figure 2.3: Weyant et al. (2013) comparison of uncertainty regions in thew−Ωmparam- eter space using ABC and theχ2method as described in Kessler et al. (2009a).

Jennings et al. (2016) proposes alternative ABC metrics using light-curve flux mea- surements and the SALT2 light-curve fit parameters (Section 1.2.3) using a set ofSNANA- simulated SNe Ia light curves from the Dark Energy Survey Supernova Program (Kessler

et al., 2015). Their analysis includes parameter inference with two distinct metrics and with and without including systematic uncertainties as parameters in the model. In the “Tripp Metric,” the difference between the observed and theoretical distance moduli is computed for the sets of simulated and observed SNe

∆data = 1 N_data Ndata

∑

[µ(zdata_i ,θ∗)−(mdata_b_,_i +α∗xdata₁_,_i −β∗cdata_i −M0−δM₀∗)]2

σ_m2_b_,_i+ (α∗σx1,i)2+ (β∗σci)2+σ 2 int ∆sim= 1 N_sim Nsim

∑

[µ(zsim_j ,θ∗)−(msim_b_,_j +α∗xsim₁_,_j −β∗csim_j −M0−δM0∗)]2

σ_m2_b_,_j+ (α∗σx1,j)2+ (β∗σcj)2+σ

2 int

and the metric is defined as the difference between the two offsets:

ρTripp=|∆data−∆sim|. (2.18)

Rather than use the light-curve fit parameters, the “Light-Curve” metric uses the light- curve fluxes directly and compares the differences in observed fluxes in thegrizbands for the simulated and observed SNe Ia. This is done by comparing the difference in fluxes to a “reference difference” distribution that accounts for sampling variance in a fixed cosmology. The metric is defined as

ρLC= Nbins

∑

j=0 χ2_j, where (2.19) χ2_j ≡ (OcTc˜,j− Ecc,j) 2 E_cc,j , (2.20)

OcTc˜ is the observed distribution of flux differences, andEcc is the expected distribution of flux differences.

Figure 2.4 presents example 1σ and 2σ contour regions using the “Light-Curve” (top)

and “Tripp” (bottom) metrics. Dashed lines in both plots indicate the posterior distribu- tions when systematic uncertainties are included as parameters in the model and the yel- low star indicates the true values of the parameters used to generate the simulated data set.

As Figure 2.4 shows, both ABC metrics successfully recover the input value in the 1σ un-

certainty region. The 1σ posterior is narrower using the “Tripp” metric than when using

the “Light-Curve” metric, yet including systematics tightens the uncertainty region using the “Light-Curve” metric. The bottom figure also includes results using traditional χ2-

minimization parameter inference with MCMC (purple contours). The ABC algorithm recovers similar 1σ and 2σ uncertainty regions to those inferred with χ2-minimization

while including more complicated survey-specific effects such as weather conditions and spectroscopic selection efficiency.

In document Supernova Cosmology And How To Talk About It: New Approaches To Cosmological Parameter Inference With Type Ia Supernovae And An Assessment Of The Education And Public Outreach Program Of The Dark Energy Survey (Page 73-78)