Parameter and uncertainty estimation - System modelling for Spitzer IRAC data

Chapter 2 Instrumentation and Methods

2.4 System modelling for Spitzer IRAC data

2.4.2 Parameter and uncertainty estimation

2.4.2.1 Figure of merit for parameter optimisation

The fitting of the light curve and radial velocity models to data is carried out using Bayesian methods. Under this scheme one is interested in characterising the posterior distribution - the probability distribution of models given a set of data.

For a particular model, the posterior probability is found by combining the likelihood of obtaining the dataset given the model, and the prior probability of the model:

p(model_|data)_∝p(data_|model)p(model) (2.26) Under the condition that the data uncertainties represent a Gaussian distribution and that their values are accurate [Ford, 2005], the likelihood is given as:

p(data_|model)_∝exp ₋χ

2 2 ! (2.27) χ2=X i (datai−modeli)2 σ_i2 (2.28)

χ2 is found from the model fits to all the inputted data. In this thesis these data are secondary eclipse and radial velocity datasets.

The prior probability,p(model) is used to account for prior knowledge of the system. In this work they are applied as non-uniform probability distributions of system model parameters and/or the derived parameters. Specifically, parameters are restricted using Gaussian distributions:

p(model) = exp  − m X j=1 (pj−pj,0)2 2σ2 j  , (2.29)

wherepj,0is the best estimate for the parameter andσjis the error. These values are

based on estimates from previous studies. Details of the priors used for the IRAC observations, and specifically their use in constraining transit parameters, can be found in Section 2.4.2.3 and in Chapters 3 and 4. For parameters where no prior is explicitly given, a uniform prior is implicitly used. This is equivalent to there being no prior knowledge of the parameter.

The posterior probability distribution is therefore given by:

p(model_|data)_∝exp

 − χ2 2 − m X j=1 (pj−pj,0)2 2σ2 j   (2.30)

Model optimisation and uncertainty estimation is carried out by exploring the parameter space around the maximum of the posterior probability distribution. This is equivalent to the minimum of a figure-of-merit,Q, given by:

Q=χ2+ m X j=1 (pj−pj,0)2 σ2 j (2.31)

2.4.2.2 Markov Chain Monte Carlo

Exploration of model fits to data in the mcmctransit code is carried out using a Markov Chain Monte Carlo (MCMC) technique. As described in Section 2.4.1, models are defined by a set of proposal parameters (see Table 2.1). The MCMC algorithm is used to estimate both the optimal model parameters and their uncertainties. One of the main advantages of MCMC is that it properly accounts for the inter-dependencies of model parameters in the parameter uncertainty estimates.

In an MCMC algorithm a cloud of points is created in the model parameter space, with each point corresponding to a particular instance of the model. The number of points in a region of this parameter space is proportional to the posterior probability in that region. This is achieved inmcmctransitusing the Metropolis- Hastings algorithm, which is as follows:

1. Find a model that is reasonably close to the global minimum in Q (as given in equation 2.31). Use the model, M0, as the starting point for the MCMC

(i= 0, where iis the ith step in the chain).

2. Generate a proposal model,Mi+1,proposal. Each parameter,p, of the model is

altered according to pi+1,proposal = pi +spG(0,1)fi. Here sp is an estimate of the uncertainty in p; G(0,1) is a Gaussian random number (mean of 0, variance of 1); fi is a variable scale factor used to ensure the acceptance rate of proposals is _∼25%.

3. CalculateQi+1,proposal forMi+1,proposal.

4. Apply Metropolis-Hastings rules. If:

Qi+1,proposal< Qi, accept model Mi+1,proposal

Qi+1,proposal>Qi, accept model Mi+1,proposal with probability

exp (₋(Qi+1,proposal−Qi)/2).

6. If rejected return to step 2 (idoes not increase).

7. End the chain wheniequals a specified number (typically 104₋105).

These rules ensure that the cloud of points created in the model parameter space will converge to the desired probability distribution [Ford, 2005].

Information on individual parameters can then be obtained by finding the marginal distribution of the parameter (i.e. by producing its histogram). Uncer- tainties for the parameter are estimated using the 1σ confidence intervals of its marginal distribution, while the optimal value can be estimated from the median of this distribution.

2.4.2.3 Optimal parameter and uncertainty estimation

In the usual implementation ofmcmctransit, transit light curve data are provided and fitted using the model in equation 2.18 [e.g. Collier Cameron et al., 2007b; Pollacco et al., 2008]. In the implementation used in this thesis the transit fitting was bypassed through the use of priors. Constraints that are usually imposed on the parameters enteringλ(φ) by transit data were instead applied using priors on transit light curve model parameters from previous studies. These studies are often quite detailed in their assessments of transit light curves. Consequently I saw the use of priors based on their results as a better option than another analysis of the available transit data. The use of priors still allowed for the dependencies of secondary eclipse parameters on transit light curve parameters to be explored.

For WASP-3 (Chapter 3), transit light curve priors were applied by constraining the derived parameters Rp

a , R⋆

a andi(see Section 3.3.4). These parameters were constrained because they were used for the light curve model of Southworth [2011], which was the follow-up study used for this object. In Chapter 4 for WASP- 21, Rp

a + R⋆

a , Rp

R⋆ andiwere used to provide the transit light curve constraints [Ciceri

et al., 2013, also see Section 4.3.3], while for WASP-28 and WASP-37δ, band T14

were used [Anderson et al., 2014; Simpson et al., 2011, also see Sections 4.4.3 and 4.5.3]. For each of these objects, priors were also placed on the ephemeris parameters T0 andP (which affect the transit model throughφ), and onTeff and [Fe/H] (which

set the value forM⋆). These priors enter theQstatistic in accordance with equation 2.31. For example, the prior constraint on i for WASP-3 is determined as (i−i0)2

σ2

, wherei0±σi0 is the inclination estimate from Southworth [2011].

Secondary eclipse light curves were fitted to the flux data resulting from the aperture photometry analyses (using the ULTRACAM pipeline for WASP-3 and the IDL reduction for WASP-21, 28 and 37). The units of these data were electrons i.e.

they are not scaled or normalised to the flux of a comparison star. There were two stages to the secondary eclipse fitting. First, the secondary eclipse model (equation 2.20) was divided out of the measured fluxes to give an estimate for the stellar flux:

F⋆=

Fmeasured

1 +η(φ)∆F (2.32)

Then, F⋆ was modelled using a subset of one of the equations 2.1–2.4. For example, for IRAC channel 1 data, one could choose to model the intra-pixel sensitivity variations using:

F_⋆,_model = ˆF⋆+a0+axdx+aydy (2.33) The optimal coefficient values can be determined uniquely (in terms of minimising χ2) using a singular-value decomposition (SVD) technique [Press et al., 1992]. This is possible due to the linear nature of the equations used for detrending. A slight complication is thattoff,a2 anda4 (from equations 2.3 and 2.4) cannot be evaluated

using this method, so these were added as MCMC proposal parameters when in use. Radial velocity data were modelled using equation 2.22. The combination of radial velocity and secondary eclipse data provides important constraints on e and ω. In the case of the secondary eclipses, the timing of the centre of the eclipse constrains the quantity ecosω through [Charbonneau et al., 2005]:

∆φ_≡φE−0.5≃

πecosω, (2.34)

whereφEis the observed mid-eclipse phase and 0.5 is the expectation from a circular

orbit.

The starting point for the MCMC proposal parameters came either from an analysis of the SuperWASP light curves used to discover WASP objects (for WASP- 3) or they were derived from the priors (for WASP-21, 28 and 37). A circular orbit was also assumed for the first model (√ecosω=√ecosω= 0) and the eclipse depths were started from 0. The starting point forK1 was found by initially modelling the

radial velocity data using a circular orbit.

These values were each perturbed by 5G(0,1)sp at the beginning, to ensure areas of parameter space other than the expected solution were explored. From this starting point, a ‘burn-in’ phase occurred, to allow the optimal region of parameter space to be found. At least 2000 (accepted) steps were used for this, with the burn- in ending once the χ2 for an accepted step was greater than the median χ2 of all the preceding steps. This ensured that the chain had found the region of parameter space in which the optimal solution lies.

After the burn-in phase, the photometric uncertainties on the secondary eclipse data were scaled such that the reduced χ2 for each set equalled 1. This was done to ensure that realistic uncertainties on the proposal parameters were obtained. The values ofsp were also re-evaluated by producing a chain of 1000 values and determining the standard deviations of the marginal distributions for each parameter.

A production chain of between 104 _{and 10}5_{successful jumps was then made in}

order to fully explore the parameter space around the optimal solution, to account for the dependencies of the proposal parameters on each other and to make an honest assessment of their uncertainties [Anderson et al., 2011]. Optimal parameter estimates (for both proposal and derived parameters) were taken as the median of the marginal distribution for that parameter, while the uncertainties were taken as the 15.9 and 84.1 percentiles (to give 1σ uncertainties).

In document Observations of exoplanet atmospheres (Page 73-77)