• No results found

4.2 Computational method

4.2.2 Markov-chain Monte-Carlo simulation

Markov-Chain Monte-Carlos (MCMCs) are a set of Bayesian analysis methods which take advantage of the properties of Monte Carlo integration to accurately sample an unknown distribution by drawing a large number of samples from it. The strength of MCMC is that each step is predicated on the current state of the chain. Most commonly implemented in the form of a random walk algorithm, the basic MCMC method has several available variations in the way that the Markov chain is produced, and in which the sampling is carried out. For my work I use the Metropolis-Hastings method (Metropolis et al., 1953; Hastings, 1970).

The premise of Metropolis-Hastings is that the parameter steps are taken from a known, proposed distribution, and matched to the desired posterior distribution; a basic outline of the method is shown in Figure 4.1. The new parameter set,Pk, is generated from the previous set,Pk1, and evaluated. In my case the proposed distribution for each parameter is taken to

be Gaussian, and the evaluation is carried out using the forward integration model described in section 4.2. TheC statistic value for the new parameter set is compared to that from the previous set using∆C =CkCk1; if the fit is better (∆C <0) then the new set of parameters is accepted, and the chain moves to that point in parameter space. If the fit is worse, then the step is accepted with probabilitye−∆C/2. If the step is rejected then the chain remains where

Figure 4.1: A flowchart illustrating the basic working of the MCMC algorithm. When generating new pa- rameters, the scale factor f is of order unity but tuned to provide an acceptance rate of roughly 25 percent. The secondary check for acceptance allows the Markov chain to climb out of local minima, preventing erroneous solutions and poor convergence.

it was, and saves another copy of the last successfully accepted parameter set1. Another set of new parameters is then generated.

The purpose of the second acceptance checking phase is to try and prevent the chain from falling into local minima in theC-surface. If the steps that the algorithm is taking are too small then the algorithm could potentially become stuck in such a minimum for some time, leading to a ‘poorly mixed’ chain and slow convergence. The random number comparison allows the chain to climb back out of local minima before it gets stuck, although the success of this approach still relies on the size of the steps that the algorithm is taking. This in turn depends on the choice of proposed distribution, but can be modified through additional scaling. The ideal rate of acceptance is roughly 25 percent (Tegmark et al., 2004).

There are two phases to an MCMC algorithm: the ‘burn-in’ phase, and the ‘production’ phase. Since the posterior distribution is unknown, the chain must be started from an arbitrary combination of parameters; under my optimisation scheme, these coordinates are set to the centre of the parameter space unless this lies within 3σC of the coordinates of the best-fitting initial result from the grid search, in which case the origin coordinates are set to lie outwith the

1This becomes key when calculating uncertainties at the end of the MCMC run, as it affects the shape of the

Figure 4.2: An illustration of the burn-in procedure and evaluation for my MCMC algorithm. The pro- gression of the C statistic for the current step is shown by the solid, black line. The dotted, blue line marks the progression of the median value of C. Note the importance of the≥100steps criterion in this situation, as without it ‘burn-in’ would have been judged to be complete at the earlier plateau in C. Data taken from my analysis of the WASP-18 system.

C contour in parameter space. The algorithm then randomly iterates around the available

parameter space according to the Metropolis-Hastings decision maker until the chain has converged on the desired distribution. Once this has been achieved, the steps taken up to that point are discarded (they were not drawn from the posterior distribution), and the chain is said to have ‘burnt-in’. The production phase then begins, and the desired number of samples are taken.

Determining when burn-in is complete is of the utmost importance. For my work, I use the method of Knutson et al. (2008), judging this phase to be complete when the minimum value of the test statistic from the current integration exceeds the median of all previous minimum C statistic values for the first time, with the proviso that 100 successful steps must have been completed (see Figure 4.2). This additional criterion is applied to ensure that the chain has truly converged on the posterior distribution. Following the completion of burn-in I then carry out an intermediate, ‘rescaling’ phase of 100 successful steps, using these to calculate new error bars for my parameters that are fed into the algorithm at the parameter generation phase.

roughly 40000 steps when accounting for the optimum acceptance rate. The best-fitting set of parameters are taken to be the median values of the respective posterior probability distri- butions, with 1σerrors derived from the values that delineate the central 68.3 percent of the distribution. I chose this approach over the absolute minimum, as the latter strongly depends on the precise sampling of the parameter space by any given Markov chain. This set of pa- rameters are then integrated forward in time to provide the best-fitting evolutionary history for the system.