Model Distribution and the Likelihood Function

3.5 Afterglow Model Fitting Procedure

3.5.2 Model Distribution and the Likelihood Function

The emission, extinction and absorption models described in _§§3.2, 3.3 & 3.4 allow us to calculate, at any given time and frequency, the expected model flux density

Fmod

ν (t, ν;ϑm), given a set of M model parameters {ϑm}. We now must calculate the

likelihood of a given model, given a particular set of N filter-integrated flux density measurements

Fobs

ν,n , along with any prior constraints on the model parameters them-

selves. We denote a given flux measurement Fobs

ν,n(tn, fn(ν)), where n = {1, . . . , N}, tn

is the time since the burst trigger at the middle of a given exposure, and fn(ν) is

the normalized transmission function of the observation’s photometric filter, multiplied by the response function of the telescope and instrument, when available. Tabulated response curves are read in at the beginning of the fit for each data point n, and re- sampled, using linear interpolation, to create normalized filter response curves of Nn

points, evenly spaced in linear frequenciesνn={νn,1, . . . , νn,Nn}. In practice, we choose

where ∆ν is chosen to give _∼10 bins across the width of the narrowest spectral feature in the model; in most cases, this is the UV bump, so that the typical bin width is ∆ν _≃ 3

1+zGRB ×10

14 _Hz _≈ ₂_.₅_×₁₀13 _Hz_/_{(1 +}_z

GRB). The filter response-weighted

mean model flux density is then approximated as:

Fmod ν,n (tn, fn(ν);θ) = Nn X k=1 Fmod ν (tn, νn,k;ϑm)×fn(νn,k). (3.47)

Each flux density measurement Fobs

ν,n is assigned to a particular calibration group,

typically one group per photometric filter per independently prepared data set. Since photometric calibration is typically performed in magnitude space, we assume that all flux measurements within a given calibration group may be offset from their true values by a constant offset ∆c,n in logF space. These offsets are parameters in our model,

each of which is constrained by a zero-mean log-normal Gaussian prior whose width is given by our best estimate of the calibration uncertainty, σlogFνc,n, for that group,

i.e., p(∆c,n) = G(∆c,n; 0, σlogFνc,n). The model flux density must then be adjusted to

compensate for this offset:

F_νmod_c_,n =F_νmod_c_,n _×10∆c,n_. _(3.48)

In addition to calibration group offsets, we assume that there is additional, random scatter in the observed flux that is greater than can be attributed to measurement uncertainties alone, due to unmodeled stochastic physical processes in the emission and absorption of light from the GRB afterglow. This additional scatter, or slop, is assumed to be normally distributed in logFν space, and is characterized as a Gaussian with

widthσlogFν. While, in principle, this slop could be different for different data groups,

in practice we assume thatσlogFν is the same for all measurements in the UV-optical-IR

region of the spectrum, and we characterize it with a single model parameter.10 _{In this}

10_{However, when we include X-ray flux measurements in our data set, we employ a separate slop}

parameterσX

sense, we are fitting not a model curve, but a model distribution to our measured flux densities. The model probability density is normally distributed about the model curve in logFν space:

pmod(logFν; logFνmodc,n, σlogFν) =

√

2πσlogFν

(logFν−logF_νc,nmod)2

2σ2_log

Fν . (3.49)

Since we are comparing the model distribution to flux densities in linear space, it is necessary to transform the probability distribution of the model into that space. As we showed in _§2.7, the probability density of the model distribution in linear Fν space

is obtained by equating the differential probabilities:

pmod(Fν;Fνmodc,n, σlogFν)dFν = p(logFν; logF

mod

νc,n, σlogFν)dlogFν

pmod(Fν;Fνmodc,n, σlogFν) =

Fνln 10

p(logFν; logFνmodc,n, σlogFν). (3.50)

It can be shown analytically that pmod _{peaks at a value:}

F_νmax_c_,n = 10(logFνc,nmod−σ2logFνln 10). (3.51)

As we discussed in_§2.7, it is often convenient to approximate asymmetric probability distributions as skew-normal, or asymmetric, Gaussian distributions. As a reminder, we define the asymmetric Gaussian function:

GA(x;x0, σx±)≡ 2 √ 2π(σx++σx−) ×      e− 1 2 (x−x₀₎2 σ2 x+ ifx≥x 0 e− 1 2 (x−x₀₎2 σ2 x− ifx < x0 . (3.52) If σlogFν . 0.1, p mod₍_F

ν) can be reasonably well approximated as GA(Fν;Fνmaxc,n, σFν±), of slop due to scintillation in the ISM, which is expected to be normally distributed in linearFν space;

whereσFν± are approximated by the empirical formulas: σFν+ ≈ F max νc,n 2.3029σlogFν + 2.6293σ 2 logFν −3.6945σ 3 logFν σFν− ≈ F max νc,n 2.3027σlogFν −2.6544σ 2 logFν −4.0699σ 3 logFν . (3.53)

As we described in detail in Chapter 2, to find the best-fit model distribution to the observed data, we must define a likelihood function that quantifies the joint probability of a given set of measurements, with their intrinsic measurement uncertainties, and the model distribution described by the set of parameters _{ϑm}, which includes one

or more slop parameters σ. We approximate the intrinsic probability distribution of a given flux density measurement Fν,n as a (possibly asymmetric) Gaussian:

pint_ν,n(Fν;Fν,n, σFν,n±) =GA(Fν;Fν,n, σFν,n±). (3.54)

For optical and NIR photometric observations, since a CCD camera is an intrinsically linear flux-measuring device, the flux density measurements we generate ourselves have symmetric error bars σFν,n+ = σFν,n− = σFν,n. Published data from other sources,

however, tend to quote symmetric error bars in logFν or magnitude space. In those

cases, we transform the data points into linear flux space in the same manner as outlined above for the log-normal model distribution (assuming the quoted error bars are.0.1 in logFν), with asymmetric error barsσFν,n±. If a published measurement is quoted as

a 3σ upper limit, we assign that data point a linear flux density Fν,n = 0, and assign it

symmetric 1σ error bars σFν,n such that:

Z F_ν,n3σ −∞

G(Fν; 0, σFν,n)dFν = 0.99730, (3.55)

as we discussed in _§2.7.3. If 2- or 1σ upper limits, F2σ

of σFν,n are those for which the integral in Equation 3.55 equals 0.9545 and 0.6827,

respectively. By these definitions, σFν,n = F

3σ ν,n/2.7822, σFν,n = F 2σ ν,n/1.6901 or σFν,n = F1σ ν,n/0.4752.

The joint probability of the one-dimensional model distribution and a given flux density measurement is:

pn∝

Z ∞

−∞

GA(Fν;Fνmaxc,n, σFν±)GA(Fν;Fν,n;σFν,n±)dFν. (3.56)

If both distributions were symmetric Gaussians, with widthsσFν andσFν,n, respectively,

this joint probability takes a very simple form:

pn∝G(Fν,n;Fνmaxc,n,Σn), (3.57)

where Σn is the quadrature sum of the slop and the intrinsic uncertainty:

Σn ≡ q σ2 Fν+σ 2 Fν,n. (3.58)

However, when one or both of the distributions is approximated as an asymmetric Gaussian, the result of the probability integral (equation 3.56) is not, in general, of the form of an asymmetric Gaussian. We have explored such integrals extensively (see

§2.7 and Appendix A), and have found that they can be reasonably well approximated as asymmetric Gaussians, provided the peak is allowed to shift by an amount δFν,n.

An approximate empirical formulation for δFν,n is presented in Appendix A. The joint

probability can then be approximated as:

pn∝GA Fν,n+δFν,n;F

max

νc,n,Σn±

where Σn+ ≡ q σ2 Fν−+σ 2 Fν,n+ Σn− ≡ q σ2 Fν++σ 2 Fν,n−. (3.60)

Note the reversal of signs; σFν∓ is added in quadrature to σFν,n±.

The total likelihood of the measurements, given the model, is the product of the individual joint probabilities:

L =

n=1

pn. (3.61)

However, some of our model parameters _{ϑm} themselves have prior constraints on

their values. Typically, these are symmetric or asymmetric Gaussian probability distributions about the most likely parameter value. In some cases, one parameter’s prior width or peak may be a function of one or more other parameters. The details of the priors that we use in our GRB afterglow extinction and absorption models are discussed in detail in _§§3.3 & 3.4. Including the priors, the posterior probability of the model, given the data, is:

P ∝ L ×

m=1

pϑm. (3.62)

Finally, we arrive at the best-fit set of model parameters for a given data set by minimizing₋2 ln_P: −2 ln_P = N X n=1 −2 ln_L+ M X m=1 −2 lnpϑm+ Constant. (3.63)

This quantity has the form of a modified, χ2_{-like statistic (it is not a “true”}_χ2_{, in that}

or more free slop parameters). We find the best-fit model by minimizing: M X m=1 (ϑm−ϑˆm)2 σ2 ϑ,m + N X i=1 2 ln(Σn++ Σn−) +      (Fν,n+δ_Fν,n−F_νc,nmax)2 Σ2 n− if Fν,n+δFν,n ≥F max νc,n , (Fν,n+δ_Fν,n−F_νc,nmax)2 Σ2 n+ if Fν,n+δFν,n < F max νc,n , (3.64) where the first summation is over all parameters with Gaussian priorspϑm =G(ϑm; ˆϑm, σϑ,m),

where ˆϑm is the best-fit value of parameterϑm, andσϑ,m is its fitted uncertainty. (Note

that in some cases, a parameter’s prior probability distribution may also be approximated as an asymmetric Gaussian.) The term 2 ln(Σn++Σn−) in the second summation

is necessary, since Σn± is a function of the free slop parameter(s)σlogFν. We do not in-

clude a term 2 lnσϑ,m in the prior summation. In some cases, σϑ,m is a pre-determined

constant. In other cases, σϑ,m may itself be a model parameter, with its own prior

constraint. We found that in this latter case, including the term 2 lnσϑ,m biases the

Chapter 4 Fit the First: Exercising the Model

on GRB 090313

“For the Snark’s a peculiar creature, that won’t Be caught in a commonplace way.

Do all that you know, and try all that you don’t: Not a chance must be wasted to-day!”

— Lewis Carroll, The Hunting of the Snark, Fit the Fourth

In document The gamma-ray burst afterglow modeling project : foundational statistics and absorption & extinction models (Page 168-175)