3.5 Afterglow Model Fitting Procedure
3.5.2 Model Distribution and the Likelihood Function
The emission, extinction and absorption models described in §§3.2, 3.3 & 3.4 allow us to calculate, at any given time and frequency, the expected model flux density
Fmod
ν (t, ν;ϑm), given a set of M model parameters {ϑm}. We now must calculate the
likelihood of a given model, given a particular set of N filter-integrated flux density measurements
Fobs
ν,n , along with any prior constraints on the model parameters them-
selves. We denote a given flux measurement Fobs
ν,n(tn, fn(ν)), where n = {1, . . . , N}, tn
is the time since the burst trigger at the middle of a given exposure, and fn(ν) is
the normalized transmission function of the observation’s photometric filter, multiplied by the response function of the telescope and instrument, when available. Tabulated response curves are read in at the beginning of the fit for each data point n, and re- sampled, using linear interpolation, to create normalized filter response curves of Nn
points, evenly spaced in linear frequenciesνn={νn,1, . . . , νn,Nn}. In practice, we choose
where ∆ν is chosen to give ∼10 bins across the width of the narrowest spectral feature in the model; in most cases, this is the UV bump, so that the typical bin width is ∆ν ≃ 3
10
γ
1+zGRB ×10
14 Hz ≈ 2.5×1013 Hz/(1 +z
GRB). The filter response-weighted
mean model flux density is then approximated as:
Fmod ν,n (tn, fn(ν);θ) = Nn X k=1 Fmod ν (tn, νn,k;ϑm)×fn(νn,k). (3.47)
Each flux density measurement Fobs
ν,n is assigned to a particular calibration group,
typically one group per photometric filter per independently prepared data set. Since photometric calibration is typically performed in magnitude space, we assume that all flux measurements within a given calibration group may be offset from their true values by a constant offset ∆c,n in logF space. These offsets are parameters in our model,
each of which is constrained by a zero-mean log-normal Gaussian prior whose width is given by our best estimate of the calibration uncertainty, σlogFνc,n, for that group,
i.e., p(∆c,n) = G(∆c,n; 0, σlogFνc,n). The model flux density must then be adjusted to
compensate for this offset:
Fνmodc,n =Fνmodc,n ×10∆c,n. (3.48)
In addition to calibration group offsets, we assume that there is additional, random scatter in the observed flux that is greater than can be attributed to measurement uncertainties alone, due to unmodeled stochastic physical processes in the emission and absorption of light from the GRB afterglow. This additional scatter, or slop, is assumed to be normally distributed in logFν space, and is characterized as a Gaussian with
widthσlogFν. While, in principle, this slop could be different for different data groups,
in practice we assume thatσlogFν is the same for all measurements in the UV-optical-IR
region of the spectrum, and we characterize it with a single model parameter.10 In this
10However, when we include X-ray flux measurements in our data set, we employ a separate slop
parameterσX
sense, we are fitting not a model curve, but a model distribution to our measured flux densities. The model probability density is normally distributed about the model curve in logFν space:
pmod(logFν; logFνmodc,n, σlogFν) =
1
√
2πσlogFν
e
(logFν−logFνc,nmod)2
2σ2log
Fν . (3.49)
Since we are comparing the model distribution to flux densities in linear space, it is necessary to transform the probability distribution of the model into that space. As we showed in §2.7, the probability density of the model distribution in linear Fν space
is obtained by equating the differential probabilities:
pmod(Fν;Fνmodc,n, σlogFν)dFν = p(logFν; logF
mod
νc,n, σlogFν)dlogFν
pmod(Fν;Fνmodc,n, σlogFν) =
1
Fνln 10
p(logFν; logFνmodc,n, σlogFν). (3.50)
It can be shown analytically that pmod peaks at a value:
Fνmaxc,n = 10(logFνc,nmod−σ2logFνln 10). (3.51)
As we discussed in§2.7, it is often convenient to approximate asymmetric probability distributions as skew-normal, or asymmetric, Gaussian distributions. As a reminder, we define the asymmetric Gaussian function:
GA(x;x0, σx±)≡ 2 √ 2π(σx++σx−) × e− 1 2 (x−x0)2 σ2 x+ ifx≥x 0 e− 1 2 (x−x0)2 σ2 x− ifx < x0 . (3.52) If σlogFν . 0.1, p mod(F
ν) can be reasonably well approximated as GA(Fν;Fνmaxc,n, σFν±), of slop due to scintillation in the ISM, which is expected to be normally distributed in linearFν space;
whereσFν± are approximated by the empirical formulas: σFν+ ≈ F max νc,n 2.3029σlogFν + 2.6293σ 2 logFν −3.6945σ 3 logFν σFν− ≈ F max νc,n 2.3027σlogFν −2.6544σ 2 logFν −4.0699σ 3 logFν . (3.53)
As we described in detail in Chapter 2, to find the best-fit model distribution to the observed data, we must define a likelihood function that quantifies the joint probability of a given set of measurements, with their intrinsic measurement uncertainties, and the model distribution described by the set of parameters {ϑm}, which includes one
or more slop parameters σ. We approximate the intrinsic probability distribution of a given flux density measurement Fν,n as a (possibly asymmetric) Gaussian:
pintν,n(Fν;Fν,n, σFν,n±) =GA(Fν;Fν,n, σFν,n±). (3.54)
For optical and NIR photometric observations, since a CCD camera is an intrinsically linear flux-measuring device, the flux density measurements we generate ourselves have symmetric error bars σFν,n+ = σFν,n− = σFν,n. Published data from other sources,
however, tend to quote symmetric error bars in logFν or magnitude space. In those
cases, we transform the data points into linear flux space in the same manner as outlined above for the log-normal model distribution (assuming the quoted error bars are.0.1 in logFν), with asymmetric error barsσFν,n±. If a published measurement is quoted as
a 3σ upper limit, we assign that data point a linear flux density Fν,n = 0, and assign it
symmetric 1σ error bars σFν,n such that:
Z Fν,n3σ −∞
G(Fν; 0, σFν,n)dFν = 0.99730, (3.55)
as we discussed in §2.7.3. If 2- or 1σ upper limits, F2σ
of σFν,n are those for which the integral in Equation 3.55 equals 0.9545 and 0.6827,
respectively. By these definitions, σFν,n = F
3σ ν,n/2.7822, σFν,n = F 2σ ν,n/1.6901 or σFν,n = F1σ ν,n/0.4752.
The joint probability of the one-dimensional model distribution and a given flux density measurement is:
pn∝
Z ∞
−∞
GA(Fν;Fνmaxc,n, σFν±)GA(Fν;Fν,n;σFν,n±)dFν. (3.56)
If both distributions were symmetric Gaussians, with widthsσFν andσFν,n, respectively,
this joint probability takes a very simple form:
pn∝G(Fν,n;Fνmaxc,n,Σn), (3.57)
where Σn is the quadrature sum of the slop and the intrinsic uncertainty:
Σn ≡ q σ2 Fν+σ 2 Fν,n. (3.58)
However, when one or both of the distributions is approximated as an asymmetric Gaussian, the result of the probability integral (equation 3.56) is not, in general, of the form of an asymmetric Gaussian. We have explored such integrals extensively (see
§2.7 and Appendix A), and have found that they can be reasonably well approximated as asymmetric Gaussians, provided the peak is allowed to shift by an amount δFν,n.
An approximate empirical formulation for δFν,n is presented in Appendix A. The joint
probability can then be approximated as:
pn∝GA Fν,n+δFν,n;F
max
νc,n,Σn±
where Σn+ ≡ q σ2 Fν−+σ 2 Fν,n+ Σn− ≡ q σ2 Fν++σ 2 Fν,n−. (3.60)
Note the reversal of signs; σFν∓ is added in quadrature to σFν,n±.
The total likelihood of the measurements, given the model, is the product of the individual joint probabilities:
L =
N
Y
n=1
pn. (3.61)
However, some of our model parameters {ϑm} themselves have prior constraints on
their values. Typically, these are symmetric or asymmetric Gaussian probability dis- tributions about the most likely parameter value. In some cases, one parameter’s prior width or peak may be a function of one or more other parameters. The details of the priors that we use in our GRB afterglow extinction and absorption models are discussed in detail in §§3.3 & 3.4. Including the priors, the posterior probability of the model, given the data, is:
P ∝ L ×
M
Y
m=1
pϑm. (3.62)
Finally, we arrive at the best-fit set of model parameters for a given data set by mini- mizing−2 lnP: −2 lnP = N X n=1 −2 lnL+ M X m=1 −2 lnpϑm+ Constant. (3.63)
This quantity has the form of a modified, χ2-like statistic (it is not a “true”χ2, in that
or more free slop parameters). We find the best-fit model by minimizing: M X m=1 (ϑm−ϑˆm)2 σ2 ϑ,m + N X i=1 2 ln(Σn++ Σn−) + (Fν,n+δFν,n−Fνc,nmax)2 Σ2 n− if Fν,n+δFν,n ≥F max νc,n , (Fν,n+δFν,n−Fνc,nmax)2 Σ2 n+ if Fν,n+δFν,n < F max νc,n , (3.64) where the first summation is over all parameters with Gaussian priorspϑm =G(ϑm; ˆϑm, σϑ,m),
where ˆϑm is the best-fit value of parameterϑm, andσϑ,m is its fitted uncertainty. (Note
that in some cases, a parameter’s prior probability distribution may also be approxi- mated as an asymmetric Gaussian.) The term 2 ln(Σn++Σn−) in the second summation
is necessary, since Σn± is a function of the free slop parameter(s)σlogFν. We do not in-
clude a term 2 lnσϑ,m in the prior summation. In some cases, σϑ,m is a pre-determined
constant. In other cases, σϑ,m may itself be a model parameter, with its own prior
constraint. We found that in this latter case, including the term 2 lnσϑ,m biases the
Chapter 4
Fit the First: Exercising the Model
on GRB 090313
“For the Snark’s a peculiar creature, that won’t Be caught in a commonplace way.
Do all that you know, and try all that you don’t: Not a chance must be wasted to-day!”
— Lewis Carroll, The Hunting of the Snark, Fit the Fourth