Chapter 2 Bayesian Model for Calibration of ILI Tools
2.2 Bayesian Methodology
2.2.1 Basic Formulation
The Bayesian approach is an advanced tool to fit a probability model to a set of observations by evaluating the unknown parameters of the model in a probabilistic way (Gelman 2004). The Bayesian method treats the unknown parameters of a physical process as random variables rather than as deterministic values. It incorporates the prior knowledge about the parameters, which may arise from the results of previous studies or experience. The prior knowledge is then updated based on the observed data to obtain the revised opinion about the parameters. The updated belief can be further considered as the prior distribution for future updating when new data are available. Therefore through this iterative process the uncertainty in the parameters is minimized.
Given a set of n observations, X = (x1, x2, …, xn), and a collection of k unknown
parameters, θk) that characterize the physical process underlying the
observed data, one can specify the Bayesian model for the parameters. The objective of the Bayesian analysis is to find the updated opinion about the unknown parameters θ based on Bayes’ theorem given by (Bayes and Price 1763)
(2.1)
where p() denotes the probability density function of . Here p(θ),which is called the
prior distribution of the parameters, characterizes the belief regarding the unknown parameters θ prior to any modeling. The information contained in the data is introduced via the so-called likelihood for the data, p(X|θ), which is the value of the probability
density function associated with the data conditional on the parameters θ. The entity
p(θ|X) is known as the posterior distribution, which reflects the combined information from the data and prior distribution. The quantity p(X) is a normalizing constant that ensures the left hand side of Eq. (2.1) to be a probability distribution; that is, p(θ|X) integrates to unity, and p(X) is known as the marginal likelihood and can be obtained by integrating the numerator on the right hand side of Eq. (2.1) with respect to the unknown parameters θ. Thus,
(2.2)
Taking into consideration the normalizing constant, one can write Eq. (2.1) as
(2.3)
where the symbol “ ” indicates proportionality.
Consider an example to illustrate the Bayesian method. Assume that y = (y1, y2, …, yn) represents a set of n data that follow a Poisson distribution with an unknown rate
parameter . Further assume that the prior distribution of follows a gamma distribution with known shape parameter and scale parameter . Therefore, the likelihood and prior distribution can be written as,
ikelihood Prior distribution:
where () denotes the gamma function.
The posterior distribution of can be evaluated from Eq. (2.3) as follows:
(2.4)
where is the mean value of the observations. Note that the terms that are independent of (such as Γ() and ) in Eq. (2.4) are omitted in the last step because they are part of the proportionality constant. Based on Eq. (2.4), it can be concluded that the posterior distribution of follows a gamma distribution with shape parameter and scale parameters equal to n + and n+, respectively.
2.2.2 Markov Chain Monte Carlo Simulation
The main purpose of the Bayesian analysis is to evaluate the probabilistic characteristics (e.g. mean, variance and quantiles) of the posterior distribution of the unknown model parameters. Therefore, it is necessary to obtain the marginal posterior distribution of each parameter. The marginal distribution of a parameter i of the
parameter vector , p(i|X), and the mean value of i, E(i|X), can be evaluated as
follows:
(2.5)
(2.6)
In some cases, closed-form solutions of the integrals in Eqs. (2.5) and (2.6) are available, or they can be easily computed using numerical methods. But in most applications analytic or direct numerical evaluation of these integrals is very difficult, if not impossible, due to the complexity and high dimensionality of the Bayesian model. To overcome this difficulty, the Markov Chain Monte Carlo (MCMC) simulation technique has been widely used in the Bayesian analysis (Gilks et al. 1996).
MCMC works by sequentially generating random samples of uncertain parameters to form a Markov process whose stationary distribution is the joint posterior distribution of the parameters. The difference between MCMC and the conventional Monte Carlo simulation is that the sample generated in a given sequence in MCMC depends on the sample generated in the previous sequence, whereas the samples generated in different trials of the conventional Monte Carlo simulation are independent.
The MCMC simulation starts by assigning an arbitrary initial value to each of the parameters considered. After an initial set of sequences, i.e. the so-called burn-in period, the subsequent sequences are considered to converge to the joint posterior distribution of the parameters. The samples generated in different sequences in MCMC are typically autocorrelated. To reduce the autocorrelation, the so-called “thinning” technique is employed, where only samples from every kth (k > 1) iteration are stored for the output analysis (Congdon 2006; Link and Eaton 2012). The generated sequences can be approximated as independent samples by choosing the thinning interval (also known as sampling lag) appropriately. The samples generated after the burn-in period using a suitable thinning interval can be used to evaluate the probabilistic characteristics of the
joint posterior distribution or the marginal distribution of a given parameter, just like the conventional Monte Carlo simulation.
There are several standard sampling algorithms to generate MCMC samples from the joint posterior distributions. The most commonly used algorithms are the Metropolis- Hastings algorithm and the Gibbs sampling. These two algorithms are briefly described in Appendix 2A.