3.2 Sequential Monte Carlo Methods
3.2.4 Bayesian Inference with a State Space Model
We have introduced important inference methods that aim to estimate the target distribution in general. In this section, these approaches will be reconsidered in the light of a state space model e.g. the Markov Chain Model. Let us first describe the state space model, involving the Markov process and other relevant notations. Let us assume that we are aiming to model a system using a homogeneous continuous- time Markov model, which is characterised by a set of states {xm}m>0 where xm ∈ χ.
Let us further assume that the model is parametrised by θ ∈ Θ, which is unknown and our interest is to make an inference about the model parameter, given the avail- able data. The Markov process, as we described in chapter 2, is associated with the transition kernel fθ(xm|xm−1) which represents the current state, given only
the previous one. The observation {ym}m>0 of data is supposed to be condition-
ally independent given {xm}m>0 and governed by the distribution πθ(ym|xm). The
probabilistic description of the model can be defined as:
x0 ∼ πθ(x0), (3.13)
xm|xm−1 ∼ fθ(xm|xm−1), (3.14)
ym|xm∼ πθ(ym|xm). (3.15)
The term ∼ implies distribution according to, πθ(x) denotes a probability density
function and fθ(xm|xm−1) is the probability density represent the transient from
state xm−1 to state xm. The density of the observation given the state is πθ(ym|xm)
(Doucet and Johansen, 2009).
Within modelling context and in the Bayesian framework, equation (3.13) and equa- tion (3.14) represent the prior distribution of the process {xm}m>0. The equation
(3.15) represents the likelihood, then; we can define the following:
πθ(x0:m) = πθ(x0) m Y s=1 fθ(xs|xs−1) (3.16) πθ(y0:m|x0:m) = m Y s=1 πθ(ys|xs). (3.17)
In a Bayesian framework, the inference about x0:m given the observation y0:m can
be obtained through the posterior distribution as follows:
πθ(x0:m|y0:m) =
πθ(x0:m, y0:m)
πθ(y0:m)
, (3.18)
where the numerator is:
πθ(x0:m, y0:m) = πθ(y0:m|x0:m)πθ(x0:m), (3.19)
and the denominator is:
π(y0:m) =
Z
πθ(x0:m, y0:m)dx0:m =
Z
πθ(y0:m|x0:m)πθ(x0:m)dx0:m. (3.20)
The main interest is to approximate the marginal likelihood π(y0:m).
Recall the likelihood defined in the equation (3.17) and the prior which is defined in (3.16). The unnormalised posterior distribution π(x0:m, y0:m) defined in equation
(3.18) satisfies:
πθ(x0:m, y0:m) = πθ(x0:m−1, y0:m−1)fθ(xm|xm−1)πθ(ym|xm). (3.21)
Hence, the posterior can satisfy the recursion as a follow:
πθ(x0:m|y0:m) = πθ(x0:m−1|y0:m−1) fθ(xm|xm−1)πθ(ym|xm) πθ(ym|y0:m−1) . (3.22) where πθ(ym|y0:m−1) = Z fθ(xm|xm−1)πθ(ym|xm)πθ(xm−1|y0:m−1)dxmdxm−1, (3.23)
where πθ(xm−1|y0:m−1)represents unknown filtering distribution of the current state
of the model given the available information (Doucet and Johansen, 2009; Del Moral et al., 2013). The marginal distribution πθ(xm|y0:m) can be obtained by integrating
x0:m−1 out in equation (3.22):
πθ(xm|y0:m) =
πθ(ym|xm)πθ(xm|y0:m−1)
πθ(ym|y0:m−1)
Chapter 3. Bayesian Inference Methods 47
the previous equation is called the updating step, where
πθ(xm|y0:m−1) =
Z
fθ(xm|xm−1)πθ(xm−1|y0:m−1)dxm−1. (3.25)
The equation (3.25) is called the prediction step. If {πθ(x0:m|y0:m)}and {πθ(xm|y0:m)}
can be computed sequentially, then the marginal likelihood π(y0:m)can be evaluated
using: π(y0:m) = π(y0) m Y s=1 πθ(ys|y0:s−1),
where the term πθ(ys|y0:s−1) is defined in equation (3.23) (Doucet and Johansen,
2009).
Evaluating such distributions analytically can be difficult. Alternatively, they can be estimated through a Monte Carlo sampling technique such as importance sampling. For more details about the filtering sampling algorithm see (Doucet et al., 2000), (West, 1996) and (Gordon et al., 1993).
In SIS, the desired distribution can be approximated by N particles. Then, IS is employed to propagate these particles through the stags using importance density. At initial stage m = 0, the importance density assumed to be q0(x0), hence; the
importance weight can be defined as:
w00i = πθ(x i 0)πθ(y0|xi0) q(xi 0) . At stage m > 1, the importance weight is computed as:
wm0i = fθ(x i m|xim−1)πθ(ym|xim) q(xi m|xim−1) . Then, the weight can be normalised as:
wmi = w 0i m PN j w 0j m . The set of {xi
m}and the corresponding weights {wmi }represent the SIS for the state
space model, and is called a particle filter.
Algorithm 6 Sequential Importance Sampling for state space model
1: For i = 1, · · · , N, initialise sample
xi0 ∼ q(x0)
Assign initial weight:
wi00 = πθ(x i 0)πθ(y0|xi0) q(xi 0) w0i = w i0 0 PN j=1w j0 0
2: At the next stage m = 1, · · · , M and for i = 1, · · · , N propagate:
xim ∼ q(xm|xim−1)
3: Compute the importance weight:
wmi0 = wm−1i fθ(x
i
m|xim−1)πθ(ym|xim)
q(xi
m|xim−1)
4: Compute the normalised weight:
wim = w 0 m(xi0:m) PN j=1w0m(xi0:m) .
Chapter 3. Bayesian Inference Methods 49 ˆ πθ(ym|y0:m−1) ≈ 1 N N X i=1 wmi .
The procedure of this approach is described in algorithm 6. Different types of proposal density can be chosen. One possible option is to choose the state transition, this results in a particular case of the particle filter known as a bootstrap particle filter.