Bayesian Inference with a State Space Model

3.2 Sequential Monte Carlo Methods

3.2.4 Bayesian Inference with a State Space Model

We have introduced important inference methods that aim to estimate the target distribution in general. In this section, these approaches will be reconsidered in the light of a state space model e.g. the Markov Chain Model. Let us first describe the state space model, involving the Markov process and other relevant notations. Let us assume that we are aiming to model a system using a homogeneous continuous- time Markov model, which is characterised by a set of states {xm}m>0 where xm ∈ χ.

Let us further assume that the model is parametrised by θ ∈ Θ, which is unknown and our interest is to make an inference about the model parameter, given the available data. The Markov process, as we described in chapter 2, is associated with the transition kernel fθ(xm|xm−1) which represents the current state, given only

the previous one. The observation {ym}m>0 of data is supposed to be condition-

ally independent given {xm}m>0 and governed by the distribution πθ(ym|xm). The

probabilistic description of the model can be defined as:

x0 ∼ πθ(x0), (3.13)

xm|xm−1 ∼ fθ(xm|xm−1), (3.14)

ym|xm∼ πθ(ym|xm). (3.15)

The term ∼ implies distribution according to, πθ(x) denotes a probability density

function and fθ(xm|xm−1) is the probability density represent the transient from

state xm−1 to state xm. The density of the observation given the state is πθ(ym|xm)

(Doucet and Johansen, 2009).

Within modelling context and in the Bayesian framework, equation (3.13) and equation (3.14) represent the prior distribution of the process {xm}m>0. The equation

(3.15) represents the likelihood, then; we can define the following:

πθ(x0:m) = πθ(x0) m Y s=1 fθ(xs|xs−1) (3.16) πθ(y0:m|x0:m) = m Y s=1 πθ(ys|xs). (3.17)

In a Bayesian framework, the inference about x0:m given the observation y0:m can

be obtained through the posterior distribution as follows:

πθ(x0:m|y0:m) =

πθ(x0:m, y0:m)

πθ(y0:m)

, (3.18)

where the numerator is:

πθ(x0:m, y0:m) = πθ(y0:m|x0:m)πθ(x0:m), (3.19)

and the denominator is:

π(y0:m) =

πθ(x0:m, y0:m)dx0:m =

πθ(y0:m|x0:m)πθ(x0:m)dx0:m. (3.20)

The main interest is to approximate the marginal likelihood π(y0:m).

Recall the likelihood defined in the equation (3.17) and the prior which is defined in (3.16). The unnormalised posterior distribution π(x0:m, y0:m) defined in equation

(3.18) satisfies:

πθ(x0:m, y0:m) = πθ(x0:m−1, y0:m−1)fθ(xm|xm−1)πθ(ym|xm). (3.21)

Hence, the posterior can satisfy the recursion as a follow:

where πθ(xm−1|y0:m−1)represents unknown filtering distribution of the current state

of the model given the available information (Doucet and Johansen, 2009; Del Moral et al., 2013). The marginal distribution πθ(xm|y0:m) can be obtained by integrating

x0:m−1 out in equation (3.22):

πθ(xm|y0:m) =

πθ(ym|xm)πθ(xm|y0:m−1)

πθ(ym|y0:m−1)

Chapter 3. Bayesian Inference Methods 47

the previous equation is called the updating step, where

πθ(xm|y0:m−1) =

fθ(xm|xm−1)πθ(xm−1|y0:m−1)dxm−1. (3.25)

The equation (3.25) is called the prediction step. If {πθ(x0:m|y0:m)}and {πθ(xm|y0:m)}

can be computed sequentially, then the marginal likelihood π(y0:m)can be evaluated

using: π(y0:m) = π(y0) m Y s=1 πθ(ys|y0:s−1),

where the term πθ(ys|y0:s−1) is defined in equation (3.23) (Doucet and Johansen,

2009).

Evaluating such distributions analytically can be difficult. Alternatively, they can be estimated through a Monte Carlo sampling technique such as importance sampling. For more details about the filtering sampling algorithm see (Doucet et al., 2000), (West, 1996) and (Gordon et al., 1993).

In SIS, the desired distribution can be approximated by N particles. Then, IS is employed to propagate these particles through the stags using importance density. At initial stage m = 0, the importance density assumed to be q0(x0), hence; the

importance weight can be defined as:

w0₀i = πθ(x i 0)πθ(y0|xi0) q(xi 0) . At stage m > 1, the importance weight is computed as:

w_m0i = fθ(x i m|xim−1)πθ(ym|xim) q(xi m|xim−1) . Then, the weight can be normalised as:

w_mi = w 0_i m PN j w 0_j m . The set of {xi

m}and the corresponding weights {wmi }represent the SIS for the state

space model, and is called a particle filter.

Algorithm 6 Sequential Importance Sampling for state space model

1: For i = 1, · · · , N, initialise sample

xi₀ ∼ q(x0)

Assign initial weight:

wi₀0 = πθ(x i 0)πθ(y0|xi0) q(xi 0) w₀i = w i0 0 PN j=1w j0 0

2: At the next stage m = 1, · · · , M and for i = 1, · · · , N propagate:

xi_m ∼ q(xm|xim−1)

3: Compute the importance weight:

w_mi0 = w_m−1i fθ(x

m|xim−1)πθ(ym|xim)

q(xi

m|xim−1)

4: Compute the normalised weight:

wi_m = w 0 m(xi0:m) PN j=1w0m(xi0:m) .

Chapter 3. Bayesian Inference Methods 49 ˆ πθ(ym|y0:m−1) ≈ 1 N N X i=1 w_mi .

The procedure of this approach is described in algorithm 6. Different types of proposal density can be chosen. One possible option is to choose the state transition, this results in a particular case of the particle filter known as a bootstrap particle filter.

In document Bayesian inference for continuous time Markov chains (Page 67-71)