• No results found

Kalman Filter: A Simple Derivation

N/A
N/A
Protected

Academic year: 2020

Share "Kalman Filter: A Simple Derivation"

Copied!
5
0
0

Loading.... (view fulltext now)

Full text

(1)

Kalman Filter: A Simple Derivation

Ali M. Mosammam

Department of Statistics, University of Zanjan, Zanjan, Iran Corresponding Author: [email protected]

Copyright c2015 Horizon Research Publishing All rights reserved.

Abstract

The Kalman filter is a recursive estimator and plays a fundamental role in statistics for filtering, prediction and smoothing. The key element in any recursive estimator is the estimate of the current state, xk, at timek, based on observations up to and including

observationkand the Kalman filter enables the estimate of the state to be updated as new observations become available. In this paper we have tried to derive the Kalman filter as simple as possible.

Keywords

Kalman Filter, State-space Model, Dynamic System, Gaussian Process

1

Introduction

The analysis of a time series [5] and spatial analysis [2] can be done in one of two ways. Firstly, it could be analysed in a batch. In other words, the whole time se-ries to date is analysed. Secondly, analysis of the time series sequentially. With this approach the current esti-mate of the state of the system is updated over time as new measurements are made. Whilst both methods will give the same answer, the sequential method has two advantages:

1. reduction in the computational cost 2. reduction in the storage capacity.

The disadvantage of sequential methods is appeared when the posterior can not be calculated exactly. In this paper we will concentrate on the state-space approach to modelling dynamic systems. In order to analyze a dynamic system, at least two models are required:

First, a model describing the evolution of the state with time (the state model)

Second, a model relating the noisy measurements to the state (the measurement model).

In the Bayesian approach to dynamic state estimation, one attempts to construct the posterior probability den-sity function of the state based on all available informa-tion, including the set of received measurements. For many problems, an estimate is required every time that a measurement is received. In this case, a recursive filter

is a convenient solution. A recursive filtering approach means that received data can be processed sequentially rather than as a batch so that it is not necessary to store the complete data set nor to reprocess existing data if a new measurement becomes available. There are classes of problems for which the recoursive Bayesian solutions are tractable. The most important of these classes is the set of problems where the state and observation equa-tions are linear, and the distribuequa-tions of the prior, and observation and state noise are Gaussian. In this case no algorithm can ever do better than a Kalman filter [4]. In this paper we have tried to derive the Kalman filter properly. If the state space model is linear with uncor-related Gaussian noise and a Gaussian prior, the prior and posterior distributions at each time step are them-selves Gaussian random variables. A Gaussian random variable’s distribution is uniquely defined by the speci-fication of its mean and variance. The Kalman filter is a recursive estimator and plays a fundamental role in statistics for filtering, prediction and smoothing. The key element in any recursive estimator is the estimate of the current state , xk, at time k, based on observations

up to and including observationkand the Kalman filter enables the estimate of the state to be updated as new observations become available.

2

Basic model

The basic model in order to derive the Kalman filter is described by linear, discrete-time, finite-dimensional state-space equations

xk+1 = axk+wk, k≥0 (state equation) (1)

yk = cxk+vk (measurement model)(2)

Here we shall make the following assumptions:

1. {vk} and {wk} are individually independent,

zero-mean, Gaussian processes with known variances. Ifwk∼N(0, σw2), then we haveE(wk) = 0, var(wk) =

σw2 andE(wkwℓ) =E(wk)E(wℓ) = 0for all=ℓ.

If vk N(0, σv2), we have E(vk) = 0, var(vk) =

E(vk2) =σv2 andE(vkvℓ) = 0for all=ℓ.

2. {vk} and {wk} are independent processes. Hence

(2)

3. At the initial time, k = 0, the initial state, x0

Nx0, P0), further we shall assume that x0 is

inde-pendent ofvk andwk for anyk. Hence

(a) E(xkvℓ) = 0for all k,.

(b) E(xkwℓ) = 0for all k,.

This state-space model has some properties:

1. Sincexkis a linear combination of the jointly

Gaus-sian random variables x0, w0, w1,· · ·, wk−1, it is a

Gaussian variable. 2. {xk}is a Markov process.

3. {yk}is a Gaussian process.

4. {xk}and{yk}are jointly Gaussian.

We have made these assumptions in order to derive the Kalman filter easily. Some of these assumptions can be relaxed in practice.

3

Scalar Kalman filter

We have already introduced the basic models (1, 2) to derive the Kalman filter. We want to show that under some assumptions conditional expectation yields a set of recursive equations which is known the Kalman filter re-cursions. Here we shall denote the following quantities: xˆk|k=E(xk|y0,· · ·, yk)is the estimate of the current

state , xk, at time k, based on observations up to

and including observationk.

xˆk|k−1 =E(xk|y0,· · ·, yk1) is the predictor of the

current state ,xk, at timek, based onk−1

obser-vations.

Σk|k =var(xk|y0,· · ·, yk) is the variance of the

cur-rent state ,xk, at timek, based on observations up

to and including observationk.

Σk|k1=var(xk|y0,· · ·, yk−1) is the variance of the

current state , xk, at time k, based on first k−1

observations.

We might interested in prediction ofE(yk|y0,· · ·, yk−1).

Similarly notations can be established for observation equation. From (2)and the third assumption it can be written

[

x0

y0 ]

∼N2 ([

˜

x0

cx˜0 ]

,

[

P0 cP0

cP0 c2P0+σv2

])

.

Hence the conditional distribution takes the form x0|y0 Nx0|0,Σ0|0), which is completely specified by

following conditional mean and variance:

ˆ

x0|0=E(x0|y0) = ˜x0+c2PcP0+0σ2 v

(y0−cx˜0),

Σ0|0=var(x0|y0) =P0 (cP0) 2

c2P 0+σ2v.

Using state-space equation (1, 2) fork= 0, we get x1=ax0+w0

y1=cx1+v1.

Using,

[

x1

y0 ]

∼N2 ([

a˜x0

cx˜0 ]

,

[

a2P0+σw2 acP0

acP0 c2P0+σ2v

])

[

y1

y0 ]

∼N2 ([

acx˜0

cx˜0 ]

,

[

c2(a2P0+σ2w) +σv2 ac2P0

ac2P0 c2P0+σ2v

])

,

the conditional distributions are themselves normal and completely specified by conditional means and vari-ances

x1|y0 Nx1|0,Σ1|0)

ˆ

x1|0 = ax˜0+

acP0

c2P 0+σv2

(y0−cx˜0) =axˆ0|0

Σ1|0 = a2P0+σw2

(acP0)2

c2P 0+σ2v

=a2Σ0|0+σ2w

y1|y0 N(cxˆ1|0, c 2

Σ1|0+σ2v)

E(y1|y0) = acx˜0+ ac 2

P0

c2P 0+σv2

(y0−cx˜0) =cxˆ1|0

var(y1|y0) = c2(a2P0+σw2) +σw2

(ac2P0)2

c2P 0+σv2

=c1|0+σ2w.

Furthermore,

[

x1

y1 ]

|y0∼N2 ([

ˆ

x1|0

cxˆ1|0

]

,

[

Σ1|0 cΣ1|0

cΣ1|0 c1|0+σ2v

])

.

Hencex1|y0, y1∼Nx1|1,Σ1|1), where its mean and

vari-ance are

ˆ

x1|1=E(x1|y0, y1) = ˆx1|0+

cΣ1|0

c2Σ 1|0+σv2

(y1−cxˆ1|0)

Σ1|1=var(x1|y0, y1) = Σ1|0

(cΣ1|0)2

c2Σ 1|0+σ2v.

Let we have repeated this method and found xk−1|y0,· · ·, yk−1 Nxk1|k1,Σk1|k1), using state

(1,2)we can update equations by

xk =axk−1+wk−1 (state equation)

yk−1=cxk−1+vk−1 (measurement model).

Hence the conditional distributions are xk|y0,· · ·, yk−1∼Nxk|k−1,Σk|k−1)

yk|y0,· · ·, yk1∼N(cxˆk|k−1, ck|k−1+σ2v)

wherexˆk|k1=axˆk1|k1andΣk|k1=ak1|k1+σw2

and [

xk

yk

]

|y0,· · ·, yk−1

∼N2 ([

ˆ

xk|k−1

cxˆk|k1

]

,

[

Σk|k−1 cΣk|k−1

cΣk|k1 ck|k1+σv2

])

Hence xk|y0,· · ·, yk∼Nxk|k,Σk|k), where

ˆ

xk|k=E(xk|y0,· · ·, yk) = ˆxk|k1+

cΣk|k−1

c2Σ k|k−1+σv2

(yk−cxˆk|k1)

Σk|k=var(xk|y0,· · ·, yk) = Σk|k−1

(cΣk|k−1)2

c2Σ

k|k−1+σ2v.

(3) So the Kalman filter equations can be written as

ˆ

xk|k=E(xk|y0,· · ·, yk) = ˆxk|k−1+bk(yk−cxˆk|k−1)

Σk|k=var(xk|y0,· · ·, yk) = Σk|k−1−cbkΣk|k−1.

(3)

where

bk=

cΣk|k−1

c2Σ

k|k−1+σ2v

. (5)

Notice that the conditional variance is independent of y0,· · ·, yk. Alternatively, using xˆk|k−1 = axˆk−1|k−1 the

Kalman filter can be written by

ˆ

xk|k=E(xk|y0,· · ·, yk) =axˆk1|k1+bk(yk−acˆxk1|k1).

Note thatˆxk|kis the estimate of statexkat timekbased

on the previous estimate and only one data at time k. No algorithm can ever do better than a Kalman filter in the linear Gaussian state space. It should be noted that it is possible to derive the same results by minimizing MSE.

4

Kalman filter predictor

We might intrested in E(xk+ℓ|y0,· · ·, yk), which is

known as the Kalman filter predictor. 1. One-step predictor:

The same as the last section we again update xk+1 applying xk+1 = axk + wk. Therefore

xk+1|y0,· · ·, yk ∼N(axˆk|k, ak|k+σ2w)and applying

equation(4), we obtain

ˆ

xk+1|k=axˆk|k =axˆk|k1+abk(yk−cxˆk|k1)

Σk+1|k =ak|k+σw2 =a2(1−cbkk|k−1+σw2.

(6) Note that if we define the innovation process {y˜k}

byyk−cxˆk|k−1and also assume thaty˜0=y0−E(y0),

then we have xˆk+1|k =axˆk|k−1+abky˜k in terms of

˜

yk.

2. Multi-step predictor:

Using (1), and substituting these sequences we can write

xk+=aℓxk+

i=1ai−1wk+ℓ−i.

Then the-step prediction is given by

ˆ

xk+|k =E(xk+ℓ|y0,· · ·, yk) =aℓxˆk|k,

andΣk+|k =var(xk+|k) =a2Σk|k+σw21−a

2

1−a2.Since

yk+=cxk++vk+ we have, then

E(yk+ℓ|y0,· · ·, yk) =cE(xk+ℓ|y0,· · ·, yk) =cxˆk+ℓ|k

var(yk+ℓ|y0,· · ·, yk) =ck+ℓ|k+σ2v.

5

Vector Kalman filter

We have dealt so far with the scalar state equation by a first order autoregressive process. We can write directly the vector Kalman filter which each one is the first-order autoregressive process:

xjk+1=ajxjk+wjk j= 1,2,· · ·, q. (7)

Let Xk = [x1k, x2k,· · ·, xqk]T and Wk =

[w1k, w2k,· · ·, wqk]T.

Theq equations (7) can be written as the first-order vector equation, Xk+1=AXk+Wk, where A is a q×q

matrix, in this case given byA=diag(a1, a2,· · ·, aq)and

Wk N(0, Qk). Assume that observation equations at

timek are

yik=cixik+vik i= 1,· · ·, r r < q.

We define observation vector equation byYk=CXk+Vk,

where

C=

     

c1 0 . . . 0 . . . 0

0 c2 . . . 0 . . . 0

..

. ... . .. ... . .. ...

0 0 . . . cr . . . 0

     

Yk = [y1k, y2k,· · ·, yrk]T , Vk = [v1k, v2k,· · ·, vqk]T and

Vk ∼N(0, Rk).

Hence we can directly transfer the scalar Kalman fil-ter, given by equations (3) to (5) in to the corresponding vector Kalman filter form

ˆ

Xk|k= ˆXk|k1+βk(Yk−CXˆk|k−1)

Σk|k= Σk|k−1−βkCΣk|k−1,

where βk = Σk|k1CT[CΣk|k1CT +Rk]1 andΣk|k1=

AΣk1|k1AT +Qk−1.

We can obtain the following vector and matrix set of equation directly from scalar Kalman predictor(6).

ˆ

Xk+1|k=AXˆk|k1+Aβk(Yk−CXˆk|k−1)

Σk+1|k=A(I−βkCk|k1AT +Qk,

whichβk is already defined.

6

Some properties of Kalman

fil-ter

1. The Kalman filter defined in the last sections was conditional expectation of E(xk|y0,· · ·, yk) which

minimizes the mean square error. It is also max-imum likelihood estimator and the best linear un-biased estimator whose variance is less than any other linear unbiased estimators. Even we drop nor-mality assumption, E(xk|y0,· · ·, yk) also minimizes

the variance. Here E(xk|y0,· · ·, yk) is a function

ofy0,· · ·, yk and without the normality assumption

will not necessarily be linear.

2. LetAY+bbe a linear estimator ofXgivenY, where Ais a fixed matrix andbis a fixed vector. We define a linear minimum variance estimator,Xˆ(Y) =A0Y+

b0, contrast with E(X|Y) the calculation of Xˆ(Y)

does not require the joint probability density. If X and Y are jointly Gaussian, the minimum variance and linear minimum variance estimators coincide. 3. If Xˆ(Y) is a linear minimum variance estimate of

X, then CXˆ(Y) +d is a linear minimum variance estimator ofCX+d, whereCanddare fixed matrix and vector respectively. It is clearly property of linearity of the expectation operator.

4. Xˆ(Y) is unbiased, because E( ˆX(Y)) = E{µx +

ΣxyΣ−yy1(Y−µy)}=µx. Recall thatE(X|Y)is a

con-ditional minimum variance estimator. Infact Xˆ(y)

(4)

5. Kalman filter can be extended to many nonlinear problems [3]. As most real world models are non linear and non Gaussian and many analysis tasks in-volve estimating the state of a dynamic model when only partial or inaccurate observations are available, it is of great interest to develop efficient computa-tional methods to solve this so called Bayesian fil-tering problem numerically. Many approximation methods such as a Extended Kalman filter (EKF) have been developed to cope with this problem. The EKF linearises the state and observation equations, and then uses the Kalman Filter to obtain estimates for the state. For more details see e.g. [1].

7

Smoothing Kalman filter

We already have considered xˆk|k = E(xk|y0,· · ·, yk)

and xˆk+ℓ|k = E(xk+ℓ|y0,· · ·, yk) as classic Kalman

fil-ter and predictor. One might be infil-terested in xˆj|k =

E(xj|y0,· · ·, yk), j < k, which is known as smoothing

Kalman filter. We expect this estimator to be more accurate than E(xk|y0,· · ·, yk), because more

measure-ments are used in producingxˆj|k. The simplest smooth-ing is fixed-point smoothsmooth-ing where, we want to deter-minexˆj|k=E(xj|y0,· · ·, yk)and the associated variance,

Σj|k=E{(xj−ˆxj|k)(xj−xˆj|k)T|y0,· · ·, yk}for some fixed

j and all k > j. Let an augmenting state x∗k is de-fined by x∗k+1 = x∗k, k j with x∗j = xj at the initial

time, then x∗k+1 =xj,∀k ≥j. Hence xˆ∗k+1|k = ˆxj|k and

Σ∗∗k+1|k = Σj|k, whereΣ∗∗k+1|k denotes the variance of er-ror (ˆx∗k+1|k−x∗k+1). The augment state space model is therefore

Xk+1= [

xk+1

x∗k+1

] = [ a 0 0 1 ] [ xk

x∗k

] + [ 1 0 ]

wk=AXk+Wk

Yk=yk=

[

c 0

] [ x

k

x∗k

]

+Vk=CXk+Vk,

with the state vector atk=j satisfying

[

xj

x∗j

] = [ xj xj ] .

Using the Kalman filter predictor

ˆ

Xk+1|k =

[

ˆ

xk+1|k

ˆ

x∗k+1|k

] = [ a 0 0 1 ] [ ˆ

xk|k−1

ˆ

x∗k|k1

] + [ a 0 0 1 ] [ bk

b∗k

] (

Yk−

[

c 0

] [ xˆk |k1

ˆ

x∗k|k1

])

,

where

βk=

[

bk

b∗k

]

=

[

Σk|k1 Σ∗k|k1

Σk|k1 Σ∗∗k|k1

] [

c

0

]

×([ c 0

] [ Σk|k1 Σ

k|k−1

Σk|k1 Σ∗∗k|k1

] [

c

0

]

+σv2

)1

,

and [

Σk+1|k Σ∗k+1|k

Σk+1|k Σ∗∗k+1|k

] = [ a 0 0 1 ] ([ 1 0 0 1 ] [ bk

b∗k

] [

c 0

])

×

[

Σk|k−1 Σ∗k|k1

Σk|k1 Σ∗∗k|k1

] [ a 0 0 1 ] + [

σw2 0

0 1

]

.

It can be written as

[

Σk+1|k Σk+1|k Σk+1|k Σ∗∗k+1|k

]

=

[

a2(1−cbkk|k1+σw2 a(1−cbk∗k|k1

a(1−cb∗kk|k−1 Σk∗∗|k−1−cb∗Σ∗k|k−1 ]

.

Note thatΣk+1|k appearing here is precisely that

associ-ated with the scalar Kalman filter equation, because the first element of the augmented Kalman filter is xˆk|k1.

Hence fork≥j,

ˆ

xj|k=E(xj|y0,· · ·, yk) = ˆxj|k−1+b∗k(yk−cxˆk|k−1),

b∗k=

cΣk|k1

c2Σ

k|k−1+σ2v,

Σk+1|k=a(1−cbk∗k|k−1,

with initial valueΣj|j1= Σj|j1.

The variance of the error term,(xj−ˆxj|k), is given by

Σj|k= Σj|k1−cb∗kΣ∗k|k−1.

The fixed-point smoother has some properties:

1. The fixed point smoother is driven from the innova-tions process,y˜k=yk−cˆxk|k−1 of the Kalman filter

for the nonaugmented model.

2. The smoother is a linear discrete time model of di-mension equal to that of the filter.

3. As in the case of filter,xj|y0,· · ·, ykis Gaussian and

therefore defined by the conditional meanxˆj|k and

Σj|k.

8

General form of the vector

Kalman filter

In general we can write the state-space equation by Xk+1=AkXk+GkWk, k≥0 (state equation)

Yk=CkXk+Vk (measurement model),

where we have all the assumptions which we made in the scalar one. Furthermore we assume that the covariance of the{Vk}is given byE[VkVℓT] =Rkδkℓwhere

δkℓ=

{

1 k=

0 o.w

and also covariance of the{Wk}process isE[WkWℓT] =

Qkδkℓ. Hence {Vk} and {Wk} are zero mean,

(5)

X0 ∼N( ˜X0, P0). Similar to scalar one by using

condi-tional expectation we have

[

X0

Y0 ]

∼N

([

˜

X0

C0X˜0 ]

,

[

P0 P0C0T

C0P0 C0P0C0T+R0 ])

.

Hence Xˆ

0|0 = ˜X0+P0C0T(C0P0C0T +R0)1(Y0−C0X˜0)

and covariance is defined by

Σ0|0=P0−P0C0T(C0P0C0T+R0)1C0P0. We can update

state equation byX1=A0X0+G0W0. SoXˆ1|0=A0Xˆ0|0

andΣ1|0=A0Σ0|0AT+G0Q0GT0. Also fromY1=C1X1+

V1 it follows that Y1 condition on Y0 is Gaussian with

mean, Yˆ

1|0 = C1Xˆ1|0, and covariance, C1Σ1|0C1T +R1.

Hence

[

X1

Y1 ]

|Y0∼N

([ ˆ

X1|0

C1Xˆ1|0 ]

,

[

Σ1|0 Σ1|0C1T

C1Σ1|0 C1Σ1|0C1T+R1 ])

.

ThenX1 given Y0, Y1 has normal distribution by

con-ditional mean and covariance,

ˆ

X1|1= ˆX1|0+ Σ1|0C1T(C1Σ1|0C1T+R1)1(Y1−C1Xˆ1|0)

Σ1|1= Σ1|0Σ1|0C1T(C1Σ1|0C1T+R1)1C1Σ1|0.

Repeating this method yields

ˆ

Xk|k= ˆXk|k−1+Bk(Yk−CkXˆk|k−1)

Σk|k= Σk|k1−BkCkΣk|k1,

where Bk = Σk|k1CkT(CkΣk|k1CkT + Rk)1 and

Σk|k1=Ak−k1|k1ATk−1+Gk−1Qk−1GTk−1.

REFERENCES

[1] B. D. Anderson and J. B. Moore. Optimal filtering, Courier Corporation, 2012.

[2] N. Cressie. Statistics for Spatial Data, Revised edition, John Wiley, New York, 1993.

[3] D. Crisan and B. Rozovskii. The Oxford handbook of nonlinear filtering, Oxford University Press , 2011.

[4] R. E. Kalman. A new approach to linear filtering and prediction problems, Journal of Fluids Engineering, Vol.82, No.1, 35-45, 1960.

References

Related documents

Whereas The Binding of Isaac focuses on real world religion and religious stories, the Dragon Age series focuses on religions specifically created for the world these video

We have therefore developed an image mosaicing algo- rithm, which stitches single images of a PDD bladder video sequence and finally provides an expanded panoramic image of the

The first one is that conversely to earlier results where the percentage of the basic spatial unit covered by agriculture had a negative impact on dwelling rents while

Upon obtaining search data for the above mentioned terms and following approaches used in other research using Google Trends to examine search term fluctuation fol- lowing news

Methods: All patients with HCAs with malignant transformation who underwent hepatectomy at the Cancer Institute and Hospital, Chinese Academy of Medical Sciences and Peking

Although we have found a detectable increase in the STIM1-TRPC1 interaction upon stimulation with Thr, our findings indicate that it is unlikely that SOCE is enhanced, and,

The BioBio indicators revealed that, with 440 plant species, 171 bee species, 261 spider species, and 16 earthworm species, overall species richness was highest here among the

Smith Alex Scales Mike Carson Terik Brown Darius Wright 1997-98 Jamar Curry Henry Madden Flo Hartenstein Terik Brown Mike McShane 1996-97 Jamar Curry Kyle Milling Rob Ramaker