We may also obtain π via Gaussian elimination, using the block tridiagonal structure of πΏ, as follows. First we form the matrix sequence {πΏπ}:
πΏ0=πΏ0,0
πΏπ+1=πΏπ+1,π+1β Ξ£β1πππΏβ1π π
πΞ£β1, π = 0, . . . , π½ β 1; (14.4)
and we form the vector sequence {π§π}:
π§0 =πΆ0β1π0,
π§π+1=π»πΞβ1π¦π+1β Ξ£β1πππΏβ1π π§π.
We may then read off ππ½ from the equation πΏπ½ππ½ = ππ½ and finally we perform
back-substitution to obtain
πΏπππ = π§πβ πΏπ,π+1ππ+1, π = π½ β 1, .., 1.
Note that ππ½ found this way coincides with the mean of the Kalman filter at π = π½.
Proposition 14.3. The matrices {πΏπ} in (14.4) are positive definite Proof. The proof of this theorem relies on the following two lemmas:
Lemma 14.4. If π := β‘ β’ β’ β’ β£ π1 Γ Γ Γ Γ π2 Γ Γ Γ Γ ... Γ Γ Γ Γ ππ β€ β₯ β₯ β₯ β¦
is positive-definite symmetric then ππ is positive-definite symmetric for all π β {1, . . . , π}.
Lemma 14.5. Let P be a block upper(lower) triangular matrix with identity on the diagonal. Then, P is an invertible matrix.
Using Lemma14.4, we deduce that πΏ0 = πΏ0,0is positive-definite symmetric. Consider
the matrix π β Rπ (π½ +1)Γπ (π½ +1) defined as : π = β‘ β’ β’ β’ β£ πΌ 0 0 βπΏ1,0πΏ(β1)0 πΌ ... ... 0 ... 0 0 πΌ β€ β₯ β₯ β₯ β¦ .
DRAFT
We compute π πΏππ = β‘ β’ β’ β’ β’ β’ β’ β’ β£ πΏ0 0 0 0 πΏ1 πΏ1,2 ... 0 πΏ2,1 πΏ2,2 ... 0 πΏπ½ β1,π½ β1 πΏπ½ β1,π½ 0 πΏπ½,π½ β1 πΏπ½,π½ β€ β₯ β₯ β₯ β₯ β₯ β₯ β₯ β¦ .By using the Lemma14.4,
Λ πΏ = β‘ β’ β’ β’ β’ β’ β£ πΏ1 πΏ1,2 ... 0 πΏ2,1 πΏ2,2 ... πΏπ½ β1,π½ β1 πΏπ½ β1,π½ πΏπ½,π½ β1 πΏπ½,π½ β€ β₯ β₯ β₯ β₯ β₯ β¦
is positive definite, and so is πΏ0.
Lemma 14.4 gives us that πΏ1 positive definite as follows. By Lemma 14.5 the
following matrix, which we will use to transform the above matrix, is invertible.
π2= β‘ β’ β’ β’ β£ πΌ 0 0 βπΏ2,1πΏβ11 πΌ ... ... ... 0 0 πΌ β€ β₯ β₯ β₯ β¦ . Thus we have π2πΏπΛ 2π = β‘ β’ β’ β’ β’ β’ β’ β’ β£ πΏ1 0 0 0 πΏ2 πΏ2,3 ... πΏ3,2 πΏ3,3 0 πΏπ½ β1,π½ β1 πΏπ½ β1,π½ 0 πΏπ½,π½ β1 πΏπ½,π½ β€ β₯ β₯ β₯ β₯ β₯ β₯ β₯ β¦ ,
giving the positive definiteness of πΏ1. We may iterate our reasoning to argue that all
the πΏπ are similarly positive definite.
14.4 Discussion and Bibliography
The subject of Kalman smoothing is overviewed in the text [5]; see also [92]. A link between the standard implementation of the smoother and Gauss-Newton methods for MAP estimation is made in [8]. Fur further details on the Kalman smoother, in both discrete and continuous time, see [70] and [44].
DRAFT
15
Filtering Approach to the Inverse Problem
In this final chapter we demonstrate how the two separate themes that underpin this course, inverse problems and data assimilation, may be linked. This opens up the possibility of transfering ideas from filtering into the setting of quite general inverse problems. In the first section we describe the general, abstract, connection and introduce the revolutionary idea of sequential Monte Carlo methods (SMC). In the second section we analyze the concrete case of applying the EnKF to solve an inverse problem, leading to ensemble Kalman inversion (EKI). In the final section we link the EKI methodology to SMC.
15.1 General Formulation
Recall the inverse problem of finding π’ β π³ from π¦ β Rπ½ where
π¦ = πΊ(π’) + π, π βΌ π (0, Ξ0) (15.1)
and the related loss function
π0(π’) =
1
2|π¦ β πΊ(π’)|
2 Ξ0.
The reason for writing Ξ0, rather than Ξ, will become apparent below and will also be exploited in the third section of this chapter where we study EKI. The reason for considering π³ rather than Rπ as in the earlier chapters on inversion is twofold: (i) we avoid notational conflict with π the number of ensemble members; (ii) we emphasize that the ideas of this chapter apply quite generally when unknown π’ lies in in a Hilbert space.
If we put a prior π0 on the unknown π’ then the posterior takes the form
π(π’) = 1
π exp(βπ0(π’))π0(π’).
Let π½ β N and choose β so that π½ β = 1. Then define the family of pdfs {ππ}π½ π=0 by
ππ(π’) =
1 ππ
exp(βπβπ0(π’))π0(π’).
It follows that ππ½ = π and, furthermore, we may update the sequence {ππ}π½π=0sequentially
using the formula
ππ+1(π’) = ππ ππ+1 exp(οΈ ββπ0(π’))οΈ ππ(π’).
This simply corresponds to application of Bayesβ Theorem to the inverse problem
π¦ = πΊ(π’) + π, π βΌ π (0, Ξ), (15.2)
with Ξ = β1Ξ0 and with prior ππ. With this choice of Ξ the identity (15.2) gives the
likelihood proportional to exp(ββπ0(π’)). The update may be written ππ+1(π’) = πΏππ,
DRAFT
noting that the likelihood operator πΏ is nonlinear, corresponding to multiplication by exp(ββπ0(π’)) and then normalization to a pdf.If we let ππ denote any Markov kernel for which ππ is invariant then we obtain
ππ+1= πΏππππ.
This update formula should be compared with the filtering update formula ππ+1 = πΏππ ππ
introduced and used in earlier chapters.
Whilst the πβdependence in Bayes rule and the Markov kernel is interchanged between the inverse problem and the filtering problem, this makes little material difference to implementation. Firstly, once a Markov kernel π is identified under which π = ππ½ is invariant, it can be easily adapted to find a family of kernels ππ under which
ππ is invariant, simply by rescaling the observation covariance. Secondly the fact that
the data π¦ is fixed, rather than changing at each step, makes little difference to the implementation. This perspective opens up an entire field, known as sequential Monte Carlo, in which ideas from filtering may be transferred to other quite different problems,
including Bayesian inversion, as explained here.
Note however (see the discussion at the end of section13.6) that of the filtering methods we have introduced so far only the bootstrap filter applies directly to the case of nonlinear observation operators; since πΊ is in general nonlinear this means that extra ideas are required to implement the optimal particle filter, 3DVAR, ExKF and EnKF. One approach is to replace the sequential optimization principles by the non-quadratic optimization problem required for nonlinear πΊ; we do not discuss this idea in any detail but it is a viable option. Another approach is to use the linearization technique which we now outline in the context of EKI.