Chapter 4 Bayesian Approach to Bar Code Denoising
5.3 Kullback-Leibler minimisation
5.3.3 Variational Problem
A=nN(m,2(−∂2t +Bε)−1) : (m,A)∈ H
o
whereBε=ε−2A2−ε−1A0and that the setAais defined in the same way withHreplaced
byHafor somea > 0. Given the measureµεdefined by (5.17), i.e. the law of transition
paths, we aim to find optimal Gaussian measuresνεfromAorAaminimising the Kullback-
Leibler divergenceDKL(νε||µε). To that end, first in view of (5.40), the constants |x1−x0| 2
4 andlog(Zµ,ε)can be neglected in the minimisation process since they do not depend on the
choice ofνε. Hence we are only concerned with minimising the modified Kullback-Leibler
divergenceDeKL(νε||µε). Furthermore, instead of minimisingDeKL(νε||µε), we consider the
variational problem inf ν∈A εDeKL(νε||µε) +εγkAk2H1(0,1) , (5.42)
whereγ >0andAis given by (5.31). We will also study the minimisation problem over the setAa. The reasons why the problem (5.42) is of interest to us are the following. First, multiplyingDeKL(νε||µε) by εdoes not change the minimisers. Yet after this scaling the
m-dependent terms ofDeKL(νε||µε) (the first two terms on the right hand side of (5.41))
and the A-dependent terms (middle line of (5.41)) are well-balanced since they are all order one quantities with respect toε. Moreover, the regularisation termεγkAk2
H1(0,1) is
necessary because the matrixBε, along any infimising sequence for εDeKL(νε||µε), will
only converge weakly and the minimiser may not be attained inA. This issue is illustrated in [187, Example 3.8 and Example 3.9] and a similar regularisation is used there.
Remark 5.3.3. The normalisation constantZµ,ε in(5.40) is dropped in our minimisation
problem. This is one of the advantages of quantifying measure approximations by means of the Kullback-Leibler divergence. However, understanding the asymptotic behaviour ofZµ,ε
in the limitε→0is quite important, even though this is difficult. In particular, it allows us to study the asymptotic behaviour of the scaled Kullback-Leibler divergenceεDKL(νε||µε),
whereby quantitative information on the quality of the Gaussian approximation in the small temperature limit can be extracted. In the next section we study behaviour of the minimisers of the functional defined in(5.42)in the limitε→ 0; we postpone study ofεDKL(νε||µε),
which requires analysis ofZµ,εin the limitε→0, to future work.
Remark 5.3.4. We choose the small weightεγwith someγ >0in front of the regularisation term with the aim of weakening the contribution from the regularisation so that it disappears in the limitε → 0. For the study of the Γ-limit of Fε, we will considerγ ∈ (0,12); see
Theorem5.4.5in the next section.
Remark 5.3.5. The Kullback-Leibler divergence is not symmetric in its arguments. We do not studyDeKL(µε||νε)because minimisation of this functional over the class of Gaussian
measures leads simply to moment matching and this is not appropriate for problems with multiple minimisers, see [19, Section 10.7].
The following theorem establishes the existence of minimisers for the problem (5.42).
Theorem 5.3.6. Given the measureµε defined by(5.17)with fixedε > 0. There exists at
least one measureν ∈ A(orAa) minimising the functional
ν7→εDeKL(ν||µε) +εγkAk2H1(0,1) (5.43)
overA(orAa).
Proof. We only prove the theorem for the case where the minimising problem is defined overAa since the other case can be treated in the same manner. First we show that the
infimum of (5.43) overAais finite for any fixedε >0. In fact, considerA∗ =a·Idwith a > 0 andm∗ being any fixed function in H1±(0,1). Then we show that F(m∗,A∗) is
finite. For this, by the formula (5.41), we only need to show that
Eνε
Z 1
0
Ψε(z(t) +m∗(t))dt <∞.
SinceA∗ =a·Id, from (5.28) one can see thatz(t)∼N(0,2Gε(t, t))under the measure νε. In addition, it follows from (5.71) that|Gε(t, t)|F ≤Cεa.e. on(0,1)for someC >0.
Then from the growth condition (A-4) onΨεand the fact thatm∗∈L∞(0,1),
Eνε Z 1 0 Ψε(z(t) +m∗(t))dt = Z 1 0 Z Rd 1 p (4π)ddet(G ε(t, t)) e−14x TGε(t,t)−1x Ψε(x+m∗(t))dxdt = Z 1 0 Z Rd 1 (4π)d/2e −1 4|x| 2 Ψε (Gε(t, t)1/2)x+m∗(t) dxdt ≤C1exp km∗kαL∞(0,1) Z Rd e−12|x|2+C2εα|x|αdx <∞ sinceα∈[0,2).
Next, we prove that the minimiser exists. By examining the proof of [187, The- orem 3.10], one can see that the theorem is proved if the following statement is valid: if
a sequence{An} ⊂ H1a(0,1)satisfiessupnkAnkH1(0,1) < ∞, then the sequence{Bn}
withBn = ε−2An2 −ε−1A0n, viewed as multiplication operators, contains a subsequence
that converges to B = ε−2A2 −ε−1A0 in L(Hβ,H−β) for some A ∈ H1
a(0,1) and
someβ ∈ (0,1). Hence we only need to show that the latter statement is true. In fact, if supnkAnkH1(0,1) <∞, then there exists a subsequence{Ank}and someA ∈H1(0,1)
such thatAnk *AinH
1(0,1). By Rellich’s compact embedding theorem,A
nk →Ain
L2(0,1)and passing to a further subsequence we may assume thatAnk →Aa.e. on[0,1]. This implies thatAis symmetric andA≥a·Ida.e. and henceA∈H1a(0,1). In addition,
it is clear thatBnk * B inL
2(0,1). According to Lemma5.7.9, for anyα, β > 0 such thatβ > max(α, α/2 + 1/4), a matrix-valued function inH−α(0,1)can be viewed as a multiplication operator inL(Hβ,H−β). Thanks to the compact embedding fromL2(0,1) toH−α(0,1), we obtainBnk →BinL(H
β,H−β). The proof is complete.
Remark 5.3.7. minimisers of (5.43) are not unique in general. The uniqueness issue is outside the scope of this chapter; see more discussions about uniqueness of minimising the Kullback-Leibler divergence in [187, Section 3.4].