Chapter 6 A Bayesian Level Set Method for Geometric Inverse Problems
6.2 Bayesian Level Set Inversion
6.2.3 Well-Posed Bayesian Inverse Problem
We now formulate the Bayesian approach to findingufromygiven by (6.5). All quantities are treated as random variables and we seek to find the posterior probability distribution onu giveny, given a prior probability distribution onuand an independent probabilistic specification of the noiseη.LetU denote a separable Banach space and define a complete probability space(U,Σ, µ0)for the unknownu. HereΣandµ0are the sigma algebra and prior probability measure, respectively. (In our applicationsU will be the space C(D;R) but we state our main theorem in more generality). Assume that the noiseη is a random draw from the centred GaussianQ0 :=N(0,Γ). Allowing for non-Gaussianηis also pos- sible, as is dependence betweenηandu; however we elect to keep the presentation simple. We may now define the joint random variable(u, y)∈U ×RJ. The posterior probability distributionµy on the random variableu|ydescribes our probabilistic knowledge aboutu
on the basis of the measurementsy given by (6.5) and the prior informationµ0 on u. In the case where the mapGis continuous, one can apply an infinite dimensional version of Bayes’ theorem [215] to show that the posteriorµyexists and has the density with respect
to the prior of the form
dµy dµ0
(u) = 1
Z exp(−Φ(u;y)),
whereZ is the normalisation constant. To extend the theory to allowing discontinuousG, we now state a set of assumptions for the potentialΦ, under which the posterior distribution
is well-defined via its density with respect to the prior distribution, and is Lipschitz in the Hellinger metric, with respect to datay. These assumptions will be verified for the level set formulation of interest to us.
Assumptions 6.2.2. The functionΦ : U ×RJ → Rand probability measureµ
0 on the measure space(U,Σ)satisfy the following properties:
1. for everyr > 0there is aK =K(r)such that, for allu∈ U and ally ∈ RJ with |y|Γ< r,
0≤Φ(u;y)≤K;
2. for any fixed y ∈ RJ, Φ(·;y) : U → R, is continuous µ0-almost surely on the complete probability space(U,Σ, µ0);
3. fory1, y2 ∈RJ withmax{|y1|Γ,|y2|Γ} < r, there exists aC =C(r)such that, for allu∈U,
|Φ(u;y1)−Φ(u;y2)| ≤C|y1−y2|Γ.
For our Bayesian level set inverse problem with finite observations and noiseη ∼
Q0, the functionΦ :U×RJ →R+has the least squares form
Φ(u;y) = 1
2|y− G(u)| 2
Γ (6.9)
with| · |Γ :=|Γ−12 · |andG =O ◦G◦F.ClearlyΦdefined in (6.9) satisfies the first and
the last item of Assumption6.2.2. We will show in the next section that for some model problems, the second item of Assumption6.2.2will also be fulfilled byΦin (6.9).
Recall that the Hellinger distance betweenµandµ0is defined as
dHell(µ, µ0) = 1 2 Z U r dµ dν − r dµ0 dν !2 dν 1 2
for any measureνwith respect to whichµandµ0are absolutely continuous. The Hellinger distance is, however, independent of which reference measureν is chosen. We have the following:
Theorem 6.2.3. Assume that the least squares functionΦ : U ×RJ → Rgiven by(6.9) and the probability measureµ0 on the measure space (U,Σ) satisfy Assumptions 6.2.2.
Thenµy µ0with Radon-Nikodym derivative dµy
dµ0 = 1
Z exp(−Φ(u;y)) (6.10)
where, foryalmost surely,
Z:=
Z
U
exp(−Φ(u;y))µ0(du)>0.
Furthermoreµyis locally Lipschitz with respect toy, in the Hellinger distance: for ally, y0
withmax{|y|Γ,|y0|Γ}< r, there exists aC =C(r)>0such that
dHell(µy, µy
0
)≤C|y−y0|Γ.
This implies that, for allf ∈L2µ0(U;S)for separable Banach spaceS,
kEµ
y
f(u)−Eµ
y0
f(u)kS≤C|y−y0|. (6.11)
Remarks 6.2.4. • The interpretation of this result is very natural, linking the Bayesian picture with least squares minimisation: the posterior measure is large on sets where the least squares function is small, and vice-versa, all measured relative to the prior
µ0.
• The key technical advance in this theorem over existing theories overviewed in [61] is thatΦ(·;y)is only continuousµ0−almost surely; existing theories typically use that Φ(·;y)is continuous everywhere onU and thatµ0(U) = 1; these existing theories cannot be used in the level set inverse problem, because of discontinuities in the level set map. Once the technical Lemma 6.6.1has been established, which uses
µ0−almost sure continuity to establish measurability, the proof of the theorem is a straightforward application of existing theory; we therefore defer it to Appendix 1. • Stability estimates about the distance of level sets can be obtained by choosing f
carefully in(6.11). Indeed, considerf :U 7→L1(D)given by
f(u)(x) :=1Di(x) (6.12)
whereDi is defined in terms ofuas in(6.3). Obviouslyf ∈ L2µ0(U;L
1(D))since the indicator function is uniformly bounded. Then one can read from(6.11)that the
L1-norm of mean indicator function of the setDi under the posterior measure is
Lipschitz continuous with respect to the data. Note that this does not give exactly the symmetric difference of the two mean level sets since indicator functions are averaged
first. However, it does reflect stability of geometric reconstructions in an averaged sense.
• What needs to be done to apply this theorem in our level set context is to identify the sets of discontinuity for the mapG, and henceΦ(·;y), and then construct prior measuresµ0 for which these sets have measure zero. We study these questions in general terms in the next two subsections, and then, in the next section, demonstrate two test model PDE inverse problems where the general theory applies.
• The consequences of this result are wide-ranging, and we name the two primary ones: firstly we may apply the mesh-independent MCMC methods overviewed in [57] to sample the posterior distribution efficiently; and secondly the well-posedness gives desirable robustness which may be used to estimate the effect of other perturbations, such as approximatingGby a numerical method, on the posterior distribution [61].