Well-Posed Bayesian Inverse Problem - Bayesian Level Set Inversion

Chapter 6 A Bayesian Level Set Method for Geometric Inverse Problems

6.2 Bayesian Level Set Inversion

6.2.3 Well-Posed Bayesian Inverse Problem

We now formulate the Bayesian approach to findingufromygiven by (6.5). All quantities are treated as random variables and we seek to find the posterior probability distribution onu giveny, given a prior probability distribution onuand an independent probabilistic specification of the noiseη.LetU denote a separable Banach space and define a complete probability space(U,Σ, µ0)for the unknownu. HereΣandµ0are the sigma algebra and prior probability measure, respectively. (In our applicationsU will be the space C(D;R) but we state our main theorem in more generality). Assume that the noiseη is a random draw from the centred GaussianQ0 :=N(0,Γ). Allowing for non-Gaussianηis also pos- sible, as is dependence betweenηandu; however we elect to keep the presentation simple. We may now define the joint random variable(u, y)∈U ×RJ. The posterior probability distributionµy _{on the random variable}_u_|_y_{describes our probabilistic knowledge about}_u

on the basis of the measurementsy given by (6.5) and the prior informationµ0 on u. In the case where the mapGis continuous, one can apply an infinite dimensional version of Bayes’ theorem [215] to show that the posteriorµy_{exists and has the density with respect}

to the prior of the form

dµy dµ0

(u) = 1

Z exp(−Φ(u;y)),

whereZ is the normalisation constant. To extend the theory to allowing discontinuousG, we now state a set of assumptions for the potentialΦ, under which the posterior distribution

is well-defined via its density with respect to the prior distribution, and is Lipschitz in the Hellinger metric, with respect to datay. These assumptions will be verified for the level set formulation of interest to us.

Assumptions 6.2.2. The functionΦ : U ×RJ _→ _R_{and probability measure}_µ

0 on the measure space(U,Σ)satisfy the following properties:

1. for everyr > 0there is aK =K(r)such that, for allu∈ U and ally ∈ RJ with |y|Γ< r,

0≤Φ(u;y)≤K;

2. for any fixed y ∈ RJ, Φ(·;y) : U → R, is continuous µ0-almost surely on the complete probability space(U,Σ, µ0);

3. fory1, y2 ∈RJ withmax{|y1|Γ,|y2|Γ} < r, there exists aC =C(r)such that, for allu∈U,

|Φ(u;y1)−Φ(u;y2)| ≤C|y1−y2|Γ.

For our Bayesian level set inverse problem with finite observations and noiseη ∼

Q0, the functionΦ :U×RJ →R+has the least squares form

Φ(u;y) = 1

2|y− G(u)| 2

Γ (6.9)

with| · |_Γ :=|Γ−12 · |andG =O ◦G◦F.ClearlyΦdefined in (6.9) satisfies the first and

the last item of Assumption6.2.2. We will show in the next section that for some model problems, the second item of Assumption6.2.2will also be fulfilled byΦin (6.9).

Recall that the Hellinger distance betweenµandµ0is defined as

dHell(µ, µ0) =   1 2 Z U r dµ dν − r dµ0 dν !2 dν   1 2

for any measureνwith respect to whichµandµ0are absolutely continuous. The Hellinger distance is, however, independent of which reference measureν is chosen. We have the following:

Theorem 6.2.3. Assume that the least squares functionΦ : U ×RJ → Rgiven by(6.9) and the probability measureµ0 on the measure space (U,Σ) satisfy Assumptions 6.2.2.

Thenµy µ0with Radon-Nikodym derivative dµy

dµ0 = 1

Z exp(−Φ(u;y)) (6.10)

where, foryalmost surely,

Z:=

exp(−Φ(u;y))µ0(du)>0.

Furthermoreµyis locally Lipschitz with respect toy, in the Hellinger distance: for ally, y0

withmax{|y|Γ,|y0|Γ}< r, there exists aC =C(r)>0such that

dHell(µy, µy

)≤C|y−y0|_Γ.

This implies that, for allf ∈L2_µ₀(U;S)for separable Banach spaceS,

kEµ

f(u)−Eµ

f(u)kS≤C|y−y0|. (6.11)

Remarks 6.2.4. • The interpretation of this result is very natural, linking the Bayesian picture with least squares minimisation: the posterior measure is large on sets where the least squares function is small, and vice-versa, all measured relative to the prior

µ0.

• The key technical advance in this theorem over existing theories overviewed in [61] is thatΦ(·;y)is only continuousµ0−almost surely; existing theories typically use that Φ(·;y)is continuous everywhere onU and thatµ0(U) = 1; these existing theories cannot be used in the level set inverse problem, because of discontinuities in the level set map. Once the technical Lemma 6.6.1has been established, which uses

µ0−almost sure continuity to establish measurability, the proof of the theorem is a straightforward application of existing theory; we therefore defer it to Appendix 1. • Stability estimates about the distance of level sets can be obtained by choosing f

carefully in(6.11). Indeed, considerf :U 7→L1(D)given by

f(u)(x) :=1Di(x) (6.12)

whereDi is defined in terms ofuas in(6.3). Obviouslyf ∈ L2µ0(U;L

1₍_D₎₎_since the indicator function is uniformly bounded. Then one can read from(6.11)that the

L1-norm of mean indicator function of the setDi under the posterior measure is

Lipschitz continuous with respect to the data. Note that this does not give exactly the symmetric difference of the two mean level sets since indicator functions are averaged

first. However, it does reflect stability of geometric reconstructions in an averaged sense.

• What needs to be done to apply this theorem in our level set context is to identify the sets of discontinuity for the mapG, and henceΦ(·;y), and then construct prior measuresµ0 for which these sets have measure zero. We study these questions in general terms in the next two subsections, and then, in the next section, demonstrate two test model PDE inverse problems where the general theory applies.

• The consequences of this result are wide-ranging, and we name the two primary ones: firstly we may apply the mesh-independent MCMC methods overviewed in [57] to sample the posterior distribution efficiently; and secondly the well-posedness gives desirable robustness which may be used to estimate the effect of other perturbations, such as approximatingGby a numerical method, on the posterior distribution [61].

In document Asymptotic analysis and computations of probability measures (Page 173-176)