• No results found

The structure of optimal parameters for image restoration problems

N/A
N/A
Protected

Academic year: 2021

Share "The structure of optimal parameters for image restoration problems"

Copied!
37
0
0

Loading.... (view fulltext now)

Full text

(1)

Contents lists available atScienceDirect

Journal

of

Mathematical

Analysis

and

Applications

www.elsevier.com/locate/jmaa

The

structure

of

optimal

parameters

for

image

restoration

problems

J.C. De Los Reyesa, C.-B. Schönliebb, T. Valkonena,b,∗

a Research Center on Mathematical Modelling (MODEMAT), Escuela Politécnica Nacional, Ladrón de Guevara E11-253, Quito, Ecuador

b Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Wilberforce Road, CB3 0WA, Cambridge, United Kingdom

a r t i c l e i n f o a b s t r a c t

Article history: Received 8 May 2015

Available online 16 September 2015 Submitted by H. Frankowska Keywords:

Total variation

Total generalised variation Bi-level optimisation Optimality

Parameter choice

We study the qualitative properties of optimal regularisation parameters in variationalmodels for imagerestoration.The parametersare solutionsof bilevel optimisationproblemswiththeimagerestorationproblemasconstraint.Ageneral type of regulariser is considered, which encompasses total variation (TV), total generalised variation(TGV) andinfimal-convolution total variation (ICTV). We provethatundercertainconditionsonthegivendataoptimalparametersderived bybileveloptimisationproblemsexist.Acrucialpointintheexistenceproofturns outtobetheboundednessoftheoptimalparametersawayfrom0 whichweprove inthispaper.Theanalysisisdoneontheoriginal–inimagerestorationtypically non-smooth variational problem – as well as on a smoothed approximation set inHilbertspacewhich istheoneconsidered innumerical computations.Forthe smoothedbilevelproblemwealsoprovethatitΓ convergestotheoriginalproblem asthesmoothingvanishes.Allanalysisisdoneinfunctionspacesrather thanon thediscretisedlearningproblem.

© 2015TheAuthors.PublishedbyElsevierInc.Thisisanopenaccessarticle undertheCCBYlicense(http://creativecommons.org/licenses/by/4.0/).

1. Introduction

In this paper we consider the generalvariational image reconstructionproblem that, given parameters

α= (α1,. . . ,αN),N≥1,aimstocomputeanimage

uα∈arg min u∈X

J(u;α).

The image depends on α and belongs in our setting to a generic function space X. Here J is a generic energymodellingourpriorknowledgeontheimage.Thequalityofthesolutionofvariationalimaging

* Corresponding author.

E-mail addresses:[email protected](J.C. De Los Reyes), [email protected](C.-B. Schönlieb), tuomo.valkonen@iki.fi(T. Valkonen).

http://dx.doi.org/10.1016/j.jmaa.2015.09.023

0022-247X/© 2015 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

(2)

approacheslikethisonecruciallyreliesonagoodchoiceoftheparametersα.Weareparticularlyinterested inthecase J(u;α) = Φ(Ku) + N j=1 αjAjuj,

withKagenericboundedforwardoperator,Φ afidelityfunction,andAj linearoperatorsactingonu.The

valuesAjuarepenalisedinthetotalvariationorRadonnormμj =μM(Ω;Rmj),andcombinedconstitute

theimageregulariser.Inthiscontext,αrepresentstheregularisationparameterthatbalancesthestrength of regularisation against the fitness Φ of the solution to the idealised forward model K. The size of this parameterdependsonthelevelofrandomnoiseandthepropertiesoftheforwardoperator.Choosingittoo largeresultsinover-regularisationofthe solutionandinturn maycausethe lossof potentially important details in the image; choosing it too small under-regularises the solution and may result in a noisy and unstableoutput. Inthis workwewilldiscuss andthoroughly analyseabileveloptimisationapproachthat isabletodeterminetheoptimalchoiceofαinJ(;α).

Recently,bilevelapproaches forvariationalmodels havegainedincreasingattentioninimage processing and inverse problems in general. Based on prior knowledge of the problem in terms of a training set of imagedataandcorrespondingmodelsolutionsorknowledgeofothermodeldeterminantssuchasthenoise level,optimalreconstructionmodelsareconceivedbyminimisingacostfunctional –calledF inthesequel –constrainedtothevariationalmodelinquestion.Wewillexplainthisapproachinmoredetailinthenext section. Before,letus giveanaccount ofthestateof theartof bileveloptimisationformodellearning.In machinelearningbileveloptimisationiswellestablished.Itisasupervisedlearningmethodthatoptimally adaptsitselftoagivendatasetofmeasurementsanddesirablesolutions.In [41,42,28,29,18,19],forinstance the authors consider bilevel optimisation for finite dimensional Markov random field (MRF) models. In inverse problems the optimal inversion and experimental acquisition setup is discussed in the context of optimal model designin worksby Haber, Horesh and Tenorio [34,32,33], as well as Ghattas et al.[13,8]. Recentlyparameterlearninginthecontextoffunctional variationalregularisationmodels alsoenteredthe image processingcommunitywith worksbytheauthors [23,14], Kunisch,Pockand co-workers [37,17]and Chungetal. [20].AveryinterestingcontributioncanbefoundinapreprintbyFehrenbachetal. [31]where the authors determine an optimal regularisation procedureintroducing particular knowledgeof the noise distributionintothelearningapproach.

Apartfromtheworkoftheauthors [23,14],allapproachesforbilevellearninginimage processingsofar areformulatedandoptimised inthediscrete setting.Oursubsequentmodelling,analysis andoptimisation willbecarriedoutinfunctionspaceratherthanonadiscretisationofthevariationalmodel.Inthiscontext, acareful analysis of the bilevelproblem is ofgreat relevancefor its applicationinimage processing. The structure of optimal regularisers is important, among others, for thedevelopment of solution algorithms. Inparticular, iftheparametersarebounded andlieintheinteriorofaclosedconnectedset,thenefficient optimisation methods can be used for solving the problem. Previous results on optimal parameters for inverseproblemswithpartial differentialequationshavebeen obtainedin,e.g., [16].

Inthispaperwestudythequalitativestructureofregularisation parametersarisingassolutionsfromthe bileveloptimisationofvariationalmodels.Inourframeworkthevariationalmodelsaretypicallyconvexbut non-smoothand posedinBanachspaces.Thetotalvariationandtotalgeneralised variationregularisation models are particular instances.Alongside theoptimisation of thenon-smooth variationalmodel, we also consider a smoothed approximation in Hilbert space which is typically the one considered in numerical computation. Under suitable conditions, we prove that – for both the original non-smooth optimisation problem as well as the regularised Hilbertspace problem – the optimal regularisers are bounded and lie inthe interiorof thepositiveorthant. Theconditions necessary to provethis turn outto be verynatural

(3)

conditionsonthegivendatainthecaseofanL2costfunctionalF.Indeed,forthetotalvariationregularisers

with L2-squaredcostandfidelity,wewill merelyrequire

TV(f)>TV(f0),

withf0theground-truthandf thenoisyimage.Thatis,thenoisyimageshouldoscillatemoreintermsof

the totalvariationfunctional,than theground-truth. Forsecond-ordertotal generalisedvariation [11],we obtain an analogous condition. Apartfrom the standard L2 costs, we also discuss coststhat constitutea

smoothedL1normofthegradientoftheoriginaldatawewillcallthistheHuberisedtotalvariationcost

inthesequel–typicallyresultinginoptimalsolutionssuperiortotheones minimisinganL2 cost.Forthis case,however,theinteriorpropertyofoptimalparameterscouldbeverifiedforafinitedimensionalversion ofthecostonly.Eventually,wealsoshowthatasthenumericalsmoothingvanishestheoptimalparameters forthesmoothedmodels tendtooptimalparametersoftheoriginalmodel.

Theresultsderivedinthispaperaremotivatedbyproblemsinimageprocessing.However,their applica-bility goeswell beyondthatand canbe generallyappliedto parameterestimation problemsof variational inequalitiesofthesecondkind,forinstancetheparameterestimationprobleminBinghamflow [22]. Previ-ousanalysisinthiscontexteitherrequiredboxconstraintsontheparametersinordertoproveexistenceof solutionsor theadditionof aparameterpenaltyto thecostfunctional [3,6,7,23]. Inthispaper,we require neither butrather provethat under certain conditions on the variational model and for reasonable data and cost functional, optimal parameters are indeed positiveand guaranteed to be bounded away from 0 and .Aswe willsee laterthisisenoughforproving existenceofsolutionsandcontinuityof thesolution map.Thenextstepfromourworkinthishereisderivingnumericallyusefulcharacterisationsofsolutions to theensuingbi-levelprograms.Forthemostbasicproblemsconsideredhereinthishasbeendonein [23]

under numerical H1 regularisation.We will consider inthe follow-up work [24] the optimalityconditions

forhigher-orderregularisersandthenewcostfunctionalsintroducedinthiswork.Foranextension charac-terisationofoptimalitysystemsforbi-leveloptimisationinfinite dimensions,wepointthereaderto [27]as astartingpoint.

Outline of the paper InSection2weintroducethegeneralbilevellearningproblem,statingassumptionson thesetupofthecostfunctionalFandthelowerlevelproblemgivenbyavariationalregularisationapproach. The bilevel problem is discussed in its original non-smooth form (P) as well as in asmoothed form ina Hilbert space setting (Pγ,) inSection2.2, thatwill be theone used inthenumerical computations.The

bilevelproblemisputincontextwithparameterlearningfornon-smoothvariationalregularisationmodels, typical in image processing, by proving the validity of the assumptions for examples such as TV, TGV and ICTVregularisation.Themainresultsofthepaper –existenceofpositiveoptimalparametersforL2,

Huberised TV and L1 type costsand theconvergence of thesmoothed numericalproblem to the original

non-smooth problem – are stated inSection 3. Auxiliary results, such as coercivity, lower semicontinuity and compactness results for the involved functionals, is the topic of Section 4. Proofs for existence and convergence of optimal parameters are contained in Section5. The paper finishes with abrief numerical discussion inSection6.

2. Thegeneralproblemsetup

Let Ω Rm be an open bounded domain with Lipschitz boundary. This will be our image domain. Usually Ω = (0,w)×(0,h) for w and h the width and height of a two-dimensional image, although no such assumptions are made in this work. Our noisy or corrupted data f is assumed to lie in a Banach space Y, which isthe dual of Y, while ourground-truth f0 lies inageneral Banachspace Z. Usually, in

(4)

data is just acorrupted version of the original image.Further d= 1 for grayscale images, and d = 3 for colourimagesintypicalcolourspaces.Forsub-sampledreconstructionfromFouriersamples,wemightuse afinite-dimensionalspaceY =Cntherearehoweversomesubtletieswiththat,andwerefertheinterested

readerto [1].

Asourparameterspaceforregularisationfunctional weightswetake

P∞

α := [0,∞]N,

where N isthedimensionoftheparameterspace, thatisα= (α1,. . . ,αN),N 1.Observethatweallow

infinite and zero valuesfor α. The reasonfor the former is that incase of TGV2, it is not reasonable to expect thatboth α1 and α2 are bounded; such conditionswould imply thatTGV2 performs betterthan

both TV2 and TV. But we want our learning algorithm to find out whetherthat is the case! Regarding zerovalues,oneofourmain tasksisproving thatforreasonabledata,optimalparametersinfactlieinthe interior

intPα= (0,∞]N.

This is required for the existence of solutions and the continuity of the solution map parametrised by additionalregularisationparametersneededforthenumericalrealisationofthemodel.Wealsoset

:= [0,∞)N

forsomeoccasionswhenweneedaboundedparameter.

Remark1. Inmuch ofour treatment,we could allowfor spatiallydependentparametersα. However,the parameters would need to lie in a finite-dimensional subspace of C0(Ω;RN) in our theory. Minding our

generaldefinitionofthefunctional Jγ,0(·;α) below,nogeneralityis lostbytaking αto bevectorsinRN.

Wecould simplyreplace thesuminthe functional as alargersummodellingintegration over parameters withvaluesinafinite-dimensionalsubspaceof C0(Ω;RN).

Inour generallearning problem,we look for α= (α1,. . . ,αN) solving for someconvex, proper, weak*

lowersemicontinuous costfunctionalF :X Rtheproblem min α∈Pα∞F() (P) subjectto uα∈arg min u∈X J(u;α), (Dα) with J(u;α) := Φ(Ku) + N j=1 αjAjuj.

Here wedenoteforshortthetotalvariationnorm

μj:=μM(Ω;Rmj), (μ∈ M(Ω;Rmj)).

ThefollowingcoversourassumptionswithregardtoA,K, andΦ.Wediscussvariousspecificexamplesin Section2.1immediatelyafter statingtheassumptions.

(5)

Assumption A-KA (Operators AandK). We assume thatY is aBanach space, and X a normed linear space, boththemselvesduals ofY andX,respectively.Wethen assumethatthelinearoperators

Aj:X→ M(Ω;Rmj), (j= 1, . . . , N),

and

K:X →Y,

are bounded. Regarding K, we also assume theexistence of a bounded aright-inverse K† :R(K) X, where R(K) denotestherangeoftheoperatorK. Thatis,KK†=I onR(K).Wefurtherassumethat

uX :=

N

j=1

Ajuj+KuY (1)

is a norm on X, equivalent to the standard norm. In particular, by the Banach–Alaoglu theorem, any sequence {ui}

i=1 ⊂X withsupiuiX<∞possessesaweakly*convergentsubsequence.

Assumption A-Φ (The fidelity Φ). Wesuppose Φ :Y (−∞,∞] is convex, proper, weakly* lower semi-continuous, andcoerciveinthesense that

Φ(v)+ as vY +∞. (2)

We assumethat0dom Φ,andthe existenceoff arg minvY Φ(v) suchthatf =Kf¯for somef¯∈X. Finally,we requirethateitherK iscompact,or Φ iscontinuous andstronglyconvex.

Remark2.Whenf¯exists,wecanchoosef¯=K†f.

Remark3.Insteadof0dom Φ,itwouldsufficetoassume,moregenerally,thatdom Φ∩Nj=1kerAj=.

Wealsorequirethefollowingtechnicalassumptionontherelationshipoftheregularisationtermsandthe fidelity Φ.Roughlyspeaking,inmostinterestingcases,itsaysthatforeach,we cancloselyapproximate thenoisydataf with functionsfδ,of“order ”.Theorderhereisnotdirectlyintermsofderivatives,but

as definedbytheoperators {Aj}throughtheconditions (3): fδ,j = 0 forj =, sothere isno information

at otherorders,while Afδ,isbounded, sotheinformation atorderisbounded.

AssumptionA-δ(Order reduction). Weassumethatforevery∈ {1,. . . ,N}andδ >0,thereexistsf¯δ,∈X

suchthat Φ(Kf¯δ,)< δ+ inf Φ, (3a) Af¯δ,<∞, and, (3b) j= Ajf¯δ,j= 0. (3c)

Example 1 (Orders related to TGV2 andICTV). ForICTV,asformulatedin Example 6,“order2”means that fδ,2 BV2(Ω),while “order 1”means fδ,1 BV(Ω). For the definitionof BV2(Ω),we referto [26].

ForTGV2,asformulatedin Example 5,“order1”islikewiseBV(Ω),but“order2”isthespaceoffunctions

(6)

2.1. Specific examples

Wenowdiscussafewexamplestomotivatetheabstractframeworkabove. Example2 (Squared L2 fidelity). WithY =L2(Ω),f Y,and

Φ(v) = 1

2v−f

2

Y,

werecover thestandardL2-squaredfidelity, modellingGaussiannoise.Onabounded domainΩ, Assump-tion A-Φfollowsimmediately.

With fidelity functionals as in Example 2, the operator K has a double-role in (Dα). It models any

forwardoperatorT,suchas ablurringkernelor asub-sampledFouriertransform,butalsoactsto extract theinterestingcomponentsofuinthecorrectspace.Initslatterrole,theoperatorisalsorequiredwithinF. WethereforeintroducetheoperatorK0forthatpurpose.ThenfordenoisingwewanttotakeK=K0,but

forotherimage processingtasks,wewouldcombinethiswiththeforwardoperatorT,takingK=T K0.

Example3 (Cost functionals). Forthecostfunctional F,givennoise-freedataf0∈Z=L2(Ω),weconsider

inparticulartheL2 cost

FL2 2(u) := 1 2f0−K0u 2 L2(Ω),

aswell astheHuberisedtotalvariationcost

FL1

η∇(u) :=D(f0−K0u)γ

with noise-free data f0 Y := BV(Ω). For the definition of the Huberised total variation, we refer to

Section2.2 onthenumericsofthebi-levelframework (P).Inthenumericalcounterpart [24]tothepresent paper,wehaveobservedthattheHuberised totalvariationcostproducesimprovedresultscomparedtothe

L2 costwhenmeasuredwiththeSSIM [46],whilehaving theadvantageof beingconvexunliketheSSIM. Example4 (Total variation denoising). WetakeK0astheembeddingofX= BV(Ω)∩L2(Ω) intoZ=L2(Ω),

andA1=D.WeequipX withthenorm

uX :=uL2(Ω)+DuM(Ω;Rn).

This makes K0 a bounded linear operator. If the domain Ω has Lipschitz boundary, and the dimension

satisfiesn∈ {1,2},thespaceBV(Ω) continuouslyembedsintoL2(Ω)[2,Corollary3.49].Therefore,wemay identifyXwithBV(Ω) asanormedspace.Otherwise,ifn≥3,withoutgoingintothedetailsofconstructing

X asadualspace,1we defineweak*convergenceinXascombinedweak*convergenceinBV(Ω) andL2(Ω).

Anybounded sequenceinX will thenhaveaweak*convergentsubsequence.This istheonlypropertywe wouldusefromX beingadualspace.

Now,combinedwith Example 2andthechoiceK=K0,Y =Z,wegettotalvariationdenoisingforthe

sub-problem (Dα). Assumption A-KA holdingisimmediate fromthepreviousdiscussion. Assumption A-δ

is also easily satisfied, as with f BV(Ω)∩L2(Ω), we may simply pick f¯δ,1 = f for the only possible case = 1.Observe however that K is not compact unless n = 1, see [2, Corollary 3.49], so the strong

1 This can be achieved by allowing φ

(7)

convexity of Φ is crucial here. If the data f L∞(Ω), then it is well-known that solutions uˆ to (Dα)

satisfyuˆL∞(Ω)≤ fL∞(Ω).Wecouldthereforeconstructacompactembeddingbyaddingsomeartificial

constraintstothedataf.Thischangesinthenexttwoexamples,asboundednessofsolutionsforhigher-order regularisersisunknown;see also [43].

Example 5 (Second order total generalised variation denoising). Wetake

X= (BV(Ω)∩L2(Ω))×BD(Ω),

thefirstpartwiththesametopologyasin Example 4.WealsotakeZ=L2(Ω),denote u= (v,w),andset

K0(v, w) =v, A1u=Dv−w, and A2u=Ew

for E the symmetrised differential. With K = K0 and Y = Z, this yields second-order total generalised

variation (TGV2) denoising [11] for the sub-problem (Dα). Assuming for simplicity that α12 > 0 are

constants,toshow Assumption A-KA,werecallfrom [12]theexistenceofaconstantc=c(Ω) suchthat

c−1vBV(Ω)≤ vL1(Ω)+DvM(Ω;Rn)≤cvBV(Ω),

where thenorm

vBV(Ω):= TGV2(1,1)(v) +vL1(Ω).

Wemaythusapproximate

wL1(Ω;Rn)≤ Dv−wM(Ω;Rn)+DvM(Ω;Rn)

≤ Dv−wM(Ω;Rn)+cTGV2(1,1)(v) +uL1(Ω) (1 +c)Dv−wM(Ω;Rn)+EwM(Ω;Rn×n)

+cvL1(Ω).

ForsomeC >0,itfollows

uX = vL2(Ω)+DvM(Ω;Rn) +wL1(Ω;Rn)+EwM(Ω;Rn×n) ≤CDv−wM(Ω;Rn)+EwM(Ω;Rn×n)+vL2(Ω) =C N i=1 Ajuj+KuL2(Ω) .

ThisshowsuX≤CuX.TheinequalityuX ≥ uX followseasilyfromthetriangleinequality,namely

uX≥ vL2(Ω)+Dv−wM(Ω;Rn)+EwM(Ω;Rn×n).

Thus ·X isequivalent to· X.

NextweobservethatclearlyDvk−wk Dv∗ −winM(Ω;Rn) andEwk Ew∗ inM(Ω;Rn×n) ifvk v∗

inBV(Ω) andwk w weakly*inBD(Ω).ThusA

1andA2areweak*continuous. Assumption A-KAfollows.

Considerthenthesatisfactionof Assumption A-δ.If= 1,wemaythenpickf¯δ,1= (f,0),inwhichcase

A1fδ,1 =Df, and A2fδ,1 = 0.If = 2,which isthe only other case,we maypick asmoothapproximate

fδ,2to f suchthat

1

2f−fδ,2

2

(8)

Thenwesetf¯δ,2:= (fδ,2,∇fδ,2),yieldingA1f¯δ,2= 0 andA2f¯δ,2=2fδ,2.Byfδ,2beingsmoothweseethat A2f¯δ,2M<∞.Thus Assumption A-δissatisfiedforthesquaredL2fidelityΦ(v)=f −v2L2(Ω).

Example6 (Infimal convolution TV denoising). Letus takeZ =L2(Ω),and X = (BV(Ω)L2(Ω))×X 2,

where

X2:={v∈W1,1(Ω)| ∇v∈BV(Ω;Rn)},

andBV(Ω)∩L2(Ω) againhasthetopologyof Example 4.Settingu= (v,w),aswell as

K0(v, w) =v+w, A1u=Dv, and A2u=D∇w,

weobtainfor (Dα)withK=K0andY =Z theinfimalconvolutiontotalvariationdenoisingmodelof [15]. Assumptions A-KA,A-δ,andA-H areverifiedanalogouslytoTGV2 in Example 5.

Example7 (Deblurring and sub-sampled Fourier transforms for MRI). LetK0 andtheAjsbe givenbyone

ofthe regularisersof Example 4to Example 6. AlsotakethecostF =FL2

2 or FL1η∇ as in Example 3,and

Φ as the squaredL2 fidelity of Example 2. However,let us now takeK =T K

0 for some bounded linear

operatorT :Z →Y. TheoperatorT couldbe,for example,ablurring kernelor a(sub-sampled) Fourier transform, in which case we obtain a modelfor learning the parameters for deblurring or reconstruction from Fouriersamples. Thelatterwouldbe important, forexamplefor magneticresonance imaging(MRI)

[5,44,45].Unfortunately,ourtheorydoesnoextendtomanyofthesecasesbecausewewillrequire,roughly,

K0∗K0 ≤CK∗K forsomeconstantC >0. Inother words, the cost functional cannot see what the fidelity does not.

Example8 (Parameter estimation in Bingham flow). Binghamfluids arematerialsthatbehaveas solidsif themagnitudeofthestresstensorstaysbelowaplasticitythreshold,andasliquidsifthatquantitysurpasses thethreshold.Inacrosssectionalpipe,ofsectionΩ,thebehaviourismodelled bytheenergyminimisation functional min u∈H1 0(Ω) μ 2u 2 H1 0(Ω)(f|u)H−1(Ω),H 1 0(Ω)+α Ω |∇u|dx, (4)

where μ > 0 stands for the viscosity coefficient, α > 0 for the plasticity threshold and f H−1(Ω). In

many practicalsituations,the plasticitythreshold is notknowninadvance and hasto be estimated from experimentalmeasurements.Onethenaimsatminimising a leastsquaresterm

FL2 2 = 1 2u−f0 2 L2(Ω) subjectto (4).

The bileveloptimisation problem can then be formulated as problem (P)–(Dα), with the choices X = Y =H1

0(Ω),K theidentity,A1=D and

φ(v) = μ 2v 2 H1 0(Ω)(f|u)H−1(Ω),H 1 0(Ω).

Concentratingintherestofthispaperprimarilyonimageprocessingapplications,wewillhoweverbriefly returntoBinghamflow in Example 13.

(9)

2.2. Considerations for numerical implementation

Forthenumericalsolutionofthedenoisingsub-problem,wewillinafollow-upwork [24]expanduponthe infeasible semi-smooth quasi-Newton approach taken in [35] for L2-TV imagerestoration problems. This

depends onHuber-regularisation of thetotalvariation measures,as well as enforcingsmoothnessthrough Hilbert spaces. This isusually doneby asquaredpenalty onthe gradient,i.e., H1 regularisation, butwe

formalisethismoreabstractlyinordertosimplifyournotationandargumentslateron.Therefore,wetake a convex, proper, and weak* lower-semicontinuous smoothing functional H : X [0,∞], and generally expectittosatisfythefollowing.

AssumptionA-H (Smoothing). Weassumethat0domH andforeveryδ≥0,everyα∈ Pα∞,andevery

u∈X,theexistenceofX satisfying

H()<∞ and J(;α)≤J(u;α) +δ. (5) Example 9 (H1 smoothing inBV(Ω)). Usually, withH1(Ω;Rm)X =, wetake

H(u) := 1 2∇u 2 L2(Ω;Rm×n), u∈H1(Ω;Rm)∩X, ∞, otherwise.

This isinparticular thecasewith Example 4(TV),where X = BV(Ω)∩L2(Ω)H1(Ω),and Example 5

(TGV2), where X = (BV(Ω)∩L2(Ω))×BD(Ω)⊃H1(Ω)×H1(Ω;Rn) on abounded domain Ω. In both of these cases, weak* lower semicontinuity is apparent; for completeness we record this in Lemma 1 in Section 4.2.In caseof Example4, (5)is immediate from approximatingustrictly by functionsinC∞(Ω) usingstandardstrictapproximationresultsinBV(Ω)[2].Incaseof Example 5,thisalsofollowsbyasimple generalisation ofthesameargumentto TGV-strictapproximation,aspresented in [43,10].

Forparameters0 andγ∈(0,∞],wethenconsider theproblem min

α∈Pα∞

F(uα,γ,) (Pγ,)

where uα,γ,∈X∩domHsolves

min u∈XJ γ,(u;α) (Dγ,) for Jγ,(u;α) :=H(u) + Φ(Ku) + N j=1 αjAjuγ,j.

Here wedenoteforshort theHuber-regularisedtotalvariationnorm

μγ,j :=μγ,M(Ω;Rmj), (μ∈ M(Ω;Rmj)),

asgivenbythefollowingdefinition.Thereweinterpretγ=togivebackthestandardunregularisedtotal variationmeasure.ClearlyJ∞,0=J,and (D

(10)

Definition1.Givenγ∈(0,∞], wedefineforthenorm· 2 onRm,theHuberregularisation |g|γ = g221γ, g21/γ, γ 2g 2 2, g2<1/γ.

Weobservethatthiscanequivalentlybewrittenusingconvexconjugatesas

α|g|γ = sup q, g − 1 2γαq 2 2q2≤α . (6)

Thenifμ=fLn+μsistheLebesguedecompositionofμ∈ M(Ω;Rm) intotheabsolutelycontinuouspart

fLn and thesingularpartμs,we set

|μ|γ(V) :=

V

|f(x)|γdx+|μs|(V), (V Ω Borel-measureable).

Themeasures |μ|γ istheHuber-regularisationof thetotalvariation measures|μ|, andwedefine itsRadon

normas theHuberregularisation oftheRadonnorm ofμ,thatis

μγ,M(Ω;Rm):=|μ|γM(Ω;Rm).

Remark 4. The parameter γ varies in the literature. In this paper, we use the convention of [23], where largeγmeanssmallHuberregularisation.In [35],smallγ meanssmallHuberregularisation.Thatis, their regularisationparameteris1 inournotation.

2.3. Shorthand notation

Writingfornotationallightness

Au:= (A1u, . . . , ANu), and (μ1, . . . , μN;α) := N j=1 αjμjγ,j, R(·;α) :=R∞(·;α),

ourproblem (Pγ,)becomes

min

α∈P∞

α

F() subject to uα∈arg min u∈X

Jγ,(u;α)

for

Jγ,(u;α) :=H(u) + Φ(Ku) +(Au;α).

Further,givenα∈ Pα,wedefinethe“marginalised”regularisationfunctional

(v;α) := inf

¯

v∈K−1vR

(11)

HereK−1standsforthepreimage,sotheconstraintis¯vX withv=K¯v.Theninthecase(γ,)= (,0)

and F =F0◦K,ourproblemmayalsobewrittenas

min α∈Pα∞ F0() subjectto vα∈arg min v∈Y Φ(v) +T(v;α).

This givestheproblem amuchmoreconventionalflair,asthefollowingexamplesdemonstrate.

Example 10 (TV as a marginal). Considerthe totalvariationregularisation of Example 4. Then A1=D,

N = 1, and

T(v;α) =R(Av;α) =αTV(v).

Example11 (TGV2 as a marginal). IncaseoftheTGV2 regularisationof Example 5,wehaveK(v,w)=v

and K†v= (v,0).ThusKK†f =f,etc., so

T(v;α) = inf

w∈BD(Ω)R(Dv−w, Ew;α) = TGV 2

α(v).

3. Mainresults

Ourtasknowistostudythecharacteristicsofoptimalsolutions,andtheirexistence.Ourresults,based on natural assumptions on the data and the original problem (P), derive properties of the solutions to thisproblemandallnumericallyregularisedproblems (Pγ,)sufficientlyclosetotheoriginalproblem:large γ >0 and small >0.We denoteby uα,γ, any solutionto (Dγ,) forany givenα∈ Pα,and byαγ, any

solutionto (Pγ,).Solutionsto (D

α)and (P)wedenote,respectively,by=uα,∞,0, andαˆ=α∞,0.

3.1. L2-squared cost and L2-squared fidelity

Ourmain existenceresultregardingL2-squaredcostsandL2-squaredfidelitiesisthefollowing.

Theorem 1. Let Y and Z be Hilbert spaces, F(u) = 1

2K0u−f0 2

Z, and Φ(v) = 12f −v 2

Y for some

f ∈ R(K), f0∈Z, and a bounded linear operator K0:X →Z satisfying

K0uZ≤C0KuY, for allu∈X for some constantC0>0. (8) Suppose Assumptions A-KA and A-δhold. If for some α¯int and t∈(0,1/C0] holds

T(f; ¯α)> T(f −t(K0K†)(K0K†f−f0); ¯α), (9) then problem (P)admits a solution αˆintPα∞.

If, moreover, Assumption A-H holds, then there exist ¯γ (0,∞) and ¯ (0,∞) such that the problem (Pγ,)with (γ,)γ,]×[0,¯] admits a solution α

γ,∈intPα∞, and the solution map

S(γ, ) := arg min

α∈Pα∞

F(uα,γ,)

(12)

We provethis result inSection 5. Outer semicontinuity of aset-valued map S :Rk Rm means [39]

thatforany convergent sequencexk xand S(xk)yk y,we haveyS(x).Inparticular, theouter

semicontinuity of S means that as the numerical regularisation vanishes, the optimal parameters for the regularisedmodels (Pγ,)tendtooptimalparametersoftheoriginalmodel (P).

Remark5.LetZ =Y and K0=K in Theorem 1.Then (9)reducesto

T(f; ¯α)> T(f0; ¯α). (10)

Alsoobserve thatourresultrequires,Φ◦K to measure allthedata thatF measures,inthemoreprecise sense givenby (8).If (8) didnothold,anoscillatingsolution forα∈∂Pα∞,could largelypassthrough

the nullspace of K, hence havelow valuefor the objective J of theinner problem, yet havea large cost givenbyF.

Corollary1 (Total variation Gaussian denoising). Suppose f,f0BV(Ω)∩L2(Ω), and

TV(f)>TV(f0). (11)

Then there exist ¯,γ >¯ 0such that any optimal solution αγ, to the problem

αγ,∈arg min α≥0 1 2f0−uα,γ, 2 L2(Ω) with uα,γ,∈arg min u∈BV(Ω) 1 2f −u 2 L2(Ω)+αDuγ,M+ 2∇v 2 L2(Ω;Rn) satisfies αγ,>0whenever ∈[0,¯], γ∈γ,∞].

Thatis,fortheoptimalparametertobestrictlypositive,thenoisyimagef should,intermsofthetotal variation,oscillatemorethanthenoise-freeimagef0.Thisisaverynaturalcondition:ifthenoisesomehow

hadsmoothedoutfeaturesfrom f0,thenweshouldnotsmoothitanymorebyTV regularisation!

Proof. Assumptions A-KA, A-δ, and A-H we have already verified in Example 4. We then observe that

K0 =K,so we are inthesetting of Remark 5. Followingthe mappingof the TVproblem to the general

frameworkusingtheconstructionin Example 4,wehaveK=I and K†=I embeddingswith Y =L2(Ω).

K†is boundedonR(K)=L2(Ω)BV(Ω).Moreover,by Example 10, T(v;α¯)= ¯αTV(v).Thus (10)with

thechoicet= 1 reducesto (11). 2

ForTGV2 wealsohaveaverynaturalcondition.

Corollary 2 (Second-order total generalised variation Gaussian denoising). Suppose that the data f,f0

L2(Ω)BV(Ω) satisfies for some α

2>0the condition

TGV2(α2,1)(f)>TGV2(α2,1)(f0). (12) Then there exist,¯γ >¯ 0such that any optimal solution αγ,= ((αγ,)1,(αγ,)2)to the problem

αγ, arg min α≥0 1 2f0−vα,γ, 2 L2(Ω)

(13)

with (vα,γ,, wα,γ,) arg min v∈BV(Ω) w∈BD(Ω) 1 2f −v 2 L2(Ω)+α1Dv−wγ,M+α2Ewγ,M + 2(∇v,∇w) 2 L2(Ω;Rn×Rn×n) satisfies (αγ,)1,(αγ,)2>0whenever ∈[0,¯], γ∈γ,∞].

Proof. Assumptions A-KA,A-δ, and A-H we have already verified inExample 5. We then observe that

K0 =K,so we areinthe setting of Remark 5.By Example 11, T(v;α¯)= TGV2α¯(v).Finally, similarlyto (11),wegetfor (10)witht= 1 thecondition (12). 2

Example 12 (Fourier reconstructions). Let K0 be given, for example as constructed inExample 4 or Ex-ample 5.If wetakeK =FK0 forF theFouriertransform –or anyother unitary transform–then (8)is

satisfiedandK†=K0†F∗.Thus (9)becomes

T(f; ¯α)> T(f −t(K0K0†F∗)(K0K0†F∗f−f0); ¯α).

WithF∗f,f0∈ R(K0) andt= 1 thisjustreduces to

T(f; ¯α)> T(Ff0; ¯α).

Unfortunately,ourresultsdonotcoverparameterlearningforreconstructionfrompartialFouriersamples exactly because of (8). What we can do is to find the optimal parameters if we only know apart of the ground-truth, buthavefullnoisydata.

3.2. Huberised total variation and other L1-type costs with L2-squared fidelity

We now consider the alternative“Huberised total variation”cost functional from Example 3. Unfortu-nately, weare unableto deriveforFL1

η∇ easilyinterpretableconditionsas fortheFL22.Theworkaroundis

to “discretise” thenorm, inthe senseof testingonlywith afinitenumberof testfunctions.To be precise, recall that FL1 η∇(u) =D(f0−K0u)η, where μη = sup ξ∈C∞c (Ω;Rn), ξ ≤1 (μ|ξ) 1 2ηξL2(Ω;Rn).

Theideaistoreplacetheset oftestfunctionsξbyafiniteset

V =1, . . . , ξM} ⊂ {ξ∈Cc∞(Ω;Rn)| ξ ≤1}.

Thatis,weapproximateFL1

η∇ by FLV1 η∇(u) := sup ξ∈V (μ|ξ) 1 2ηξL2(Ω;Rn), where μ=D(f0−K0u).

(14)

ChoosingV suitably, thisamountsto takingthefinite-dimensionalHuberised 1-normofadiscretised gra-dientoff0−K0u.

To study bi-level optimisationproblems with thecost FLV1

η∇, we rewrite the functional using a generic

HuberisedL1-cost FL1 η(z) := sup λ Z∗≤1 (λ|z−f0) 1 2ηλ 2 Z∗

onareflexiveBanachspaceZ,forsomef0∈Z.Indeed,defining

DVv:={(divξ1|v), . . . ,(divξM|v)}, (v∈BV(Ω)),

wehave

FLV1

η∇:=FL1η◦D

V wheref

0:=DVf0 andZ =RM with-norm.

OurresultsonFV L1

η∇ arethenbasedonthefollowing generalresult.

Theorem 2. Let Y be a Hilbert space, and Z a reflexive Banach space. Let F = FL1

η ◦K0, and Φ(v) =

1

2f −v 2

Y for some compact linear operator K0 :X →Z satisfying (8)and f ∈ R(K). Suppose

Assump-tions A-KA and A-δ hold. If for some α¯int and t>0holds

T(f; ¯α)> T(f −t(K0K†)∗λ; ¯α), λ∈∂FL1

η(K0K

f), (13)

then the problem (P)admits a solution αˆintPα∞.

If, moreover, Assumption A-H holds, then there exist ¯γ (0,∞) and ¯ (0,∞) such that the prob-lem (Pγ,) with (γ,) γ,∞]×[0,¯] admits a solution αγ, intPα∞, and the solution map S is outer

semicontinuous within γ,∞]×[0,¯].

Weprovethis resultinSection5.

Remark6.IfK0=K,thecondition (13)hasthemuchmorelegibleform

T(f; ¯α)> T(f −tλ; ¯α), λ∈∂FL1

η(f). (14)

Also if K is compact, then the compactness of K0 follows from (8). In the following applications, K is

howevernot compactfor typical domains ΩR2 or R3, so we haveto make K

0 compact bymaking the

rangefinite-dimensional.

Corollary 3 (Total variation Gaussian denoising with discretised Huber-TV cost). Suppose that the data satisfies f,f0BV(Ω)∩L2(Ω) and for some t>0and ξ∈V the condition

TV(f)>TV(f+tdivξ), divξ∈∂FLV1

η∇(f). (15)

Then there exist,¯γ >¯ 0such any optimal solution αγ, to the problem

min α≥0F V L1 η∇(f0−vα) with

(15)

uα∈arg min u∈BV(Ω) 1 2f −u 2 L2(Ω)+αDuγ,M+ 2∇v 2 L2(Ω;Rn) satisfies αγ, >0whenever ∈[0,¯], γ∈γ,∞].

This saysthatfortheoptimalparametertobestrictly positive,thenoisyimagef shouldoscillatemore than the image f +tdivξ in the direction of the (discrete) total variation flow. This is a very natural condition, andweobservethatthenon-discretisedcounterpart of (15)forγ=wouldbe

TV(f)>TV(f+tdivξ), ξ∈Sgn(Df0−Df),

where wedefineforameasureμ∈ M(Ω;Rm) thesign

Sgn(μ) :={ξ∈L1(Ω;μ)=ξ|μ|}.

Thatis,divξ isthetotalvariationflow.

Proof. Analogous to Corollary 1 regarding the L2 cost. Note that in the application of Theorem 2, the

operatorDV isabsorbedbyK0.Precisely,wetakeK0=DV,whileK=I. 2

ForTGV2 wealsohaveananalogousnaturalcondition.

Corollary4 (TGV2Gaussian denoising with discretised Huber-TV cost). Suppose that the data f,f0∈L2(Ω) satisfies for some t,α2>0and ξ∈V the condition

TGV2(α2,1)(f)>TGV2(α2,1)(f+tdivξ), divξ∈∂FLV1

η∇(f). (16)

Then there exist¯,¯γ >0such any optimal solution αγ, = ((αγ,)1,(αγ,)2)to the problem

min α≥0F V L1 η∇(f0−vα) with (vα, wα) arg min v∈BV(Ω) w∈BD(Ω) 1 2f −v 2 L2(Ω)+α1Dv−wγ,M+α2Ewγ,M + 2(∇v,∇w) 2 L2(Ω;Rn×Rn×n) satisfies (αγ,)1,(αγ,)2>0whenever ∈[0,¯], γ∈γ,∞].

Proof. Analogousto Corollary 2. Notethatfortheapplicationof Theorem 2,where inthepresent setting

u= (v,w),wetakeK0u=DVv,while Ku=v. 2

4. Afewauxiliaryresults

We record in this section some general results that will be useful in the proofs of the main results. These includethecoercivity of thefunctional Jγ,0(·;λ,α),recorded inSection4.1. Wethen discusssome

elementarylowersemicontinuityfactsinSection4.2.WeprovideinSection4.3somenewresultsforpassing from strictconvergenceto strongconvergence.

(16)

4.1. Coercivity Observethat (μ1, . . . , μN; ¯α) = sup ψj∈Cc∞(Ω;Rmj×Rn) supx∈Ω ψj(x)221 N j=1 ¯ αj μj(ψj) 1 2γψj 2 L2(Ω;Rn) . Thus R(μ;α)≥Rγ(μ;α)≥R(μ;α)−C 2γL n(Ω)

for someC =C( ¯α). Since Ω is bounded, it follows that given δ >0, forlarge enough γ >0 and every

0 holds

J(u;α)−δ≤Jγ,0(u;α)≤Jγ,(u;α)≤J0,(u;α). (17) We will use these properties frequently. Based on the coercivity and norm equivalence properties in As-sumptions A-KA and A-Φ, the following proposition statesthe important fact thatJγ,0 is coercive with

respectto · X andthusalsothestandardnorm ofX.

Proposition 1.Suppose Assumptions A-KA and A-Φhold, and that α∈intPα∞. Let ≥0 and γ∈(0,∞]. Then

Jγ,(u;α)+ as uX +∞. (18)

Proof. Let {ui}

i=1 X, and suppose supiJγ,(ui;α) < . Then in particular supiΦ(Kui) < . By

Assumption A-ΦthensupiKui

Y <∞.But Assumption A-KAsays

uiX=

N

j=1

Ajuij+KuiY ≤Jγ,(ui;α) +KuiY.

This implies supiuiX <∞. Bythe equivalenceof norms in Assumption A-KA, we immediately obtain

(18). 2

4.2. Lower semicontinuity

Werecord thefollowing elementarylower semicontinuityfactsthatwehave alreadyused tojustify our examples.

Lemma1. The following functionals are lower semicontinuous.

(i) μ→ ν−μγ,M with respect to weak* convergence in M(Ω;Rd).

(ii) v → f −v2

L2(Ω;Rd) with respect to weak convergence in Lp(Ω;Rd) for any 1< p≤2 on a bounded

domain Ω.

(17)

Proof. In each case, let {vi}

i=1 converge to v. Denoting by G the involved functional, we write it as a

convex conjugate,G(v)= sup{v,ϕ−G∗(ϕ)}. Takingasupremisingsequence{ϕj}∞j=1 forthisfunctional at any point v, weeasily see lowersemicontinuity byconsideringthesequences {vi,ϕj−G∗(ϕj)}∞i=1 for eachj.Incase(ii)weusethefactthatϕjL2(Ω;Rd)[Lp(Ω;Rd)] whenΩ isbounded.

In case(i), how exactlywe write G(μ)=ν−μγ,M as aconvex conjugate demandsexplanation. We

firstof allrecallthatforg∈Rn,theHuber-regularisednormmaybewrittenindualform as

|g|γ= sup q, g −γ 2q 2 2q21 . Therefore, wefindthat G(μ) = sup ⎧ ⎨ ⎩μ(ϕ) Ω γ 2ϕ(x) 2 2dx ϕ∈Cc(Ω), sup x∈Ω ϕ(x)2≤α ⎫ ⎬ ⎭. This hastherequiredform. 2

Wealsoshowherethatthemarginalregularisationfunctional (·;α¯) isweakly*lower semicontinuous

onY.ChoosingKas in Example 4and Example 5,thisprovides inparticularaproofthatTV andTGV2 are lowersemicontinuouswithrespect toweakconvergenceinL2(Ω) whenn= 1,2.

Lemma 2. Suppose α¯intPα∞, and Assumption A-KA holds. Then Tγ(·;α¯)is lower semicontinuous with respect to weak* convergence in Y, and continuous with respect to strong convergence in R(K).

Proof. Letvk v weakly*inY. BytheBanach–Steinhaustheorem,thesequence isboundedinY.From

thedefinition

(v; ¯α) := inf

¯

v∈K−1vR

γ(Av¯; ¯α).

Therefore, ifwe pick>0 andv¯k∈K−1vk suchthat

(Av¯k; ¯α)≤Tγ(vk; ¯α) +,

then referralto Assumption A-KA, yieldsforsomeconstantc>0 the bound

cv¯kX ≤ vk+(Av¯k; ¯α)≤ vk+(vk; ¯α) +.

Without lossofgenerality, wemayassumethat lim inf

k→∞ T

γ(vk; ¯α)<,

becauseotherwisethereisnothingtoprove.Then{v¯k}∞k=1isboundedinX,andthereforeadmitsaweakly* convergent subsequence. Letv¯be the limitof this, unrelabelled,sequence. SinceK iscontinuous, we find thatK¯v=v. But(·;α¯) isclearlyweak*lower semicontinuousinX;see Lemma 1. Thus

(v; ¯α)≤Rγ(Av¯; ¯α)lim inf

k→∞ R

γ(A¯vk; ¯α)lim inf k→∞ T

γ(vk; ¯α) +.

(18)

To seecontinuitywithrespect tostrongconvergenceinR(K),weobservethatifv=Ku∈ R(K),then bytheboundednessoftheoperators{Aj}Nj=1 weget

(v; ¯α)≤Rγ(u; ¯α)≤R(u; ¯α)≤Cu,

for some constant C > 0. So we know that (·;α¯)|R(K) is finite-valued and convex. Therefore it is continuous [30, LemmaI.2.1]. 2

4.3. From Φ-strict to strong convergence

In Proposition 3, formingpart of the proof of ourmain theorems, we will need to passfrom “Φ-strict convergence” of Kuk to v to strong convergence, using the following lemmas. The former means that

Φ(Kuk) Φ(v) and Kuk v weakly* in Y. By strong convexity in a Banach space Y, we mean the existenceofγ >0 suchthatforeveryy∈Y andz∈∂Φ(y)⊂Y∗holds

Φ(y)Φ(y)(z|y−y) +γ 2y

y2

Y, (y∈Y),

where (z|y) denotes the dual product, and the subdifferential Φ(y) is defined by z satisfying the same expression with γ = 0. With regard to more advanced strict convergence results, we point the reader to

[25,38,36].

Lemma 3. Suppose Y is a Banach space, and Φ : Y (−∞,∞] strongly convex. If vk vˆ domΦ

weakly* in Y and Φ(vk)Φ(ˆv), then vk vˆstrongly in Y.

Remark7. Bystandardconvex analysis [30], v domΦ if Φ has afinite-valued point of continuityand

v∈int dom Φ.

Proof. We first of all note that −∞ < Φ(ˆv) < because v domΦ implies v dom Φ. Let us pick

z∈∂Φ(ˆv).From thestrongconvexityofΦ,forsomeγ >0 then Φ(vk)Φ(ˆv)(z|vk−ˆv) +γ

2v

kvˆ2

Y.

Takingthelimitinfimum,weobserve

0 = Φ(ˆv)Φ(ˆv)d≥lim inf k→∞ γ 2v kvˆ2 Y.

Thisproves strongconvergence. 2

Wenowusethelemmatoshow strongconvergenceofminimisingsequences.

Lemma 4.Suppose Φis strongly convex, satisfies Assumption A-Φ, and that C ⊂Y is non-empty, closed, and convex with intC∩dom Φ=∅. Let

ˆ

v:= arg min

v∈C

Φ(v).

If {vk}

(19)

Proof. BythestrictconvexityofΦ,impliedbystrongconvexity,andtheassumptionsonCvisuniqueand well-defined.MoreovervˆdomΦ.Indeed,ourassumptionsshowtheexistenceofapointv∈intC∩dom Φ. The indicator function δC is thencontinuous at v, and so standardsubdifferential calculus(see, e.g., [30,

Proposition I.5.6]) implies that (Φ+δC)(ˆv) = Φ(ˆv)+∂δCv). But 0 (Φ+δC)(ˆv) because ˆv

arg minv∈YΦ(v)+δC(v).ThisimpliesthatΦ(ˆv)=.Consequentlyalsovˆdom Φ,andΦ(ˆv)R.

Using thecoercivityof Φ in Assumption A-Φ wethen findthat{vk}

k=1 isboundedinY, atleastafter

movingtoanunrelabelledtailofthesequencewithΦ(vk)Φ(ˆv)+ 1.SinceY isadualspace,theunitball

isweak*compact,andwededucetheexistenceofasubsequence,unrelabelled,andv∈Y suchthatvk v

weakly* inY.Bytheweak*lowersemicontinuity(Assumption A-Φ),wededuce Φ(v)lim inf

k→∞ Φ(v

k) = Φ(ˆv).

Since eachvk C,and C is closed,also v C. Therefore,by thestrict convexityandthedefinition ofvˆ,

necessarily v = ˆv. Therefore vk vˆweakly*inY, and Φ(vk)Φ(ˆv). Lemma 3now shows thatvk →vˆ strongly inY. 2

5. Proofsofthe mainresults

Wenowprovetheexistence,continuity,andnon-degeneracy(interiorsolution)resultsofSection3through a series of lemmas and propositions, starting from general ones that are then specialised to provide the naturalconditionspresentedinSection3.

5.1. Existence and lower semicontinuity under lower bounds

Our principaltool for proving existenceis the following proposition.We will inthe rest ofthis section concentrateonprovingtheexistenceofthesetKinthestatement.Webase thisonthenaturalconditions of Section3.

Proposition 2 (Existence on compact parameter domain). Suppose Assumptions A-KA and A-Φhold. With

0and γ∈(0,∞]fixed, if there exists a compact set K ⊂intPα with

inf

α∈Pα\KF(uα,γ,)>α∈Pinfα∞

F(uα,γ,), (19)

then there exists a solution αγ,∈intPα∞ to (Pγ,). Moreover, the mapping

Iγ,(α) :=F(uα,γ,),

is lower semicontinuous within intPα∞.

Theproofdepends onthefollowing twolemmasthatwillbeusefullateronas well.

Lemma5 (Lower semicontinuity of the fidelity with varying parameters). Suppose Assumptions A-KA, A-Φ, and A-H hold. Let uk u weakly* in X, and (αk,γk,k)(α,γ,)intP

α ×(0,∞]×(0,∞). Then

Jγ,(u;α)lim inf

k→∞ J γk,k

(20)

Proof. Let ˆ >0 be such thatαj ˆ, (j = 1,. . . ,N). Wethen deducefor largek andsomeC =C) that lim sup k Jγ,k(uk; ˆ, . . . ,ˆ)lim sup k Jγk,k(uk; ˆ, . . . ,ˆ) +C lim sup k Jγk,k(uk;αk) +C <∞. (20) Herewehaveassumedthefinalinequalitytohold.Thiscomeswithoutloss ofgenerality,becauseotherwise there is nothingto prove.Observethatthis holds evenif {αk}∞k=1 isnot bounded.In particular, if>0, restrictingktobe large,wemayassumethat

C1:=H(uk)<∞. (21) Werecallthat Jγ,(u;α) :=H(u) + Φ(Ku) + N j=1 αjAjuγ,j. (22)

Wewantto showlower semicontinuity ofeachof thetermsinturn.Westartwith thesmoothingterm. If

>0,using (21),wewrite

kH(uk) = (k−)H(uk) +H(uk)(k−)C1

2 +H(u

k).

Bytheconvergencek,and theweak*lower semicontinuityofH,wefindthat

H(u)lim inf k→∞ kH(uk). (23) If= 0, wehave H(u) = 0· ∞= 0, whilestill 0sup k kH(uk)<∞. Thus (23)follows.

The fidelity term Φ◦K is weak* lower semicontinuous by the continuity of K and the weak* lower semicontinuity ofΦ.Itthereforeremainstoconsiderthetermsin (22)involvingboththeregularisation pa-rametersα,aswellastheHuberisationparameterγ.Indeedusingthedualformulation (6)oftheHuberised norm,we haveforsomeconstantC=C,Ω) that

Ajukγk,j= sup ϕ(x)1 ϕ dAjuk− 1 2γkϕ 2dx sup ϕ(x)1 Ω ϕ dAjuk+ 1 2γϕ 2dxC|γ1(γk)1| =Ajukγ,j−C|γ−1(γk)1|.

(21)

Thus, ifα∈ Pα, weget

αkjAjukγk,j=αjAjukγk,j+ (αkj −αj)Ajukγk,j

≥αjAjukγk,j− |αkj −αj|Ajukγk,j

≥αjAjukγ,j−C|γ−1(γk)1| − |αkj −αj|Ajukγk,j.

It follows from (20) thatthe sequence {Ajukγk,j}k=1 is bounded in M(Ω;Rmj) for eachj = 1,. . . ,N.

Thus lim inf k→∞ α k jAjukγk,jlim inf k→∞ αjAju k γ,j ≥αjAjuγ,j,

where thefinalstepfollowsfrom Lemma 1.

Itremainstoconsider thecasethatα∈ Pα∞\ Pα,i.e.,whenα=forsome.Wemaypicksequences

j}∞=1, (j = 1,. . . ,N), such that βj αj. Further, we may find {βjk,}∞=1 such that β

k,

j αkj with

αk

j = lim→∞βjk, andβj= limk→∞βjk,.Then, bythebounded casestudiedabove

lim inf k→∞ α k jAjukγk,jlim inf k→∞ β k, j Ajukγk,j≥βjAjuγ,j.

But{αkjAjukγk,j}∞k=1 isboundedby (20),and

lim inf

→∞ β

jAjuγ,j ≥αjAjuγ,j.

Thus lowersemicontinuityfollows. 2

Lemma 6 (Convergence of reconstructions away from boundary). Suppose Assumptions A-KA, A-Φ, and A-H hold. Let (αk,γk,k)(α,γ,)in intP

α ×(0,∞]×[0,∞). Then we can find uα,γ,∈arg minJγ,(·;α)

and extract a subsequence satisfying

Jγk,k(uαkk,k;αk)→Jγ,(uα,γ,;α), (24a)

uαkk,k u∗ α,γ, weakly* X, and (24b) Kuαkk,k→Kuα,γ, strongly inY. (24c)

Proof. By Lemma 5, wehave lim inf k→∞ J γk,k(u αkk,k;αk)≥Jγ,(uα,γ,;α) = min u∈XJ γ,(u;α).

Wealsowanttheoppositeinequality lim sup

k→∞ J γk,k(u

αkk,k;αk)≤Jγ,(uα,γ,;α). (25)

Letδ >0.If= 0,weuse Assumption A-H onu=uα,γ,,toproduce.Otherwise,weset =uα,γ,.In

eithercase

References

Related documents

Хувилбар бүлгийн хулганы элэгний эдийн дээжийг Тэхийн шээг 3 бэлдмэлээр эмчилгээ хийснээс 7, 14, 21 хоногийн дараа харахад элэгний хэлтэнцрүүдэд үүссэн

Como o prop´osito ´e identificar o n´ıvel de vulnerabilidade de 100% das fam´ılias inseridas no IRSAS, foram escolhidos atributos comuns a todas as fam´ılias cadastradas

Therefore, in light of the theoretical principles concerning early maladaptive schemas and marital satisfaction and findings from previous studies, it can be concluded

According to the obtained information, the highest frequency between the basic themes in the analyzed sources is related to the content of the code (P10) called

The main contribution of this study is to decrease income inequality in risk sharing network such as informal insurance through an analytical method, this study

Solution: Remove all tree limbs from within 10 feet of the chimney; remove all dead limbs overhanging or near the house.. OUTSIDE

The Council of Europe emphasized in its Recommendation to Member States on public service media governance (2012) that many European countries still face the