Minimization Method - Nonvariational PDEs

Chapter 3 Nonvariational PDEs

3.3 Minimization Method

In the following we look at a derivation for a new method which comes from min- imizing the problem. We will first give an outline of the continuous formulation of the problem, before moving to the discrete case which will be the basis for our numerical method. Specifically the saddle point formulation in section3.3.1is used in the computations.

First of all, in order to set up the problem in general terms, letV be a Hilbert space with inner product a(·,·) : V ×V → _R (i.e. a symmetric, positive definite, bilinear form). We wish to consider two different cases for this method. In the first case we will choose V = L2(Ω) and a(v, w) = R_Ωvwdx and in the second we will choose V =H1(Ω) anda(v, w) =R

Ω∇v· ∇wdx+βh −1R

∂Ωvwds, whereβ >0. Now the Riesz representation theorem says that for every w ∈ V∗, there exists a uniqueu∈V such that,

whereh·,·i:V∗×V →_Ris the dual pairing. From this we can define an invertible projection operatorN :V∗ →V which has the property

a(Nw, v) =hw, vi, ∀v∈V, (3.10) We can additionally define a norm usingN and aas

kwk2_N :=a(Nw,Nw), ∀w∈V∗. (3.11) We note that this norm is defined in the dual ofV, however in the case ofV =L2, they are equivalent.

Using the above definitions, we can reformulate (3.1) as a minimization problem as follows.

Definition 3.3.1 (Continuous Minimization Formulation). Let u ∈ H2(T) ∩

H₀1(Ω)⊂V such that forJ :H2(T)∩H₀1(Ω)→_R+

J(u) := 1 2kA:D

hu+fk2N → min, (3.12)

Then we call (3.12) thecontinuous minimization formulation of (3.1).

HereD2_hu∈L2(Ω) is a piecewise approximation to the Hessian, i.e. D2_hv|_K =

D2v|K for allK ∈ T, which we will use for the remainder of this method.

Remark. We note that it is necessary to haveu∈H2(T)∩H₀1(Ω) for the continuous case, however for the discrete case we will instead be working withu∈Vh.

Now by formulating the Euler-Lagrange equation of the above, by taking the functional derivative, we get a variational version of (3.12).

Definition 3.3.2 (Continuous Euler-Lagrange Formulation). Let u∈W such that

where the right-hand side is defined by

l(ϕ) =−a(Nf,N(A:D_h2ϕ)).

Then (3.13) is thecontinuous Euler-Lagrange formulationof (3.12).

We note that this formulation is written in a general way so that it can take different forms depending on the choices ofa(·,·) (and by extensionV). These different cases will be considered later on.

Now let us consider a discrete version of the problem. Note that the discrete version keeps virtually the same structure, with the only changes being the use of

Vh,Nh, and uh, which are discrete versions of the above.

In particular N_h is a standard Galerkin approximation to N, and takes a similar form to (3.10).

Definition 3.3.3 (Nh projection). Recall Vh = C0(Ω)∩Pk(T). We define Nh :

V∗→Vh by

a(Nhvh, ϕh) = (vh, ϕh)L2, ∀ϕ_h∈V_h. (3.14)

First, we note that we make a standard assumption related to the difference between our non-discrete and discrete projection operatorsN and N_h.

Assumption 3.3.1. For Nv∈Hk, k∈_Nand 0≤m≤k

k(N − N_h)vk_m ≤Chk−mkNvk_k. (3.15) We can also derive the following bound on the discrete projector by the non-discrete version.

Lemma 3.3.1 (N_h bound). Let kvka:=a(v, v) for v∈V. Then

Proof.

kN_hvk2

a=a(Nhv,Nhv),

=a(Nv,N_hv), (due to Galerkin orthogonality ofN_h)

≤CkNvk_akN_hvk_a, (by boundedness ofa(·,·))

Thus we divide through bykN_hvk_a to getkN_hvk_a≤CkNvk_a. Having defined these discretized concepts, let us now consider the minimization formulation which we will use as the basis for our finite element method.

Definition 3.3.4(Euler-Lagrange Formulation). Letuh∈Vh such that

a(N_h(A:D2_huh),Nh(A:Dh2ϕh)) +s(uh, ϕh) =lh(ϕh), ∀ϕh∈Vh, (3.17) where we have added a stabilization term s(·,·), defined by

s(v, w) := Z E β1hpJ∇vK·J∇wKds+β2h q Z Ω A:D_h2vA:D_h2wdx+β3hr Z ∂Ω vwds,

whereβ1, β2, β3 >0 are parameters, his the grid size, p, q, r∈Zand lh(·) is

lh(v) :=−a(Nhf,Nh(A:Dh2v))−β2hq

Ω

f A:D2_hvdx.

Then (3.17) is the discreteEuler-Lagrange formulationof (3.1). For convenience we denote the left hand side of (3.17) by

bh(v, w) :=a(Nh(A:D_h2v),Nh(A:D2_hw)) +s(v, w). (3.18)

Remark. The values ofp, q, rwill vary depending on the choice ofa(·,·), as this will change the h-scaling of the first term inbh, thus we leave them unspecified.

It is quite easy to show Galerkin orthogonality for this problem, which shows that it is consistent with the original one.

solution to (3.1), and bh(·,·) be defined as in (3.18). Then we have bh(u, v) =lh(v), ∀v∈Vh. (3.19) Consequently, bh(u−uh, v) = 0, ∀v∈Vh. Proof. bh(u, v)−lh(v), =a(Nh(A:D2hu),Nh(A:D2hv)) + Z E β1hpJ∇uK·J∇vKds+β2h q₍_A_:_D2 hu, A:D2hv)L2, +β3hr Z ∂Ω uvds+a(Nhf,NhA:D2hv) +β2hq(A:Dh2f, A:D2hv)L2 =(A:D_h2u+f,Nh(A:D2_hv))L2 +β₂hq(A:D_h2u+f, A:D_h2v)_L2 +β3hr Z ∂Ω uvds, (using (3.14) and _J∇u_K= 0), =0,

where we have used the fact −A:D2

hu=f in Ω andu= 0 on ∂Ω. We can also prove the existence of a unique solution fairly easily due to the choice of stabilization term.

Lemma 3.3.3(Existence and Uniqueness). Assume that the original equation (3.1) has a unique solution uh ∈ W. Then for β1, β2, β3 >0, the approximation (3.17)

has a unique solution uh ∈Vh.

Proof. This follows by taking f = 0 and showing the solution uh must be zero. Settingf = 0 in (3.17) and takingϕh=uh gives

0 =a(N_h(A:D2_huh),Nh(A:Dh2uh)) +s(uh, uh), =kNh(A:Dh2uh)k2a+β1hpkJ∇uhKk

E+β2hqkA:Dh2uhk20+β3hrkuhk2∂Ω. wherek · kE denotes the L2 norm over the skeleton of the grid and k · k∂Ω denotes theL2 norm over the boundary.

Since every term in this expression is≥0, they must each be zero. Thus we have J∇uhK= 0 on E, which means uh is smooth across elements, so uh ∈ C

1_(Ω). We already had uh ∈C0(Ω)∩Pk(T), so uh ∈C1(Ω)∩Pk(T), which by extension means uh ∈ H2(Ω). Additionally since uh|∂Ω = 0, we have that uh ∈ W. Next sinceA:D2_huh = 0 (and uh|∂Ω = 0),uh is a piecewise solution to (3.1) whenf = 0. Since at the start we assumed the solution to (3.1) was unique, and uh ∈ W, this

meansuh = 0.

Let us now return to the two specific examples of the method, which come from selecting appropriate choices forV and a(·,·).

Example 3.3.1 (L2 Minimization). Let V =L2(Ω) and a(v, w) = R_Ωvwdx (thus

k · kN is simply theL2 norm). In this case Nh defined in (3.14) is the discrete L2 projectionPh:L2(Ω)→Vh defined by Z Ω Phvϕhdx= Z Ω vϕhdx, ∀ϕh ∈Vh. (3.20) So the discrete problem becomes

Ω

Ph(A:D2huh)Ph(A:Dh2ϕh) dx+s(uh, ϕh) =lh(ϕh), ∀ϕ∈Vh, (3.21) where we choosep=−1, q= 0, r=−3 ins(uh, ϕh).

Remark 1. The formulation in (3.21) is similar to the one presented in Mu and Ye

[2017], with the discrete projectionPh and a different penalty term being the main differences.

Remark 2. The choice ofp, q and r comes from examining how each term scales in magnitude with regards to the grid sizeh. For the main component,

Ω

Ph(A:Dh2uh)Ph(A:D2hϕh) dx

we can state that the D_h2uh and Dh2ϕh each provide approximately O(h−2) scaling, whilst the integral dx itself provides O(h2) scaling, giving O(h−2h−2h2) =O(h−2)

overall. Applying similar logic to thes(uh, ϕh) terms, we find that β1hp Z EJ ∇v_K0·J∇wK0ds≈ O(h p_h−2_h_{) =}_O₍_hp−1₎ β2hq Z Ω A:D2_hv, A:D2_hwdx≈ O(hqh−4h2) =O(hq−2) β3hr Z ∂Ω vwds≈ O(hrh0h) =O(hr+1)

In order to match the originalO(h−2), we thus choosep=−1, q= 0, r=−3.

Example 3.3.2(H−1Minimization). LetV =H1_{(Ω) and}_a₍_{v, w}_{) =}R

Ω∇v·∇wdx+

β3h−1

∂Ωvwds(in which casek·k 2

a=|·|21+h−1k·k2∂Ωwhere|·|1is theH1seminorm). In this case,Nh is the (discrete) Ritz projectionNh :H−1(Ω)→Vh defined by

Z Ω ∇Nhw∇ϕhdx+ β3 h Z ∂Ω Nhwϕhds= Z Ω wϕhdx, ∀ϕh∈Vh. (3.22) This gives us the following approximation.

Z Ω ∇N_h(A:D2_huh)· ∇Nh(A:Dh2ϕh) dx +β3 h Z ∂Ω Nh(A:D_h2uh)Nh(A:D2_hϕh) ds+s(uh, ϕh) =lh(ϕh), ∀ϕh∈Vh, (3.23)

where we choosep= 1, q= 2, r=−1 ins(uh, ϕh) using the same logic as in example

3.3.1, remark 2.

Remark. In this case the original problem is equivalent to the minimization of

A:D_h2uh−f in theH−1 norm.

Remark 2. This form for a(·,·) comes from Nitsche’s method (see Freund and Stenberg[1995]) for solving Poisson’s equation. The second term is added to weakly impose boundary conditions so that a(·,·) is invertible (the alternative would be working in V = H₀1 and not using the term). We note that there exist additional terms R

∂Ω∇v ·nwds−

∂Ω∇w·nvds in the full method that can be added for consistency, however as they do not appear to affect our numerical results we do not use them to simplify the presentation.

3.3.1 Saddle Point Formulation

One potential weakness to the forms (3.21) and (3.23) is that they can be tricky to implement numerically. For one, Ph and Nh require extra effort to implement. Another problem is theD_h2ϕterm present in these equations, since in later sections (specifically we refer to example 3.5.1) we plan to replace D_h2 with H[·], and H[ϕ] lacks an implementation inDune-Fempy.

Thus we look here to rewrite the approximation into the form of a saddle point problem. This is the form we will use later in computational results, and we can also show directly the appeal of example3.3.2in terms of the complexity of the problem. To obtain this form, we introduce an additional variable.

σ=−N_h(A:D_h2uh+f). (3.24) Taking a(·, ϕh) of both sides of (3.24), and substituting σ into (3.17), we get two equations,

a(σ, ϕh) =−a(Nh(A:D2huh+f), ϕh),

a(σ,N_h(A:D2_hϕh))−s(uh, ϕh) =−β2hq(f, A:Dh2ϕh).

Recalling property (3.14), i.e. a(N_hw, v) = (w, v)_L2, and rearranging the first equa-

tion, this can be rewritten as

a(σ, ϕh) + (A:D_h2uh, ϕh) =−(f, ϕh), (σ, A:D2_hϕh)−s(uh, ϕh) =−β2hq(f, A:D2hϕh).

Lettingb(v, w) := (A:D_h2v, w), we can think of the above in terms of bilinear forms instead.

a(σ, ϕh) +b(uh, ϕh) =−(f, ϕh),

We can then rewrite this system in matrix-vector form as   M B BT −S     σ u  =−   f g   (3.25)

where here we can think ofB as representingb(·, ϕh),S as representings(·, ϕh) and

M asa(·, ϕh). The Schur complement of this is then the following.

BTM−1B+S

u=−BTM−1f−g. (3.26) Thus we can see the differences between the two examples in terms of the order of the complexity of the problem. In the case a(·,·) is the L2 inner product, M

is approximately the identity, i.e. of order 0. On the other hand if a(v, w) =

Ω∇v· ∇wdx+β3h −1R

∂Ωvwds, thenM becomes the discrete Laplace operator, so

M−1 is of order -2.

From this we can see that the primary advantage of the H−1 approach in comparison to theL2_{version is that the order of the method is}_O₍_h2_h−2_h2_{) =}_O₍_h2_), in comparison toO(h4).

In document Software for finite element methods and its application to nonvariational problems (Page 96-104)