arxiv: v1 [math.na] 31 Aug 2021

(1)

arXiv:2108.13767v1 [math.NA] 31 Aug 2021

Stochastic Discontinuous Galerkin Methods for Robust Deterministic Control of Convection Diffusion

Equations with Uncertain Coefficients

Pelin Ç ilo˘gluâ, Hamdullah Yücelâ,∗

aInstitute of Applied Mathematics, Middle East Technical University, 06800 Ankara, Turkey

Abstract

We investigate a numerical behaviour of robust deterministic optimal control problem governed by a convection diffusion equation with random coefficients by approximating statistical moments of the solution. Stochastic Galerkin approach, turning the original stochastic problem into a system of deterministic problems, is used to handle the stochastic domain, whereas a discontinuous Galerkin method is used to discretize the spatial domain due to its better convergence behaviour for convection dominated optimal control problems. A priori error estimates are derived for the state and adjoint in the energy norm and for the deterministic control in L²–norm. To handle the curse of dimensionality of the stochastic Galerkin method, we take advantage of the low–rank variant of GMRES method, which reduces both the storage requirements and the computational complexity by exploiting a Kronecker–product structure of the system matrices. The efficiency of the proposed methodology is illustrated by numerical experiments on the benchmark problems with and without control constraints.

Keywords: PDE-constrained optimization, uncertainty quantification, stochastic discontinuous Galerkin, error estimates, low–rank approximations 2010 MSC: 35R60, 49J20, 60H15, 60H35

∗Corresponding author

Email addresses: [email protected](Pelin C¸ ilo˘glu), [email protected] (Hamdullah Y¨ucel)

(2)

1. Introduction

In many phenomena in physics or engineering applications, one may wish to optimize certain parameters of a model in order to obtain a more desirable outcome, e.g., optimal shapes of airplane wings, the temperature control of a melting process, or the locations of the points where the liquid is inserted into the medium. Such real-world problems can be formulated as optimal control problems or optimization problems with PDE constraints. However, in reality, the input parameters of these simulations, such as the wind speed or mate- rial properties, are not often known due to the lack of knowledge or inherent variability in the system; see, e.g., [1]. For instance, the highly heterogeneous subsurface properties in groundwater flow simulations can only be measured at relatively few locations, so at other locations, these properties are subject to uncertainty or unexpected fluctuations induced in a flow field around an aircraft wing by wind gusts. In the last decade, the idea of uncertainty quantification, i.e., quantifying the effects of uncertainty on the result of a computation, has been the subject of growing interest in the scientific community.

PDE–constraint optimization problems with uncertainty have been studied in various formulations in the literature, such as mean–based control [2, 3], path- wise control [4, 5], average control [6, 7], robust deterministic control [8, 9, 10, 11, 12, 13], and stochastic control [14, 15, 16, 17]. Robust deterministic control is more practical and realistic since randomness in the system is not observable at the time of designing the control. Therefore, we are here interested with a robust deterministic control problem, in which a suitable statistical measure of the objective function to be minimized

u∈Umin^ad J (y, u) := 1

2ky − y^dk²X+γ

2kS(y)k²W+µ

2kuk²U (1.1) subject to

A(x, ω)y(x, ω) = f (x) + u(x) in D × Ω, (1.2a)

y(x, ω) = 0 on ∂D × Ω, (1.2b)

(3)

where A : Y → Y^′ is a linear operator that contains uncertain parameters, D ⊂ R² is a convex bounded polygonal spatial domain with a Lipschitz boundary

∂D, and Ω is a sample space of events. The function J (y, u) is a cost functional of tracking–type, which includes a risk penalization via the standard deviation.

The first term in (1.1) is a measure of the distance between the state variable y and the target function y^d in terms of expectation of y − y^d. Without loss of generality, we assume that the state y ∈ Y is a random field, whereas the desired state y^d ∈ Y is modelled deterministically. The second term measures the standard deviation of y, which is added since it is desirable to have a control for which the state is more accurately known, leading to a risk averse optimum.

The last term corresponds to distributive deterministic control. The constant µ > 0 is a regularization parameter for the penalization of the action of the control u, whereas γ ≥ 0 is the risk–aversion parameter. We note that the objective functional J is a deterministic quantity with uncertain terms. Further, the closed convex admissible set in the control space U is defined by

U^ad:= {u ∈ U : u^a≤ u(x) ≤ u^b, ∀x ∈ D}, (1.3) where constants ua, ub∈ R with ua≤ ub.

Finding an approximate solution for the optimization problems (1.1)–(1.2) is extremely challenging and requires much more computational resources than the ones in the deterministic setting. In the literature, there exist three com- peting methods to solve such kinds of problem: the Monte Carlo (MC) method [5, 18, 19], the stochastic collocation method (SCM) [20, 21, 17, 13], and the stochastic Galerkin method (SGM) [9, 12, 13, 22]. Although the MC method is popular for its simplicity, natural parallelization, and broad applications, it features slow convergence, which does not depend on the number of uncertain parameters, e.g., the mean value typically converges as 1/√

N , where N is the number of realizations [23, 24]. For the SCMs, the crucial issue is how to construct the set of collocation points appropriately because the choice of the collocation points determines the efficiency of the method. In contrast to the MC approach and the SCM, the SGM is a nonsampling approach, which transforms

(4)

a PDE with random coefficients into a large system of coupled deterministic PDEs. As in the classic (deterministic) Galerkin method, the idea behind the SGM is to seek a solution for the model equation such that the residue is orthogonal to the space of polynomials. An important feature of this technique is the separation of the spatial and stochastic variables, which allows a reuse of established numerical techniques. The results obtained in [13] also show that the SGM generally displays superior performance compared to the SCM for the robust deterministic control problems. On the other hand, for the discretization of the spatial domain, we use a discontinuous Galerkin method due to their better convergence behaviour for optimization problems governed by convection dominated problems; see, e.g., [25, 26, 27]. Compared with the discontinuous Galerkin method, the finite difference method can not handle complex geome- tries, the finite volume method is not capable of achieving high–order accuracy, and the standard continuous finite element method lacks the ability of local mass conservation [28, 29].

In spite of these nice properties exhibited by the stochastic discontinuous Galerkin method, the dimension of the resulting linear system increases rapidly, called as the curse of dimensionality. We address this issue by applying a low–

rank variant of generalized minimal residual (GMRES) [30] with suitable pre- conditioners, which reduce both the storage requirements and the computational complexity by exploiting a Kronecker–product structure of system matrices; see, e.g., [31, 32, 33]. Low-rank approximation for the optimal control problems with uncertain inputs have been also studied in [14, 34, 35, 36] for unconstrained control problems and in [36] for control constraint problems. In the aforementioned studies, randomness is generally defined in the diffusion parameter however we here consider the randomness both in diffusion or convection parameters by using also discontinuous Galerkin method in spatial domain.

The rest of the paper is organized as follows: In the next section, we discuss the existence and uniqueness of the solution for our model problem. In Section 3, we reduce the problem into finite dimensional setting via Karhunen–Lo`eve (KL) expansion, stochastic Galerkin method, and discontinuous Galerkin method. A

(5)

priori error estimates are given in Section 4. In Section 5, we construct the matrix formulation of the underlying optimization problem by proceeding the optimize-then-discretize approach and then discuss the implementation of low–

rank GMRES solver. Numerical results are given in Section 6 to show the efficiency of the proposed approach. Finally, we draw some conclusions and discussions in Section 7 based on the findings in this paper.

2. Existence and uniqueness of the solution

Let D ⊂ R²be a convex bounded polygonal spatial domain with a Lipschitz boundary ∂D, and the triplet (Ω, F, P) denotes a complete probability space, where Ω is a sample space of events, F ⊂ 2^Ωdenotes a σ–algebra, and P : F → [0, 1] is the associated probability measure. A generic random field η on the probability space (Ω, F, P) is denoted by η(x, ω) : D×Ω → R. For a fixed x ∈ D, η(x, ·) is a real–valued square integrable random variable η(x, ·) ∈ L²(Ω, F, P), i.e.,

L²(Ω) := L²(Ω, F, P) := {X : Ω → R : Z

Ω|X(ω)|²dP(ω) < ∞}.

Then, the mean E[η], the standard deviation S(η), and the corresponding vari- ance V(η) for any random variable η defined on (Ω, F, P), are given, respectively, by

E[η] = Z

Ω

η dP(ω), S(η) =

Z

Ω(η − E[η])² dP(ω)

1/2

, V(η) = [S(η)]²= E[η²] − (E[η])².

Recalling the tensor–product space H^k(D) ⊗ L²(Ω), which is endowed with the norm

kηkH^k(D)⊗L²(Ω):=

Z

Ωkη(·, ω)k²H^k(D)dP(ω)

1/2

< ∞, (2.1) we define the spaces of the state and the control, respectively,

Y := H0¹(D) ⊗ L²(Ω) and U := L²(D).

(6)

We also set X := L²(D) ⊗ L²(Ω) and W = L²(D). It is noted that U, Y ⊂ L²(D) ⊗ L²(Ω) = X . Further, Y^′ is the dual of the space Y.

In order to show the existence of the solution, we assume that the operator A satisfies the following conditions:

a) A is coercive: there exists a constant c > 0 such that, P-a.s., (Av, v) ≥ ckvk^X, ∀v ∈ X .

b) (Au, v) = (u, A^∗v) ∀u, v ∈ X , where A^∗ is the adjoint of A.

By following the standard arguments in the theory of optimal control, see, e.g., [37, Theorem 1.3] and [38, Theorem 2.14], the existence and uniqueness of an optimal solution for the optimization problem (1.1)–(1.2) can be proved.

With the definitions above, Y and U are Hilbert spaces, the functional J is strictly convex, and the admissible set U^ad is a closed and convex set. Then, according to Lion’s Lemma [37, Theorem 1.3], there exists a unique optimal control ¯u ∈ U, which satisfies the following variational inequality:

J^′(¯u) · (v − ¯u) ≥ 0, ∀v ∈ U^ad, (2.2) where

J^′(¯u) · w := lim

h→0⁺

J (¯u + hw) − J (¯u)

h (2.3)

is the directional derivative of J at ¯u ∈ U^adalong the direction w ∈ U^ad. Now, we can state the first–order optimality conditions of the robust deterministic optimal control problem (1.1)–(1.2).

Theorem 2.1. The optimal control problem (1.1)–(1.2) has a unique solution (y, u) if and only if there exists an adjoint variable p ∈ Y such that the triplet (y, u, p) satisfies, P-a.s., the following optimality system:

Ay(u) = f (x) + u(x), y(u) ∈ Y, (2.4a)

A^∗p(u) = y(u) − y^d+ γ y(u) − E[y(u)]

, p(u) ∈ Y, (2.4b) Eh Z

D(p(u) + µu) · (v − u) dxi

≥ 0, u, v ∈ U^ad. (2.4c)

(7)

Proof. Rewrite the objective functional J as J (u) = 1

2E

Z

D y(u) − y^d² dx

| {z }

J1(u)

+γ 2E

Z

D

y(u)²dx

| {z }

J2(u)

−γ 2E

Z

D

(E [y(u)])² dx

| {z }

J3(u)

+µ 2E

Z

D

u²dx

| {z }

J4(u)

.

By the directional derivative definition (2.3), we obtain that

J1^′(u) · (v − u) = lim

h→0⁺

EhR

D y(u + h(v − u)) − y^d2

dxi

− EhR

D y(u) − y^d2

dxi 2h

= lim

h→0⁺

ER

D y(u + h(v − u))²− (y^d)² dx 2h

− lim

h→0⁺

ER

D2 (y(u + h(v − u)) − y(u)) y^ddx 2h

= E

Z

D y(u) − y^d

y^′(u) · (v − u) dx

,

J2^′(u) · (v − u) = γ lim

h→0⁺

EhR

D y(u + h(v − u))2

dxi

− ER

Dy(u)²dx 2h

= γE

Z

D

y(u)y^′(u) · (v − u) dx

,

J3^′(u) · (v − u) = γ lim

h→0⁺

EhR

D E[y(u + h(v − u))]2

dxi

− EhR

D E[y(u)]2

dxi 2h

= γE

Z

D

E[y(u)]y^′(u) · (v − u) dx

,

J4^′(u) · (v − u) = µ lim

h→0⁺

ER

D(u + h(v − u))²dx

− ER

Du²dx 2h

= µ lim

h→0⁺

ER

D h²(v − u)²+ 2hu(v − u) dx 2h

= µE

Z

Du · (v − u) dx

. Hence, by combining all terms, we have J^′(u) · (v − u) = E

Z

D y(u) − y^d

y^′(u) · (v − u) dx

+ γE

Z

D

y(u)y^′(u) · (v − u) dx

−γE

Z

D

E[y(u)]y^′(u) · (v − u) dx

+ µE

Z

Du · (v − u) dx

. (2.5)

(8)

By well–posedness of the state equation (1.2) followed from the Lax–Milgram lemma, one can easily show that the operator A is invertible and we have

Ay(u) = f + u =⇒ y(u) = A⁻¹(f + u), (2.6) so that, by taking directional derivative of (2.6), one then gets

y^′(u) · (v − u) = A⁻¹(v − u) = y(v) − y(u).

Thus, we have from (2.5) that

J^′(u) · (v − u) = Ψ(γ) + µE

Z

Du · (v − u) dx

, (2.7)

where

Ψ(γ) = (1 + γ)E

Z

Dy(u) · y(v) − y(u) dx

− γE

Z

D

E[y(u)] · y(v) − y(u) dx

−E

Z

D

y^d· y(v) − y(u) dx

.

To ensure the existence and uniqueness of the optimal solution from Lion’s Lemma [37, Theorem 1.3], the following expression holds

J^′(u) · (v − u) ≥ 0. (2.8)

Next, we introduce the adjoint state p(u) by

A^∗p(u) = y(u) − y^d− γ y(u) − E[y(u)]

, p(u) ∈ Y. (2.9) Multiplying both sides of (2.9) by y(v) − y(u)

, integrating over D, and taking the expectation of the resulting system, we obtain

E

Z

D

A^∗p(u) · y(v) − y(u) dx

= E

Z

Dp(u) · Ay(v) − Ay(u) dx

= E

Z

Dp(u) · v − u dx

= Ψ(γ). (2.10)

Inserting (2.10) into (2.7) and combining with (2.8), we obtain J^′(u) · (v − u) = E

Z

D

p(u) + µ u

· (v − u) dx

≥ 0, (2.11) which is the desired result.

(9)

In this study, we consider A as the convection–diffusion operator A := −∇ · a(x, ω)∇

+ b(x, ω) · ∇, (2.12)

which turns the state equation (1.2) into

−∇ · a(x, ω)∇y(x, ω)

+ b(x, ω) · ∇y(x, ω) = f(x) + u(x) in D × Ω, (2.13a) y(x, ω) = yDB on ∂D × Ω, (2.13b) where a : (D × Ω) → R and b : (D × Ω) → R²are random diffusivity and velocity coefficients, respectively, which is assumed to have continuous and bounded covariance functions. Under the following assumptions on the uncertain coefficients:

i) The diffusivity coefficient a(x, ω) is P–almost surely uniformly positive, that is, there exist constants amin, amax such that 0 < amin≤ a^max< ∞, with

amin≤ a(x, ω) ≤ a^max a.e in D × Ω. (2.14) ii) The velocity coefficient b satisfies b ∈ L^∞(D)2

and ∇·b(x, ω) ∈ L^∞(D).

the well–posedness of the state equation (2.13) can be shown by following the classical Lax–Milgram lemma; see, e.g., [39, 40].

Then, the corresponding weak formulation of the optimal control problem (1.1)–(1.2) is

u∈Umin^ad J (u) = 1 2E

Z

D y(u) − y^d² dx

+γ

2E

Z

D y(u) − E[y(u)]2

dx

+µ 2E

Z

D

u²dx

(2.15) subject to

a[y, v] + b[u, v] = [f, v], v ∈ Y, (2.16) where

a[y, v] = E

Z

D a(x, ω)∇y · ∇v + b(x, ω) · ∇y v dx

, ∀y, v ∈ Y, b[u, v] = −E

Z

D

uv dx

and [f, v] = E

Z

D

f v dx

, ∀u ∈ U, v ∈ Y.

(10)

Moreover, the optimality system in (2.4) can be stated in the variational form as follows:

a[y, v] + b[u, v] = [f, v], v ∈ Y, (2.18a)

a[q, p] = [y − y^d, q] + γ

y − E[y], q

, q ∈ Y, (2.18b)

Eh Z

D(p + µu) · (w − u) dxi

≥ 0, w ∈ U^ad, (2.18c)

where the adjoint p ∈ Y solves the following convection diffusion with random coefficients

−∇ · a(x, ω)∇p

− b(x, ω) · ∇p = (y − y^d) + γ(y − E[y]) in D × Ω, (2.19a)

p(x, ω) = 0 on ∂D × Ω. (2.19b)

In the following, we introduce the techniques, that is, Karhunen–L`oeve (KL) expansion, stochastic Galerkin, and discontinues Galerkin methods, to reduce the infinite–dimensional model problem (2.15)–(2.16) into the finite dimensional.

3. Finite dimensional representation 3.1. Finite representation of stochastic fields

To solve (2.15)–(2.16) numerically, it is needed to reduce the stochastic process into a finite number of mutually uncorrelated, sometimes mutually independent, random variables. Therefore, we assume that the given coefficients a(x, ω) and b(x, ω) can be approximated by a prescribed finite number of uncorrelated components ξi(ω), i = 1, . . . , N ∈ N, called as finite dimensional noise [39, 41].

Let Γi= ξi(Ω) ∈ R be a bounded interval and ρⁱ : Γi→ [0, 1] be the probability density functions of the random variables ξi(ω), i = 1, . . . , N ∈ N with ω ∈ Ω.

Then, we can replace the probability space (Ω, F, P) with (Γ, B(Γ), ρ(ξ)dξ), where Γ =

QN n=1

Γnis the support of such probability density, B(Γ) denotes Borel σ–algebra, and ρ(ξ)dξ is the distribution measure of the vector ξ. Moreover, the joint probability density function is denoted by ρ(ξ), ξ ∈ Γ. Hence, we can state the tensor–product space H^k(D)⊗L²(Γ), which is endowed with the norm

kηkH^k(D)⊗L²(Γ):=

Z

Γkη(·, ξ)k²H^k(D)ρ(ξ) dξ

1/2

< ∞. (3.1)

(11)

Following the well–known KL expansion [42, 43], a random field η(x, ω) : D × Ω → R with a continuous covariance function C^η(x, y)

C_η(x, y) := h(η(x, ·) − η(x))(η(y, ·) − η(y))i . (3.2) admits a proper orthogonal decomposition

η(x, ω) = η(x) + κ X∞ k=1

pλkφk(x)ξk(ω), (3.3)

where η(x) and κ are mean and standard deviation of η, respectively, and ξ :=

{ξ¹, ξ2, . . .} are uncorrelated random variables. The pair {λ^k, φk} is a set of the eigenvalues and eigenfunctions of the corresponding covariance operator C^η. We approximate η(x, ω) by truncating its KL expansion of the form

η(x, ω) ≈ η^N(x, ω) := η(x) + κ XN k=1

pλkφk(x)ξk(ω). (3.4)

The truncated KL expansion (3.4) is a finite representation of the random field η(x, ω) in the sense that the mean-square error of approximation is minimized;

see, e.g., [44].

By the assumption on the finite dimensional and Doob–Dynkin lemma [45], the solution of (2.13) can be described by a finite number of random variables, i.e., y(x, ω) = y(x, ξ) = y(x, ξ1(ω), . . . , ξN(ω)), with the corresponding finite dimensional stochastic Yρ = L²(H₀¹(D); Γ). Then, the optimization problem (2.15)-(2.16) can be rewritten as follows

u∈Umin^ad J (u) = 1 2E

Z

D y(u) − y^d² dx

+γ

2E

Z

D(y(u) − E[y(u)])² dx

+µ 2E

Z

D

u²dx

(3.5) subject to

a[y, v]ρ+ b[u, v]ρ= [f, v]ρ, ∀v ∈ Y^ρ, (3.6)

(12)

where a[y, v]ρ=

Z

Γ

Z

D a(x, ξ)∇y · ∇v + b(x, ξ) · ∇y v

dx ρ(ξ)dξ, ∀y, v ∈ Yρ, (3.7a) b[u, v]ρ= −

Z

Γ

Z

D

uv dx ρ(ξ)dξ, ∀u ∈ U, v ∈ Y^ρ, (3.7b) [f, v]ρ=

Z

Γ

Z

D

f v dx ρ(ξ)dξ, ∀v ∈ Y^ρ. (3.7c)

A pair (y, u) ∈ Yρ× U^adis the solution of (3.5)-(3.6) if and only if there is an adjoint variable p ∈ Y^ρ such that the triplet (y, p, u) satisfies the following optimality system:

a[y, v]ρ+ b[u, v]ρ= [f, v]ρ, v ∈ Yρ, (3.8a) a[q, p]ρ= [y − y^d, q]ρ+ γ

y − eE[y], q

ρ, q ∈ Y^ρ, (3.8b) e

Eh Z

D(p + µu) · (w − u) dxi

≥ 0, w ∈ U^ad, (3.8c)

where eE[u] =R

Γu ρ(ξ)dξ.

Next, we present the representation of stochastic solutions, i.e., y(x, ξ), p(x, ξ), by a finite generalized polynomial chaos (PC) approximation.

3.2. Stochastic Galerkin Method

The state solution y(x, ξ) ∈ L²(Γ, F, P), as well as the adjoint solution p(x, ξ) ∈ L²(Γ, F, P), can be represented by a finite generalized polynomial chaos (PC) approximation as stated in Cameron–Martin theorem [46],

y(x, ω) ≈ y^J(x, ξ) =

J−1X

i=0

yi(x)ψi(ξ), (3.9a)

p(x, ω) ≈ pJ(x, ξ) =

J−1X

i=0

pi(x)ψi(ξ), (3.9b)

where yi(x) and pi(x) are the deterministic modes of the expansion and the total number of PC basis is determined by the dimension N of the random vector ξ and the highest order Q of the basis polynomials ψi

J = 1 + XQ s=1

1 s!

s−1Y

j=0

(N + j) = (N + Q)!

N !Q! .

(13)

By following [47, 48], we then define the stochastic space as

Sk := span{ψi(ξ) : i = 0, 1, . . . , J − 1} ⊂ L²(Γ). (3.10) For simplicity, we only deal with the state equation since the procedure for the adjoint equation is similar to the state ones. By inserting KL expansions (3.4) of the diffusion a(x, ω) and the convection b(x, ω) coefficients, and the solution expression (3.9) into the state equation (3.6) and projecting onto the space spanned by the PC basis functions, we obtain the following linear system, consisting of J deterministic convection diffusion equations for j = 0, . . . , J − 1

−

J−1X

i=0

∇ · (a^ij∇yⁱ(x)) + bij· ∇yⁱ(x)

= hψ^ji f(x) + hψ^ji u(x), (3.11)

where

aij = a(x) ψ²_i(ξ)

δij+ κa

XN k=1

pλ^a_kφ^a_k(x) hξkψi(ξ)ψj(ξ)i ,

b_ij = b(x) ψ_i²(ξ)

δij+ κb

XN k=1

q

λ^b_kφ^b_k(x) hξkψi(ξ)ψj(ξ)i .

Here, we note that the quantity of interest is the statistical moments of the solution y(x, ω) rather than the solution y(x, ω). Next, we introduce the symmetric interior penalty Galerkin in order to discretize the state system (3.11) in the physical space.

3.3. Symmetric interior penalty Galerkin method

Let {T^h}^hbe a family of shape-regular simplicial triangulations of D. Each mesh T^hconsists of closed triangles such that D =S

K∈T_hK holds. We assume that the mesh is regular in the following sense: for different triangles Ki, Kj∈ Th, i 6= j, the intersection Ki∩ Kj is either empty or a vertex or an edge, i.e., hanging nodes are not allowed. The diameter of an element K and the length of an edge E are denoted by hK and hE, respectively. The maximum value of the element diameter is also denoted by h = max

K∈Th

hK.

We split the set of all edges Eh into the set Eh⁰ of interior edges and the set Eh^∂ of boundary edges so that E^h = Eh⁰∪ Eh^∂. Let n denote the unit outward

(14)

normal to ∂D. For a fixed realization ω, the inflow and outflow parts of ∂D are denoted by ∂D⁻ and ∂D⁺, respectively,

∂D⁻= {x ∈ ∂D : b(x, ω) · n(x) < 0} , ∂D⁺= {x ∈ ∂D : b(x, ω) · n(x) ≥ 0} . Similarly, the inflow and outflow boundaries of an element K are defined by

∂K⁻= {x ∈ ∂K : b(x, ω) · n^K(x) < 0} , ∂K⁺= {x ∈ ∂K : b(x, ω) · n^K(x) ≥ 0} , where nK is the unit normal vector on the boundary ∂K of an element K.

Let the edge E be a common edge for two elements K and Kê. For a piecewise continuous scalar function y, there are two traces of y along E, denoted by y|Êfrom inside K and yê|Êfrom inside Kê. The jump and average of y across the edge E are defined by:

[[y]] = u|Ên_K+ yê|Ên_Ke, {{y}} = 1

2 y|Ê+ yê|Ê

. (3.12)

Similarly, for a piecewise continuous vector field ∇y, the jump and average across an edge E are given by

[[∇y]] = ∇y|Ê· n^K+ ∇yê|Ê· n^Kê, {{∇y}} = 1

2 ∇y|Ê+ ∇yê|Ê

. (3.13) For a boundary edge E ∈ K ∩ ∂D, we set {{∇y}} = ∇y and [[y]] = yn, where n is the outward normal unit vector on ∂D.

Next, introducing the following discrete space Vh=

y ∈ L²(D) : y |^K∈ P(K) ∀K ∈ T^h

, (3.14)

where P(K) be the set of linear polynomials and following the standard discontinuous Galerkin structure in [28, 29], the (bi)–linear forms for a finite dimensional vector ξ can be stated as follow:

ah(y, v, ξ) = X

K∈Th

Z

K

a(., ξ)∇y · ∇v dx − X

E∈E_h⁰∪E_h^∂

Z

E

{{a(., ξ)∇y}}[[v]] ds

− X

Z

E

{{a(., ξ)∇v}}[[y]] ds + X

σ hE

Z

E

[[y]] · [[v]] ds

(15)

+ X

K∈Th

Z

K

b(., ξ) · ∇yv dx +X

K∈Th

Z

∂K⁻\∂D

b(., ξ) · nE(y^e− y)v ds

− X

K∈Th

Z

∂K⁻∩∂D⁻

b(., ξ) · nEyv ds,

bh(u, v, ξ) = − X

K∈T_h

Z

K

uv dx,

lh(f, v, ξ) = X

K∈Th

Z

K

f v dx + X

E∈E_h^∂

σ hE

Z

E

yDB[[v]] ds − X

E∈E_h^∂

Z

E

yDB{{a(., ξ)∇v}} ds

− X

K∈Th

Z

∂K⁻∩∂D⁻

b(., ξ) · nEyDBv ds,

where the constant σ > 0 is the interior penalty parameter. It has to be chosen sufficiently large independently of the mesh size to ensure the stability of the DG discretization. Then, the (bi)–linear forms of the stochastic discontinuous Galerkin (SDG) correspond to

aξ[y, v] = Z

Γ

ah(y, v, ξ)ρ(ξ) dξ, bξ[u, v] = Z

Γ

bh(u, v, ξ)ρ(ξ) dξ, [f, v]ξ =

Z

Γ

lh(f, v, ξ)ρ(ξ) dξ.

Now, we can state the discrete optimal control problem min

uh∈U_h^ad J (uh) = 1 2E

Z

D

yh− y^d² dx

+γ

2E

Z

D

(yh− E[yh])² dx

+µ 2E

Z

D

u²_hdx

(3.15) subject to

aξ[yh, vh] + bξ[uh, vh] = [f, vh]ξ, ∀v^h∈ Y^h= Vh⊗ S^k, (3.16) where the discrete admissible set (1.3) is defined by

Uhâd:= {u^h∈ U^h: ua≤ u^h(x) ≤ u^b, a.e. x ∈ K ⊂ T^h}, (3.17) with Uhâd = U^h∩ Uâd and U^h = V^h. Analogously, the control problem (3.15)- (3.16) has a unique pair solution (yh, uh) ∈ Yh× Uhâd, if and only if there is an

(16)

adjoint variable ph ∈ Y^h, such that (yh, ph, uh) ∈ Y^h× Y^h× Uh^ad satisfies the following optimality system:

aξ[yh, vh] + bξ[uh, vh] = [f, vh]ξ, vh∈ Yh, (3.18a) aξ[qh, ph] = [yh− y^d, qh]ξ+ γ

yh− eE[yh], qh

ξ, qh∈ Yh, (3.18b) [ph+ µuh, wh− u^h]ξ ≥ 0, wh∈ Uh^ad. (3.18c) Further, by denoting

Jh^′(uh) · wh= [ph+ µuh, wh]ξ, ∀wh∈ Uh^ad, (3.19) one can easily obtain the following expression for the discrete directional derivative of functional Jh(uh):

Jh^′(uh) · (wh− uh) ≥ 0, ∀wh∈ Uh^ad. (3.20)

4. A Priori Error Estimate

In this section, we derive a priori error estimates for the optimization problem in (2.15)-(2.16), discretized by the stochastic discontinuous Galerkin method.

Before deriving the corresponding estimates, we first define the associated energy norm on D × Γ as

kykξ = Z

Γky(., ξ)k²eρ(ξ) dξ

!¹₂

, (4.1)

where ky(., ξ)k^e is the energy norm on D, given as

ky(., ξ)ke = X

K∈T_h

Z

K

a(., ξ)(∇y)²dx + X

σ hE

Z

E

[[y]]²ds

+1 2

X

E∈E_h^∂

Z

E

b(., ξ) · nEy²ds +1 2

X

E∈E_h⁰

Z

E

b(., ξ) · nE(y^e− y)²ds

!¹₂ .

By the standard arguments as done in deterministic case, one can easily show the coercivity and continuity of aξ(·, ·) for y, v ∈ Y^h

aξ[y, y] ≥ c^cvkyk²ξ, aξ[y, v] ≤ c^ctkyk^ξkvk^ξ, (4.2)

(17)

where the coercivity constant ccv depends on amin, whereas the continuity constant cct depends on amax.

Next, we state the estimates on the finite dimensional probability domain Γ and the physical domain K ∈ T^h. Let a partition of the support of probability density in finite dimensional space, i.e., Γ =

QN n=1

Γn consists of a finite number of disjoint R^N–boxes, γ =

QN n=1

(r^γ_n, s^γ_n), with (r_n^γ, s^γ_n) ⊂ Γn for n = 1, . . . , N . The mesh size kn is defined by kn = max

γ |s^γn− rn^γ| for 1 ≤ n ≤ N. For the multi–index q = (q1, . . . , qN), the (discontinuous) finite element approximation space with degree at most qn on each direction ξn is denoted by Sk^q ⊂ L²(Γ).

Then, for v ∈ H^q+1(Γ), ϕ ∈ Sk^q, we have, see [39],

ϕ∈Smin_k^qkv − ϕkL²(Γ)≤ XN n=1

kn

2

qn+1k∂ξ^qnⁿ⁺¹vkL²(Γ)

(qn+ 1)! . (4.3) For v ∈ H²(K) and ev ∈ P(K), where K ∈ T^h, the following discontinuous Galerkin approximation [29, Theorem 2.6] also holds

kv − evkH^q(K)≤ C h^2−q|v|H²(K), 0 ≤ q ≤ 2, (4.4) where the constant C is independent of v and h.

To latter use in the rest of the paper, we recall the following projection operators:

• L²–projection operators Πn : L²(Γ) → Sk^q and Πh: L²(D) → Vh∩ L²(D) are given by

(Πn(ξ) − ξ, ζ)L²(Γ) = 0, ∀ζ ∈ Sk^q, ∀ξ ∈ L²(Γ), (4.5a) (Πh(ν) − ν, χ)L²(D) = 0, ∀χ ∈ V^h, ∀ν ∈ L²(D). (4.5b) with the following estimate

kν − Π^h(ν)kL²(L²(D;Γ))≤ ChkνkL²(H¹(D;Γ)). (4.6)

• H¹–projection operator R^h: H¹(D) → V^h∩ H¹(D) is stated by

(Rh(ν) − ν, χ)L²(D)= 0, ∀χ ∈ Vh, ∀ν ∈ H¹(D), (4.7a) (∇(Rh(ν) − ν), ∇χ)L²(D)= 0, ∀χ ∈ Vh, ∀ν ∈ H¹(D). (4.7b)

(18)

With the help of the H¹–projection operator in (4.7a), the Cauchy–Schwarz inequality, the L²–projection operator in (4.5b), and the approximation in (4.3), we obtain the approximation property ([49, Theorem 3.2]): for all v ∈ L²(H²(D); Γ) ∩ H^q+1(H¹(D); Γ) and ev ∈ V^h× Sk^q

kv − evkL²(H¹(D);Γ) ≤ ChkvkL²(H²(D);Γ)

+ XN n=1

kn

2

^qn+1

k∂ξ^qnⁿ⁺¹vkL²(H¹(D);Γ)

(qn+ 1)! , (4.8) where the constant C independent of v, h, and kn.

In order to obtain the separate error estimates in D and Γ, we set a projection operator P^hn which maps onto the tensor product space Y^h, given by

P^hnΥ = ΠhΠnΥ = ΠnΠhΥ, ∀Υ ∈ L²(L²(D); Γ) (4.9) and the decomposition

Υ − PhnΥ = (Υ − ΠhΥ) + Πh(I − Πn)Υ, ∀Υ ∈ L²(L²(D); Γ). (4.10) Before derivation of the a priori error estimate, we state the following auxiliary problem

Jh^′(u) · (w − u) = [p^h(u) + µu, w − u]^ξ ≥ 0, ∀w ∈ U^ad, (4.11) where ph(u) ∈ Y^h is the solution of the auxiliary system:

aξ[yh(u), vh] + bξ[u, vh] = [f, vh]ξ, vh∈ Yh, (4.12a) aξ[qh, ph(u)] = [yh(u) − y^d, qh]ξ+ γ

yh(u) − eE[yh(u)], q

ξ, qh∈ Y^h. (4.12b) Lemma 4.1. With the definition in (4.11), the following estimate holds:

Jh^′(w) · (w − u) − Jh^′(u) · (w − u) ≥ µkw − uk²L²(L²(D);Γ). (4.13)

Proof. By (4.11), we have Jh^′(w) − Jh^′(u)

· (w − u) = [ph(w) − ph(u), w − u]ξ+ µ[w − u, w − u]ξ.(4.14)

(19)

Now, it follows from (4.12) that

[ph(w) − p^h(u), w − u]^ξ = aξ[yh(w) − y^h(u), ph(w) − p^h(u)]

= (1 + γ)[yh(w) − yh(u), yh(w) − yh(u)]ξ

− γeE[yh(w) − y^h(u)], yh(w) − y^h(u)

ξ. (4.15) An application of Cauchy-Schwarz and Young’s inequalities yields

− γeE[yh(w) − y^h(u)], yh(w) − y^h(u)

ξ

≥ −γ

2keE[yh(w) − y^h(u)]k²L²(L²(D);Γ)−1

2k[y^h(w) − y^h(u)]k²L²(L²(D);Γ). Since all norms are convex functions, by Jensen’s inequality keE[u]k ≤ eEkuk and e

E[eE[u]] = eE[u], we have

−γeE[yh(w) − yh(u)], yh(w) − yh(u)

ξ≥ −

1 2 +γ

2

k[yh(w) − yh(u)]k²L²(L²(D);Γ). (4.16) Thus, inserting (4.16) into (4.15), it is obtained that

[ph(w) − ph(u), w − u]ξ ≥

1 2 +γ

2

kyh(w) − yh(u)k²L²(L²(D);Γ)

| {z }

≥0

. (4.17)

Hence, (4.14) and (4.17) imply that (4.13) exists.

Next, we derive estimates between the approximate solutions (yh, ph) and the auxiliary solutions (yh(u), ph(u)).

Lemma 4.2. Let (yh, ph) and (yh(u), ph(u)) be the solutions of (3.18) and (4.12), respectively. Then, there exist positive constants C1 andC2 independent ofh such that

ky^h− y^h(u)k^ξ ≤ C¹ku − u^hkL²(L²(D);Γ), (4.18a) kp^h− p^h(u)k^ξ ≤ C²ku − u^hkL²(L²(D);Γ). (4.18b) Proof. By subtracting (4.12a) from (3.18a) and taking vh= yh− y^h(u), we have that

aξ[yh− y^h(u), yh− y^h(u)] = [uh− u, y^h− y^h(u)]ξ.

(20)

With the help of the coercivity of aξ (4.2) and the Cauchy-Schwarz inequality, we obtain

ccvkyh− yh(u)k²ξ ≤ aξ[yh− yh(u), yh− yh(u)]

≤ ku^h− ukL²(L²(D);Γ)ky^h− y^h(u)k^ξ, which yields the desired result (4.18a).

Analogously, by subtracting (4.12b) from (3.18b) and taking vh= ph−ph(u), we have that

a^∗_ξ[ph− p^h(u), ph− p^h(u)]

= (1 + γ)[yh− yh(u), ph− ph(u)]ξ+ γh e

E[yh(u) − yh], ph− ph(u)i

ξ. An application of the coercivity of aξ, Cauchy-Schwarz inequality, and Jensen’s inequality yields

ccvkph− ph(u)k²ξ ≤ aξ[ph− ph(u), ph− ph(u)]

≤ (1 + 2γ)kph− ph(u)kL²(L²(D);Γ)kyh− yh(u)kξ. (4.19) We note that the procedure applied in (4.16) is also used in the derivation of (4.19). Hence, by (4.19) and (4.18a), we deduce the desired result (4.18b).

In order to find a upper bound for the control taking account into the active and inactive regions of the control u, we divide the domain D into pieces as done in [50]:

D⁺ = ([

K

: K ⊂ D, u^a< u|^K < ub

)

, (4.20a)

D^∂ = ([

K

: K ⊂ D, u|K = ua or u|K = ub

)

, (4.20b)

D⁻ = D \ (D⁺∪ D^∂). (4.20c)

It is assumed that these sets are disjoint and D = D⁺∪ D^∂∪ D⁻. D⁻ consists of elements which lie close to the free boundary between the active and the inactive sets with the following assumption

meas(D⁻) ≤ Ch (4.21)

(21)

on the regularity of u and T^h. This assumption is valid if the boundary of the level set D^∂ consists of a finite number of rectifiable curves [51]. We also set

D^∗= {x ∈ D : u^a< u(x) < ub} , which includes D⁺⊂ D^∗ [52].

Lemma 4.3. Let(y, p, u) and (yh, ph, uh) be the solutions of (2.18) and (3.18), respectively. Then, we have

ku − u^hk²L²(L²(D);Γ)

≤ Ckp − ph(u)kL²(L²(D;Γ))+ Ch^3/2kukL²(W^1,∞(D);Γ)

+ C

hkpk²L²(H¹(D);Γ)+ XN n=1

kn

2

qn+1k∂ξ^qnⁿ⁺¹pkL²(H¹(D⁻);Γ)

(qn+ 1)!

. (4.22)

Proof. With the help of (4.11), Lemma 4.1, the standard Lagrangian interpolation Πu, the assumption D⁺⊂ D^∗, and the notation ph= ph(uh), we obtain

µku − u^hk²L²(L²(D);Γ)≤ Jh^′(u) · (u − u^h) − Jh^′(uh) · (u − u^h)

= [µu + ph(u), u − uh]ξ− [µuh+ ph, u − uh]ξ

= [µu + p, u − uh]ξ

| {z }

−J^′(u)·(uh−u)≤0

−[p − ph(u), u − uh]ξ

+ [µuh+ ph, uh− Πu]^ξ

| {z }

−J_h^′(u_h)·(Πu−u_h)≤0

+[µuh+ ph, Πu − u]^ξ

= [µuh+ ph, Πu − u]ξ+ [ph(u) − p, u − uh]ξ. (4.23) The first term in (4.23) can be rewritten as follows

[µuh+ ph, Πu − u]ξ = [µuh+ ph− µu − p, Πu − u]ξ+ [µu + p, Πu − u]ξ

= [µuh− µu, Πu − u]^ξ+ [µu + p, Πu − u]^ξ

+ [ph− ph(u), Πu − u]ξ+ [ph(u) − p, Πu − u]ξ. (4.24) Then, inserting (4.24) into (4.23) and applying Cauchy-Schwarz and Young’s inequalities and Lemma 4.2, we obtain

µku − u^hk²L²(L²(D);Γ) ≤ c¹kp^h(u) − pkL²²(L²(D);Γ)+ c2ku − u^hk²L²(L²(D);Γ)

+c3ku − Πuk²L²(L²(D);Γ)+ [µu + p, Πu − u]^ξ.(4.25)

(22)

Assuming Πu is the standard Lagrangian interpolation satisfying Πu(x) = u(x) for any vertex x. Then, Πu belongs to Uh^ad and we have

ku − ΠukL²(L²(D⁺);Γ) ≤ Ch²kukL²(H²(D⁺);Γ), (4.26a) ku − ΠukL²(W^0,∞(D⁻);Γ) ≤ ChkukL²(W^1,∞(D⁻);Γ) (4.26b) for u ∈ L²(W^1,∞(D); Γ) and u|^D^∗ ⊂ L²(H²(D^∗); Γ). Hence

ku − Πuk²L²(L²(D);Γ)

= ku − Πuk²L²(L²(D⁺);Γ)+ ku − Πuk²L²(L²(D^∂);Γ)

| {z }

=0

+ku − Πuk²L²(L²(D⁻);Γ)

≤ ku − Πuk²L²(L²(D⁺);Γ)+ Cku − Πuk²L²(W^0,∞(D⁻);Γ)meas(D⁻)

≤ Ch⁴kuk²L²(H²(D⁺);Γ)+ Ch³kuk²L²(W^1,∞(D⁻);Γ)

≤ Ch³

hkuk²L²(H²(D⁺);Γ)+ kuk²L²(W^1,∞(D⁻);Γ)

≤ Ch³

kuk²L²(H²(D^∗);Γ)+ kuk²L²(W^1,∞(D);Γ)

. (4.27)

By variational inequality (3.18c) and the definitions of domains (4.20), we have µu + p = 0 on D⁺ and Πu − u = 0 on D^∂.

Then,

[µu + p, Πu − u]ξ =[µu + p − Phn(µu + p), Πu − u]D⁻

≤[µu − Phn(µu), Πu − u]D⁻+ [p − Phn(p), Πu − u]D⁻. (4.28) It follows from definition of the projector operator in (4.9), the inequalities (4.6), and (4.26b) that

[µu − P^hn(µu), Πu − u]D⁻

≤ [µu − Πh(µu), Πu − u]D⁻+ [Πh(I − Πn)(µu), Πu − u]D⁻

| {z }

=0

≤ µku − ΠhukL²(L²(D⁻);Γ)kΠu − ukL²(L²(D⁻);Γ)

≤ ChkukL²(H¹(D⁻);Γ)ku − ΠukL²(W^0,∞(D⁻);Γ)meas(D⁻)

≤ Ch³kukL²(H¹(D⁻);Γ)kukL²(W^1,∞(D⁻);Γ)≤ Ch³kuk²L²(W^1,∞(D⁻);Γ). (4.29)

(23)

Next, with the help of the bounds in (4.6), (4.26b), and (4.3) and Cauchy and Young’s inequalities, we find a bound for the second term in (4.28)

[p − P^hn(p), Πu − u]D⁻

≤ [p − Πh(p), Πu − u]D⁻+ [Πh(I − Πn)(p), Πu − u]D⁻

≤ C¹h³kpkL²(H¹(D⁻);Γ)kukL²(W^1,∞(D⁻);Γ)

+ C2

XN n=1

kn

2

^qn+1

k∂ξ^qnⁿ⁺¹pkL²(H¹(D⁻);Γ)

(qn+ 1)! h²kukL²(W^1,∞(D⁻);Γ)

≤ C1

h²

2 kpk²L²(H¹(D⁻);Γ)+h⁴

2 kuk²L²(W^1,∞(D⁻);Γ)

+ C2 1 2

XN n=1

kn

2

2qn+2k∂ξ^qnⁿ⁺¹pk²_L²_(H¹_(D⁻_);Γ) ((qn+ 1)!)² +h⁴

2 kuk²L²(W^1,∞(D⁻);Γ)

! . (4.30) Combination of (4.29) and (4.30) yields

[µu + p, Πu − u]^ξ ≤ Ch³kuk²L²(W^1,∞(D⁻);Γ)+ Ch²kpk²L²(H¹(D⁻);Γ)

+ C XN n=1

kn

2

2qn+2k∂ξ^qnⁿ⁺¹pk²L²(H¹(D⁻);Γ)

((qn+ 1)!)² . (4.31) Finally, inserting (4.27) and (4.31) into (4.25), we completes the proof of Lemma 4.3.

Lemma 4.4. Let(y, p) and (yh(u), ph(u)) be the solutions of (2.18) and (4.12), respectively. Then, there is a constant C independent of h such that

ky − yh(u)kξ ≤ ChkykL²(H²(D);Γ)

+ XN n=1

kn

2

qn+1k∂ξ^qnⁿ⁺¹ykL²(H¹(D);Γ)

(qn+ 1)! (4.32) and

kp − p^h(u)k^ξ≤ Ch

kykL²(H²(D);Γ)+ kpkL²(H²(D);Γ)

+ XN n=1

kn

2

^qn+1

k∂ξ^qnⁿ⁺¹ykL²(H¹(D);Γ)+ k∂ξ^qⁿn⁺¹pkL²(H¹(D);Γ)

(qn+ 1)! .

(4.33)