• No results found

Sparse tensor discretizations of high-dimensional parametric and stochastic PDEs

N/A
N/A
Protected

Academic year: 2021

Share "Sparse tensor discretizations of high-dimensional parametric and stochastic PDEs"

Copied!
177
0
0

Loading.... (view fulltext now)

Full text

(1)

doi:10.1017/S0962492911000055 Printed in the United Kingdom

Sparse tensor discretizations of

high-dimensional parametric and

stochastic PDEs

Christoph Schwab and Claude Jeffrey Gittelson

Seminar for Applied Mathematics, ETH Z¨urich, R¨amistrasse 101, CH-8092 Z¨urich, Switzerland

E-mail: [email protected],[email protected]

Partial differential equations (PDEs) with random input data, such as ran-dom loadings and coefficients, are reformulated as parametric, deterministic PDEs on parameter spaces of high, possibly infinite dimension. Tensorized operator equations for spatial and temporal k-point correlation functions of their random solutions are derived. Parametric, deterministic PDEs for the laws of the random solutions are derived. Representations of the random so-lutions’ laws on infinite-dimensional parameter spaces in terms of ‘generalized polynomial chaos’ (GPC) series are established. Recent results on the regu-larity of solutions of these parametric PDEs are presented. Convergence rates of best N-term approximations, for adaptive stochastic Galerkin and collo-cation discretizations of the parametric, deterministic PDEs, are established. Sparse tensor products of hierarchical (multi-level) discretizations in physical space (and time), and GPC expansions in parameter space, are shown to con-verge at rates which are independent of the dimension of the parameter space. A convergence analysis of multi-level Monte Carlo (MLMC) discretizations of PDEs with random coefficients is presented. Sufficient conditions on the random inputs for superiority of sparse tensor discretizations over MLMC dis-cretizations are established for linear elliptic, parabolic and hyperbolic PDEs with random coefficients.

Work partially supported by the European Research Council under grant number ERC

AdG 247277-STAHDPDE and by the Swiss National Science Foundation under grant number SNF 200021-120290/1.

(2)

CONTENTS

Introduction 292

1 Operator equations with stochastic data 296 2 Stochastic Galerkin discretization 332 3 Optimal convergence rates 367 4 Sparse tensor discretizations 394

Appendix

A Review of probability 419 B Review of Hilbert spaces 428 C Review of Gaussian measures on Hilbert spaces 439

References 461

Introduction

The numerical solution of partial differential equation models in science and engineering has today reached a certain maturity, after several decades of progress in numerical analysis, mathematical modelling and scientific computing. While there certainly remain numerous mathematical and al-gorithmic challenges, for many ‘routine’ problems of engineering interest, today numerical solution methods exist which are mathematically under-stood and ‘operational’ in the sense that a number of implementations exist, both academic and commercial, which realize, in the best case, algorithms of provably optimal complexity in a wide range of applications. As a rule, the numerical analysis and the numerical solution methods behind such al-gorithms suppose that a model of the system of interest is described by a well-posed (in the sense of Hadamard) partial differential equation (PDE), and that the PDE is to be solved numerically to prescribed accuracy for one given set of input data.

With the availability of highly accurate numerical solution algorithms for a PDE of interest andone prescribed set ofexact input data (such as source terms, constitutive laws and material parameters) there has been increasing awareness of the limited significance of such single, highly accurate ‘forward’ solves. Assuming, as we will throughout this article, that the PDE model of the physical system of interest is correct, this trend is due to two rea-sons: randomness and uncertainty of input data and the need for efficient prediction of system responses on high-dimensional parameter spaces.

First, the assumption of availability of exact input data is not realistic: often, the simulation’s input parameters are obtained from measurements or from sampling a large, but finite number of specimens or system snap-shots which are incomplete or stochastic. This is of increasing importance in classical engineering disciplines, but even more so in emerging models in

(3)

the life sciences and social sciences. Rather than producing efficiently accu-rate answers for single instances of exact input data, increasingly the goal of computation in numerical simulations is to efficiently process statistical information on uncertain input data for the PDE of interest. While math-ematical formulations of PDEs with random inputs have been developed with an eye towards uncorrelated, or white noise inputs (see, e.g., Holden, Oksendal, Uboe and Zhang (1996), Da Prato and Zabczyk (1992), Da Prato (2006), Lototsky and Rozovskii (2006), Pr´evˆot and R¨ockner (2007), Dalang, Khoshnevisan, Mueller, Nualart and Xiao (2009) and the references therein), PDEs with random inputs in numerical simulation in science and engineer-ing are of interest in particular in the case of so-calledcorrelated inputs (or ‘coloured noise’).

Second, in the context of optimization, or of risk and sensitivity anal-ysis for complex systems with random inputs, the interest is in comput-ing the systems’ responses efficiently given dependence on several, possibly countably many parameters, thereby leading to the challenge ofnumerical simulation of deterministic PDEs on high-dimensional parameter spaces.

Often, the only feasible approach in numerical simulation towards these two problems is to solve the forward problem for many instances, or samples, of the PDE’s input parameters; for random inputs, this amounts to Monte Carlo-type sampling of the noisy inputs, and for parametric PDEs, responses of the system are interpolated from forward solves at judiciously chosen combinations of input parameters.

With the cost of one ‘sample’ being the numerical solution of a PDE, it is immediate that, in particular for transient problems in three spatial dimensions with solutions that exhibit multiple spatial and temporal length scales, the computational cost of uniformly sampling the PDE solution on the parameter space (resp. the probability space) is prohibitive. Responding to this by massive parallelism may alleviate this problem, but ultimately, the low convergence rate 1/2 of Monte Carlo (MC) sampling, respectively the so-called ‘curse of dimensionality’ of standard interpolation schemes in high-dimensional parameter spaces, requires advances at the mathemati-cal core of the numerimathemati-cal PDE solution methods: the development of novel mathematical formulations of PDEs with random inputs, the study of the regularity of their solutions is of interest, both with respect to the physi-cal variables and with respect to parameters, and the development of novel discretizations and solution methods of these formulations. Importantly, the parameters may take values in possibly infinite-dimensional parame-ter spaces: for example, in connection with Karhunen–Lo`eve expansions of spatially inhomogeneous and correlated inputs.

The present article surveys recent contributions to the above questions. Our focus is on linear PDEs with random inputs; we present various formu-lations, new results on the regularity of their solutions and, based on these

(4)

regularity results, we design, formulate and analyse discretization schemes which allow one to ‘sweep’ the entire, possibly infinite-dimensional input parameter space approximately in a single computation. We also establish, for the algorithms proposed here, bounds on their efficiency (understood as accuracy versus the number of degrees of freedom) that do not deteriorate with respect to increasing dimension of the computational parameter do-main, i.e., that are free from the curse of dimensionality. The algorithms proposed here are variants and refinements of the recently proposed stochas-tic Galerkin and stochasstochas-tic collocation discretizations (see,e.g., Xiu (2009) and Matthies and Keese (2005) and the references therein for an account of these developments). We exhibit assumptions on the inputs’ correlations which ensure an efficiency of these algorithms which is superior to that of MC sampling. One insight that emerges from the numerical analysis of re-cently proposed methods is thatthe numerical resolution in physical space need not be high uniformly on the entire parameter space. The use of ‘poly-nomial chaos’ type spectral representations (and their generalizations) of the laws of input and output random fields allows a theory of regularity of the random solutions and, based on this, the optimization of numerical methods for their resolution. Here, we have in mind discretizations in phys-ical space and time as well as in stochastic or parameter space, aiming at achieving a prespecified accuracy with minimal computational work. From this broad view, the recently proposedmulti-level Monte Carlo methods can also be interpreted as sparse tensor discretizations. Accordingly, we present in this article an error analysis of single- and multi-level MC methods for elliptic problems with random inputs.

As this article’s title suggests, the notion ofsparse tensor products of op-erators and hierarchical sequences of finite-dimensional subspaces pervades our view of numerical analysis of high-dimensional problems. Sparsity in connection with tensorization has become significant in several areas of sci-entific computing in recent years: in approximation theory as hyperbolic cross approximations (see, e.g., Temlyakov (1993)) and, in finite element and finite difference discretizations, the so-calledsparse grids (see Bungartz and Griebel (2004) and the references therein) are particular instances of this concept. We note in passing that the range of applicability of sparse ten-sor discretizations extends well beyond stochastic and parametric problems (see, e.g., Schwab (2002), Hoang and Schwab (2004/05) and Schwab and Stevenson (2008) for applications to multiscale problems). On the level of numerical linear algebra, the currently emerginghierarchical low-rank ma-trix formats, which were inspired by developments in computational chem-istry, are closely related to some of the techniques developed here.

The present article extends these concepts in several directions. First, on the level of mathematical formulation of PDEs with random inputs: we present deterministic tensorized operator equations for two- and k-point

(5)

correlation functions of the the random system responses. Such equations also arise in the context of moment closures of kinetic models in atomistic-to-continuum transitions. Discretizations for their efficient, deterministic numerical solution may therefore be of interest in their own right. For the spectral discretizations, we review the polynomial chaos representation of random fields and the Wiener–Itˆo chaos decomposition of probability spaces and of random fields into tensorized Hermite polynomials of a countable number of Gaussians. The spectral representation of random outputs of PDEs allows for a regularity theory of the laws of random fields which goes substantially beyond the mere existence of moments.

According to the particular application, in this article sparsity in tensor discretizations appears in roughly three forms. First, we use sparse tensor products of multi-level finite element spaces in the physical domainD⊂Rd to build efficient schemes for the Galerkin approximation of tensorized equa-tions for k-point correlation functions. Second, we consider heterogeneous

sparse tensor product discretizations ofmulti-level finite element, finite vol-ume and finite difference discretizations in the physical domain with hierar-chical polynomial chaos bases in the probability space. As we will show, the use of multi-level discretizations in physical space actually leads to substan-tial efficiency gains in MC methods; nevertheless, the resulting multi-level MC methods are of comparable efficiency as sparse tensor discretizationsfor random outputs with finite second moments. However, as soon as the out-puts have additional summability properties (and the examples presented here suggest that this is so in many cases), adaptive sparse tensor discretiza-tions outperform MLMC methods.

The outline of the article is as follows. We first derive tensorized oper-ator equations for deterministic, linear equations with random data. We establish the well-posedness of these tensorized operator equations, and in-troduce sparse tensor Galerkin discretizations based on multi-level, wavelet-type finite element spaces in the physical domain. We prove, in particular, stability of sparse tensor discretizations in the case of indefinite operators such as those arising in acoustic or electromagnetic scattering. We also give an error analysis of MC discretizations which indicates the dependence of its convergence rate on the degree of summability of the random solution.

Section 2 is devoted to stochastic Galerkin formulations of PDEs with random coefficients. Using polynomial chaos representations of the random inputs, for example in a Karhunen–Lo`eve expansion, we give a reformulation of the random PDEs of interest as deterministic PDEs which are posed on

infinite-dimensional parameter spaces. While the numerical solution of these PDEs with standard tools from numerical analysis is foiled by the curse of dimensionality (the raison d’ˆetre for the use of sampling methods on the stochastic formulation), we review recent regularity results for these prob-lems which indicate that sparse, adaptive tensorization of discretizations

(6)

in probability and physical space can indeed produce solutions whose accu-racy, as a function of work, is independent of the dimension of the parameter space. We cover both affine dependence, as is typical in Karhunen–Lo`eve representations of the random inputs, as well as log-normal dependence in inputs. We focus on Gaussian and on uniform measures, where ‘polynomial chaos’ representations use Hermite and Legendre polynomials, respectively (other probability measures give rise to other polynomial systems: see,e.g., Schoutens (2000) and Xiu and Karniadakis (2002b)). Section 3 addresses the regularity of the random solutions in these polynomial chaos represen-tations by an analysis of the associated parametric, deterministic PDE for their laws. The analysis allows us to deduce bestN-term convergence rates ofpolynomial chaos semidiscretizations of the random solutions’ laws.

Section 4 combines the results from the preceding sections with space and time discretizations in the physical domain. The error analysis of fully discrete algorithms reveals that it is crucial for efficiency that the level of spatial and temporal resolution be allowed to depend on the stochastic mode being discretized. Our analysis shows that, in fact, a highly non-uniform level of resolution in physical space should be adopted in order to achieve algorithms that scale favourably with respect to the dimension of the space of stochastic parameters.

As this article and the subject matter draw on tools from numerical anal-ysis, from functional analysis and from probability theory, we provide some background reference material on the latter two items in the Appendix. This is done in order to fix the notation used in the main body of the text, and to serve as a reference for readers with a numerical analysis background. Naturally, the selection of the background material is biased towards the subject matter of the main text. It does not claim to be a reference on these subjects. For a more thorough introduction to tools from probabil-ity and stochastic analysis we refer the reader to Bauer (1996), Da Prato (2006), Da Prato and Zabczyk (1992), Pr´evˆot and R¨ockner (2007) and the references therein.

1. Sparse tensor FEM for operator equations with stochastic data

For the variational setting of linear operator equations with deterministic, boundedly invertible operators, we assume thatX, Y are separable Hilbert spaces over R with duals X and Y, respectively, and A ∈L(X, Y) a lin-ear, boundedly invertible deterministic operator. We denote its associated bilinear form by

(7)

Here, and throughout, forw∈Y andv∈Xthe bilinear formYw, vX

de-notes theY×Xduality pairing. As is well known (see,e.g., Theorem C.20) the operatorA fromX onto Y is boundedly invertible if and only if a(·,·) satisfies the following conditions.

(i) a(·,·) is continuous: there existsC1 <∞ such that

∀w∈X, v∈Y : |a(w, v)| ≤C1wXvY. (1.2)

(ii) a(·,·) is coercive: there existsC2 >0 such that

inf 0=w∈X0=supvY a(w, v) wXvY C2>0. (1.3) (iii) a(·,·) is injective: 0=v∈Y : sup 0=w∈X a(w, v)>0. (1.4)

If (1.2)–(1.4) hold, then for everyf ∈Y the linear operator equation u∈X : a(u, v) =Yf, vX ∀v∈Y (1.5)

admits a unique solutionu∈X such that

uX ≤C21fY. (1.6)

We consider equation (1.5) with stochastic data: to this end, let (Ω,F,P) be a probability space and letf : Ω→Y be a random field,i.e., a measurable map from (Ω,F,P) into Y which is Gaussian (see Appendix C for the definition of Gaussian random fields). Analogous to the characterization of Gaussian random variables by their mean and their (co)variance, a Gaussian random fieldf L2(Ω,F,P;Y) is characterized by its mean af Y and

its covariance operatorQf ∈ L+1(Y).

We use the followinglinear operator equation with Gaussian data: given f ∈L2(Ω,F,P;Y), finduL2(Ω,F,P;X) such that

Au=f in L2(Ω,F,P;Y) (1.7) admits a unique solutionu∈L2(Ω,F,P;X) if and only if Asatisfies (1.2)– (1.4).

By Theorem C.31, the unique random solutionu∈L2(Ω,F,P;X) of (1.7) is Gaussian with associated Gaussian measureNau,Qu onX which, in turn,

ischaracterized by the solution’s mean,

au = mean(u) =A−1af, (1.8)

and the solution’s covariance operator Qu ∈ L+1(X), which satisfies the

(deterministic) equation

(8)

In the Gaussian case, therefore, solving the stochastic problem (1.7) can be reduced to solving the twodeterministic problems (1.8) and (1.9). Whereas the mean-field problem (1.8) is one instance of the operator equation (1.7), the covariance equation (1.8) is an equation for the operator Qu ∈ L+1(X).

As we show in Theorem C.31, this operator is characterized by the so-called covariance kernel Cu, which satisfies, in terms of the corresponding

covariance kernelCf of the data, the covariance equation (see (C.50))

(A⊗A)Cu=Cf, (1.10)

which is understood to hold in the sense of (Y ⊗Y) Y ⊗Y. One approach to the numerical treatment of operator equationsAu=f, where the data f are random fields, i.e., measurable maps from a probability space (Ω,F,P) into the set Y of admissible data for the operatorA, is via tensorized equations such as (1.10) for their statistical moments.

The simplest approach to the numerical solution of the linear operator equation Au = f with random input f is Monte Carlo (MC) simulation,

i.e., generating a large numberM of i.i.d. data samplesfj and solving,

pos-sibly in parallel, for the corresponding solution ensemble{uj =A−1fj; j =

1, . . . , M}. Statistical moments and probabilities of the random solution u are then estimated from {uj}. As we will prove, convergence of the MC

method as the numberM of samples increases is ensured (for suitable sam-pling) by the central limit theorem. We shall see that the MC method allows in general only the convergence rateO(M−1/2).

If statistical moments, i.e., mean-field and higher-order moments of the random solutionu, are of interest, one can exploit the linearity of the equa-tion Au=f to derive a deterministic equation for the kth moment of the random solution, similar to the second-moment equation (1.10); this deriva-tion is done in Secderiva-tion 1.1. For the Laplace equaderiva-tion with stochastic data, this approach is due to I. Babuˇska (1961). We then address the numerical computation of the moments of the solution by either Monte Carlo or by direct, deterministic finite element computation. If the physical problem is posed in a domain D Rd, the kth moment of the random solution is defined in the domainDk Rkd; standard finite element (FE)

approxima-tions will therefore be inadequate for the efficient numerical approximation of thekth moments of the random solution.

The efficient deterministic equation and its FE approximation were in-vestigated in Schwab and Todor (2003a, 2003b) in the case where A is an elliptic partial differential operator. It was shown that the kth moment of the solution could be computed in a complexity comparable to that of an FE solution for the mean-field problem by the use of sparse tensor products of standard FE spaces for which a hierarchical basis is available. The use of sparse tensor product approximations is a well-known device in high-dimensional numerical integration going back to Smolyak (1963), in

(9)

multivariate approximation (Temlyakov 1993), and in complexity theory; see Wasilkowski and Wo´zniakowski (1995) and the references therein.

In the present section, we address the case whenAis a non-local operator, such as a strongly elliptic pseudodifferential operator, as arises in the bound-ary reduction of boundbound-ary value problems for strongly elliptic partial differ-ential equations. In this case, efficient numerical solution methods require, in addition to Galerkin discretizations of the operator equation, some form of matrix compression (such as the fast multipole method or wavelet-based matrix compression) which introduces additional errors into the Galerkin solution that will also affect the accuracy of second and higher moments. We briefly present the numerical analysis of the impact of matrix compres-sions on the efficient computation of second and higher moments of the random solution. Therefore, the present section will also apply to strongly elliptic boundary integral equations obtained by reduction to the boundary manifoldD=∂Dof elliptic boundary value problems in a bounded domain

D ⊂Rd+1, as is frequently done in acoustic and electromagnetic scattering.

For such problems with stochastic data, the boundary integral formulation leads to an operator equation Au = f, where A is an integral operator or, more generally, a pseudodifferential operator acting on function spaces on ∂D. The linearity of the operator equation allows, without any closure hypothesis, formulation of a deterministic tensor equation for the k-point correlation function of the random solution u = A−1f. We show that, as in the case of differential operators, sparse tensor products of standard FE spaces allow deterministic approximation of thekth moment of the random solutionuwith relatively few degrees of freedom. To achieve computational complexity which scales log-linearly in the number of degrees of freedom in a Galerkin discretization of the mean-field problem, however, the Galerkin matrix for the operatorA must be compressed.

Accordingly, one purpose of this section is the design and numerical anal-ysis of deterministic and stochastic solution algorithms to obtain the kth moment of the random solution of possibly non-local operator equations with random data in log-linear complexity in the number N of degrees of freedom for the mean-field problem.

We illustrate the sparse tensor product Galerkin methods for the nu-merical solution of Dirichlet and Neumann problems for the Laplace or Helmholtz equation with stochastic data. Using a wavelet Galerkin finite element discretization allows straightforward construction of sparse tensor products of the trial spaces, and yields well-conditioned, sparse representa-tions of stiffness matrices for the operatorA as well as for its k-fold tensor product, which is the operator arising in thekth-moment problem.

We analyse the impact of the operator compression on the accuracy of functionals of the Galerkin solution, such as far-field evaluations of the ran-dom potential in a point. For example, means and variances of the potential

(10)

in a point can be computed with accuracyO(N−p) for any fixed orderp, for random boundary data with known second moments in O(N) complexity, whereN denotes the number of degrees of freedom on the boundary.

The outline of this section is as follows. In Section 1.1, we describe the operator equations considered here and derive the deterministic problems for the higher moments, generalizing Schwab and Todor (2003b). We estab-lish the Fredholm property for the tensor product operator and regularity estimates for the statistical moments in anisotropic Sobolev spaces with mixed highest derivative. Section 1.2 addresses the numerical solution of the moment equations, in particular the impact of various matrix compres-sions on the accuracy of the approximated moments, the preconditioning of the product operator and the solution algorithm. In Section 1.4, we discuss the implementation of the sparse Galerkin and sparse MC methods and estimate their asymptotic complexity. Section 1.5 contains some examples from finite and boundary element methods.

1.1. Operator equations with stochastic data Linear operator equations

We specialize the general setting (1.1) to the caseX=Y =V, and consider the operator equation

Au=f, (1.11)

where A is a bounded linear operator from the separable Hilbert space V into its dualV.

The operator A is a differential or pseudodifferential operator of order on a bounded d-dimensional manifold D, which may be closed or have a boundary. Here, for a closed manifold and for s 0, ˜Hs(D) := Hs(D) denotes the usual Sobolev space. For s < 0, we define the spaces Hs(D) and ˜Hs(D) by duality. For a manifold D with boundary we assume that

this manifold can be extended to a closed manifold ˜D, and define ˜

Hs(D) :={u|D ; u∈Hs( ˜D), u|D˜\D = 0}

with the induced norm. If D is a bounded domain inRd we use ˜D:= Rd. We now assume that V =H/2(D). In the case when A is a second-order differential operator, this means that we have Dirichlet boundary conditions (other boundary conditions can be treated in an analogous way).

The manifold D may be smooth, but we also consider the case whenD is a polyhedron inRd, or the boundary of a polyhedron inRd+1, or part of the boundary of a polyhedron.

For the deterministic operatorAin (1.11), we assume strong ellipticity in the sense that there existsα >0 and a compact operator T :V →V such

(11)

that the G˚arding inequality

∀v∈V : (A+T)v, v≥αv2V (1.12) holds. For the deterministic algorithm in Section 1.4 we need the slightly stronger assumption thatTis smoothing with respect to a scale of smooth-ness spaces (see (1.63) below). Here and in what follows, ·,· denotes the V×V duality pairing. We assume also thatA is injective,i.e., that

kerA={0}, (1.13) which implies that for everyf ∈V, (1.11) admits a unique solution u∈V and, moreover, thatA−1 :V →V is continuous, i.e., there exists CA >0

such that, for allf ∈V,

uV =A−1fV ≤CAfV. (1.14)

Here CA =C21 with the constantC2 as in (1.3). We shall consider (1.11)

in particular for dataf, which are Gaussian random fields on the data space V. By the linearity of the operator equation (1.11), then the solutionv∈V is a Gaussian random field as well. Throughout, we assume thatV and V are separable Hilbert spaces.

Random data

A Gaussian random fieldf with values in a separable Hilbert spaceX is a mappingf: Ω→X which maps events E Σ to Borel sets in X, and such that the image measuref#P on X is Gaussian. In the following, we allow

more general random fields. Of particular interest will be their summability properties. We say that a random fieldu: Ω X is in the Bochner space L1(Ω;X) ifω→ u(ω)X is measurable and integrable so thatuL1(Ω;X):=

u(ω)X P(dω) is finite. In particular, then the ‘ensemble average’

Eu:=

u(ω)P(dω)∈X

exists as a Bochner integral ofX-valued functions, and it satisfies

EuX ≤ uL1(Ω;X). (1.15)

Let k 1. We say that a random field u: Ω X is in the Bochner space Lk(Ω;X) if ukLk(Ω;X) =

u(ω)kXP(dω) is finite. Note that ω u(ω)kX is measurable due to the measurability ofu and the continuity of the norm·X on X. Also,Lk(Ω;X)⊃Ll(Ω;X) for k < l.

LetB ∈ L(X, Y) denote a continuous linear mapping from X to another separable Hilbert spaceY. For a random field u∈Lk(Ω;X), this mapping defines a random variable v(ω) = Bu(ω) taking values in Y. Moreover, v∈Lk(Ω;Y) and we have

(12)

where the constantC is given by C =BL(X,Y). In addition, we have BuP(dω) = Ω BuP(dω). (1.17)

MC estimation of statistical moments

We are interested in statistics of the random solution u of (1.11) and, in particular, in statistical moments. To define them, for a separable Hilbert spaceX and for anyk∈N we define thek-fold tensor product space

X(k) =X ⊗ · · · ⊗ X

ktimes

,

and equip it with the natural cross-norm · X(k). The significance of a

cross-norm was emphasized by Schatten. The cross-norm has the property that, for everyu1, . . . , uk∈X,

u1⊗ · · · ⊗ukX(k) =u1X· · · ukX (1.18)

(see Light and Cheney (1985) and the references therein for more on cross-norms on tensor product spaces). The k-fold tensor products of, for ex-ample, X are denoted analogously by (X)(k). For u Lk(Ω;X) we now consider the random fieldu(k) defined byu(ω)⊗ · · · ⊗u(ω). By Lemma C.9, u(k)=u⊗ · · · ⊗u∈L1(Ω;X(k)), and we have theisometry

u(k)L1(Ω;X(k)) = Ω u(ω)⊗ · · · ⊗u(ω)X(k)P(dω) (1.19) = Ω u(ω)X· · · u(ω)XP(dω) =ukLk(Ω;X).

We define the momentMku as the expectation ofu⊗ · · · ⊗u.

Definition 1.1. For u Lk(Ω;X), for some integer k 1, the kth

mo-ment ofu(ω) is defined by Mku=Eu⊗ · · · ⊗u ktimes = ω∈u(ω)⊗ · · · ⊗u(ω) ktimes P(dω)∈X(k). (1.20) Note that (1.15) and (1.18) give, with Jensen’s inequality and the con-vexity of the norm · V R, the bound

Mku

X(k) =Eu(k)X(k) Eu(k)X(k) =EukX =ukLk(Ω;X). (1.21)

Deterministic equation for statistical moments

We now consider the operator equation Au = f, where f Lk(Ω;V) is

(13)

(1.16), (1.14) and (1.21), that u∈Lk(Ω;V), and that we have thea priori

estimate

Mku

V(k) ≤ ukLk(Ω;V)≤CAkfkLk(Ω;V). (1.22)

Remark 1.2. One example of a probability measurePonXis a Gaussian measure; we refer to,e.g., Vakhania, Tarieladze and Chobanyan (1987) and Ledoux and Talagrand (1991) for general probability measures over Banach spaces X and, in particular, to Bogachev (1998) and Janson (1997) for a general exposition of Gaussian measures on function spaces.

SinceA−1:V →V in (1.11) is bijective, by (1.12) and (1.13), it induces a measureP:=A−#1Pon the spaceV of solutions to (1.11). IfPis Gaussian overVandAin (1.11) is linear, thenPis Gaussian overV by Theorem C.18. We recall that a Gaussian measure is completely determined by its mean and covariance, and hence onlyMkufork= 1,2 are of interest in this case. We now consider the tensor product operatorA(k)=A⊗· · ·⊗A(ktimes). This operator mapsV(k) to (V)(k). Forv∈V andg:=Av, we obtain that A(k)v⊗ · · · ⊗v=g⊗ · · · ⊗g. Consider a random fielduLk(Ω;V) and let

f :=Au∈Lk(Ω;V). Then the tensor productu(k)=u⊗ · · · ⊗u (k times) belongs to the spaceL1(Ω;V(k)), and we obtain from (1.17) withB =A(k)

that thek-point correlations u(k) satisfyP-a.s. the tensor equation A(k)u(k)=f(k),

wheref(k)L1(Ω; (V)(k)). Now (1.17) impliesfor linear and deterministic operators Athat the k-point correlation functions of the random solutions,

i.e., the expectations Mku =E[u(k)], are solutions of the tensorized equa-tions

A(k)Mku=Mkf. (1.23) In the case k = 1 this is just the equation AEu = Ef for the mean field. Note that this equation provides a way to compute the moments Mku of the random solution in a deterministic fashion, for example by Galerkin discretization. As mentioned before, with the operatorAacting on function spacesX,Y in the domainD⊂Rd, the tensor equation (1.23) will require discretization inDk, the k-fold Cartesian product of D with itself. Using tensor products of, for instance, finite element spaces inD, we find fork >1 a reduction of efficiency in terms of accuracy versus number of degrees of freedom due to the ‘curse of dimensionality’. This mandates sparse tensor product constructions.

We will investigate the numerical approximation of the tensor equation (1.23) in Section 1.4. The direct approximation of (1.23) by, for example, Galerkin discretization is an alternative to the Monte Carlo approximation of the moments which will be considered in Section 1.3.

(14)

In the deterministic approach, explicit knowledge of all joint probability densities off (i.e., the law off) with respect to the probability measureP is not required to determine the order-k statistics of the random solutionu from order-k statistics off.

Remark 1.3. For nonlinear operator equations, associated systems of mo-ment equations require a closure hypothesis, which must be additionally imposed and verified. For the linear operator equation (1.11), however, a closure hypothesis is not necessary, as (1.23) holds.

For solvability of (1.23), we consider the tensor product operator A1

A2⊗ · · · ⊗Ak for operators Ai ∈ L(Vi, Vi), i= 1, . . . , k.

Proposition 1.4. For integerk >1, letVi,i= 1, . . . , k be Hilbert spaces

with duals Vi, and let Ai ∈ L(Vi, Vi) be injective and satisfy a G˚arding

inequality,i.e., there are compact Ti ∈ L(Vi, Vi) and αi>0 such that ∀v∈Vi:

(Ai+Ti)v, v

≥αiv2Vi, (1.24)

where·,·denotes the Vi×Vi duality pairing.

Then the product operator A = A1⊗A2⊗ · · · ⊗Ak ∈ L(V,V), where V=V1⊗V2⊗ · · · ⊗Vk andV = (V1⊗V2⊗ · · · ⊗Vk) =V1⊗V2⊗ · · · ⊗Vk,

is injective, and for every f ∈ V, the problem Au = f admits a unique solutionu with

uV ≤CfV.

Proof. The injectivity and the G˚arding inequality (1.24) imply the bounded invertibility ofAi for eachi. This implies the bounded invertibility ofAon V → V since we can write

A= (A1⊗Ik−1)(I⊗A2⊗I(k−2))◦ · · · ◦(I(k−1)⊗Ak),

whereI(j) denotes thej-fold tensor product of the identity operator on the appropriateVi. Note that each factor in the composition is invertible.

To apply this result to (1.23), we require the special case A(k):=A ⊗A⊗ · · · ⊗ A

ktimes

∈ L(V(k),(V)(k)) =L(V(k),(V(k))). (1.25) Theorem 1.5. If A in (1.11) satisfies (1.12) and (1.13), then for every k > 1 the operator A(k) ∈ L(V(k), (V)(k)) is injective on V(k), and for everyf ∈Lk(Ω;V), the equation

A(k)Z =Mkf (1.26) has a unique solutionZ ∈V(k).

(15)

This solution coincides with thekth moment Mkuof the random field in (1.20):

Z =Mku.

Proof. By (1.21), the assumption f Lk(Ω;V) ensures that Mkf

(V)(k). The unique solvability of (1.26) follows immediately from Propo-sition 1.4 and the assumptions (1.12) and (1.13). The identity Z = Mku follows from (1.23) and the uniqueness of the solution of (1.26).

Regularity

The numerical analysis of approximation schemes for (1.26) will require a regularity theory for (1.26). To this end we introduce a smoothness scale (Ys)s≥0 for the data f with Y0 = V and Ys Yt for s > t. We assume

that we have a corresponding scale (Xs)s≥0 of ‘smoothness spaces’ for the

solutions with X0 = V and Xs Xt for s > t, so that A−1:Ys Xs is

continuous.

When Dis a smooth closed manifold of dimension dembedded into Eu-clidean space Rd+1, we choose Y

s = H−/2+s(D) and Xs = H/2+s(D).

The case of differential operators with smooth coefficients in a manifoldD with smooth boundary is also covered within this framework by the choices Ys=H−/2+s(D) andXs=H/2∩H/2+s(D). Note that in other cases (a

pseudodifferential operator on a manifold with boundary, or a differential operator on a domain with non-smooth boundary), the spaces Xs can be

chosen as weighted Sobolev spaces which contain functions that are singular at the boundary.

Theorem 1.6. Assume (1.12) and (1.13), and that there is ans∗ >0 such that A−1 : Ys Xs is continuous for 0 s s∗. Then we have for all

k≥1 and for 0≤s≤s∗ some constantC(k, s) such that

Mku X(sk) ≤CM kf Ys(k) =Cf k Lk(Ω;Ys). (1.27)

Proof. If (1.12) and (1.13) hold, then the operatorA(k) is invertible, and

Mku= (A(k))1Mkf = (A1)(k)Mkf. Since A−1fXs ≤CsfYs, 0≤s≤s∗, it follows that Mku Xs(k) =(A 1)(k)Mkf Xs(k) ≤C k sMkfY(k) s , 0≤s≤s .

1.2. Finite element discretization

In order to obtain a finite-dimensional problem, we need to discretize in both Ω and D. For D we will use a nested family of finite element spaces V ⊂V,= 0,1, . . . .

(16)

Nested finite element spaces

The Galerkin approximation of (1.11) is based on a sequence {V}∞=0 of

subspaces ofV of dimension N = dimV <∞ which are dense in V,i.e.,

V =0V, and nested,i.e.,

V0 ⊂V1 ⊂V2 ⊂ · · · ⊂V⊂V+1 ⊂ · · · ⊂V. (1.28)

We assume that for functions u in the smoothness spacesXs with s≥0

we have the asymptoticapproximation rate

inf

v∈Vu−vV ≤CN

−s/d

uXs. (1.29)

Finite elements with uniform mesh refinement

We will now describe examples for the subspaces V which satisfy the

as-sumptions of Section 1.2. We briefly sketch the construction of finite element spaces which are only continuous across element boundaries; see Braess (2007), Brenner and Scott (2002) and Ciarlet (1978) for presentations of the mathematical foundations of finite element methods. These elements are suitable for operators of order <3. Throughout, we denote byPp(K)

the linear space of polynomials of total degree≤pon a set K.

Let us first consider the case of a bounded polyhedron D Rd. Let T0

be a regular partition ofD into simplices K. Let {T}∞=0 be the sequence

of regular partitions of D obtained from T0 by uniform subdivision: for

example, if d= 2, we bisect all edges of the triangulation T and obtain a

new, regular partition of the domainDinto possibly curved triangles which belong to finitely many congruency classes. We set

V =Sp(D,T) ={u∈C0(D) ; u|K ∈ Pp(K) ∀K∈ T}

and let h = max{diam(K) ; K ∈ T}. Then N = dimV = O(h−d) as

→ ∞. With V = ˜H/2(D) and X

s=H/2+s(D), standard finite element

approximation results imply that (1.29) holds fors∈[0, p+ 1−/2], i.e., inf

v∈Vu−vV ≤CN

−s/d uXs.

For the case when D is the boundary D =∂D of a polyhedron D ⊂ Rd+1

we define finite element spaces on D in the same way as above, but now in local coordinates on D, and obtain the same convergence rates (see,

e.g., Sauter and Schwab (2010)): for ad-dimensional domain D⊂Rdwith a smooth boundary we can first divide D into pieces DJ, which can be

mapped to a simplexS by smooth mappings ΦJ: DJ →S (which must be

C0-compatible where two pieces DJ, DJ touch). Then we can define onD

finite element functions which on DJ are of the form g◦ΦJ, where g is a

(17)

For a d-dimensional smooth surfaceD⊂Rd+1 we can similarly divideD into pieces which can be mapped to simplices inRd, and again define finite elements using these mappings.

Finite element wavelet basis for V

To facilitate the accurate numerical approximation of moments of order k≥2 of the random solution and for the efficient numerical solution of the partial differential equations, we use a hierarchical basis for the nested finite element (FE) spacesV0⊂ · · · ⊂VL.

To this end, we start with a basis j0}j=1,...,N0 for the finite element

space V0 on the coarsest triangulation. We represent on the finer meshes

T the corresponding FE spaces V, with > 0 as a direct sum V =

V−1⊕W. Since the subspaces are nested and finite-dimensional, this is

pos-sible with a suitable spaceW forany hierarchy of FE spaces. We assume,

in addition, that we are explicitly given basis functions j}j=1,...,M of

W. Iterating with respect to , we have that VL = V0⊕W1⊕ · · · ⊕WL,

and j; = 0, . . . , L, j= 1, . . . , M} is a hierarchical basis for VL, where

M0 :=N0.

(W1) Hierarchical basis. VL= span{ψj; 1≤j≤ML,0≤≤L}.

Let us define N := dimV and N−1 := 0; then we have M := N−N−1

for= 0,1,2, . . . , L.

The hierarchical basis property (W1) is in principle sufficient for the for-mulation and implementation of the sparse MC–Galerkin method and the deterministic sparse Galerkin method. In order to obtain algorithms of log-linear complexity for integrodifferential equations, impose on the hier-archical basis the additional properties (W2)–(W5) of awavelet basis. This will allow us to perform matrix compression for non-local operators, and to obtain optimal preconditioning for the iterative linear system solver. (W2) Small support. diam supp(ψj) =O(2).

(W3) Energy norm stability. There is a constant CB > 0 independent of

L∈N∪ {∞}, such that, for allL∈N∪ {∞} and all vL= L =0 M j=1 vjψj(x)∈VL, we have CB1 L =0 M j=1 |vj|2 ≤ vL2V ≤CB L =0 M j=1 |vj|2. (1.30) Here, in the caseL= it is understood that VL=V.

(18)

(W4) Wavelets ψj with 0 have vanishing moments up to order p0

p−

ψj(x)dx= 0, 0≤ |α| ≤p0, (1.31)

except possibly for wavelets where the closure of the support inter-sects the boundary ∂D or the boundaries of the coarsest mesh. In the case of mapped finite elements we require the vanishing moments for the polynomial function ψjΦJ1.

(W4) Decay of coefficients for ‘smooth’ functions inXs. There existsC >0

independent ofL such that, for everyu∈Xs and everyL, L =0 M j=1 |uj|2 22s ≤CLνu2Xs, ν = 0 for 0≤s < p+ 1−/2, 1 for s=p+ 1−/2. (1.32) By property (W3), wavelets constitute Riesz bases: every function u V has a unique wavelet expansionu==0 Mj=1 u

jψj.

We define the projection PL:V →VL by truncating this wavelet

expan-sion ofu at levelL,i.e.,

PLu:= L =0 M j=1 ujψj. (1.33)

Because of the stability (W3) and the approximation property (1.29), we obtain immediately that the wavelet projection PL is quasi-optimal: with

(1.29), for 0≤s≤s∗ and u∈Xs,

u−PLuV NL−s/duXs. (1.34)

We remark in passing that the appearance of the factor 1/din the conver-gence rates/d in (1.34), when expressed in terms of NL, the total number

of degrees of freedom, indicates a reduction of the convergence rate as the dimensiond of the computational domain increases. This reduction of the convergence rate with increasing dimension is commonly referred to as the ‘curse of dimensionality’; as long as d = 1,2,3, this is not severe and, in fact, shared by almost all discretizations. If the dimension of the computa-tional domain increases, however, this reduction becomes a severe obstacle to the construction of efficient discretizations. In the context of stochastic and parametric PDEs, the dimension of the computational domain can, in principle, become arbitrarily large, as we shall next explain.

(19)

Full and sparse tensor product spaces

To compute an approximation for

MkuV(k):=V ⊗ · · · ⊗V

ktimes

we need a suitable finite-dimensional subspace ofV(k). The simplest choice is the tensor product spaceVL⊗ · · · ⊗VL=VL(k). However, this full tensor

product space has dimension

dim(VL(k)) =NLk = (dim(VL))k, (1.35)

which is not practical for k > 1. A reduction in cost is possible by sparse tensor productsofVL. Thek-foldsparse tensor product spaceVL(k)is defined

by VL(k)= Nk 0 ||≤L V1 ⊗ · · · ⊗Vk, (1.36)

where we denote by the vector (1, . . . , k) Nk0 and its length by || =

1+· · ·+k. The sum in (1.36) is not direct in general. However, since the

V are finite-dimensional, we can writeVL(k) as a direct sum in terms of the

complement spacesWl: VL(k)= Nk 0 ||≤L W1 ⊗ · · · ⊗Wk. (1.37)

If a hierarchical basis of the subspaces V (i.e., satisfying hypothesis

(W1)) is available, we can define a sparse tensor quasi-interpolation op-erator PL(k) : V(k) V(k)

L by a suitable truncation of the tensor product

wavelet expansion: for everyx1, . . . , xk∈D,

(PL(k)v)(x) := 01+···+k≤L 1≤jν≤Mν=1,...,k v1···k j1···jk ψ 1 j1(x1)· · ·ψ k jk(xk). (1.38)

If a hierarchical basis is not explicitly available, we can still express PL(k) in terms of the projections Q := P−P−1 for = 0,1, . . . , and with the

conventionP1:= 0 as

PL(k)=

01+···+k≤L

Q1 ⊗ · · · ⊗Qk. (1.39)

We also note that the dimension ofVL(k) is

(20)

that is, it is a log-linear function of the numberNL of the degrees of

free-dom used for approximation of the first moment. Given that the sparse tensor product spaceVL(k) is substantially coarser, one wonders whether its approximation properties are substantially worse than that of the full ten-sor product spaceVL(k). The basis for the use of the sparse tensor product spacesVL(k)is the next result, which indicates thatVL(k)achieves, up to loga-rithmic terms, the same asymptotic rate of convergence, in terms of powers of the mesh width, as the full tensor product space. The approximation property of sparse grid spacesVL(k) was established, for example, in Schwab and Todor (2003b, Proposition 4.2), Griebel, Oswald and Schiekofer (1999), von Petersdorff and Schwab (2004) and Todor (2009).

Proposition 1.7. inf v∈VL(k) U−vV(k) ≤C(k) NL−s/dUX(k) s if 0≤s < p+ 1−/2, NL−s/dLν(k)UX(k) s ifs=p+ 1−/2. (1.41) Here, the exponent ν(k) = (k−1)/2 is best possible on account of the V-orthogonality of theV best approximation.

Remark 1.8. The exponent ν(k) of the logarithmic terms in the sparse tensor approximation rates stated in Proposition 1.7 is best possible for the approximation in the sparse tensor product spacesV(k) given the regularity U ∈Xs(k). In general, these logarithmic terms in the convergence estimate

are unavoidable. Removal of all logarithmic terms in the convergence rate estimate as well as in the dimension estimate ofVL(k)is possibleonly if either (a) the normV(k) on the left-hand side of (1.41) is weakened, or if (b) the

normXs(k) on the right-hand side of (1.41) is strengthened. For example, in

the context of sparse tensor FEM for the Laplacian in (0,1)d, it was shown by von Petersdorff and Schwab (2004) and Bungartz and Griebel (2004) that all logarithmic terms can be removed; this is due to the observation that the H1((0,1)d) norm is strictly weaker than the corresponding tensorized norm

H1(0,1)(d) which appears in the error bound (1.41) in the case of d-point correlations of a random field taking values inH01(0,1).

The same effect allows us to slightly coarsen the sparse tensor product spaceVL(k). This was exploited, for example, by Bungartz and Griebel (2004) and Todor (2009).

The error bound (1.41) is for the best approximation of U ∈Xs(k) from

VL(k). To achieve the exponent ν(k) = (k−1)/2 in (1.41) for a sparse tensor quasi-interpolant such as (1.38), the multi-level basisψ

j of V must

(21)

multi-level basis can be achieved in V H1(D), for example, by using so-calledspline prewavelets.

Let us also remark that it is even possible to construct L2(D) orthonor-mal piecewise polynomial wavelet bases satisfying (W1)–(W5). We refer to Donovan, Geronimo and Hardin (1996) for details.

The stability property (W3) implies the following result (see, e.g., von Petersdorff and Schwab (2004)).

Lemma 1.9. (on the sparse tensor quasi-interpolant PL(k)) Assume (W1)–(W5) and that the component spacesVofV

(k)

L areV-orthogonal

be-tween scales and have the approximation property (1.29). Then the sparse tensor projection PL(k) is stable: there exists C > 0 (depending on k but independent ofL) such that, for all for U ∈V(k),

PL(k)UV(k) ≤CUV(k). (1.42)

ForU Xs(k) and 0≤s≤s∗, if the basis functions ψj satisfy (W1)–(W5)

andareV-orthogonal between different levels of mesh refinement, we obtain quasi-optimal convergence of the sparse tensor quasi-interpolant PL(k)U in (1.38): U −PL(k)UV(k) ≤C(k)N− s/d L (logNL) (k−1)/2U Xs(k). (1.43)

Remark 1.10. The convergence rate (1.43) of the approximation PL(k)U from the sparse tensor subspace is, up to logarithmic terms, equal to the rate obtained for the best approximation of the mean field, i.e., in the case k = 1. We observe, however, that the regularity of U required to achieve this convergence rate is quite high: the function U must belong to an anisotropic smoothness class Xs(k) which, in the context of ordinary

Sobolev spaces, is a space of functions whose (weak) mixed derivatives of ordersbelong toV. Evidently, thismixed smoothness regularity requirement becomes stronger as the number k of moments increases. By Theorem 1.6, the k-point correlations Mku of the random solution u naturally satisfy such regularity.

Galerkin discretization

We first consider the discretization of the problemAu(ω) =f(ω) for a single realizationω, bearing in mind that in the Monte Carlo method this problem will have to be approximately solved for many realizations ofω∈Ω.

The Galerkin discretization of (1.11) reads: finduL(ω)∈VL such that vL, AuL(ω)=vL, f(ω) ∀vL∈VL, P-a.e.ω , (1.44)

(22)

injectivity (1.13) ofA, the G˚arding inequality (1.12) and the density inV of the subspace sequence{V}∞=0 imply that there existsL0 >0 such that, for

L≥ L0, problem (1.44) admits a unique solution uL(ω). Furthermore, we

have the uniform inf-sup condition (see, e.g., Hildebrandt and Wienholtz (1964)): there exists a discretization levelL0 and astability constant γ >0

such that, for allL≥L0,

inf 0=u∈VL 0=supvVL Au, v uV vV 1 γ >0. (1.45) The inf-sup condition (1.45) implies quasi-optimality of the approximations uL(ω) for L L0 (see, e.g., Babuˇska (1970/71)): there exist C > 0 and

L0>0 such that

∀L≥L0 : u(ω)−uL(ω)V ≤C inf

v∈VLu(ω)−vV P-a.e.ω∈. (1.46)

From (1.46) and (1.29), we obtain the asymptotic error estimate: define σ:= min{s∗, p+ 1−/2}. Then there existsC >0 such that for 0< s≤σ

∀L≥L0 : u(ω)−uL(ω)V ≤CNL−s/duXs P-a.e.ω . (1.47)

1.3. Sparse tensor Monte Carlo Galerkin FEM

We next review basic convergence results of the Monte Carlo method for the approximation of expectations of random variables taking values in a separable Hilbert space. As our exposition aims at the solution of opera-tor equations with stochastic data, we shall first consider the MC method without discretization of the operator equation, and show convergence esti-mates of the statistical error incurred by the MC sampling. Subsequently, we turn to the Galerkin approximation of the operator equation and, in par-ticular, the sparse tensor approximation of the two- andk-point correlation functions of the random solution.

Monte Carlo error for continuous problems

For a random variable Y, let Y1(ω), . . . , YM(ω) denote M Ncopies of Y,

i.e., the Yi are random variables which are mutually independent and

iden-tically distributed toY(ω) on the same common probability space (Ω,Σ,P). Then the arithmetic averageYM(ω),

YM(ω) := 1 M Y1(ω) +· · ·+YM(ω) ,

is a random variable on (Ω,Σ,P) as well.

The simplest approach to the numerical solution of (1.11) forf∈L1(Ω;V) is MC simulation. Let us first consider the situation without discretization of V. We generate M draws f(ωj), j = 1,2, . . . , M, of f(ω) and find the

(23)

solutionsu(ωj)∈V of the problems

Au(ωj) =f(ωj), j= 1, . . . , M. (1.48)

We then approximate thekth momentMkuwith the sample mean ¯EM[u(k)] ofu(ωj)⊗ · · · ⊗u(ωj): ¯ EM[u(k)] :=u⊗ · · · ⊗uM = 1 M M j=1 u(ωj)⊗ · · · ⊗u(ωj). (1.49)

It is well known that the Monte Carlo error decreases asM−1/2 in a proba-bilistic senseprovided the variance of u(k) exists. By (1.18), this is the case foru∈L2k(Ω;V). We have the following convergence estimate.

Theorem 1.11. Let k 1 and assume that in the operator equation (1.11)f ∈L2k(Ω;V). Then, for anyM Nof samples for the MC estimator (1.49), we have the error bound

MkuE¯M[u(k)]

L2(Ω;V(k)) ≤M−1/2CAfL2k(Ω;V)

k

. (1.50)

Proof. We observe that f L2k(Ω;V) implies with (1.22) that u(k) L2(Ω;V(k)). Fori= 1, . . . , M we denote byui(ω) theM i.i.d. copies of the

random variableu(ω) =A−1f(ω), which corresponds to the M many MC samplesui =A−1fi.

Using that the ui are independent and identically distributed, we infer

that, for each value ofi,ui(ω)∈L2k(Ω;V). Therefore E[u(k)]−E¯M[u(k)]2L2(Ω;V(k)) =E E[u(k)]−E¯M[u(k)]2V(k) =E E[u(k)] 1 M M i=1 u(ik) 2 V(k) =E E[u(k)] 1 M M i=1 u(ik),E[u(k)] 1 M M j=1 u(jk) = 1 M2 M i,j=1 EE[u(k)]−u(ik),E[u(k)]−u(jk) = 1 M2 M i=1 EE[u(k)]−u(ik)2V(k) (ui(ω) independent) = 1 ME u(k)E[u(k)]2V(k) (ui(ω) identically distributed) = 1 ME u(k)E[u(k)], u(k)E[u(k)]

(24)

= 1 M Eu(k)E[u(k)],E[u(k)]+Eu(k)E[u(k)], u(k) = 1 ME u(k)2V(k) 1 ME[u (k)]2 V(k) ≤M−1u(k)2L2(Ω;V(k))=M−1uL2k2k(Ω;V).

Taking square roots on both sides completes the proof.

The previous theorem required that u(k) L2(Ω;V(k)) or (equivalently by (1.18)) thatu∈L2k(Ω;V) (resp. f ∈L2k(Ω;V)) in order to obtain the convergence rateM−1/2 of the MC estimates (1.49), inL2(Ω;v).

In the case of weaker summability ofu, the next estimate shows that the MC method converges in L1(Ω;V(k)) and at a rate that is possibly lower than 1/2, as determined by the summability ofu. We only state the result here and refer to von Petersdorff and Schwab (2006) for the proof.

Theorem 1.12. Let k 1. Assume that f Lαk(Ω;V) for some α (1,2]. For M 1 samples we define the sample mean ¯EM[u(k)] as in (1.49). Then there existsC such that, for everyM 1 and every 0< <1,

P MkuE¯M[u(k)] V(k) ≤C fkLαk(Ω;V) 1M11 1−. (1.51) The previous results show that one can obtain a rate of up toM−1/2 in a probabilistic sense for the Monte Carlo method. Convergence rates beyond 1/2 are not possible, in general, by the MC method, as is shown by the central limit theorem; in this sense, the rate 1/2 is sharp.

So far, we have obtained the convergence rate 1/2 of the MC method essentially inL1(Ω, V(k)) and in L2(Ω, V(k)). A P-a.s convergence estimate of the MC method can be obtained using the separability of the Hilbert space of realizations and the law of the iterated logarithm; see,e.g., Strassen (1964) and Ledoux and Talagrand (1991, Chapter 8) and the references therein for the vector-valued case.

Lemma 1.13. Assume that H is a separable Hilbert space and thatX∈ L2(Ω;H). Then, with probability 1,

lim sup

M→∞

XM E(X)H

(2M−1log logM)1/2 ≤ X−E(X)L2(Ω;H). (1.52)

For the proof, we refer to von Petersdorff and Schwab (2006). Applying Lemma 1.13 toX=u(k)=u⊗· · ·⊗uand withV(k)in place ofHgives (with CA as in (1.14)) u⊗ · · · ⊗uL2(Ω;V(k)) = ukL2k(Ω;V) CA2kfkL2k(Ω;V),

References

Related documents

Using survey data from the largest and most representative empirical investigation of employment practice in MNCs in Ireland to date and supplemented by

Methods: A sample of 59 adult high risk males detained in a high secure hospital completed questionnaires at baseline and post treatment to assess violent attitudes, anger,

vehicle is stationary especially in urban canyon.

The Tour operator reserves the right to change the hotels listed on the tour for others of similar quality.. Page 7

Block characteristics include the size of the block in megahertz (dummies for 5 and 20 MHz blocks), whether the block was paired (UNPAIR = 1 if the block was unpaired), and

3. In a new game, the position of the dealer button will be determined by having the dealer deal one up card to each player. The first player to act will be the player

specialist unit to care for older people with delirium and dementia (the Medical and Mental Health 39.. Unit, MMHU) was developed and then tested in a randomised controlled trial

1º Fica aprovado o roteiro de tramitação para os procedimentos de reajuste, revisão e repactuação dos contratos do Superior Tribunal de Justiça, na forma do anexo I