• No results found

Stochastic Processes Under Linear Differential Constraints : Application to Gaussian Process Regression for the 3 Dimensional Free Space Wave Equation

N/A
N/A
Protected

Academic year: 2022

Share "Stochastic Processes Under Linear Differential Constraints : Application to Gaussian Process Regression for the 3 Dimensional Free Space Wave Equation"

Copied!
55
0
0

Loading.... (view fulltext now)

Full text

(1)

arXiv:2111.12035v2 [math.ST] 17 Dec 2021

Stochastic Processes Under Linear Differential Constraints : Application to Gaussian Process

Regression for the 3 Dimensional Free Space Wave Equation

Iain Henderson, Pascal Noble, Olivier Roustant May 2021

Abstract Let P be a linear differential operator over D Ă Rd and U “ pUxqxPD

a second order stochastic process. In the first part of this article, we prove a new necessary and sufficient condition for all the trajectories of U to verify the partial differential equation (PDE) TpUq “ 0. This condition is formulated in terms of the covariance kernel of U. When compared to previous similar results [1], the novelty of this result is that the equality TpUq “ 0 is understood in the sense of distributions, which is a functional analysis framework particularly adapted to the study of PDEs.

This theorem provides precious insights during the second part of this article, which is dedicated to performing ”physically informed” machine learning on data that is solution to the homogeneous 3 dimensional free space wave equation. We perform Gaussian process regression (GPR) on this data, which is a kernel based Bayesian approach to machine learning. To do so, we put Gaussian process (GP) priors over the wave equation’s initial conditions and propagate them through the wave equation. We obtain explicit formulas for the covariance kernel of the posterior GP; this kernel can then be used for GPR. Our theorem states that this kernel, the trajectories of the corresponding GP and the predictions provided by GPR are all solutions to the wave equation in the sense of distributions. We explore two particular cases : the radial symmetry and the point source. For the former, we derive convolution-free GPR formulas; for the latter, we show a direct link between GPR and the classical triangulation method for point source localization used e.g. in GPS systems. Additionally, this Bayesian framework gives rise to a new answer for the ill-posed inverse problem of reconstructing initial conditions for the wave equation with finite dimensional data, and simultaneously provides a way of estimating physical parameters from this data as in [2]. We finish by showcasing this physically informed GPR on a number of practical examples.

Keywords : Gaussian Process Regression, Partial Derivative Equations, Wave equa- tion, Physical Parameter Estimation, Initial Value Inverse Problems

1 Introduction

Machine learning techniques have proved countless times that they were able to pro- vide efficient solutions to difficult problems when field data was available. One key element to a great part of this success is the incorporation of ”expert knowledge” in the corresponding statistical models. In a good deal of practical applications, powerful mathematical models are already available (and somewhat well understood) to describe certain phenomena. This is very common when dealing with problems coming from physics such as thermodynamics, continuum mechanics or fluid mechanics to name a few. In these examples, the mathematical models take the form of Partial Differential Equations (PDEs). The zoology of PDEs is incredibly vast [3] and their applications are ubiquitous. As such, tremendous efforts have been devoted to trying to solve them,

(2)

both theoretically [3] and numerically [4] . These equations impose very specific struc- tures, simple or complex, on the observed data. These may be extremely difficult to capture and mimic with general machine learning models : given the data, they are difficult to understand for the model. However, given the theoretical and practical knowledge of the corresponding PDEs at our disposal, one should try to incorporate these structures in the machine learning models. How may this be done? Gaussian Process Regression (GPR) [5], which is a type of Bayesian framework for machine learn- ing, provides a possible answer when the PDE is linear. Indeed, Gaussian processes (GPs) are the most ”linear” of all random processes : they are stable under (finite) linear combinations of their elements [5]. Although they are very simple mathematical objects when compared to non linear ones, linear PDEs are central in general PDE theory and remain physically pertinent in a number of applications such as acoustics, electromagnetics or quantum mechanics. As a matter of fact, the PDE after which this article is titled plays a fundamental role in all three of the aforementioned domains.

1.1 State of the Art

GPR is a ”kernel based” machine learning method, which means that it is built around a positive definite function (see equation (2)) called its kernel. Solving or ”learning”

linear ODEs and PDEs thanks to GPR is not a very new idea. The first initiatives in that direction probably go back to [6] and have been re-explored ever since in a number of cases. They have been developed in the context of latent forces [7] [8] [9] [10] and were then applied to certain wave equations [11] [12]. Latent forces are interested in linear PDEs of the form

Lu“ f (1)

where both u and f are defined on the same domain D Ă Rd, with L a linear differential operator. Latent forces put a GP prior on the driving source term f . Explicit resolution of (1), thanks to Green’s functions, translates this prior as a GP distribution on the solution u. Conversely, a second approach [2] [13] rather puts a GP prior on the solution u and straightforwardly translates (1) as a GP distribution on the driving term f , avoiding the need for Green’s functions and convolutions. Though both of these approaches are ”physically informed”, they may not account for strict linear equality constraints in the interior domain D, which could be exploited for dimension reduction when they are known. Actually, a number of famous PDEs can be studied with no interior source term, in which case initial conditions or more general boundary source terms are provided. This is frequent in the evolution equation literature, which for instance gives rise to the fruitful semi-group theory for PDEs [14]. Equation (1) then writes Lu “ 0 and while u may be a random object, 0 should be strictly 0 and not just a centered stochastic process as would be the case if using any of the two frameworks described above. A first step towards enforcing strict linear constraints in the approximation space probably dates back to [15] where in a deterministic context, divergence-free interpolation spaces were first built. This idea was pursued in [16] where an interpolation space comprised of solutions to Laplace’s equation was constructed.

In both cases, these spaces are built upon positive definite kernels (see equation (2)), which makes them easy to transpose in a GPR framework. As such, the kernel from [16] was then used in [17] for performing GPR on Laplace’s equation. Likewise for [18], where GPs are built so that their trajectories are systematically divergence/curl free. This was then taken a step further in [19] [20] [21] [22] in the more general context of Maxwell’s stationary equations. Finally, [23] applies this framework to the 1D heat equation, Laplace’s and Helmholtz’ 2D equations. The matter of enforcing strict homogeneous boundary conditions in the context of GPR has also been addressed in [24] [25]. Enforcing these constraints provides another way of lowering the dimension of the problem. Following [26], [24] builds a PDE-tailored covariance kernel thanks to a Mercer-like expansion in terms of eigenvectors of the differential operator in question.

In contrast with the rest of the literature, [25] raises the question of rigorous proofs and regularity issues regarding the derivations and applications of GPR for PDEs, and resorts to algebraic techniques to justify the different steps of his approach. We

(3)

raise the same questions here, though we rather make use of a functional analysis framework adapted to PDEs. All of the aforementioned approaches as well as the one presented in this article can be formulated using the theory of stochastic partial differential equations (SPDE), which are PDEs whose source term is a random function.

The general matter of applying physically informed GPR to linear PDEs thanks to an SPDE formulation is tackled in [27], without addressing regularity questions. [28]

presents how spatio-temporal GPR can be reformulated as an SPDE problem, enabling the use of Kalman filter theory for computational efficiency. In [29], the variational formulation (see [3], section 6.1.2 for a definition) of certain linear PDEs has been incorporated into a GPR framework thanks to an SPDE reformulation. This approach requires the use of Gaussian generalized stochastic processes (see [30], section 2.2.1.1), or ”functional Gaussian processes” following [29]. In [31], covariance kernels on graphs are obtained thanks to an adaptation of SPDEs on graphs. Finally, [32] focuses on the study of stationary stochastic processes that are solutions of a wide class of linear SPDEs, outside of the context of GPR. In particular, [32] provides a description of all the second order stationary stochastic processes that are solutions to the 3D wave equation, a central equation in the present article; this description is done in terms of the covariance kernel of the corresponding stochastic process. Note that as in this article, [32] also makes use of the theory of generalized functions.On a side note, a thorough theoretical study of a stochastic 3D wave equation and the regularity of the associated random paths is presented in [33]. In a much wider framework, general linearly constrained stochastic processes and GPs in particular are thoroughly explored in [1]. Though [1] deals with many different types of linear operators, the application of the corresponding results to linear PDEs are not straightforward, see section 3. Indeed, these results are not phrased using the language of functional analysis, which was in great part designed to deal efficiently with differential equations. This is why we prove in Proposition 2 a theorem resembling those of [1] but specifically adapted to linear PDEs. A recent survey on linearly constrained GPs presents most of these different approaches [34].

In this article, we focus on the so called (3 dimensional, free space, time dependent) wave equation. As a time-dependent PDE, we show that performing GPR on wave equation data amounts to reconstructing the corresponding initial conditions of the wave equation from incomplete scattered data. This immediately echoes with questions rising from the inverse problem community [35] [36]; the solution we provide in this article entirely falls in the domain of Photo Acoustic Tomography (PAT), which deals with recovering the initial conditions of wave propagation problems for which the initial value problem (37) is the archetype. We quote [37] and the many references therein on that topic. Also, Bayesian approaches for inverse problem questions involving the wave equation have been set up a number of times, though never really following the GPR methodology presented in this article when it comes down to reconstructing some initial conditions. In these approaches, a probability prior is set on the model’s parameter space, typically the wave propagation speed c; the goal is then to estimate the model’s true parameters. See [35], sections 5.6 to 5.8 for practical examples, or [38] for an inversion of a non uniform propagation speed cpxq.

1.2 Contribution of the paper

We tackle the general problem of applying GPR on scattered observations of solutions of the wave equation using ”physically informed” GPs. We explore both the theoretical and applied aspects of this task.

We begin by proving a general result that provides a simple necessary and sufficient condition for the trajectories of any second order stochastic process to be solutions to a given linear PDE in the distributional sense (section 3). This condition is formulated in terms of its covariance function; the hypotheses are minimal and the displayed result is concise. This theorem is phrased in a functional analysis framework, using the permissive language of generalized functions.

We describe a general Gaussian process model for the homogeneous 3D wave equa- tion, with the corresponding proofs (section 4). This model is obtained by putting

(4)

GP priors on the initial conditions of the wave equation, which is a natural thing to do when one views these initial conditions as unknown. In particular we derive the corresponding positive definite covariance kernel which we will directly use for GPR.

For short, we denote WIGPR the use of this positive kernel to perform GPR, as in

”Wave Informed GPR”. The exposed approach enforces strict linear homogeneous PDE constraints in the interior domain, similarly to what was observed in [19] [20] [22]

among others for different PDEs; see the state of the art section for more details. More precisely, the trajectories of the corresponding GP all verify the wave equation, as well as the predictions provided by WIGPR. Here, these linear constraints are understood in light of our result from section 2, i.e. in the sense of distributions. The key difference with the kernels presented in [32] is that here, no stationarity assumptions are made on the underlying stochastic process. In particular the spectral measure provided by Bochner’s theorem [5], which is the key tool used in [32], is not available anymore. We thus resort to more general integration techniques in the proofs. We then provide an inverse problem interpretation of the use of WIGPR on wave equation data, as the prediction from WIGPR evaluated at t“ 0 provides a finite dimensional reconstruc- tion of the real initial conditions corresponding to the observed data. This is a natural thing to do when adopting a Bayesian point of view, where the prior GP distribution is conditioned on the data. The resultant posterior distribution enables the estimation of the parameters on which the priors were set, i.e. the initial conditions. Among other things, the solution to the corresponding inverse problem provided by this approach is implementable in practice. Two particular cases are investigated : the radial symmetry and the point source. When the initial conditions exhibit radial symmetry, we derive convolution-free covariance formulas and discuss them when the initial conditions are compactly supported. Indeed, these formulas can be directly linked to the finite speed propagation principle for the 3D wave equation, also known as the strong Huygens principle. In the case of the point source, we show numerically (Figure 1) and theo- retically that the parameter fitting step from WIGPR naturally reduces to the classic triangulation approach for point source localization, used for instance in GPS systems.

Indeed, as in [2], WIGPR can be used to jointly estimate physical parameters such as wave speed, source localization or source size. Note that the wave equation differs from most of the PDEs mentioned in the introduction, such as the heat equation or Laplace’s equation, because these are either parabolic or elliptic. There are known regularization effects for such types of PDEs [3] which mitigate the need for precise mathematical argumentation w.r.t. the derivations of GPR for PDEs. Such regularization effects completely disappear for hyperbolic PDEs such as the wave equation [3]. In that case, precise mathematical argumentation and rigorous derivations become critical [3]. This is one reason why throughout this section, no proofs are omitted and precise arguments are established to justify all the derivations that lead to the exposed formulas.

We showcase a few numerical experiments of WIGPR (section 5). They are per- formed on numerically generated wave equation data, with radially symmetric com- pactly supported initial conditions. This data takes the form of a number of noise polluted time series, each of them corresponding to an ”artificial” sensor placed in the numerical simulation. We thus use the fast-to-compute covariance expressions derived in the previous section. The tackled questions concern the quality of the estimation of certain physical parameters, the quality of the initial condition reconstruction and the sensibility of the reconstruction step w.r.t. the sensor location. We display initial con- dition reconstruction images, in light of the inverse problem interpretation described in the previous section. In appendix A are presented more complete numerical results, showing for each example the quality of the physical parameter estimation as well as L2, L1 and L8 relative error estimates in terms of the number of sensors used.

Organization of the paper The paper is organized as follow. For self-containment, section 2 is dedicated to reminders on GPs, GPR and generalized functions. This section and all the proofs are detailed enough so that this article is accessible both to the analyst and the statistician. In section 3, we state and prove our new necessary and sufficient condition on stochastic processes that are subject to linear differential

(5)

constraints. Section 4 is dedicated to the study of the wave equation thanks to Gaussian processes and Gaussian process regression. In section 5, we showcase some numerical applications of the previous section on wave equation data. We conclude in section 6.

2 Notations and background

2.1 Notations

Let m be a scalar function defined on an open set D Ă Rd and k a scalar function defined on Dˆ D. Let X “ px1, ..., xnqT be a column vector in Dn. Let pΩ, F , Pq be a probability space, over which all the random objects of this article will be defined.

N.1 Note mpXq the column vector such that mpXqi “ mpxiq, kpX, Xq the square matrix such that kpX, Xqij “ kpxi, xjq and given x P D, kpX, xq the column vector such that kpX, xqi “ kpxi, xq.

N.2 for any positive definite kernel k, Hk denotes the associated Reproducing Kernel Hilbert Space (RKHS) as defined in [30].

N.3 Note L2pPq the Hilbert space of real-valued random variables defined on Ω with finite second order moment endowed with the inner product xX, Y y “ ErXY s.

For a GP pUpxqqxPD, the trajectory of U at point ω P Ω is the deterministic function x ÞÝÑ Upxqpωq and is noted Uω. LpUq :“ SpanpUpxq, x P Dq Ă L2pPq denotes the Hilbert subspace of L2pPq induced by U. Since L2pPq-limits of Gaus- sian random variables drawn from the same GP remain Gaussian [39], LpUq only encompasses Gaussian random variables.

N.4 If X is a random variable, then ”X P A a.s.” means ”X P A almost surely”, or equivalently PpX P Aq “ 1. Likewise, if f is a function defined on D Ă Rd, then

”fpxq P A a.e.” means ”f pxq P A almost everywhere” or equivalently, λdptx P D : fpxq R Auq “ 0 where λd is the Lebesgue measure on Rd.

N.5 L1locpDq denotes the space of measurable scalar functions f defined on D that are locally integrable, i.e. such that ş

K|f | ă `8 for all compact sets K Ă D.

DpDq denotes the space of compactly supported infinitely differentiable functions supported on D.

N.6 for k “ pk1, ..., kdq P Nd, we use the usual notation |k| “ k1 ` ... ` kd and Bk:“ Bkx11...Bxkdd where Bkxii is the kith derivative w.r.t the ith coordinate xi.

N.7 The variables pr, θ, φq will always denote spherical coordinates; Sp0, 1q denotes the unit sphere of R3 and we will always write dΩ“ sin θdθdφ its surface differen- tial element; γ “ psin θ cos φ, sin θ sin φ, cos θqT P Sp0, 1q denotes the unit length vector parametrized by pθ, φq.

2.2 Gaussian Processes

We refer to [5] for further details on on Gaussian processes.

Definition LetpΩ, F , Pq be a probability space and D Ă Rdan open set. A Gaussian process pUpxqqxPD is a collection of normally distributed random variables defined on Ω and indexed by D such that for any px1, ..., xnq P Dn, the law of pUpx1q, ..., UpxnqqT is a multivariate normal distribution. Its trajectories are the deterministic functions Uω : x ÞÝÑ Upx, ωq for any ω P Ω. The law of a GP is characterized by its mean and covariance functions, defined by

• mpxq :“ ErU pxqs

• kpx, x1q “ CovpUpxq, Upx1qq “ ErpUpxq ´ mpxqqpUpx1q ´ mpx1qqs We write pUpxqqxPD „ GP pm, kq.

(6)

Covariance kernels The function m can be any function, and is actually often set to zero. On the other hand, the function k has to be positive definite (PD) :

ÿn i,j“1

aiajkpxi, xjq ě 0 @ px1, ..., xnq P Dn, @ pa1, ..., anq P Rn (2)

PD functions verify the Cauchy-Schwarz inequality [5] :

@x, x1 P D, |kpx, x1q| ď a

kpx, xqa

kpx1, x1q (3)

The covariance kernel k is the core element that encodes the mathematical properties of the GP. Furthermore, there is a one-to-one correspondence between positive definite kernels and (covariance kernels of) centered GPs [40]. Thus we will focus on the design of positive definite kernels.

Among all covariance kernels, some are said to be stationary, in which case the value atpx, x1q only depends on the increment x ´ x1 : kpx, x1q “ kSpx ´ x1q. Common examples are the squared exponential and Mat´ern kernels [5]; see equation (117) for an example.

Bayesian inference of functions A Gaussian process U “ pUpxqqxPD can also be seen as a random variable that is valued in a space of functions, i.e. a random function.

Indeed, U can equivalently be viewed as the following random variable :

U :

#pΩ, F , Pq ÝÑ pE, T q

ω ÞÝÑ Uω “ rx ÞÑ Upxqpωqs (4)

wherepE, T q is a measurable space of functions large enough to contain the trajectories of U. If E is a Banach space, T can for example be set to be the Borel σ´algebra associated to the normed vector space topology of E. This in turn defines a probability distribution over pE, T q, which is the associated pushforward measure PU defined by PUpAq “ PpU P Aq @A P T . If a function u P E is unknown, it can be modelled as a random function, for example U. Its a priori probability distribution will be PU and we will say that we put a Gaussian process prior over u. This is typical of Bayesian inference, where probability distributions are assumed over unknown quantities prior to observing them. Thanks to Bayes’ theorem, the prior probability distribution of u can then be updated through probability conditioning when data on u is available. The conditioned probability distribution over u is called the posterior. Statistical indicators can then be derived from the posterior, such as expectation and standard deviation, to estimate the unknown quantity over which the prior was initially set. Bayesian inference is one way of understanding Gaussian process regression (next subsection).

2.3 Gaussian Process Regression

We refer to [5] for further details on Gaussian process regression.

Kriging equations GPs can be used for function interpolation. Let u be a func- tion defined on D of which we know a small dataset of values B “ tupx1q, ..., upxnqu.

Conditioning the law of a GP pUpxqqxPD „ GP pm, kq on the database B yields a sec- ond GP ˜U with ˜Upxq :“ pUpxq|Upxiq “ upxiq, i “ 1, ..., nq. The law of ˜U is known : p ˜UpxqqxPD „ GP p ˜m, ˜kq. ˜m and ˜k are given by the so-called Kriging equations (5) and (6). Note X “ px1, ..., xnqT and suppose that KpX, Xq is invertible, then [5]

"

˜

mpxq = mpxq ` kpX, xqTkpX, Xq´1pupXq ´ mpXqq (5)

˜kpx, x1q = kpx, x1q ´ kpX, xqTkpX, Xq´1kpX, x1q (6) In a Bayesian framework, the initial GP pUpxqqxPD is the prior and the conditioned GPp ˜UpxqqxPD is the posterior; the Kriging mean and covariance are simply the mean and covariance of the posterior. At location x, ˜mpxq is the prediction of upxq. By

(7)

construction, for all i P t1, ..., nu, we have that ˜mpxiq “ upxiq and ˜kpxi, xiq “ 0. If observing noisy data Ui “ Upxiq ` εi with pε1, ..., εnqT „ N p0, σ2Inq independent from U, one replaces KpX, Xq with KpX, Xq ` σ2I above and leaves the terms kpX, xq unchanged. This amounts to applying Tikhonov regularization on kpX, Xq and may also be used to approximate (5) and (6) when kpX, Xq is ill-conditioned.

Tuning covariance kernels For discussions on general kernel construction and se- lection, we refer to [5]. Usually, a family of kernels kθ indexed by θ P Θ Ă Rq is first selected. The elements of θ are the hyperparameters of kθ. One may then try to find the value θ˚ that fits the best the observations, which corresponds to max- imizing the marginal likelihood. It is the probability density of the Gaussian ran- dom vector pUpx1q, ..., UpxnqqT at point pupx1q, ..., upxnqqT, see equation (8). Note uobs “ pupx1q, ..., upxnqqT the vector of observations at locations X “ px1, ..., xnq and ppuobs|θq the associated marginal likelihood at point θ, we search for θ˚ such that

θ˚ “ arg max

θ

ppuobs|θq (7)

Explicitly, assuming that m” 0, then pUpx1q, ..., UpxnqqT „ N p0, kθpX, Xqq and ppuobs|θq “ 1

p2πqn{2det kθpX, Xq1{2e´12uTobskθpX,Xq´1uobs (8) Set Lpθq :“ ´2 log ppuobs|θq ´ n log 2π, then (7) is equivalent to

θ˚ “ arg min

θ

Lpθq (9)

Problem (9) is better behaved numerically. From now on, we call Lpθq the negative log marginal likelihood and we have, for noiseless observations,

Lpθq “ uTobskθpX, Xq´1uobs` log det kθpX, Xq (10) and for noisy observations with noise standard deviation σ,

Lpθ, σ2q “ uTobspkθpX, Xq ` σ2Inq´1uobs` log detpkθpX, Xq ` σ2Inq (11) σ can be interpreted as an additional hyperparameter and estimated through (9).

Scattered Data interpolation and the RKHS point of view Kriging equations (5) and (6) can be encountered without resorting to GPs. Given a positive definite kernel k defined on D, one may build a Reproducing Kernel Hilbert Space (RKHS) of functions defined on D which we denote by Hk, see N.2. The inner product of Hk

verifies the so called reproducing property :

@x, x1 P D, xkpx, ¨q, kpx1,¨qyHk “ kpx, x1q (12) In the meshfree interpolation framework [30] [41], one may formulate the following constrained (interpolation) optimization problem

vminPHk

||v||Hk s.t. vpxiq “ upxiq @i P t1, ..., nu (13) Solving (13) leads to the kriging equation for ˜m in (5); the second equation (6) is what is called the power function in [41]. One may also show [30] that equation (5) can be summerized as

˜

m“ m ` pFpu ´ mq (14)

with F is the finite dimensional space defined as F “ Spanpkpx1,¨q, ..., kpxn,¨qq Ă Hk

and pF stands for the orthogonal projection operator on F w.r.t. the inner product of Hk. In particular, when m ” 0, equation (14) amounts to ˜m “ pFpuq. Likewise, equation (6) amounts to

˜kpx, ¨q “ PFKpkpx, ¨qq and ˜kpx, xq “ ||PFKkpx, ¨q||2Hk ď ||kpx, ¨q||2Hk “ kpx, xq (15) One perk of this approach is that the Kriging mean is now be understood as an or- thogonal projection over a finite dimensional deterministic space, which is reminiscent of Fourier series or Galerkin reconstruction approaches.

(8)

2.4 Generalized functions

We refer to [42] and [43] for further details on generalized functions. In this whole subsection, D is an open set of Rd.

Definitions and properties Endow DpDq with its usual LF-space topology, defined for example in [42]. We call generalized function any continuous linear form on DpDq, i.e. any element of DpDq1, the topological dual of DpDq. We will rather denote it by D1pDq as in [42]. The topology of DpDq is such that T P D1pDq if and only if for all compact set K Ă D,

DCK ą 0, DnK P N, @ϕ P DpDq s.t Supppϕq Ă K, |T pϕq| ď CK

ÿ

|k|ďnK

||Bkϕ||8 (16)

Generalized functions are also called ”distributions”, a terminology we will only use when there is no risk of confusion with probability distributions. The duality bracket will be denoted x, y : for ϕ P DpDq and T P D1pDq, we have xT, ϕy “ T pϕq.

• Any function f P L1locpDq can be injectively identified to a generalized function Tf [42] defined as follow

@ϕ P DpDq, xTf, ϕy :“

ż

D

fpxqϕpxqdx (17)

The map L1locpDq Q f ÞÝÑ Tf is linear and injective. Throughout this article, we will use the abusive notationxTf, ϕy “ xf, ϕy, as if x, y were the L2 inner product.

• Any generalized function T can be indefinitely differentiated [42] with the follow- ing definition (see N.6)

BkT : ϕÞÝÑ xT, p´1q|k|Bkϕy (18) which coincides with the definition of weak derivatives when T is a function that admits the according weak derivatives [42].

In particular, (17) and (18) combined provide a flexible definition for the derivatives of any function f P L1locpDq up to any order.

Radon measures In this paper, we call positive Radon measure any positive mea- sure over D that is Borel regular ( [44], Def 1.9) and that has finite mass over any compact subset of D. Borel regularity is a standard regularity hypothesis from mea- sure theory. We call real-valued Radon measure, or simply Radon measure, any linear combination of positive Radon measures. In [45], Chapter IX, it is proved that the space of Radon measures over D is isomorphic to the space of continuous linear forms over CcpDq, the space of compactly supported continuous functions on D endowed with its usual LF-space topology described e.g. in [43]. The corresponding isomorphism is given by

µÞÝÑ

#CcpDq ÝÑ R

f ÞÝÑş

Dfpxqµpdxq (19)

We have the following facts :

• any signed measure that admits a density f w.r.t. the Lebesgue measure such that f P L1locpDq is a Radon measure ( [43],p.217).

• mimicking (17) and (19), any Radon measure can be injectively identified to a generalized function with the following identification [43]

@ϕ P DpDq, xµ, ϕy :“

ż

D

ϕpxqµpdxq (20)

In particular, Radon measures can be differentiated up to any order through equation (18).

(9)

• for any Radon measure µ, there is a unique couple pµ`, µ´q of positive Radon measures such that µ“ µ`´ µ´ ( [45], Chapter IX). We then define |µ| as

|µ| :“ µ`` µ´ (21)

• If µ and ν are two finite Radon measures over Rd (i.e. ş

Rd|µ|pdxq ă 8 and likewise for ν), their convolution µ˚ ν is defined as follow : let BpRdq be the Borel σ´algebra of Rd, then

@A P BpRdq, pµ ˚ νqpAq “ ż

Rd

ż

Rd

1Apx ` yqµpdxqνpdyq (22) and µ˚ ν is also a Radon measure over Rd. When µ and ν have densities fµ and fν, µ˚ ν has the density fµ˚ fν defined by

pfµ˚ fνqpxq “ ż

Rd

fµpyqfνpx ´ yqdy

Remark 1. What is meant behind the terminology of Radon measures varies between authors. [44] calls Radon measure what we call positive Radon measure in this article.

[45] proves that continuous linear forms over CcpDq are differences of Radon measures in the sense of the Radon measures defined in [44], but [45] never uses the term of Radon measures, positive of not. Likewise, [43] calls positive Radon measure any positive linear form over CcpDq which, thanks to the proof from [45], reduces to Radon measures in the sense of [44].

Finite order generalized functions Let k be a non negative integer, we consider CckpDq the space of compactly supported functions of class Ck endowed with its usual LF-space topology [42]. We denote CckpDq1its topological dual. The topologies of CckpDq and DpDq are such that the canonical injection DpDq Ñ CkpDq is continuous [43], which yields that CkpDq1 Ă D1pDq : continuous linear forms over CkpDq, when restricted to DpDq, become continuous linear forms over DpDq, i.e. generalized functions. We then have the following definitions and facts.

• Generalized functions T P D1pDq that are restrictions of continuous linear forms over CckpDq are called generalized functions of order k. If T is of order k for some kP N, T is said to be of finite order.

• T P D1pDq is at most of order n if in equation (16), the integer nK can always be taken to be equal to n, whatever the compact set K.

• Let T be a generalized function of order k. Then [43] there exists a family of Radon measures tµpu|p|ďk over D such that

T “ ÿ

|p|ďk

Bpµp (23)

where the equality in (23) holds in D1pDq and CckpDq1. Note that we recover (20) when k“ 0.

• Among the finite order generalized functions are those that are compactly sup- ported, i.e. those for which the measures µp such that T “ ř

|p|ďkBpµp all have compact support. One property is that one can define the Fourier transform of any compactly supported generalized functions [42].

Convolution with generalized functions Let k be a non negative integer. As above, we consider CckpRdq endowed with its usual topology. Let f P CckpRdq and T P CckpRdq1. Note τxf the function y ÞÝÑ f py ´ xq and ˇf the function y ÞÝÑ f p´yq.

Then [43] one may define the convolution between T and f by

T ˚ f : x ÞÝÑ xT, τ´xfˇy (24)

(10)

and T ˚ f is a function in the classical sense, i.e. defined pointwise. When T lies in L1locpDq, equation (24) reduces to the usual convolution of functions

pT ˚ f qpxq “ ż

Rd

Tpyqf px ´ yqdy

through the identification defined in equation (17). More general definitions of gener- alized function convolution are available [43] but this one is sufficient for our use.

Tensor product of generalized functions For two generalized functions T1 P D1pD1q and T2 P D1pD2q, T1 b T2 P D1pD1 ˆ D2q denotes their tensor product [43], which is uniquely determined by the following tensor property :

1 P DpD1q, @ϕ2 P DpD2q, xT1b T2, ϕ1b ϕ2y “ xT1, ϕ1y ˆ xT2, ϕ2y (25) T1 b T2 reduces to the tensor product of functions when T1 and T2 are functions through the identification of equation (17), and the product measure when T1 and T2

are Radon measures through (20).

3 Stochastic processes under linear differential con- straints

One may wish to force the trajectories of a stochastic process U “ pUpxqqxPD to verify linear constraints, i.e. to lie in the kernel of some linear operator. This is a priori an ambitious task as the trajectories of U form a vast set of functions. However, if U is a second order stochastic process (i.e. @x P D, Var`

Upxq˘

ă `8), then in many cases linear constraints on the trajectories of U can be completely translated as linear constraints on the covariance kernel of U. In particular, these new linear constraints are imposed on a much smaller set of accessible ”explicit” functions. Overall, the resulting constraints on the covariance kernel of U are much easier to handle than the constraints on the trajectories of U. This idea was thoroughly explored in [1], where different general frameworks were studied in order to formulate mathematical results on linearly constrained stochastic processes. In proposition 1, we recall a particular result from [1] that was then applied to the stationary heat equation in the same article.

Note FpD, Rq the space of real-valued functions defined on D. Proposition 1 is based on the so called Lo`eve isometry [30] between LpUq and L2pPq (see N.3 for notations), which in turn leads to the following theorem.

Proposition 1 (Trajectories of GPs under linear constraints [1]). Let ` Upxq˘

xPD „ GPp0, kq be a centered GP. Note for all x P D the function kx : y ÞÝÑ kpx, yq. Let E be a real vector space of functions defined on D that contains the trajectories of U almost surely and T : E ÝÑ F pD, Rq be a linear operator. Suppose that for all xP D, T pUqpxq P LpUq. Then there exists a unique linear operator T : Hk ÝÑ F pD, Rq such that for all x, x1 P D,

ErT pUqpxqUpx1qs “ T pkx1qpxq and @x P D, @hn

Hk

ÝÝÑ h, T phnqpxq ÝÑ T phqpxq. Moreover, the following statements are equivalent :

(i) Pptω P Ω : T pUωq “ 0uq “ 1 (ii) @x P D, T pkxq “ 0

(iii) TpHkq “ t0u

This theorem can be applied when T is a differential operator as discussed in [1].

However, in Proposition 1, the differential operator T of order n has to be valued in the space of (classical, pointwise defined) functions FpD, Rq; in particular for u P E,

(11)

the function Tpuq has to be defined pointwise in order to use the Lo`eve isometry. To summarize, in all generality the derivatives in T have to be understood in a classical sense and E has to be contained in DnpDq, the space of n times differentiable functions on D. Requiring that E Ă DnpDq is a very strong assumption w.r.t. the trajectories of U; furthermore, this is not compliant with the usual way of studying PDEs where derivatives are understood in a weaker sense. We present in Proposition 2 an adaptation of Proposition 1 where we make use of the distributional definition of derivatives and relax the assumptions made on U and its trajectories. In this proposition, `

Upxq˘

xPD

is not supposed Gaussian and is only required to be second order. We refer to the notation paragraphs N.5 and N.6.

Proposition 2 (Trajectories of stochastic processes under linear differential con- straints). Let D Ă Rd be an open set and let T “ ř

|k|ďnakpxqBk be a linear differ- ential operator with coefficients akpxq P C|k|pDq. Let U “`

Upxq˘

xPD be a second order stochastic process with mean function mpxq and covariance kernel kpx, x1q. For all xP D, note kx : y ÞÝÑ kpx, yq. Suppose that its mean function m lies in L1locpDq as well as its standard deviation function σ : xÞÝÑ a

kpx, xq.

1) Then on a set of probability 1, the trajectories of U lie in L1locpDq as well as the functions kx for all xP D.

2) Suppose that Tpmq “ 0 in the sense of distributions. Then the following statements are equivalent :

(i) PpT pUq “ 0 in the sense of distributionsq “ 1 (ii) @x P D, T pkxq “ 0 in the sense of distributions.

Here we write down precisely what we mean by piq and piiq. Note T˚ the formal adjoint of T defined by T˚u“ř

|k|ďnp´1q|k|Bkpakpxquq. By piq, we mean that DA P F , PpAq “ 1, @ω P A, @ϕ P DpDq, xUω, T˚ϕy “

ż

D

UωpxqT˚ϕpxqdx “ 0 (26) Similarly, piiq means that

@x P D, @ϕ P DpDq, xkx, T˚ϕy “ ż

D

kxpyqT˚ϕpyqdy “ 0 (27) This definition can be found e.g. in [46]. The fact that the functions xÞÝÑ Uωpxq and y ÞÝÑ kxpyq lie in L1locpDq ensure the existence of the integrals in equations (26) (see point 2 of the proof of proposition 2) as well as the continuity of the associated linear forms over DpDq, following the definition (17). In every case, the term ”in the sense of distributions” can be replaced by ”in D1pDq” : stating that T pf q “ 0 in the sense of distributions means that Tpf q, seen as an element of D1pDq, is equal to the null generalized function 0D1pDq : ϕÞÝÑ 0.

Proof. Suppose first that U is centered, i.e. m” 0.

1) We begin by showing that the trajectories of U almost surely lie in L1locpDq. Note first that thanks to the Cauchy-Schwarz inequality, Er|Upxq|s ď σpxq. Now, let pKnqnPN

be an increasing sequence of compact subsets of D such that Ť

nPNKn “ D. Then for any nP N,

E“ ż

Kn

|Upxq|dx‰

“ ż

Kn

Er|Upxq|sdx ď ż

Kn

σpxqdx ă `8 (28)

since σ P L1locpDq. Using the property that ”Er|X|s ă `8 ùñ |X| ă `8 almost surely”, this yields a set Bn Ă Ω of probability 1 such that the random variable ω ÞÝÑ ş

Kn|Uω|pxqdx takes finite values over Bn. Consider now the set B “ Ş

nPNBn

which remains of probability 1. For all compact subset K Ă D, there exists an integer nK such that K Ă KnK and thus for all ωP B,

ż

K

|Uωpxq|dx ď ż

KnK

|Uωpxq|dx ă `8

(12)

which shows that the trajectories of U lie in L1locpDq almost surely.

Now, we check that for all x P D, kx P L1locpDq : for any compact set K, since σP L1locpDq and because of (3),

ż

K

|kxpyq|dy “ ż

K

|kpx, yq|dy ď σpxq ż

K

σpyqdy ă 8

2) Let us check in advance that whatever f P L1locpDq, the map T pf q : ϕ ÞÝÑ xf, T˚ϕy is a continuous linear form over DpDq. Since ak P CkpDq, we can apply Leib- niz’ rule on T˚ϕ “ ř

|k|ďnp´1q|k|Bkpakϕq. This yields a family tfku|k|ďn of continuous functions over D such that

@ϕ P DpDq, @x P D, T˚ϕpxq “ ÿ

|k|ďn

fkpxqBkϕpxq (29) For all f P L1locpDq, for all compact set K Ă D and for all ϕ P DpDq such that Supppϕq Ă K, (29) yields

|xf, T˚ϕy| ď ż

D

|f pxq||T˚ϕpxq|dx ď

ˆ ż

K

|f pxq|dx ˆ max

|k|ďnsup

xPK

|fkpxq|˙ ÿ

|k|ďn

||Bkϕ||8 ă `8 (30) This proves that Tpf q : ϕ ÞÝÑ xf, T˚ϕy is a continuous linear form over DpDq (eq.(16)).

piq ùñ piiq : Suppose piq. Let ϕ P DpDq. There exists a set A Ă Ω such that PpAq “ 1 and such that for all ω P A,

ż

D

UωpxqT˚φpxqdx “ 0

Multiplying equation above with Uωpx1q, taking the expectancy and formally permuting (for now) the integral and the expectancy, we obtain

0“ E

« Upx1q

ż

D

UpxqT˚ϕpxqdx ff

“ ż

D

T˚ϕpxqErUpxqUpx1qsdx

“ ż

D

T˚ϕpxqkpx, x1qdx “ xkx1, T˚ϕy

The integral-expectancy permutation is justified by writing down the expectancy as an integral and using Fubini’s theorem, checking that the below quantity is finite :

E

« ż

D

|Upx1qUpxqT˚ϕpxq|dx ff

“ ż

D

|T˚ϕpxq|Er|UpxqUpx1q|sdx (Tonelli)

ď ż

D

|T˚ϕpxq|ErUpxq2s1{2ErUpx1q2s1{2dx ď σpx1q

ż

D

|T˚ϕpxq|σpxqdx ă `8

The last integral is finite because of (30) and σ P L1locpDq. Thus, @x P D, @ϕ P DpDq, xkx, T˚ϕy “ 0 which proves that piq ùñ piiq.

piiq ùñ piq : Suppose piiq. Let ϕ P DpDq, we have xkx1, T˚ϕy “ 0. Multiplying this with T˚ϕpx1q and integrating w.r.t. x1 yields

0“ ż

D

T˚ϕpx1q ż

D

T˚ϕpxqkpx, x1qdxdx1 “ ż

D

ż

D

T˚ϕpxqT˚ϕpx1qErUpxqUpx1qsdxdx1 Permuting formally the expectancy and the integrals (justified in equation (31)) yields

0“ ż

D

ż

D

T˚ϕpxqT˚ϕpx1qErUpxqUpx1qsdxdx1

“ E

«˜ ż

D

T˚ϕpxqUpxqdx¯2

“ ErxU, T˚ϕy2s

(13)

and thus xU, T˚ϕy “ 0 a.s. : there exists Aϕ P F with PpAϕq “ 1 such that

@ω P Aϕ,xUω, T˚ϕy “ 0. We justify the expectancy-integral permutation with the computation below

ż

D

ż

D

|T˚ϕpxqT˚ϕpx1q|Er|UpxqUpx1q|sdxdx1 ď ż

D

ż

D

|T˚ϕpxqT˚ϕpx1q|σpxqσpx1qdxdx1

ď

˜ ż

D

|T˚ϕpxq|σpxqdx

¸2

ă `8 (31)

The last integral is finite because of (30) and σP L1locpDq.

This does not finish the proof as we need to find a set A with PpAq “ 1, indepen- dently from ϕ, such that@ω P A, xUω, T˚ϕy “ 0. For this we use the fact that DpDq is a separable topological space, which we prove at the end of this proof. Let F Ă DpDq be a countable dense subset of DpDq, let A :“ B XŞ

ϕPFAϕ and let ω P A. Since Uω P L1locpDq,(30) shows that the map Lω : ϕ ÞÝÑ xUω, T˚ϕy is a continuous linear form on DpDq. The continuity of Lω implies that LωpF q is a dense subset of LωpDpDqq [47].

But LωpF q “ t0u and therefore LωpDpDqq “ LωpF q “ t0u which shows that

@ω P A, @ϕ P DpDq, xUω, T˚ϕy “ Lωpϕq “ 0 Since PpAq “ 1, this shows that piiq ùñ piq.

When U is not centered, consider the centered stochastic process V defined by Vpxq “ Upxq ´ mpxq for which the above proof can be applied. Since T is linear and mis supposed to verify Tpmq “ 0 in the sense of distributions, the probabilistic events tT pUq “ 0 in the sense of distributionsu and tT pV q “ 0 in the sense of distributionsu coincide and thus have the same probability measure. Finally, U and V have the same covariance kernel kpx, x1q. Thus,

PpT pUq “ 0 in the sense of distributionsq “ 1

ðñ PpT pV q “ 0 in the sense of distributionsq “ 1 ðñ @x P D, T pkxq “ 0 which finishes the proof in the general case.

Proof that DpDq is separable : DpDq is an LF-space as the inductive limit of the Fr´echet spaces DKipDq :“ tϕ P C8pDq : Supppϕq Ă Kiu, i P N, where K1 Ă K2 Ă ...

are compact subsets of D such that Ť

iKi “ D ( [43], p.131-133). As such, DpDq is separable iff DKipDq is separable for all i P N [48], which we now show. The Fr´echet topology of DKipDq is the one induced by the usual Fr´echet topology of C8pDq when DK

ipDq is seen as a subspace of C8pDq ( [42], section 1.46). As a Fr´echet space, C8pDq is metrizable. But C8pDq is also a Montel space ( [43], Prop 34.4) : as a metrizable space, it is automatically separable ( [49], p.195). Thus DKipDq is also separable as a subspace of the separable metric space C8pDq.

Remark 2. Distributional solutions are the weakest types of solutions for PDEs. In general, additional regularity conditions have to be imposed to obtain physically re- alistic solutions, such as Sobolev regularity or entropy conditions as for non linear hyperbolic PDEs [50]. However, every step in the above proof remains valid when re- placing ϕP DpDq with ϕ P CcnpDq. Although we have not clarified the usual topology of CcnpDq in this article, we state that this is enough to show that the equalities stated in Proposition 2 also hold in CcnpDq1, the space of finite order generalized functions of order n, rather than just in D1pDq. CcnpDq1 is a smaller space than D1pDq, though less used in PDE theory than D1pDq.

Remark 3. We gave here an elementary proof that

σP L1locpDq ùñ the trajectories of U lie in L1locpDq almost surely (32) Similar results on Sobolev regularity of the trajectories of second order stochastic pro- cesses are scarce in the literature. Some are available in [51], though the result (32) is actually not covered in [51], where additional continuity hypotheses would be required in the left hand side of (32) to apply results from [51].

(14)

We partially recover proposition 1 when the trajectories of U lie in CnpDq and k P Cn,npD ˆ Dq. Indeed, in that case one can show that if T “ ř

|k|ďnakpxqBk, then we simply have T “ T in proposition 1. Additionally, T pUωq and T pkxq both lie in F pD, Rq X L1locpDq, and for any function g that lies in L1locpDq, we have

g “ 0 in the sense of distributions ðñ g “ 0 a.e. (33) Equation (33) is just another way of saying that the linear map f ÞÝÑ Tf given in (17) is injective. In that framework, proposition 1 states that

@x P D, T pkxq “ 0 ðñ PpT pUq “ 0q “ 1 (34) where the function equalities of the form Tpf q “ 0 in (34) are valid everywhere on D.

Following equation (33), proposition 2 states a slightly weaker result, namely that

@x P D, T pkxq “ 0 a.e. ðñ PpT pUq “ 0 a.e.q “ 1 (35) We can now state the following corollary, which draws the consequences of propo- sition 2 when applied to GPR.

Proposition 3 (Heredity of Proposition 2 to conditioned GPs). Let D and T be as defined in Proposition 2. LetpUpxqqxPD „ GP pm, kq be a Gaussian process that verifies the hypotheses of Proposition 2. Suppose also that

Tpmq “ 0 and @x P D, T pkxq “ 0 both in the sense of distributions (36) piq Then whatever the integer p, the vector u “ pu1, ..., upqT P Rp and the vector X“ px1, ..., xpqT P Dp such that kpX, Xq is invertible, the Kriging mean ˜mpxq and the Kriging standard deviation function ˜σpxq “

b˜kpx, xq both lie in L1locpDq, and we have

Tp ˜mq “ 0 and @x P D, T p˜kxq “ 0 both in the sense of distributions where ˜m and ˜k are defined in equations (5) and (6).

piiq As such, the trajectories of the conditioned Gaussian process ` ˜Upxq˘

xPD defined by ˜Upxq “ pUpxq|Upxiq “ ui @i “ 1, ..., pq are almost surely solutions of the equation Tpf q “ 0 in the sense of distributions :

PpT p ˜Uq “ 0 in the sense of distributionsq “ 1

Proof. Note first that for all x P D, ˜kpx, xq ď kpx, xq, which is immediate from (15).

Thus the function ˜σ : x ÞÝÑ

b˜kpx, xq also lies in L1locpDq. Point piq is then a direct consequence of the definition of ˜m and ˜k in equations (5) and (6), and the linearity of T. Proposition 2 can then be applied conjointly with piq, which yields point piiq since the mean and covariance functions of the GP ˜U are ˜m and ˜k (see section 2.2).

Proposition 3 shows that when U is a GP, the results of proposition 2 are inherited on the conditioned posterior process ˜U. One weak consequence of proposition 3 is that if GPR is performed with a kernel k that verifies pointpiiq of proposition 2, then the predictions provided by GPR are all solutions of the PDE Tp ˜mq “ 0.

The goal of the next section is to apply this idea to a special case of the (3 di- mensional) wave equation defined in eq.(37), by building an ”explicit” positive definite kernel k such that@x1 P D, lkx1 “ 0 in the sense of distributions, where the box symbol l classically denotes the linear wave operator a.k.a. the d’Alembert operator. With this new kernel, we will perform GPR on observations of a function that is solution to the wave equation and draw a number of related consequences.

(15)

4 Gaussian Processes and the 3 Dimensional Wave Equation

4.1 General Solution to the 3 Dimensional Wave Equation

Denote the 3D Laplace operator ∆ “ B2xx ` Byy2 ` Bzz2 and the d’Alembert operator l“ 1{c2Btt2 ´ ∆ with constant wave speed c ą 0. We focus on the general initial value problem in the free space R3

$’

&

’%

lw “ 0 @px, tq P R3 ˆ R˚` wpx, 0q “ u0pxq @x P R3

pBtwqpx, 0q “ v0pxq @x P R3

(37)

Throughout this paper, we will refer to u0 as the initial position and v0 as the initial speed. The problem (37) is a Cauchy problem with initial conditions (IC) u0 and v0. It admits a unique solution which can be extended to all times tP R, and is represented as follow [3]

wpx, tq “ pFt˚ v0qpxq ` p 9Ft˚ u0qpxq @px, tq P R3ˆ R (38) where Ftand 9Ftare known generalized functions. Actually, Ftand 9Ftare better known through their Fourier transforms [3], as

F pFtqpξq “ sinpct|ξ|q

c|ξ| and Fp 9Ftqpξq “ cospct|ξ|q (39) where |ξ| is the euclidean norm of ξ P R3. Note that the relation 9Ft “ BtFt can be directly deduced from (39). Additionally, the representation (38) is valid in any dimension as well as the Fourier formulas (39), see [3]. Finally, Ft also corresponds to the Green’s function of the wave equation [52].

In dimension 3, Ft and 9Ft are compactly supported generalized functions of order 0 and 1 respectively. More explicitly, in dimension 3 Ft and 9Ft are given by

Ft“ σc|t|

4πc2t and F9t “ BtFt (40)

where σR is the surface measure of the sphere of center 0 and radius R. F9t “ BtFt

means that

@C01pR3q, x 9Ft, fy “ BtxFt, fy “ Bt

ż

R3

fpxqFtpdxq

Suppose that u0 P C1pR3q and v0 P C0pR3q, then w as defined in (38) is a function in the classical sense [43] and in that case an explicit formula for such convolutions is reminded in equation (24) (yet one may actually make sense out of (38) when u0 and v0 are only required to be any generalized functions [43]). Combining formulas (38) and (40) leads to the Kirschoff formula [3] (see N.7 for spherical coordinates notations) :

wpx, tq “ ż

Sp0,1q

tv0px ´ c|t|γq ` u0px ´ c|t|γq ´ c|t|γ ¨ ∇u0px ´ c|t|γqdΩ

4π (41)

4.2 Gaussian Process Modelling of the Solution

Suppose now that u0 and v0 are unknown, and only pointwise values of w are observed.

In a Bayesian approach, we model u0 and v0 as random functions and put a Gaussian process prior over u0 and v0 as in equation (4). More precisely, we make the following assumptions.

(A1) Suppose that the initial conditions u0 and v0 of Problem (37) are trajectories drawn from two independent Gaussian processes U0 „ GP p0, kuq and V0 „ GPp0, kvq : Dω P Ω, @x P R3, u0pxq “ Uω0pxq and v0pxq “ Vω0pxq.

(16)

(A2) Suppose that all trajectories of U0 lie in C1pR3q and that those of V0 lie in C0pR3q almost surely. A sufficient condition for this is given in [39], Thm 1.4.2. This theorem states that under mild technical assumptions, the paths ofpUpxqqxPD „ GPp0, kq lie in Cl a.s. as soon as k P C2lpD ˆ Dq, which we assume from now on.

We now analyse the consequence of these two assumptions. First, they imply that by solving (37), one obtains a time-space stochastic process Wpx, tq defined by

Wpx, tq : Ω Q ω ÞÝÑ pFt˚ Vω0qpxq ` p 9Ft˚ Uω0qpxq (42) Here, Vω0 denotes the trajectory of V0 at ω P Ω and likewise for Uω0. In particular, thanks to assumption pA2q, (42) defines a random variable for all px, tq. Note the space-time variable z“ px, tq and note the random variables

Vpzq : ω ÞÝÑ pFt˚ Vω0qpxq and Upzq : ω ÞÝÑ p 9Ft˚ Uω0qpxq (43) that is, Wpzq “ Upzq ` V pzq. We show in the next proposition that the stochastic processes U, V and W are GPs as well. In particular we describe their covariance kernels.

Proposition 4. Define the two functions

kvwavepz, z1q “ rpFtb Ft1q ˚ kvspx, x1q and kuwavepz, z1q “ rp 9Ftb 9Ft1q ˚ kuspx, x1q (44) (i) Then U “ pUpzqqzPR3ˆR and V “ pV pzqqzPR3ˆR as defined in (43) are two indepen- dent centered GPs with covariance kernels kuwave and kwavev respectively. Consequently, pW pzqqzPR3ˆR is a centered GP whose covariance kernel is given by

kWpz, z1q “ kwavev pz, z1q ` kwaveu pz, z1q (45) (ii) Conversely, any centered second order stochastic process with covariance kernel kW

has its sample paths solution of the 3 dimensional wave equation (37) almost surely.

Proof. piq : first we prove that U and V are GPs. Since U0 and V0 are GPs, LpU0q and LpV0q are only comprised of Gaussian random variables (see N.3).

To prove that U and V are Gaussian processes, we rely on the Kirschoff formula (41), writing the integrals as limits of Riemann sums. We start with V , that is, we focus on the first term in Kirschoff’s formula (41). To show that V is a Gaussian process, we only need to show that for any z, Vpzq P LpV0q as this will ensure the Gaussian process property. Since the trajectories of V0 are continuous almost surely, there exists a sequence of numbers ank and points ykn such that for almost any ω P Ω,

Vpzqpωq “ pFt˚ Vω0qpxq “ t ż

R3

V0px ´ c|t|γqpωqdΩ 4π

“ t 4π

ż 0

żπ 0

V0px ´ c|t|γpθ, φqqpωq sinpθqdθdφ “ lim

nÑ8

ÿn k“1

ankV0px ´ ynkqpωq

Thus Vpzq is an almost sure limit of elements of LpV0q which also implies convergence in law. But since V0 is a Gaussian process, convergence in law implies the convergence of the moments of all order [39] and in particular the convergence in L2pPq. Thus, Vpzq P LpV0q and V is a Gaussian process. The convergence of the first moment implies that Vpzq is centered.

We apply the same reasoning to U, by applying the above steps to the second part of Kirschoff’s formula (41). One’s ability to write out the integrals as a limit of Riemann sums is ensured when the trajectories of U0 lie in C1pR3q.

Finally, since U0and V0are independent, LpU0q and LpV0q are orthogonal in L2pPq:

LpU0q ` LpV0q “ LpU0q‘ LpVK 0q

Since LpUq Ă LpU0q and likewise for V , U and V are independent Gaussian pro- cesses (for Gaussian random variables, independence is equivalent to null covariance).

References

Related documents

Image enhancement is used to improve the interpretability or perception of information in images for human viewers, or to provide better input for other automated image

(2003) Participatory development for regional sustainability in Western Australia: An enabling state?. In: International Sustainability Conference, 17 - 19 September,

After successfully logging in and selecting CARS, the user is able to select a property and enter responses to file audit findings as well as correction dates. If a property does

The information includes: standardised and “as reported” financials (including restated reports), SEC filings, images of annual reports, corporate actions and dividends,

The differences in global monthly mean AOD between the two resolution data sets is − 0.004 in January and nearly 0.000 in July over ocean and 0.004 in January and 0.010 in July

This section provides the estimation results of our empirical analysis of mar- ket shares of health insurers in Germany. We consider several regression mod- els. First, we present

macroeconomic terms PAYG, it is presented to each individual as an account based system, with a capital sum accumulated which is converted into pension income according to the

• A solution plan (Business Continuity Plan - BCP) is prepared for each of the threat associated with the services depending on the identified Minimum Service Level and