Throughout, we make the standing Assumptions 3.1, 3.4 without explicit men-tion. The proof of Proposition2.2uses the following result concerning triangular martingale increment arrays. The result is similar to the classical results on trian-gular arrays of independent increments.
Let kN:[0, T ] → Z+ be a sequence of nondecreasing, right-continuous func-tions indexed by N with kN(0)= 0 and kN(T )≥ 1. Let {Mk,N,Fk,N}0≤k≤kN(T )
be an Hs valued martingale difference array. That is, for k = 1, . . . , kN(T ), we have E(Mk,N|Fk−1,N) = 0, E(-Mk,N-2s|Fk−1,N) <∞ almost surely, and Fk−1,N ⊂ Fk,N. We will make use of the following result.
PROPOSITION 4.1 ([3], Proposition 5.1). Let S : Hs → Hs be a self-adjoint, positive definite, operator with finite trace. Assume that, for all x∈ Hs, 6 >0 and t∈ [0, T ], the following limits hold in probability:
Nlim→∞
kN&(T )
k=1 E(-Mk,N-2s|Fk−1,N)= T trace(S), (4.1)
Nlim→∞
k&N(t )
k=1E(+Mk,N, x,2s|Fk−1,N)= t+Sx, x,s, (4.2)
lim
N→∞
kN&(T )
k=1
E'+Mk,N, x,2s1|+Mk,N,x,s|≥6|Fk−1,N(= 0.
(4.3)
Define a continuous time process WN by WN(t )=+kkN=1(t )Mk,N if kN(t )≥ 1 and kN(t ) >limr→0+kN(t− r), and by linear interpolation otherwise. Then the
se-quence of random variables WN converges weakly in C([0, T ], Hs) to an Hs valued Brownian motion W, with W (0)= 0, E(W(T )) = 0, and with covariance operator S.
REMARK4.2. The first two hypotheses of the above theorem ensure the weak convergence of finite-dimensional distributions of WN(t ) using the martingale central limit theorem inRN; the last hypothesis is needed to verify the tightness of the family{WN(·)}. As noted in [11], the second hypothesis [equation (4.2)] of Proposition4.1is implied by
lim
N→∞
k&N(t )
k=1
E(+Mk,N, en,s+Mk,N, em,s|Fk−1,N)= t+Sen, em,s
(4.4)
in probability, where{en} is any orthonormal basis for Hs. The third hypothesis in (4.3) is implied by the Lindeberg type condition,
lim
N→∞
kN&(T )
k=1
E'-Mk,N-2s1-Mk,N-s≥6|Fk−1,N(= 0 (4.5)
in probability, for any fixed 6 > 0.
Using Proposition4.1we now give the proof of Proposition2.2.
PROOF OFPROPOSITION2.2. We apply Proposition4.1with kN(t )def= 6Nt7, Mk,N def= √1N/k,N and Sdef= Cs; the resulting definition of WN(t ) from Proposi-tion4.1coincides with that given in (2.22). We set Fk,N to be the sigma algebra generated by {xj, ξj}j≤k with x0∼ πN. Since the chain is stationary, the noise process{/k,N,1≤ k ≤ N} is identically distributed, and so are the errors rk,Nand Ek,N from (2.17) and (2.18), respectively. We now verify the three hypotheses re-quired to apply Proposition4.1. We generalize the notationEξ0(·) from Section2.6 and setEξ(·|Fk,N)= Eξk(·).
• Condition (4.1). It is enough to show that
Nlim→∞EπN 66 66 6 1 N
6NT 7&
k=1 Eξk−1(-/k,N-2s)− trace(Cs) 66 66 6= 0
and condition (4.1) will follow from Markov’s inequality. By (3.12) and (2.2), Eξ0(-/1,N-2s)=
&N
j=1
Eξ0(-Bs1/2/1,N-2)=
&N
j=1
Eξ0+/1,N, Bs1/2φj,2
=
&N
j=1Eξ0+Bs1/2φj, /1,N⊗ /1,NBs1/2φj, (4.6)
= trace(CsN)+ 1 2%2β
&N
j=1
+φj, E1,Nφj,s (4.7)
− N
2%2β-E0(x1− x0)-2s.
By Proposition2.1it follows thatEπN|+Nj=1+φj, E1,Nφj,s| → 0. For the third term, notice that by Proposition2.1(2.14) we have
EπN N
2%2β-E0(x1− x0)-2s ≤ M1
NEπN'-mN(x0)-2s+ -r1,N-2s (
≤ M1 N
'EπN(1+ -x0-s)2+ EπN-r1,N-2s
(4.8) (
→ 0,
where the second inequality follows from the fact that C∇( is globally Lips-chitz in Hs. Also{Ek,N} is a stationary sequence. Therefore,
EπN 66 66 6 1 N
6NT 7&
k=1 Eξk−1(-/k,N-2s)− T trace(CsN) 66 66 6
≤ MEπN/66666
&N
j=1
+φj, E1,Nφj,s 66 66 6+ N
2%2β-E0(x1− x0)-2s 0
+ trace(CsN) 66 666NT 7
N − T
66 66→ 0.
Condition (4.1) now follows from the fact that
Nlim→∞|trace(Cs)− trace(CsN)| = 0.
• Condition (4.2). By Remark4.2, it is enough to verify (4.4). To show (4.4), using stationarity and similar arguments used in verifying condition (4.1), it suffices to show that
Nlim→∞EπN|Eξ0(+/1,N,φ,n,s+/1,N,φ,m,s)− +φ,n, CsN,φm,s| = 0, (4.9)
where{φ,k} is as defined in (2.7). We have
EπN|Eξ0(+/1,N,φ,n,s+/1,N,φ,m,s)− +φ,n, CsN,φm,s|
= n−sm−sEπN|Eξ0(+/1,N, φn,s+/1,N, φm,s)− +φn, CsNφm,s| and therefore, it is enough to show that
Nlim→∞EπN|Eξ0(+/1,N, φn,s+/1,N, φm,s)− +φn, CsNφm,s| = 0.
(4.10)
Indeed we have
+/1,N, φn,s+/1,N, φm,s= +/1,N, Bsφn,+/1,N, Bsφm,
= +Bsφn, /1,N⊗ /1,NBsφm,
= +φn, Bs1/2/1,N⊗ /1,NBs1/2φm,s
and from (3.12) and Proposition2.1we obtain +φn, Bs1/2/1,N⊗ /1,NBs1/2φm,s− +φn, CsNφm,s
= +φn, Bs1/2/1,N⊗ /1,NBs1/2φm,s− +φn, Bs1/2CNBs1/2φm,s
= nsms+φn, E1,Nφm,s− N
2%2βE0(+x1− x0, φn,s)E0(+x1− x0, φm,s).
From Proposition 2.1, it follows that limN→∞EπN|+φn, E1,Nφm,s| = 0. Also notice that
N2[EπN|E0(+x1− x0, φn,s)E0(+x1− x0, φm,s)|]2
≤ MEπN'N-E0(x1− x0)-2s-φn-2s
(EπN'N-E0(x1− x0)-2s-φm-2s (
→ 0
by the calculation done in (4.8). Thus (4.10) holds and since |+φn, Csφm,s− +φn, CsNφm,s| → 0, equation (4.2) follows from Markov’s inequality.
• Condition (4.3). From Remark4.2it follows that verifying (4.5) suffices to es-tablish (4.3).
To verify (4.5), notice that for any 6 > 0, EπN
66 66 6
1 N
6NT 7&
k=1
Eξk−1'
-/k,N-2s1{-/k,N-2s≥6N}(66666
≤6NT 7
N EπN'-/1,N-2s1{-/1,N-2s≥6N}(
→ 0 by the dominated convergence theorem since
Nlim→∞EπN-/1,N-2s= trace(Cs) <∞.
Thus (4.5) is verified.
Thus we have verified all three hypotheses of Proposition4.1, proving that WN(t ) converges weakly to W (t) in C([0, T ]; Hs).
Recall that XR⊂ Hs denotes the R-dimensional subspace PRHs. To prove the second claim of Proposition2.2, we need to show that (x0, WN(t ))converges weakly to (z0, W (t )) in (Hs, C([0, T ]; Hs))as N → ∞ where z0∼ π and z0 is independent of the limiting noise W . For showing this, it is enough to show that
for any R∈ N, the pair (x0, PRWN(t )) converges weakly to (z0, ZR)for every t >0, where ZRis a Gaussian random variable on XRwith mean zero, covariance t PRCsPR and independent of z0. We will prove this statement as the corollary of the following lemma.
LEMMA 4.3. Let x0 ∼ πN and let{θk,N} be any stationary martingale se-quence adapted to the filtration{Fk,N} and furthermore, assume that there exists a stationary sequence{Uk,N} such that for all k ≥ 1 and any u ∈ XR:
(1) Eξk−1|+u, PRθk,N,s|2= +u, PRCsu,s+ Uk,N,limN→∞EπN|U1,N| = 0.
(2) Eξk−1-θk,N-3s≤ M.
Then for anyt ∈ Hs, u∈ XR, R∈ N and t > 0,
Nlim→∞EπN'ei+t,x0,s+(i/
√N )+6Nt7k=1+u,PRθk,N,s(
(4.11)
= Eπ'ei+t,z0,s−(t/2)+u,PRCsu,s(. Note: Here and in Corollary4.4, i=√
−1.
PROOF OF LEMMA4.3. We show (4.11) for t= 1, since the calculations are nearly identical for an arbitrary t with minor notational changes. Indeed, we have
EπN'ei+t,x0,s+(i/
√N )+Nk=1+u,PRθk,N,s(
= EπN'EξN−1
'ei+t,x0,s+(i/
√N )+Nk=1+u,PRθk,N,s((
. By Taylor’s expansion,
EπN'EξN−1
'ei+t,x0,s+(i
√N )+Nk=1+u,PRθk,N,s((
= E7ei+t,x0,s+(i/
√N )+Nk=1−1+u,PRθk,N,s
(4.12)
×
$ 1− 1
2N E
ξ
N−1|+u, PRθN,N,s|2 + M
$ 1
N3/2VN∧ 2
%%8 ,
where|VN| ≤ EξN−1|+u, PRθN,N,s|3≤ M, since by assumption EξN−1-θN,N-3s ≤ M. We also have that
EξN−1|+u, PRθN,N,s|2= +u, PRCsu,s+ UN,N,
Nlim→∞EπN|UN,N| = 0.
Thus from (4.12) we deduce that
Thus we have shown that EπN
As a corollary of Lemma4.3, we obtain the following.
COROLLARY 4.4. The pair (x0, WN)converges weakly to (z0, W )in C([0, T]; Hs)where W is a Brownian motion with covariance operator Cs and is inde-pendent of z0almost surely.
PROOF. As mentioned before, it is enough to show that for any t∈ Hs, u∈
Now we verify the conditions of Lemma 4.3to show (4.14). To verify the first hypothesis of Lemma4.3, notice that from Proposition2.1we obtain that for k≥ 1,
Eξk−1|+u, PR/k,N,s|2= Eξk−1+Bsu, PR/k,N⊗ /k,NBsu,
= +u, PRCsu,s+ Uk,N,
|Uk,N| ≤ 1 2%2βM
R&∧N
l,j=1
uluj|+φl, PMEk,Nφj,s|
+ N
2%2β-Eξk−1(xk− xk−1)-2s-u-2s
+ |+u, PRCsNu,s− +u, PRCsu,s|,
where{Ek,N} is as defined in (2.18). Because{/k,N} is stationary, we deduce that {Uk,N} is stationary. From Proposition2.1we obtain
Nlim→∞
R&∧N
l,j=1
EπN|+φl, PMEk,Nφj,s| = 0
andEπN2%N2β-Eξk−1(xk− xk−1)-2s → 0 by the calculation in (4.8). Thus we have shown thatEπ|U1,N| → 0 as N → ∞. The second hypothesis of Lemma 4.3is easily verified since Eξk−1-/k,N-3s ≤ MEξk−1-C1/2ξk-3s ≤ M. Thus the corollary follows from Lemma4.3. !
Thus we have shown that (x0, WN)converges weakly to (z0, W )where W is a Brownian motion in Hs with covariance operator Cs, and by the above corollary we see that W is independent of x0 almost surely, proving the two claims made in Proposition2.2and the proof is complete. !
5. Mean drift and diffusion: Proof of Proposition 2.1. To prove this key proposition we make the standing Assumptions3.1,3.4from Section3.1without explicit statement of this fact within the individual lemmas. We start with several preliminary bounds and then consider the drift and diffusion terms, respectively.
5.1. Preliminary estimates. Recall the definitions of R(x, ξ), Ri(x, ξ ) and Rij(x, ξ ) from equations (2.38), (2.39) and (2.47), respectively. These quantities were introduced so that the term in the exponential of the acceptance probability Q(x, ξ )could be replaced with Ri(x, ξ )and Rij(x, ξ )to take advantage of the fact that, conditional on x, Ri(x, ξ )is independent of ξi and Rij(x, ξ )is independent of ξi, ξj. In the next lemma, we estimate the additional error due to this replace-ment of Q(x, ξ). Recall thatEξ0 denotes expectation with respect to ξ = ξ0 as in Section2.2.
LEMMA5.1.
Eξ0|Q(x, ξ) − Ri(x, ξ )|2≤ M
N(1+ |ζi|2), (5.1)
Eξ0
'Q(x, ξ )− Rij(x, ξ )(2≤ M
N(1+ |ζi|2+ |ζj|2).
(5.2)
PROOF. Since ξj are i.i.d. N(0, 1), using (2.1) and (3.1), we obtain that E-C1/2ξ-4s≤ 3(E-C1/2ξ-2s)2≤ M
/ ∞
&
j=1
j2s−2k 02
<∞ (5.3)
since s < k−12.
Starting from (2.40), the estimates in (2.32) and (5.3) imply that Eξ0|Q(x, ξ) − Ri(x, ξ )|2≤ M
$
Eξ0|r(x, ξ)|2+ 1
NEξ0ζi2ξi2+ 1 N2Eξi4
%
≤ M
$ 1
N2E-C1/2ξ-4s + 1
Nζi2+ 3 N2
%
≤ M 1
N(1+ ζi2)
verifying the first part of the lemma. A very similar argument for the second part finishes the proof. !
The random variables R(x, ξ), Ri(x, ξ )and Rij(x, ξ )are approximately Gaus-sian random variables. Indeed it can be readily seen that
R(x, ξ )≈ N$−%2,2%2
N-ζ -2%.
The next lemma contains a crucial observation. We show that the sequence of ran-dom variables {-ζ-N2} converges to 1 almost surely under both π0 and π. Thus R(x, ξ )converges almost surely to Z%def
= N(−%2,2%2)and thus the expected ac-ceptance probabilityEα(x, ξ) = 1 ∧ eQ(x,ξ )converges to β= E(1 ∧ eZ%).
LEMMA5.2. As N → ∞ we have 1
N-ζ -2→ 1, π0-a.s. and 1
N-ζ -2→ 1, π-a.s.
(5.4)
Furthermore, for any m∈ N, α ≥ 2, s < κ −12 and for any c≥ 0, lim sup
N∈N EπN
&N
j=1
λαjj2s|ζj|me(c/N )-ζ-2<∞.
(5.5)
Finally, we have
lim
N→∞EπN$66661− 1 N-ζ -2
66 66
2%
= 0.
(5.6)
PROOF. The proof proceeds by showing the conclusions first in the case when x∼ πD 0; this is easier because the finite-dimensional distributions are Gaussian and by Fernique’s theorem x has exponential moments. Next we notice that the almost sure properties are preserved under the change of measure π. To show the con-vergence of moments, we use our hypothesis that the Radon–Nikodym derivative
dπN
dπ0 is bounded from above independently of N, as shown in Lemma3.5, equa-tion (3.8).
Indeed, first let x∼ πD 0. Recall that ζ = C−1/2(PNx)+ C1/2∇(N(x)and -∇(N(x)-−s≤ M3(1+ -x-s).
(5.7)
Using (3.6) and the fact that s < κ−12 so that−κ < −s, we deduce that -C1/2∇(N(x)- ; -∇(N(x)-−κ
≤ -∇(N(x)-−s
≤ M(1 + -x-s)
uniformly in N. Also, since x is Gaussian under π0, from (2.4), we may write C−1/2(PNx)=+Nk=1ρkφk, where ρkare i.i.d. N(0, 1). Note that
1
N-ζ -2= 1
N-C−1/2(PNx)+ C1/2∇(N(x)-2
= 1 N
'-C−1/2(PNx)-2+ 2+C−1/2(PNx), C1/2∇(N(x), + -C1/2∇(N(x)-2( (5.8)
= 1 N
'-C−1/2(PNx)-2+ 2+PNx,∇(N(x), + -C1/2∇(N(x)-2(
= 1 N
&N
k=1
ρk2+ γ, where
|γ | ≤ 1 N
'2-x-s-∇(N(x)-−s+ -C1/2∇(N(x)-2( (5.9)
≤M N
'2-x-s(1+ -x-s)+ (1 + -x-s)2(.
Under π0, we have-x-s<∞ a.s., for s < κ −12 and hence, by (5.9), we conclude that |γ | → 0 almost surely as N → ∞. Now, by the strong law of large num-bers, N1 +Nk=1ρk2→ 1 almost surely. Hence, from (5.8) we obtain that under π0, limN→∞N1-ζ -2= 1 almost surely, proving the first equation in (5.4). Now the second equation in (5.4) follows by noting that almost sure limits are preserved under a (absolutely continuous) change of measure.
Next, notice that by (5.8) and the Cauchy–Schwarz inequality, for any c > 0, 'Eπ0e(c/N )-ζ-2(2≤'Eπ0e(2c/N)+ρ2k((Eπ0e2cγ)
≤'Eπ0e(2c/N)+ρ2k('Eπ0e(M/N )-x-2s(.
Using the fact that+Nk=1ρk2 has chi-squared distribution with N degrees of free-dom gives
'Eπ0e(c/N )-ζ -2(2≤ Me−(N/2) log(1−4c/N)'
Eπ0e(M/N )-x-2s(≤ M, (5.10)
where the last inequality follows from Fernique’s theorem sinceEπ0e(M/N )-x-2s <
∞ for sufficiently large N. Hence, by applying Lemma3.5, equation (3.8), it fol-lows that lim supN→∞EπNe(c/N )-ζ-2<∞. Notice that we also have the bound
|ζk|m≤ M'|ρk|m+ |λk|m(1+ -x-ms )(.
Since s < k− 1/2, we have that+∞j=1λ2jj2s<∞ and therefore, it follows that for α≥ 2,
lim sup
N→∞
&N
k=1
(EπNλ2αk j2s|ζk|2m)1/2<∞.
(5.11)
Hence the claim in (5.5) follows from applying Cauchy–Schwarz combined with (5.10) and (5.11). Similarly, a straightforward calculation yields that Eπ0(|1 −
1
N-ζ-2|2)≤ MN. Hence, again by Lemma3.5,
Nlim→∞EπN$66661− 1 N-ζ -2
66 66
2%
= 0 proving the last claim and the proof is complete. !
Recall that Q(x, ξ)= R(x, ξ) − r(x, ξ). Thus, from (2.32) and Lemma5.1it follows that Ri(x, ξ )and Rij(x, ξ ) also are approximately Gaussian. Therefore, the conclusion of Lemma5.2leads to the reasoning that, for any fixed realization of x∼ π, the random variables R(x, ξ), RD i(x, ξ )and Rij(x, ξ )all converge to the same weak limit Z%∼ N(−%2,2%2)as the dimension of the noise ξ goes to∞. In the rest of this subsection, we rigorize this argument by deriving a Berry–Essen bound for the weak convergence of R(x, ξ) to Z%.
For this purpose, it is natural and convenient to obtain these bounds in the Wasserstein metric. Recall that the Wasserstein distance between two random vari-ables Wass(X, Y ) is defined by
Wass(X, Y )def= sup
f∈Lip1
E'f (X)− f (Y )(,
where Lip1 is the class of 1-Lipschitz functions. The following lemma gives a bound for the Wasserstein distance between R(x, ξ) and Z%.
LEMMA5.3. Almost surely with respect to x∼ π, Wass(R(x, ξ), Z%)≤ M For any 1-Lipschitz function f ,
66Eξ'f (G)− f (R(x, ξ))(66≤ %2Eξ
implying that Wass(G, R(x, ξ))≤ M√1N. Now, from classical Berry–Esseen esti-mates (see [26]), we have that
Wass(G, Z%)≤ M 1
Hence the proof of the first claim follows from the triangle inequality. To see the second claim, notice that for any 1-Lipschitz function f we have
Eξ0|f (R(x, ξ)) − f (Ri(x, ξ ))| ≤ Eξ0|R(x, ξ) − Ri(x, ξ )| ≤ M 1
√N(1+ |ζi|) and the proof is complete. !
Hence, from equations (5.13) and (5.12), we obtain Wass(Ri(x, ξ ), Z%)
We conclude this section with the following observation which will be used later.
Recall the Kolmogorov–Smirnov (KS) distance between two random variables (W, Z):
KS(W, Z)def= sup
t∈R|P(W ≤ t) − P(Z ≤ t)|.
(5.15)
LEMMA 5.4. If a random variable Z has a density with respect to the Lebesgue measure, bounded by a constant M, then
KS(W, Z)≤)4M Wass(W, Z).
(5.16)
We could not find the reference for the above in any published literature, so we include a short proof here which was taken from the unpublished lecture notes [10].
PROOF OFLEMMA5.4. Fix t∈ R and 6 > 0. Define two functions g1and g2 as g1(y)= 1 for y ∈ (−∞, t), g1(y)= 0 for y ∈ [t +6, ∞) and linear interpolation in between. Similarly, define g2(y)= 1, for y ∈ (−∞, t − 6], g2(y)= 0, for y ∈ [t, ∞) and linear interpolation in between. Then g1and g2 form upper and lower envelopes for the function 1(−∞,t](y). So
P(W ≤ t) − P(Z ≤ t) ≤ Eg1(W )− Eg1(Z)+ Eg1(Z)− P(Z ≤ T ).
Since g1 is 16-Lipschitz, we have Eg1(W ) − Eg1(Z) ≤ 16 Wass(W, Z) and Eg1(Z)− P(Z ≤ t) ≤ M6 since Z has density bounded by M. Similarly, us-ing the function g2, it follows that the same bound holds for the difference P(Z ≤ t) − P(W ≤ t). Optimizing over 6 yields the required bound. !
5.2. Rigorous estimates for the drift: Proof of Proposition2.1, equation (2.14).
In the following series of lemmas we retrace the arguments from Section2.6while deriving explicit bounds for the error terms. Lemma5.11at the end of the section gives control of the error terms.
The following lemma shows that Q(x, ξ) is well approximated by Ri(x, ξ )−
#2%2
N ζiξi, as indicated in (2.40).
LEMMA5.5.
NE0(xi1− xi)= λi
)2%2NEξ0
''1∧ eRi(x,ξ )−√
2%2/N ζiξi( ξi
(+ ω0(i),
|ω0(i)| ≤ M
√Nλi. PROOF. We have
NE0(xi1− xi0)= NE0'
γ0(yi0− xi)(= NEξ0 /
α(x, ξ )
*2%2
N (C1/2ξ )i 0
= λi
)2%2NEξ0(α(x, ξ )ξi)= λi
)2%2NEξ0''1∧ eQ(x,ξ )(ξi( .
Now we observe that
where the last inequality follows from (5.17) and the proof is complete. ! The next lemma takes advantage of the fact that Ri(x, ξ ) is independent of ξi conditional on x. Thus, using the identity (2.36), we obtain the bound for the approximation made in (2.41).
Now we observe that
Eξ0i−eRi(x,ξ )+%2ζi2/N
= Eξ0i−'e−
√2%2/N+Nj=1,j:=iζjξj−(%2/N )+Nj=1,j:=iξj2+(%2/N )ζi2( (5.20)
≤ Eξ0i−' e−
√2%2/N+Nj=1,j:=iζjξj+(%2/N )ζi2(
= e(%2/N )-ζ -2. Since ' is globally Lipschitz, it follows that
Eξ0i−eRi(x,ξ )+%2ζi2/N' /
− Ri(x, ξ )
#2%2/N|ζi|−
*2%2 N |ζi|
0
= Eξ0i−eRi(x,ξ )+%2ζi2/N'
$ −Ri(x, ξ )
#2%2/N|ζi|
%
+ ω1(i), (5.21)
|ω1(i)| ≤ M|ζi| 1
√NEξ0i−eRi(x,ξ )+%2ζi2/N ≤ M|ζi| 1
√Ne(%2/N )-ζ -2,
where the last estimate follows from (5.20). The lemma follows from (5.19) and (5.20). !
The next few lemmas are technical and give quantitative bounds for the approx-imations in (2.43) and (2.44).
LEMMA5.7.
Eξ0i−eRi(x,ξ )+%2ζi2/N'
$ −Ri(x, ξ )
#2%2/N|ζi|
%
= Eξ0i−eRi(x,ξ )+%2ζi2/N1Ri(x,ξ )<0+ ω2(i),
|ω2(i)| ≤ Me(2%2/N )-ζ -2(|ζi| + 1) 7
Eξ0
1 (1+ |R(x, ξ)|√
N )2 81/4
. PROOF. We first prove the following lemma needed for the proof.
LEMMA 5.8. Let φ(·) and '(·) denote the pdf and CDF of the standard nor-mal distribution, respectively. Then we have:
(1) for any x∈ R, |'(−x) − 1x<0| = |1 − '(|x|)|.
(2) for any x > 0 and 6≥ 0, 1 − '(x) ≤x1+6+6.
PROOF. For the first claim, notice that if x > 0,|'(−x)−1x<0| = |'(−x)| =
|1 − '(|x|)|. If x < 0, |'(−x) − 1x<0| = |1 − '(|x|)| and the claim follows.
For the second claim,
We now proceed to the proof of Lemma5.7. By Cauchy–Schwarz and an esti-mate similar to (5.20),
where the last two observations follow from the computation done in (5.20) and the fact that|1Ri(x,ξ )<0− '(√−Ri(x,ξ ) The right-hand side of the estimate (5.23) depends on i but we need estimates which are independent of i. In the next lemma, we replace Ri(x, ξ ) by R(x, ξ) and control the extra error term.
LEMMA5.9.
PROOF. We write and the claim follows from (5.25) and (5.26). !
Now, by applying the estimates obtained in (5.22), (5.23) and (5.24), we obtain
|ω2(i)| ≤ Me(2%2/N )-ζ -2(|ζi| + 1)
and the proof is complete. !
The error estimate in ω2 has R(x, ξ) instead of Ri(x, ξ ). This bound can be achieved because the terms Ri(x, ξ ) for all i ∈ N have the same weak limit as R(x, ξ )and thus the additional error term due to the replacement of Ri(x, ξ ) by R(x, ξ )in the expression can be controlled uniformly over i for large N.
LEMMA5.10.
PROOF. Set g(y)def= ey1y<0. We first need to estimate the following:
66Eξ0'
g(Ri(x, ξ ))− g(Z%)(66.
Notice that the function g(·) is not Lipschitz and therefore, the Wasserstein bounds obtained earlier cannot be used directly. However, we use the fact that the normal distribution has a density which is bounded above. So by Lemma5.3, (5.14) and (5.16), Since g is positive on (−∞, 0], for a real valued continuous random variable X,
E(g(X)) =- 0
Hence, putting the above calculations together and noticing thatE(eZ%1Z%<0)= β/2, we have just shown that
66
where the last bound follows from (5.20), proving the claimed error bound for ω3(i). !
For deriving the error bounds on ω3, we cannot directly apply the Wasserstein bounds obtained in (5.14), because the function y.→ ey1y<0is not Lipschitz onR.
However, using (5.16), the KS distance between Ri(x, ξ )and Z%is bounded by the square root of the Wasserstein distance. Thus, using the fact that ey1y<0is bounded and positive, we bound the expectation in Lemma5.10by the KS distance.
Combining all the above estimates, we see that
NEξ0[xi1− xi] = −%2β'PNx+ C∇((PNx)(i+ riN
(5.27) with
|riN| ≤ |ω0(i)| + Mλi '√
N|ω1(i)| + |ζi||ω2(i)| + |ζi||ω3(i)|(. (5.28)
The following lemma gives the control over rN and completes the proof of (2.14), Proposition2.1.
LEMMA5.11. For s < κ− 1/2,
Nlim→∞EπN-rN-2s = lim
N→∞EπN
&N
i=1
i2s|riN|2= 0.
PROOF. By (5.28), we have|riN| ≤ |ω0(i)| + Mλi(√
N|ω1(i)| + |ζi||ω2(i)| +
|ζi||ω3(i)|). Therefore, EπN
&N
i=1
i2s|riN|2 (5.29)
≤ MEπN
&N
i=1
'i2s|ω0(i)|2+ i2sλ2i'N ω1(i)2+ ζi2ω2(i)2+ ζi2ω3(i)2((.
Now we will evaluate each sum of the right-hand side of the above equation and show that they converge to zero.
• Since+∞i=1λ2ii2s<∞,
&N
i=1
EπNi2s|ω0(i)|2≤ M1 N
&N
i=1
i2sλ2i ≤ M1 N
&∞
i=1
λ2ii2s→ 0.
(5.30)
• By Lemmas5.6and5.2,
NEπN
&N
i=1
λ2ii2s|ω1(i)|2≤ M1 N
&N
i=1
EπNλ2ii2s|ζi|4e(2%2/N )-ζ -2→ 0.
(5.31)
• From Lemma5.7and Cauchy–Schwarz, we obtain
Proceeding similarly as in Lemma5.2, it follows that
&N
i=1
'EπNe(8%2/N )-ζ -2λ4ii4s(|ζi|8+ 1)(1/2
is bounded in N. Since, with x∼ πD 0, R(x, ξ) converges weakly to Z%as N→
∞, by the bounded convergence theorem we obtain lim and thus, by Lemma3.5,
Nlim→∞EπN
• After some algebra we obtain from Lemma5.10that EπN
Similar to the previous calculations, using Lemma5.2, it is quite straightforward to verify that each of the four terms above converges to 0. Thus we obtain
lim
N→∞
&N
i=1
EπNλ2ii2s|ζi|2|ω3(i)|2= 0.
(5.33)
Now the proof of Lemma5.11follows from (5.29)–(5.33). ! This completes the proof of Proposition2.1, equation (2.14).
5.3. Rigorous estimates for the diffusion coefficient: Proof of Proposition2.1, equation(2.15). Recall that for 1≤ i, j ≤ N,
NE0[(xi1− xi0)(xj1− xj0)] = 2%2Eξ01
(C1/2ξ )i(C1/2ξ )j'1∧ exp Q(x, ξ)(2. The following lemma quantifies the approximations made in (2.48) and (2.49).
LEMMA5.12.
Eξ01
(C1/2ξ )i(C1/2ξ )j'1∧ exp Q(x, ξ)(2= λiλjδijEξij−1'1∧ exp Rij(x, ξ )(2+ θij, Eξij−1'1∧ exp Rij(x, ξ )(2= β + ρij,
where the error terms satisfy
|θij| ≤ Mλiλj(1+ |ζi|2+ |ζj|2)1/2 1
√N, (5.34)
|ρij| ≤ M / 1
√N(1+ |ζi| + |ζj|) + 1 N3/2
&N
s=1
|ζs|3+ 66
661−-ζ -2 N
66 66 0 (5.35) .
PROOF. We first derive the bound for θ. Indeed,
|θij| ≤ Eξ0166(C1/2ξ )i(C1/2ξ )j
''1∧ eQ(x,ξ )(−'1∧ eRij(x,ξ )((662
≤ MλiλjEξ0166ξiξj''
1∧ eQ(x,ξ )(−'1∧ eRij(x,ξ )((662. By the Cauchy–Schwarz inequality,
|θij| ≤ Mλiλj' Eξ0
66'1∧ eQ(x,ξ )(−'1∧ eRij(x,ξ )(66(1/2
≤ Mλiλj
'Eξ0|Q(x, ξ) − Rij(x, ξ )|2(1/2. Using the estimate obtained in (5.2),
|θij| ≤ Mλiλj(1+ |ζi|2+ |ζj|2)1/2 1
√N verifying (5.34).
Now we turn to verifying the error bound in (5.35). We need to bound
A simple calculation will yield that
Wass(Rij(x, ξ ), R(x, ξ ))≤ M(|ζi| + |ζj| + 1) 1
√N. Therefore, by the triangle inequality and Lemma5.3,
Wass(Rij(x, ξ ), Z%)≤ M Hence the estimate in (5.34) follows from the observation made in (5.36). !
Putting together all the estimates produces
NE0[(xi1− xi0)(xj1− xj0)] = 2%2βλiλjδij+ ENij and (5.37)
|ENij| ≤ M(|θij| + λiλjδij|ρij|).
Finally we estimate the error of EijN. LEMMA5.13. We have
PROOF. From (5.37) we obtain that
&N
due to the fact that +∞i=1λ2ii2s <∞ and Lemma 5.2. Now the second term of (5.38),
&N
i=1
λ2ii2sEπN|ρii|
≤ MEπ0
&N
i=1
λ2ii2s / 1
√N(1+ |ζi|) + 1 N3/2
&N
s=1
|ζs|3+ 66
661−-ζ -2 N
66 66 0
. The first term above goes to zero by (5.39) and the last term converges to zero by the same arguments used in Lemma 5.2. As mentioned in the proof of the estimate for the term ω3 in Lemma 5.11, the sum EπNN13/2
+N
s=1|ζs|3 goes to zero. Therefore, we have shown that
Nlim→∞
&N
i=1
EπN|+φi, ENφi,s| = 0,
proving the first claim. Finally, from (5.34) it immediately follows that Eπ|+φi, ENφj,s| ≤ Eπisjs|θij| → 0,
proving the second claim as well. ! Therefore, we have shown
NE0[(xi1− xi0)(xj1− xj0)] = 2%2β+φi, Cφj, + EN,
Nlim→∞
&N
i=1EπN|+φi, ENφi,| = 0.
This finishes the proof of Proposition2.1, equation (2.15).
Acknowledgments. We thank Alex Thiery and an anonymous referee for their careful reading and very insightful comments which significantly improved the clarity of the presentation.
REFERENCES
[1] BÉDARD, M. (2007). Weak convergence of Metropolis algorithms for non-i.i.d. target distribu-tions. Ann. Appl. Probab. 17 1222–1244.MR2344305
[2] BÉDARD, M. (2009). On the optimal scaling problem of Metropolis algorithms for hierarchical target distributions. Preprint.
[3] BERGER, E. (1986). Asymptotic behaviour of a class of stochastic approximation procedures.
Probab. Theory Related Fields 71 517–552.MR0833268
[4] BESKOS, A., ROBERTS, G. and STUART, A. (2009). Optimal scalings for local Metropolis–
Hastings chains on nonproduct targets in high dimensions. Ann. Appl. Probab. 19 863–
898.MR2537193
[5] BESKOS, A., ROBERTS, G., STUART, A. and VOSS, J. (2008). MCMC methods for diffusion bridges. Stoch. Dyn. 8 319–350.MR2444507
[6] BESKOS, A. and STUART, A. M. (2008). MCMC methods for sampling function space. In ICIAM Invited Lecture2007 (R. Jeltsch and G. Wanner, eds.). European Mathematical Society, Zürich.
[7] BOU-RABEE, N. and VANDEN-EIJNDEN, E. (2010). Pathwise accuracy and ergodicity of Metropolized integrators for SDEs. Comm. Pure Appl. Math. 63 655–696.MR2583309 [8] BREYER, L. A., PICCIONI, M. and SCARLATTI, S. (2004). Optimal scaling of MALA for
nonlinear regression. Ann. Appl. Probab. 14 1479–1505.MR2071431
[9] BREYER, L. A. and ROBERTS, G. O. (2000). From Metropolis to diffusions: Gibbs states and optimal scaling. Stochastic Process. Appl. 90 181–206.MR1794535
[10] CHATTERJEE, S. (2007). Stein’s method. Lecture notes. Available athttp://www.stat.berkeley.
edu/~sourav/stat206Afall07.html.
[11] CHEN, X. and WHITE, H. (1998). Central limit and functional central limit theorems for Hilbert-valued dependent heterogeneous arrays with applications. Econometric Theory 14 260–284.MR1629340
[12] COTTER, S. L., DASHTI, M. and STUART, A. M. (2010). Approximation of Bayesian inverse problems. SIAM Journal of Numerical Analysis 48 322–345.
[13] DAPRATO, G. and ZABCZYK, J. (1992). Stochastic Equations in Infinite Dimensions. En-cyclopedia of Mathematics and Its Applications44. Cambridge Univ. Press, Cambridge.
MR1207136
[14] ETHIER, S. N. and KURTZ, T. G. (1986). Markov Processes: Characterization and Conver-gence. Wiley, New York.MR0838085
[15] HAIRER, M., STUART, A. M. and VOSS, J. (2007). Analysis of SPDEs arising in path sam-pling. II. The nonlinear case. Ann. Appl. Probab. 17 1657–1706.MR2358638
[16] HAIRER, M., STUART, A. M. and VOSS, J. (2011). Signal processing problems on function space: Bayesian formulation, stochastic PDEs and effective MCMC methods. In The Ox-ford Handbook of Nonlinear Filtering(D. Crisan and B. Rozovsky, eds.). Oxford Univ.
Press, Oxford.
[17] HAIRER, M., STUART, A. M., VOSS, J. and WIBERG, P. (2005). Analysis of SPDEs arising in path sampling. I. The Gaussian case. Commun. Math. Sci. 3 587–603.MR2188686 [18] HASTINGS, W. K. (1970). Monte Carlo sampling methods using Markov chains and their
applications. Biometrika 57 97–109.
[19] LIU, J. S. (2008). Monte Carlo Strategies in Scientific Computing. Springer, New York.
MR2401592
[20] MA, Z. M. and RÖCKNER, M. (1992). Introduction to the Theory of (nonsymmetric) Dirichlet Forms. Springer, Berlin.MR1214375
[21] METROPOLIS, N., ROSENBLUTH, A. W., TELLER, M. N. and TELLER, E. (1953). Equations of state calculations by fast computing machines. J. Chem. Phys. 21 1087–1092.
[22] ROBERT, C. P. and CASELLA, G. (2004). Monte Carlo Statistical Methods, 2nd ed. Springer, New York.MR2080278
[23] ROBERTS, G. O., GELMAN, A. and GILKS, W. R. (1997). Weak convergence and opti-mal scaling of random walk Metropolis algorithms. Ann. Appl. Probab. 7 110–120.
MR1428751
[24] ROBERTS, G. O. and ROSENTHAL, J. S. (1998). Optimal scaling of discrete approximations to Langevin diffusions. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 255–268.MR1625691 [25] ROBERTS, G. O. and ROSENTHAL, J. S. (2001). Optimal scaling for various Metropolis–
Hastings algorithms. Statist. Sci. 16 351–367.MR1888450
[26] STROOCK, D. W. (1993). Probability Theory, an Analytic View. Cambridge Univ. Press, Cam-bridge.MR1267569
[27] STUART, A. M. (2010). Inverse problems: A Bayesian perspective. Acta Numer. 19 451–559.
MR2652785 J. C. MATTINGLY
DEPARTMENT OFMATHEMATICS CENTER FORTHEORETICAL
ANDMATHEMATICALSCIENCES
CENTER FORNONLINEAR ANDCOMPLEXSYSTEMS ANDDEPARTMENT OFSTATISTICALSCIENCES DUKEUNIVERSITY
DURHAM, NORTHCAROLINA27708-0251 USA
E-MAIL:[email protected]
N. S. PILLAI
DEPARTMENT OFSTATISTICS HARVARDUNIVERSITY
CAMBRIDGE, MASSACHUSETTS02138 USA
E-MAIL:[email protected]
A. M. STUART
MATHEMATICSINSTITUTE WARWICKUNIVERSITY CV4 7AL
UNITEDKINGDOM
E-MAIL:[email protected]