A diffusion process associated with Fréchet means

(1)

DOI:10.1214/14-AAP1066

©Institute of Mathematical Statistics, 2015

A DIFFUSION PROCESS ASSOCIATED WITH FRÉCHET MEANS

BYHUILINGLE

University of Nottingham

This paper studies rescaled images, under exp−_μ1, of the sample Fréchet means of i.i.d. random variables{Xk|k≥1}with Fréchet meanμon a

Rie-mannian manifold. We show that, with appropriate scaling, these images con-verge weakly to a diffusion process. Similar to the Euclidean case, this limit-ing diffusion is a Brownian motion up to a linear transformation. However, in addition to the covariance structure of exp−_μ1(X1), this linear transformation

also depends on the global Riemannian structure of the manifold.

1. Introduction. It has become increasingly common in various research ar-eas for statistical analysis to involve data that lies in non-Euclidean spaces. One such an example is the statistical analysis of shape; cf. [4] and [7]. Consequently, many statistical concepts and techniques have been generalised and developed to adapt to such phenomena.

Fréchet means, as a generalisation of Euclidean means, of random variables on a metric space have been widely used for statistical analysis of non-Euclidean data. A pointμin a metric spaceMwith distance functionρ is called a Fréchet mean of a random variableX onMif it satisfies

μ=arg min x∈ME

ρ(x, X)2.

Influenced by the structure of the underlying spaces, Fréchet means, unlike their Euclidean counterparts, exhibit many challenging probabilistic and statistical fea-tures. Various aspects of Fréchet means have been studied for non-Euclidean spaces, including Riemannian manifolds and certain stratified spaces. Among oth-ers, the strong law of large numbers for Fréchet means on general metric spaces was obtained in [11]. The first use of Fréchet means to provide nonparametric statistical inference, such as confidence regions and two-sample tests for discrim-inating between two distributions, was carried out in [2] and [3] for both extrinsic and intrinsic inference applied to manifolds. WhenMis a Riemannian manifold with the distance function being that induced by its Riemannian metric, the results on central limit theorems for Fréchet means can be found in [3] and [8]. The results in both papers imply that, since manifolds are locally homeomorphic to Euclidean spaces, the limiting distributions for sample Fréchet means on Riemannian mani-folds are usually Gaussian, a phenomenon similar to that for Euclidean means.

Received July 2013; revised September 2014.

MSC2010 subject classifications.60D05, 60F05.

Key words and phrases.Limiting diffusion, rescaled Fréchet means, weak convergence.

(2)

In the case of Euclidean space, the link between the sample means of i.i.d. ran-dom vectors and ranran-dom walks leads to the fact that the rescaled sample means converge weakly to Brownian motion, possibly up to a linear transformation asso-ciated with the covariance structure of the random vectors. On the other hand, the authors of [1] constructed a stochastic gradient algorithm from a given sequence of i.i.d. random variables on a Riemannian manifold where, under certain condi-tions, the random sequence resulting from the algorithm converges almost surely to the Fréchet mean μof the given random variables. Moreover, it showed that, if one rescales the images, under exp−_μ1, of the random walks associated with the algorithm, they converge weakly to an inhomogeneous diffusion process on the tangent space of the manifold at μ. The following questions are raised from this paper: if one rescales the images, under exp−_μ1, of the sample Fréchet means of the random variable, will they converge weakly? If they do, do they converge to the same diffusion process as the one given in [1]? If not, what is the limiting diffusion process? This paper addresses these questions. We show that the rescaled images of the sample Fréchet means of i.i.d. random variables{Xk|k≥1} on a Rieman-nian manifold converge weakly to a diffusion process which is a BrowRieman-nian motion up to a linear transformation. Moreover, in addition to the covariance structure of exp−_μ1(X1), this linear transformation also depends on the global Riemannian

structure of the manifold. For this we first, in the next section, construct a sequence of simpler inhomogeneous Markov processes, each of which is also a martingale, and consider the behaviour of their weak convergence. In addition to their intrin-sic interest, the results in this section also form a basis for our investigations of “rescaled” sample Fréchet means in the following section. In particular, we relate the constructed sequence of processes to the “rescaled” sample Fréchet means in such a way that the result for the latter is a direct consequence of the former.

2. An auxiliary weakly convergent sequence of Markov chains. LetMbe a complete Riemannian manifold of dimensiond with covariant derivativeDand Riemannian distanceρ, whose sectional curvature is bounded below byκ0≤0 and

above byκ1≥0. For anyx∈M, we denote byCx the cut locus ofx. Note that, for any fixedx0, the squared distance functionρ(x0, x)2tox0is notC2onCx0.

For a fixed y∈M, consider the vector field on M\Cy defined, at x /∈Cy, by exp−_x1(y)∈τx(M), whereτx(M) denotes the tangent space ofMat x, and then define the linear operatorHx,y on the tangent spaceτx(M)by

Hx,y:v→ −

Dvexp−x1(y)

(x). (1)

The operator Hx,y so defined will play an important role in the following study of the asymptotic behaviour of sample Fréchet means onM. Note first thatHx,y is closely linked with Hess(1₂ρ(x, y)2), the Hessian of the function 1₂ρ(x, y)2, as follows (cf. [6], page 145):

Hx,y(v), u

=Hessx ₁

2ρ(x, y)

(3)

for anyx /∈Cy and any tangent vectorsu, v∈τx(M), and so the assumption on the bounds for the sectional curvature of Mimplies that, for any unit tangent vector v∈τx(M),

√

κ1ρ(x, y)cot

√

κ1ρ(x, y)

(2)

≤Hx,y(v), v

≤√−κ0ρ(x, y)coth

√

−κ0ρ(x, y)

,

where we require that ifκ1>0,√κ1ρ(x, y) < π/2 for the first inequality to hold;

cf. [6], page 203.

In contrast to Euclidean means, there is generally no closed form for Fréchet means. On the other hand, the result of [9] implies that the Euclidean random variable exp−1_μ (X) is almost surely defined, where μ is a Fréchet mean of the random variableXonM. Then, since

exp−_x1(y)= −1₂grad₁ρ(x, y)2,

where grad1denotes the gradient operator acting on the first argument of a function

onM×Mand since

gradEρ(x, X)2|x=μ=E

grad₁ρ(x, X)2|x=μ

=0

by the definition of Fréchet means, the Fréchet meanμsatisfies the condition that

Eexp−_μ1(X)=0. (3)

Thusμis linked to the Euclidean mean in the sense that the origin of the tangent space ofMatμ,τμ(M)is the Euclidean mean of the Euclidean random variable exp−1_μ (X).

Let{Xk|k≥1}be a sequence of i.i.d. random variables onM, and for a fixed x0∈M, assume that E[ρ(x0, X1)2]<∞. This assumption ensures the existence

of Fréchet means ofX1. For simplicity, in the following, we shall assume that the

Fréchet mean μofX1 is unique. However, we do not require the support of the

probability measure ofX1to be contained in any geodesic ball. Note that the result

of [9] ensures that P(X1∈Cμ)=0 under some mild condition onM. We further assume that

E Hμ,X1 2

<∞ and

(4)

E[Hμ,X1]

₋1

,the inverse of the linear operator E[Hμ,X1],exists.

These two assumptions ensure that the linear operator E[Hμ,X1] is well defined

and nonsingular.

For each fixedn≥1, consider the time-inhomogeneous Markov chain{V_kn|k≥ 0}defined on the tangent spaceτμ(M)in terms of{Xk|k≥1}as

V₀n=0,

V₁n= √1 n

E[Hμ,X1] ₋1

exp−_μ1(X1)

(4)

V_kn₊₁= √1 n

E[Hμ,X1]

₋1

exp−_μ1(Xk+1)

+

k+1

k I− 1 k

E[Hμ,X1] ₋1

Hμ,Xk+1 V

n k

, k≥1.

One may check that{V_kn|k≥0}is a martingale. We are interested in the asymptotic behaviour of {V_[n_nt_]|t ≥0} as n tends to infinity. Firstly, for this, the following lemma gives an upper bound for the sequence{V_[n_ε

0n]|n≥1}forε0>0.

LEMMA 1. Suppose that the assumptions(4) hold.Then there is a constant

c0>0such that,for anyε0>0andn0>0,

sup n≥n0

EV_[n_ε₀_n_]2≤αEρ(μ, X1)2

1 n0 +

ε0c0 ,

whereα= (E[Hμ,X1])−1 2.

PROOF. Write

B=EH_μ,X ₁E[Hμ,X1]

₋

E[Hμ,X1]

₋1

Hμ,X1

.

Then, by the definition ofV_kn,

EV_kn₊₁2|V_kn

= 1

nEE[Hμ,X1]

₋1

exp−1_μ (X1)2

+V_kn,

_k2₋₁

k2 I+

1 k2B V

n k

+√2

nE

E[Hμ,X1]

₋1

exp−_μ1(Xk+1), Gk+1Vkn

|V_kn,

where

Gk+1=

k+1 k I−

1 k

E[Hμ,X1]

₋1

Hμ,Xk+1.

Under the given conditions, there is a constantβ≥0 such that, for anyv∈τμ(M),

v, Bv ≤(β+1)|v|2. Thus, using the facts that E[exp−_μ1(X1)] =0 and thatXk+1

is independent ofV_kn, we have

EV_kn₊₁2|V_kn

≤ α

nE

ρ(μ, X1)2

+

1+ β

k2

V_kn2

−√2

nk

EH_μ,X _k₊₁E[Hμ,X1]

₋

E[Hμ,X1]

₋1

exp−_μ1(Xk+1)

(5)

Noting that{V_kn|k≥0} is a martingale with zero expectation, the above implies that

EV_kn₊₁2≤ α nE

ρ(μ, X1)2

+1+ β

k2

EV_kn2.

Hence, by induction, we have

EV_kn2≤α nE

ρ(μ, X1)2

1+

k

i=1

k

j=i

1+ β j2

.

Since

k

i=1 k

j=i

1+ β j2

≤k k

j=1

1+ β j2

,

the above implies, in particular, that

EV_[n_ε₀_n_]2≤αEρ(μ, X1)2

1 n+ε0

∞

j=1

1+ β

j2

.

The required result then follows from the fact that∞_j₌₁(1+_jβ2) <∞.

The next lemma gives various bounds on the differencesV_kn₊₁−V_knfor suffi-ciently largenandk.

LEMMA2. In addition to the assumptions in(4),assume that,for someδ >0, E[ρ(μ, X1)2+δ]<∞.Then,for anyε0>0andr >0,there are constantsc1,c2,

andc3 depending onε0 andr such that,whennis sufficiently large,fork≥ε0n

and forv∈τμ(M)with|v| ≤r:

(i) for anyε >0

PV_kn₊₁−V_kn> ε|V_kn=v≤ ⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩

c1

ε2+δn

−(1+min{1,δ/2})_, _if_ε_≤_1,

c1

ε2n−

(1+min{1,δ/2})_, _if_{ε >}_1; (5)

(ii)

EV_kn₊₁−V_kn1_{|V_kn₊₁−V_kn|>1}|Vkn=v≤c2n−(1+min{1/2,δ/4});

(6)

(iii)

_E

V_kn₊₁−V_knV_kn₊₁−V_kn1_{|V_kn₊₁−V_kn|>1}|Vkn=v (7)

(6)

PROOF. For anyε >0, writeε=ε E[Hμ,X1] . Then, by the definition ofV_kn, we have

PV_kn₊₁−V_kn> ε|V_kn=v

≤Pexp−_μ1(X1)+ √

n k

E[Hμ,X1] −Hμ,X1

(v)> ε√n

≤Pexp−1_μ (X1)> ε √

n

2 orE[Hμ,X1] −Hμ,X1

(v)> εk 2

≤Pexp−1_μ (X1)> ε √

n 2

+PE[Hμ,X1] −Hμ,X1> ε

k

2|v|

.

Thus ifv=0, it follows from Chebyshev’s inequality that

≤Eρ(μ, X1)2+δ

22+δ

(ε√n)2+δ +var

Hμ,X1

(2|v|)2 (εk)2

≤Eρ(μ, X1)2+δ

22+δ

(ε√n)2+δ +E

Hμ,X1

2(2|v|)2

(εk)2 ,

whennis sufficiently large. Note that the assumption thatv=0 implies thatk≥1. Ifv=0, a modified argument will show that the above still holds fork≥1. Hence, (5) follows.

Similarly, using the definition ofV_kn, we have

EV_kn₊₁−V_kn2|V_kn=v

≤ 2

nEE[Hμ,X1]

₋1

exp−1_μ (X1)2

+ 2

k2EI−

E[Hμ,X1]

₋1

Hμ,X1

v2.

Thus, under the given conditions, result (i) also implies that, for any r >0 and some constantc2depending onε0andr, we have

_E

V_kn₊₁−V_kn1_{|V_kn₊₁−V_kn|>1}|Vkn=v

≤EV_kn₊₁−V_kn2|V_kn=v1/2PV_kn₊₁−V_kn>1|V_kn=v1/2

≤c2n−(1+min{1/2,δ/4}),

fork≥ε0n, for sufficiently largenand for allv∈τμ(M)such that|v| ≤r, so that (6) holds.

To show (7), we note that there are positive constantsa, b, c independent ofn andksuch that, for givenV_kn=v,

_Vn

k+1−Vkn

V_kn₊₁−V_kn

≤a

nexp

−1

μ (X1)

2₊ b

k2

c+ Hμ,X1 2

(7)

Thus, by result (i),

EV_kn₊₁−V_knV_kn₊₁−V_kn1_{|V_kn₊₁−V_kn|>1}|Vkn=v

≤ a

nEexp

−1

μ (X1)21{|V_kn₊₁−V_kn|>1}|Vkn=v

+ b

k2E

c+ Hμ,X1 2_|

v|2

≤ a

nEexp

−1

μ (X1)2+

δ2/(2+δ)

PV_kn₊₁−V_kn>1|V_kn=vδ/(2+δ)

+ b

k2E

c+ Hμ,X1 2

|v|2

≤ a

nEexp

−1

μ (X1)2+δ

2/(2+δ)_× c nδ/(2+δ) +

b k2E

c+ Hμ,X1 2

|v|2

for some constantc>0 dependent on|v|, so that the required result follows.

COROLLARY. Under the assumptions of Lemma2,for anyε0>0andr >0,

the following limits hold uniformly ink≥ε0n:

(i) for anyε >0,

lim

n→∞_|sup_v_|≤_rnPV n k+1−V

n

k> ε|Vkn=v

=0;

(ii)

lim

n→∞_|sup_v_|≤_rnE

V_kn₊₁−V_kn1_{|V_kn₊₁−V_kn|≤1}|Vkn=v=0;

(iii)

lim

n→∞_|sup_v_|≤_rnE

V_kn₊₁−V_knV_kn₊₁−V_kn1_{|V_kn₊₁−V_kn|≤1}|Vkn=v

−A

=0,

whereA=E[Hμ,X1]−

1_E_[_H

μ,X1]−and

=covexp−_μ1(X1)

=Eexp−_μ1(X1)⊗exp−μ1(X1)

. (8)

PROOF. By (5), for anyk≥ε0n,

≤Eρ(μ, X1)2+δ

22+δ

(ε√n)2+δ +E

Hμ,X1 2

(4|v|)2 (εε0n)2

,

whennis sufficiently large. Thus (i) holds. Noting that E[V_kn₊₁|V_kn] =V_kn, (ii) fol-lows from (6). Since

covV_kn₊₁|V_kn=cov ₁

√

n

E[Hμ,X1] ₋1

exp−_μ1(X1)

=1

(8)

(iii) is equivalent to

lim

n→∞_|sup_v_|_>rnE

V_kn₊₁−V_knV_kn₊₁−V_kn1_{|Vn

k+1−Vkn|>1}|V n

k =v=0,

which follows from (7).

The properties that we have obtained so far on {V_kn|k≥0}enable us to prove the weak convergence of{V_[n_nt_]|t≥0}as follows.

PROPOSITION. In addition to the assumptions in(4),assume that, for some

δ >0, E[ρ(μ, X1)2+δ]<∞. Then the sequence of processes{V_[n_nt_]|t≥0}

con-verges weakly inD([0,∞), τμ(M)),the space of right continuous functions with left limits on the tangent space ofMatμ,to{Vt|t≥0}asn→ ∞,whereVt is the solution of the stochastic differential equation

dVt =

E[Hμ,X1] ₋1

E[Hμ,X1]

₋1/2

dBt (9)

withV0=0,Bt a standard Brownian motion inRd andis defined by(8).

PROOF. LetV˜_kn=(k_n, V_kn). Then{ ˜V_kn|k≥0}is a time-homogeneous Markov chain. For eachn≥1, writePnfor the transition probability distribution associated with{ ˜V_kn|k≥0}, that is,

Pn _l

n, v

, B

=P

_l₊₁

n , V n l+1

∈B|V_ln=v

,

whereB is any Borel set in(0,∞)×τμ(M).

For any ε0 >0, the result of Lemma 1 implies that { ˜V_[n_ε₀_n_]|n≥1} is tight.

Hence, there is a subsequence { ˜V_[n_εj

0nj]|j ≥1} that converges weakly in τμ(M)

to a random variable ξ˜ε0 =(ε0, ξε0). Then it follows from Corollary 7.4.2 in [5]

(pages 355–356) that the results of theCorollaryimply that the sequence of pro-cesses{ ˜V_[n_nj

jt]|t≥ε0}converges weakly inD([ε0,∞),[ε0,∞)×τμ(M))to a

dif-fusion { ˜Vt|t≥ε0}, whereV˜t =(t, Vt) with the initial condition that V˜ε0 has the

same distribution as ξ˜ε0 and where Vt satisfies the stochastic differential

equa-tion (9). This implies (cf. [5], page 355) that{V_[n_nj

jt]|t≥ε0}converges weakly in

D([ε0,∞), τμ(M))to{Vt|t≥ε0}, whereVε0 has the same distribution asξε0.

To show the required result, it is now sufficient to show that, for any subse-quence of {V_[n_nt_]|t ≥0}, there is a further subsequence which converges weakly to {Vt|t ≥ 0}. Without loss of generality, we may rename the subsequence as

{V_[n_nt_]|t≥0}and apply the above toε0=1/m. For eachm≥1, this gives a

sub-sequence {V_[n_nj

jt]|t ≥0} indexed by m, of {V n

(9)

subsequences, and we then take the diagonal subsequence. The diagonal subse-quence converges weakly inD((0,∞), τμ(M))to{Vt|t >0}. However, the result of Lemma1shows that E[|V_t|2] →0 ast→0 and so the required result follows by noting that{V_t|t≥0}must be equal in law to{Vt|t≥0}.

3. The main result. We now return to consider the sample Fréchet means of

{Xk|k≥1}. For this, we denote by μk a sample Fréchet mean of X1, . . . , Xk for eachk, so thatμkconverges toμalmost surely (cf. [11]). It follows from (3) that μk satisfies the condition

1 k

k

i=1

exp−_μ_k1(Xi)=0. (10)

Thus the origin of the tangent space ofMatμk,τμk(M), is the sample Euclidean

mean of the Euclidean random variables exp−_μ1(Xi), i=1, . . . , k. Nevertheless, although these relations resemble those for Euclidean means, these conditions are generally imposed on different tangent spaces, resulting in the difficulty in ob-taining a usable form of the relation between consecutive sample Fréchet means. Moreover, the usual difference “μk−μ” makes no sense here. However, in the context of manifolds, exp−_μ1(μk)plays a similar role toμk−μin the Euclidean case. This leads us to consider, for eachn≥1, the re-scaled sequence

W_kn=√k nexp

−1

μ (μk), k≥1. (11)

It is clear from (10), which the sample Fréchet means must satisfy, thatμk+1

can-not generally be expected to be determined byμkandXk+1alone so that in

partic-ular,{μk|k≥1}, and so{Wkn|k≥1}, is in generalnota Markov chain. However, the following result shows that, for sufficiently large n and k, the behaviour of

{W_kn|k≥1}is close to that of a Markov chain.

LEMMA3. In addition to the assumptions in(4),assume that

lim s→0E

sup x∈ball(μ,s)

x,μHx,X1−Hμ,X1

=0,

wherex,y denotes the parallel transport fromxtoyalong the geodesic between the two points.Then,for anyε0>0,r >0,andT >0,

sup ε0≤t≤T∧σnr

W_[n_nt_]−V_[n_nt_]−W_[n_ε₀_n_]−V_[n_ε₀_n_]−→P 0 asn→ ∞,

(10)

Note that, whenxis sufficiently close toμ, the geodesic between the two points is unique so that the above parallel transport is well defined.

Note also that Hessx(1₂ρ(x, y)2)is, as a mapping fromτx(M)×τx(M)→R, smooth with respect to x if y /∈Cx and, by (2), it is positive-definite provided

√

κ1ρ(x, y) < π/2. Thus the relationship betweenHx,y and Hessx(1₂ρ(x, y)2) en-sures that all three assumptions required for Lemma3are satisfied if the support for the distribution ofXis a compact subset of the open ball ball(μ, π/(2√κ1)).

PROOF OF LEMMA 3. Define, for each givenk, the random vector fieldUk onMby

Uk(x)= k

i=1

exp−1_x (Xi),

forx /∈CX1∪CX2∪ · · · ∪CXk. For each fixedx,

1

kUk(x)is the sample Euclidean mean of random variables exp_x−1(X1), . . . ,expx−1(Xk). By hypothesis onXi and the result of [9], Uk(μ) is defined almost surely, and it follows from (3) that E[Uk(μ)] =0. Moreover, μk being a sample Fréchet mean of X1, . . . , Xk im-plies thatUk(μk)=0 almost surely. Using these facts and using parallel transport followed by Taylor’s expansion, Kendall and Le [8] show that

−

k

i=1

D_exp−1

μ (μk)exp

−1

μ (Xi)=Uk(μ)+k(μk;X1, . . . , Xk)

exp−1_μ (μk)

, (12)

where the correction operatork satisfies the condition that, for any givenε >0, there existss >0 such that the ball, ball(μ, s), that is centred atμand with radiuss is contained inM\Cμand, for anyx in that ball,

k(x;X1, . . . , Xk)

≤d k

i=1

(1+2εs) sup x∈ball(μ,s)

x,μDexp−_x1(Xi)−Dexp−μ1(Xi)

+2εexp−1_μ (Xi)+sDexp−1μ (Xi).

Thus, noting by (1) that

−(D_exp−1

μ (μk)Uk)(μ)= _k

i=1

Hμ,Xi

exp−_μ1(μk)

,

we can rewrite (12) as _k

i=1

Hμ,Xi−k(μk;X1, . . . , Xk)

exp−_μ1(μk)

=Uk(μ), (13)

(11)

On the other hand, _k₊₁

i=1

Hμ,Xi−k+1(μk+1;X1, . . . , Xk+1)

exp−_μ1(μk)

=

_k

i=1

Hμ,Xi−k(μk;X1, . . . , Xk)

exp−_μ1(μk)

+Hμ,Xk+1

exp−_μ1(μk)

+k(μk;X1, . . . , Xk)−k+1(μk+1;X1, . . . , Xk+1)

exp−_μ1(μk)

=Uk(μ)+Hμ,Xk+1

exp−_μ1(μk)

+R(X1, . . . , Xk+1)

exp−_μ1(μk)

,

where

R(X1, . . . , Xk+1)=k(μk;X1, . . . , Xk)−k+1(μk+1;X1, . . . , Xk+1).

This, together with (13), gives _k₊₁

i=1

Hμ,Xi−k+1(μk+1;X1, . . . , Xk+1)

exp−_μ1(μk+1)−exp−μ1(μk)

=exp−_μ1(Xk+1)−Hμ,Xk+1

exp−_μ1(μk)

−R(X1, . . . , Xk+1)

exp−_μ1(μk)

.

It then follows from the definition ofW_knthat the differenceW_kn₊₁−W_kncan be expressed as

W_kn₊₁−W_kn

=k√+1

n _k₊₁

i=1

Hμ,Xi −k+1(μk+1;X1, . . . , Xk+1) −1

×exp−1_μ (Xk+1)−Hμ,Xk+1

exp−1_μ (μk)

−R(X1, . . . , Xk+1)

exp−_μ1(μk)

+√1

nexp

−1

μ (μk),

or equivalently as

W_kn₊₁−

1+1 k

W_kn

=

1 k+1

_k₊₁

i=1

Hμ,Xi−k+1(μk+1;X1, . . . , Xk+1)

−1

×

₁

√

nexp

−1

μ (Xk+1)−

1

kHμ,Xk+1

W_kn−1

kR(X1, . . . , Xk+1)

(12)

However, under the given assumptions, we have

1 k

k

i=1

Hμ,Xi

a.s.

−→E[Hμ,X1] and

1

kk(μk;X1, . . . , Xk)

P

−→0

(cf. [8]), so that in particular, 1_k R(X1, . . . , Xk+1) P

−→0. Hence, it follows that

W_kn₊₁=√1 n

E[Hμ,X1]

₋1

exp−_μ1(Xk+1)

+

k+1

k I− 1 k

E[Hμ,X1] ₋1

Hμ,Xk+1 W

n k

+ok−1 a.s.,

whereI is the identity operator. This implies that, fort≥ε0,

W_[n_nt_]−V_[n_nt_]−W_[n_ε₀_n_]−V_[n_ε₀_n_]

=W_[n_nt_]−₁−V_[n_nt_]−₁−W_[n_ε₀_n_]−V_[n_ε₀_n_]

= 1

[nt] −1

I−E[Hμ,X1]

₋1

Hμ,X_[nt]

W_[n_nt_]−₁−V_[n_nt_]−₁

+o[nt]−1 a.s.

so that, forε0≤t≤T ∧σnr, _Wn

[nt]−V[nnt]

−W_[n_ε₀_n_]−V_[n_ε₀_n_]

≤W_[n_nt_]−₁−V_[n_nt_]−₁−W_[n_ε

0n]−V

n

[ε0n]+o

[nt]−1.

The required result then follows.

We are now in the position to state and prove the main result of the paper con-cerning the limiting diffusion associated with the sequences of the rescaled images

{W_kn|k≥0}, under exp_μ−1, of the Fréchet meansμkofX1, . . . , Xk.

THEOREM. Under the assumptions of the Proposition and Lemma 3, the sequence of processes {W_[n_nt_]|t ≥0} converges weakly in D([0,∞), τμ(M)) to

{Vt|t≥0},whereW₀n=0;W_kn,k≥1,is defined by(11),and the Vt are as given in theProposition.

PROOF. By the Proposition, we only need to show that, for any r >0 and T >0,

sup

0≤t≤T∧σr n

W_[n_nt_]−V_[n_nt_]−→P 0 asn→ ∞,

(13)

SinceW₀n=V₀n=0, we have

sup

0≤t≤T∧σr n

W_[n_nt_]−V_[n_nt_]

=lim

ε0↓0

_Wn

[nt]−V[nnt]

≤lim

ε0↓0

_Wn

[nt]−V[nnt]

−W_[n_ε₀_n_]−V_[n_ε₀_n_]

+W_[n_ε₀_n_]+V_[n_ε₀_n_].

Thus, for anyε >0, we have for all sufficiently smallε0>0,

P! sup

0≤t≤T∧σr n

_Wn

[nt]−V[nnt]>6ε "

≤P! sup

ε0≤t≤T∧σnr

W_[n_nt_]−V_[n_nt_]−W_[n_ε₀_n_]−V_[n_ε₀_n_]

+W_[n_ε₀_n_]+V_[n_ε₀_n_]>3ε"

≤P! sup

ε0≤t≤T∧σnr _Wn

[nt]−V[nnt]

−W_[n_ε₀_n_]−V_[n_ε₀_n_]> ε"

+PW_[n_ε₀_n_]> ε+PV_[n_ε₀_n_]> ε.

The first term on the right tends to zero asn→ ∞by Lemma3. It follows from the

Propositionthat the distributionνε0ofVε0is Gaussian with mean zero and

covari-ance matrixε2₀E[Hμ,X1]−

1_E_[_H

μ,X1]−, where is given by (8). This implies

that the limiting distribution ofV_[n_ε

0n] is νε0. Moreover, the result of [8] implies

thatνε0 is also the limiting distribution ofW

n

[ε0n]. Thus, asn→ ∞, both the

sec-ond and third terms on the right are bounded above by var(|Vε0|)/ε

2_{, so that}

lim n→∞P

! sup

0≤t≤T∧σr n

W_[n_nt_]−V_[n_nt_]>6ε"≤2var(|Vε0|)

ε2 .

Since limε0↓0var(|Vε0|)=0, the independence of the left-hand side above on ε0

then gives the required result.

(14)

REFERENCES

[1] ARNAUDON, M., DOMBRY, C., PHAN, A. and YANG, L. (2012). Stochastic algorithms for computing means of probability measures.Stochastic Process. Appl.122 1437–1455.

MR2914758

[2] BHATTACHARYA, R. and PATRANGENARU, V. (2003). Large sample theory of intrinsic and extrinsic sample means on manifolds. I.Ann.Statist.311–29.MR1962498

[3] BHATTACHARYA, R. and PATRANGENARU, V. (2005). Large sample theory of intrinsic and extrinsic sample means on manifolds. II.Ann.Statist.331225–1259.MR2195634

[4] DRYDEN, I. L. and MARDIA, K. V. (1998).Statistical Shape Analysis. Wiley, Chichester.

MR1646114

[5] ETHIER, S. N. and KURTZ, T. G. (1986).Markov Processes:Characterization and Conver-gence. Wiley, New York.MR0838085

[6] JOST, J. (2005). Riemannian Geometry and Geometric Analysis, 4th ed. Springer, Berlin.

MR2165400

[7] KENDALL, D. G., BARDEN, D., CARNE, T. K. and LE, H. (1999).Shape and Shape Theory. Wiley, Chichester.MR1891212

[8] KENDALL, W. S. and LE, H. (2011). Limit theorems for empirical Fréchet means of indepen-dent and non-iindepen-dentically distributed manifold-valued random variables.Braz.J.Probab.

Stat.25323–352.MR2832889

[9] LE, H. and BARDEN, D. (2014). On the measure of the cut locus of a Fréchet mean.Bull.

Lond.Math.Soc.46698–708.MR3239607

[10] STROOCK, D. W. and VARADHAN, S. R. S. (1979).Multidimensional Diffusion Processes.

Grundlehren der Mathematischen Wissenschaften[Fundamental Principles of Mathemat-ical Sciences]233. Springer, Berlin.MR0532498

[11] ZIEZOLD, H. (1977). On expected figures and a strong law of large numbers for random ele-ments in quasi-metric spaces. InTransactions of the Seventh Prague Conference on In-formation Theory,Statistical Decision Functions,Random Processes and of the Eighth European Meeting of Statisticians(Tech.Univ.Prague,Prague, 1974)A591–602. Rei-del, Dordrecht.MR0501230

SCHOOL OFMATHEMATICALSCIENCES

UNIVERSITY OFNOTTINGHAM

NOTTINGHAM, NG7 2RD UNITEDKINGDOM