Large deviations for exchangeable observations with applications

(1)

PII. S0161171204301444 http://ijmms.hindawi.com © Hindawi Publishing Corp.

LARGE DEVIATIONS FOR EXCHANGEABLE OBSERVATIONS

WITH APPLICATIONS

JINWEN CHEN

Received 7 January 2003

We ﬁrst prove some large deviation results for a mixture of i.i.d. random variables. Com-pared with most of the known results in the literature, our results are built on relaxing some restrictive conditions that may not be easy to be checked in certain typical cases. The main feature in our main results is that we require little knowledge of (continuity of) the compo-nent measures and/or of the compactness of the support of the mixing measure. Instead, we pose certain moment conditions, which may be more practical in applications. We then use the large deviation approach to study the problem of estimating the component and the mixing measures.

2000 Mathematics Subject Classiﬁcation: 60F10, 60G09, 62G05, 62G20.

(2)

Our investigation on the LDP turns out to have some significant implications to the problem of making inference on the component and/or the mixing measures which are usually unknown or contain some unknown key parameters. Suppose we have a sample, (ξ1, . . . , ξn), of exchangeable observations. A typical way to generate such a sample is as follows: first choose a distributionPθat random from a family{Pθ, θ∈Θ} according to certain lawµ, whereΘis a space of parameters; then sample from this distribution. Thus it is easily seen that such a sample is typically suitable for making inference on the component distributionPθ. Suppose a realization of this sample is obtained, one can, by using standard statistical methods, make inference or estimate on the distributionPθfrom which this sample comes. However, from the point of view of Bayesian statistics, it is important to make inference or estimate on the mixing measure or the prior distribution in Bayesian terminology,µ. From such a sample given above, it is not hard to find unbiased, even uniformly minimum variance unbiased estimators of some functionals ofµthat play key roles in statistics. But, by the LLN for exchangeable random variables, such estimators are usually not consistent ones. Actually it is hard to find consistent estimators from a single sample. To construct a consistent estimator, a number of such samples are usually needed. From this point of view, inSection 3we propose estimators forµor its functionals and use a large deviation approach to show their consistency.

InSection 2we formulate the main large deviation results followed by some exam-ples.Section 3is devoted to the proof of these results. InSection 4we study the esti-mate problem.

2. Large deviations. We consider a mixture of general i.i.d. random ﬁelds and start with some notations. First, for a topological spaceY, denote byᏮY the Borelσ-ﬁeld; M1(Y )is the space of all probability measures on(Y ,ᏮY). Now letXbe a locally convex

Hausdorﬀ topological vector space with the conjugateXand letX0⊂Xbe a separable metric space in the relative topology. Let(Ω,Ᏺ)be a given measurable space,Θa Haus-dorﬀ topological space, and{Pθ, θ∈Θ}a family of probability measures on (Ω,Ᏺ)

satisfying that for eachA∈Ᏺ, θ→Pθ(A)is measurable on(Θ,ᏮΘ). Let {ξt, t∈Zd}

(d≥1) be a family ofX0-valued random variables on(Ω,Ᏺ)which are i.i.d. under each

Pθ. Letµbe a Radon probability measure on(Θ,ᏮΘ)and deﬁne

P=

Pθµ(dθ). (2.1)

Finally, let{Λn, n≥1}be a sequence of ﬁnite subsets ofZd_satisfying_Λ

n↑Zd. Denote

by|Λn|the cardinality ofΛn. For eachn≥1, deﬁne

¯

ξn=_Λ1

n

t∈Λn

ξt (2.2)

and letP_θn=Pθ◦_ξ¯−1

n ,Pn=P◦ξ¯n−1which are probability measures on (X0,ᏮX0). The aim of this section is to study the LDP of{Pn_{, n}_≥₁_}_{. The main technique we will use}

(3)

(H) for everya >0,there is a compact set Ca⊂X0,such that

lim sup

n→∞

1

_Λ_nlogPn_Cc a

≤ −a. (2.3)

Our main result is the following.

Theorem2.1. LetPnbe deﬁned as above and let (H) hold. If for eachλ∈X,

Mθ(λ)=

eλ,ξ0_dPθ_<∞_, _µ_-a.s._, _(2.4)

then{Pn_{, n}_≥₁_}_{satisﬁes an LDP with a good rate function, that is, there is a function} I:X0→[0,∞]such that for everyA∈ᏮX0,

− inf

x∈A0I(x)≤lim infn→∞ 1

_Λ_nlogPn(A)≤lim sup

n→∞

1

_Λ_nlogPn(A)≤ −inf

x∈A¯

I(x) (2.5)

andIhas compact level sets, that is, for anya >0,{x:I(x)≤a}is compact inX0, where

A0_and_A_¯_{are the interior and closure of}_A_{, respectively. In particular, if for all}_λ_∈_X_,

M(λ)≡

eλ,ξ0_{dP <}_∞_, _(2.6)

then the above conclusions hold.

Remark_2.2. _{Assumption (H) is not a restrictive one. Indeed, in our case, to obtain} a full LD result with agood rate function, that is, withI having compact level sets, (H) is a necessary condition. IfX0is compact, (H) is trivial. Some practically used suﬃcient condition for (H) to hold can be found in [4,5,6]. In the case whereX=Rd_{, sup}

θMθ(λ) <

∞for someλ >0 is suﬃcient for (H). In the case of a more general Banach space, see Example 2.6.

Remark_2.3. _{As a special case, let}_X=_M(E)_{be the space of all ﬁnite signed}

mea-sures on some Polish spaceE, equipped with the weak topology,X0=M1(E), and let {ηt, t∈Zd_}_{be a family of}_E_{-valued random variables on}₍_Ω_,_Ᏺ₎_{that are i.i.d. under each} Pθ,ξt=δη_t is the Dirac measure onηt. Then under (H),{P ((1/|Λn|)

t∈Λnδηt ∈ ·), n≥ 1}satisfies an LDP, since in this case (2.4) is automatically satisfied. In particular, ifX0 is compact, then all the conditions inTheorem 2.1are satisfied.

The following are some typical examples that satisfy the conditions ofTheorem 2.1.

Example2.4. LetX=X₀=Rand Pθ the Gaussian distribution with meanα and varianceσ2_{, and}_θ₌_{(α, σ}2₎_∈_Θ_⊂_R_×_R

+, whereR+ is the space of strictly positive

real numbers. Then (2.4) is satisﬁed for anyλ∈R. In particular, ifΘis a bounded set, then (H) is also satisﬁed. IfPθis a Poisson distribution with meanθ∈Θ⊂R+, then we

(4)

Example2.5. LetPθ be the Bernoulli distribution with parameterθ∈(0,1). Then all the conditions inTheorem 2.1are satisﬁed.

In the cases wherePθis exponential or geometric, (2.4) is not satisﬁed. These cases will be treated in the next theorem.

Example_2.6. _{The third application is to Banach-valued sequences. In this case,}_X₀=

Bis assumed to be a separable Banach space with norm·and{ξn, n≥1}is anX 0-valued sequence.Sn=n

i=1ξi. Suppose (i) there is aµ-null setΘ0⊂Θsuch that the family{Pθ, θ∈Θc_0}_{is tight, and (ii)}_eαξ1_dP_·_L_∞_(µ)_<∞_{, for all}_{α >}_{0. Note that de} Acosta [4] proved that the family{P (n−1_Sn_{∈ ·}_{), n}_≥_1}_{is exponentially tight. Thus} ourTheorem 2.1and its proof imply that this family satisﬁes a full LDP with the good rate function given by

I(x)=sup

λ∈B

λ, x−L(λ) , (2.7)

whereL(λ)=logexpλ,ξ1_dP_·_L_∞_(µ)_{. This complements [}₄_{, Theorem 5.1] by providing} the accompanying large deviation lower bounds, and thus the full LDP.

As pointed out in [7], some simple examples do not satisfy the moment condition (2.4) or (2.6). For example, letξ0 have exponential distribution with parameter θ >

θ0≥0 underPθ [7, Example 3.3]. Then it is easy to check that (2.4) does not hold. For such cases, it has been proved in [7] that ifΘis compact and {P_θn, n≥1, θ∈Θ}is exponentially continuous (to be stated below), then under (H){Pn_{, n}_≥_1}_{also satisﬁes}

an LDP. Based on this result, we can treat the case whereΘis not compact. To this end we ﬁrst recall that the family{P_θn, n≥1, θ∈Θ}is said to beexponentially continuous if wheneverθn→θ, the family{Pθnn, n≥1}satisﬁes an LDP. We have the following.

Theorem_2.7. _{Assume that (H) holds and that}{_P_θn_{, n}≥₁_{, θ}∈Θ}_{is exponentially} continuous. If for eachλ∈X, there existδ >0and a compact setΘδ⊂Θwithµ(Θδ) <1,

such that

e(1+δ)λ,ξ0_dPθ_<∞ _for_µ_{-almost all}_θ∈Θc

δ, (2.8)

then{Pn_{, n}_≥₁_}_{satisﬁes an LDP with a good rate function.}

It is easy to check that (2.8) is satisﬁed in both the exponential and the geometric cases.

Using the same argument, whenΘis countable, we can prove the following.

Theorem2.8. LetΘbe countable, and for eachn≥1, letµnbe a probability measure on(Θ,Ꮾ_Θ)satisfyinglimn→∞|Λn|−1logµn(θ)=0for eachθ. If the family{Pθn, n≥1, θ∈ Θ}satisfies that(1)for eachθ,{P_θn, n≥1}satisfies an LDP;(2)for anyλ∈X, there existδ >0and a finite setΘδ⊂Θsuch that (2.8) holds; and(3)the sequence defined by

Pn₌ θ∈Θ

Pn

θµn(θ), n≥1, (2.9)

(5)

For all the above results,Xcan be replaced by any of its dense linear subspaces. In the above theorems, the LD rate function should be

I(x)= sup

f∈C_b(X0)

f (x)−L(f ), (2.10)

where

L(f )=lim

n→∞

1

_Λ_nlog

e|Λn|f_dPn_, _(2.11)

which exists for f ∈Cb(X0); see Section 3. At the same time, under the conditions of any of these theorems, for eachθ∈Θ,{P_θn, n≥1}satisﬁes an LDP with the rate function given by

Iθ(x)= sup

f∈Cb(X0)

f (x)−Lθ(f ) , (2.12)

which has compact level sets, whereLθ(f )is deﬁned similarly toL(f )withPn_replaced

byP_θn. Our proofs of the theorems (Section 3) show that for someΘ0⊂Θwithµ(Θ0)=1,

L(f )=supθ∈Θ0Lθ(f ), which suggests that one may consider whether we have

I= inf

θ∈Θ0

Iθ. (2.13)

This turns out to be a minimax problem. Equation (2.13) need not be true in general. There are various conditions for a minimax theorem to hold; see [9] or [10]. Usually three types of conditions for Λ(θ, f )≡f (x)−Lθ(f ) for ﬁxed x ∈X0 are involved: (i) compactness or connectivity of the domains of the two arguments of Λ; and (ii) convexity of Λ; (iii) certain continuity ofΛ. Simple examples show that only two of these conditions are insuﬃcient for (2.13). From the results mentioned in [9,10], we can see that if one has compactness, then the solutions are satisfactory. The situation is very complicated in the noncompact cases. Here we will not give further discussion.

3. Proof of the large deviations. To use the asymptotic value method [1] for a mea-surable functionf on(X0,ᏮX0), deﬁne

Ln(f )=Λn−1log

e|Λn|f_dPn_, _n_≥₁_, _(3.1)

and for a subsetΘ0⊂Θ, deﬁne

LΘ0,n(f )=Λn −1

log

Θ0 µ(dθ)

e|Λn|f_dPn

θ, n≥1. (3.2)

The main task is to prove that for allf∈Cb(X0), the limit limn→∞Ln(f )exists. To this

end, we will use linear functionals to approximate bounded continuous functions. Set

Ꮽ=

min 1≤i≤m

λi,·+ci, λi∈X, ci∈R, m≥1

. (3.3)

(6)

Lemma3.1. LetΘ₀be a measurable subset ofΘwithµ(Θ₀) >0,λi∈X,1≤i≤m. If there existsδ >0such that (2.8) holds for everyλiandθ∈Θ0, then for anyci∈R, 1≤i≤m, andM >0, withg=min1≤i≤m(λi,·+ci), the limit

LΘ0(g∧M)=_n→∞limLΘ0,n(g∧M) (3.4)

exists and is ﬁnite, wherea∧b=min(a, b).

To prove this lemma, we need the following result which was proved in [1] (see [1, Lemma L.6.1 and its proof]).

Lemma3.2. For ﬁxedn≥1andθ∈Θ. Letf₁, . . . , fnbe measurable and such that (1) for any linear combinationf=n

i=1αifi, the limit

Lθ(f )≡lim

n→∞

1

_Λ_nlog

e|Λn|f_dPθ _(3.5)

exists;

(2) (d/dt)Lθ(tf+(1−t)g)|t=0 exists for each pair of convex combinations f =

n

i=1αifiandg=

n i=1βifi.

ThenLθ(min1≤i≤nfi)exists and there is a convex combinationf0=

n

i=1α0ifisuch that

Lθ

min 1≤i≤nfi

=inf

Lθ

n

i=1

αifi

, αi≥0, n

i=1

αi=1

=Lθf0

. (3.6)

Proof ofLemma_3.1. _Write_gi= _λi,·+_ci_{. Then under our assumptions, for each}

θ∈Θ0, the limits

Lθ

m

i=1

αigi+αm+1M

=lim

n→∞

1

_Λ_nlog

e|Λn|(mi=1αigi+αm+1M)_dPn θ

=

m

i=1

αici+αm₊1M+log

emi=1αiλi,ξ0_dPθ

(3.7)

are ﬁnite in some open convex setA1containing

A=

α1, . . . , αm+1

∈Rm+1:αi≥0, m+1

i=1

αi=1

(3.8)

and for any(α1, . . . , αm+1)and(γ1, . . . , γm+1)inA, if we setf=m_i=1αigi+αm+1Mand

h=m

i=1γigi+αm+1M, then(d/dt)Lθ((1−t)f+th)|t=0exists. Hence fromLemma 3.2 we know that there existsgθ0=

m

i=1αigi+αm+1M with(α1, . . . , αm+1)∈A, such that

gθ

0∧g∧M=g∧Mand

h(θ)≡Lθ(g∧M)=lim

n→∞

1

_Λ_nlog

e|Λn|(g∧M)_dPn

θ =Lθ

g0θ

. (3.9)

(7)

continuous onΘk, and we can chooseΘkto be increasing. LetΘ∞=limk→∞Θk. Then µ(Θ∞)=µ(Θ0). DeﬁneΘk,µ=Supp(µ|Θk), whereµ|Θkis the restriction ofµtoΘk. Then

Θk,µ↑andµ(Θk,µ)=µ(Θk). We will show that

L_Θ₀(g∧M)=lim

k→∞θsup∈Θk,µ

Lθ(g∧M)= sup

θ∈∪∞ k=1Θk,µ

h(θ). (3.10)

First notice thatg∧M=gθ

0∧g∧M≤gθ0for eachθ∈Θ0, thus, by (2.4),

lim sup

n→∞ LΘ0,n(g∧M)≤lim supn→∞

1

_Λ_nlog

Θ0 µ(dθ)

e|Λn|g0θ_dPn θ

=lim sup

n→∞

1

_Λ_nlog

Θ0 µ(dθ)

e|Λn|Lθ(g0θ)_dPn θ

=lim sup

n→∞

1

_Λ_nlog

Θ0

e|Λn|h(θ)_µ(dθ)

=lim sup

n→∞ k→∞lim

1

_Λ_n_Θ

k

e|Λn|h(θ)_µ(dθ)_≤_lim

k→∞θsup∈Θk,µ

h(θ).

(3.11)

Notice that under (H) the sequence{Θ0P n

θµ(dθ), n≥1}is also exponentially tight; by

the above inequality and [1, Lemma 5.1] we know thatl=sup_θ∈∪∞

k=1Θk,µh(θ)is ﬁnite. Therefore for anyε >0, we can chooseθ0∈ ∪∞k=1Θk,µ such that

hθ0

=Lθ₀(g∧M) > l−₂ε. (3.12)

Supposeθ0∈Θk,µ. Then by the continuity ofhonΘk, we can choose a neighborhood Vθ₀ofθ0, such that for anyθ∈Vθ₀Θk,

h(θ) > hθ0

−₂ε> l−ε. (3.13)

Thus by Jensen’s inequality, [1, Lemma 5.1], and Fatou’s lemma, we see that for suﬃ-ciently largeN,

lim inf

n→∞ LΘ0,n(g∧M)=lim inf_n→∞ LΘ0,n

(g∧M)∨(−N)

≥lim inf

n→∞

1

_Λ_nlog

Vθ0Θk

µ(dθ)

e|Λn|[(g∧M)∨(−N)]_dPn

θ

≥lim inf

n→∞

1

µVθ₀Θk

1

_Λ_n_V

θ₀Θk

log

e|Λn|[(g∧M)∨(−N)]_dPn θ

µ(dθ)

≥ 1

µVθ₀Θk

Vθ₀Θk

lim inf

n→∞

1

_Λ_nlog

e|Λn|(g∧M)_dPn θ

µ(dθ)

= 1

µVθ₀Θk

V_θ 0Θk

h(θ)µ(dθ) > l−ε.

(3.14)

(8)

Proof ofTheorem2.1. Letg=min1_≤_i_≤m(λi,·+ci)∈ᏭandM >0. Then for any

δ >0, by assumption, there isΘ0⊂Θwithµ(Θ0)=1 such that (2.8) holds for allθ∈Θ0. Thus, byLemma 3.1, the limitL(g∧M)=limn→∞Ln(g∧M)=LΘ0(g∧M)exists and is ﬁnite. Therefore from the proof of [1, Theorem T.1.3], we see that for allf∈Cb(X0), the limit

L(f )=lim

n→∞Ln(f ) (3.15)

exists and is ﬁnite. Hence by [1, Theorem T.1.2],{Pn_{, n}_≥₁_}_{satisﬁes an LDP with the}

good rate function given by

I(x)= sup

f∈Cb(X0)

f (x)−L(f ). (3.16)

Proof ofTheorem_2.7. _{From the proof of}_{Theorem 2.1}_{, we know that it suﬃces} to show that for allg∈Ꮽ andM >0, the limit L(g∧M)exists and is ﬁnite. To do this, letgi = λi,· +ci and g=g1∧ ··· ∧gm. Then under the assumptions of the theorem, there existδ >0 and a compact setΘδ⊂Θwith µ(Θδ) <1, and aΘ0⊂Θcδ

withµ(Θ0)=µ(Θδc)such that (2.8) holds for allθ∈Θ0. Then the lemma implies that

the limitL_Θc

δ(g∧M)=ŁΘ0(g∧M) exists and is ﬁnite. By the exponential continuity of {Pθn, n≥1, θ∈Θ}and [4, Theorems 2.1 and 2.2], we know that the sequence

{ΘδP

n

θµ(dθ), n≥1}satisﬁes an LDP. Hence for anyf∈Cb(X0), the limit

LΘδ(f )=_n→∞lim 1

_Λ_nlog

Θδ

µ(dθ)

e|Λn|f_dPn

θ (3.17)

exists and is ﬁnite. ThereforeLΘδ(g∧M)∨(−N)exists and is ﬁnite for anyN >0. Notice that{ΘδP

n

θµ(dθ), n≥1}is also exponentially tight; again by [1, Lemma 5.1] we know

that for suﬃciently largeN,

lim

n→∞

1

_Λ_nlog

Θδ

µ(dθ)

e|Λn|((g∧M)∨(−N))_dPn θ−log

Θδ

µ(dθ)

e|Λn|(g∧M)_dPn θ

=0.

(3.18)

This means that the limitLΘδ(g∧M)exists and is ﬁnite. Thus

L(g∧M)=lim

n→∞

1

_Λ_nlog

Θδ

µ(dθ)

e|Λn|(g∧M)_dPn

θ+

Θ0 µ(dθ)

e|Λn|(g∧M)_dPn

θ

=L_Θ_δ(g∧M)∨L_Θc δ(g∧M)

(3.19)

exists and is ﬁnite.

Finally we proveTheorem 2.8. Letgi= λi,·+ci, 1≤i≤m,M >0,g=g1∧···∧gm. Then by our assumptions and the proof of the lemma, there is a ﬁnite setΘδ⊂Θ, such

that for anyθ∈Θc

δ, the limit

Lθ(g∧M)=lim

n→∞

1

_Λ_nlog

e|Λn|(g∧M)_dPn

(9)

exists and is ﬁnite, and there is agθ

0 =

m

i=1αigi+αm+1M withαi≥0,m+_i=11αi=1, such thatLθ(g∧M)=Lθ(gθ0),g0θ∧g∧M=g∧M. Thus for eachθ∈Θcδ,

lim inf

n→∞ LΘcδ,n≥lim infn→∞

1

_Λ_nlogµn(θ)

e|Λn|(g∧M)_dPn

θ =Lθ(g∧M); (3.21)

that is,

lim inf

n→∞ LΘcδ,n(g∧M)≥sup

θ∈Θc δ

Lθ(g∧M). (3.22)

On the other hand,

lim sup

n→∞ LΘ

c

δ,n(g∧M)=lim sup_n→∞ 1

_Λ_nlog

θ∈Θc δ

µn(θ)

e|Λn|(gθ0∧g∧M)_dPn θ

≤lim sup

n→∞

1

_Λ_nlog

θ∈Θc δ

µn(θ)

e|Λn|gθ0_dP_θn

=lim sup

n→∞

1

_Λ_nlog

θ∈Θc δ

µn(θ)e|Λn|Lθ(g∧M)≤_sup

θ∈Θc δ

Lθ(g∧M).

(3.23)

Hence the limit

L_Θc

δ(g∧M)=n→∞limLΘcδ,n(g∧M)=sup

θ∈Θc δ

Lθ(g∧M) (3.24)

exists and is ﬁnite. Forθ∈Θδ, since{Pθn, n≥1}is exponentially tight and satisﬁes an

LDP, by [1, lemma 5.1], for largeN,

Lθ(g∧M)=lim

n→∞

1

_Λ_nlog

e|Λn|(g∧M)_dPn

θ (3.25)

exists and is equal toLθ((g∧M)∨(−N)), hence ﬁnite. Therefore, the limit

LΘδ(g∧M)=_nlim_→∞ 1

_Λ_n

θ∈Θδ

µn(θ)

e|Λn|(g∧M)_dPn

θ =max θ∈Θδ

Lθ(g∧M) (3.26)

exists and is ﬁnite and, furthermore,

L(g∧M)=LΘδ(g∧M)

∨L_Θc δ(g∧M)

(3.27)

exists and is ﬁnite, completing the proof.

4. Estimate of the mixing measure. Now we turn to the study of estimating the mixing measure µ or some of its functionals such as expectation and moments. We adopt almost all the setting given inSection 2, except that we replace the set of indices

(10)

simply to make the notations simple. Letµθ=Pθ◦ξ−1

1 be the law ofξ1onX0under

Pθ. LetT be the mapping fromΘtoM1(X0)withT (θ)=µθ, whereM1(X0)is the space of probability measures on X0. DenoteT (µ)=µ◦T−1_{. We consider the problem of} estimatingT (µ), instead ofµ, since we can, if needed, make a certain change of the space of parameters so that the mixing measure isT (µ). Let{ξi,j, 1≤j≤n}, 1≤i≤

n, ben independent copies of{ξi, 1≤i≤n}. We still denote byP the underlying probability law. Deﬁne, forn≥1,

Ln,i=_n1

n

j=1

δξ_i,j, 1≤i≤n, Rn=_n1

n

i=1

δL_n,i. (4.1)

The following result implies thatRnis a (strongly) consistent estimator ofT (µ).

Theorem_4.1. _{Assume (H). Then}_lim_n→∞_Rn=_{T (µ)}_{a.s. in the sense of weak}

conver-gence.

Remark_4.2. _{From this theorem it is easily seen that if}_φ∈_Cb(M₁_(X₀₎₎_{, then}

lim

n→∞

1

n n

i=1

φLn,i=

φdT (µ)=

φµθµ(dθ) a.s. (4.2)

In particular, ifφ∈Cb(X0), then

lim

n→∞

1

n2

n

i,j=1

φξi,j=

µθ(φ)µ(dθ) a.s. (4.3)

This suggests consistent estimators for expectation or more general moments ofµ.

Obviously, the above theorem is a consequence of the following large deviation result.

Theorem_4.3. _{Assume (H). Then}{_Qn=_P◦_R_n−1_{, n}≥1}_{satisﬁes a full LDP on}_M₁_(X₀₎ endowed with the weak topology, with the good rate functionJgiven byJ(R)=H(R, T (µ)), the relative entropy ofRwith respect toT (µ).

The proof ofTheorem 4.3is completed with the following three lemmas.

Lemma4.4. LetLn=(1/n)n_i₌₁δξ

i. Thenlimn→∞P◦L− 1

n =T (µ)weakly.

Proof. For eachθ∈Θ, by applying Sanov’s theorem, it is not hard to prove that limn→∞Pθ◦L−n1=δµθ weakly. Thus for anyf∈Cb(M1(X0)),

lim

n→∞

f dP◦L−1

n =_n→∞lim

µ(dθ)

f dPθ◦L−1

n =

fµθµ(dθ)=

f dT (µ), (4.4)

(11)

Lemma4.5. For anyf∈Cb(M₁(X₀)), the limit

Λ(f )≡lim

n→∞

1

nlog

enf (ν)_Qn(dν) _(4.5)

exists and equalslogef_{dT (µ)}_{. Moreover, for any pair} _f _and _g _in_Cb(M

1(X0)), t→

Λ(tf+(1−t)g)is diﬀerentiable.

Proof. The ﬁrst assertion is a consequence ofLemma 4.4, and thus the second assertion easily follows.

Lemma_4.6. _{Assume (H). Then}{_{Qn, n}≥₁}_{is exponentially tight on}_M₁_(X₀₎_.

Proof. _{For any}_{a >}_{0, from (H) we know that there is a compact subset}_Ca_of_M₁_(X₀₎

such that

PLn∈Cac

≤e−an ∀n≥1. (4.6)

Now deﬁne

Ka=

R∈M1

M1

X0

:RC_akc ≤a_k,∀k≥1

. (4.7)

ThenKa is compact inM1(M1(X0))and, for each 1≤i≤n,

PδL_n,i∈Kac

=PδLn∈K

c a

=P

∃k≥1|δLn

C_akc ≥a_k

=P∃k≥1|Ln∈Cakc

≤₁₋e−an_e₋an

(4.8)

by (4.6). Note thatKais convex. We have

PRn∈Kac

≤

n

i=1

PδL_n,i∈KaC

≤₁ne₋_e−₋anan, (4.9)

which implies the exponential tightness of{Qn, n≥1}.

Now the proof ofTheorem 4.3is completed by applying [1, Theorem T.1.4], with the rate function being given by

J(R)= sup

f∈Cb(M1(X0))

f dR−Λ(f )

= sup

f∈Cb(M1(X0))

f dR−log

efdT (µ)

(4.10)

which is justH(R, T (µ)).

Proof ofTheorem4.1. Letd(·,·)be any metric onM₁(M₁(X₀))that generates the weak topology. Then fromTheorem 4.1we know that for any >0, there is a constant

α>0 such that

PdRn, T (µ)≥≤e−αn ∀_n≥₁_. _(4.11)

(12)

Acknowledgment. This research was supported by the National Natural Science Foundation of China (NSFC 10371063).

References

[1] W. Bryc,Large deviations by the asymptotic value method, Diﬀusion Processes and Related Problems in Analysis, Vol. I (Evanston, Ill, 1989) (M. Pinsky, ed.), Progr. Probab., vol. 22, Birkhäuser Boston, Massachusetts, 1990, pp. 447–472.

[2] T. Daras,Large and moderate deviations for the empirical measures of an exchangeable sequence, Statist. Probab. Lett.36(1997), no. 1, 91–100.

[3] ,Trajectories of exchangeable sequences: large and moderate deviations results, Statist. Probab. Lett.39(1998), no. 4, 289–304.

[4] A. de Acosta,Upper bounds for large deviations of dependent random vectors, Z. Wahrsch. Verw. Gebiete69(1985), no. 4, 551–565.

[5] A. Dembo and O. Zeitouni, Large Deviations Techniques and Applications, Jones and Bartlett Publishers, Massachusetts, 1993.

[6] J.-D. Deuschel and D. W. Stroock,Large Deviations, Pure and Applied Mathematics, vol. 137, Academic Press, Massachusetts, 1989.

[7] I. H. Dinwoodie and S. L. Zabell,Large deviations for exchangeable random vectors, Ann. Probab.20(1992), no. 3, 1147–1166.

[8] ,Large deviations for sequences of mixtures, Statistics and Probability: A Raghu Raj Bahadur Festschrift (J. K. Ghosh, S. K. Mitra, K. R. Parthasarathy, and B. L. S. Prakasa Rao, eds.), Wiley Eastern, New Delhi, 1993, pp. 171–190.

[9] B. Ricceri and S. Simons (eds.),Minimax Theory and Applications, Nonconvex Optimization and Its Applications, vol. 26, Kluwer Academic Publishers, Dordrecht, 1998. [10] S. Simons,Minimax theorems and their proofs, Minimax and Applications (D.-Z. Du and

P. M. Pardalos, eds.), Nonconvex Optim. Appl., vol. 4, Kluwer Academic Publishers, Dordrecht, 1995, pp. 1–23.