On Conditional Distribution of the Sample Mean for Densities with Singular Logarithmic Derivative

(1)

On Conditional Distribution of the Sample Mean for

Densities with Singular Logarithmic Derivative

Victor Chulaevsky

Département de Mathématiques, Université de Reims, Moulin de la Housse, B.P. 1039, 51687 Reims Cedex 2, France

Abstract

We study the regularity of the conditional distribution of the empiric mean of a finite sample of IID random variables with a bounded common probability density, conditional on the sample ”fluctuations”, and extend a prior result, proved for strictly positive smooth densities, to a larger class of smooth densities vanishing at one or more points of their support.

Keywords

Conditional Sample Mean, Eigenvalue Con-centration Estimates, Multi-particle Anderson Localization

1 Introduction

1.1 Formulation of the problem

Consider a sample of n > 1 IID (independent and identically distributed) random variables (r.v., in short) X1, . . . , Xn with Gaussian distribution N(0,1), and

intro-duce the sample meanξ=ξnand the ”fluctuations”ηi

rela-tive to the mean:

ξn=

1

n

∑

i=1

Xi, ηi=Xi−ξn, i= 1, . . . , n. (1.1)

It is well-known from standard courses of the probability the-ory and statistics (cf., e.g., [6]) thatξnis independent of the

sigma-algebraFηgenerated by{η1, . . . , ηn}. Therefore, the

conditional probability distribution ofξngivenFηcoincides

with the unconditional one,N(0, n−1), thusξnhas bounded

conditional density

pξ(t) =

e−12t 2 √

2πn−1 ≤

√

n

2π, (1.2)

and for any intervalI⊂Rof length|I|, we have

P{ξn(ω)∈IFη

}a.s.

=P{ξn(ω)∈I} ≤

√

n

2π|I|. (1.3) In other words, the conditional probability distribution func-tion (PDF) of the sample mean, given the fluctuafunc-tions{ηi}, is

Lipschitz continuous. (1.4) implies also the following bound: for anyFη-measurable r.v.λ, and anys≥0

P{ξn(ω)∈[λ(ω), λ(ω) +s]} ≤

√

n

2πs. (1.4) For the proof, it suffices to use conditioning onFη: setting

Is(ω) = [λ(ω), λ(ω) +s], we have

P{ξn(ω)∈Is(ω)}=E

[

P{ξn(ω)∈Is(ω)Fη

} ]

≤E

[

sup

a∈R

P{ξn(ω)∈[a, a+s]Fη

} ]

≤

√

n

2πs. (1.5) It is to be emphasized that the estimate on the probabil-ityP{ξn(ω)∈Is(ω)}with anon-constant, randominterval

Is(ω) = [λ(ω), λ(ω) +s], is more informative, and more

difficult to obtain, than its counterpart for a fixed interval Is= [a, a+s],a∈R,s >0. With a fixed intervalIs, a

num-ber of classical results of the probability theory are available. The regularity of the probability distribution of the sample meanξn(ω), or simply of the sum of IID r.v.X1+· · ·+Xn,

is at least as good as that of each term in the sum, and the smoothing effect of convolution makes it even better with n → +∞. Standard textbook examples show that already the sumX1+X2of IID r.v. with purely singular

continu-ous distribution can have absolutely continucontinu-ous probability distribution; see, e.g., [6, v.II, Section V.4] where it is shown that the uniform distribution in[0,1]is a convolution of two singular probability measures of Cantor type.

WhenIs is random, the situation becomes more

compli-cated.

In the present paper, as in our prior works [3,4], we handle the problem by using the conditional probability distribution of the sample mean, given the sigma-algebra generated by the fluctuationsη1, . . . , ηn−1. The main technical difficulty

arising here is that ofndegrees of freedom, initially present in the sample mean ξn(ω) = (X1(ω) +· · ·+Xn(ω))/n,

only one remains after fixingn−1parametersη1, . . . , ηn₋1,

which can be considered as coordinates in then-dimensional sample space. When the r.v. Xi have a densityρ(·)(which

we always assume in the present paper), their joint proba-bility distribution also has density, viz. p(x1, . . . , xn) =

ρ(x1)· · ·ρ(xn), and the aforementioned conditional

(2)

the restriction ofp(·)to the straight line in then-dimensional sample space selected by the n−1 conditions η1 = c1,

. . .ηn−1=cn−1. The normalization factor brings up a

tech-nical problem, for it can be very large (we address this prob-lem in detail in Sections4and5). Also, ifρis non-constant, i.e., the marginal distribution ofXi is not a uniform one in

some bounded interval, then it takes at least two different val-ues, say,a < b, thus the tensor productptakes valuesan_and

bn_{, with the ratio}₍_b/a₎n_≫₁_for_n_≫₁_{. In the intended}

ap-plications, this might result in inadequate regularity bounds, so we have to address this issue, too.

A natural question is, to what extent the above mentioned remarkable property of Gaussian IID samples can be gener-alized for other types of marginal probability measures. Sur-prisingly, this problem appears to be unexplored in a reason-ably general setup (cf. Section6). The author of these lines would greatly appreciate any constructive feedback from ex-perts in the field that would shed more light on the regularity problem at hand. A particularly challenging case is where the IID r.v.Xihave a purely singular probability distribution,

but the (cumulative) probability distribution function (PDF), t7→F(t) :=P{X1(ω)≤t},t∈R, is, for example, H¨older

continuous, or more generally, has an explicitly known con-tinuity modulus

s(s) := sup

a∈R

(

F(a+s)−F(a)), s >0. (1.6) In a prior work [4], we studied this problem under the fol-lowing condition:

(V1): The probability measure µ has bounded support,

suppµ = [a, b], and admits on (a, b) smooth probability densityρsatisfying the following conditions:

∀t∈(a, b)

{

0< ρ_∗≤ρ(t)≤ρ <∞,

|ρ′(t)/ρ(t)| ≤C <∞. (1.7) The prototypical example is the uniform distribution on an interval [a, b]. Informally, one can say that the probability distributions satisfying(V1)are ”comparable” to the uniform distributions. Under this assumption, we proved H¨older con-tinuity of the conditional distribution of the sample mean for typical conditions.

More precisely, we introduced in [4] a property of the probability measures onRresembling (1.4) and called Reg-ularity of the Conditional Mean:

(RCM)For alln≥2and someC,a∈(0,+∞),b∈(0,1], for anyFη-measurable random variableλ, one has

P{ξn(ω)∈[λ(η), λ(η) +s]} ≤Cnasb. (1.8)

The above form of conditional regularity of the sample mean for typical conditions is well-adapted to the applica-tions (briefly discussed in Section1.2below) which served as the principal motivation for our project.

In the present paper, we further develop our approach from [3,4] and extend(RCM)to a much larger class of smooth marginal densities. Specifically, we allow the logarithmic derivative of the common densityρto have power-law sin-gularities due to vanishing ofρat a finite number of points of its support.

The principal technical result of this paper, Theorem1, is proved under the following hypothesis:

(V2): The probability measure µ has bounded support,

suppµ = [a, b], and admits on the interior(a, b)a strictly

positive smooth probability densityρsatisfying the following condition:

∀t∈(a, b) ρ′(t)/ρ(t)≤C max

[

1

t−a,

1

b−t

]

. (1.9)

Moreover, the method of proof of Theorem 1 provides some explicit estimates for the exponents a andb figuring in (1.8).

Note that the condition (1.10) on the logarithmic deriva-tive covers the all the cases where the densityρvanishes at one of the edges of its support at power-law rate; the up-per bound on(lnρ)′actually provides some regularity of the decay at the respective edge, but the latter is a substantially milder condition than, for example, an exact decay asymp-toticsρ(t)∼C(t−a)γ _as_t_↘_a_.

We shall see that Theorem 1 naturally extends from the measures satisfying(V2)to a much larger class of measures, obtained from those satisfying(V2)by the operations of (i) shifts, (ii) convolution, and (iii) randomization.

The role of shifts is clarified in Corollary1.

Corollary2evidences that, starting from measures satisfy-ing(V2), one can construct a rich class of measures featuring (RCM), under the condition

(V3):The probability measureµonRis a convolution of the form

µ=µ1∗µ2 (1.10)

whereµ1 fulfills the condition(V2), andµ2 is an arbitrary

probability measure onR.

Corollary3 derives from Theorem1 property(RCM)for measures which fulfill the following condition:

(V4): The probability measureµon Ris obtained by ran-domization of measures µ1, . . . ,µK satisfying(V2), i.e., a

random variableX with probability distributionµadmits a decomposition of the form

X(ω) =

K

∑

k=1

1_A_k(ω)Xk(ω), (1.11)

where the r.v. {Xk,1 ≤ k ≤ K} are independent and

fulfill the condition (V2), and Ak = {ω : ζ(ω) = k},

k = 1, . . . , K, are the events generated by some integer-valued r.v. ζ : Ω → {1, . . . , K}independent of the family

{Xk,1≤k≤K}.

1.2 Motivation: random

N

-particle

Hamilto-nians

The regularity properties of the conditional distribution of the sample mean, given the fluctuations(ηi), are interesting

in itself, and may prove useful in a fairly broad context of mathematical statistics, but the main motivation for this work actually comes from the spectral theory of multi-particle ran-dom Hamiltonians studied in the Anderson [1] localization theory; for an introduction to this relatively new area of math-ematical physics of disordered quantum systems one can rec-ommend a recent monograph [5].

(3)

which cannot be derived from the standard, Wegner-type es-timates (cf. [7]) used in the single-particle models. More to the point, one needs eigenvalue comparison estimates for pairs of random operatorsH1(ω),H2(ω)which are

stochas-tically correlated in a very strong way: the same family of random variables which affectsH1(ω)also affects H2(ω),

and vice versa. As a result, no stricto sensu stochastic decou-pling is possible in such models. However, the eigenvalue comparison analysis for H1(ω)andH2(ω)becomes much

simpler, once the random field generating the potential in a finite domain (e.g., a finite lattice subsetQ⊂Zd) is decom-posed into a sum

V(x, ω) =ξQ(ω)1Q(x) +ηQ,x(ω), (1.12)

where

ξQ(ω) =

1 cardQ

∑

x∈Q

V(x, ω) (1.13)

and

ηQ,x(ω) =V(x, ω)−ξQ(ω). (1.14)

The reason is that after conditioning on the random fluctua-tion fieldηQ,x(ω), the random potentialV(x, ω)becomes a

sum of a nonrandom background potentialηQ,x(ω), ignored

in the principal estimates, and of a random but spatially con-stant potentialξQ(ω)1Q(x). As an operator of multiplication

by a constant, the latter commutes with all other operators involved, and this simplifies considerably the spectral analy-sis. See the details in [3]. Naturally, the crucial issue is the regularity of the conditional distribution of the sample mean ξQ(ω)given the fluctuationsηQ,x(ω). Our results show that

this conditional distribution is H¨older continuous, and such a regularity suffices for the methods of spectral theory of ran-dom operators; cf., e.g., [5].

2 Main results

Our goal is to analyze the case where the common proba-bility distribution of the IID random variablesXj,1≤j ≤

n, is absolutely continuous, with probability densityρ, and the supportS = suppρ of the density is a finite union of intervals:

S ⊂ ∪K

k=1Jk, Jk = [ak, bk], ak+1≥bk. (2.1)

With the help of the well-known randomization procedure (cf. [6]), namely, by making use of Corollary3, one can reduce the analysis of densities supported by a union of in-tervals to the case of a single supporting interval, and this is what we shall do first.

The class of densities ρ supported on an interval [a, b]

which do not vanish on its interior (a, b) and have there bounded logarithmic derivative,

0< ρ_∗≤ρ(t)≤ρ <∞,

|ρ′(t)/ρ(t)| ≤C, (2.2) was studied in [4], so we focus on the case where ρ van-ishes at one or both endpoints of the support. To cover in one argument all possible situations, one can always decom-pose each intervalJ whereρ(a) =ρ(b) = 0into the union

[a, c]∪[c, b],c= (a+b)/2, so on each of the resulting sub-intervalsρvanishes at exactly one endpoint. The two cases

are equivalent, since we can make on[c, b]the change of vari-ablet7→b+c−t, thus getting a density vanishing at the left edgec. Again, randomization allows one to study separately densities with the supports[a, c]and[c, b].

In the next section, we perform such analysis for a density vanishing at exactly one of the endpoints of the supporting interval. The reader will see, however, that the probabilistic bounds stemming from our analysis become slightly better for densities vanishing at both edges of the support (cf. Re-mark2).

Theorem 1. Assume that the common probability distribu-tion of IID random variablesX1, . . .,Xnadmits a

probabil-ity densprobabil-ityρ, with∥ρ∥_∞=ρ <+∞, satisfying the following conditions:

(i) suppρis a bounded interval:

suppρ= [a, a+ℓ]; (2.3) (ii) ρvanishes ataand admits the upper bound

ρ(t)≤C(t−a)γ, γ >0; (2.4) (iii) the logarithmic derivative ofρadmits the upper bound

∀t∈(a, a+ℓ) ρ′(t)/ρ(t)≤C(t−a)−1. (2.5)

Then for any A > 1 and α ∈ (0,1) there exist constants C′, C′′ ∈ (0,+∞), depending upon A,α, ℓ and the den-sity ρ, such that for any 0 ≤ s ≤ C′′n−(A−11)α₎_{and any}

Fη-measurable random variableλ, one has, withIs(ω) :=

[λ(ω), λ(ω) +s]: P{ξn(ω)∈Is(ω)}

≤min[C ρnsα(2+γ)+C′√ns1−Aα,1]. (2.6) The explicit form of the RHS in (2.6) shows that(V2)gives rise to the property(RCM).

2.1 Optimization of the H¨older exponents

In applications to the eigenvalue concentration estimates for random Hamiltonians, one often hass ≤ e−nβ,β > 0, so thatsdecays asn→ ∞much faster than any power-law functionn7→ n−B,0 < B < ∞. From this perspective, a pre-factor polynomial innis essentially negligible compared tos, and the exponentafiguring in(RCM)is of much greater importance. To balance the contributions from the two terms in the RHS of (2.6), let us find the optimal value (if it exists) α=αγ, as a solution of the equation

1−Aα=α(γ+ 2)⇔αγ =

1

A+ 2 +γ, (2.7) resulting in a simpler probability bound

P{ξ∈Is(η)} ≤C′′(n)s1−

A

A+2+γ ₌_C′′₍_n₎_s

2+γ A+2+γ _(2.8)

with C′(n), C′′(n) polynomial in n. Comparing to the optimal bound for the probability densities with regular (bounded) logarithmic derivative from [4],

(4)

we see that the above exponent 2/3 can be improved by a judicial choice ofA, viz.

A > Aγ = 1 +

1

2γ, (2.10)

and the closer isAto1, the better. For example, withγ= 1

(linear decay of the density at the edges), andA = 4/3, we obtain

P{ξ∈Is(η)} ≤C(n)s9/13≪C(n)s2/3, (2.11)

forssmall enough.

Furthermore, the optimal valuea = _A2+₊₂₊γ_γ for the expo-nentafiguring in(RCM)approaches1asγ→+∞. In view of Remark2, the final bound becomes even stronger when-everρvanishes at both edges of its support.

This corroborates the intuitive idea that vanishing of the density at some edges of its support should enhance the reg-ularity bound for the conditional distribution of the sample mean.

2.2 Extension to the case of random variables

with non-identical expectations

By a simple change of variables, the assertion of Theorem 1can be adapted to independent random variables with dif-ferent expectations.

Corollary 1. Suppose that the random variablesX1, . . . ,Xn

fulfill the hypotheses of Theorem1(hence, they have the prop-erty(RCM)). Pick any real numbers a1, . . . , an. Then the

random variablesXei:=Xi+ai,i= 1, . . . , n, also have the

property(RCM).

Indeed, arbitrary shifts Xi 7→ Xi +ai result only in a

translation of the sample mean ξn, even with conditioning

on Fη, but the upper bounds on the logarithmic derivative

of the conditional density ofξngivenFη remain unaffected.

A direct inspection of the proof of Theorem1 in Section5 evidences that this suffices for the main assertion to hold true.

2.3 Sufficiency of the condition

(V3)

Corollary 2((V3)). The condition(V3)implies(RCM). Proof. LetF(2)be the sigma-algebra generated byX(2). The conditional distribution ofX givenX(2)differs from that of X(1)_{(i.e., from the measure}_µ

1) only by a shift (nonrandom,

conditional onF(2)). Thus we have

P{ξn(ω)∈Is(ω)}=E[P{ξn(ω)∈Is(ω)|F2}] ≤ess supP{ξn(ω)∈Is(ω)|F2} ≤Cnasb,

(2.12) where the last inequality follows from Theorem1, sinceµ1

fulfills(V2).

2.4 Sufficiency of the condition

(V4)

Corollary 3((V4)). The condition(V4)implies(RCM). Proof. Introduce the sigma-algebraFζ generated by the

ran-dom vectorζ={ζi,1≤i≤N}. The probability

distribu-tion of{Xi,1 ≤i≤N}conditional onFζ is concentrated

on parallelepipeds Jk=

n

×

i=1

Jki, Jki = [aki, bki], (2.13)

with random multi-indicesk(ω) = (k1(ω), . . . , kn(ω))

de-termined by{ζi,1≤i≤n}.

Let Pk{·} be the conditional probability measure, given the event{ζ(ω) =k},Ek[·]the respective expectation, and pk=P{ζ(ω) =k}. Then we have

P{ξn∈Is(η)}=

∑

k∈K

pkPk{ξn∈Is(η)Fη

}

=∑

k∈K

pkEk[Pk{ξn∈Is(η)Fη

} ]

≤max

k_∈KEk

[

Pk{ξn∈Is(η)Fη

} ]

,

≤Cnasb,

(2.14) since∑_kpk= 1, each term in the last RHS sum is bounded byCknasb, by virtue of Theorem1combined with Corollary 1, and the number of terms is finite.

3 Basic geometrical objects and

nota-tions

Let be given a real numberℓ > 0and an integern ≥ 2. Consider a sample of n IID random variables with uni-form distributionUnif([0, ℓ]), and introduce again the sample meanξ=ξnand the ”fluctuations”ηirelative to the mean:

ξn=

1

n

∑

i=1

Xi, ηi=Xi−ξn. (3.1)

Further, consider then-dimensional Euclidean space of real linear combinations of the random variablesXi. Clearly, the

variablesηi:Rn→Rare invariant under the group of

trans-lations

(X1, . . . , Xn)7→(X1+t, . . . , Xn+t), t∈R, (3.2)

and so are their differencesηi−ηj ≡Xi−Xj,1≤i < j ≤

n. Introduce the variables

Yi =ηi−ηn, 1≤i≤n−1. (3.3)

Then the spaceRn_{is stratified into a union of affine lines of}

the form

L(Y) :={X ∈Rn: ηi−ηn=Yi, i≤n−1}

={X ∈Rn: Xi−Xn=Yi, i≤n−1},

(3.4)

labeled by the elementsY = (Y1, . . . , Yn−1)of the(n−1)

-dimensional real vector spaceYn−1 ∼= Rn−1orthogonal to the vector(1, . . . ,1). Denote

X(Y) =L(Y)∩[0, ℓ]n

={X ∈[0, ℓ]n : Xi−Xn =Yi, i≤n−1}

(3.5)

and equip each nonempty interval X(Y) ⊂ Rn _{with the}

structure of a probability space inherited fromRn_:

(5)

• if|X(Y)| = r > 0, then we use the inherited struc-ture of an interval of a one-dimensional affine line and the normalized measure with constant densityr−1_with

respect to the inherited Lebesgue measure onX(Y). The transformation X 7→ (ξn, η1, . . . , ηn−1) is

non-degenerate, but not orthogonal. We will have to work with the metric onX(Y), induced by the standard Riemannian metric in the ambient spaceRn_{; to this end, introduce an orthogonal}

coordinate transformation inRn_,_X _7→ _{( ˜}_ξ

n,η˜1, . . . ,η˜n₋1),

such that

˜

ξn=n−1/2 n

∑

i=1

Xi=n1/2ξn; (3.6)

the exact form ofη˜j,j= 1, . . . , n−1is of no importance for

our analysis, provided that the transformation is orthogonal. Remark1. For later use, note that, due to (3.6), each of the re-scaled variablesn1/2_X

ican serve as the normalized length

parameter on the elementsX(Y). Along an elementX(Y), one can simultaneously parameterizeξ˜and the variablesXi,

by settingξ˜(t) =c0+t,Xj(t) =cj+n−1/2t, with arbitrarily

chosen constantscj. Here, ξ˜n is a natural length parameter

onX(Y), since the transformationX 7→( ˜ξn,η˜1, . . . ,η˜n−1)

is orthogonal.

4 Probability of short intervals

In this section, we assume thatsuppρ= [0, ℓ]and the log-arithmic derivativeρ′/ρis well-defined on the open interval

(0, ℓ). In addition, we assume a power-law decay ofρat the edge0:

ρ(t)≤Cγtγ, γ >0. (4.1)

Later, we will complete this upper bound by a certain regu-larity of the edge decay (cf. (2.5)).

By a change of variablet 7→ ℓ−t, the obtained bounds will apply to the densities vanishing at the right edge of the support.

In the following preparation lemma, we use only bound-edness ofρand the decay condition (4.1); smoothness of the density is irrelevant for this intermediate result.

Lemma 1. Assume that the random variablesX1,. . .,Xn,

n≥2, are IID and admit a bounded densityρsupported by an interval[0, ℓ],ℓ >0, with∥ρ∥_∞=ρ <∞. Furthermore, assume thatρvanishes at0and satisfies(4.1). Then for all t∈(0, ℓ/2]one has

P{ |X(Y)|< r} ≤Cγρn r2+γ. (4.2)

Proof. Let

X =X(X) = min

i Xi, X=X(X) = maxi Xi. (4.3)

WhileX(X)andX(X)vary along the elementsX(Y), their differenceX(X)−X(X)does not; it is uniquely determined byX(Y).

According to Remark 1, each variable n1/2_X

i, i =

1, . . . , n, restricted toX(Y), can serve as a length parame-ter onX(Y), compatible with the metric induced by the Eu-clidean distance in the ambient spaceRn_{. Thus the range}

of eachn1/2_X

i|_X(Y)is an interval of length |X(Y)|. One

can increase (resp., decrease), e.g., the value ofX1, as long

as all {Xi,1 ≤ i ≤ n} are strictly smaller thanℓ (resp.,

strictly positive). Therefore, the maximum increment ofX1

(indeed, of anyXi) alongX(Y)is given byℓ−X(X), and

its maximum decrement equalsX(X), so the range of the normalized length parametern1/2_X

1alongX(Y(X))is an

interval of length

|X(Y(X))|=n1/2(ℓ−X(X) +X(X)). (4.4) BothX(X)andℓ−X(X)are non-negative, so one has the implication

|X(Y)|=X+ (ℓ−X)< t =⇒ max{X, ℓ−X}< t. (4.5) With0 ≤ t ≤ ℓ/2,(ℓ−Xi < t

)

implies(Xi > t

)

, thus denoting

Aij(t) :={Xi< t} ∩ {ℓ−Xj< t}, (4.6)

we have, for anyi,

Aii(t) ={Xi< t} ∩ {ℓ−Xi< t}=∅. (4.7)

Therefore,

{

max{X, ℓ−X}< t

}

⊂∪

i̸=j

{

Xi < t, ℓ−Xj < t

}

.

(4.8) Thus the union∪i_̸=jAij(t)contains all the samplesX with

|X(Y)|< t.

By hypothesis, for anyi̸=j, the random variablesXi, Xj

are independent, so

P{Aij(t)}=P{Xi < t} ·P{ℓ−Xj< t}. (4.9)

Since∥ρ∥_∞=ρ, we have

P{ℓ−Xj < t} ≤ρ t, (4.10)

and by (4.1), P{Xi< t} ≤

∫ t

0

Cγsγds=

Cγtγ+1

γ+ 1 ≤Cγt

γ+1_, _(4.11)

hence

P{Aij(t)} ≤ρt·Cγt1+γ =Cγρt2+γ. (4.12)

Consequently, P{ |X(Y)|< r}

=P

{

n1/2((ℓ−X(X)) +X(X))< r

}

=P{ ((ℓ−X(X)) +X(X))< rn−1/2

}

≤∑

i_̸=j

P{Aij

(

rn−1/2) }≤n(n−1)Cγρt 2+γ

(n1/2₎2 ≤Cγρn r2+γ.

(4.13)

This completes the proof.

Remark 2. As was said in Section2, in the case whereρ vanishes at both edges of its support,0andℓ, the probabilistic bounds for short intervals become stronger. For example, if ρ(t) ≤Ctγ _and_ρ₍_t₎ _≤_C′₍_ℓ₋_t₎γ′_{, we can replace (4.10)}

by a stronger bound P{ℓ−Xj< t} ≤ C′t1+γ ′

, and the resulting bound is

(6)

5 Regularity bounds for densities with

singular logarithmic derivative

In order to assess the conditional regularity of the sample mean on typical (viz., not too short) linear elementsX(Y), we need to complement the upper bound (4.1) with some assumption regarding the regularity of the decay ofρ(t)as t ↘ 0. Having in mind first of all applications to realis-tic physical models of disorder, we could restrict ourselves to the case whereρ(t) ∼ Ctγ_{, but it actually turns out that}

one can treat a more general situation where the logarithmic derivative fulfills the condition (2.5), or even a condition of the form

(lnρ)′1(0,ℓ)_∞≤ C

tB (5.1)

for someB ∈ (0,+∞). For brevity, we consider only the caseB = 1covering all possible ratesγ > 0of power-law decay at the edge of the support.

Proof of Theorem1. Without loss of generality, it suffices to prove the claim in the particular case wheresuppρ= [0, ℓ], which we assume below.

Fix someA >1andα∈(0,1).

Further, fix an elementX(Y)with|X(Y)| ≥sα; the prob-ability to have an elementX(Y)with|X(Y)|< sαis upper-bounded in (5.15), with the help of Lemma1.

Next, introduce a length parametertonX(Y), so we can identify X(Y) ∼= ˜J = [0,|X(Y)|]. Astruns throughJ˜, thei-th coordinate of the pointx(t)∈ X(Y)runs through an intervalJ˜i= [˜ai,˜bi], with˜bi−˜ai=|X(Y)|n−1/2.

Recall that it is not the sample mean ξn but its rescaled

counterpart ξ˜n = n1/2ξn which gives a normalized length

parameter on X(Y), so we re-write the key probability as follows:

P{ξn(ω)∈[λ(ω), λ(ω) +s]}=P

{

˜

ξn(ω)∈I˜s(ω)

}

(5.2) where

˜

Is(ω) =

[_˜

λ(ω),λ˜(ω) +s√n], λ˜(ω) =√nλ(ω), (5.3) so that|I˜s(ω)|=s

√

n.

Denote byp(t)the product densityρ(xi(t))· · ·ρ(xn(t))

atx(t) = (x1(˜bi−t), . . . , xn(˜bi−t))∈ X(Y). We have

p(t) =

n

∏

j=1

ρ(˜bj−t), (5.4)

and therefore,

p(t)

p(0) =

n

∏

j=1

ρ(˜bj−t)

ρ(˜bj)

. (5.5)

The functiont7→p(t)gives the conditional density induced onX(Y), up to a normalizing factor1/ZY, with

ZY =

∫ _|X(Y)|

0

p(r)dr. (5.6) To upper-bound the maximum of the conditional density, we have to lower-bound the integralZY.

Note that fortsmall enough, the ratio in the RHS of (5.5) is close to1; to see this, use the hypothesis

(lnρ(x))′≤Cx−1. (5.7)

Recall that we assumed|X(Y)| ≥sα_{, and by hypothesis of}

the Theorem,

s≤n−(A−11)α_. _(5.8)

Forxj =bj−twitht∈[0,1₂sα], we havexj−aj ≥ 1₂sα,

so|(lnρ(xj))′| ≤ Cs−α. In particular, fort ∈ [0, ϵs],ϵs =

sAα_,_{A >}₁_,

|lnρ(bj)−lnρ(bj−t)|=|t||

ρ′(cj)|

ρ(cj) ≤

ϵs·

C 1 2s

α

≤C1sα(A−1)≤C2n−1,

(5.9)

Further,

|lnp(b)−lnp(x(t))| ≤nmax

j |lnρ(bj)−lnρ(bj−t)|

≤C3n sα(A−1)= O(1),

(5.10) hence for sucht,

pp(0)(t)

±1= e±(lnp(0)−lnp(t)= eO(1)≥C4>0. (5.11)

Therefore,

Zk≥C4p(0)ϵs≥

1 2p(0)s

Aα_, _(5.12)

so the conditional probability density on the segmentX(Y)

is uniformly bounded by ans-dependent constant: p(t)

Zk

≤ C5p(0) sAα_p₍₀₎ ≤C5s

−Aα_, _(5.13)

yielding

∫

Is(η)

p(x(t))

Zk

dt≤C5s−Aα· |Is(η)|

=C5n1/2s1−Aα.

(5.14)

Owing to Lemma1, we know that

P{ |X(Y)|< sα} ≤C6ρnsα(2+γ). (5.15)

Collecting (5.14) and (5.15), the claim follows.

6 Open problems

It seems to be a very natural conjecture that the property (RCM) holds true for probability measures with bounded compactly supported probability density. However, operat-ing without any additional assumption on the regularity of the bounded probability density (apart from its measurability and boundedness) would require new analytic ideas.

A more challenging problem concerns the validity of (RCM) for more general probability measures with H¨older continuous PDF. The author’s discussions with a number of experts in probability and functional analysis seem to indi-cate that the question on regularity of conditional distribution of the sample mean on typical fibers

{(x1, . . . , xn) : ηi=ci, i= 1, . . . , n−1} ⊂Rn (6.1)

is far from obvious, when Rn _{is endowed with the product}

measureµ⊗n _{with a H¨older continuous, but possibly purely}

(7)

7 Conclusion

We have proved that the regularity properties of the sample meanξn, conditional on the sample ”fluctuations” relative to

ξn, generalizing the well-known property of Gaussian

sam-ples and proved earlier for a class of marginal densities with bounded logarithmic derivative, hold true for a much larger class of smooth densities which can vanish at some points in a sufficiently regular way. Moreover, we have shown that vanishing of the marginal density can only result in stronger probabilistic concentration bounds for the values of the sam-ple mean. Intuitively, this seems quite natural, but techni-cally, this was not quite clear from prior works based on global regularity of the logarithmic derivative of the density.

From the point of view of applications, the new result cov-ers a large number of models where the probability distri-bution comes from physical laws or, more generally, from explicit calculations with elementary functions which are smooth in their domains of definition and usually feature a power-law decay at certain points. In particular, this covers many models of disorder in mathematical physics of multi-particle quantum systems, where it provides a crucial ingre-dient for the analysis of eigenvalue concentration in presence of disorder and of a nontrivial interaction between particles (cf. [5] and references therein).

REFERENCES

[1] P. W. Anderson (1958).Absence of diffusion in certain ran-dom lattices. Phys. Rev.,109, 1492–1505.

[2] V. Chulaevsky & Y. Suhov (2008).Wegner bounds for a two-particle tight binding model. Commun. Math. Phys.283, 479– 489.

[3] V. Chulaevsky (2011). On resonances in disordered multi-particle systems. C. R. Acad. Sci. Paris, Ser. I,350, 81–85.

[4] V. Chulaevsky (2015). On the regularity of the conditional distribution of the sample mean. Markov Proc. Rel. Fields,

21:3.

[5] V. Chulaevsky & Y. Suhov (2013). Multi-scale Analysis for Random Quantum Systems with Interaction. Boston: Birkh¨auser.

[6] W. Feller (1966).An Introduction to Probability Theory and its Applications. New York: Wiley.