Non Parametric Inference

(1)

Non Parametric Inference

Maura Mezzetti

Department of Economics and Finance Universit`a Tor Vergata

(2)

Outline

1

Inverse distribution function (quantile function)

2

Nonparametric Inference

3

Robust Statistics

(3)

Inverse distribution function

Theorem: Let U be a uniform random variable on (0, 1). Let X

be a continuous random variable with cumulative distribution

function (cdf) F (x ). Let Y be defined such that Y = F

⁻¹

(U). Y

has c.d.f. equal to F .

(4)

Inverse distribution function

(5)

Inverse distribution function

(6)

Why nonparametric statistics?

While in many situations parametric assumptions are reasonable (e.g. assumption of Normal distribution for the background noise), we often have no prior knowledge of the underlying distributions.

In such situations, the use of parametric statistics can give

misleading or even wrong results. We need statistical procedures

which are insensitive to the model assumptions in the sense that

the procedures retain their properties in the neighborhood of the

model assumptions.

(7)

What is the nonparametric inference?

The basic idea of nonparametric inference is to use data to infer an unknown quantity while making as few assumptions as possible.

Usually, this means using statistical models that are

infinite-dimensional. Indeed, a better name for nonparametric

inference might be infinite-dimensional inference. But it is difficult

to give a precise definition of nonparametric inference. For the

purposes of this course, we will use the phrase nonparametric

inference to refer to a set of modern statistical methods that aim

to keep the number of underlying assumptions as weak as possible.

(8)

What is the advantage of nonparametric statistics?

The rapid and continuous development of nonparametric statistical procedures over the past six decades is due to the following

advantages enjoyed by nonparametric techniques

Require few assumptions about the underlying populations from which the data are obtained

It enables the user to obtain exact p − values for tests, exact coverage probabilities for confidence regions, and exact experimentwise error rates for multiple comparison procedures.

easy to understand (often)

Usually they are only slightly less efficient than their normal competitors when the underlying populations are normal, and they can be mildly or wildly more efficient than these

competitors when the underlying populations are not normal.

insensitive to outliers

(9)

What is the advantage of nonparametric statistics?

Because many nonparametric approaches require just the ranks of the observations, rather than the actual magnitude of the

observations, they are applicable in many situations where normal

theory procedures cannot be utilized.

(10)

The empirical distribution function

We will begin with the problem of estimating a CDF (cumulative distribution function)

Suppose X ˜ F , where F (x ) = P(X ≤ x ) is a distribution function

The empirical distribution function, ˆ F , is the CDF that puts mass 1/n at each data point x

_i

F (x ) = ˆ 1 n

n

X

i =1

I (x

_i

≤ x)

where I is the indicator function

(11)

Properties of ˆ F

At any fixed value of x, E ( ˆ F (x )) = F (x ) Var ( ˆ F (x )) =

¹_n

F (x )(1 − F (x ))

Note that these two facts imply that F (x ) −→ ˆ

^P

F (x )

An even stronger proof of convergence is given by the Glivenko-Cantelli Theorem:

sup

_x

| ˆ F (x ) − F (x )| −→

_a.s.

0

(12)

Non parametric test

In order to be able to employ the test proposed below, we have to make the supplementary (but mild) assumption that F is

continuous. Thus the hypothesis to be tested here is

H

₀

: F (x ) = F

₀

(x ) a given continuous d.f., against the alternative H

0

: F (x ) 6= F

0

(x ) (in the sense that F (x ) 6= F

0

(x ) for at least one one x . Define the random variable D

_n

as

D

n

= sup

x

| ˆ F (x ) − F (x )|

(13)

Kolmogorov test

Idea: If the difference between the sample and the theoretical distribution functions is severe, the null hypothesis H

0

is rejected.

Statistic: The probability distribution of D

_n

is not one of the well-known models. Its probabilities are given in a specific table for small n, while an asymptotic result is applied for big n.

Rule: Critical region of the form D

n

(x ) ≥ k

(14)

Kolmogorov One-sample test

In order for this determination to be possible, we would have to know the distribution of D

_n

, under H

₀

, or of some known multiple of it. It has been shown in the literature that

P( √

nD

n

≤ x|H

₀

) −−−−→ n → ∞

∞

X

j =−∞

(−1)

^j

e

^−2j²^x²

, x > 0

Thus for large n, the right-hand side of previous equation may be

used for the purpose of determining critical region. The test

employed above is known as the Kolmogorov one-sample test.

(15)

Kolmogorov-Smirnov Two sample test

The testing hypothesis problem just described is of limited practical importance. What arise naturally in practice are problems of the following type: Let X

i

, i = 1, . . . , m be i.i.d. r.v. with continuous but unknown d.f. F and let Y

j

, j = 1, . . . , n be i.i.d. r.v. with continuous but unknown d.f. G . The two random samples are assumed to be independent and the hypothesis of interest here is

H

₀

: F = G . One possible alternative is the following:

H

1

: F 6= G

(16)

Kolmogorov-Smirnov Two sample test

(17)

Kolmogorov-Smirnov Two sample test

(18)

Robustness

Any statistical procedure should possess the following desirable features:

It has reasonably relative efficiency under the assumed model It is robust in the sense that small deviations from the assumed model assumptions should impair the perfomance only slighly

Somewhat larger deviations from the model should not a

cause a catastrophe

(19)

Robustness

In addition to the classical concept of efficiency, new concepts are introduced to de- scribe

the local stability of a statistical procedure (the influence function and derived quantities)

its global reliability or safety (the breakdown point).

(20)

Sample median

x

₍₁₎

, x

₍₂₎

, . . . , x

_(n)

denotes a sample in ascending order.

Definition. The (sample or empirical) median denoted by Me, is given by

Me =

( x

₍ⁿ⁺¹

2 )

if n is odd x

₍ⁿ

2)

+ x

₍ⁿ

2+1)