Sampling from Non-Normal distributions - Sampling Distribution of X

11.3 Sampling Distribution of X

11.3.7 Sampling from Non-Normal distributions

In this case, the sampling distribution of X will not be normal. However, if we imagine that the sample size n is allowed to increase without bound, so that n ! 1; we can appeal to a famous theorem (more accurately, a collection of theorems) in probability theory called the Central Limit Theorem. This states that

if X is obtained from a random sample of size n from a population with mean and variance 2; then, irrespective of the distribution sampled,

SE X =

( =pn) ! N (0; 1) as n ! 1: That is, the probability distribution of X

SE X approaches the standard normal distribution as n ! 1:

We interpret this as saying that

SE X =

( =pn) N (0; 1) ; approximately for …nite n:

An alternative is to say that

X N ;

n approximately

for …nite n:

The rate at which the standard normal distribution is approached in- ‡uences the quality of the approximation. This is expected to improve as n increases, and textbooks usually claim that the approximation is good enough if

n> 20 or n > 30:

The idea that X has an approximate normal distribution as n ! 1 is often described as the large sample normality of the sample mean. The textbook claim here is that a “large”sample is at least 20. This is not really reasonable, but is adequate for use in a course like this.

So, in the IQ example above, we can argue that Pr X < 90 = 0:0062 approximately if in fact IQ’s are not normally distributed, but do have population mean 100 and population variance 400:

Chapter 12

POINT ESTIMATION

It was stated in Section 10.2.1 that the objective of statistics is to learn about population characteristics or parameters. The values of these parameters are unknown to us, and generally, we will never discover their values exactly. The objective can then be restated as making (statistical) inferences on (unknown) population parameters, using data from a random sample from the population. We now know that this is equivalent to draw- ing a random sample from the probability distribution of a random variable X; producing sample random variables X1; :::; Xn which are mutually inde-

pendent and have the same probability distribution as X: We can construct sample statistics like X; and conceptually …nd their sampling distributions, and the characteristics or parameters of these sampling distributions.

12.1 Estimates and Estimators

12.1.1 Example

The basic principle of estimation of population parameters is very simple, and is best motivated by an example. Suppose that a random sample of size 3 is drawn from a population with mean and variance 2; both parameters being unknown. The values in the sample are

x1 = 5; x2= 20; x3 = 11

with

x = 5 + 20 + 11

3 = 12:

Then, the (point) estimate of is x = 12:

138 CHAPTER 12. POINT ESTIMATION

Similarly, we can calculate the sample variance s2 _from

s2 = 1 n 1 n X i=1 (xi x)2 = 1 n 1 n X i=1 x2_i nx2 ! : Here, n X i=1 x2_i = 25 + 400 + 121 = 546; so that s2= 1 2(546 (3) 144) = 114 2 = 57: Then, the (point) estimate of 2 _{is s}2 _{= 57:}

12.1.2 Principle of Estimation

The principle appears to be that of estimating a population characteristic by its corresponding sample version. However, there is another important principle here. We are calculating numerical estimates of the population parameters using the sample values x and s2 of what have previously been called sample statistics like X and S2: The latter are random variables, and have probability distributions; x and s2_{are values of these random vari-}

ables. Additional terminology is required to make sure that this distinction is preserved when using the language of estimation of population parameters:

an estimator is the sample statistic;

an estimator is the random variable which is a function of the sample random variables X1; :::; Xn;

an estimate is the value of the sample statistic;

an estimate is the statistic or number calculated from the sample values x1; :::; xn:

Another aspect is that

an estimator is a random variable and hence has a probability distribution

12.1. ESTIMATES AND ESTIMATORS 139

In this course, we have encountered a number of population parameters: mean, variance, proportion, covariance, correlation. The following table displays the parameters, their estimators and their estimates.

Population Parameter Estimator Estimate

mean: X = 1 n n X i=1 Xi x = 1 n n X i=1 xi variance: 2 S2= 1 n 1 n X i=1 Xi X 2 s2= 1 n 1 n X i=1 (xi x)2 covariance: XY SXY = 1 n 1 n X i=1 Xi X Yi Y sXY = 1 n 1 n X i=1 (xi x) (yi y) correlation: R = SXY SXSY r = sXY sXsY proportion: P p

Most of the quantities in this table have appeared in the discussion of de- scriptive statistics in Section 2.3. The (sample) correlation coe¢ cient R is shown explicitly as a function of the sample covariance and the sample variances to emphasise that the estimator of the population correlation co- e¢ cient is derived from other estimators. Out of the quantities in this table, we shall be concerned almost exclusively with the behaviour of the sample mean X: The sample variance S2 will also appear in a minor role, but none of the other quantities will concern us further.

One other small detail to explain in this table is the absence of an ex- pression for the estimator or estimate of the population proportion. This is because the sample mean is the required quantity, when sampling from a population of zeros and ones or equivalently from the distribution of a Bernoulli random variable - see Section 11.2.1 above. The possible sample values for x are

0;1 n; 2 n; :::; n 1 n ; 1 :

these are precisely the proportions of 10s in the sample, and are thus values of the sample proportion.

12.1.3 Point Estimation

The adjective “point” seems to play no role in the discussion. It serves mainly to distinguish the ideas from another concept of estimation called interval estimation, which will be discussed shortly. The distinction is simply that a point estimate is a single number, whilst an interval estimate is an interval of numbers.

140 CHAPTER 12. POINT ESTIMATION

12.2 Properties of Estimators

The discussion of the sampling distribution of the sample mean X in Section 11.3 can be carried over to the case where X is considered as an estimator of : More generally, if is some population parameter which we estimate by a sample statistic U; then we expect that U will have a sampling distribution. We can use this sampling distribution to obtain information on how good U is as an estimator of : After all, there may be a number of possible estimators of and it is natural to want to use the estimator that is best in some sense, or at least, avoid using estimators that have undesirable properties. This raises the issue of what desirable properties an estimator should possess.

The relevance of the sampling distribution of an estimator for this issue can be motivated in the following way. Di¤erent samples of data will generate di¤erent numerical values for u - that is, di¤erent values for the estimator U: A value of u will only equal the population parameter by chance. Because population parameters are generally unknown, we will not know when this happens anyway. But, the sampling distribution of U repre- sents, intuitively, the “chance” of such an occurrence, and we can calculate from it, in principle, the appropriate probabilities.

12.2.1 Unbiased Estimators

Following this rather intuitive argument, if we cannot detect whether the estimate is actually correct, we could resort to demanding that the estimate be correct “on average”. Here, the appropriate concept of averaging is that embodied in …nding the expected value of U: We can then say that

if E [U ] = ; then U is an unbiased estimator of ;

an unbiased estimator is correct on average;

if E [U ] 6= ; then U is a biased estimator of : a biased estimator is incorrect on average.

It is clear that unbiasedness is a desirable property for an estimator, whilst bias is an undesirable property.

So, to show that an estimator is unbiased, we have to …nd its expected value, and show that this is equal to the population parameter being esti- mated.

12.2.2 Examples

In Section 11.3.1, we showed that in sampling from a population with mean and variance 2;

12.2. PROPERTIES OF ESTIMATORS 141

So, without any further conditions, the sample mean is unbiased for : It is also true that the sample variance S2 is unbiased for 2 :

E S2 = E " 1 n 1 n X i=1 Xi X 2 # = 2:

One can guess from this that the use of the divisor n 1 rather than n is important in obtaining this property. Given this result, we can see that

E " 1 n n X i=1 Xi X 2 # = E n 1 n S 2 ₌ n 1 n 2_;

so that this alternative de…nition of sample variance produces a biased estimator of 2: On the other hand, the systematic underestimation of 2 implied by the nature of the bias will disappear as the sample size n increases. The estimator S2 is used simply because it is unbiased.

It is possible to show, but not in this course, that the sample covariance SXY is unbiased for the population covariance XY:

The sample correlation coe¢ cient R is in general biased for the population correlation coe¢ cient ; because it is a ratio of random variables. This does not seem to prevent its widespread practical use, however.

In document statistics (Page 155-161)