• No results found

MATH 10: Elementary Statistics and Probability Chapter 7: The Central Limit Theorem

N/A
N/A
Protected

Academic year: 2021

Share "MATH 10: Elementary Statistics and Probability Chapter 7: The Central Limit Theorem"

Copied!
20
0
0

Loading.... (view fulltext now)

Full text

(1)

Chapter 7: The Central Limit Theorem

Tony Pourmohamad

Department of Mathematics De Anza College

(2)

Objectives

By the end of this set of slides, you should be able to:

1 Understand what the central limit theorem is

2 Recognize the central limit theorem problems

(3)

The Central Limit Theorem

• The Central Limit Theorem (CLT) is one of the most powerful and

useful ideas in all of statistics

• For this class, we will consider two application of the CLT:

1 CLT for means (or averages) of random variables 2 CLT for sums of random variables

• Let’s start with an example, courtesy of Professor Mo Geraghty

http://nebula2.deanza.edu:16080/˜mo/holistic/clt.swf

• Try exploring the following website to better understand the CLT

http://spark.rstudio.com/minebocek/CLT_mean/

(4)

The Central Limit Theorem

• So what is happening in the CLT video?

10 Samples Frequency 2.5 3.0 3.5 4.0 4.5 5.0 0 1 2 3 4 100 Samples Frequency 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 0 10 20 30 1,000 Samples 250 10,000 Samples 1000 1500

(5)

The Central Limit Theorem -- Basic Idea

• Imagine there is some population with a mean

µ

and standard

deviation

σ

We can collect samples of size n where the value of n is "large

enough"

• We can then calculate the mean of each sample

• If we create a histogram of those means, then the resulting

histogram will look close to being a normal distribution

• It does not matter what the distribution of the original population is,

or whether you even know it. The important fact is the the distribution of the sample means tend to follow the normal distribution

(6)

The Central Limit Theorem -- More Formally

• Suppose that we have a large population with with mean

µ

and

standard deviation

σ

Suppose that we select random samples of size n items this

population

• Each sample taken from the population has its own averageX .

¯

• The sample average for any specific sample may not equal the

(7)

The Central Limit Theorem -- More Formally Continued

• The sample averagesX follow a probability distribution of their own

¯

• The average of the sample averages is the population average:

µ¯

x

= µ

• The standard deviation of the sample averages equals the

population standard deviation divided by the square root of the sample size

σ

¯x

=

σ

n

• The shape of the distribution of the sample averagesX is normally

¯

distributed if the sample size is large enough

• The larger the sample size, the closer the shape of the distribution

of sample averages becomes to the normal distribution

• This is the Central Limit Theorem!

(8)

The Central Limit Theorem -- Case 1

IF a random sample of any size n is taken from a population with a

normal distribution with mean and standard deviation

σ

• THEN distribution of the sample mean has a normal distribution

with:

µ¯

x

= µ

and

σ

¯x

=

σ

n and

¯

X

N

(µ¯

x

, σ¯

x

)

(9)

The Central Limit Theorem -- Case 1

X ~ N(10, 2) µ X ~ N(10, 2 50 ) µ 9 / 20

(10)

The Central Limit Theorem -- Case 2

IF a random sample of sufficiently large size n is taken from a

population with ANY distribution with mean

µ

and standard

deviation

• THEN the distribution of the sample mean has approximately a

normal distribution with:

µ

¯x

= µ

and

σ¯

x

=

σ

n and

¯

X

N

¯x

, σ

¯x

)

(11)

The Central Limit Theorem -- Case 2

X ~ N(10, 2) µ X ~ N(µ, σ n ) µ 11 / 20

(12)

The Central Limit Theorem -- Recap

• 3 important results for the distribution ofX

¯

1 The mean stays the same

µ

¯x

= µ

2 The standard deviation gets smaller

σ¯

x

=

σ

n

3 If n is sufficiently large,

¯

X has a normal distribution where

¯

(13)

What is Large n?

How large does the sample size n need to be in order to use the

Central Limit Theorem?

The value of n needed to be a "large enough" sample size

depends on the shape of the original distribution of the individuals in the population

• If the individuals in the original population follow a normal

distribution, then the sample averages will have a normal distribution, no matter how small or large the sample size is

• If the individuals in the original population do not follow a normal

distribution, then the sample averages

¯

X become more normally

distributed as the sample size grows larger. In this case the sample

averagesX do not follow the same distribution as the original

¯

population

(14)

What is Large n? Continued

• The more skewed the original distribution of individual values, the

larger the sample size needed

• If the original distribution is symmetric, the sample size needed can

be smaller

Many statistics textbooks use the rule of thumb n

30, considering

30 as the minimum sample size to use the Central Limit Theorem. But in reality there is not a universal minimum sample size that works for all distributions; the sample size needed depends on the shape of the original distribution

• In this class, we will assume the sample size is large enough for the

(15)

Calculating Probabilities from a Normal Distribution

• Here is the general procedure to calculate probabilities from the

distribution of the sample meanX

¯

1 You are given an interval in terms of

¯

x, i.e.

P

X

< ¯

x

)

2 Convert to a z score by using

z

=

¯

x

− µ

σ/

n

3 Look up probability in z table that corresponds to z score, i.e.

P

(

Z

<

z

)

• This is just the same idea we used in Chapter 6!

(16)

Examples

(17)

Percentile Calculations Based on the Normal Distribution

• Here is the general procedure to calculate the value

¯

x that

corresponds to the Pthpercentile

1 You are given a probability or percentile desired

2 Look up the z score in table that corresponds to the probability

3 Convert to

¯

x by the following formula:

¯

x

= µ +

z



σ

n



• Examples: Look at Handout #5 on the website

(18)

Using Your Calculator

• If you have a graphing calculator, your calculator can calculate

all of these probabilities without using a z table

If you want to calculate P

(

a

< ¯

X

<

b

)

follow these steps:

1 Push 2nd, then DISTR

2 Select normalcdf() and then push ENTER

3 Then enter the following: normalcdf(a,b, µ, σ/√n)

• Question: IfX

¯

N

(

0

,

1

)

, what is the probability P

(−

1

< ¯

X

<

1

)

?

• Solution: normalcdf(

1

,

1

,

0

,

1

) =

0

.

6827

68

%

• Question: IfX

¯

N

(

10

,

2

)

, what is the probability P

(

7

< ¯

X

<

9

)

?

(19)

Using Your Calculator

If you want to calculate P

X

<

a

)

follow these steps:

1 Push 2nd, then DISTR

2 Select normalcdf() and then push ENTER

3 Then enter the following: normalcdf(−1099,a, µ, σ/n)

• Question: IfX

¯

N

(

10

,

2

)

, what is the probability P

X

<

8

)

?

• Solution: normalcdf(

1099

,

8

,

10

,

2

) =

0

.

158656

If you want to calculate P

X

>

a

)

follow these steps:

1 Push 2nd, then DISTR

2 Select normalcdf() and then push ENTER

3 Then enter the following: normalcdf(a,1099, µ, σ/√n) • Question: IfX

¯

N

(

10

,

2

)

, what is the probability P

X

>

9

)

?

• Solution: normalcdf(9

,

1099

,

10

,

2

) =

0

.

691462

(20)

Using Your Calculator

• If you want to calculate the value ofX that gives you the P

¯

th

percentile then follow these steps:

1 Push 2nd, then DISTR

2 Select invNorm() and then push ENTER

3 Then enter the following: invNorm(percentile,µ, σ)

• Question: IfX

¯

N

(

10

,

2

)

, what value ofX gives us the 25

¯

th percentile?

• Solution: normalcdf(

.

25

,

10

,

2

) =

8

.

65102

• Recall: We used the formula

¯

x

= µ +

z

σ/

n, so

¯

x

=

10

+ (−

0

.

67

)(

2

) =

8

.

66

References

Related documents

Mainly high resolution multichannel seismic data and swatch bathymetry data were used to study near- surface seismostratigraphy, structure and seismic fluid-indicating features in

Commerce’s (Commerce) International Trade Administration who are responsible for reviewing Brand USA’s requests for federal matching funds. In addition, we analyzed and assessed

Background In patients with unexplained palpitations, especially in those with infrequent symptoms, the conventional strat- egy, including short-term ambulatory electrocardiogram

Calculating Transformer Noise Level with Setback from Property Line.. Step 2: Calculate Sound Pressure Level at a

Students may select 6-8 credits of the same foreign language (may include Biblical Languages or American Sign Language) and 13-15 credits of general electives (for a total of

We will study the fundamental principles and techniques of data mining, and we will examine real-world examples and cases to place data-mining techniques in context, to

“For classification purposes EN/ISO 14644-1 methodology defines both the minimum number of sample locations and the [minimum] sample size based on the class limit of the

Some products may contain additional features such as time limits for computer or Internet use, monitoring of the child’s online activities, and restrictions on usage of