Chapter 7: The Central Limit Theorem
Tony Pourmohamad
Department of Mathematics De Anza College
Objectives
By the end of this set of slides, you should be able to:
1 Understand what the central limit theorem is
2 Recognize the central limit theorem problems
The Central Limit Theorem
• The Central Limit Theorem (CLT) is one of the most powerful and
useful ideas in all of statistics
• For this class, we will consider two application of the CLT:
1 CLT for means (or averages) of random variables 2 CLT for sums of random variables
• Let’s start with an example, courtesy of Professor Mo Geraghty
http://nebula2.deanza.edu:16080/˜mo/holistic/clt.swf
• Try exploring the following website to better understand the CLT
http://spark.rstudio.com/minebocek/CLT_mean/
The Central Limit Theorem
• So what is happening in the CLT video?
10 Samples Frequency 2.5 3.0 3.5 4.0 4.5 5.0 0 1 2 3 4 100 Samples Frequency 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 0 10 20 30 1,000 Samples 250 10,000 Samples 1000 1500
The Central Limit Theorem -- Basic Idea
• Imagine there is some population with a mean
µ
and standarddeviation
σ
• We can collect samples of size n where the value of n is "large
enough"
• We can then calculate the mean of each sample
• If we create a histogram of those means, then the resulting
histogram will look close to being a normal distribution
• It does not matter what the distribution of the original population is,
or whether you even know it. The important fact is the the distribution of the sample means tend to follow the normal distribution
The Central Limit Theorem -- More Formally
• Suppose that we have a large population with with mean
µ
andstandard deviation
σ
• Suppose that we select random samples of size n items this
population
• Each sample taken from the population has its own averageX .
¯
• The sample average for any specific sample may not equal the
The Central Limit Theorem -- More Formally Continued
• The sample averagesX follow a probability distribution of their own
¯
• The average of the sample averages is the population average:
µ¯
x= µ
• The standard deviation of the sample averages equals the
population standard deviation divided by the square root of the sample size
σ
¯x=
σ
√
n
• The shape of the distribution of the sample averagesX is normally
¯
distributed if the sample size is large enough
• The larger the sample size, the closer the shape of the distribution
of sample averages becomes to the normal distribution
• This is the Central Limit Theorem!
The Central Limit Theorem -- Case 1
• IF a random sample of any size n is taken from a population with a
normal distribution with mean and standard deviation
σ
• THEN distribution of the sample mean has a normal distribution
with:
µ¯
x= µ
andσ
¯x=
σ
√
n and¯
X∼
N(µ¯
x, σ¯
x)
The Central Limit Theorem -- Case 1
X ~ N(10, 2) µ X ~ N(10, 2 50 ) µ 9 / 20The Central Limit Theorem -- Case 2
• IF a random sample of sufficiently large size n is taken from a
population with ANY distribution with mean
µ
and standarddeviation
• THEN the distribution of the sample mean has approximately a
normal distribution with:
µ
¯x= µ
andσ¯
x=
σ
√
n and¯
X∼
N(µ
¯x, σ
¯x)
The Central Limit Theorem -- Case 2
X ~ N(10, 2) µ X ~ N(µ, σ n ) µ 11 / 20The Central Limit Theorem -- Recap
• 3 important results for the distribution ofX
¯
1 The mean stays the same
µ
¯x= µ
2 The standard deviation gets smaller
σ¯
x=
σ
√
n
3 If n is sufficiently large,
¯
X has a normal distribution where¯
What is Large n?
• How large does the sample size n need to be in order to use the
Central Limit Theorem?
• The value of n needed to be a "large enough" sample size
depends on the shape of the original distribution of the individuals in the population
• If the individuals in the original population follow a normal
distribution, then the sample averages will have a normal distribution, no matter how small or large the sample size is
• If the individuals in the original population do not follow a normal
distribution, then the sample averages
¯
X become more normallydistributed as the sample size grows larger. In this case the sample
averagesX do not follow the same distribution as the original
¯
population
What is Large n? Continued
• The more skewed the original distribution of individual values, the
larger the sample size needed
• If the original distribution is symmetric, the sample size needed can
be smaller
• Many statistics textbooks use the rule of thumb n
≥
30, considering30 as the minimum sample size to use the Central Limit Theorem. But in reality there is not a universal minimum sample size that works for all distributions; the sample size needed depends on the shape of the original distribution
• In this class, we will assume the sample size is large enough for the
Calculating Probabilities from a Normal Distribution
• Here is the general procedure to calculate probabilities from the
distribution of the sample meanX
¯
1 You are given an interval in terms of
¯
x, i.e.P
(¯
X< ¯
x)
2 Convert to a z score by using
z
=
¯
x− µ
σ/
√
n3 Look up probability in z table that corresponds to z score, i.e.
P
(
Z<
z)
• This is just the same idea we used in Chapter 6!
Examples
Percentile Calculations Based on the Normal Distribution
• Here is the general procedure to calculate the value
¯
x thatcorresponds to the Pthpercentile
1 You are given a probability or percentile desired
2 Look up the z score in table that corresponds to the probability
3 Convert to
¯
x by the following formula:¯
x= µ +
zσ
√
n• Examples: Look at Handout #5 on the website
Using Your Calculator
• If you have a graphing calculator, your calculator can calculate
all of these probabilities without using a z table
• If you want to calculate P
(
a< ¯
X<
b)
follow these steps:1 Push 2nd, then DISTR
2 Select normalcdf() and then push ENTER
3 Then enter the following: normalcdf(a,b, µ, σ/√n)
• Question: IfX
¯
∼
N(
0,
1)
, what is the probability P(−
1< ¯
X<
1)
?• Solution: normalcdf(
−
1,
1,
0,
1) =
0.
6827≈
68%
• Question: IfX
¯
∼
N(
10,
2)
, what is the probability P(
7< ¯
X<
9)
?Using Your Calculator
• If you want to calculate P
(¯
X<
a)
follow these steps:1 Push 2nd, then DISTR
2 Select normalcdf() and then push ENTER
3 Then enter the following: normalcdf(−1099,a, µ, σ/√n)
• Question: IfX
¯
∼
N(
10,
2)
, what is the probability P(¯
X<
8)
?• Solution: normalcdf(
−
1099,
8,
10,
2) =
0.
158656• If you want to calculate P
(¯
X>
a)
follow these steps:1 Push 2nd, then DISTR
2 Select normalcdf() and then push ENTER
3 Then enter the following: normalcdf(a,1099, µ, σ/√n) • Question: IfX
¯
∼
N(
10,
2)
, what is the probability P(¯
X>
9)
?• Solution: normalcdf(9
,
1099,
10,
2) =
0.
691462Using Your Calculator
• If you want to calculate the value ofX that gives you the P
¯
thpercentile then follow these steps:
1 Push 2nd, then DISTR
2 Select invNorm() and then push ENTER
3 Then enter the following: invNorm(percentile,µ, σ)
• Question: IfX
¯
∼
N(
10,
2)
, what value ofX gives us the 25¯
th percentile?• Solution: normalcdf(
.
25,
10,
2) =
8.
65102• Recall: We used the formula
¯
x= µ +
zσ/
√
n, so¯
x