• No results found

Confidence Interval for a Population. Normal (z) Statistic

N/A
N/A
Protected

Academic year: 2022

Share "Confidence Interval for a Population. Normal (z) Statistic"

Copied!
28
0
0

Loading.... (view fulltext now)

Full text

(1)
(2)

Confidence Interval for a Population Mean:

Mean:

Normal (z) Statistic

(3)

Estimation Process

Mean, µµµµ, is unknown

Population

Random Sample

☺ ☺☺

I am 95% confident that µµµµis between 40 &

Mean 60.

x = 50

☺☺

☺☺

☺☺☺

☺☺

☺☺

☺☺

☺☺

☺☺

☺☺

☺☺

☺☺

☺☺

Sample ☺☺

☺☺☺

(4)

Confidence Interval

According to the Central Limit Theorem, the sampling

distribution of the sample mean is approximately normal for large samples. Let us calculate the interval estimator:

x ± 1.96

σ

= x ± 1.96

σ

x ± 1.96

σ

x = x ± 1.96

σ

n

That is, we form an interval from 1.96 standard

deviations below the sample mean to 1.96 standard

deviations above the mean. Prior to drawing the sample, what are the chances that this interval will enclose µ,

the population mean?

(5)

Confidence Interval

If sample measurements yield a value of that falls between the two lines on either side of µ, then the interval will contain µ.

The area under the

normal curve between

x x ± 1.96

σ

x

normal curve between these two boundaries is exactly .95. Thus, the probability that a

randomly selected

interval will contain µ is equal to .95.

(6)

95% Confidence Level

If our confidence level is 95%, then in the long run, 95% of our confidence intervals will contain µ and 5% will not.

For a confidence level of 95%, the area in the two tails is .05. To choose a different confidence level we increase or decrease the area (call it α) assigned to the tails. If we place decrease the area (call it α) assigned to the tails. If we place α/2 in each tail

and zα/2 is the z-value, the confidence interval with coefficient (1 – α) is

x ± z

( )

α 2

σ

x .

(7)

Conditions Required for a Valid Large- Sample

Confidence Interval for µ

1. A random sample is selected from the target population.

population.

2. The sample size n is large (i.e., n ≥ 30). Due to the Central Limit Theorem, this condition guarantees that the sampling distribution of is approximately normal. Also, for large n, s will be a good estimator of

σ

.

x

(8)

Large-Sample (1 – α α α α )% Confidence Interval for µ

where zα/2 is the z-value with an area

α

/2 to its right and in the standard normal distribution. The parameter

σ

is

the standard deviation of the sampled population, and n x ± z

( )

α 2

σ

x = x ± zα 2

σ

n





the standard deviation of the sampled population, and n is the sample size.

Note: When

σ

is unknown and n is large (n ≥ 30), the confidence interval is approximately equal to

where s is the sample standard deviation.

x ± zα 2 s n





(9)

Thinking Challenge

You’re a Q/C inspector for Gallo.

The σσσσ for 2-liter bottles is .05 liters. A random sample of 100 bottles showed x = 1.99 liters.

What is the 90% confidence interval estimate of the true

mean amount in 2-liter bottles?

2 liter

© 1984-1994 T/Maker Co.

2 liter

(10)

Confidence Interval Solution*

x − zα/ 2 σ n

µ ≤ x + zα/ 2 σ n

1.99 − 1.645⋅ .05

µ ≤ 1.99 + 1.645⋅ .05

1.99 − 1.645⋅ .05 100

µ ≤ 1.99 + 1.645⋅ .05 100 1.982 µ ≤ 1.998

We can be 90% confident that the mean amount in 2-liter bottles

between 1.982 and 1.998. Our confidence is derived from the fact that 90% of the intervals formed in repeated applications of this procedure would contain µ

(11)

Exercise

• A random sample of 70 observations from a normally distributed population possesses a sample mean equal to 26.2 and a sample standard deviation equal to 4.1

• A) Find an approximate 95% confidence interval for µ.

• B) What do you mean when you say that a confidence

• B) What do you mean when you say that a confidence level is 95%?

• C) Find an approximate 99% confidence interval for µ.

• D) What happens to the width of a confidence interval as the value of the confidence coefficient is increased while the sample size is held fixed?

(12)
(13)
(14)

Confidence Interval for a Population Mean:

Mean:

Student’s t-Statistic

(15)

Small sample size problem for inference about µ

• The use of a small sample in making inference about µ presents two problems when we

attempt to use the standard normal z as a test statistic.

statistic.

(16)

Problem 1

• The shape of the sampling distribution of the sample mean now depends on the shape of the population sampled.

• We can no longer assume that the samplingWe can no longer assume that the sampling distribution of sample mean is approximately

normal because the central limit theorem ensures normality only for samples that are sufficiently large.

(17)

Solution to Problem 1

• We know that if our sample comes from a population with normal distribution the sampling distribution of sample mean will be normal regardless of the sample size.

(18)

Problem 2

• The population standard deviation σ is almost always unknown. For small samples the sample standard deviaiton s provides poor approximation for σ.

(19)

Solution to Problem 2

(Small Sample with σ σσ σ Unknown)

Instead of using the standard normal statistic z = x − µ

σ

x =

x − µ

σ

n use the t–statistic

x

t = x − µ s n

in which the sample standard deviation, s, replaces the population standard deviation,

σ

.

(20)

Conditions Required for a Valid Small- Sample Confidence Interval for µ

• A random sample is selected from the target population

• The population has a relative frequency distribution that is approximately normal.

distribution that is approximately normal.

(21)

Small Sample with σ σσ σ known

Use the standard normal statistic z = x − µ

σ

x =

x − µ

σ

n

x

(22)

Student’s t-Statistic

The t-statistic has a sampling distribution very much like that of the z-statistic: mound-shaped, symmetric, with mean 0.

The primary difference between the sampling between the sampling distributions of t and z is that the t-statistic is more variable than the z-statistic.

(23)

Degrees of Freedom

The actual amount of variability in the sampling distribution of t depends on the sample size n. A

convenient way of expressing this dependence is to say that the t-statistic has (n – 1) degrees of freedom (df).

that the t-statistic has (n – 1) degrees of freedom (df).

(24)

Student’s t Distribution

Standard Normal

t (df = 13) Bell-Shaped

Symmetric

‘Fatter’ Tails

z

0 t

t (df = 5)

‘Fatter’ Tails

The smaller the degrees of freedom for t-statistic, the more variable will be its sampling distribution.

(25)
(26)

1)

2)

(27)

• We have a random sample of 15 cars of the same model. Assume that the gas milage for the population is normally distributed with a standard deviaition of 5.2 miles per galon.

• A) Identify the bounds for a 90% confidence interval for the mean given a sample mean of 22.8 miles per gallon.

• B) The car manufacturer of this particular model claims that the average gas milage is 26 miles per gallon. Discuss the validity of this average gas milage is 26 miles per gallon. Discuss the validity of this claim using the 90% confidence interval calculated in A.

• C) Let a and b represent the lower and upper boundaries of 90%

confidence intervl for the mean of the population. Is it correct to conclude that tere is a 90% probability that true population mean lies between a and b?

(28)

Thinking Challenge

• We have a random sample of customer order totals with an average of $78.25 and a population standard deviation of $22.5.

• A) Calculate a 90% confidence interval for the mean given a sample size of 40 orders.

• B) Calculate a 90% confidence interval for the mean

• B) Calculate a 90% confidence interval for the mean given a sample size of 75 orders.

• C) Explain the difference in the 90% confidence intervals calculated in A and B.

• Calculate the minimum sample size needed to identify a 90% confidence interval for the mean assuming a $5 margin of error.

References

Related documents

Let player 2 be the initial proposer and take any such division x; it can easily be verified that the following is an equilibrium: player 2 initially demands the entire surplus

On the other hand, CYP3A7 is primarily expressed in fetal liver, although variants of CYP3A7 have been described that result in expression into adulthood.(15) Shortly after

Mainly high resolution multichannel seismic data and swatch bathymetry data were used to study near- surface seismostratigraphy, structure and seismic fluid-indicating features in

Im Bildungsshaker-Podcast manifestier- te sich dieser Schritt in den ersten Episoden, in denen zunächst das Projekt be- schrieben wurde (Episode 1: Über die OERlabs), sowie Einblicke

Calculate confidence intervals for population means or proportions from sample data using the normal distribution for large samples and student’s t-distribution for small samples

As this segment contributed about 15% to the company s total revenue in the reported quarter, the continuing weakness can significantly weigh down on the company

The general consensus of contingency writers is that if managers are to apply management concepts, principles and techniques successfully, they must consider the

We will study the fundamental principles and techniques of data mining, and we will examine real-world examples and cases to place data-mining techniques in context, to