1.Forty-two percent of today’s 15-year-old girls will get pregnant in their teens.
2. A 1993 survey conducted by the Richmond
Times-Dispatch one week before Election Day asked voters which candidate for the state’s Attorney General they would vote for. Thirty-seven percent of the respondents said they
would vote for the Democratic candidate. On Election Day, 41% actually voted for the Democratic candidate.
3. The National Center for Health Statistics
reports that the mean systolic blood
pressure for males 35 to 44 years of age is
128 and the standard deviation is 15. The
medical director of a large company looks
at the medical records of 72 executives in
this age group and finds that the mean
systolic blood pressure for these executives
is 126.07.
4. Which statistic has the largest bias among these three? Justify your answer.
NOTE: A population consists of the entire population of people or objects of interest to an investigator, while a
sample refers to the part of the population that the investigator actually studies. Also remember that a
parameter is a numerical characteristic of a population and that a statistic is a numerical characteristic of a sample.
In certain contexts a population can also refer to a process (such as flipping a coin) that in principle can be repeated indefinitely. With this interpretation of population, a sample
is a specific collection of process outcomes.
We will be very careful to use different symbols to denote
parameters and statistics. For example, we use the following symbols to denote proportions, means, and standard deviations (note that we consistently use Greek letters for parameters):
Identify each of the following as a parameter or a statistic, indicate the symbol used to represent it, and specify its value.
the proportion of men in the entire 1999 U.S. Senate
the proportion of Democrats among the following five
the mean years of service among these five
Senators
the standard deviation of the years of service in the
Chapter 9 Section 2
Sample Proportions
Review: The parameter p is the population
proportion. In practice, this value is always
unknown. (If we know p, then there is no need for a sample.) The statistic is the sample proportion. We use to estimate the value of p. The value of the statistic changes as the sample changes.
How can we describe the sampling model for ?
Chapter 9 Section 2
If our sample is an SRS of size n, then the following statements describe the sampling model for :
The shape is Approximately Normal.
Assumption: Sample Size Is Sufficiently Large
Condition:
np
³
10
Chapter 9 Section 2
If our sample is an SRS of size n, then the following statements describe the sampling model for :
The mean is exactly p.
The standard deviation is .
Assumption: Sample Size Is Sufficiently Large
Condition: Population is at least 10 times as large as the sample OR Population > 10n
p
(1
-
p
)
Chapter 9 Section 2
Where did those formulas come from?
Suppose the sample size is n and the actual population is p. We
learned in Section 8.1, that if X is a binomial random variable, that the mean and standard deviation of the sampling distribution of X are given by
AND .
Here, however, we are interested in the proportion rather than in
the number of successes. We remember (hopefully) that when we multiply or divide every element by a constant, we multiply or
divide both the mean and standard deviation by the same constant. In this case, to change the number of successes to proportion of successes, we divide by n:
AND .
m
x=
np
s
x=
np
(1
-
p
)
m
pˆ=
np
n
=
p
s pˆ =np(1- p)
n =
p(1- p)
Chapter 9 Section 2
Example #1: Opinion polls in 2002 showed that about 70% of the population had a favorable opinion of
President Bush. That same year, a simple random sample of 600 adults living in the San Francisco Bay area found only 65% that had a favorable opinion of President Bush. What is the probability of getting a
rating of 65% or less in a random sample of this size if the true population proportion was .70?
Example #2: Trina Forest fails to study for her statistics final. The final has 100 multiple choice
Example #3: It is estimated that 48% of all motorists use their seat belts. If a police officer observes 400 cars go by in an hour, what is the probability that the proportion of drivers wearing seat belts is between 45% and 55%?
Example #5: A brake inspection station reports that 15% of all cars tested have brakes in need of
replacement pads. For a sample of 20 cars that come to the inspection station,
What is the probability that exactly 3 cars have defective brakes?
Example #6: In a large class of introductory Statistics
students, the professor has each person toss a coin 16 times and calculate the proportion of his or her tosses that were heads. The students then report their results, and the
professor plots a histogram of these several proportions.
What shape would you expect this histogram to be? Why?
Where do you expect the histogram to be centered?
How much variability would you expect among these proportions?
Example #7: The candy company claims that 10% of the M&M’s it produces are green. Suppose that the candies are packaged at random in small bags containing about 50 M&M’s. A class of
elementary school students learning about percent's opens several bags, counts the various colors of the candies, and calculates the proportion that are green.
If we plot a histogram showing the proportions of green candies in the various bags, what shape would you expect it to have?
Can that histogram be approximated by a Normal model? Explain.
Where should the center of the histogram be?