S T A T I S T I C S !
Chapter 1
We encounter statistics in our daily lives more often than we probably realize and from many different
sources, like the news. Typically, when you read a newspaper article or watch a television news program, you are given sample information. With this information, you make a decision
about the correctness of a statement, claim, or “fact”. Statistical methods can help us make the “best educated guess”.
Statistics
Main Goal
The common and important goal of statistics:
collect data from a small part of a larger group
so that we can learn something about the larger
group
Definitions
Data - observations that you or someone else records; numeric or nonnumeric
Examples: measurements, genders, song titles stored on iTunes
Statistics – the science of planning studies and experiments, obtaining data, and then organizing, summarizing, presenting,
analyzing, interpreting, and drawing conclusions based on the data
Population – a collection of ALL persons, or ALL things, or ALL objects under study
Census – collection of data from every member of a population
Sample – a portion of the larger group (= population)
Definitions
Voluntary Response Sample – A sample in which the respondents themselves decide whether to be included.
Ex: Radio host asks listeners to call in to give their opinion on a topic.
Self-Interest Study – A study in which the sponsor can enjoy monetary or other gains from the results.
Ex: Hershey performs a study on whether chocolate is good for you.
Definitions
Statistically Significant – Results are statistically significant if they are unlikely to occur by chance.
Practical Significance – Results are practically significant if they suggest that enough of a difference is made to perform the test, undergo the
treatment, etc…
Potential Pitfalls
Misleading Conclusions – We may find that a correlation exists between two data sets, but that doesn’t mean that one is caused by the other.
(Correlation does not imply causation.)
Reported Results – Take measurements instead of conducting a survey. (If you ask someone their weight, they may tell you a desired weight instead of an actual weight.)
Small Samples – We’ll talk about what constitutes a large enough sample size later. (If I survey 3 students in the class, and they all say math is their favorite subject, can I conclude that math is the favorite subject of all students in the class?)
Example 2
A hip hop radio show broadcast in the city of Puddelton asked people to call in and express their opinions on the new mayor.
a) Are the results likely to be representative of all adults in Puddelton?
b) Of all listeners to the hip hop show?
Example 2
A hip hop radio show broadcast in the city of Puddelton asked people to call in and express their opinions on the new mayor.
a) Are the results likely to be representative of all adults in Puddelton?
b) Of all listeners to the hip hop show?
Answer:
a) No, the results are not representative of all adults in Puddelton because a hip hop radio show is likely to attract a younger audience.
b) No, the results are not representative of all listeners to the show because it is a voluntary response sample since listeners themselves choose to respond, that is, those with stronger opinions are more likely to respond
Example 3
In a test of the Atkins weight loss program, 40 subjects
using the program had a mean weight loss of 4.6 lb after
one year.
Example 3
In a test of the Atkins weight loss program, 40 subjects using the program had a mean weight loss of 4.6 lb after one year.
Answer
The diet was effective because the participants lost weight. The study could be shown to have statistical significance.
The study does not have practical significance however, as the weight
loss is probably not large enough to justify following the diet program
for one year.
Example 4
Big Smile Dental’s billboard ad states that their dental rinse will
“reduce plaque by over 300%”. What is wrong with this statement?
(Textbook page 15 1.2 #36)
Example 4
Big Smile Dental’s billboard ad states that their dental rise will
“reduce plaque by over 300%”. What is wrong with this statement?
Answer:
A reduction of 100% would eliminate all plaque, so it’s not
possible to reduce it by more than 100%.
Definitions
Parameter – a number that represents a property of the population (ALL)
Statistic – a number that represents a property of the sample
(SUBSET)
Determine whether the given value is a parameter or statistic.
a) In a poll of 1010 adults in the United States, 55% of the respondents said that they used local TV stations daily as a source of news.
b) Among the flights included in the sample, 21% arrived late.
c) The average atomic weight of all elements in the periodic table is 134.355 amu.
d) After inspecting all of 55,000 kg of meat stored at the Wurst Sausage Company, it was found that 45,000 kg of the meat was spoiled.
Example 5
Determine whether the given value is a parameter or statistic.
a) In a poll of 1010 adults in the United States, 55% of the respondents said that they used local TV stations daily as a source of news.
Answer: STATISTIC
b) Among the flights included in the sample, 21% arrived late.
Answer: STATISTIC
c) The average atomic weight of all elements in the periodic table is 134.355 amu.
Answer: PARAMETER
d) After inspecting all of 55,000 kg of meat stored at the Wurst Sausage Company, it was found that 45,000 kg of the meat was spoiled.
Answer: PARAMETER
(Textbook page 21 1.3 #8, 10)
Example 5
Definitions
Quantitative data (quantity = numerical) – are the result of counting or measuring attributes of a population
Examples: amount of money, pulse rate, weight, number of people living in your city, the weight of supermodels, the age of respondents
Qualitative data (quality = categorical) – are the result of categorizing or describing attributes of a population
Examples: hair color, blood type, ethnic group, the car a person
drives, street a person lives on, gender of professional athletes, shirt
numbers on professional athletes uniforms as a substitute for names
Definitions
Quantitative data can be further described by distinguishing between discrete and continuous types
Discrete data – all data that are a result of counting
Example: the number of eggs that a hen lays
Example: If you count the number of phone calls you receive for each day of the week, you might get values such as zero, one, two, or three
Example: the number of books in your backpack
Continuous data – all data that are a result of measuring
Example: measuring angles in radians
Example: the weight of your backpack
Example: the area of lawns on your street
Example 6
Determine the correct data type – qualitative or quantitative(discrete/continuous).
a) the number of pairs of shoes you own b) the type of car you drive
c) where you go on vacation
d) the distance it is from your home to the nearest grocery store e) the number of classes you take per school year
f) the cost of your fall classes g) the type of calculator you use h) movie rating
i) political party preference j) weights of sumo wrestlers
k) amount of money (in dollars) won playing poker l) number of correct answers on a quiz
m) peoples’ attitudes toward the government
Example 6
Determine the correct data type – qualitative or quantitative(discrete/continuous):
a) the number of pairs of shoes you own Answer: quantitative, discrete
b) the type of car you drive Answer: qualitative
c) where you go on vacation Answer: qualitative
d) the distance it is from your home to the nearest grocery store Answer: quantitative, continuous e) the number of classes you take per school year Answer: quantitative, discrete
f) the cost of your fall classes Answer: quantitative, continuous
g) the type of calculator you use Answer: qualitative
h) movie rating Answer: qualitative
i) political party preference Answer: qualitative
j) weights of sumo wrestlers Answer: quantitative, continuous
k) amount of money (in dollars) won playing poker Answer: quantitative, discrete l) number of correct answers on a multiple choice quiz Answer: quantitative, discrete m) peoples’ attitudes toward the government Answer: qualitative
Random vs Simple Random Sample
Random Sample
Members from the population are selected in such a way that each
individual member in the population has an equal chance of being selected.
Simple Random Sample
A sample of n subjects is selected in such a way that every possible sample of the same size n has the same chance of being chosen.
Example 7
Lisa wants to form a four-person study group (herself and three other people) from her pre-calculus class 31 (including Lisa). How can her selection of groupmates be random?
One way would be to put everyone’s name in a hat, and pick out three names.
Another way is to alphabetize the names and assign a number from 1-30 and use a random number generator.
TI83 step by step:
1. press MATH
2. arrow over to PRB
3. press 5 to get the randInt tool and enter 0, 30, 3 close parenthesis
Always remember!!!
It is important to select the sample of subjects in such a way that
the sample is likely to be representative of the larger population.
Other sampling methods
Systematic: select some starting point and then select
every kth (such as every 7
th) element of the population
Other sampling methods
Convenience – use results that are easy to get
Other sampling methods
Stratified – divide the population into groups called strata
and then take a proportionate number from each stratum
Other sampling methods
Cluster – divide the population into clusters (groups)
and then randomly select some of the clusters
Example 9 What is the sampling method?
a) A sample of 100 undergraduate students is taken by organizing the student’s names by classification (freshman, sophomore, junior, senior) and then selecting 25 students from each.
b) A random number generator is used to select a student from the alphabetical listing of all undergraduate
students in the fall semester. Starting with that student, every 50th student is chosen until 75 students are included in the sample.
c) A completely random method is used to select 75 students. Each undergraduate student in the fall semester has the same probability of being chosen at any stage of the sampling process.
d) The freshman, sophomore, junior, and senior years are numbered one, two, three, and four, respectively. A random number generator is used to pick two of those years. All students in those two years are in the sample.
e) An administrative assistant is asked to stand in front of the library one Wednesday and to ask the first 100 undergraduate students he encounters what they paid for tuition the fall semester. Those 100 students are the sample.
Example 9 What is the sampling method?
a) A sample of 100 undergraduate students is taken by organizing the student’s names by classification (freshman, sophomore, junior, senior) and then selecting 25 students from each.
Answer: stratified
b) A random number generator is used to select a student from the alphabetical listing of all undergraduate students in the fall semester. Starting with that student, every 50th student is chosen until 75 students are included in the sample.
Answer: systematic
c) A completely random method is used to select 75 students. Each undergraduate student in the fall semester has the same probability of being chosen at any stage of the sampling process.
Answer: simple random
d) The freshman, sophomore, junior, and senior years are numbered one, two, three, and four, respectively. A random number generator is used to pick two of those years. All students in those two years are in the sample.
Answer: cluster
e) An administrative assistant is asked to stand in front of the library one Wednesday and to ask the first 100
undergraduate students he encounters what they paid for tuition the fall semester. Those 100 students are the sample.
Answer: convenience
We typically obtain data from two sources.
In an observational study, we observe and measure specific characteristics but we don’t attempt to modify the subjects being studied
In an experiment, we apply some treatment and then proceed
to observe its effects on the subjects
Example 10
Determine if the given description corresponds to an observational study or an experiment.
a) In a clinical trial of the cholesterol drug Lipitor, 188 subjects were given 20-mg doses of the drug, and 3.7% of them experienced nausea.
b) In a study sponsored by Coca-Cola, 1,250 people were asked what
contributes most to their happiness and 77% of the respondents said that it was their family or partner.
(Textbook page 32)
Example 10
Determine if the given description corresponds to an observational study or an experiment.
a) In a clinical trial of the cholesterol drug Lipitor, 188 subjects were given 20- mg doses of the drug, and 3.7% of them experienced nausea.
Answer: because the subjects were given a treatment consisting of Lipitor, this is an experiment
b) In a study sponsored by Coca-Cola, 1,250 people were asked what
contributes most to their happiness and 77% of the respondents said that it was their family or partner.
Answer: this is an observational study because the survey subjects were not given any treatment, their responses were observed
Example 11
Often, experiments are better than observational studies because experiments reduce the chance of lurking variables. A lurking
variable is one that is not part of the study but affects the variables that are actually part of the study.
Little Billy is on an airplane and tells his mom “I wish they would stop
turning on the seat belt sign because it makes everything so bumpy.”
Example 11
Little Billy is on an airplane and tells his mom “I wish they would stop turning on the seat belt sign because it makes everything so bumpy.”
Answer:
Because one event follows the other consistently the boy assumes causation
In reality they are correlated but they are both responses to a third variable
Turbulence = lurking variable
Types of Observational Studies
Retrospective (or case-control) study Data are collected from a past time period by going back in time
through examination of records, interviews, and so on.
Cross-sectional study
Collect information about individuals at a specific point in time.
Cohort
Study(Prospective)
Go forward in time and observe groups sharing common factors, such as smokers and nonsmokers.
The cohort observed over a long period of time.
Example 12 Identify the type of observational study used.
a) In order to study the seriousness of drinking and driving, a researcher obtains records from past car crashes. Drivers are partitioned into groups that had no alcohol consumption and
another group that did have evidence of alcohol consumption at the time of the crash.
b) Researchers at the National Cancer Institute studied meat
consumption and its relationship to mortality. Approximately one- half million people were surveyed, and they were the followed for a period of 10 years.
(Textbook page 34)
Example 12 Identify the type of observational study used.
a) In order to study the seriousness of drinking and driving, a researcher obtains records from past car crashes. Drivers are partitioned into groups that had no alcohol consumption and another group that did have
evidence of alcohol consumption at the time of the crash.
Answer: retrospective study
b) Researchers at the National Cancer Institute studied meat consumption and its relationship to mortality. Approximately one-half million people were surveyed, and they were the followed for a period of 10 years.
Answer: cohort study
Three very important considerations in the design of experiments are the following:
1. Use randomization to assign subjects to different groups.
2. Use replication by repeating the experiment on enough
subjects so that effects of treatment or other factors can be clearly seen.
3. Control the effects of variables by using such techniques as
blinding and a completely randomized experimental design.
Experimental designs
Bad experimental design – treat all women subjects and give the men a placebo
Problem – we don’t know if effects are due
to sex or to treatment
Experimental designs
Completely randomized design – use randomness to determine who gets the treatment and who gets the placebo
Experimental designs
Randomized block design –
1) Form a block of women and a block of men 2) Within each block, randomly select subjects
to be treated
Experimental designs
Matched pairs design– get measurements from the same subjects before and after some treatments
Example 14 Identify the experiment design
a) A clinical trial of aspirin treatments is being planned to determine whether the rate of myocardial infractions (heart attacks) is different for men and
women.
b) The HIV Trials Network is conducting a study to test the effectiveness of two different experimental HIV vaccines. Subjects will consist of 80 pairs of twins.
For each pair of twins, one of the subjects will be treated with the DNA vaccine and the other will be treated with the adenoviral vector vaccine.
c) Currently, there is no approved vaccine for the prevention of West Nile virus.
A clinical trial of a possible vaccine is being planned to include subjects treated with the vaccine while other subjects are given a placebo.
(Textbook page 34)
Example 14 Identify the experiment design
a) A clinical trial of aspirin treatments is being planned to determine
whether the rate of myocardial infractions (heart attacks) is different for men and women.
Answer: randomized block design
b) The HIV Trials Network is conducting a study to test the effectiveness of two different experimental HIV vaccines. Subjects will consist of 80 pairs of twins. For each pair of twins, one of the subjects will be treated with the DNA vaccine and the other will be treated with the adenoviral vector
vaccine.
Answer: matched pairs design
c) Currently, there is no approved vaccine for the prevention of West Nile virus. A clinical trial of a possible vaccine is being planned to include
subjects treated with the vaccine while other subjects are given a placebo.
Answer: completely randomized design
Example 15
In “Cardiovascular Effects of Intravenous Triiodothyronine in Patients Undergoing Coronary Artery Bypass Graft Surgery” the authors explain that patients were assigned to one of three groups:
a group treated with triiodothyronine
a group treated with normal saline bolus and dopamine
a placebo group given normal saline
The authors summarize the sample design as a “prospective, randomized, double-blind, placebo-controlled trial.” Describe the meaning of each of those terms in the context of this study.
(textbook page 34)
Example 15
Prospective: the experiment was begun and the results were followed forward in time
Randomized: subjects were assigned to the different groups through a process of random selection whereby they had the same chance of
belonging to each group
Double-blind: the subjects didn’t know which of the three groups they were in and the people who evaluated results did not know either