Math and Science Bridge
Program
Year 1: Statistics and Probability Dr. Tamara Pearson
Assistant Professor of Mathematics
Research Paperwork
Informed Consent
Pre-Survey
After you complete the survey please
write in your journal about the following:
What do you expect to learn about statistics
and/or teaching from this series of professional development workshops?
What fears (if any) do you have about
statistics?
What personal experiences have you had with
learning statistics?
Session 1
Introduction to Statistics
Agenda
9:00am-‐9:30am What is Statistics?
9:30am-‐10:30am The Importance of Statistical Literacy 10:30am-‐11:30am The 4-Step Process
11:30am-12:30pm LUNCH 12:30pm-‐2:00pm Types of Studies 2:00pm-‐3:00pm Types and Levels of Data 3:00pm-‐3:30pm Wrap-Up and ReHlection
Professional Development Website
faculty.clayton.edu/tpearson5 Handouts
PowerPoint Presentations
Websites
What is statistics?
Write a sentence describing what the word “statistics” means to you.
Statistics
Statistics is the science of planning studies and experiments, obtaining data, and then organizing,
summarizing, presenting, analyzing, interpreting, and drawing conclusions based on the data.
Population
Population is the complete collection of all individuals (scores, people,
measurements, and so on) to be studied; the collection is complete in the sense that it includes all of the individuals to be studied.
Census vs. Sample
Census:
Collection of data fromevery member of a population
Sample:
Subcollection of membersselected from a population
Data must be collected in an appropriate way. Otherwise the data may be useless.
So You Wanna Lose Weight
Which is the “best” method for losing 5 pounds by June?
Weight Watchers http://youtu.be/hvohHi69o9w LA Fitness http://youtu.be/fcjP0tjhxss Lipozene http://www.lipozene.com
THE IMPORTANCE OF
STATISTICAL LITERACY
Key Concept
You can not rely on blind acceptance of mathematical calculation. You should consider these factors:
Context of the data
Source of the data
Sampling method Conclusions Practical implications
Context
40 46 42 60 60 75 45 51 50 98 36 50 55 120 40 105 52 70 29 53What questions
do you have
about this data?
Context
What do the values represent?
Where did the data come from?
Why were they collected?
An understanding of the context
will directly affect the statistical
procedure used.
Context
MEDIAN INCOME POTENTIAL (thousands) MAJORS GROUP LOW HIGH
Arts 40 46 Biology and Life Science 42 60 Business 60 75 CommunicaAons and Journalism 45 51 Computers and MathemaAcs 50 98 EducaAon 36 50 Engineering 55 120
Health 40 105 Physical Sciences 52 70 Psychology and Social Work 29 53 The Economic Value of a Bachelor’s Degree
Source of Data
Is the source objective?
Is the source biased?
Is there some incentive to distort or
spin results to support some self-serving position?
Is there something to gain or lose by
distorting results?
Be vigilant and skeptical of studies from
sources that may be biased The source of the previous data is:
"What’s It Worth? The Economic Value of College Majors"
Sampling Method
Does the method chosen greatly
influence the validity of the conclusion?
Voluntary response (or self-selected)
samples often have bias (those with special interest are more likely to participate). These samples’ results are not necessarily valid.
Other methods are more likely to
Conclusions
Make statements that are clear to those
without an understanding of statistics and its terminology.
Counseling Psychology majors make median earnings of $29,000 per year, compared to $120,000 for Petroleum Engineering majors.
Avoid making statements not justified
by the statistical analysis.
It would be unwise to conclude from the data presented that if you obtain a degree in engineering that you will make $120,000 a year
Practical Implications
State practical implications of the
results.
What are the implications of the results shown previously?
There may exist some statistical
significance yet there may be NO
practical significance.
Common sense might suggest that the
finding does not make enough of a difference to justify its use or to be practical.
Statistical Significance
Consider the likelihood of getting
the results by chance.
If results could easily occur by
chance, then they are not
statistically significant.
If the likelihood of getting the
results is small, then the results
are statistically significant.
Misuses of Statistics
We should learn to distinguish between statistical conclusions that are likely to be valid and those that are seriously flawed.Graphs
To correctly interpret a graph, you must analyze the numerical information given in the graph, so as not to be misled by the graph’s shape.
Correlation vs. Causation
Concluding that one variable causes the
other variable when in fact the variables are linked
Two variables may seem linked, guns
and crimes, this relationship is called correlation.
Cannot conclude that one causes the
other.
Correlation does not imply causation.
Small Samples
Conclusions should not be based
on samples that are far too
small.
Example: Basing a school
suspension rate on a sample of
only
three
students
Nonresponse
Occurs when someone either refuses
to respond to a survey question or is unavailable.
People who refuse to talk to pollsters
have a view of the world around them that is markedly different than those who will let poll-takers into their homes.
Missing Data
Can dramatically affect results.
Subjects may drop out for reasons
unrelated to the study.
People with low incomes are less
likely to report their incomes.
US Census suffers from missing
people (tend to be homeless or low income).
Self-Interest Study
Some parties with interest to
promote will sponsor studies.
Be wary of a survey in which the
sponsor can enjoy monetary gain from the results.
When assessing validity of a study,
always consider whether the sponsor
The 4-Step Process
Formulate questions
Collect data
Analyze data
Interpret results
The 4-Step Process
Formulate questions
Formulate questions and determine how data can be collected and analyzed to provide an answer
Collect data
Analyze data
Interpret results
The 4-Step Process
Formulate questions
Collect data
Design and implement a data collection plan for statistical studies, including observational studies, sample surveys, and experiments.
Analyze data
Interpret results
The 4-Step Process
Formulate questions
Collect data
Analyze data
Identify appropriate ways to summarize numerical or categorical data using tables, graphical displays, and numerical summary statistics.
Interpret results
The 4-Step Process
Formulate questions
Collect data
Analyze data
Interpret results
Understand the meaning of statistical significance and the difference between statistical significance and practical significance.
Advertising in the Yellow Pages
“Yes, phone books are still around. And while they eventually may succumb to the Internet, they’re not going down without a fight. Publishers threw 422 million directories on America’s lawns and doorsteps last year, according to research firm Simba Information. And businesses paid a collective $6.9 billion for ads in them, according to the market research firm BIA/Kelsey.” Bloomberg Business Week, March 2012, “The Golden Allure of the Yellow Pages”
Advertising in the Yellow Pages
What is the average number of ads on a YP page?
Use a “representative sample” of 30
pages.
Complete all parts of the 4-step
process
Create a group poster that conveys
your findings.
TYPES OF STUDIES
Basics of Collecting Data
Statistical methods are driven by the data that we collect. We typically obtain data from two distinct sources: observational studiesand experiment.
Observational Study vs. Experiment
Observational study: Observing andmeasuring specific characteristics without attempting to modify the subjects being studied.
Experiment: Apply some treatment
and then observe its effects on the subjects; (subjects in experiments are called experimental units)
Sample Survey: used to estimate or
make decisions about characteristics or populations
Types of Studies
For observational studies:
Cross sectional study: data are observed,
measured, and collected at one point in time
Retrospective (or case control) study: data
are collected from the past by going back in time (examine records, interviews, …)
Prospective (or longitudinal or cohort)
study: data are collected in the future from
groups sharing common factors (called cohorts)
Experiment Design
• Randomization is used when subjects are
assigned to different groups through a process of random selection. The logic is to use chance as a way to create two groups that are similar.
• The goal is to use chance as a way to create two groups that are similar. • Found to be an extremely effective method
Blinding
Blinding is a technique in which the
subject doesn’t know whether he or she is receiving a treatment or a placebo. Blinding allows us to determine whether the treatment effect is significantly different from a placebo effect, which occurs when an untreated subject reports improvement in symptoms.
Double-Blind occurs at two levels:
The subject doesn’t know whether he or she
is receiving the treatment or a placebo The experimenter does not know whether he
or she is administering the treatment or placebo
Sample Survey
Must be aware of the survey design
Loaded Questions
Survey questions can be “loaded” or intentionally worded
to elicit a desired response.
Too little money is being spent on “welfare” versus too
little money is being spent on “assistance to the poor.” Results: 19% versus 63%
Order of Questions
Questions are unintentionally loaded by such factors as
the order of the items being considered.
Would you say traffic contributes more or less to air
pollution than industry? Results: traffic - 45%; industry - 27%
When order reversed.
Results: industry - 57%; traffic – 24%
Errors
No matter how well you plan and execute the sample collection process, there is likely to be some error in the results.
• Sampling error: the difference between a
sample result and the true population result; such an error results from chance sample fluctuations
• Nonsampling error: sample data
incorrectly collected, recorded, or analyzed (such as by selecting a biased sample, using a defective instrument, or copying the data incorrectly)
Smart Phones Apps
Is there a relationship between a person’s age and the number of
apps on their smart phone?
Count the number of apps on your
phone.
When asked give the number of
apps and your age.
Smart Phones Apps
Is there a relationship between a person’s age and the number of
apps on their smart phone?
Count the number of apps on your
phone.
When asked give the number of
apps and your age.
Word Memorization
Does grouping effect letter memorization?
You will be given one minute to
study the list of letters on your paper.
You will then have 30 seconds to
write down as many letters (in order) that you can remember.
Caffeine vs. Sleep
Is there a relationship between caffeine intake and sleep?
How many ounces of caffeinated
beverages do you drink in a typical day?
How many hours do you sleep in a
typical night?
Teacher Satisfaction
Teachers are less satisfied with their careers.
The percentage of teachers who say they are very or fairly likely to leave the profession has increased by 12 points since 2009, from 17% to 29%.
The percentage of teachers who do not feel their job is secure has grown since 2006 from 8% to 34%.
A majority of teachers (63%) of teachers report that class sizes have increased in the last year.
Taken from: Executive Summary of MetLife Survey of the American Teacher
Student Performance
Read the Georgia Department of
Education press release.
Discuss what conclusions you would
draw from the information given.
What incorrect conclusions might
the public come to after reading this
press release?
TYPES OF DATA
Data
Data are collections of observations (such as measurements, genders,
survey responses)
Quantitative vs. Categorical Data
Quantitative (or numerical) data
consists of numbers representing counts or measurements
Age, weight, GPA, etc.
Categorical (or qualitative) data
consists of names or labels (representing categories)
Discrete vs. Continuous Data
Quantitative data can be further
described by as discrete or continuous Discrete data result when the number
of possible values is either a finite number or a ‘countable’ number. Continuous (numerical) data result
from infinitely many possible values that correspond to some continuous scale that covers a range of values, without gaps, interruptions or jumps.
Levels of Measurement
Nominal: Data that cannot be arranged
in an ordering scheme (low to high)
Ordinal: Data that can be ordered but
differences between values (by subtraction) are meaningless
Interval: Like ordinal except the
difference between the values is
meaningful, but there is no ‘natural’ zero starting point and ratios are meaningless
Ratio: Like interval with the additional
property that there is also a natural zero starting point and ratios are meaningful
LOW
HIGH
What Do You Want To Know?
Now that you have been introduced to the four-step process, think of a research question(s) you are interested in studying. Formulate a research question(s). What type of study (experimental or
observational) might be the best approach for your research question? What is the population of interest?
What type of data will you need to collect
and how will you gather this data?
REFLECTION
Reflection
In your journals, write about the following:
What did you like most about
today’s professional development session?
What did you like the least?
What would you like to see in future