Statistics Chapter
Statistics
Is the process of attempting to organize and
understand raw (unorganized) information. Descriptive Statistics
Goals
Learn some terminology used in basic statistics.
Understand the purpose of basic statistical measures.
Learn how to compute and use basic statistical
Wisconsin Pay Data
Statistics in Graphical Form
Statistics in Graphical Form
Frequency Distribution
Miles Driven 0 – 9
9 –18 18 – 27 27 – 36 36 – 45 45 – 54
Frequency 1 3 3 7 0 1
0.2 54 30 11
24 30 32 20
11 32 12 24
30 35 29
1-way distance driven from home to SWTC:
Statistical Tools
Statistics
Is the process of attempting to organize and
understand raw (unorganized) information. Descriptive Statistics
Goals
Learn some terminology used in basic statistics.
Understand the purpose of basic statistical measures.
Learn how to compute and use basic statistical
measures.
Definition
A measure of central tendency is used to:
Represent or describe an entire group of data with
a single number.
Indicate the middle of a collection of data.
My Blueprint Reading Test Scores
82 76 75 90
71 72 80 83
3 types…
Measures of Central Tendency 1. mean
2. median
Measures of Central
Tendency
0.2 55 30 11
24 30 32 20
11 32 12 23.5
30 35 29
Survey of 15 students:
1-way distance driven from home to SWTC:
Formula to compute the mean ( )
Sum* of the data
n
Note:
In statistics:
n = the quantity of data you are working with. Note:
In statistics:
n = the quantity of data you are working with.
0.2 55 30 11
24 30 32 20
11 32 12 23.5
30 35 29
Commuting Distance: mean
(
)
Sum of the datan
374.7 15
Like any measure of central tendency, the mean is:
a single value that is
intended to represent the entire set of data.
an attempt to show the
middle of the data set.
Practice
Compute mean () Practice Set 1 Compute the mean
The mean in practice…
Carpenters earn an average annual wage of
$44,520…
mean mean
Practice – Compute the Mean
Royal Oakes Subdivision
Assessed home prices on eight homes:
$125,000 $135,000 $140,000 $110,000 $150,000 $380,000 $127,000 $148,000
What is the mean home value?Sum of the data
Median – middle of an
88 72 71 92 60 83 75
Median
Ordered set (lowest highest)
Median
If n is even…
88 72 71 92 60 83
Median = 77.5
Practice – Compute the
Median
Royal Oakes Subdivision
Assessed home prices on eight homes:
$125,000 $135,000 $140,000 $110,000 $150,000 $380,000 $127,000 $148,000
What is the median home value? $110,000 $125,000 $127,000 $135,000 $140,000 $148,000 $150,000 $380,000
Median = $137,500
0.2 55 30 11
24 30 32 20
11 32 12 23.5
30 35 29
mean = 25 miles
median = 29 miles
Median
0.2 11 11 12 20 24 24 29 30 30 30 32 32 35 5523.5
Practice
Practice Set 2 – Compute the median
Mode – the most frequently
occurring.
Soft Drinks
0.2 55 30 11
24 30 32 20
11 32 12 23.5
30 35 29
mean = 25 miles
Mode
median = 29 miles
Practice
Practice Set 3 – Compute the mode.
Review of Section 1
Review all three measures of central
tendency
Practice Set 4
Weighted Average
Average Price per ItemNumber of Items
Purchased Price per Item ( Welding Gloves )
40 $4.85
10 $6.50
5 $6.75
80 $4.95
90 $5.05
Average Price = ?$5.62 x = x = x = x = x =
$194.00 $65.00 $33.75 $396.00 $454.50 $1143.25 225 items
Weighted Average
Grade Point AverageGrade Grade as a
Number Number of
Credits A 5 A 2 C 3 B 3 D 1
Average Grade = ?2.8 avg.
x = x = x = x = x =
20 grade pts 8
6 9 1
44 gr. pts. 14 credits
= 3.14 GPA
Practice
Assignment List
Statistics
Section 2
Introduction
Making decisions based on central tendency.Mean (average) depth = 4.2 ft
Moral of the story…
Even though knowing mean, median, and
Variability
Defined Variability refers to how different the data
values are from one another.
In this group of numbers; 5, 5, 6, 6, 6, 6, 8
The variability is very low because the numbers are very
similar in size.
You can also say that this collection of numbers is very
Variability
Defined Variability refers to how different the data
values are from one another.
In this group of numbers; 2, 13, 14, 29, 36, 60, 91
The variability is higher because the numbers are not
similar in size.
You can also say that this collection of numbers is not
Introduction to
Variability
Who is the better bowler? Bowler A:
Bowler B:
Mean Score = 132
Introduction to
Variability
Who is the more consistent bowler? Bowler A:
Bowler B:
Bowler A: Scores are “all over the road”. Bowler B: Scores are very similar.
128, 150, 103, 161, 117 141, 148, 151, 143, 149
Mean Score = 132
Mean Score = 146
?
Are the scores similar?
70 68 67 69 65
67 61 62 77 74
72 68 66 69 67
69 70 70 69 69
67 73 69 72 74
72 69 68 71 68
73 69 67 71 71
71 73 63 63 69
68 70 74 73 71
73 66 72 74 71
61 68 64 66 77
70 76 67 69 71
67 74 77 69 70
68 69 63 72 65
67 72 73 68 77
70 72 66 69
63 70 70 69 69
74 69 69 74 70
70 72 68 74 70
65 71 68 68 74
73 76 70 69 75
67 73 71 73 69
71 68 72 73 70
75 65 73 72 72
71 68 71 66 71
68 70 67 69 68
66 70 76 75 71
71 75 68 71 65
69 70 75 72 77
74 73 69 70 73
68 78 69 75 69
76 68 74 69 69
68 68 65 69 68
66 67 67 68 71
70 67 70
Steve Stricker 2009 PGA Data
Ben Crane 2009 PGA Data
Which golfer is more consistent?
Tools to Measure Variability
Range
Range = Highest Number - Lowest Number
A small range means the data values are all very similar.
A large range means the data values are dissimilar.*
Standard Deviation
later…
Range = High – Low
Range = $5.329 – $2.509 Range = $2.82
Variability
Range Who is the more consistent welder based
on number of defective welds per week?
Compute the range to decide.
Welder A: 5, 2, 2, 1, 4, 4
Range = 5 – 1 = 4
Welder B: 8, 8, 6, 7, 7, 6
Range = 8 – 6 = 2
Since the range is smaller for Welder B, he/she appears to be the more consistent welder…
Welding I Pay Data
Median annual pay for 17
cities in the upper Midwest as of October 2013:
median = $33,495
If you are willing to relocate anywhere in the region how choosy do you have to be in terms of picking a city?
Compute the range to help answer this question: __________
Range = High – Low
$6,945
Practice
Range Practice Set 5, page 17 & 18
Variability
Range-drawbacks Which blueprint reading student has the least
variability in their test scores?
Compute the range to decide.
Student A (test scores): 95, 94, 93, 65, 92, 89
Range = 95 – 65 = 30
Student B (test scores): 80, 61, 88, 65, 70, 58
Range = 88 – 58 = 30
outli er
Review
Measures of Central Tendency
mean median mode
Measures of Variability
range
standard deviation
Test Scores
75, 78, 60, 82, 85, 70, 70
74.3
75 70
25
8.5
How “good” is this student?
How variable are the test scores ?
Are the scores similar to one another or are they more
Variability
Standard Deviation “Casual Definition” Standard Deviation - describes the average
"distance" a typical piece of data is from the middle of the data set.
Student A: 80, 61, 88, 65, 70, 58
70.3 80
61 65
58 70 88
Ave. Distance = 9.1
Variability
Standard Deviation “Casual Definition” Standard Deviation - describes the average
"distance" a typical piece of data is from the middle of the data set.
Student B: 77, 71, 66, 67, 68, 73
70.3 73 67 68
66 71 77
Ave. Distance = 3.3
Compute Standard Deviation
Calculator
Basic model (p. 21) Scientific
Computer
Spreadsheet: Excel Online
easycalculation.com
App:
Stats CalculatorFree (Android)
Practice
Standard DeviationStatistics
Data Set “shapes”
Frequency Distributions allow you to “see”
the data you are working with:
Class Frequency
15-21 years old 1863
22-28 636 29-35 417 36-42 452 43-49 397 50-56 251 57-63 109 64-70 45 71-77 26 78-84 9
SWTC Student Ages in 2004
15-21 22-28 29-35 36-42 43-49 50-56 57-63 64-70 71-77 78-84 0 200 400 600 800 1000 1200 1400 1600 1800 2000
Ages of SWTC Students in 2004
1 - 8 8 - 15 15 - 22 22 - 29 29 - 36 36 - 43 0 2 4 6 8 10 12
Joey Lagono NASCAR 2009 Season Finish Results Finish Result F re q u en cy o f O cc u rr en ce
Joey Lagono 2009 Nascar Season
Joey Lagono
Normally Distributed Data
Many data collections when graphed produce
a “bell-shaped” curve…
IQ Test
mean ( ) = 100
st. dev ( ) = 15
100 115 85 130 70 55 145 X X X X X X X X X X X X
X XX
X
X X
X
x
Examples
Information that usually is normally distributed:
IQ Scores
Heights
Measurements taken of a production run of parts.
Service life of a product (car batteries, tires)
Normally Distributed Data
Basic Characteristics
symmetry (“bell-shape”) mean ( )
standard deviation (s) or ( )
also median and mode
Small st. dev. ()
large st. dev. ()
x
Practice
Reading Information Given the normal distribution below,
determine the mean _____
42 45 48 51 33 36 39
Practice
Reading Information Both normal distributions shown have a mean
of 20.
Which one, A or B, has the larger standard
deviation?
20
A
20
NORMAL
DISTRIBUTIONS
Normal Distributions and
Percents
Area under the curve…
All of the data is located under the curve:
Normal Distributions and
Percents
Area under the curve…
Half of the data is located to the left of the mean:
Normal Distributions and
Percents
Area under the curve…
Half of the data is located to the right of the mean:
In your notes…
Write:
1 – 68%
2 – 95%
Normally-distributed Data:
Characteristics
For any data that is normally distributed:
68% of the data is located within plus or minus
1 standard deviation from the mean.
s s
Characteristics
Example –
3.5 L V-6 engine
mean horsepower = 275 hp
standard deviation = 3 hp
275 278 272
68% of the engines built should have a peak horsepower
between 272 and 278.
(
x
)
(
)
Normally-distributed Data:
Characteristics
For any data that is normally distributed:
95% of the data is located within plus or minus 2 standard deviations from the mean.
s s s s
Characteristics
Example –
3.5 L V-6 engine
mean horsepower = 275 hp
standard deviation = 3 hp
95% of the engines built should have a peak horsepower
between 269 and 281.
(
x
)
(
)
3 3 3 3
275 281
Normally-distributed Data:
Characteristics
For any data that is normally distributed:
99.7% of the data is located within +/- 3 standard deviations from the mean.
s s s s s s
Characteristics
Example –
3.5 L V-6 engine
mean horsepower = 275 hp
standard deviation = 3 hp
99.7% of the engines built
should have a peak horsepower between 266 and 284.
(
x
)
(
)
275 284
266
Practice
Test of ultimate strength of welds in 2210-T87
aluminum alloy (1/4” through 1” plate)
39.6 41.8
37.4 44.0 35.2
33.0 46.2
What is the mean for this data?
What is the standard deviation for this data?
39.6 kpsi
2.2
n = 304 n = 304
Units of measure: kpsi
Practice
Introduction to Normal DistributionsStatistics
Using Normally Distributed
Data
Data on the Heights of U.S. Women
65.5 68
63 70.5
60.5
Normal Distribution:
Applications
72 month guarantee.
Using Normally Distributed
Data
Car Battery Service Life
50 52
48 54
46
44 56
Service Life (Months) of SureStart Car Batteries Service Life (Months) of SureStart Car Batteries
• How long does the average
SureStart Battery last?
• Would it be better for this
company to set a 54 month warranty or a 44 month warranty?
50 months
Sample Problems
Data: Lifespan of 60w equiv. light bulbs
20 00 2 15 0 18 50 23 00 17 00 15 50 24 50
As a manufacturer, what would be the safest claim to make about the bulbs?
a) Our bulbs are guaranteed to last 2000 hrs.
b) Our bulbs are guaranteed to last 2300 hrs.
c) Our bulbs are guaranteed to last 1550 hours.
Sample Problems
Establish ad campaigns to target products.
19 21
17 23
15
13 25
a) What is the average age of your customers?
b) What percent of your customers are between the ages of 15 and 23?
c) What percent of your customers are over 25 years old?
19 yrs
95%
99.7% between
13 and 25.
100% - 99.7% = 0.3%0.3% 2 = 0.15%
Establish ad campaigns to target products.
19 21
17 23
15
13 25
d) What percent of your customers are 15 years old or less?
e) If you sold to 15,000 customers last year, how many were between the ages of 17 and 21?
Sample Problems
95% between 15 and 23.100% - 95% = 5%5% 2 = 2.5%
68%68% of 15,000 =
.68 x 15,000 =
Practice
Normal Distribution ApplicationsStatistics
Correlation
Is there a link (relationship of cause and
effect) between two things?
Time spent studying…
Welding
Current HardnessWeld
Ultimate Tensile Strength
Correlation
Some more examples Is there a link (relationship of cause and
effect) between two things?
Temperature…
Sample Problem
STEP 1
Procedure - Step 1
Step 1: Create a scatter graph
Size (L) MPG
1.6 26
1.8 27
1.8 26
1.9 26
2.0 25
Procedure – Step 2
Engine Size vs. Mileage (SUV's)
10 12 14 16 18 20 22 24 26 28
1.5 2.5 3.5 4.5 5.5 6.5 7.5
Engine Size (liters)
M il ea g e (m p g )
Engine Size: 4.0 L
Mileage: 20.5 mpg
Engine Size: 4.0 L
Mileage: 20.5 mpg
Each “dot” represents a vehicle…
…its engine size and mileage.
Each “dot” represents a vehicle…
Engine Size vs. Mileage (SUV's) 10 12 14 16 18 20 22 24 26 28
1.5 2.5 3.5 4.5 5.5 6.5 7.5
Engine Size (liters)
M il ea g e (m p g )
Engine Size: 2.0 L
Mileage: 23.5 mpg
Engine Size: 2.0 L
Mileage: 23.5 mpg
Procedure – Step 3
Engine Size vs. Mileage (SUV's)
10 12 14 16 18 20 22 24 26 28
1.5 2.5 3.5 4.5 5.5 6.5 7.5
Engine Size (liters)
Engine Size vs. Mileage (SUV's) 10 12 14 16 18 20 22 24 26 28
1.5 2.5 3.5 4.5 5.5 6.5 7.5
Engine Size (liters)
M il ea g e (m p g ) I II III IV
Procedure – Step 4
Quadrant I: _____ Quadrant II: _____ Quadrant III: _____ Quadrant IV: _____
Engine Size vs. Mileage (SUV's)
10 12 14 16 18 20 22 24 26 28
1.5 2.5 3.5 4.5 5.5 6.5 7.5
Engine Size (liters)
Quadrant I + Quadrant III = _____
Quadrant II + Quadrant IV = _____
Engine Size vs. Mileage (SUV's)
10 12 14 16 18 20 22 24 26 28
1.5 2.5 3.5 4.5 5.5 6.5 7.5
Engine Size (liters)
M il ea g e (m p g ) I II III IV Step 6
• Positive Correlation
• Negative Correlation
• No Correlation
Procedure – Step 5
2
12
There is a cause and effect relationship between the two variables.
As one variable increases, so does the variable associated with it.
There is a cause and effect relationship between the two variables.
As one variable increases, the variable associated with it decreases.
There appears to be no relationship
Procedure – Step 6
Positive Correlation
Temperature
C
rim
e
R
at
Procedure – Step 6
Negative Correlation
Caffeine Consumption (mg)
Procedure – Step 6
No Correlation
Correlation Criteria
Correlation Rules textbook page 55
Positive
Sum of Quadrants I and III more than twice the sum of
Quadrants II and IV.
Negative
Sum of Quadrants II and IV more than twice the sum of
Quadrants I and III.
No Correlation
When neither of the above occur.
Quadrant I + Quadrant III = _____
Quadrant II + Quadrant IV = _____
2
12
Engine Size vs. Mileage (SUV's)
10 12 14 16 18 20 22 24 26 28
1.5 2.5 3.5 4.5 5.5 6.5 7.5
Engine Size (liters)