• No results found

stat4b_chapter09.pptx

N/A
N/A
Protected

Academic year: 2020

Share "stat4b_chapter09.pptx"

Copied!
28
0
0

Loading.... (view fulltext now)

Full text

(1)

1-1

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 9, Slide 1

Chapter 9

Re-expressing the

Data:

(2)

1-2

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 9, Slide 2

Straight to the Point

 We cannot use a linear model unless the

relationship between the two variables is linear. Often re-expression can save the day,

straightening bent relationships so that we can fit and use a simple linear model.

 Two simple ways to re-express data are with

logarithms and reciprocals.

 Re-expressions can be seen in everyday life—

(3)

1-3

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 9, Slide 3

Straight to the Point (cont.)

 The relationship between fuel efficiency (in miles

per gallon) and weight (in pounds) for late model

(4)

1-4

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 9, Slide 4

Straight to the Point (cont.)

(5)

1-5

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 9, Slide 5

Straight to the Point (cont.)

 We can re-express fuel efficiency as gallons per

(6)

1-6

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 9, Slide 6

Straight to the Point (cont.)

 A look at the residuals plot for the new model

(7)

1-7

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 9, Slide 7

Goals of Re-expression

 Goal 1: Make the distribution of a variable (as

(8)

1-8

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 9, Slide 8

Goals of Re-expression (cont.)

 Goal 2: Make the spread of several groups (as

(9)

1-9

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 9, Slide 9

Goals of Re-expression (cont.)

 Goal 3: Make the form of a scatterplot more

(10)

1-10

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 9, Slide 10

Goals of Re-expression (cont.)

 Goal 4: Make the scatter in a scatterplot spread

out evenly rather than thickening at one end.

 This can be seen in the two scatterplots we

(11)

1-11

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 9, Slide 11

The Ladder of Powers

 There is a family of simple re-expressions that

move data toward our goals in a consistent way. This collection of re-expressions is called the

Ladder of Powers.

 The Ladder of Powers orders the effects that the

(12)

1-12

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 9, Slide 12

The Ladder of Powers

Ratios of two quantities (e.g., mph) often benefit from a reciprocal.

The reciprocal of the data

–1

An uncommon re-expression, but sometimes useful.

Reciprocal square root

–1/2

Measurements that cannot be negative often benefit from a log re-expression. We’ll use

logarithms here

“0”

Counts often benefit from a square root re-expression.

Square root of data values

½

Data with positive and negative values and no bounds are less likely to benefit from re-expression.

Raw data

1

Try with unimodal distributions that are skewed to the left.

Square of data values

2

Comment Name

(13)

1-13

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 9, Slide 13

Plan B: Attack of the Logarithms

 When none of the data values is zero or negative,

logarithms can be a helpful ally in the search for a useful model.

 Try taking the logs of both the x- and y-variable.

 Then re-express the data using some

(14)

1-14

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 9, Slide 14

(15)

1-15

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Slide 10- 15Chapter 9, Slide 15

Example: Using Models pg. 238 #2

For each of the models listed below, predict y when x = 2.

a) b) c) d) e) x yˆ 1.2 0.8log

x yˆ 1.2 0.8

log  

x yˆ  1.2  0.8

 

x yˆ 1.2 0.8

1 2 . 1 8 . 0

ˆ x2 x

(16)

1-16

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Slide 10- 16Chapter 9, Slide 16

Example: Zurich Zoo

The following data are the shoulder-hip length and the

vertical thickness of the bodies of some quadrupeds at the zoo in Zurich, Switzerland. Predict the vertical thickness of a giraffe if the shoulder-hip length is 145 cm.

Animal length (cm) Height (cm)

Ermine 12 4

Dachshund 35 12

Indian Tiger 90 45

Llama 122 73

(17)

1-17

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Slide 10- 17Chapter 9, Slide 17

Example: Pressure and Volume

We attempt to find how the volume of a gas

depends on the temperature and pressure of the gas. If temperature is held constant at 300 K, the following results are obtained. Predict the volume if the pressure is 325.

Pressure 200 250 300 350 400

(18)

1-18

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Slide 10- 18Chapter 9, Slide 18

Example: Soil Erosion

The problem of soil erosion is faced by farmers all over the world. The following data was from a study in western India. Predict the amount of erosion is the wind velocity is 24 km/hr.

Velocity 13.5 13.5 14 15 17.5 19 20 21 22 23 25 25 26 27 (km/hr)

(19)

1-19

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Slide 10- 19Chapter 9, Slide 19

Example: Female Heights and Weights

Consider the data on x = height (in.) and y =

average weight (lb.) for American females aged 30-39. Predict the weight of a female that is 64.5 inches tall.

X 58 59 60 61 62 63 64 65 66 Y 113 115 118 121 124 128 131 134 137

(20)

1-20

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Slide 10- 20Chapter 9, Slide 20

Example: Shoes!

Cyrus Tist was trying to determine how the pressure exerted on the floor by the heel of a shoe depends on the width of the heel and the weight of the person wearing the shoe. He started by measuring the pressure (in psi) exerted by several people wearing a shoe with a heel width of 3.5 inches. The data are summarized below. Predict the pressure exerted on the heel with a width of 3.5 inches if the person weighs 175 pounds.

(21)

1-21

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 9, Slide 21

Why Not Just Use a Curve?

 If there’s a curve in the scatterplot, why not just fit

(22)

1-22

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 9, Slide 22

Why Not Just Use a Curve? (cont.)

 The mathematics and calculations for “curves of

best fit” are considerably more difficult than “lines of best fit.”

 Besides, straight lines are easy to understand.

 We know how to think about the slope and the

(23)

1-23

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 9, Slide 23

What Can Go Wrong?

 Don’t expect your

model to be perfect.

 Don’t stray too far

from the ladder.

 Don’t choose a model

(24)

1-24

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 9, Slide 24

What Can Go Wrong? (cont.)

 Beware of multiple modes.

 Re-expression cannot pull separate modes together.

 Watch out for scatterplots that turn around.

 Re-expression can straighten many bent

(25)

1-25

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 9, Slide 25

What Can Go Wrong? (cont.)

 Watch out for negative data values.

 It’s impossible to re-express negative values

by any power that is not a whole number on the Ladder of Powers or to re-express values that are zero for negative powers.

 Watch for data far from 1.

 Data values that are all very far from 1 may not

be much affected by re-expression unless the range is very large. If all the data values are large (e.g., years), consider subtracting a

(26)

1-26

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 9, Slide 26

What have we learned?

 When the conditions for regression are not met, a

simple re-expression of the data may help.

 A re-expression may make the:

 Distribution of a variable more symmetric.

 Spread across different groups more similar.

 Form of a scatterplot straighter.

 Scatter around the line in a scatterplot more

(27)

1-27

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 9, Slide 27

What have we learned? (cont.)

 Taking logs is often a good, simple starting point.

 To search further, the Ladder of Powers or the

log-log approach can help us find a good re-expression.

 Our models won’t be perfect, but re-expression

(28)

1-28

Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 9, Slide 28

AP Tips

 Make sure that you can make accurate

predictions using a transformed equation.

 Make sure that your descriptions use the

transformed variable names, not the original variables, as appropriate.

 For example, “89.6% of the variation in

log(weight)…”

 Don’t get lost in the technology. Most AP

References

Related documents

Regression coefficients are estimated by solving an estimating equation that is the average of the partial likelihood scores based on imputed true covariates.. For the purpose

The Access Management and Growing Stock models are dependent upon Stewardship features to assess the appropriate activities given the ecological structure and function of

This article presents a novel RBPF SLAM algorithm for AUV equipped with a slow sampling sonar MSIS. It is able to build an accurate occupancy grid map while providing

The characteristics are similar to the one intended and this show promising future potential as the nanoparticle form of copper allows increase in catalyst activity due to

These certificates should be sent with the Discharge Advice and Hospital Claim form (D653A) to Medicare Australia for claims processing.. Private Hospital - Mental Health

In public sector, this means that the government makes all services via one portal; in e-government one-stop service is integrating all services and making them accessible via one

In the same way, the knowledge discovery applied to Virtual Worlds make designers of these (game designers, etc.) can improve, redesign or change in any way

Provided that the main goal is to present a decision model to support the assessment of health networked system, the attributes for network performance and end users’ priorities