• No results found

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

N/A
N/A
Protected

Academic year: 2022

Share "CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression"

Copied!
21
0
0

Loading.... (view fulltext now)

Full text

(1)

CHAPTER 13

SIMPLE LINEAR REGRESSION

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Opening Example

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

SIMPLE LINEAR REGRESSION

! 

Simple Regression

! 

Linear Regression

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Simple Regression

Definition

A regression model is a mathematical equation that describes the relationship between two or more variables. A simple regression model includes only two variables: one independent and one dependent. The dependent variable is the one being explained, and the independent variable is the one used to explain the variation in the dependent variable.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Linear Regression

Definition

A (simple) regression model that gives a straight-line relationship between two variables is called a linear regression model.

Prem Mann, Introductory Statistics, 8/E

Figure 13.1 Relationship between food expenditure and income. (a) Linear relationship. (b) Nonlinear relationship.

Prem Mann, Introductory Statistics, 8/E

(2)

Figure 13.2 Plotting a linear equation.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Figure 13.3 y-intercept and slope of a line.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

SIMPLE LINEAR REGRESSION ANALYSIS

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

SIMPLE LINEAR REGRESSION ANALYSIS

Definition

In the regression model y = A + Bx + ε, A is called the y- intercept or constant term, B is the slope, and ε is the random error term. The dependent and independent variables are y and x, respectively.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

SIMPLE LINEAR REGRESSION ANALYSIS

Definition

In the model ŷ = a + bx, a and b, which are calculated using sample data, are called the estimates of A and B, respectively.

Table 13.1 Incomes (in hundreds of dollars) and Food

Expenditures of Seven Households

(3)

Scatter Diagram

Definition

A plot of paired observations is called a scatter diagram.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Figure 13.4 Scatter diagram.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Figure 13.5 Scatter diagram and straight lines.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Figure 13.6 Regression Line and random errors.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Error Sum of Squares (SSE)

The error sum of squares, denoted SSE, is

The values of a and b that give the minimum SSE are called the least square estimates of A and B, and the regression line obtained with these estimates is called the least squares line.

2

ˆ

2

SSE = ∑ e = ∑ ( y y)

Prem Mann, Introductory Statistics, 8/E

The Least Squares Line

For the least squares regression line ŷ = a + bx, SS and

SS

xy xx

b = a y bx = −

where

and SS stands for “sum of squares.” The least squares regression line ŷ = a + bx is also called the regression of y on x.

( )( )

2

( )

2

SS

xy

x y and SS

xx

x

xy x

n n

= ∑ − ∑ ∑ = ∑ − ∑

Prem Mann, Introductory Statistics, 8/E

(4)

Example 13-1

Find the least squares regression line for the data on incomes and food expenditure on the seven households given in the Table 13.1. Use income as an independent variable and food expenditure as a dependent variable.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Table 13.2

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-1: Solution

386 108 / 386 / 7 55.1429 / 108 / 7 15.4286

x y

x x n

y y n

= =

= = =

= = =

∑ ∑

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-1: Solution ( )( )

( )

2 2

2

(386)(108)

SS 6403 447.5714

7 (386)

SS 23,058 1772.8571

7

xy

xx

x y

xy n

x x n

= − = − =

= − = − =

∑ ∑

∑ ∑

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-1: Solution 447.5714

.2525 1772.8571

15.4286 (.2525)(55.1429) 1.5050

xy xx

b SS SS a y bx

= = =

= − = − =

Thus, our estimated regression model is ŷ = 1.5050 + .2525 x

Figure 13.7 Error of prediction.

(5)

Interpretation of a and b

Interpretation of a

! 

Consider a household with zero income. Using the estimated regression line obtained in Example 13-1,

" 

ŷ = 1.5050 + .2525(0) = $1.5050 hundred.

! 

Thus, we can state that a household with no income is expected to spend $150.50 per month on food.

! 

The regression line is valid only for the values of x between 33 and 83.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Interpretation of a and b

Interpretation of b

! 

The value of b in the regression model gives the change in y (dependent variable) due to a change of one unit in x (independent variable).

! 

We can state that, on average, a $100 (or $1) increase in income of a household will increase the food expenditure by

$25.25 (or $.2525).

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Figure 13.8 Positive and negative linear relationships between x and y.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Case Study 13-1 Regression of Weights on Heights for NFL Players

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Case Study 13-1 Regression of Weights on Heights for NFL Players

Prem Mann, Introductory Statistics, 8/E

Assumptions of the Regression Model

Assumption 1: The random error term Є has a mean equal to zero for each x

Assumption 2: The errors associated with different observations are independent

Assumption 3: For any given x, the distribution of errors is normal

Assumption 4: The distribution of population errors for each x has the same (constant) standard deviation, which is denoted σ

Є

Prem Mann, Introductory Statistics, 8/E

(6)

Figure 13.11 (a) Errors for households with an income of

$4000 per month.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Figure 13.11 (b) Errors for households with an income of

$ 7500 per month.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Figure 13.12 Distribution of errors around the population regression line.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Figure 13.13 Nonlinear relations between x and y.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

STANDARD DEVIATION OF ERRORS AND COEFFICIENT OF DETERMINATION

Degrees of Freedom for a Simple Linear Regression Model

The degrees of freedom for a simple linear regression model are

df = n – 2

Figure 13.14 Spread of errors for x = 40 and x = 75.

(7)

STANDARD DEVIATION OF ERRORS AND COEFFICIENT OF DETERMINATION

The standard deviation of errors is calculated as

where

2

yy xy

e

SS bSS

s n

= −

2

2

( )

yy

SS y y

= ∑ − ∑ n

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-2

Compute the standard deviation of errors s

e

for the data on monthly incomes and food expenditures of the seven households given in Table 13.1.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Table 13.3

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-2: Solution

( )

2 2

2

1792 (108) 125.7143 7

125.7143 .2525(447.5714) 1.5939

2 7 2

yy

yy xy

e

SS y y n SS bSS

s n

= − = − =

− −

= =

− −

∑ ∑

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

COEFFICIENT OF DETERMINATION

Total Sum of Squares (SST)

The total sum of squares, denoted by SST, is calculated as

Note that this is the same formula that we used to calculate SS

yy

.

( )

2

2

y

SST y

= ∑ − ∑ n

Prem Mann, Introductory Statistics, 8/E

Figure 13.15 Total errors.

Prem Mann, Introductory Statistics, 8/E

(8)

Table 13.4

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Figure 13.16 Errors of prediction when regression model is used.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

COEFFICIENT OF DETERMINATION

Regression Sum of Squares (SSR)

The regression sum of squares , denoted by SSR, is

SSR SST SSE = −

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

COEFFICIENT OF DETERMINATION

Coefficient of Determination

The coefficient of determination, denoted by r

2

, represents the proportion of SST that is explained by the use of the regression model. The computational formula for r

2

is

and 0 ≤ r

2

≤ 1

2 xy

yy

r b SS

= SS

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-3

For the data of Table 13.1 on monthly incomes and food expenditures of seven households, calculate the coefficient of determination.

Example 13-3: Solution

! 

From earlier calculations made in Examples 13-1 and 13-2,

! 

b = .2525, SSxx = 447.5714, SSyy = 125.7143

2

(.2525)(447.5714) .90

125.7143

xy yy

r b SS

= SS = =

(9)

INFERENCES ABOUT B

! 

Sampling Distribution of b

! 

Estimation of B

! 

Hypothesis Testing About B

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Sampling Distribution of b

Mean, Standard Deviation, and Sampling Distribution

of b

Because of the assumption of normally distributed random errors, the sampling distribution of b is normal. The mean and standard deviation of b, denoted by and , respectively, are

and

b b

xx

B SS

µ = σ = σ

µ

b

σ

b

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Estimation of B

Confidence Interval for B

The (1 – α)100% confidence interval for B is given by

where

and the value of t is obtained from the t distribution table for α α /2 area in the right tail of the t distribution and n-2 degrees of freedom.

b ts ±

b b e

xx

s s

= SS

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-4

Construct a 95% confidence interval for B for the data on incomes and food expenditures of seven households given in Table 13.1.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-4: Solution 1.5939

.0379 1772.8571

2 7 2 5

/ 2 (1 .95) / 2 .025 2.571

.2525 2.571(.0379)

.2525 .0974 .155 to .350

e b

xx

b

s s

SS df n

t b ts α

= = =

= − = − =

= − =

=

± = ±

= ± =

Prem Mann, Introductory Statistics, 8/E

Hypothesis Testing About B

Test Statistic for b

The value of the test statistic t for b is calculated as

The value of B is substituted from the null hypothesis.

b

t b B s

= −

Prem Mann, Introductory Statistics, 8/E

(10)

Example 13-5

Test at the 1% significance level whether the slope of the regression line for the example on incomes and food expenditures of seven households is positive.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-5: Solution

! 

Step 1:

H

0

: B = 0 (The slope is zero) H

1

: B > 0 (The slope is positive)

! 

Step 2:

is not known

Hence, we will use the t distribution to make the test about B.

σ

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-5: Solution

! 

Step 3:

α = .01

Area in the right tail = α = .01 df = n – 2 = 7 – 2 = 5 The critical value of t is 3.365.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Figure 13.17

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-5: Solution

.2525 0

6.662 .0379

b

t b B s

− −

= = =

From H

0

#  Step 4:

Example 13-5: Solution

! 

Step 5:

The value of the test statistic t = 6.662

" 

It is greater than the critical value of t = 3.365

" 

It falls in the rejection region

Hence, we reject the null hypothesis

We conclude that x (income) determines y (food

expenditure) positively.

(11)

LINEAR CORRELATION

! 

Linear Correlation Coefficient

! 

Hypothesis Testing About the Linear Correlation Coefficient

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Linear Correlation Coefficient

Value of the Correlation Coefficient

The value of the correlation coefficient always lies in the range of –1 to 1; that is,

-1 ≤ ρ ≤ 1 and -1 ≤ r ≤ 1

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Figure 13.18 Linear correlation between two variables.

(a) Perfect positive linear correlation, r = 1

Prem Mann, Introductory Statistics, 8/E

x

Copyright © 2013 John Wiley & Sons. All rights reserved.

Figure 13.18 Linear correlation between two variables.

(b) Perfect negative linear correlation, r = -1

Prem Mann, Introductory Statistics, 8/E

x

Copyright © 2013 John Wiley & Sons. All rights reserved.

Figure 13.18 Linear correlation between two variables.

(c) No linear correlation, , r ≈ 0

Prem Mann, Introductory Statistics, 8/E

x

Figure 13.19 Linear correlation between variables.

Prem Mann, Introductory Statistics, 8/E

(12)

Figure 13.19 Linear correlation between variables.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Figure 13.19 Linear correlation between variables.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Figure 13.19 Linear correlation between variables.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Linear Correlation Coefficient

Linear Correlation Coefficient

The simple linear correlation coefficient, denoted by r, measures the strength of the linear relationship between two variables for a sample and is calculated as

xy

xx yy

r SS

SS SS

=

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-6

Calculate the correlation coefficient for the example on incomes and food expenditures of seven households.

Example 13-6: Solution

447.5714

.95

(1772.8571)(125.7143)

xy xx yy

r SS

SS SS

=

= =

(13)

Hypothesis Testing About the Linear Correlation Coefficient

Test Statistic for r

If both variables are normally distributed and the null hypothesis is H

0

: ρ = 0, then the value of the test statistic t is calculated as

Here n – 2 are the degrees of freedom.

2

2 1 t r n

r

= −

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-7

Using the 1% level of significance and the data from Example 13-1, test whether the linear correlation coefficient between incomes and food expenditures is positive. Assume that the populations of both variables are normally distributed.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-7: Solution

! 

Step 1:

H

0

: ρ = 0 (The linear correlation coefficient is zero) H

1

: ρ > 0 (The linear correlation coefficient is positive)

! 

Step 2:

The population distributions for both variables are normally distributed. Hence, we can use the t distribution to perform this test about the linear correlation coefficient.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-7: Solution

! 

Step 3:

Area in the right tail = .01 df = n – 2 = 7 – 2 = 5 The critical value of t = 3.365

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Figure 13.20

Prem Mann, Introductory Statistics, 8/E

Example 13-7: Solution

#  Step 4:

Prem Mann, Introductory Statistics, 8/E

!="√⁠​%−#/$−​"↑#   

=.&'($√⁠​)−#/$−(.&'($​)↑#   

=6.667

(14)

Example 13-7: Solution

! 

Step 5:

The value of the test statistic t = 6.667

" 

It is greater than the critical value of t=3.365

" 

It falls in the rejection region

Hence, we reject the null hypothesis.

We conclude that there is a positive relationship between incomes and food expenditures.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

REGRESSION ANALYSIS: A COMPLETE

Example 13-8

A random sample of eight drivers selected from a small city insured with a company and having similar minimum required auto insurance policies was selected. The following table lists their driving experiences (in years) and monthly auto insurance premiums (in dollars).

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-8

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-8

(a) Does the insurance premium depend on the driving experience or does the driving experience depend on the insurance premium? Do you expect a positive or a negative relationship between these two variables?

(b) Compute SS

xx

, SS

yy

, and SS

xy

.

(c) Find the least squares regression line by choosing appropriate dependent and independent variables based on your answer in part a.

(d) Interpret the meaning of the values of a and b calculated in part c.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-8

(e) Plot the scatter diagram and the regression line.

(f) Calculate r and r

2

and explain what they mean.

(g) Predict the monthly auto insurance for a driver with 10 years of driving experience.

(h) Compute the standard deviation of errors.

(i) Construct a 90% confidence interval for B.

(j) Test at the 5% significance level whether B is negative.

(k) Using α = .05, test whether ρ is different from zero.

Example 13-8: Solution

(a) Based on theory and intuition, we expect the insurance premium to depend on driving experience.

" 

The insurance premium is a dependent variable

" 

The driving experience is an independent variable

(15)

Table 13.5

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-8: Solution

(b)

/ 90 / 8 11.25 / 474 / 8 59.25

x x n

y y n

= = =

= = =

2 2

2

2 2

2

( )( ) 4739 (90)(474) 593.5000 8

( ) 1396 (90) 383.5000 8

( ) (474)

29,642 1557.5000

8

xy

xx

yy

x y

SS xy

n SS x x

n SS y y

n

= − = − = −

= − = − =

= − = − =

∑ ∑ ∑

∑ ∑

∑ ∑

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-8: Solution

(c)

593.5000 1.5476 383.5000

59.25 ( 1.5476)(11.25) 76.6605

xy xx

b SS SS a y bx

= = − = −

= − = − − =

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

ŷ=)-.--./−$./')-)

Example 13-8: Solution

(d) The value of a = 76.6605 gives the value of ŷ for x = 0;

that is, it gives the monthly auto insurance premium for a driver with no driving experience.

The value of b = -1.5476 indicates that, on average, for every extra year of driving experience, the monthly auto insurance premium decreases by $1.55.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Figure 13.21 Scatter diagram and the regression line.

(e) The regression line slopes downward from left to right.

Prem Mann, Introductory Statistics, 8/E

Example 13-8: Solution

2

593.5000 (383.5000)(1557.5000) .77

( 1.5476)( 593.5000) 1557.5000 .59

xy

xx yy

xy yy

r SS

SS SS r bSS

SS

= = − = −

− −

= = =

(f)

Prem Mann, Introductory Statistics, 8/E

(16)

Example 13-8: Solution

(f) The value of r = -0.77 indicates that the driving experience and the monthly auto insurance premium are negatively related.

The (linear) relationship is strong but not very strong.

The value of r² = 0.59 states that 59% of the total variation in insurance premiums is explained by years of driving experience and 41% is not.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-8: Solution

(g) Using the estimated regression line, we find the predicted value of y for x = 10 is

ŷ = 76.6605 – 1.5476(10) = $61.18

Thus, we expect the monthly auto insurance premium of a driver with 10 years of driving experience to be $61.18.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-8: Solution

(h)

2

1557.5000 ( 1.5476)( 593.5000)

8 2 10.3199

yy xy

e

SS bSS

s n

= −

− − −

= −

=

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-8: Solution

(i)

10.3199

.5270 383.5000

/ 2 .5 (.90 / 2) .05 2 8 2 6 1.943

1.5476 1.943(.5270)

1.5476 1.0240 2.57 to .52

e b

xx

b

s s SS

df n t b ts α

= = =

= − =

= − = − =

=

± = − ±

= − ± = − −

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-8: Solution

(j)

!  Step 1:

H

0

: B = 0 (B is not negative) H

1

: B < 0 (B is negative)

!  Step 2: Because the standard deviation of the error is not known, we use the t distribution to make the hypothesis test

Example 13-8: Solution

! 

Step 3:

Area in the left tail = α = .05

df = n – 2 = 8 – 2 = 6

The critical value of t is -1.943

(17)

Figure 13.22

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-8: Solution

1.5476 0

2.937 .5270

b

t b B s

− − −

= = = −

From H

0

#  Step 4:

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-8: Solution

! 

Step 5:

The value of the test statistic t = -2.937

" 

It falls in the rejection region

Hence, we reject the null hypothesis and conclude that B is negative.

The monthly auto insurance premium decreases with an increase in years of driving experience.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-8: Solution

(k)

! 

Step 1:

H

0

: ρ = 0 (The linear correlation coefficient is zero) H

1

: ρ ≠ 0 (The linear correlation coefficient is different from zero)

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-8: Solution

! 

Step 2: Assuming that variables x and y are normally distributed, we will use the t distribution to perform this test about the linear correlation coefficient.

! 

Step 3:

Area in each tail = .05/2 = .025 df = n – 2 = 8 – 2 = 6

The critical values of t are -2.447 and 2.447

Prem Mann, Introductory Statistics, 8/E

Figure 13.23

Prem Mann, Introductory Statistics, 8/E

(18)

Example 13-8: Solution

#  Step 4:

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

*="√⁠​%−#/$−​"↑#   

=−.)-)&√⁠​(−#/$−​(−.)))↑#   

= -2.936

Example 13-8: Solution

! 

Step 5:

The value of the test statistic t = -2.936

" 

It falls in the rejection region

Hence, we reject the null hypothesis

We conclude that the linear correlation coefficient between driving experience and auto insurance premium is different from zero.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

USING THE REGRESSION MODEL

! 

Using the Regression Model for Estimating the Mean Value of y

! 

Using the Regression Model for Predicting a Particular Value of y

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Figure 13.24 Population and sample regression lines.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Using the Regression Model for Estimating the Mean Value of y

Confidence Interval for µ

y|x

The (1 – α)100% confidence interval for µ

y|x

for x = x

0

is

where the value of t is obtained from the t distribution table for α/2 area in the right tail of the t distribution curve and df

= n – 2.

ˆ ˆ

y

m

y t s ±

Using the Regression Model for Estimating the Mean Value of y

Confidence Interval for µ

y|x

The value of is calculated as follows:

ˆ ym

s

2 ˆ 0

( )

1

m e

y

xx

x x s s

n SS

= + −

(19)

Example 13-9

Refer to Example 13-1 on incomes and food expenditures.

Find a 99% confidence interval for the mean food expenditure for all households with a monthly income of $5500.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-9: Solution

! 

Using the regression line estimated in Example 13-1, we find the point estimate of the mean food expenditure for x = 55

" 

ŷ = 1.5050 + .2525(55) = $15.3925 hundred

! 

Area in each tail = α/2 = (1 – .99)/2 = .005

! 

df = n – 2 = 7 – 2 = 5

! 

t = 4.032

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-9: Solution

2 0 ˆ

2

1.5939, 55.1429, and 1772.8571

( )

1

1 (55 55.1429)

(1.5939) .6025

7 1772.8571

m

e xx

e y

xx

s x SS

x x S s

n SS

= = =

= + −

= + − =

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-9: Solution

55 ˆ

Hence, the 99% confidence interval for is ˆ 15.3925 4.032(.6025)

15.3925 2.4293 12.9632 to 17.8218

m

y|

y

µ y ts ± = ±

= ± =

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Using the Regression Model for Predicting a Particular Value of y

Prediction Interval for y

p

The (1 – α)100% prediction interval for the predicted value of y, denoted by y

p

, for x = x

0

is

ˆ ˆ

y

p

y t s ±

Prem Mann, Introductory Statistics, 8/E

Using the Regression Model for Predicting a Particular Value of y

Prediction Interval for y

p

where the value of t is obtained from the t distribution table for α/2 area in the right tail of the t distribution curve and df

= n – 2.

The value of is calculated as follows: s

yˆp

2 ˆ 0

( )

1 1

p e

y

xx

x x s s

n SS

= + + −

Prem Mann, Introductory Statistics, 8/E

(20)

Example 13-10

Refer to Example 13-1 on incomes and food expenditures.

Find a 99% prediction interval for the predicted food expenditure for a randomly selected household with a monthly income of $5500.

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-10: Solution

! 

Using the regression line estimated in Example 13-1, we find the point estimate of the predicted food expenditure for x = 55

" 

ŷ = 1.5050 + .2525(55) = $15.3925 hundred

! 

Area in each tail = α/2 = (1– .99)/2 = .005

! 

df = n – 2 = 7 – 2 = 5

! 

t = 4.032

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-10: Solution

2 ˆ 0

2

1.5939, 55.1429, and 1772.8571

( )

1 1

1 (55 55.1429)

(1.5939) 1 1.7040

7 1772.8571

p

e xx

e y

xx

s x SS

x x

S s

n SS

= = =

= + + −

= + + − =

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Example 13-10: Solution

ˆyp

Hence, the 99% prediction interval for for 55 is ˆ s =15.3925 ± 4.032(1.7040)

15.3925 6.8705 8.5220 to 22.2630 y

p

x y t

=

±

= ± =

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

TI-84 TI-84

(21)

Minitab

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Excel

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Excel

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Excel

Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.

Excel

Prem Mann, Introductory Statistics, 8/E

References

Related documents