Ch. 9 Correlation and Regression
9.1 Correlation
1 Interpret Scatter Plots and Correlations MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) Given the length of a humanʹs femur, x, and the length of a humanʹs humerus, y, would you expect a positive correlation, a negative correlation, or no correlation?A) positive correlation B) negative correlation C) no correlation
2) Given the supply of a commodity, x, and the price of a commodity, y, would you expect a positive correlation, a negative correlation, or no correlation?
A) negative correlation B) positive correlation C) no correlation
3) Given the size of a humanʹs brain, x, and their score on an IQ test, y, would you expect a positive correlation, a negative correlation, or no correlation?
A) no correlation B) positive correlation C) negative correlation
2 Identify the Explanatory and Response Variables SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Identify the explanatory variable and the response variable. 1) An agricultural business wants to determine if the rainfall in inches can be used to predict the yield per acre on a wheat farm. 2) A college counselor wants to determine if the number of hours spent studying for a test can be used to predict the grades on a test. 3 Construct a Scatter Plot and Determine Correlation SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Provide an appropriate response. 1) The data below are the gestation periods, in months, of randomly selected animals and their corresponding life spans, in years. Construct a scatter plot for the data. Determine whether there is a positive linear correlation, a negative linear correlation, or no linear correlation. Gestation, x 8 2.1 1.3 1 11.5 5.3 3.8 24.3 Life span, y 30 12 6 3 25 12 10 40 2) Construct a scatter plot for the given data. Determine whether there is a positive linear correlation, negative linear correlation, or no linear correlation. x y -5 11 -3 6 4 -6 1 -1 -1 3 -2 4 0 1 2 -4 3 -5 -4 8
3) Construct a scatter plot for the given data. Determine whether there is a positive linear correlation, negative linear correlation, or no linear correlation. x y -5 11 -3 -6 48 1 -3 -1-2 -21 0 5 2 -5 36 -4 7 4) The data below are the final exam scores of 10 randomly selected statistics students and the number of hours they studied for the exam. Construct a scatter plot for the data. Hours, x Scores, y 3 65 5 80 2 60 8 88 2 66 4 78 4 85 5 90 6 90 3 71 5) The data below are the temperatures on randomly chosen days during a summer class and the number of absences on those days. Construct a scatter plot for the data. Temperature, x Number of absences, y 72 3 85 7 91 10 90 10 88 8 98 15 75 4 100 15 80 5 6) The data below are the ages and systolic blood pressures (measured in millimeters of mercury) of 9 randomly selected adults. Construct a scatter plot for the data. Age, x Pressure, y 38 116 41 120 45 123 48 131 51 142 53 145 57 148 61 150 65 152 7) The data below are the number of absences and the final grades of 9 randomly selected students from a statistics class. Construct a scatter plot for the data. Number of absences, x Final grade, y 0 98 3 86 6 80 4 82 9 71 2 92 15 55 8 76 5 82 8) A manager wishes to determine the relationship between the number of miles (in hundreds of miles) the managerʹs sales representatives travel per month and the amount of sales (in thousands of dollars) per month. Construct a scatter plot for the data. Miles traveled, x Sales, y 2 31 3 33 10 78 7 62 8 65 15 61 3 48 1 55 11 120 9) In order for applicants to work for the foreign-service department, they must take a test in the language of the country where they plan to work. The data below show the relationship between the number of years that applicants have studied a particular language and the grades they received on the proficiency exam. Construct a scatter plot for the data. Number of years, x Grades on test, y 3 61 4 68 4 75 5 82 3 73 6 90 2 58 7 93 3 72
10) In an area of the Midwest, records were kept on the relationship between the rainfall (in inches) and the yield of wheat (bushels per acre). Construct a scatter plot for the data. Rain fall (in inches), x Yield (bushels per acre), y 10.5 50.5 8.8 46.2 13.4 58.8 12.5 59.0 18.8 82.4 10.3 49.2 7.0 31.9 15.6 76.0 16.0 78.8 4 Perform a Hypothesis Test for a Population Correlation Coefficient MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) Calculate the correlation coefficient, r, for the data below. x y -10 -12 -8 -10 -1 7 -4 -1 -6 -4 -7 -8 -5 -3 -3 1 -2 4 -9 -10 A) 0.990 B) 0.881 C) 0.819 D) 0.792 2) Calculate the correlation coefficient, r, for the data below. x y -10 2 -8 -3 -1 -15 -4 -10 -6 -6 -7 -5 -5 -8 -3 -13 -2 -14 -9 -1 A)-0.995 B)-0.671 C)-0.778 D)-0.885 3) Calculate the correlation coefficient, r, for the data below. x y -1 12 1 -5 8 9 5 -2 3 -1 2 2 4 6 6 -4 7 7 0 8 A) -0.104 B) -0.132 C) -0.549 D) -0.581 SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 4) The data below are the gestation periods, in months, of randomly selected animals and their corresponding life spans, in years. Calculate the correlation coefficient r. Gestation, x 8 2.1 1.3 1 11.5 5.3 3.8 24.3 Life span, y 30 12 6 3 25 12 10 40 5) The data below are the average monthly temperatures, in °F, and the monthly natural gas consumption, in ccf, for a household in northwestern Pennsylvania. Calculate the correlation coefficient, r. Temperature 47 35 21 27 39 48 61 65 70 Consumption 34 169 248 134 137 100 19 34 12
MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 6) The data below are the final exam scores of 10 randomly selected statistics students and the number of hours they studied for the exam. Calculate the correlation coefficient r. Hours, x Scores, y 8 72 10 87 7 67 13 95 7 73 9 85 9 92 10 97 11 97 8 78 A) 0.847 B) 0.991 C) 0.761 D) 0.654 7) The data below are the temperatures on randomly chosen days during a summer class and the number of absences on those days. Calculate the correlation coefficient, r. Temperature, x Number of absences, y 74 5 87 9 93 12 92 12 90 10 100 17 77 6 102 17 82 7 A) 0.980 B) 0.890 C) 0.881 D) 0.819 8) The data below are the ages and systolic blood pressures (measured in millimeters of mercury) of 9 randomly selected adults. Calculate the correlation coefficient, r. Age, x Pressure, y 41 111 44 115 48 118 51 126 54 137 56 140 60 143 64 145 68 147 A) 0.960 B) 0.998 C) 0.890 D) 0.908 9) The data below are the number of absences and the final grades of 9 randomly selected students from a statistics class. Calculate the correlation coefficient, r. Number of absences, x Final Grade, y 2 100 5 88 8 82 6 84 11 73 4 94 17 57 10 78 7 84 A) -0.991 B) -0.888 C) -0.918 D) -0.899 10) A manager wishes to determine the relationship between the number of miles (in hundreds of miles) the managerʹs sales representatives travel per month and the amount of sales (in thousands of dollars) per month. Calculate the correlation coefficient, r. Miles traveled, x Sales, y 5 41 6 43 13 88 10 72 11 75 18 71 6 58 4 65 14 130 A) 0.632 B) 0.561 C) 0.717 D) 0.791 11) In order for applicants to work for the foreign-service department, they must take a test in the language of the country where they plan to work. The data below shows the relationship between the number of years that applicants have studied a particular language and the grades they received on the proficiency exam. Calculate the correlation coefficient, r. Number of years, x Grades on test, y 4 62 5 69 5 76 6 83 4 74 7 91 3 59 8 94 4 73 A) 0.934 B) 0.911 C) 0.891 D) 0.902
12) In an area of the Midwest, records were kept on the relationship between the rainfall (in inches) and the yield of wheat (bushels per acre). Calculate the correlation coefficient, r. Rain fall (in inches), x Yield (bushels per acre), y 10.7 47.5 9 43.2 13.6 55.8 12.7 56 19 79.4 10.5 46.2 7.2 28.9 15.8 73 16.2 75.8 A) 0.981 B) 0.998 C) 0.900 D) 0.899 13) Given a sample with r = 0.823, n = 10, and α = 0.05, determine the standardized test statistic t necessary to test the claim ρ = 0. Round answers to three decimal places. A) 4.098 B) 3.816 C) 2.891 D) 1.782 14) Given a sample with r = -0.541, n = 20, and α = 0.01, determine the standardized test statistic t necessary to test the claim ρ = 0. Round answers to three decimal places. A)-2.729 B)-5.132 C)-4.671 D)-3.251 15) Given a sample with r = 0.321, n = 30, and α = 0.10, determine the standardized test statistic t necessary to test the claim ρ = 0. Round answers to three decimal places. A) 1.793 B) 3.198 C) 2.354 D) 2.561 16) Given a sample with r = -0.765, n = 22, and α = 0.02, determine the standardized test statistic t necessary to test the claim ρ = 0. Round answers to three decimal places. A)-5.312 B)-4.392 C)-3.783 D)-2.653
17) Given a sample with r = 0.823, n = 10, and α = 0.05, determine the critical values t0 necessary to test the claim ρ = 0.
A) ± 2.306 B) ± 2.821 C) ± 1.833 D) ± 1.383
18) Given a sample with r = -0.541, n = 20, and α = 0.01, determine the critical values t0 necessary to test the claim ρ = 0.
A) ± 2.878 B) ± 1.729 C) ± 2.093 D) ± 2.540
19) Given a sample with r = 0.321, n = 30, and α = 0.10, determine the critical values t0 necessary to test the claim ρ = 0.
A) ± 1.701 B) ± 1.311 C) ± 2.462 D) ± 0.683
20) Given a sample with r = -0.765, n = 22, and α = 0.02, determine the critical values t0 necessary to test the claim ρ = 0.
A) ± 2.528 B) ± 2.080 C) ± 2.831 D) ± 1.721
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.
21) Given a sample with r = 0.823 and n = 10, test the significance of the correlation r using α = 0.05 and the claim ρ = 0.
22) Given a sample with r = -0.541, n = 20, test the significance of the correlation r using α = 0.01 and the claim ρ = 0.
23) Given a sample with r = 0.321 and n = 30, test the significance of the correlation r using α = 0.10 and the claim ρ = 0.
24) Given a sample with r = -0.765 and n = 22, test the significance of the correlation r using α = 0.02 and the claim ρ = 0. 25) For the data below, test the significance of the correlation coefficient using α = 0.05 and the claim ρ = 0. x y -9 -14 -7 -12 0 5 -3 -3 -5 -6 -6 -10 -4 -5 -2 -1 -1 2 -8 -12 26) For the data below, test the significance of the correlation coefficient using α= 0.01 and the claim ρ = 0. x y -15 7 -13 2 -6 -10 -9 -5 -11 -1 -12 0 -10 -3 -8 -8 -7 -9 -14 4 27) For the data below, test the significance of the correlation coefficient using α = 0.10 and the claim ρ = 0. x y -9 10 -7 -7 0 7 -3 -4 -5 -3 -6 0 -4 4 -2 -6 -1 5 -8 6 28) The data below are the gestation periods, in months, of randomly selected animals and their corresponding life spans, in years. Test the significance of the correlation coefficient using α = 0.01 and the claim ρ > 0. Gestation, x 8 2.1 1.3 1 11.5 5.3 3.8 24.3 Life span, y 30 12 6 3 25 12 10 40 29) The data below are the final exam scores of 10 randomly selected statistics students and the number of hours they studied for the exam. Test the significance of the correlation coefficient using α = 0.05 and the claim ρ = 0. Hours, x Scores, y 6 72 8 87 5 67 11 95 5 73 7 85 7 92 8 97 9 97 6 78 30) The data below are the temperatures on randomly chosen days during a summer class and the number of absences on those days. Test the significance of the correlation coefficient using α = 0.02, and the claim ρ = 0. Temperature, x Number of absences, y 71 2 84 6 90 9 89 9 87 7 97 14 74 3 99 14 79 4 31) The data below are the ages and systolic blood pressures (measured in millimeters of mercury) of 9 randomly selected adults. Test the significance of the correlation coefficient using α = 0.05 and the claim ρ = 0. Age, x Pressure, y 42 118 45 122 49 125 52 133 55 144 57 147 61 150 65 152 69 154
32) The data below are the number of absences and the final grades of 9 randomly selected students from a statistics class. Test the significance of the correlation coefficient using α = 0.05 and the claim ρ = 0. Number of absences, x Final Grade, y 4 99 7 87 10 81 8 83 13 72 6 93 19 56 12 77 9 83 33) A manager wishes to determine the relationship between the number of miles (in hundreds of miles) the managerʹs sales representatives travel per month and the amount of sales (in thousands of dollars) per month. Test the significance of the correlation coefficient using α = 0.01 and the claim ρ = 0. Miles traveled, x Sales, y 4 27 5 29 12 74 9 58 10 61 17 57 5 44 3 51 13 116 34) In order for applicants to work for the foreign-service department, they must take a test in the language of the country where they plan to work. The data below shows the relationship between the number of years that applicants have studied a particular language and the grades they received on the proficiency exam. Test the significance of the correlation coefficient using α = 0.10 and the claim ρ = 0. Number of years, x Grades on test, y 7 58 8 65 8 72 9 79 7 70 10 87 6 55 11 90 7 69 35) In an area of the Midwest, records were kept on the relationship between the rainfall (in inches) and the yield of wheat (bushels per acre). Test the significance of the correlation coefficient using α = 0.01 and the claim ρ = 0. Rain fall (in inches), x Yield (bushels per acre), y 9.1 54.5 7.4 50.2 12 62.8 11.1 63 17.4 86.4 8.9 53.2 5.6 35.9 14.2 80 14.6 82.8 36) The data below are the average monthly temperatures, in °F, and the monthly natural gas consumption, in ccf, for a household in northwestern Pennsylvania. Test the significance of the correlation coefficient using α = 0.05 and the claim ρ < 0. Temperature 47 35 21 27 39 48 61 65 70 Consumption 34 169 248 134 137 100 19 34 12 5 Calculate the Correlation Coefficient with Interchanged x and y SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Provide an appropriate response. 1) Calculate the coefficient of correlation, r, letting Row 1 represent the x-values and Row 2 represent the y-values. Now calculate the coefficient of correlation, r, letting Row 2 represent the x -values and Row 1 represent the y-values. What effect does switching the explanatory and response variables have on the correlation coefficient? Row 1 Row 2 -8 -18 -6 0 1 1 -2 -7 -4 -10 -5 -14 -3 -9 -1 -5 0 -2 -7 0
6 Concepts
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.
Provide an appropriate response.
1) Explain the difference between x2
∑
and∑
x 2.MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
2) If Data A has a correlation coefficient of r = -0.991, and Data B has a correlation coefficient of r = 0.991, which correlation is correct? A) Data A and Data B have the same strength in linear correlation. B) Data A has a stronger linear correlation than Data B. C) Data A has a weaker linear correlation than Data B. 3) Which of the following values could not represent a correlation coefficient? A) 1.032 B) 0 C) 0.927 D) -1
9.2 Linear Regression
1 Find the Equation of a Regression Line MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) Find the equation of the regression line for the given data. x y -5 -10 -3 -8 4 9 1 1 -1 -2 -2 -6 0 -1 2 3 3 6 -4 -8 A) y^ = 2.097x - 0.552 B) y^ = 0.522x - 2.097 C) y^ = 2.097x + 0.552 D) y^ = -0.552x + 2.097 2) Find the equation of the regression line for the given data. x y -5 11 -3 6 4 -6 1 -1 -1 3 -2 4 0 1 2 -4 3 -5 -4 8 A) y^ = -1.885x + 0.758 B) y^ = 0.758x + 1.885 C) y^ = -0.758x - 1.885 D) y^ = 1.885x - 0.758 3) Find the equation of the regression line for the given data. x y -5 11 -3 -6 48 1 -3 -1-2 -21 0 5 2 -5 36 -4 7 A) y^ = -0.206x + 2.097 B) y^ = 2.097x - 0.206 C) y^ = 0.206x - 2.097 D) y^ = -2.097x + 0.206SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 4) The data below are the gestation periods, in months, of randomly selected animals and their corresponding life spans, in years. Find the equation of the regression line for the given data. Gestation, x 8 2.1 1.3 1 11.5 5.3 3.8 24.3 Life span, y 30 12 6 3 25 12 10 40 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 5) The data below are the final exam scores of 10 randomly selected statistics students and the number of hours they studied for the exam. Find the equation of the regression line for the given data. Hours, x Scores, y 3 65 5 80 2 60 8 88 2 66 4 78 4 85 5 90 6 90 3 71 A) y^ = 5.044x + 56.113 B) y^ = 56.113x - 5.044 C) y^^ = -56.113x - 5.044 D) y^ = -5.044x + 56.113 6) The data below are the temperatures on randomly chosen days during a summer class and the number of absences on those days. Find the equation of the regression line for the given data. Temperature, x Number of absences, y 72 3 85 7 91 10 90 10 88 8 98 15 75 4 100 15 80 5 A) y^ = 0.449x - 30.27 B) y^ = 30.27x - 0.449 C) y^ = 0.449x + 30.27 D) y^ = 30.27x + 0.449 7) The data below are ages and systolic blood pressures (measured in millimeters of mercury) of 9 randomly selected adults. Find the equation of the regression line for the given data. Age, x Pressure, y 38 116 41 120 45 123 48 131 51 142 53 145 57 148 61 150 65 152 A) y^ = 1.488x + 60.461 B) y^ = 60.461x - 1.488 C) y^ = 1.448x - 60.461 D) y^^ = 60.461x + 1.488 8) The data below are the number of absences and the final grades of 9 randomly selected students from a statistics class. Find the equation of the regression line for the given data. Number of absences, x Final grade, y 0 98 3 86 6 80 4 82 9 71 2 92 15 55 8 76 5 82 A) y^ = -2.755x + 96.139 B) y^ = 96.139x - 2.755 C) y^ = -2.755x - 96.139 D) y^ = -96.139x + 2.755
9) A manager wishes to determine the relationship between the number of miles (in hundreds of miles) the managerʹs sales representatives travel per month and the amount of sales (in thousands of dollars) per month. Find the equation of the regression line for the given data. Miles traveled, x Sales, y 2 31 3 33 10 78 7 62 8 65 15 61 3 48 1 55 11 120 A) y^ = 3.529x + 37.916 B) y^ = 37.916x - 3.529 C) y^^ = 3.529x - 37.916 D) y^ = 37.916x + 3.529 10) In order for applicants to work for the foreign-service department, they must take a test in the language of the country where they plan to work. The data below shows the relationship between the number of years that applicants have studied a particular language and the grades they received on the proficiency exam. Find the equation of the regression line for the given data. Number of years, x Grades on test, y 3 61 4 68 4 75 5 82 3 73 6 90 2 58 7 93 3 72 A) y^ = 6.910x + 46.261 B) y^ = 6.910x - 46.261 C) y^ = 46.261x - 6.910 D) y^ = 46.261x + 6.910 11) In an area of the Midwest, records were kept on the relationship between the rainfall (in inches) and the yield of wheat (bushels per acre). Find the equation of the regression line for the given data. Rain fall (in inches), x Yield (bushels per acre), y 10.5 50.5 8.8 46.2 13.4 58.8 12.5 59.0 18.8 82.4 10.3 49.2 7.0 31.9 15.6 76.0 16.0 78.8 A) y^ = 4.379x + 4.267 B) y^ = -4.379x + 4.267 C) y^^ = 4.267x + 4.379 D) y^ = 4.267x - 4.379 SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 12) The data below are the average monthly temperatures, in °F, and the monthly natural gas consumption, in ccf, for a household in northwestern Pennsylvania. Find the equation of the regression line for the given data. Temperature 47 35 21 27 39 48 61 65 70 Consumption 34 169 248 134 137 100 19 34 12 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 13) Given the equation of a regression line is y^ = 4x - 6, what is the best predicted value for y given x = 10? Assume that the variables x and y have a significant correlation. A) 34 B) 46 C) 56 D) 8 14) Given the equation of a regression line is y^ = -4.5x- 3.4, what is the best predicted value for y given x = 9.5? Assume that the variables x and y have a significant correlation. A) -46.15 B) -39.35 C) 39.35 D) 46.15
15) Given the equation of a regression line is y^ = 3.5x - 5.4, what is the best predicted value for y given x = -1.2? Assume that the variables x and y have a significant correlation. A) -9.6 B) 12.3 C) -6.9 D) -12.3 16) Use the regression equation to predict the value of y for x = -1.3. Assume that the variables x and y have a significant correlation. x y -5 -10 -3 -8 49 1 1 -1 -2 -2 -6 0-1 23 3 6 -4 -8 A)-3.278 B)-2.174 C) 2.815 D) 1.379 17) Use the regression equation to predict the value of y for x = 1.1. Assume that the variables x and y have a significant correlation. x y -5 11 -3 6 4 -6 1 -1 -1 3 -2 4 0 1 2 -4 3 -5 -4 8 A) -1.315 B) 2.832 C) -1.051 D) 2.719 SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 18) The data below are the gestation periods, in months, of randomly selected animals and their corresponding life spans, in years. Use the regression equation to predict the life span, y, for a gestation period of 6 months, x. Assume the variables x and y have a significant correlation. Gestation, x 8 2.1 1.3 1 11.5 5.3 3.8 24.3 Life span, y 30 12 6 3 25 12 10 40 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 19) The data below are the final exam scores of 10 randomly selected statistics students and the number of hours they studied for the exam. What is the best predicted value for y given x = 4? Assume that the variables x and y have a significant correlation. Hours, x Scores, y 3 65 5 80 2 60 8 88 2 66 4 78 4 85 5 90 6 90 3 71 A) 76 B) 75 C) 74 D) 77 20) The data below are the temperatures on randomly chosen days during a summer class and the number of absences on those days. What is the best predicted value for y given x = 86? Assume that the variables x and y have a significant correlation. Temperature, x Number of absences, y 72 3 85 7 91 10 90 10 88 8 98 15 75 4 100 15 80 5 A) 8 B) 9 C) 10 D) 11
21) The data below are the ages and systolic blood pressures (measured in millimeters of mercury) of 9 randomly selected adults. What is the best predicted value for y given x = 62? Assume that the variables x and y have a significant correlation. Age, x Pressure, y 38 116 41 120 45 123 48 131 51 142 53 145 57 148 61 150 65 152 A) 153 B) 155 C) 151 D) 149 22) The data below are the number of absences and the final grades of 9 randomly selected students from a statistics class. What is the best predicted value for y given x = 13? Assume that the variables x and y have a significant correlation. . Number of absences, x Final grade, y 0 98 3 86 6 80 4 82 9 71 2 92 15 55 8 76 5 82 A) 60 B) 61 C) 62 D) 59 23) In order for applicants to work for the foreign-service department, they must take a test in the language of the country where they plan to work. The data below show the relationship between the number of years that applicants have studied a particular language and the grades they received on the proficiency exam. What is the best predicted value for y given x = 3.5? Assume that the variables x and y have a significant correlation. Number of years, x Grades on test, y 3 61 4 68 4 75 5 82 3 73 6 90 2 58 7 93 3 72 A) 70 B) 68 C) 66 D) 72 24) In an area of the Midwest, records were kept on the relationship between the rainfall (in inches) and the yield of wheat (bushels per acre). Which is the best predicted value for y given x = 18.4? Assume that the variables x and y have a significant correlation. Rain fall (in inches), x Yield (bushels per acre), y 10.5 50.5 8.8 46.2 13.4 58.8 12.5 59.0 18.8 82.4 10.3 49.2 7.0 31.9 15.6 76.0 16.0 78.8 A) 84.8 B) 85.1 C) 84.6 D) 85.3 SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 25) The data below are the average monthly temperatures, in °F, and the monthly natural gas consumption, in ccf, for a household in northwestern Pennsylvania. What is the best-predicted value for the gas consumption, y, given x = 50°F? Assume that the variables x and y have a significant correlation. Temperature 47 35 21 27 39 48 61 65 70 Consumption 34 169 248 134 137 100 19 34 12
26) A calculus instructor is interested in finding the strength of a relationship between the final exam grades of students enrolled in Calculus I and Calculus II at his college. The data (in percentages) are listed below. Calculus I Calculus II 88 81 78 80 62 55 75 78 95 90 91 90 83 81 86 80 98 100 a) Graph a scatter plot of the data. b) Find an equation of the regression line. c) Determine if there is a significant correlation between the data. Use α = 0.01. d) Predict a Calculus II exam score for a student who receives an 80 in Calculus I. Is your answer a valid prediction? 2 Find the Equation of a Regression Line with Interchanged x and y SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Provide an appropriate response. 1) Find the equation of the regression line by letting Row 1 represent the x-values and Row 2 represent the y-values. Now find the equation of the regression line letting Row 2 represent the x-values and Row 1 represent the y-values. What effect does switching the explanatory and response variables have on the regression line? Row 1 Row 2 -5 -10 -3 -8 4 9 1 1 -1 -2 -2 -6 0 -1 2 3 3 6 -4 -8 3 Concepts MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) Given the equation of a regression line is y^ = -1.04x + 50.3, determine whether there is a positive linear correlation or a negative linear correlation. A) negative linear correlation B) positive linear correlation 2) Given the equation of a regression line is y^ = 0.00014x + 2.53, determine whether there is a positive linear correlation or a negative linear correlation. A) positive linear correlation B) negative linear correlation
9.3 Measures of Regression and Prediction Intervals
1 Find Types of Variations and the Coefficient of Determination SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Provide an appropriate response. 1) Calculate the coefficient of determination, given that the linear correlation coefficient, r, is 0.837. What does this tell you about the explained variation and the unexplained variation of the data about the regression line? 2) Calculate the coefficient of determination, given that the linear correlation coefficient, r, is -0.625. What does this tell you about the explained variation and the unexplained variation of the data about the regression line?3) Calculate the coefficient of determination, given that the linear correlation coefficient, r, is 1. What does this tell you about the explained variation and the unexplained variation of the data about the regression line? MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 4) Find the standard error of estimate, se, for the data below, given that y^ = 2x + 1. x y 1 3 2 5 3 7 4 9 A) 0 B) 1 C) 2 D) 3 5) Find the standard error of estimate, se, for the data below, given that y^^ = -2.5x. x y -1 2 -2 6 -3 7 -4 10 A) 0.866 B) 0.675 C) 0.532 D) 0.349 6) Find the standard error of estimate, se, for the data below, given that y^^ = 2.097x - 0.552. x y -5 -10 -3 -8 4 9 1 1 -1 -2 -2 -6 0 -1 2 3 3 6 -4 -8 A) 0.976 B) 0.990 C)-0.990 D) 0.980 7) Find the standard error of estimate, se, for the data below, given that y^^ = -1.885x + 0.758. x y -5 11 -3 6 4 -6 1 -1 -1 3 -2 4 0 1 2 -4 3 -5 -4 8 A) 0.613 B) 0.981 C) 0.312 D) 0.011 8) Find the standard error of estimate, se, for the data below, given that y^ = -0.206x + 2.097. x y -5 11 -3 -6 4 8 1 -3 -1 -2 -2 1 0 5 2 -5 3 6 -4 7 A) 6.306 B) 3.203 C) 5.918 D) 8.214 SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 9) The data below are the gestation periods, in months, of randomly selected animals and their corresponding life spans, in years. Find the standard error of estimate, se, given that y^ = 1.523x + 6.343. Gestation, x 8 2.1 1.3 1 11.5 5.3 3.8 24.3 Life span, y 30 12 6 3 25 12 10 40
MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 10) The data below are the final exam scores of 10 randomly selected statistics students and the number of hours they studied for the exam. Find the standard error of estimate, se, given that y^ = 5.044x + 56.11. Hours, x Scores, y 3 65 5 80 2 60 8 88 2 66 4 78 4 85 5 90 6 90 3 71 A) 6.305 B) 7.913 C) 8.912 D) 9.875 11) The data below are the temperatures on randomly chosen days during a summer class and the number of absences on those days. Find the standard error of estimate, se, given that y^^ = 0.449x - 30.27. Temperature, x Number of absences, y 72 3 85 7 91 10 90 10 88 8 98 15 75 4 100 15 80 5 A) 0.935 B) 1.162 C) 1.007 D) 0.815 12) The data below are the ages and systolic blood pressures (measured in millimeters of mercury) of 9 randomly selected adults. Find the standard error of estimate, se, given that y^ = 1.488x + 60.46. Age, x Pressure, y 38 116 41 120 45 123 48 131 51 142 53 145 57 148 61 150 65 152 A) 4.199 B) 6.981 C) 5.572 D) 3.099 13) The data below are the number of absences and the final grades of 9 randomly selected students from a statistics class. Find the standard error of estimate, se, given that y^ = -2.75X + 96.14. Number of absences, x Final grade, y 0 98 3 86 6 80 4 82 9 71 2 92 15 55 8 76 5 82 A) 1.799 B) 4.531 C) 3.876 D) 2.160 14) A manager wishes to determine the relationship between the number of miles (in hundreds of miles) the managerʹs sales representatives travel per month and the amount of sales (in thousands of dollars) per month. Find the standard error of estimate, se, given that y^ = 3.53x + 37.92. Miles traveled, x Sales, y 2 31 3 33 10 78 7 62 8 65 15 61 3 48 1 55 11 120 A) 22.062 B) 15.951 C) 10.569 D) 5.122
15) In order for applicants to work for the foreign-service department, they must take a test in the language of the country where they plan to work. The data below shows the relationship between the number of years that applicants have studied a particular language and the grades they received on the proficiency exam. Find the standard of estimate, se, given that y^ = 6.91x + 46.26. Number of years, x Grades on test, y 3 61 4 68 4 75 5 82 3 73 6 90 2 58 7 93 3 72 A) 4.578 B) 3.412 C) 5.192 D) 6.713 16) In an area of the Midwest, records were kept on the relationship between the rainfall (in inches) and the yield of wheat (bushels per acre). Find the standard error of estimate, se, given that y^ = 4.379x + 4.267. Rain fall (in inches), x Yield (bushels per acre), y 10.5 50.5 8.8 46.2 13.4 58.8 12.5 59.0 18.8 82.4 10.3 49.2 7.0 31.9 15.6 76.0 16.0 78.8 A) 3.529 B) 4.759 C) 2.813 D) 1.332 SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 17) The data below are the average monthly temperatures, in °F, and the monthly natural gas consumption, in ccf, for a household in northwestern Pennsylvania. Find the standard error of estimate, se, given that y^ = -4.310x + 296.352. Temperature 47 35 21 27 39 48 61 65 70 Consumption 34 169 248 134 137 100 19 34 12 2 Construct and Interpret Prediction Intervals SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Provide an appropriate response.
1) Construct a 95% prediction interval for y given x = 3.5, y^ = 2x + 1 and se = 0. x y 1 3 2 5 3 7 4 9 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 2) Construct a 95% prediction interval for y given x = 2.5, y^ = -2.5x and se = 0.866. Round interval to three
decimal places. x y -1 2 -2 6 -3 7 -4 10 A)-15.566 < y < 3.066 B)-12.594 < y < 0.094 C)-16 156 < y < 3 656 D)-8 244 < y < -4 256
3) Construct a 95% prediction interval for y given x = -3.5, y^ = 2.097x - 0.552 and se = 0.976. Round interval to two decimal places. x y -5 -10 -3 -8 4 9 1 1 -1 -2 -2 -6 0 -1 2 3 3 6 -4 -8
A)-10.31 < y < -5.47 B)-3.19 < y < -2.15 C)-4.60 < y < -1.99 D)-12.14 < y < -6.48 SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 4) The data below are the gestation periods, in months, of randomly selected animals and their corresponding life spans, in years. Construct a 95% prediction interval for y, the life span, given x = 10 months, y^ = 1.523x + 6.343, and se = 5.618. Round interval to two decimal places. Gestation, x 8 2.1 1.3 1 11.5 5.3 3.8 24.3 Life span, y 30 12 6 3 25 12 10 40 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 5) The data below are the scores of 10 randomly selected students from a statistics class and the number of hours they studied for the exam. Construct a 95% prediction interval for y, the score on the final exam, given x = 7 hours, y^ = 5.044x + 56.11 and se = 6.305. Round interval to two decimal places. Hours, x Scores, y 3 65 5 80 2 60 8 88 2 66 4 78 4 85 5 90 6 90 3 71
A) 74.54 < y < 108.30 B) 55.43 < y < 78.19 C) 77.21 < y < 110.45 D) 79.16 < y < 112.34 6) The data below are the temperatures on randomly chosen days during a summer class and the number of absences on those days. Construct a 95% prediction interval for y, the number of days absent, given x = 95 degrees, y^ = 0.449x - 30.27 and se = 0.934. Round interval to three decimal places. Temperature, x Number of absences, y 72 3 85 7 91 10 90 10 88 8 98 15 75 4 100 15 80 5
A) 9.957 < y < 14.813 B) 3.176 < y < 5.341 C) 4.321 < y < 6.913 D) 6.345 < y < 8.912 7) In order for applicants to work for the foreign-service department, they must take a test in the language of the country where they plan to work. The data below shows the relationship between the number of years that applicants have studied a particular language and the grades they received on the proficiency exam. Construct a 95% prediction interval for y given x = 2.5, y^^ = 6.91x + 46.26, and se = 4.578. Round interval to two decimal places. Number of years, x Grades on test, y 3 61 4 68 4 75 5 82 3 73 6 90 2 58 7 93 3 72
8) In an area of the Midwest, records were kept on the relationship between the rainfall (in inches) and the yield of wheat (bushels per acre). Construct a 95% prediction interval for y, the yield, given x = 11 inches, y^ = 4.379x + 4.267 and se = 3.529. Round interval to two decimal places. Rainfall (in inches), x Yield (bushels per acre), y 10.5 50.5 8.8 46.2 13.4 58.8 12.5 59.0 18.8 82.4 10.3 49.2 7.0 31.9 15.6 76.0 16.0 78.8
A) 43.56 < y < 61.32 B) 41.68 < y < 63.21 C) 40.54 < y < 64.15 D) 39.86 < y < 65.98 SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 9) The data below are the average monthly temperatures, in °F, and the monthly natural gas consumption, in ccf, for a household in northwestern Pennsylvania. Construct a 90% prediction interval for y, the monthly gas consumption, given x = 50°F. Round interval to two decimal places. Temperature 47 35 21 27 39 48 61 65 70 Consumption 34 169 248 134 137 100 19 34 12 10) A private organization conducted a survey in 9 regions of the country to determine the average weekly spending in dollars per person on tobacco products and alcoholic beverages. The data are listed below. Region Alcohol spending, x Tobacco spending, y 1 $12.80 $8.50 2 $13.20 $7.60 3 $9.50 $6.90 4 $10.30 $6.80 5 $9.80 $6.80 6 $11.70 $5.70 7 $10.00 $6.50 8 $8.90 $4.90 9 $11.60 $7.00 a) Construct a scatter plot of the data letting x represent spending on alcohol and y represent spending on tobacco. b) Find the regression line. c) Find the coefficient of determination. What can you conclude? d) Find the standard error of estimate, se. e) Construct a 95% prediction interval for the weekly spending on tobacco when the amount spent on alcohol is $9.50.
9.4 Multiple Regression
1 Use a Multiple Regression Equation to Predict y -values MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) A multiple regression equation is y^ = -35,000 + 130x1 + 20,000x2, where x1 is a personʹs age, x2 is the personʹs grade point average in college, and y is the personʹs income. Predict the income for a person who is 26 years old and had a college grade point average of 2.3. A) $14,380 B) $84,380 C) $49,380 D) $485,299 2) A researcher found a significant relationship between a studentʹs IQ, x1, grade point average, x2, and the score, y, on the verbal section of the SAT test. The relationship can be represented by the multiple regression equation y^ = 250 + 1.5x1 + 80x2. Predict the SAT verbal score of a student whose IQ is 103 and grade point average is 3.7. A) 701 B) 451 C) 601 D) 6513) A researcher found a significant relationship between a personʹs age, x1, the number of hours a person works per week, x2, and the number of accidents, y, the person has per year. The relationship can be represented by the multiple regression equation y^ = -3.2 + 0.012x1 + 0.23x2. Predict the number of accidents per year (to the nearest whole number) for a person whose age is 41 and who works 31 hours per week. A) 4 B) 5 C) 6 D) 3 2 Find a Multiple Regression Equation, Standard Error of Estimate, and Coefficient of Determination SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Provide an appropriate response. 1) A researcher at a local law university wishes to see whether a studentʹs grade point average and age are related to a studentʹs score on the state bar exam. Six students are randomly selected. The data are given below.
Student GPA Age Score
1 3.5 23 530 2 2.8 28 550 3 3.9 22 690 4 3.4 27 620 5 2.3 21 430 6 3.3 26 580 a) Find a multiple regression equation for the data. b) What is the standard error of estimate? c) What is the coefficient of determination? d) Interpret the results in (c). e) Predict the state bar exam score for a 25-year-old student with a grade point average of 3.0. 2) A medical researcher wishes to see whether there is a relationship between a personʹs age, cholesterol level, and systolic blood pressure. Eight people are randomly selected. The data are listed below. Person Age Cholesterol level Blood Pressure 1 38 220 116 2 41 225 120 3 45 200 123 4 48 190 131 5 51 250 142 6 53 215 145 7 57 200 148 8 61 170 150 a) Find a multiple regression equation for the data. b) What is the standard error of estimate? c) What is the coefficient of determination? d) Interpret the results in (c). e) If a person 50 years old with a cholesterol reading of 220 is selected, what is that personʹs predicted blood pressure reading?
3 Find the Adjusted Coefficient of Determination
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.
Provide an appropriate response.
1) A researcher at a local law university wishes to see whether a studentʹs grade point average and age are related to a studentʹs score on the state bar exam. Six students are randomly selected. The data are given below.
Student GPA Age Score
1 3.5 23 530 2 2.8 28 550 3 3.9 22 690 4 3.4 27 620 5 2.3 21 430 6 3.3 26 580 Calculate the adjusted coefficient of determination, r2adj. 2) A medical researcher wishes to see whether there is a relationship between a personʹs age, cholesterol level, and systolic blood pressure. Eight people are randomly selected. The data are listed below. Person Age Cholesterol level Blood Pressure 1 38 220 116 2 41 225 120 3 45 200 123 4 48 190 131 5 51 250 142 6 53 215 145 7 57 200 148 8 61 170 150 Calculate the adjusted coefficient of determination, r2adj.
Ch. 9 Correlation and Regression
Answer Key
9.1 Correlation
1 Interpret Scatter Plots and Correlations 1) A 2) A 3) A 2 Identify the Explanatory and Response Variables 1) explanatory variable: rainfall in inches; response variable: yield per acre 2) explanatory variable: hours studying; response variable: grades on the test 3 Construct a Scatter Plot and Determine Correlation 1) There appears to be a positive linear correlation. 2) There appears to be a negative linear correlation.3)
There appears to be no linear correlation. 4)
6)
7)
9) 10) 4 Perform a Hypothesis Test for a Population Correlation Coefficient 1) A 2) A 3) A 4) 0.916 5) -0.909 6) A 7) A 8) A 9) A 10) A 11) A 12) A 13) A 14) A 15) A 16) A 17) A 18) A 19) A 20) A 21) critical value t0 = ± 2.306; standardized test statistic t ≈ 4.098; reject H0; There is sufficient evidence to conclude that a significant correlation exists. 22) critical value t0 = ± 2.878; standardized test statistic t ≈ -2.729; fail to reject H0; There is not sufficient evidence to conclude that a significant correlation exists.
23) critical value t0 = ± 1.701; standardized test statistic t ≈ 1.793; reject H0; There is sufficient evidence to conclude that a significant correlation exists. 24) critical value t0 = ± 2.528; standardized test statistic t ≈ -5.312; reject H0; There is sufficient evidence to conclude that a significant correlation exists. 25) critical value t0 = ± 2.306; standardized test statistic t ≈ 19.85; reject H0; There is sufficient evidence to conclude that a significant correlation exists. 26) critical value t0 = ± 3.355; standardized test statistic t ≈ -28.18; reject H0; There is sufficient evidence to conclude that a significant correlation exists. 27) critical value t0 = ± 1.860; standardized test statistic t ≈ 0.296; fail to reject H0; There is not sufficient evidence to conclude that a significant correlation exists.
28) standardized test statistic t ≈ 5.593; critical value t0 = 3.143; reject H0; There is sufficient evidence to conclude that a significant positive correlation exists. 29) critical value t0 = ± 2.306; standardized test statistic t ≈ 4.51; reject H0; There is sufficient evidence to conclude that a significant correlation exists. 30) critical value t0 = ± 2.998; standardized test statistic t ≈ 13.03; reject H0; There is sufficient evidence to conclude that a significant correlation exists. 31) critical value t0 = ± 2.365; standardized test statistic t ≈ 9.07; reject H0; There is sufficient evidence to conclude that a significant correlation exists. 32) critical value t0 = ± 2.365; standardized test statistic t ≈ -19.59; reject H0; There is sufficient evidence to conclude that a significant correlation exists. 33) critical value t0 = ± 3.499; standardized test statistic t ≈ 2.16; fail to reject H0; There is not sufficient evidence to conclude that a significant correlation exists. 34) critical value t0 = ± 1.895; standardized test statistic t ≈ 6.92; reject H0; There is sufficient evidence to conclude that a significant correlation exists. 35) critical value t0 = ± 3.499; standardized test statistic t ≈ 13.38; reject H0; There is sufficient evidence to conclude that a significant correlation exists.
36) standardized test statistic t ≈ -5.770; critical value t0 = -1.895; reject H0; There is sufficient evidence to conclude that a significant negative correlation exists. 5 Calculate the Correlation Coefficient with Interchanged x and y 1) The correlation coefficient remains unchanged. 6 Concepts 1)
∑
x2 means square each x-value and then add the squares, and∑
x 2 means add the x-values and then square the sum. 2) A 3) A9.2 Linear Regression
1 Find the Equation of a Regression Line 1) A 2) A 3) A 4) y^ = 1.523x + 6.343 5) A 6) A 7) A 8) A 9) A 10) A 11) A 12) y^ = -4.310x + 296.35215) A 16) A 17) A 18) About 15 years. 19) A 20) A 21) A 22) A 23) A 24) A 25) About 81 ccf. 26) a) See graph below. b) y^ = 1.044x - 5.990 c) critical value t0 = ± 3.499; test statistic t = 7.64; reject H0; There is sufficient evidence to conclude that a significant correlation exists. d) When x = 80, y = 78. This is a valid prediction as there is a significant correlation between the data. 2 Find the Equation of a Regression Line with Interchanged x and y 1) The sign of m is unchanged, but the values of m and b change. 3 Concepts 1) A 2) A
9.3 Measures of Regression and Prediction Intervals
1 Find Types of Variations and the Coefficient of Determination 1) The coefficient of determination, r2, = 0.701. That is, 70.1% of the variation is explained and 29.9% of the variation is unexplained. 2) The coefficient of determination, r2, = 0.391. That is, 39.1% of the variation is explained and 60.9% of the variation is unexplained. 3) The coefficient of determination, r2, = 1. That is, 100% of the variation is explained and there is no variation that is unexplained. 4) A 5) A 6) A 7) A 8) A 9) 5.622 10) A 11) A 12) A13) A 14) A 15) A 16) A 17) 35.899 2 Construct and Interpret Prediction Intervals 1) Since se = 0, there is no interval for x = 3.5. 2) A 3) A 4) 6.87 < y < 36.27 5) A 6) A 7) A 8) A 9) 8.91 < y < 152.79 10) a) b) y^ = 0.449x + 1.865 c) r2 = 0.437. This means that about 43.7% of the variation can be explained. About 56.3% of the variation is unexplained and is due to chance or other variables. d) se = 0.8247 e) 3.987 < y < 8.284
9.4 Multiple Regression
1 Use a Multiple Regression Equation to Predict y -values 1) A 2) A 3) A 2 Find a Multiple Regression Equation, Standard Error of Estimate, and Coefficient of Determination 1) a) y^ = -37.5 + 134.7x1 + 7.1x2 b) se = 48.4 c) r2 = 0.82. d) The multiple regression model explains 82% of the variation in y. e) 544 2) a) y^ = 14.2 + 1.87x1 + 0.13x2 b) se = 2.02 c) r2 = 0.984. d) The multiple regression equation explains 98.4% of the variation in y.3 Find the Adjusted Coefficient of Determination
1) r2adj. = 0.70 2) r2adj. = 0.978