Chapter 17
Correlation and Regression True/False Questions
The product moment correlation, r, is an index used to determine whether a linear, or
straight-line, relationship exists between X and Y. It indicates the degree to which the variation in one variable, X, is related to the variation in another variable, Y.
(True, easy, page 497)
When determining the correlation coefficient, r, it does matter which variable is considered to be the dependent variable and which the independent.
(False, different, page 499)
The test statistic is used when determining the statistical significance of the relationship between two variables measured by using r.
(True, moderate, page 499)
A correlation matrix indicates the coefficient of correlation between each pair of
variables.
(True, easy, page 500)
The order associated with a partial correlation indicates how many variables are being
adjusted or controlled. (True, moderate, page 501)
The partial correlation coefficient is a measure of the correlation between Y and X
when the linear effects of the other independent variables have been removed from X but not from Y.
(False, moderate, page 501)
In the absence of ties, Kendall’s yields a closer approximation to the Pearson product moment correlation coefficient, than Spearman’s s.
(False, difficult, page 502)
Regression analysis is concerned with the nature and degree of association between
variables and does not imply or assume any causality. (True, easy, page 503)
The product moment correlation helps us determine the strength of the association
between two metric variables. Regression analysis helps us determine which variables cause a change in other variables.
In bivariate regression, the null hypothesis is that no linear relationship exists between X and Y, or H0: 0.
(True, moderate, page 504)
Standardized variables have a mean of 1 and a variance of zero.
(False, easy, page 507)
The term beta coefficient or beta weight is used to denote the standardized regression
coefficient.
(True, moderate, page 507)
The statistical significance of the linear relationship between X and Y may be tested
by examining the hypotheses: H0: 0 ; H1: 0. (False, moderate, page 508)
The formula for the coefficient of determination is r = SSreg/ SSy. (True, moderate, page 509)
The hypotheses for the test for significance of the coefficent of determination are: H0: R
popH1: Rpop (True, difficult, page 510)
The standard error of estimate, SEE, is the standard deviation of the actual Y values
from the predicted Ŷ values. (True, moderate, page 510)
The general form of the multiple regression model is:
Y=β0+ β1 X1+ β2 X2+ β3X3+…+ βkXk+e (True, moderate, page 512)
The coefficient of multiple determination is adjusted for the number of dependent
variables and the sample size to account for diminishing returns. (False, difficult, page 513)
The multiple correlation coefficient, R, can also be viewed as the simple correlation
coefficient, r, between Y and Ŷ. (True, moderate, page 515)
R2cannot decrease as more independent variables are added to the regression equation.
(True, easy, page 515)
A residual is the difference between the observed value of Yi and the value predicted
by the regression equation, Ŷi. (True, easy, page 517)
If a variable explains a significant proportion of the residual variation, it should be considered for inclusion in the regression equation.
(True, moderate, page 518)
If an examination of the residuals indicates that the assumptions underlying linear
regression are not met, the researcher can transform the variables in an attempt to satisfy the assumptions.
(True, difficult, page 518)
The purpose of stepwise regression is to select, from a large number of predictor
variables, a small subset of variables that account for most of the variation in the dependent or criterion variable.
(True, easy, page 519)
Stepwise procedures result in regression equations that are optional, in the sense of
producing the largest R2, for a given number of predictors. (True, easy, page 520)
Multicollinearity arises when intercorrelations among the predictors are very low.
(False, easy, page 521)
When coding for dummy variables, c – 1 codes are needed for the c categories of an
independent variable. (True, easy, page 523)
In regression with dummy variables, the predicted Ŷ for each category is the mean of
Y for each category.
(True, moderate, page 524)
Regression in which a single independent variable has been recoded into dummy
variables is equivalent to one-way analysis of variance. (True, moderate, page 524)
Multiple Choice Questions
31. _____ is best to use to determine how strongly sales are related to advertising expenditures.
a. Regression analysis
b. Partial correlation coefficient c. ANOVA
d. Product moment correlation (r) (d, moderate, page 497)
32. The _____ is a statistic summarizing the strength of association between two metric variables.
a. regression analysis
b. partial correlation coefficient c. ANOVA
d. product moment correlation (d, moderate, page 497)
33. The equation for r involves dividing the _____ by _____. a. COVxy; the product of the variance of X and Y (Sx2Sy2) b. product of the standard deviation of X and Y (SxSy); COVxy
c. COVxy; the product of the standard deviation of X and Y (SxSy)
d. product of the variances of X and Y (Sx2Sy2); COVxy (c, difficult, page 497)
34. The equation for r is represented as:
a. COV xy/ Sx2Sy2 b. S xSy/COV c. COV xy/ SxSy d. S x2Sy2/COV (c, moderate, page 497) 35. r2 measures:
a. the proportion of variation in one variable that is explained by the other. b. the proportion of error variation .
c. the proportion of variation in Y related to the variation of the categories of X. d. the proportion of variation in Y due to the variation within each of the categories
of X.
(a, moderate, page 499) 36. r = 0 indicates:
a. X and Y have a relationship
b. X and Y don’t have a linear relationship c. X and Y are unrelated
d. X and Y have a linear relationship (b, moderate, page 499)
37. Which statement about the correlation coefficient, r, is true?
a. The calculation of r assumes that X and Y are metric variables whose distributions have the same shape.
b. The correlation coefficient computed for a population is denoted by ρ(rho).
c. Data obtained by using rating scales with a small number of categories tends to deflate r.
d. All of the statements are true. (d, difficult, page 499)
38. Which statement is not true about correlation matrices?
a. Usually only the lower portion of the matrix is considered. b. The diagonal elements all equal 0.
c. A correlation matrix indicates the coefficient of correlation between each pair of variables.
d. The upper triangular portion of the matrix is a mirror image of the lower triangular portion.
(b, easy, page 500)
39. The _____ is a measure of the association between two variables after controlling or adjusting for the effects of one or more additional variables.
a. regression analysis
b. partial correlation coefficient c. ANOVA
d. product moment correlation (b, moderate, page 500)
40. The question of ‘How strongly are sales related to advertising expenditures when the effect of price is controlled?’ is best answered via _____.
a. regression analysis
b. partial correlation coefficient c. ANOVA
d. product moment correlation (b, moderate, page 500)
41. Which statement is not correct about the partial correlation coefficient?
a. Partial correlations can be helpful for detecting spurious relationships.
b. The partial correlation coefficient is generally viewed as more important than the
part correlation coefficient.
c. The partial correlation coefficient represents the correlation between Y and X
when the linear effects of the other independent variables have been removed from X but not from Y.
d. The partial correlation coefficient can be calculated by a knowledge of the simple
correlations alone. (c, moderate, pages 500-502)
42. Which of the following is a measure of non-metric correlation? a. Pearson product moment correlation
b. Spearman’s rho c. Kendall’s tau d. both b and c (d, easy, page 502)
43. _____ is a statistical procedure for analyzing associative relationships between a metric dependent variable and one or more independent variables.
a. regression analysis
b. partial correlation coefficient c. ANOVA
d. product moment correlation (a, moderate, page 502)
44. _____ is a procedure for deriving a mathematical relationship, in the form of an equation, between a single metric dependent variable and a single metric independent variable. a. Chi-square b. Part correlation c. Multiple regression d. Bivariate regression (d, moderate, page 503)
45. Which of the following marketing questions would be best answered by bivariate regression?
a. Are consumers’ perceptions of quality related to their perceptions of prices when the effect of brand image is controlled?
b. Can the variation in market share be accounted for by the size of the sales force? c. Do retailers, wholesalers, and agents differ in their attitudes toward the firm’s
distribution policies?
d. How do advertising levels (high, medium, and low) interact with price levels (high, medium, and low) to influence a brand’s sale?
(b, difficult, page 503)
46. A technique for fitting a straight line to a scattergram by minimizing the square of the vertical distances of all the points from the line is known as the _____.
a. least-square procedure b. scatter diagram plot
c. sum of square errors procedure d. maximum residual procedure (a, easy, page 505)
47. The bivariate regression model that accounts for the probabilistic or stochastic nature of the relationship between X and Y is _____.
a. Ŷ = a+b1X1+b2X2 b. Y= β0+ β1 Xi c. Yi =β0+ β1 Xi+ ei d. Ŷi = a+bXi (c, easy, page 506)
48. What is the bivariate regression equation if sample observations are used to predict Y? a. Ŷ = a+b1X1+b2X2 b. Y= β0+ β1 Xi c. Yi =β0+ β1 Xi+ ei d. Ŷi = a+bxi (d, easy, page 506)
49. Which statement is not true about the constant b in the bivariate regression equation
Ŷi = a+bXi?
a. It is usually referred to as the non-standardized regression coefficient.
b. It is the slope of the regression line and it indicates the expected change in Y when
X is changed by one unit.
c. It is the intercept of the regression line and it indicates the value of Y when X is
zero.
d. It may be computed as b=COVxy/S
x2 (c, moderate, page 506)
50. Which equation depicts the relationship between the standardized and non-standardized regression coefficients?
a. Byx= byx(S2x/S2y) b. B2 yx= byx(Sx/Sy) c. Byx= byx(Sx/Sy) d. B2 yx= byx(S2x/S2y) (c, moderate, page 507)
51. The standard deviation of b, or the standard error, is denoted as:
a. SEb b. SDb c. SSYb
d. None of the above
(a, moderate, page 508)
52. In bivariate regression, which statement is true concerning the coefficient of determination, r2?
a. r2 is the square of the simple correlation coefficient obtained by correlating the
two variables.
b. r2 varies between 0 and 1.
c. r2 signifies the proportion of the total variation in Y accounted for by the variation in X.
d. All are correct. (d, easy, page 508)
53. _____ is the appropriate test statistic to use to determine the significance of the coefficient of determination in bivariate regression.
a. F statistic b. T statistic c. Z statistic d. ω2
(a, moderate, page 510)
54. To estimate the accuracy of predicted values, Ŷ, found in bivariate regression, it is useful to calculate the _____, the standard deviation of the actual Y values for the predicted Ŷ values.
a. coefficient of determination b. standard error of the estimate c. covariance
d. standard error (b, difficult, page 510)
55. _____ is a statistical technique that simultaneously develops a mathematical relationship between two or more independent variables and an interval-scaled dependent variable.
a. Chi-square
b. The least-squares procedure c. Multiple regression
d. Bivariate regression (c, moderate, page 511)
56. The general form of the multiple regression model is estimated by which equation? a. Ŷi = a+bXi
b. Yi =β0+ β1 Xi+ ei
c. Ŷ=a+ b1 X1+ b2 X2+ b3X3+…+ bkXk d. Ŷ = a+b1X1+b2X2
(c, easy, page 512)
57. Which statistic is associated only with multiple regression and not with bivariate regression?
a. adjusted R2 b. F test
c. estimated or predicted value (Ŷ)
d. both a and b
(d, moderate, page 513)
58. The _____ denotes the change in the predicted value, Ŷ, per unit change in X1 when the other independent variables, X2 to Xk, are held constant.
a. partial regression coefficient b. partial correlation coefficient
59. Which statement is not true about partial regression coefficients?
a. The combined effects of X1 and X2 on Y are additive. In other words, if X1 and X2
are each changed by one unit, the expected change in Y would be (b1 + b2).
b. The beta coefficients are the partial regression coefficients obtained when all the
variables (Y, X1, X2… Xk) have been standardized to a mean of 0 and a variance of 1
before estimating the regression equation.
c. Partial regression coefficients have an order associated with them. d. Both a and b are not true.
(c, moderate, page 513-514)
60. In multiple regression, if the overall null hypothesis is rejected:
a. the mean value of the dependent variable will be different for different categories of the independent variable.
b. the means of the independent variables are not equal. c. there is an association between the independent variables.
d. one or more population partial regression coefficients have a value different from 0.
(d, moderate, page 516)
61. In multiple regression, if the overall null hypothesis is rejected, which statement is true?
a. We know which specific β s are nonzero.
b. We can use t = b/SEb to determine which β s are nonzero. c. We do not know which β s are nonzero.
d. Both b and ’ are correct. (d, moderate, page 516)
62. _____ is a regression procedure in which the predictor variables enter or leave the regression equation one at a time.
a. Multiple regression b. Bivariate regression
c. Dummy variable regression d. Stepwise regression
(d, easy, page 519)
63. Which of the following is not a problem associated with multicollinearity?
a. The partial regression coefficients may not be estimated precisely. The standard errors are likely to be high.
b. It becomes difficult to assess the relative importance of the independent variables in explaining the variation in the dependent variables.
c. Predictor variables may be incorrectly included or removed in stepwise regression.
d. It becomes difficult to compute the correct test statistic. (d, difficult, page 521)
64. _____ variables may be used as predictors or independent variables by coding them as dummy variables.
a. interval b. categorical c. ratio
d. all of the above (b, easy, page 523)
65. The regression equation for a categorical variable with four categories would be modeled as:
a. Ŷi=a+ b1 D1+ b2 D2+ b3D3 b. Ŷi=a+ b1 D1+ b2 D2+ b3D3+ b4D4 c. Y=a+ b1 D1+ b2 D2+ b3D3 d. Y=a+ b1 D1+ b2 D2+ b3D3+ b4D4 (a, moderate, page 523)
Essay Questions
66. In what ways can regression analysis be used?
Answer
1. Determine whether the independent variables explain a significant variation in the dependent variable: whether a relationship exists.
2. Determine how much of the variation in the dependent variable can be explained by the independent variables: strength of the relationship.
3. Determine the structure or form of the relationship: the mathematical equation relating the independent and dependent variables.
4. Predict the values of the dependent variable.
5. Control for other independent variables when evaluating the contributions of a specific variable or set of variables.
(moderate, page 503)
67. Briefly explain how a scatter diagram benefits the researcher?
Answer
A scatter diagram is useful for determining the form of the relationship between the variables. A plot can alert the researcher to patterns in the data, or to possible problems. Any unusual combinations of the two variables can be easily identified. (easy, pages 504-505)
68. What are the assumptions made by the regression model in estimating the parameters
and in significance testing? Answer
1. The error term is normally distributed. For each fixed value of X, the distribution of Y is normal.
2. The means of all these normal distributions of Y, given X, lie on a straight line with slope b.
3. The mean of the error term is 0.
4. The variance of the error term is constant. This variance does not depend on the values assumed by X.
5. The error terms are uncorrelated. In other words, the observations have been drawn independently.
(difficult, page 511)
69. Given the multiple regression equation, Ŷ = a+b1X1+b2X2, and the bivariate equation
Ŷ = a+bX, why is the partial regression coefficient, b1, different from the regression coefficient, b, obtained by regressing Y on only X1?
Answer
This happens because X1 and X2 are usually correlated. In bivariate regression, X2 was
not considered and any variation in Y that was shared by X1 and X2 was attributed to
X1. However, in the case of multiple independent variables, this is no longer true. (difficult, page 513)