multiple linear regression models and determination of leverage values
6.1.6 Which predictors are important?
Here the reduced model is one with just an inter- cept and no predictor variables (i.e. 1⫽2⫽...⫽ j⫽...⫽0). Interpretation of r2in multiple linear
regression must be done carefully. Just like in simple regression, r2 is not directly comparable
between models based on different transforma- tions (Anderson-Sprecher 1994; Chapter 5). Additionally, r2is not a useful measure of fit when
comparing models with different numbers of, or combinations of, predictor variables (e.g. interac- tion terms, see Section 6.1.12). As more predictors are added to a model, r2cannot decrease so that
models with more predictors will always appear to fit the data better. Comparing the fit of models with different numbers of predictors should use alternative measures (see Section 6.1.15).
6.1.6 Which predictors are important?
Once we have fitted our multiple linear regression model, we usually want to determine the relative importance of each predictor variable to the response variable. There are a number of related approaches for measuring relative importance of each predictor variable in multiple linear regres- sion models.
Tests on partial regression slopes
The simplest way of assessing the relative impor- tance of the predictors in a linear regression
Full SSResidual Reduced SSResidual SSResidual SSTotal SSRegression SSTotal
model is to use the F or t statistics, and their asso- ciated P values, from the tests of the null hypothe- ses that each j equals zero. These tests are straightforward to interpret but only tell us the probability of observing our sample observations or ones more extreme for these variables if the H0 for a given predictor is true. Also, some statisti- cians (Neter et al. 1996, Rawlings et al. 1998) have argued that we are testing null hypotheses about a number of regression coefficients simultaneously from a single data set, so we should adjust the sig- nificance level for each test to limit the overall probability of at least one Type I error among all our tests to␣. Such an adjustment will reduce the power of individual tests, and as we discussed in Chapter 3, seems unnecessarily harsh. If you deem such an adjustment necessary, however, one of the sequential Bonferroni procedures is appropriate.
Change in explained variation
The change in variation explained by the model with all predictors and the model with a specific predictor omitted is also a measure of importance of that predictor. This is basically comparing the fit of two models to the data; because the number of predictors differs between the two models, the choice of measure of fit is critical and will be dis- cussed further when we consider model selection in Section 6.1.15. To measure the proportional reduction in the variation in Y when a predictor variable Xjis added to a model already including the other predictors (X1to Xpexcept Xj) is simply:
rX
j
2⫽ (6.14)
where SSExtrais the increase in SSRegression, or the decrease in SSResidual, when Xj is added to the model and Reduced SSResidual is unexplained SS from the model including all predictor variables except Xj. This rX
j
2 is termed the coefficient of
partial determination for Xjand its square root is the partial correlation coefficient between Y and
Xjholding the other predictor variables constant (i.e. already including them in the model).
A related approach is hierarchical partitioning (Chevan & Sutherland 1991, Mac Nally 1996), which quantifies the independent correlation of each predictor variable with the response variable. It works by partitioning any measure of
SSExtra
Reduced SSResidual
explained variance (e.g. r2) into components meas-
uring the independent contribution of each pre- dictor. It is an important tool for multivariate inference, especially in multiple regression models, and we will describe it in more detail in Section 6.1.16.
Standardized partial regression slopes
The sizes of the individual regression slopes are difficult to compare if the predictor variables are measured in different units (see Chapter 5). We can calculate standardized regression slopes by regressing the standardized response variable against the standardized predictor variables, or alternatively, calculate for predictor Xj:
bj*⫽bj (6.15)
These standardized regression slopes are compar- able independently of the scales on which the pre- dictors are measured. Note that the regression model based on standardized variables doesn’t include an intercept, because its OLS (and ML) esti- mate will always be zero. Note also that if the pre- dictor variables are not correlated with each other, then the standardized regression slopes relating Y to each Xjare the same as the correla- tion coefficients relating Y to Xj.
For model 6.3, standardized regression slopes would not assist interpretation because both pre- dictors (latitude and longitude) are in the same units (centesimal degrees). However, if we included mean annual temperature (°C) and mean annual precipitation (mm) in the model, then the magnitudes of the unstandardized regression slopes would not be comparable because of the dif- ferent units, so standardization would help.
Bring (1994) suggested that the size of each standardized slope should relate to the reduction in explained variation when each predictor is omitted from the full model (see Equation 6.14). He argued that standardization should be based on partial standard deviations rather than ordi- nary standard deviations, so that the size of the bj*
relates to the reduction in r2 when that X
j is omitted from the model. The partial standard deviation of predictor variable j (Xj) is:
sX j *⫽ sXj (6.16) 兹VIFj
兹
n⫺ 1 n⫺ p sXj sYVIF is the variance inflation factor and will be defined in Section 6.1.11 when we examine the problem of multicollinearity. This partial stan- dard deviation can then be incorporated in the formula for the standardized regression slope (Equation 6.15).
Regressions on standardized variables will produce coefficients (except for the intercept) that are the same as the standardized coeffi- cients described above. The hypothesis tests on individual standardized coefficients will be iden- tical to those on unstandardized coefficients. Standardization might be useful if the variables are on very different scales and the magnitude of coefficients for variables with small values may not indicate their relative importance in influ- encing the response variable. However, it is the predictor variables that are important here and standardizing the response variable may not be necessary and will make predicted values from the model more difficult to interpret. Regression models using standardized (or simply centered) predictors are very important for detecting and treating multicollinearity and interpreting inter- actions between predictors (Sections 6.1.11 and 6.1.12).