• No results found

Box 6.5 Collinearity

6.1.12 Interactions in multiple regression

The multiple regression model we have been using so far is an additive one, i.e. the effects of the pre- dictor variables on Y are additive. In many biolog- ical situations, however, we would anticipate interactions between the predictors (Aiken & West

1991, Jaccard et al. 1990) so that their effects on Y are multiplicative. Let’s just consider the case with two predictors, X1and X2. The additive multi- ple linear regression model is:

yi⫽␤0⫹␤1xi1⫹␤2xi2⫹␧i (6.17) This assumes that the partial regression slope of Y on X1 is independent of X2 and vice-versa. The multiplicative model including an interaction is:

yi⫽␤0⫹␤1xi1⫹␤2xi2⫹␤3xi1xi2⫹␧i (6.18) The new term (␤3xi1xi2) in model 6.18 represents the interactive effect of X1and X2on Y. It measures the dependence of the partial regression slope of Y against X1on the value of X2and the dependence of the partial regression slope of Y against X2on the value of X1. The partial slope of the regression of Y against X1is no longer independent of X2and vice versa. Equivalently, the partial regression slope of

Yagainst X1is different for each value of X2. Using the data from Paruelo & Lauenroth (1996), model 6.2 indicates that we expect no interaction between latitude and longitude in their effect on the relative abundance of C3plants. But what if we allow the relationship between C3 plants and latitude to vary for different longi- tudes? Then we are dealing with an interaction between latitude and longitude and our model becomes:

(relative abundance of C3grasses)i⫽␤0⫹ ␤1(latitude)i⫹␤2(longitude)i⫹

␤3(latitude)i⫻(longitude)i⫹␧i (6.19)

One of the difficulties with including interaction terms in multiple regression models is that lower- order terms will usually be highly correlated with their interactions, e.g. X1and X2will be highly cor- related with their interaction X1X2. This results in 130 MULTIPLE AND COMPLEX REGRESSION

Table 6.2 Expected values of mean squares from analysis of variance for a multiple linear regression model with two predictor variables

Mean square Expected value

MSRegression re2⫹ MSResidual re2 b2 1

n i⫽1 (xi1⫺x¯1)2⫹b22

n i⫽1 (xi2⫺x¯2)⫹2b1b2

n i⫽1 (xi1⫺x¯1)(xi2⫺x¯2) 2

all the computational problems and inflated vari- ances of estimated coefficients associated with collinearity (Section 6.1.11). One solution to this problem is to rescale the predictor variables by centering, i.e. subtracting their mean from each observation, so the interaction is then the product of the centered values (Aiken & West 1991, Neter et

al. 1996; see Box 6.1 and Box 6.2). If X1and X2are centered then neither will be strongly correlated with their interaction. Predictors can also be stan- dardized (subtract the mean from each observa- tion and divide by the standard deviation) which has an identical affect in reducing collinearity.

When interaction terms are not included in the model, centering the predictor variables does not change the estimates of the regression slopes nor hypothesis tests that individual slopes equal zero. Standardizing the predictor variables does change the value of the regression slopes, but not their hypothesis tests because the standardization affects the coefficients and their standard errors equally. When interaction terms are included, centering does not affect the regression slope for the highest-order interaction term, nor the hypothesis test that the interaction equals zero. Standardization changes the value of the regres- sion slope for the interaction but not the hypoth- esis test. Centering and standardization change all lower-order regression slopes and hypothesis tests that individual slopes equal zero but make them more interpretable in the presence of an interac- tion (see below). The method we will describe for further examining interaction terms using simple slopes is also unaffected by centering but is affected by standardizing predictor variables.

We support the recommendation of Aiken & West (1991) and others that multiple regression models with interaction terms should be fitted to data with centered predictor variables. Standardization might also be used if the vari- ables have very different variances but note that calculation and tests of simple slopes must then be based on analyzing standardized variables but using the unstandardized regression coefficients (Aiken & West 1991).

Probing interactions

Even in the presence of an interaction, we can still interpret the partial regression slopes for other

terms in model 6.18. The estimate of ␤1 deter- mined by the OLS fit of this regression model is actually the regression slope of Y on X1when X2is zero. If there is an interaction (␤3does not equal zero), this slope will obviously change for other values of X2; if there is not an interaction (␤3 equals zero), then this slope will be constant for all levels of X2. In the presence of an interaction, the estimated slope for Y on X1when X2is zero is not very informative because zero is not usually within the range of our observations for any of the predictor variables. If the predictors are centered, however, then the estimate of ␤1is now the regres- sion slope of Y on X1for the mean of X2, a more useful piece of information. This is another reason why variables should be centered before fitting a multiple linear regression model with interaction terms.

However, if the fit of our model indicates that interactions between two or more predictors are important, we usually want to probe these inter- actions further to see how they are structured. Let’s express our multiple regression model as relating the predicted yito two predictor variables and their interaction using sample estimates:

y

ˆi⫽b0⫹b1xi1⫹b2xi2⫹b3xi1xi2 (6.20) This can be algebraically re-arranged to:

y

ˆi⫽(b1⫹b3xi2)xi1⫹(b2xi2⫹b0) (6.21) We now have (b1⫹b3xi2), the simple slope of the regression of Y on X1for any particular value of X2 (indicated as xi2). We can then choose values of X2 and calculate the estimated simple slope, for either plotting or significance testing. Cohen & Cohen (1983) and Aiken & West (1991) suggested using three different values of X2: x¯2, x¯2⫹s, x¯2⫺s,

where s is the sample standard deviation of X2. We can calculate simple regression slopes by substi- tuting these values of X2into the equation for the simple slope of Y on X1.

The H0that the simple regression slope of Y on

X1for a particular value of X2equals zero can also be tested. The standard error for the simple regres- sion slope is:

(6.22) where s112 and s2

33 are the variances of b1 and b3

respectively, s2

13is the covariance between b1and b3

兹s2

11⫹ 2x2s213⫹ x22s233

and x2is the value of X2chosen. The variance and covariances are obtained from a covariance matrix of the regression coefficients, usually stan- dard output for regression analyses with most software. Then the usual t test is applied (simple slope divided by standard error of simple slope). Fortunately, simple slope tests can be done easily with most statistical software (Aiken & West 1990, Darlington 1990). For example, we use the follow- ing steps to calculate the simple slope of Y on X1 for a specific value of X2, such as x¯2⫹s.

1. Create a new variable (called the condi- tional value of X2, say CVX2), which is xi2minus the specific value chosen.

2. Fit a multiple linear regression model for Y on X1, CVX2, X1by CVX2.

3. The partial slope of Y on X1from this model is the simple slope of Y on X1for the specific value of X2chosen.

4. The statistical program then provides a standard error and t test.

This procedure can be followed for any condi- tional value. Note that we have calculated simple slopes for Y on X1 at different values of X2. Conversely, we could have easily calculated simple slopes for Y on X2at different values of X1.

If we have three predictor variables, we can have three two-way interactions and one three- way interaction:

yi⫽␤0⫹␤1xi1⫹␤2xi2⫹␤3xi3⫹␤4xi1xi2

␤5xi1xi3⫹␤6xi2xi3⫹␤7xi1xi2xi3⫹␧i ( 6.23)

In this model, ␤7is the regression slope for the three-way interaction between X1, X2and X3and measures the dependence of the regression slope of Y on X1on the values of different combinations of both X2and X3. Equivalently, the interaction is the dependence of the regression slope of Y on X2 on values of different combinations of X1and X3 and the dependence of the regression slope of Y on X3on values of different combinations of X1 and X2. If we focus on the first interpretation, we can determine simple regression equations for Y on X1at different combinations of X2and X3using sample estimates:

y

ˆi⫽(b1⫹b4xi2⫹b5xi3⫹b7xi2xi3)xi1

(b2xi2⫹b3xi3⫹b6xi2xi3⫹b0) (6.24)

Now we have (b1⫹b4xi2⫹b5xi3⫹b7xi2xi3) as the simple slope for Y on X1for specific values of X2 and X3together. Following the logic we used for models with two predictors, we can substitute values for X2 and X3 into this equation for the simple slope. Aiken & West (1991) suggested using

x

¯2and x¯3and the four combinations of x¯2⫾sx

2and

x ¯3⫾sx

3. Simple slopes for Y on X2or X3can be cal-

culated by just reordering the predictor variables in the model. Using the linear regression routine in statistical software, simple slopes, their stan- dard errors and t tests for Y on X1at specific values of X2and X3can be calculated.

1. Create two new variables (called the condi- tional values of X2and X3, say CVX2and CVX3), which are xi2and xi3minus the specific values chosen.

2. For each combination of specific values of

X2and X3, fit a multiple linear regression model for Y on X1, CVX2, CVX3, X1by CVX2, X1by CVX3, CVX2by CVX3, and X1by CVX2by CVX3.

3. The partial slope of Y on X1from this model is the simple slope of Y on X1for the chosen specific values of X2and X3.

With three or more predictor variables, the number of interactions becomes large and they become more complex (three-way interactions and higher). Incorporating all possible interac- tions in models with numerous predictors becomes unwieldy and we would need a very large sample size because of the number of terms in the model. There are two ways we might decide which interactions to include in a linear regression model, especially if our sample size does not allow us to include them all. First, we can use our biolog- ical knowledge to predict likely interactions and only incorporate this subset. For the data from Loyn (1987), we might expect the relationship between bird density and grazing to vary with area (grazing effects more important in small frag- ments?) and years since isolation (grazing more important in new fragments?), but not with dis- tance to any forest or larger fragments. Second, we can plot the residuals from an additive model against the possible interaction terms (new vari- ables formed by simply multiplying the predic- tors) to see if any of these interactions are related to variation in the response variable.

There are two take-home messages from this section. First, we should consider interactions between continuous predictors in multiple linear regression model because such interactions may be common in biological data. Second, these inter- actions can be further explored and interpreted using relatively straightforward statistical tech- niques with most linear regression software.

Outline

Related documents