• No results found

Nonlinearities – Interactions Michael Malcolm

In document 351_metrics_f12.pdf (Page 87-94)

4 Model Selection

Unit 5.4: Nonlinearities – Interactions Michael Malcolm

January 26, 2011

1

Motivation

Interaction terms are useful when the estimated impact ofX onY depends on a third variable. For example, one additional year of educationX might be associated with a different increase in wageY for men than for women. We would say that gender ”interacts” with education in determining its effect on wage. Another example is that one additional year of experienceX might be associated with a different increase in wageY

for workers with different levels of education.

The way to incorporate interactions in a regression model is to include in the regression the product of the interacted variables. IfX1 andX2 interact in their impact onY, then the model is:

Y =β0+β1X1+β2X2+β3(X1∗X2) +u

Like with polynomial expansions, it is important to note that a model should never include only an interaction term. If you include an interactionX1∗X2, then you need to includeX1 andX2 in the model even if the coefficients are not significant.

2

Interactions with two Dummy Variables

Suppose thatY = wage and thatX1andX2are dummies:

X1= 1, if male = 0, if female

X2= 1, if college grad = 0, otherwise

For a baseline, consider the model without interactions:

Y =β0+β1X1+β2X2+u

In this case, β2 is the increase in wage associated with college graduation status. The problem is that,

in this model, this increaseβ2 is the same for men and for women.

Consider now interacting the two:

Y =β0+β1X1+β2X2+β3(X1∗X2) +u

• Expected wage for a woman who did not graduate from college: β0 • Expected wage for a woman who graduated from college: β0+β2 • Expected wage for a man who did not graduate from college: β0+β1 • Expected wage for a man who graduated from college: β0+β1+β2+β3

For women, the increase in wage from attending college is β2. For men, the increase in wage from attending college is β2+β3. Then β3 is the interaction – Ifβ3 >0, then men expect a larger increase in their wages by graduating from college. Ifβ3<0, then women expect a larger increase in wages as a result of college graduation. Ifβ3= 0 then the effect of college graduation is the same for men and for women.

3

Difference-in-Differences

Suppose that there was a law change in 1984 that affected only some workers – a researcher is investigating whether the law affected the wages of these workers. One simple estimator might be to subtract average wage in 1984 from the average wage in 1983 for the workers affected:

¯

Yaffected, 1984−Y¯affected, 1983

If we define a dummy variable equal to 1 in 1984 and equal to 0 in 1983, then this estimator can be parameterized as a difference estimator as discussed in unit 5.1.

The problem is that maybeeveryone’s wages increased in 1984, regardless of the law change. We might think then of subtracting out the difference in wages experienced by workers who were not affected by the law. That is, if non-affected workers’ wages rose by $1.25, but those of affected workers rose by $1.75, then we can conclude that the effect of the law change was $0.50. Writing it out formally, this is called adifference in differences (DD) estimator:

DD=Y¯affected, 1984−Y¯affected, 1983−Y¯nonaffected, 1984−Y¯nonaffected, 1983

It turns out that this can also be parameterized as a regression model. Define the following dummies:

X1= 1, if affected by the law

= 0, otherwise

X2= 1, if observation is in 1984 = 0, if observation is in 1983

Then consider the following model:

Y =β0+β1X1+β2X2+β3(X1∗X2) +u

Estimating by OLS, it turns out that ˆβ3 is numerically equal to the DD estimator from above. Again,

the OLS formulation is useful because we get all the regression statistics along with the parameter estimate, particularly an idea of the significance of the effect.

Figure 1: Model with no interactions: X1 continuous andX2 qualitative

4

Interactions with one Dummy and one Continuous Variable

Again let Y measure wage. Let X1 measure experience and X2 be our college graduation dummy from earlier.

X2= 1, if college grad

= 0, otherwise

Consider first the model with no interactions:

Y =β0+β1X1+β2X2+u

Here, β1 is the increase in wage associated with one more year of experience while β2 is the increase in wage associated with college graduation status. However, note that the effect of one more year of experience in this model isβ1 regardless of college graduation status. That is, the model does not allow for the effect of experience on wages to differ between college graduates and non-college graduates.

As a simple way to visualize this, consider figure 1.

β2 is the increase in wage for college graduates. However, in the model without interactions, the slope β1 – the impact of experience on wages – is constrained to be the same for college graduates and non-college

graduates. Using economic theory, though, it is reasonable to think that a college graduate would expect more of an increase in wage associated with experience than a non-college graduate.

Consider instead the model with an interaction:

Y =β0+β1X1+β2X2+β3(X1∗X2) +u

Now, for a non-college grad (i.e. X2 = 0) the impact of one more year of experience on wage is β1. However, for a college graduate (i.e. X2= 1), wage is expected to increase byβ1+β3as a result of one more year of experience X1. Thus,β3 is the extra increase in wage from one more year of experience for college grads over and above the increase in wage from one more year of experience for non-college grads.

Again, to visualize the difference, consider figure 2.

A model with interactions allows different interceptsand different slopes. We are allowing for the possi- bility that one more year of experience has a different impact on the wages of college grads and non-college grads.

Figure 2: Model with interactions: X1continuous andX2qualitative

5

Interactions with Two Continuous Variables

Again studying Y = wage, let us consider the impact of X1 = experience and X2 = education. Similar to the theoretical motivation in the previous example, it is certainly conceivable that an additional year of experience could have more of an impact on wages for workers with more education.

Again, taking a model with no interactions:

Y =β0+β1X1+β2X2+u

In this case, the expected impact on wages from one additional year of experience isβ1and is the same for all workers, regardless of education levels. Now consider the model with an interaction:

Y =β0+β1X1+β2X2+β3(X1∗X2) +u

To calculate the expected impact on wage of one more year of experience, take the partial derivative:

∂Y

∂X1 =β1+β3X2

Ifβ3>0, then the effect on wage of one additional year of experience becomes stronger as educationX2

rises.

6

Testing

Testing whether an interaction is significant is just a matter of conducting the usual t-test to determine whether the coefficient on the interaction term is significant.

7

Example

Using theCPS85file from before, suppose we are interested in knowing whether the effect of race on wages is different for men and for women. To test this, we need an interaction term:

Figure 3: Do race and gender interact in determining wage?

To implement this in EViews, just enter race*gender as an independent variable in the Equation Estimationdialog. Estimating this model in EViews gives us the results in figure 3.

The coefficient on the interaction term is not significantly different from 0, sincep= 0.2068. Thus, there is no evidence of an interaction effect – while race and gender each has a statistically significant association with wages, there is no evidence that the racial effect is different for men than for women.

Now, consider whether the effect of additional experience differs for men and for women. Again, we use a model with an interaction term:

Y =β0+β1∗exper+β2∗gender+β3(exper∗gender) +u

The results are given in figure 4.

The interaction here is significant. Writing out the estimated model, recall that the gender variable in this data set is coded asgender= 1 for females andgender= 0 for males:

ˆ

Y = 8.4980 + 0.0941∗exper−0.6695∗gender−0.0919(exper∗gender)

So, for a male (gender = 0), one more year of experience is associated with a wage increase of 0.0941. However, for a female (gender = 1) , one more year of experience is associated with a wage increase of 0.0941−0.0919 = 0.0022. Furthermore, the interaction term is significant, which provides statistical evidence that the experience premium (i.e. the increase in wage associated with more experience) is higher for men than for women. An important point is that even though the coefficient ongenderis not significantly different from 0, we cannot drop it from the model since the interaction term is included.

As a final example, let us consider whether the experience premium is different for workers with different levels of education. The model is:

Figure 4: Does gender interact with experience in determining wage?

The results are given in figure 5.

From EViews, the estimate of the model is:

ˆ

Y =−2.8323−0.0272∗exper+ 0.7655∗educ+ 0.0106(exper∗educ) Now, the estimated change in wage resulting from one more year of experience is:

∂Yˆ

∂exper =−0.0272 + 0.0106∗educ

For a worker with 12 years of education, ∂exper∂Yˆ = 0.10, so the worker expects a $0.10 increase in wage associated with one more year of experience. However, for a worker with educ = 16, the expected wage increase associated with one more year of education is ∂exper∂Yˆ = 0.1424. Notice, however, that the coefficient on the interaction term is not significant at conventional levels.

Should the interaction then be discarded from the regression? Ultimately, these things need to be determined by theory. If your theoretical model of wage determination suggests that the interaction is important, then it should be left in the model regardless of the significance of the coefficient. However, statistical results can sometimes be suggestive – if theory is uncertain, then maybe this suggests something about whether or not there is really an interaction effect.

Unit 6.1: Instrumental Variables – Motivation and Research

In document 351_metrics_f12.pdf (Page 87-94)