1 — C Notes: same notation as in Table 1.
61 This is slightly different from the figure in Table 20, because it is based on household rather than individual data This is necessary, as we can only estimate compliance rates at the household
3.8 Appendix II: a check on mean reversion
A recent paper (Chay et al., 2005) has considered the problem of mean reversion in difference-in-differences regressions. The paper found, in the context of the evaluation of school performance, that ignoring mean-reverting shocks can lead to serious over-estimation of the effect of policy measures that are aimed at the weaker schools. These findings may be relevant for the analysis in this chapter, too. If there are mean-reverting income shocks, then it is likely that some individuals will be allocated to the treatment group, because of good (bad) luck. In the post reform period, those individuals would see their incomes fall (rise) back to its permanent level. This would bias the difference-in-differences effect downwards.
The central results of this chapter, i.e. those on household data are likely to be less affected, because shocks across family measure will often cancel out, so that the variance of shocks to household incomes will be smaller.
One may also argue that the use of consumption data helps, as it is likely to represent permanent income, i.e. the part of income that is not due to an unexpected shock. In Russia however, as argued above, savings are negligible, even in the better-off control group. While there may be some underreporting of savings, it is still likely that our income measure may suffer from such mean- reversion.
A related, but slightly different, worry may be that growth rates of incomes are for exogenous reasons different across the treatment and control groups. It is for example possible that growth rates are structurally higher for those on low incomes, e.g. if starting salaries are low, but rise fast in early years of employment.
Chay et al. (2005) suggest dealing with mean reversion problems by adding the original income to a difference-in-differences equation as follows63:
63 This is the “regression discontinuity” approach, described as early as in Campbell, and Stanley (1963). Chay et al. (2005) go far beyond that relatively simple approach and introduces numerous
E 32
4 v * = A + A
j v-
i+ A ? ] + " #
This will detect a difference-in-differences effect, even if there is mean reversion. Any mean reversion in this set up should be picked up by the coefficient on original income (/?/), while the coefficient on the treatment dummy
(fij) picks up any additional change in income. An illustration of this is provided
in the two top panels of Figure 8, which shows hypothetical data for the relationship between increases in income (Ay,) and original income iyt-i). The top left panel shows, how a standard difference-in-difference methodology picks up a negative treatment effect, if the general relationship between original income and income changes is negative. The top right panel shows, how in this hypothetical case, none of the fall in income of the high income group is due to the reform, as all can be explained by the general negative relationship. Note that the relationship need not be linear, as the regression can easily be adapted by the inclusion of further polynomials of original income.
robustness checks, most o f which cannot be adapted to our sample, which for example does not contain any obvious variables that could be used for instrumentation.
Control Treatment Control Treatment
Figure 8:Illustration of regression discontinuity approach and extensions
One underlying assumption of this approach is that treatment effect of the reform is a constant quantity, but that relationship between income increases and original income does not change. While this seems appropriate in case of a pure mean-reverting shock, it may be less useful if there are other reasons for a general negative relationship between income changes and original income. Here we therefore suggest a method for relaxing this assumption.
A case in which following the reform both the slope and the constant of a regression between income changes and original income change is illustrated in the bottom left panel of Figure 8. This shows hypothetical data for current year in dots and hypothetical data for the previous year in dashes. Clearly, running a regression as in E 32 on the current data only, would lead to the finding that the reform had no negative effect on income and that the negative relationship was entirely due to mean reversion. However, when data from the previous year is taken into account, one can note that the relationship between changes in income and original income has become steeper. There must therefore be some explanation for the additional negative effect that is not purely due to the pre reform mean-reverting process. The upshot of this is that the relationship
between changes in income and original income should be estimated on pre- reform data first, if the reform is allowed to change this relationship.
The analysis can be generalised even further, by allowing the reform to have different effects on the slopes for the treatment and control group. Such differential effects could be expected if a complicated reform affected individuals in the treatment group to different degrees. In the case of the Russian income tax reform, this might be expected because marginal tax rates were reduced more for those individuals qualifying for lower social taxes. Moreover, the absolute value of the tax cut was higher, the larger the proportion of pre-reform income that previously faced higher tax rates. A hypothetical example is given in the bottom right panel of Figure 8, where following a reform, the relationship becomes flatter for the control group and steeper for the treatment group, with the absolute jump between both groups positive, but of negligible size. Overall such data would suggest that the reform had a negative effect on income changes for the treatment relative to the control group.
More formally, the analysis as presented in the bottom right panel can be obtained by running a regression of the following form:
E 33 \ y u =/?„ + 1 + PiP, + P,P,T, + foy,,-iP, + P5y„-iP,1’, + u„ ■
This regression will yield the pre-reform relationship between, as characterised by f$o and Pi. p2 and p4 describe how the relationship changed after the reform. The differential changes are given by Ps and Ps, which show the effect on the intercept and slope respectively. If both are of the same sign, interpretations is obvious. Otherwise, the total effect needs to be calculated, or more easily, a graph of the regression lines can be inspected.
When applying this to the Russian data sample, a further difficulty is encountered, as no data are available for 1999, during which no survey was undertaken. We therefore need to use data for 1998 to estimate the pre-reform relationship, dividing any income change between 2000 and 1998 by two, to average out the growth over the two years. This gives us an expanded data set, covering the years 1998 to 2001. After differencing and lagging, we then have
two observations per individual, and are able to run regression E33. The results are given in Table 36. Moreover, we repeat the regression after allowing the slopes to differ even in the pre-reform period between control and treatment group. While there is less reason to expect such a difference in slope before the reform, it could be feared that the comparison between pre- and post-reform periods may be biased, if a more restrictive functional form is assumed for one of them.
Table 36: Regression on pre- and post-reform data
(i) (2)
Dependent variable Increase in real income
Real income, t-1 -0 . 1 1 0 -0.279
(0.0 2 0)*** (0.0 1 2)***
Real income, t-1 * (Treatment) 0.253
(0.027)***
Post-reform 75.493 2.148
(13.548)*** (11.792)
Post-reform * (Treatment) 262.349 262.349
(170.485) (170.506) Real income, t-1 * (Post-reform) 0.105 0.274
(0.031)*** (0.026)*** Real income, t-1 * (Post-reform) * -0.185 -0.438
(Treatment) (0.092)** (0.096)***
Constant 32.157 105.502
(10.203)*** (7.720)***
Observations 4082 4082
R-squared 0.08 0 . 1 2
Note: Heteroskedasticity-robust standard errors in parentheses.
These results suggest that incomes in the treatment group indeed grew more slowly compared to the control group, even allowing for generally lower growth among better-off individuals. While incomes in the treatment group increase by a positive (and in case of regression (2) insignificant) fixed amount, this is outweighed by growth falling more as incomes increase. This is shown more clearly in Figure 9, which shows predicted income growths for regression (1) (results from regression (2) look very similar).
reform.
t i i i i r
0 1000 2000 3000 4000 5000
Original in com e
Figure 9: Regression lines from regression (1) in T able 36.
Overall we thus conclude that the results were not purely driven by mean reversion or more generally a pre-existing negative relationship between income changes and original income. This is reassuring, but there remain some doubts about the methodology used in this appendix, which is the reason for making it a robustness check rather than the central case.
First, the pre-reform period may not be a useful comparison, because it was also marked by a tax reform, which reduced tax rates at the upper part o f the income distribution.
Second, the experiment described here cannot be easily extended to household- level regressions, as less complete households than individuals “survive” for three consecutive periods.
Third, we have made the assumption that growth rates are structurally different across income levels, and that this structure would not have changed without the reform. It is not clear whether these assumptions are necessarily better than the assumption o f equal growth among both groups in the absence o f reform.