• No results found

Summary of the Methods Used to Evaluate the Impact of PROGRESA

A

ll of the reports evaluating the impact of PROGRESA did so using either the differ-

ence-in-differences (2DIF) estimator or the cross-sectional differences (CSDIF) esti- mator of the program impact (discussed in Chapter 3). However, rather than simply comparing unconditional means between treated and control groups, all of the reports used re- gression methods. The estimation of the 2DIF and CSDIF estimates of program impact through regression methods provides the advantage of controlling for the role of observed characteristics of the individual, the household, and the locality on the variation in the ob- served value of the outcome indicator of interest. Without any adjustment for the role of such confounding factors, all the differences in the mean value of the outcome indicator are attrib- uted to the program. As a matter of principle, it is preferable to have an estimate of the impact of the program net of the influence of these observed characteristics on the difference in the mean value between treatment and control households.

The discussion that follows provides a summary of the regression methods that one can use to estimate the impact of eligibility in the PROGRESA program on a generic outcome in- dicator. It should be kept in mind that the nature of the outcome indicator (i.e., whether it is a continuous variable, such as household consumption, or a binary variable, such as child school enrollment) necessitates an appropriate econometric method for the estimation of the program

impact (such as ordinary least-squares [OLS], probit, or logit).38

To begin, consider the case where data are available for treatment and control households

before and after the start of the program.39Restricting the sample to eligible households only

(E = 1), the various estimators for program evaluation discussed earlier that control for indi-

38However, it should be kept in mind that when nonlinear methods (such as probit or logit) are employed, the es-

timated parameters of the model must be transformed into marginal effects that depend on the values of the re- gressors used in the model. The common practice in these circumstances is to estimate marginal effects at the means of the regressors.

39It is assumed that there is only one observation after the start of the program for expositional simplicity. The avail-

ability of more than one round of observations after the start of the program can be easily accommodated by includ- ing an additional binary variable (say R3) together with its interaction with the treatment dummy (R3*T). Then the coefficient of the (R3*T) term is an estimate of the 2DIF program impact estimate in the third round of the survey and can yield information on whether the impact of the program is strengthened or weakened over time (e.g., Schultz 2000a; Parker and Skoufias 2000).

vidual, household, and locality observed characteristics can be obtained by estimat-

ing a regressions equation of the form40

Y(i, t) = α + βTT(i) + βRR2 +

βTR(T(i)*R2) +

Σ

jθjXj+ η(i,v,t), (17)

where Y(i,t) denotes the value of the out- come indicator in household (or individual)

i in period t;α, β, γ, and θ are fixed param-

eters to be estimated; T(i) is an binary vari- able taking the value of 1 if the household belongs in a treatment community and 0 otherwise (i.e., for control communities);

R2 is a binary variable equal to 1 for the sec-

ond round of the panel (or the round after the initiation of the program) and equal to 0 for the first round (the round before the ini- tiation of the program); X is a vector of household (and possibly village) character-

istics; and η is an error term summarizing

the influence of random disturbances. To better understand the preceding spec- ification it is best to divide the parameters into two groups: one group summarizing differences in the conditional mean of the outcome indicator before the start of the pro-

gram (i.e., α, βT) and another group sum-

marizing differences after the start of the

program (i.e., βR,βTR). Specifically, the co-

efficient βTallows the conditional mean of

the outcome indicator to differ between eli- gible households in treatment and control localities before the initiation of the pro- gram whereas the rest of the parameters allow the passage of time to have a different effect on households in treatment and con- trol localities. For example, the combination

of parameters βRand βTRallows the differ-

ences between eligible households in treat- ment and control localities to be different after the start of the program.

One advantage of this specification is that the t-values associated with some of these parameters provide direct tests of a number of interesting hypotheses. For ex- ample, the t-value associated with the esti-

mated value βTprovides a direct test of the

equality in the conditional mean of Y be- tween treatment and control before the initi- ation of the program and serves the role of a test of the randomness in selection of local- ities. For if there were a truly random selec- tion of localities into control and treatment, then the conditional mean of the outcome indicator should be identical across treat- ment and control households/individuals.

Specifically, given the preceding specifi- cation, the conditional mean values of the outcome indicator for treatment and control groups before and after the start of the pro- gram are as follows:

[E(Y | T = 1, R2 = 1,X)] = α + βT+ βR+ βTR +

Σ

jθjXj (18a) [E(Y | T = 1, R2 = 0,X)] = α + βT+

Σ

jθjXj (18b) [E(Y | T = 0, R2 = 1,X)] = α + βR+

Σ

jθjXj (18c) [E(Y | T = 0, R2 = 0,X)] = α +

Σ

jθjXj (18d)

According to the preceding specification, the cross-sectional difference estimator is given by the expression

CSDIF = (18a – 18c) = [E(Y | T = 1, R2 = 1,X) – E(Y | T = 0, R2 = 1, X)]

= βT+ βTR, (19)

SUMMARY OF METHODS USED TO EVALUATE IMPACT OF PROGRESA 41

40It should be noted that a slightly more restrictive specification is to pool all eligible and non-eligible households

and include an additional variable denoting eligibility (E ) along with a full set of its interactions with the binary variables T and R2. This alternative specification, in addition to estimates of CSDIF and 2DIF, yields the triple difference (3DIF) or difference-in-difference-in-differences estimator which compares changes in the inequality of the outcome indicator between eligible and non-eligible households.

while the before-and-after difference esti- mator is given by

BADIF = (18a – 18b) = [E(Y | T = 1, R2 = 1,X) – E(Y | T = 1, R2 =

0, X)] = βR+ βTR. (20)

Expression (19) describing the CSDIF esti- mator highlights the fact that the estimated impact of the program is inclusive of any pre-program differences between treatment and control groups (summarized by the

presence of the βT term). Along similar

lines, expression (20) indicates that the BADIF estimator is inclusive of any trend or aggregate effects in the changes of the out- come indicator Y (summarized by the pres-

ence of the βRterm).

The advantage offered by the difference- in-differences (2DIF) estimator is that it provides an estimate of the impact of the program that is net of any pre-program dif- ferences between treatment and control households and/or any time trends or aggre- gate effects in changes of the values of the

outcome indicator.41 By comparing before

and after differences between treatment and control households (or differences between treatment and control households after and before the program), one is able to get an estimate of the impact of the program (sum-

marized by the single parameter βTR):

2 DIF = (18a – 18b) – (18c – 18d) = (18a – 18c) – (18b – 18d)

= βTR. (21)

Using the terminology of Heckman et al.

(1999), the parameter βTRprovides an esti-

mate of the “mean direct effect of treatment

on those who take the treatment.” It should also be noted the program effect summa-

rized by the parameter βTR is inclusive of

the role of the operational efficiency or in- efficiency with which the program operates. It is likely that persistent delays in the pro- cessing of forms in some states or munici- palities and other administrative bottlenecks may lead to weaker impacts of the program on households residing in these states rela- tive to those in states where the program is

operating more efficiently.42This question

was dealt with as part of the operations eval- uation of PROGRESA rather than as part of the impact evaluation of the program.

In the majority of the reports, the vector X typically consists of variables characteriz- ing the age and gender composition of the household, household size, and the age and education level of the household head and his or her spouse. Given that the vector X does not contain any supply-related vari-

ables, βTR is an estimate of the impact of

the conditional cash transfers (demand-side effects) and the improvements in the quan- tity and quality (or supply-side effects) of educational and health services and facili- ties associated with the PROGRESA pro- gram. Efforts to distinguish between the de- mand and supply effects of the program require the inclusion of supply-related vari- ables as additional regressors in equations

(18).43However, the extent to which one is

able to isolate sufficiently the demand effect from the supply effect depends on whether the observed supply-related variables cap- ture sufficiently all the supply effects of the program.

The availability of repeated observations before and after the start of the program on

41It is important to note, however, that these advantages of the 2DIF estimator are based on some implicit as-

sumptions. For example, the 2DIF estimator assumes that the time trend present among control households is an adequate representation of the trend that would have prevailed among treated households in the absence of the program.

42For example, such regional inefficiencies in the operation of the program could be captured empirically by al-

lowing the coefficient of treatment dummy variable T in expression (17) to vary by region or state.

non-eligible households in treatment areas also offers the opportunity to examine the potential effects of the program on the non- eligible households residing in the treatment communities. For example, improvements in the quantity and quality of supply of health and educational services may also benefit the non-eligible households in the treatment communities. Non-eligible house- holds may also benefit by attending volun- tarily the monthly pláticas offered in the vil- lages covered by PROGRESA. In addition, non-eligible households in treatment locali- ties may alter their behavior (such as work less or withdraw their children from school) in anticipation that such actions may qualify them for the program. An evaluation of the extent to which the program has had some indirect effects on the outcome indicator among non-eligible households in treatment areas can also be conducted by estimating a regression similar to equations (18) but re- stricted to the sample of non-eligible house- holds (E = 0).

When data for the outcome variable of interest are available for one round (or more rounds) after the start of the program, then the evaluation of the impact of the program reduces to whether the coefficient of the T(i) binary variable identifying the households residing in treatment localities, is positive and significantly different from zero. Using

the sample of eligible households only ( E = 1), then a specification that could be esti-

mated is of the form

Y(i) = α + γTT(i) +

Σ

jθjXj

+ η(i,v). (22)

In this case, an estimate of the cross- sectional difference estimator (CSDIF) is provided by

CSDIF = E(Y | T = 1,X) –

E(Y | T = 0,X) = γT. (23)

The availability of observations for more than one round after the start of the program allows one to examine whether the impact of the program is strengthened or weakened by the passage of time. For example, denot- ing observations from the third round of the evaluation survey by R3 (that is the second round after the start of the program) is a re- gression of the form

Y(i) = α + γTT(i) + δRR3 +

δRT(R3*T(i)) +

Σ

jθjXj+ η(i,v). (24)

With this specification, a statistical compar- ison of the relative sizes of the coefficients

δRT and γT (using one-tailed tests) can re-

veal whether the impact of the program in the third round is significantly greater or smaller than the impact of the program in the second round (e.g., see Hoddinott, Skoufias, and Washburn 2000; Skoufias and Parker 2001).

The discussion so far did not address the role of unobserved heterogeneity summa-

rized by the error term η(i, v, t) in the pre-

ceding regressions. One primary implica- tion of the clustering of the households with villages is that the household-specific error

terms η(i, v, t) are likely to be correlated

within each village (as well as across time). Failure to account for such a correlation may lead to a considerable bias in the esti- mated standard error of the program impact (e.g., see Murray 1998). For this reason, all of the impact evaluation reports account for the clustered nature of the sample and report robust standard error estimates for the im-

pact of the program.44

Some Important Details