• No results found

Tests of the validity of the design

3. QUALITY OF HIGHER EDUCATION AND EARNINGS: REGRESSION

3.3 Data

3.5.1 Tests of the validity of the design

A standard concern with any RD design is the ability for individuals to precisely control the assignment variable. In our context, this can occur if students and/or graders manipulate scores in such a way that the distribution of unobservable deter- minants of education and earnings are discontinuous at the cutoff. The first concern is if students themselves are able to precisely sort to either side of the cutoff, espe- cially given that the cutoff score is known beforehand. However, the Baccalaureate 11The optimal local linear bandwidth for most of our specifications ranges from 1.2 to 1.5 score points.

exam comprises all subject matter taken during the year, most of which is in essay format, making it highly unlikely for any student to be able to precisely control their grade. A potentially more worrying concern is whether graders are sorting students to either side of the passing threshold in a non random way. Indeed, if borderline students with better future prospects are marginally passed at a higher rate than those with worse prospects, then our education and earnings estimates would most likely be upward biased.

In addressing these concerns, we consider a few tests that have become standard in the RD literature. The first informative test would be to check for any discon- tinuity in the density of grades at the cutoff point (McCrary, 2008). The rationale behind this test is that if individuals are manipulating grades around the cutoff, then the grade distribution will be discontinuously uneven for grades just below and above the cutoff. However, a running variable with a continuous density is neither necessary nor sufficient for identification. Specifically, this test may not be as help- ful if discontinuities in the grade distribution can be attributed to other exogenous factors such as grade rounding.12 As mentioned in Section 3.2.2, after the initial

grading of the exams, jury members decide whether they should award extra points to individuals just short of an important cutoff. The empirical distribution in Panel A of Figure B1 is consistent with this idea. At each representative grade cutoff, we observe a dip in the number of students who are just short of said cutoff combined with a spike in the number of students who are just above it.13 This heaping is

consistent with a priori expectations that jury members are bunching grades at im- portant cutoffs. These distributional discontinuities could be the result of strategic cutoff crossing, or an alternative random sorting process. While, the first case is

12See Zimmerman (2014) for such a case.

13Recall, that the cutoff grades of 8 ,10 ,12 ,14 and 16 all serve a specific purpose in terms of awarded degree.

obviously problematic, the latter poses no threat to identification. As highlighted in McCrary (2008):“If teachers select at random which students receive bonus points, then an ATE would still be identified.” In what follows, we provide evidence against strategic cutoff crossing.

In the presence of a running variable that is discontinuously distributed for ex- ogenous reasons, an informative visual test for grade manipulation is to verify the smoothness of baseline characteristics. This test has become standard in the RD literature as an alternative and often preferred approach for testing the validity of the RD design (Lee and Lemieux, 2010). The intuition here is that if we observe discontinuities in exogenous variables, then the treatment is not randomly assigned and an average treatment effect is not identified. Further, as part of this exercise, we also check for the presence of a discontinuity in the probability of being observed in the follow-up labor force segment of the survey. Specifically, if probability of sur- vey response is correlated with treatment, then the standard interpretation of our treatment effect would be problematic.

All panels in Figure B2 present estimates of the effects of threshold crossing on baseline characteristics. These figures take the same form as those after them in that open circles represent local averages over a 0.25 score range. All figures represent local linear regressions within 1.5 score points of the cutoff. Further, estimates are computed using population weights with robust standard errors reported in paren- theses.

We first check for the presence of a discontinuity in the averaged score of the oral and written French literature portion of the Baccalaureate exam. There are two advantages to looking at this variable. First, these exams are administered in grade 11, one year before all other Baccalaureate tests. In that sense, it is a very recent indicator of student ability. Second, jury members cannot award extra points on this

particular component of the Baccalaureate exam. Panel A of Figure B2 reveals an insignificant treatment effect (0.0196) on the average score of the French literature exam. We further test for a discontinuity in the Brevet national exam test scores. This high stakes exam is taken in grade 9 and is required for entry into high school, with the grading scale also being from 0-20. We have the averaged score for the three major components of the Brevet exam (Mathematics, French and foreign language). We also look at another national exam taken at the beginning of grade 6. The goal of this exam is to evaluate the level of students in mathematics and its grading scale is from 0 to 78. In Panel B of Figure B2, we find an insignificant treatment estimate (0.158) on Brevet scores. Panel C of Figure B2 also reveals an insignificant treatment effect (-0.847) on the mathematics exam scores in grade 6. This eases concerns that jury members might be sorting students around the cutoff, based on their academic ability.

In Panel D, we check for the presence of a discontinuity in the likelihood of being from a high socioeconomic status (S.E.S). We also find no significant effect (0.022). Further, in Panels E through G, we check for the smoothness of covariates that are known to affect education and wages, but that should be independent of treatment. Estimates on gender (0.0029), order of birth (-0.098) and number of siblings (0.138) are all statistically insignificant. To alleviate any concerns over bandwidth and/or functional form chosen, we present the baseline characteristics over varying functional forms and bandwidths in Table B3. All estimates remain statistically insignificant. Finally, we show that the predicted Baccalaureate score, as a function of the above covariates, is continuous at the cutoff. Both panels in Figure B3 highlight these results using a local linear and global polynomial fit respectively.

These results reject the hypothesis of strategic threshold crossing in favor of a non strategic sorting hypothesis. They are also consistent with the fact that students’

identities are never disclosed to neither graders nor jury members.

As highlighted in Barreca et. al (forthcoming, 2015), heaping in the running variable can have serious consequences if it is associated with determinants of the outcome variables. However, this will only bias the estimates to the extent that it creates imbalances in outcome determinants around the cutoff. Therefore, as a complement to our balanced characteristics test, we implement additional checks to further investigate the existence of strategic sorting. Specifically, we run ‘Donut type’ RDs that deal with the heaped data at each cutoff. Panel B of Figure B1 highlights the new distribution of grades resulting from Donut type RD regressions, which es- sentially involves cutting out all potentially manipulable data points. We implement these regressions in Section 3.5.6 with the main results remaining unchanged.

Finally, if marginally failing students were more likely to leave the country in order to have access to higher quality universities or if they endogenously chose not to respond to the follow up survey as a result of failing, then the interpretation of our results would be problematic. As an important RD validity check, we show that there is no significant threshold crossing effect on the likelihood of being observed in the follow-up wage survey. These results are reported in Panel H of Figure B2 and Table B3. The absence of any differential selection into the earnings sample alleviates any concerns attributed to leaving the sample due to barely failing the French Baccalaureate exam.