Robustness and Validity - 6.4 2SLS Estimates

6.4 2SLS Estimates

6.5 Robustness and Validity

The fact that the treatment is determined discontinuously at the cuto is not a sucient condition to make RDD valid. If applicants are able to precisely manipulate their GPA so that they exceed the threshold to a particular high school, RDD can be invalid (Lee and Lemieux 2010). To precisely manipulate the GPA to exceed the threshold, students should be able to exactly dene their own GPA. I nd this implausible, even though students undoubtely have some control over their GPA. Moreover, grades can be seen as noisy measures of ability as Abdulkadiro§lu et al. (2014) sees tests. Thus, there is some randomness in one's comprehensive school grades and therefore also in GPA.

Besides havingperfect control over one's own GPA, applicants should know exactly the threshold for particular school to precisely manipulate the runningvariable to exceed the threshold. Of course they can use the thresholds from the previous years to predict where the threshold would lie in the year of their application, but on most cases, the threshold changes from year to year.21 _{So even though applicants have some control}

over their GPA and may have a good guess what the admission cuto might be, there are still two sources of randomness here, and therefore one can assume that applicants cannot precisely manipulate the GPA so that they are assigned to treatment.

One can examine empirically if this assumption is plausible. The standard way to test this is to test proposed by McCrary (2008). However, as Zimmerman (2014) remarks, this test is not suitable if one can trace the discontinuities at the threshold to other factors. This is the case in my setting. Here, some of the GPAs are more common than the others, which causes discontinuities in the distribution of GPA, as can be seen from Figure 7.22 _{Again, here the GPA variable is GPA multiplied by 100,}

21_{The threshold equals the threshold of previous year in 5 of the 45 cases.}

22_{This results from the fact that the number of theoretical subjects in comprehensive school is 10}

plus additional foreign languages. This means that the number of possible GPAs is limited. For example, if a student does not take additional foreign languages in the comprehensive school, her GPA

so, for example, 800 means that one has GPA of 8.00. There are peaks at every integer GPA from 6 to 10 and in 6.50, 7.50 and 8.50. Also, one can see that GPAs like 8.10, 8.20 and 8.30 are more common than GPAs between them. The admission cutos tend to land in this kind of discontinuity point often, which can be seen in Figure 8, where I demonstrate the densities in sharp samples of individual schools and the densities in full sample. I have restricted the examination to those who are inside one grade point distance of the threshold and set the bin width to 0.1 grade point to make the gure more informative.23 _{If one looks at the applicants (i.e. the sharp sample) of}

Ressu, Norssi, Viikki or Etelä-Tapiola, one could be suspicious that there is some kind of manipulation at the thresholds of these high schools. Yet, if one takes a look at the full sample (also those who have not applied to that particular school), one can see that there are discontinuities at these thresholds.

The test proposed by McCrary (2008) rejects here the null hypothesis of continuity of the running variable at the cutos of Ressu, Norssi, Viikki and Etelä-Tapiola when I consider their sharp samples. However, the null hypothesis is rejected at these cutos also when I consider the full sample. Märsky is the only school where McCrary test does not reject the null hypothesis for the whole sample, but the same is true for the sharp sample. This suggests that the discontinuities at the thresholds of Ressu, Norssi, Viikki and Etelä-Tapiola are at least partly caused by other factors than (precise) manipulation.

As Lee and Lemieux (2010) recommend, I also examine the continuity of covariates that have been determined before the treatment assignment. These covariates are gender, parental education and comprehensive school grades in non-academic subjects.

is constructed of 10 grades, which leads to 60 potential GPAs.

23_{Also, those who are not observed to receive an oer at the cuto point are set to the bin left of}

0 .01 .02 .03 Density 400 600 800 1000 GPA

Figure 7: Distribution of GPA in full sample

Figure 6 presents these estimates. There is one signicant discontinuity at the 95 percent signicance level which is not surprising, since there are 48 estimates, which means that one could predict to observe on average 2.4 signicant estimates. Otherwise there are no signicant discontinuities at the 95signicance level. This means that the applicants just to the right of the threshold do not dier signicantly of the applicants just to the left of the threshold in terms of the observable characteristics studied here. This increases the validity of RDD in this setting.

Imbens and Lemieux (2008) recommend to test whether there are jumps at other points than at the cuto. The points they suggest are at the median of the subsamples below and above the threshold. Because this test considers these subsamples separately, the samples sizes become quite small. Therefore I estimate the results for the pooled sample, which includes all of the sharp samples. The medians below and above the cuto are at -43 and at 37, respectively. The results using these placebo cutos are presented in table 7. There are no signicant jumps in outcomes at these placebo

0 .002 .004 .006 .008 .01 Density -100 -50 0 50 100

Standardized Running Variable

Ressu: Sharp sample

.005

.01

Density

-100 -50 0 50 100

Standardized Running Variable

Ressu: Full sample

0 .005 .01 .015 Density -100 -50 0 50 100

Standardized Running Variable

Norssi: Sharp sample

0 .002 .004 .006 .008 Density -100 -50 0 50 100

Standardized Running Variable

Norssi: Full sample

0 .002 .004 .006 .008 .01 Density -100 -50 0 50 100

Standardized Running Variable

Viikki: Sharp sample

0 .002 .004 .006 .008 .01 Density -100 -50 0 50 100

Standardized Running Variable

Viikki: Full sample

Figure 8: Histograms of GPA distributions in sharp samples and in full sample

Note: Applicants in the sharp samples with value zero of the running variable are in the bin just to the right of the cuto if they were observed to receive oer - otherwise they are in the bin just to the left of the cuto.

0 .002 .004 .006 .008 Density -100 -50 0 50 100

Standardized Running Variable

Märsky: Sharp sample

0 .002 .004 .006 .008 .01 Density -100 -50 0 50 100

Standardized Running Variable

Märsky: Full sample

0 .002 .004 .006 .008 .01 Density -100 -50 0 50 100

Standardized Running Variable

Etelä-Tapiola: Sharp sample

0 .002 .004 .006 .008 Density -100 -50 0 50 100

Standardized Running Variable

Etelä-Tapiola: Full sample

Notes: Applicants in the sharp samples with value zero of the running variable are in the bin just to the right of the cuto if they were observed to receive oer - otherwise they are in the bin just to the left of the cuto.

Table 6: Continuity of covariates

Ressu Norssi Viikki Märsky E-T All Schools

Outcome (1) (2) (3) (4) (5) (6) Gender 0.055 -0.240* -0.018 -0.022 -0.002 -0.025 (0.073) (0.134) (0.105) (0.088) (0.080) (0.040) 2,319 1,024 1,152 992 1,887 7,037 HEF 0.011 -0.143 -0.009 -0.024 0.124 0.031 (0.077) (0.138) (0.111) (0.110) (0.086) (0.036) 2,317 1,030 1,154 992 1,888 7,034 HEM -0.044 0.005 -0.026 0.135 0.113* 0.044 (0.068) (0.111) (0.087) (0.097) (0.068) (0.039) 2,317 1,030 1,154 992 1,888 7,034 Physical Education -0.077 0.245 0.091 -0.226** 0.238* -0.027 (0.124) (0.249) (0.148) (0.113) (0.123) (0.070) 2,304 1,021 1,146 987 1,880 7,004 Home Economics 0.002 0.063 0.159 -0.095 -0.061 -0.012 (0.094) (0.169) (0.161) (0.152) (0.096) (0.052) 1,981 874 997 844 1,658 6,073 Visual Arts 0.102 0.0175 0.243 -0.227 -0.0164 0.0412 (0.113) (0.250) (0.189) (0.147) (0.119) (0.077) 2,058 900 1,013 861 1,670 6,209 Music 0.018 -0.243 -0.192 -0.122 -0.144 -0.119* (0.127) (0.200) (0.135) (0.159) (0.120) (0.067) 2,279 1,013 1,140 979 1,873 6,949 Handicraft -0.050 0.047 0.059 0.065 0.083 -0.013 (0.122) (0.179) (0.128) (0.141) (0.127) (0.065) 2,261 1,016 1,133 979 1,872 6,929

Standard errors in parentheses and N under them. Signicance levels: *** 1 %, ** 5%, and * 10%.

Table 7: Placebo cutos

Mother tongue English Math A Math B ME GPA

Cuto (1) (2) (3) (4) (5) -43 -0.264 0.071 -0.826 -0.671 -0.221 (0.203) (0.285) (0.538) (0.460) (0.165) 2,121 2,155 937 869 2,216 37 0.197 0.260** -0.064 -0.320 0.033 (0.133) (0.131) (0.203) (0.307) (0.089) 4,278 4,2952,829 1,051 4,361

Standard errors in parentheses and N under them. Signicance levels: *** 1 %, ** 5%, and * 10%.

cutos except at the one above in English. This could suggest that the elite high school treatment has nonlinear eects on Matriculation Examination grades in English, so that those who are very high-achieving students and cross the threshold clearly actually benet of attending an elite high school. This could also explain why the OLS estimates in table A.2 show that attending an elite school is positively associated with Matriculation Examination results in English, but not with results in mathematics or in mother tongue, when baseline GPA and parental education are controlled for.

Because Gelman and Imbens (2014) recommend to use local low order polynomials like local linear or quadratic polynomials instead of global high order polynomials, I have been using local linear regression in this thesis. As a robustness check, I also estimate the results (not presented here) using quadratic polynomial regression. This estimation leads to qualitatively similar ITT and LATE estimates, except the latter estimates for Norssi are absurd in their magnitude albeit insignicant.

In the following sensitivity checks, I focus on the mother tongue grade, because during the time period I study it has remained the only compulsory exam one has to

take to pass the Matriculation Examination.

To verify the robustness of the results, I estimate ITT and LATE for mother tongue grade using 20 dierent bandwidths ranging from 5 to 100. It can be seen from gures 9 and 10that the results are robust to bandwidth choice. The dots represent the estimate and the lines the 95 percent condence intervals. The dots with red condence intervals above them represent the estimates and condence intervals resulting from the MSE- optimal bandwidth I have been using throughout this thesis.

In gure 9, the estimates and the condence intervals for the bandwidth of 5 are omitted for Norssi and Viikki. The reason for this in the case of Norssi is that there are not enough observations in the sharp sample in the bandwidth of 5. For Viikki, the reason is that for the bandwidth of 5, the estimate is quite imprecise and therefore including it makes it dicult to interpret the graph. Also, in this case the estimate did not dier signicantly from zero.

In gure 10, the estimates and the condence intervals are omitted for the bandwidth of 5 except for Märsky and Etelä-Tapiola. The reason for this in the case of Norssi is again the sample size. For the other schools the reason is that for the bandwidth of 5, the estimate is quite imprecise and therefore including it makes it dicult to interpret the graph. In addition, the bandwidth of 15 has been omitted for Norssi for the same reason. As one can see, zero is included in most of the condence intervals and therefore I can conclude that results are not dependent on the bandwidth choice.

It is possible that not all applicants above the threshold receive an oer (Virtanen 2016). Therefore I run a fuzzy RDD using the observed oer as an instrument. These estimates tell the eect of an oer for those who actually get the oer. The results are presented in table A.8 in the Appendix. The estimates are quite similar to the reduced- form estimates, even though the magnitude of the negative mother tongue estimate of

-.5 0 .5 1 1.5 Estimate 0 10 20 30 40 50 60 70 80 90 100 Bandwidth Ressu -6 -4 -2 0 2 Estimate 0 10 20 30 40 50 60 70 80 90 100 Bandwidth Norssi -1.5 -1 -.5 0 .5 1 Estimate 0 10 20 30 40 50 60 70 80 90 100 Bandwidth Viikki -1 0 1 2 3 Estimate 0 10 20 30 40 50 60 70 80 90 100 Bandwidth Märsky -1 0 1 2 3 4 Estimate 0 10 20 30 40 50 60 70 80 90 100 Bandwidth Etelä-Tapiola -.5 0 .5 1 1.5 Estimate 0 10 20 30 40 50 60 70 80 90 100 Bandwidth All Schools

Figure 9: Sensitivity of the ITT estimates to bandwidth choice

Note: For Norssi and Viikki the estimates and the condence intervals are omitted for the bandwidth of 5.

-1 0 1 2 3 Estimate 0 10 20 30 40 50 60 70 80 90 100 Bandwidth Ressu -4 -2 0 2 4 Estimate 0 10 20 30 40 50 60 70 80 90 100 Bandwidth Norssi -2 -1 0 1 Estimate 0 10 20 30 40 50 60 70 80 90 100 Bandwidth Viikki -2 0 2 4 6 8 Estimate 0 10 20 30 40 50 60 70 80 90 100 Bandwidth Märsky -2 0 2 4 6 Estimate 0 10 20 30 40 50 60 70 80 90 100 Bandwidth Etelä-Tapiola -1 -.5 0 .5 1 Estimate 0 10 20 30 40 50 60 70 80 90 100 Bandwidth All Schools

Figure 10: Sensitivity of the LATE estimates to bandwidth choice

Note: The estimates and the condence intervals are omitted for the bandwidth of 5 except for Märsky and Etelä-Tapiola. The reason for this in the case of Norssi is that there is not enough observations in the sharp sample in the bandwidth of 5. For other schools the reason is that for the bandwidth of 5 the estimate is highly imprecise (and does not dier signicantly from 0) and therefore including it makes it dicult to interpret the graph. This is also the reason why the bandwidth of 15 has been omitted for Norssi.

Norssi is smaller and the estimate of Matriculation Examination GPA is insignicant.24

Since these estimates tell mostly the same story as the reduced-form estimates, I can reject the possibility that measurement error has an signicant eect on the results, except for Norssi, as it seems that the negative estimates of Norssi are at least partly driven by measurement error.

One could argue that the reason I nd no systematic positive eects is that students in elite high schools take more exams in the Matriculation Examination than students in other schools do. This could mean that students in elite high schools have less time to focus on individual exams, which would lower their individual grades and therefore also the Matriculation Examination GPA. For example, consider a student from a non-elite high school who takes six exams and gets the best grade, laudatur, from all of them. Now consider a student from elite high school who takes seven exams and gets laudatur also from six exams but eximia from one exam. Now the latter student has a worse Matriculation Examination GPA than the rst though one could think that the latter student has done better in Matriculation Examination than the rst. Moreover, if the grade eximia came from an individual exam I have analyzed (e.g. mother tongue), it could look like attending elite high school made the student worse o.

To solve this problem, I check whether the number of exams one takes changes discontinuously at the threshold. Table A.7 shows that there are no discontinuities in the number of exams taken in Matriculation Examination except for Etelä-Tapiola.

It could also be argued that students in elite high schools take harder exams more often than students in other schools. I do not nd it meaningful to rank subjects in terms of their diculty except in the case when a single subject has dierent levels of diculty. This is true for mathematics, which has both basic and advanced course exams. Also, languages have two or three of the levels basic, intermediate (keskipitkä)

and advanced course. However, very few take intermediate exams on any language except in Swedish. Therefore besides mathematics, I consider here only the number of advanced level language exams taken. As table A.7 shows, the probability to take the advanced mathematics exam does not jump in the thresholds of interest, which means that attending an elite high school does not have an eect on taking the harder mathematics exam. Also, one can see that the number of advanced language exams is not discontinuous at any of the thresholds. The same is true for the number of exams taken in Matriculation Examination except for Etelä-Tapiola. At the threshold of Etelä-Tapiola, the number of exams taken jump signicantly at the 95 percent level, so that applicants crossing the Etelä-Tapiola threshold take on average 0.47 exams more than the applicants who do not cross the threshold. So studying in Etelä-Tapiola might encourage students to take more exams. This means that for Etelä-Tapiola, a possible explanation of non-existent eects in Matriculation Examination grades could be that students focus on greater number of exams which lowers their Matriculation Examination GPA. This can of course be an positive eect in itself.

7 Conclusions

The admission process to high schools in Finland produces entrance thresholds for each school every year. Since those whose GPA is above the entrance threshold are oered a seat and those whose GPA is below the threshold are not, these thresholds create a quasi-experimental setting, which I am able to exploit using a regression discontinuity design. In some the schools of these thresholds are very high, from year to year. There- fore, these schools can be called 'elite', because the students in those schools have high baseline GPA from comprehensive school. When a student attends one of these schools, she gets on average higher-achieving peers in terms of GPA than she would have got

otherwise.

In the title of this thesis I propose a question: Does elite school attendance aect learning outcomes? The evidence suggests that elite high school oer or elite high school enrollment does not have systematic positive or negative eects on Matricula- tion Examination grades. This means that I do not nd evidence that elite high school environment has an eect on learning outcomes, even though one gets better peers in terms of GPA when she crosses the entrance threshold of an elite high school. Even though Kanninen (2013) nd evidence that peer homogeneity improves learning outcomes, this is not the case in my study. Neither does proportion of girls nor parental education have eect. If I assume that elite high schools are similar to regular schools in other dimensions than peer quality, I can conclude that peer quality is not an impor- tant input in education production function. If these schools are not similar in other dimensions, it is of course possible that some or all of these have positive eect, but there are also some negative eects so that these eects cancel each other out. For example, one could think that the reason elite schools here do not have positive eect on learning outcomes is teacher quality. Hoekstra et al. (2016) conclude that top tier high schools in China are better because of more qualied teachers. In Finland, measurable dierences in teacher quality are quite small, at least in terms of qualication, since most high school teachers have master's degree. This could be the reason I do not nd systematic positive elite high school eects.

One could also think that the reason I do not nd any systematic eects is that there are dierences in high school spending across the schools. I do not observe the resources of schools, but I nd it implausible that elite high schools would have signicantly less resources than others. Also, as Häkkinen et al. (2003) show, changes in school spending in Finland do not have signicant eects on Matriculation Examination results.

ment itself but the rank of a student in her peer group. Since I study those who are just to the left of the and just to the right of the threshold, the students in the treatment group are among the worst in their school in terms of GPA while the students in the control group are probably not. If a student's rank in her peer group aects her test scores as Cicala et al. (2016) suggest, the potential positive eects of elite high schools could cancel out by this eect.

However, the results presented here suggest that there might be some disadvantages and benets in attending an elite high school. My analysis suggests that crossing the threshold of Norssi might have a negative eect on mother tongue grade and grade point average in Matriculation Examination, whereas being eligible to Etelä-Tapiola leads has a positive eect on the number of exams one takes in Matriculation Examination. Also, results suggest that elite high school eligibility may have positive eects on English grade for those who cross the entrance threshold by a large margin. However, these interpretations should be made with caution, since these may be just a result of a coincidence, and the negative eects of Norssi are probably at least partially driven by

In document Does Attending an Elite High School Have an Effect on Learning Outcomes? : Evidence from the Helsinki Capital Region (Page 52-66)