Instrumental variable, self-selection and endogeneity

As introduced previously, analyze spatial mobility effects on labour market outcomes implies to face with endogeneity, occurring when one or more than one of the covariates are related to the error term in the model. One of the common caused of endogeneity is the omitted variables problems and specifically the “self-selection”.

Even if the methodological procedure to overcome endogeneity are very similar, the causes of endogenity indicate different concepts30_{. In this excercise I use different concepts of endogenity} according to different research questions. Endogeneity occurs in situation where the relationship between two variable is uncorrectly identified due to the presence of unobservables factors correlated to the endogenus variable. When I estimate the returns (in terms of earnings) of the geographical mobility I assume that bias comes from this unobserved relationship.

There is self-selection when the indipendent variable is observed just for a restricted sample of the population that occurs when I compare earnings among sub-population groups with different migration paths.

All causes of endogeneity are often resolved through two step procedure where, according to this research question, in the first stage, the probability to move is calculated using one (or more) variable correlated with endogenus dummy (or with treatment) but uncorrelated with the outcome variable.

The impossibility to identify exactly a causal relationship between two variable is a recurring problem in the estimation of schooling return, labourforce participation, unionization and even in migration since there are factors influencing geographical mobility and labour market outcome which are not directly observable by researchers but must be consdier to explain difference in returns to migration (Dahl, 2002; Malamud and Wozniak, 2008; Nakosteen and Zimmer, 1980). In this fremework we can have an upwards bias (or positive self selection) if there is a positive correlation between higher educational level and migration: is possible that major earnings are justified not by migration itself but from the presence of spillover effects coming by the presence of human capital (Bartel, 1989; Nakosteen and Zimmer, 1980) or, because more able individual are prone to move in places where the returns to human capital are higher (Borjas et al., 1992; Venhorst and C¨orvers, 2015).

Otherwise, even if internal migrants possess greater ability or motivation than nonmigrant (Gabriel and Schmitz, 1995), as explained previously, at least in the short run, they might have to deal with “adjustments process” due to a unclear knowledge of the local labour market, with penalty in terms of wage, employability or job position (Rodr´ıguez-Pose and Tselios, 2009). It is still questionable the direction of the bias and the type of selection in migration since the results on the effects of geographical mobility on wage present in literature differ according to methodology used and by the observation period lenght. Furthermore, with the instrumental variable approach, the results can differ even according to the exogenous variable used.

As explained, in the last econometric framework, I start from the hypothesis that liquidity constraints and resources available influence university choice: the amount of scholarships assigned at regional level represents an opportunity to offset the lack of resources provided by family. Analyzing push and pull factors affecting student mobility Vergolini and Zanini (2015) exaplain that the scholarships provided by region can influence mobility.

Specifically Vergolini and Zanini (2015), analyzing the role of financial aids and schoolarships

in shaping enrollment high school levers decisions, find that resource allocation doesn’t increase the number of enrollment but increases student mobility and improves the match between university supply and student preferences31_.

Following this result, I use the number of scholarships awarded (on the number of eligible candidates, i.e. “idonei”) at regional level32 and the variable is specified as follow:

Scholarshipit =

Scholarships awarded number of eligible candidates

i=1,...,20 t=¯t=2000...2006

Since one limit of this analysis is the lack of informations on the year of enrollment at the university and since the dataset is composed by two waves, I averaged the number of schoolarships assigned (in the origin region) on different academic years, from 200033_{to 2006}34_.

This variable is presented in percentage and should represents an index of resources available per student, provided by regions. The main intuition is that greater are resources transferred, lower is the likelihood to move in other regions; viceversa, smaller is the amount of resources assigned in the origin region, major would be the propensity to relocate to study.

Being an instrument, the variable “Scholarship” should have the following characteristics (Nichols, 2011):

1. correlated with the migration at start of tertiary education; 2. the instrument must be exogenus.

Exogeneity implies that scholarships assigned influence decision to relocate to work only through ante graduation mobility.

While the first condition find empirical evidence (see table 2.1), the second one may be argued sayng that even if regions with lower resources (typically southern) can be even those who face higher human capital losses, a direct correlation between resources in higher education and work- place mobility is hardly explainable.

Before to use this variable, I consider the possibility to use other variables as instrument:

• sleeping accommodation assigned on the number of applications submitted;

• “net migration rate” (%) at regional level (student mobility, years between 2004 to 2007).

31_{This study was implemented on sample of students from one Italian province (Trento).}

32_{Regional institutions (“Ente regionale per il diritto allo studio”) collect data (available on University and}

Research Ministry site (“MIUR”)) on all services provided by regions to support educational costs.

33_{First year in which MIUR data are available.}

34_{I would like to thank Federica Laudisa from IRES Piemonte (“Osservatorio regionale per l’Universit`}_{a e per}

The first variable is very similar to the instrument used, and even though the effect of this variable is significant I prefer to use the variable presented above since it is a most comprehensive measure of the resources available at regional level.

The “net migration rate” shows if each region gains or loses human capital through student migration flows. Since is plausibile to assume that the decision to move to study in a different regions could be influenced by informations obtained through networks (friendships or family), I expect correlation between past migration rate (from origin region) and actual relocation decision. Therefor, the first condition for a strong instrument is satisfied35.

Unfortunately the first year available in MIUR data is the academic year 2003/2004 and I can’t use it as instrument for two reasons:

• past migration helps to resolve endogeneity only if goes sufficiently backwards in time. For example when I take the net migration rate in 2006 as past migration and net migration rate in 2007 as present migration, if the actual net migration rate is endogenous there’s no justification for assume that the migration rate in 2006 is exogenus since it could be correlated with some trends in the labour market present in 2007 (unobservable factors affecting relocate decision in 2006 are the same of those present in 2007);

• the exact year of enrollment is not available and it could happens that there are some individuals enrolled at university before 2004 and the instrument presented loses values for this group.

Two points just mentioned raise doubts about instrument exogeneity.

Before to continue with the discussion it should be pointed out that in this step, as in the estimation of migration premium according to migration paths, im not considering observations not enrolled at university and not active in the labour market. These groups are not randomly selected across population and self selection issue led to possible additional bias in the estimation. Since no solutions are found to overcome these limits, the results interpreation requires additional caution.

35_{Furthermore use past migration in order to endogenize present migration is the mostwidely solution (Al-}

2.7 Econometric specification: are there migration paths

In document Student geographical mobility and labor market outcomes: evidences from Italy (Page 39-42)