• No results found

Preliminaries: Dependent variables and standard errors

4. Data and Methods

4.2. Preliminaries: Dependent variables and standard errors

4.2.1. Dependent variables

A basic question to answer at the outset concerns the nature of our dependent variables. Since we are interested in the distributional effects of employment protection legislation. Using relative labor market outcomes, such as youth-adult outcome ratios, as outcomes (or the difference of outcomes) is an intuitive solution and has been adopted in a number of studies (see, for example, Bertola et al., 2002; Jimeno and Rodriguez-Palenzuela, 2002; Esping-Andersen, 2000). Thus, we can model relative youth outcomes which provides a direct answer to the question whether employment protection makes youth relatively worse off, as in equation (1), where π‘ˆπ‘—π‘‘π‘Œ denotes the logged youth unemployment rate and π‘ˆπ‘—π‘‘π΄ denotes the logged adult unemployment rate in country j at time t. 𝛿 is the coefficient of interest and 𝐷𝑗𝑑 a variable measuring employment protection. For ease of exposition, we omit other variables, except for a constant 𝛾0 and an idiosyncratic error term πœ€π‘—π‘‘.

Data and Methods

(1) π‘ˆπ‘—π‘‘π‘Œ βˆ’ π‘ˆπ‘—π‘‘π΄ = 𝛾0+ 𝛿𝐷𝑗𝑑+ πœ€π‘—π‘‘

This formulation has been motivated by the argument that the sharpest prediction we can theoretically derive about the effect of employment protection is on the distribution of (un)employment or relative outcomes (Bertola et al., 2002; Esping-Andersen, 2000). In Chapter 3, we raised concerns whether this theoretical argument is valid in case of youth unemployment.

However, we nevertheless follow these authors here, in an attempt to relate our findings to published research. We can think of equation (1) simply as a model for relative youth unemployment. Looking more closely at the left hand-side, we can think of the adult unemployment rate in this model as a proxy variable for time-constant and time-varying factors that have an identical impact on both youth and adult unemployment (Bertola et al., 2007: 846).

However, if employment protection affects both youth and adult unemployment, the adult unemployment rate may actually control away some of the effect we try to estimate. Moreover, to the extent that adult unemployment is correlated with both treatment and outcome variable, it may constitute a "bad control" (see discussion below), introducing rather than solving endogeneity problems. Equation (2) represents a more flexible specification of the same model:

(2) π‘ˆπ‘—π‘‘π‘Œ = 𝛾0+ 𝛿𝐷𝑗𝑑+ 𝛾1π‘ˆπ‘—π‘‘π΄+ πœ€π‘—π‘‘

Now, we empirically estimate the relationship between the youth and the adult outcomes, instead of constraining this effect to be equal to unity, which we implicitly do in equation (1). In either case, we can interpret 𝛿 as the effect of employment protection on youth unemployment, net of whatever cyclical variation is captured by the adult unemployment rate. In this formulation, we can also think about replacing π‘ˆπ‘—π‘‘π΄ with other measures of cyclical variation, which we discuss below. While empirically more flexible, this formulation also has the inherent substantive advantage that it is a model of youth unemployment. Equation (1) leaves open,

Data and Methods

people or adults or both. While relative formulations (as in equation 1) have been frequently used in the literature (Esping-Andersen, 2000; Bertola et al., 2002; Kahn, 2007), they avoid a fundamental question that only a direct analysis of youth or adult outcomes can answer. In all analyses, we take the natural log of the dependent variable.

4.2.2. Standard errors

Another consequential decision we need to make concerns the estimation of uncertainty of our coefficient estimates. The key problem that results from our data structure, which has observations nested within countries and years, is that observations within the same country tend to be very similar on both dependent and independent variables. This within-country correlation of observations creates problems for unbiased and consistent estimation of standard errors. If unaccounted for, it leads to a strong downward bias in standard errors (e.g. Bertrand et al. 2004, Kezdi, 2003; Angrist and Pischke, 2009).

Different solutions have been used in applied research. The error structure is sometimes modeled with additional parameters that are supposed to capture clustering or serial correlation.

For example, mixed or multilevel models are used to account for clustering of observations, or (unit-specific) auto-correlation parameters are estimated from the data to allow for serial correlations following a specific correlation structure. Alternatively, variance-covariance estimators that allow for arbitrary correlation of the error term within panel units (countries, in our case) have been proposed as a solution (see Bertrand et al., 2004 and Angrist and Pischke, 2009, for more discussion and suggestions).

Bertrand et al. (2004) show for empirical situations similar to the one encountered here that parametric models for autocorrelation or serial correlation in the error term perform poorly and yield too small standard errors. They argue that parametric corrections should be viewed with

Data and Methods

caution unless the parametric model imposed on the data is close to the truth (see also Angrist and Pischke: Chapter 8). Without knowing the data generating process, the chosen parametric correction and the extra parameters we have to estimate are likely to be biased. Bertrand et al.

(2004) show that a popular correction, especially the first-order autocorrelation model (AR1), generally does not succeed in remedying the serial correlation problem. This is true for the common as well as the cluster- or unit-specific AR1 correction (Bertrand et al., 2004: 264).

Given that the AR1 correction is used frequently in analysis of aggregate data (see Bertola et al., 2007, for example), this casts doubts over the (statistically) significant effects that the authors obtain, in this case about the alleged positive effect of employment protection on youth unemployment.

Bertrand et al. (2004; see also Kezdi, 2003; Cameron and Trivedi, 2005) instead advocate the use of β€œcluster robust” standard errors as one alternative, i.e. variance-covariance estimators that allow for arbitrary correlation of the error term within panel units. Presence of correlation within panel units, which is very strong in our application, drastically increases estimated standard errors compared to conventional OLS standard errors. Unfortunately, the advantages of the cluster robust estimator also decline when the number of clusters (or countries in our case) diminishes, but the β€œcluster robust” estimator still performs better than the AR1 correction even with as little as ten clusters. Given that the number of clusters in our study varies between 15 and 21, Bertrand et al.’s (2004) results suggest that this correction still works well (see their Table IV, lines 4 and 6). Moreover, two factors may mitigate the serial correlation problem in our application, in particular in the differences-in-differences analysis using EULFS data. First, we use relatively short time series. Second, the control variables, in particular the linear time trends, may capture some of the within-unit correlation across time.

Data and Methods