Appendix 1.A The three minimum wages zones (1986-2012)
2.3 The data
2.3.1 Variables description
The dependent variable used to quantify the minimum wage effect on real earnings is the
logarithm of real hourly wages. ENOE reports nominal earnings in current pesos, so we
use the National Consumer Price Index (the base period corresponds to December 2010)
to calculate real earnings. Appendix 2.A describes in detail the definition and generation
process of all the variables included in the regression models.
Although the minimum wage in Mexico is set on a daily basis, our preferred specifi-
cation uses hourly wages as a dependent variable because of the following reasons. First,
if the purpose is to reveal the minimum wage effects on real wages, we need to focus the
spotlight is the level of earnings, not the time allocated to work or the accumulated income
affected by this decision. Second, minimum wage changes may affect employment beyond
reducing employment positions. Employers can decide, for instance, to cut working hours
or to replace full time with part time positions. In consequence, this would affect total
earnings and more importantly, it may bias the real wages findings. Moreover, aiming
for consistency in the estimates, in Chapter 3 we evaluate the minimum wage impact
at different points of the earnings distribution using hourly wages. Of course, earnings
percentiles are different if we estimate hourly or monthly earnings’ distributions, which
can also alter the conclusions reached. So, we prefer to leave the decision on the number
of hours worked as a separate issue.
Table 4.3.1 presents some descriptive statistics for the variables included in the econo-
metric specification. To follow the DiD model, the sample is divided by wage zones and
by two periods, before and after the minimum wage intervention.
The first aspect to highlight is the number of observations for the dependent variable.
As we can observe, from 2.45 million working-age individuals in the eight quarters of anal-
ysis, only 0.96 million are identified as waged workers. The rest of them are unemployed
or inactive in the labour market. This implies that to estimate the effect only on this
segment of the workforce means to leave out of the analysis more than 60% of the sample.
In contrast, by implementing sample selection bias correction procedures, we can carry
out our analysis on the full working-age sample.
Regarding the number of observations by wage zones, treated Zone B has systemati-
cally fewer observations than the other two zones. As discussed in Section 1.2.2, Zone B
is the smallest in terms of the number of municipalities and population covered (around
10% Mexican population). For our DiD estimates, it implies that we have available a
Table 2.1
Sample descriptive statistics (2012Q1-2013Q4)
Pretreatment period Post-treatment period
National
Zone A Zone B Zone C Zone A Zone B Zone C
Dependent Variables
Hourly real wage (Mexican pesos)
Mean 34.45 34.72 29.62 33.72 35.64 29.23 30.52
Std. Deviation 45.99 44.19 65.04 95.94 45.61 43.47 56.95
Observations 53,599 42,188 348,085 62,598 47,951 406,747 961,168
Monthly real wage (Mexican pesos)
Mean 5,545.35 5,572.03 4,755.33 5,446.71 5,668.47 4,734.89 4,917.16 Std. Deviation 5,369.97 5,463.97 5,448.21 7,514.70 5,170.70 5,431.49 5,592.36 Observations 53,599 42,188 348,085 62,598 47,951 406,747 961,168 Control Variables Age Mean 37.52 37.84 37.23 37.64 38.12 37.27 37.36 Std. Deviation 17.87 18.14 18.32 17.86 18.18 18.26 18.23 Observations 135,567 107,501 879,645 163,602 126,481 1,044,201 2,456,997 Female Mean 0.5155 0.5154 0.5248 0.5201 0.5163 0.5249 0.5232 Std. Deviation 0.4998 0.4998 0.4994 0.4996 0.4997 0.4994 0.4995 Observations 135,589 107,607 879,955 163,630 126,651 1,044,621 2,458,053 Rural Mean 0.0820 0.0625 0.2039 0.0624 0.0458 0.1937 0.1691 Std. Deviation 0.2744 0.2421 0.4029 0.2420 0.2090 0.3952 0.3748 Observations 135,589 107,607 879,955 163,630 126,651 1,044,621 2,458,053
School level2 (7th - 9th year)
Mean 0.2398 0.2188 0.2435 0.2365 0.2199 0.2375 0.2380
Std. Deviation 0.4270 0.4134 0.4292 0.4249 0.4142 0.4255 0.4258
Observations 135,500 107,554 879,523 163,536 126,578 1,044,070 2,456,761
School level3 (10th - 12th year)
Mean 0.3167 0.3624 0.3096 0.3197 0.3615 0.3156 0.3182
Std. Deviation 0.4652 0.4807 0.4623 0.4664 0.4804 0.4648 0.4658
Observations 135,500 107,554 879,523 163,536 126,578 1,044,070 2,456,761
School level4 (University)
Mean 0.3047 0.2976 0.2655 0.3174 0.3054 0.2728 0.2777
Std. Deviation 0.4603 0.4572 0.4416 0.4655 0.4606 0.4454 0.4478
Observations 135,500 107,554 879,523 163,536 126,578 1,044,070 2,456,761
Head of the Household
Mean 0.3511 0.3402 0.3375 0.3504 0.3417 0.3383 0.3398
Std. Deviation 0.4773 0.4738 0.4729 0.4771 0.4743 0.4731 0.4736
Observations 135,589 107,607 879,955 163,630 126,651 1,044,621 2,458,053
Note: sample restricted to individuals aged between 12 and 97. For wage variables, observations with non-reported values for hourly wage are excluded. For socio-demographic controls, observations with non-specified responses are also omitted. Appendix 2.A described in detail the variables generation procedure.
With respect to real wages, some important points emerge. First, zones A and B
exhibit a greater level of mean wages than Zone C. Although zone C is the largest in terms
of population and surface area, it contains the municipalities with the lowest economic
development. Second, Zone B is the only wage zone that experienced an increase in real
wages after the intervention. The other two zones exhibit a decrease. It suggests that,
without controlling for other covariates, the minimum wage intervention could have a
positive effect on real earnings, in the treated Zone B.
The control variables included in the model are the following: age of the individuals (in
years at the moment of conducting the survey), indicator variables for f emale workers and
rural municipalities, and a set of dummy variables for completed schooling level (basic
school=1,1 secondary school=2, high school=3, university and post grad studies=4). As
we can observe in Table 4.3.1, there are no substantial differences on the gender and age
composition by wage zones. Indeed, the means for these two variables among zones are
not statistically different.
Nevertheless, if we look at the schooling level of the individuals, schooling level in
zones A and B is greater on average than Zone C. This reflects the classification itself of
the wage zones based on its economic development. Thus, given that by definition wage
zones are different among them, specifically with respect to Zone C, the main identifica-
tion assumption is that outcome variables follow the same trend before the intervention.
Pretreatment trends are analysed in the following subsection.
Taking into account these sample dissimilarities with respect to Zone C, all the DiD
models in this Chapter are run using two specifications. In the first of them, all the
untreated units (zones A and C) are part of the control group. In the second version of
the model, only observations from Zone A are included within the control group. The
magnitude of the estimated treatment effects do not vary significantly.
Finally, the control variable Head of the household, which is a dummy variable in-
dicating the individual in the household who is the main responsible of the earnings of
the household, is included only in the first stage of the sample selection correction bias
procedures as the exclusion restriction variable.2 Subsection 2.4.2 describes in detail the
procedures implemented to correct for sample selection bias.