CHAPTER 2 : Gender Differences in Skill Profiles, Wage Returns, and College En-
2.4 Model and Estimation Strategy
In this section, I investigate how gender differences in skill profiles contribute to differences in wages and college-going. I use principal component analysis to construct skill profiles, and find substantially different skill profiles between men and women. I then examine the impact of these skill profiles within a Roy model framework modified from Rosen and Willis (1979). The model allows for a direct comparison of whether different skill profiles garner different wage returns for male and female workers, since it enables the estimation of counterfactual wage rates for each individual in the data set. Finally, I examine whether gender differences in earnings over the life cycle contribute to gender differences in college-going.
2.4.1 Model of Wage Determination and the College-Going Decision
I extend the Rosen and Willis (1979) framework to incorporate a two-type (`∈ {male, female}), multi-skill model of the college-going decision. As in Rosen and Willis (1979), wages evolve according to
wcit,`=w¯ic,`exp(gci,`t) (2.4.1)
where ¯wi is initial wage for individuali andgi is growth rate of wages. Both initial wages and
wage growth rates are indexed byc∈ {0,1}, the decision to attend college, and`, the gender of the individual.
Initial wages and wage growth rates are determined according a vector of skills for each individual. Specifically,
ln ¯w1i,`=xiβ1,`+u1i ifc=1 (2.4.2)
g1i,`=xiγ1,`+u3i ifc=1 (2.4.3)
g0i,`=xiγ0,`+u2i ifc=0
whereu1,u2,u3,u4are jointly normally distributed. Here,xi includes skill profiles and a vector of
demographic controls (age, race, family background, and location).
Individuals can choose to forego college and start work right away atyear00 for wagew0it,` (c=0)
or attend college and start their career at a later year,year10, for wagew1it,`(c=1). They make their decision based on whether college-going maximizes their lifetime earnings:
c=1{lnV1,`>lnV0,`} (2.4.4)
where life-time earnings follow the equation5
Vc,`= Z ∞ yearc 0 ¯ wci,`exp(gci,`t)exp(−rit)dt≈ ¯ wci,` ri−gci,` exp(−riyearc0) (2.4.5)
Plugging equation (2.4.5) into equation (2.4.4) yields
c=1{ln ¯w1i,`−ln ¯w0i,`−ln(ri−gi1,`) +ln(ri−g0i,`)−ri(year10−year00
| {z }
S
)>0} (2.4.6)
Taylor approximating the non-linear terms around population terms ¯g1,g¯0,r¯yields an equation that allows for separate identification of the effects of initial wages ln ¯wci,`and wage growth ratesgci,`.
c=1{α0+α1(ln ¯w1,`−ln ¯w0,`) + α2 |{z} 1 ¯ r−g¯1 g1,`+ α3 |{z} − 1 ¯ r−g¯0 g0,`+ α4 |{z} −[S− 1 ri−g¯1i+ 1 ri−g¯0i] r>0} (2.4.7)
5Equation (2.4.5) integrates to infinity, which is unrealistic since individuals end their careers before then. This is merely an approximation, since the difference between ending a career at infinity and at 65 leads to negligible differences
The Taylor approximation also leads to sign predictions of theαvalues, which serve as an additional
estimation check when the model is taken to the data.α2should be positive,α3should be negative, andα4 should be negative as long asS−year10−year00 is sufficiently large. The model estimates reflect these predictions.
Substituting equations (2.4.2) and (2.4.3) into equation (2.4.7) leads to the estimation equation
c=1{α0+α1(Xβ1−Xβ0) +α2Xγ1+α3Xγ0+α4r>−(α1(u1−u0) +α2u3+α4u2)
| {z }
ε
} (2.4.8)
Becauseu1,u2,u3,u4are jointly normal,εis normally distributed. With equation (2.4.8) alone, the
parameters of interestαcannot be separately identified. Rather, equation (2.4.9) must be estimated
in order to construct a control function that approximates the probability of an individualiattending college givenwi= (xi,ri):
c=1{α0+X(α1(β1−β0) +α2γ1+α3γ0) +α4r>−ε}=1{Wπ>−ε} (2.4.9)
The function ˆk(X) constructed using estimates from equation (2.4.9) can then be used to control for selection into college. If certain characteristics directly affect both an individual’s probability of attending college and his wages, theβ coefficients would be biased. The control function ap-
proach addresses these concerns by including in the regression model a term that approximates an individual’s probability of college attendance as a function of her underlying characteristics.
The wage regression equations are then
gc,l=Xγc,`+σgkˆ(X) +uc+2 (2.4.11)
From equations (2.4.10) and (2.4.11), I obtain ˆβc,` and ˆγc,`. I then construct counterfactual wages
terms for each individualiif he had gone to college and if he had not:dln ¯w
0,` i ,dln ¯w 1,` i ,gb 0,` i , andbg 1,` i .
Finally, using these counterfactual wages terms, the model then estimates the coefficients of interest,
α, which represent the effect of college and non-college wages on the college-going decision:
c=1{α0+α1(dln ¯w 1,` −dln ¯w 0,` ) +α2gb 1,`+ α3gb 0,`+ α4r (2.4.12)
Theα1estimated from equation2.4.12represents the effect of the difference in log initial wages on the college-going decision;α2represents the effect of the college wage growth rate on the college- going decision; α3 represents the effect of the non-college wage growth rate on the college-going decision. The next section reports the results from the model.