NBER WORKING PAPER SERIES
IMMIGRATION AND SELF-SELECTION
George J. Borjas
Working Paper No. 2566
NATIONAL BUREAU OF ECONOMIC RESEARCH 1050 Massachusetts Avenue
Cambridge, MA 02138 April 1988
The research reported here is part of the
NBERS research program
in Labor Studies. Any opinions expressed are those of the author and not thoseof
the National Bureauof
Economic Research.NBER
Working Paper #2566
April
1988
Ininigration
and
Self-Selection
ABSTRACT
Self-selection
plays a dominant role in determining the size and
composition of immigrant
flows.The United States
competes with other
potential host countries in
the"immigration
market".Host countries vary
in their "offers" of economic
opportunities and also differ in
theway they
ration entry through their immigration
policies.Potential immigrants
compare the various opportunities
and are non-randomly sorted by the
immigration market among the various host
countries.This paper presents
atheoretical and empirical analysis of this marketplace. The theory
of
immigration presented in
thispaper describes the way in which immigrants
are sorted among host countries in terms of both their observed and
unobserved characteristics.
Theempirical analysis
uses Census data from
Australia,
Canada,and the United States and shows that U.S.
"competitiveness" in the immigration
market has declined significantly in
thepostwar period.
George
J.Borjas
University of California
Department of Economics
Santa Barbara,
CA 93106
D(IGRAION
ANDSELF-SELECTION
George
7.
Borjaa
1.
Introduction
Immigration
has
been
an
important componentof demographic change
sinceBiblical
times.
International differencesin economic
and
political
conditions
remain sufficiently
large to encouragethe flow of millions
of
persons across
national boundaries.United Nations
statistics,for
example,report that in the 1975—1980 period,
nearly
5million persons migrated to
adifferent
country,with nearly two
thirdsof these
individualsmigrating to
one of three
countries,the United
States, Canada,and
Australia.1 There exists, therefore,a large reservoir
of
persons who believe that betteropportunities exist elsewhere
and
who arewilling to
incur costs toexperience
those opportunities.
These two conditionsimply that the pool of
migrants will not be randomly
chosen from
thepopulation
of the
countriesof
origin,and that it will not be randomly
distributed
acrossthe
potentialcountries of
destination. Instead the pool of immigrantsin any host
country will be composed
of the subsample
of
persons whoface better oppor-
tunities in that particular
host
countrythan either in the country of
origin
or in other
potentialhost
countries,and whose migration
costs aresufficiently
low to make the move profitable.The insight
that migrants may be systematically
differentfrom
persons whodo not choose to migrate
haslong played an
importantrole in
sociological
and
historical studiesof the
irrsrd.grationphenomenon
(see, for example,the
studies contained in Jackson, 1969) .The
selectivityhypothesis
hasalso played
amajor
rolein
the modern economic literaturethat analyzes
how immigrants
do in the
tLS. labor market. For example, the early studiesof Chiswick
(1978)and Carliner
(1980)invoke the assumption
that
immigrants arepositively selected
from
the populationof
the countries of origin toexplain
theremarkable
cross—section empirical findingthat
immigrantearnings
(after ashort time
period)
"overtake" theearnings of
nativeswith
thesame observed
socioeconomic characteristics,such as age and
education.21y recant work in this area
(Borjas, 1985, 1987)has addressed two
related
questionsraised by the
early studies.Since most of the literature
thatanalyzes immigrant earnings
focuseson the study of
single cross—sectiondata
sets,my 1985 paper raiosd
the possibilitythat the
'overtaking"findings
could be due to
the fact that cross-section regressionsconfound
aging and cohort effects.3
The positive correlationbetween
immigrantearnings
and
yearsof
residencein
the U.S.observed in
the cross—sectioncould arise because
immigrants "adapt" rapidlyto the
U.S. labor market,or
because earlier waves of
immigrantsdiffer in
substantialways
(labormarket
productivities,unobserved
abilitiesor
skills)from
themore
recent waves.Eorjas
(1985)adapted
well-known techniques (see, for example, Heckman, 1983)to separately identify aging and
cohort effectsusing the
1970and
1980 U.S. Censuses.This
methodology, which "tracks" syntheticcohorts of
immigrantsover
time,showed
that: (a) immigrant assimilationwas not as fast as the
cross—Section
studies indicate; (b) themore recent
immigrantwaves performed
substantially
worsein the
labormarket than the early postwar
waves;and
(C)
there
was little likelihood
that
themost
recent immigrant waveswould
ever earn substantially
more than
nativesof comparable age and
education.n
important insight provided
by
the studyof
syntheticcohorts
is thatinvoking
the
assumption
of positive
selection,though
itmay be correct for
some cohorts of
ixrartigrants,may be
completely
wrong forother
cohortsof
immigrants.
This raises the
importantquestion
of exactly which
factors determinewhether immigrants
are positively
or negatively selected from the
population in
the countriesof
origin.Borjas
(1987)presents an
initialattempt to address this problem
and
derivesa
simpleeconomic
model of
selection on the basis of unobserved
characteristics (which after all formthe focus of much of the literature
on imnigrant earnings)
.This
model,which
will be discussed in
detail
below,
shows
that there is no general
lawstating that imuigrants
must be positively
selected.In
fact,under a
reasonable
setof conditions
itis likely that
inunigrants arenegatively
selected (i.e.,
persons who have below
averageearnings
and productivities
are the most likely persons to migrate to the United States).
My empirical
analysis revealed
that positive
selection was more likely to characterize ixmnigrants from theadvanced
industrial countries,and negative
selectionwas
more likely to characterize
iimoigrantsfrom
the Third World countriesthat form
thebulk of migration to the
U.S.in the
pot—l65
period.This paper
expandsmy
earlierwork in
a number of significant ways.The
theoretical
analysis
below will
argue that althoughmost of the literature
has focused on the role that
selectionin
unobservedcharacteristics
plays in
determining
ixmnigrant earnings, there isalso selection
in observed
charac-
teristicssuch as
education, Surprisingly, it is easy to show that there isno relationship
between the
types of selectionsthat
aregenerated in
unobserved
characteristics
and
the typesof
selectionsthat
aregenerated
in,for
example, education. It is completelypossible
for the most educated
persons to migrate to
the U.S. (i.e.,positive
selectionin education), but
for these
personsto be the least productive persons
in the population
of
highly educated
persons (i.e., negative selectionin unobserved character-
isticS) .The analysis below will
presenta number of propositions
that
yield insights into the
processthat determines
theselection of immigrants
in these two
separate
dimensionsof
"uality.
The empirical analysis
in thispaper
expandsmy previous work in two
ways. First,
it presents
a detailed analysisof the U.S. earnings of
immigrants
by focusing
on the roles
played by both selection in observed
characteristics
and
inunobserved
characteristics. It willbe
seen that anumber of the theoretical
predictions areconfirmed by the
data. Second, asnoted
earlier,potential migrants
can choose amonga number of
countriesof
destination. The
empirical analysisbelow will
present a systematicStudy of
the
selectionbiases generated by
the sortingof migrants among three
potential countries of
destination: Australia, Canada,and the United
States.It
will
be seen
thatboth
country-of-originand
country—of-destination
characteristics
play an
important rolein determining
theperformance
of immigrants
in any
labor market.The paper is organized
as follows. Section2 presents
atheory of
immigration
based
on
the
hypothesis
that
migration
is
determined through
the
process of
wealth—maximization. Section3
presentsthe basic
empirical framework thatwill be used
throughout theanalysis
to
test the variouspropositions
predicted
by the
economicmodel of
immigration. Section 4presents the
empirical analysison
the
earnings
of
ixrsnigrantsin the United
States,
while
Section 5 compares theperformance
of
immigrantsin
the U.S. labormarket with the
performanceof
ixmnigrantsin
Australiaand
Canada. Finally,Section
6 zuxmrarizesthe
main
resultsof the
Study.2.
Ii
rat
ion
2.1.
!!2X_2l
Migration
is assumed to flow from country
0, thecountry of origin or
the "homer country,to country
1,the
country of
destination or, forconcretenees,
the United
States.This
simple
framework ignores threepotential
complications. First,it is likely that
personsborn in the United
States also consider
thepossibility
of migrating to other
countries,and
perhaps many of them do
so. Second,even persons choosing the United
Statesas a country of destination
may find
that
thingsdid not work
Out (orperhaps
worked Out much better than
expected)and
somereturn migration is
generated.Third, individuals contemplating migration
in
a particular country of origin enter the "immigration market"in
which a number of other host countries(such
as Australia
and
Canada)
compete
for the
immigrant's
human
and
physical
capital.
Little is
known
about
the size
and composition
of
the
migrant flows
from
the
United
States
to
other countries, and hence these possibilities areignored
in
what
follows.
Much
more, however,
is
known
about
the size
and
composition
of
the flows from any given home countryto
each of three potential host countriea (Australia, Canada, and the United States), and theimplications
of
the simpler two-country model willbe
applied below to themore
general framework where potential migrants decide not only whether ornot to migrate, but also choose a country of destination.
Residents of the home country face an earnings (w) distribution given by:
£nwXS0+E0
(1)
where
X is a vector of socioeconomic characteristics with
value S in 0country 0, and the disturbance is independent
of K and is normally
distributed with mean zero and variance &0
The earnings distribution facing individuals in the United States is given by:
Zn
w1 (l—M) xS + M K 5,+
e1, (2)where M is a dumsoy variable indicating if the individual is foreign-born
or
native. The vector 5 gives the value that then
U.S. labor market attaches tome
socioeconomic characteristics K for natives. This valuation may differ dueto
discrimination or other unobserved factors from the value 5 that the labor market attaches to the characteristics brought in by potentialmigrants.
The
disturbance is again independentof X
(and M)and
isnormally distributed
with mean
zeroand
variancea
Finally, therandom
variablesE
and
have
correlation coefficient p.Equations (1)
and
(2) completelydescribe
the earnings opportunities facing apotential migrant
(aswell as
U.S. natives) .Three
questions areraised by this
simple framework. First,what
factors determinethe size of
the migrationflow generated by
the income-maximization hypothesis? Second,what types of
selectionin
the unobserved characteristics a arecreated by
theendogenous
migration
decision? Third,what types of
selection in theobserved
characteristicsX
arecreated
by
the endogenouamigration
decision?The
migration decision
is determinedby
the signof
theindex
function:i
tn
[—-
3
[
x
(8—8
)-
it
] +(e1-e,
(3)where
C gives
the levelof mobility
costs,and
it gives a "time—equivalent"measure
(ivw/C)
of the
costsof migrating to the United
States.The level
of migration
costsC is
likelyto vary
across individualsfor
two reasons: First, there aretime
costsassociated
with
migration,and
thesetime costs are likely to be
higherfor
personswith higher opportunity
costs. Second, there are transportationcosta associated
with
migration,and
these directcosts
include notonly
the airfare
(whichis likely to be
constant
across
individuals), but alsomoving
expensesof
familyand
household
goods, and itis
reasonable to supposethat these expenses may also
be
apositive
function ofw.
These assumptionsgive
little hintas to how
the time—equivalent
measureof mobility
costs, it, varies across individuals.It is
instructive
to
firstassume that
it is constantacross
individualssince the main
implicationsof the
Roy model areclearest
in this special case. Theanalysis
below will
show that the treatmentof
itas
arandom
will,
in sons
instances,
reinforce the conclusionsof the simpler
model. Sincemigration to the United States occurs when
I >0, the emigration
rate from the
countryof origin
for personsof given
characteristicsX
is
given
by:—
{
>
-{
—]
}
1—
D(z),
(4)where V =
z
—(X(5—ö0)
—fl/;
and
is the standard normaldistribution
function. If the characteristicsX have a joint density
function
given by
f(x), then theemigration
rate from country
0is given
by:P
P(x)f(x)dx.
(5)
xeL2
Equations
(4)and
(5) suzrsnarize the (rather obvious)economic
contentof
thetheory of migration proposed
by
Hicks (1939)and
furtherdeveloped in
Sjaastad
(1962).In
particular, the emigration rate is: (a)a negative
function
of mean income
inthe home country
.Lx6);
a
positive
function
of mean income in the United States
().L=
X1);
and a negative function
of migration Costs.such
of the
literature
on
the
internal
migration of persoos
in
the United
Statesis devoted
to testingthese
theoretical predictions (see thesurvey by
Greenwood, 1975)The
irrsmigration literature,on the other hand, has not historically
focusedon explaining
thesize of migration
flows, but on explainingtheir
composition
or labor
market
quality.
As
far
back
as
1919,
for
example,
Douglas
wasasking
whetheror
notthe
skill
composition
of immigrant cohorts
was constant across
successive immigrant waves. Thetheory of
migrationimplicit
in
equations (1)-CS)has
important implicationsabout the
selectionbiases that
characterize thepool of migrants
both in
termsof unobserved
andobserved characteristics. Consider initially the selection
mechanism
in the unobserved characteristios S In particular, consider the conditional expectationsE(Zn
w X, I > 0) and E(tn w1 X, I>
0).
Note
that thesemeans condition
on
two dimensions: theobserved
characteristicsX
and the decisionto
migrate. Under the normality assuoçtions these conditional means are given by:Ol
E(Zn
w
J X, I > 0) x8+
(P — (X1 (6)Cøi
a
E(Zn
w1 X, I >0) =
x6
+ —a——
(—
—p)X,
(7)where
A
Ø(z)/P(X);and$
is
the densityof
the standard normal. The variableA
is inversely relatedto
the emigration rate and willbe
positive as long as some personsfind
it profitable to remainin
the countryof
origin (i.e.,P(X) <
1).
Cl
Let
Q
E(S X, I>
0),
Q
E(c
X, I>
0),and k
=—
.The
0
0
13
0
variables
and
measure the "quality" (in termsof
unobserved character- istics)of
the migrant pool. TheRoy
model identifies three casesof sub-
stantive interest.A. Positive Selection: Q
>
0and
>
0.
This type of selection exists when migrants have above average earnings in
the
countryof
origin (for given characteristics X(,and
alsohave
U.S. earnings whichexceed
the earningsof
comparableUS.
natives (ignoringthe
possibility that immigrant earningsmay be
reduced becauseof
their ethnicor
racial background) .Inspection
of
equations (6)and
(7) shows that the necessaryand
sufficient conditions for this type of selectionto
occur are:If
pis sufficiently
high and
if income ismore dispersed in
the U.S.than in the country
of origin,
iztsrd.grants arriving in the U.S.will be
selected
from
the uppertail of
the home country's income distribution,and
willoutperform
comparable natives upon arrivalto the U.S.
Intuitively,this
occurs because the homecountry,
in
a sense, is "taxing"high—ability
workers and "insuring" low-ability workers against
poor
labor marketoutcomes. Since high income workers benefit relatively
more than
low income workersfrom migration to
the United State, (regardless of how much highermean
incomesin the United
Statesmay
be
relative to the countryof
origin), a brain drain is generated and the United States,with
itsgreater
opportunities,
becomes a magnet for persons who
are
likelyto do well in
the labormarket
Negative
Selection:0 < 0
and<
0.
This type
of
selection is definedto
exist when the United States draws personswho have below
average incomesin
the country of origin,and
who,hoidin
characteristics constant,do poorly in
the U.S. labormarket
The
necessaryand sufficient
conditions for negative selectionto
occur are:p >
min),
k)and
k
<1.
(9)negative
selection also requires that pbe
"sufficiently" positive, but that the income distributionin
the country of origin be more unequalthan
that in the U.S. Intuitively, negative selectionis
generatedwhen the
United
States "taxes" high-income workers relativelymore
than the countryof
origin,
and provides better
insurance for low—income workers againstpoor
labor market outcomes. This opportunity set leadsto
large incentive,for
low—ability
persons to
migrate since they can improve their situationin the
United
States,and to
decreased incentivesfor
high-ability personsto
migrate since income opportunities
in
the home country aremore
profitable. C. Refugee Sorting:Q <
0and
> 0.
This kind
of selection ooours when the U.S. draws below-average immigrants (in terms of the oountry of origin), bot migrants haveabove—
average earnings in theUnited
States labor market. The necessary and sofficient condition is:p < min(-, k) (10)
In
other words,if
p is negative or "small", the composition of the migrant pool is likelyto
resemblea
refugee population.For
instance, it is likely thatp is
negative for countries thathave
recently experienced a Communist takeover, After all, the changefrom a
market economyto
a Communistsystem
is
often accompaniedby
structural changesin
the income distribution,and by
confiscationof
entrepreneurial assetsand
redis- tributionto
other persons. The Roy model suggests that immigrantsfrom
such systems willbe
in the lower tailof
the "revolutionary" income distribution, butwill
outperform the average U.S. native worker.The basic
Roy
model thus provides a useful categorizationof
the factors that determinethe
gualityor
composition (in terms of unobservedcharacteristics)
of
the migrant pool.Even
at this level, several important implications are generated which give some insight into a numberof
empirical findings in the literature. For example,many
studies have documented the fact that refugee populations perform quite wellin
the U.S. labor marketwhen compared
to
native workersof
similar socioeconomic characteristics. These empirical results are explainedby
the income—maximization hypothesisand by
the fact that these refugee populations, priorto
the political changes whichled to a
worsening of their economic status, were relativelywell
off in the country of origin. It is, therefore, unnecessary to resortto
the arbitrary distinctions between "economic" and "non-economic" migrantsto
explain the refugee experience.The
Roy
model also providesan
interesting explanation for the empirioal finding that the quality of migrants to the United States has declined in the postwar period (where quality is defined by thewage
differential between migrantsand
natives of the same measured skills) .Prior
to
the 1965Amendments
to
the Immigration and Nationality Aot, ixmnigration to the United States was regulatedby
numerical quotas.The
distribution of the fixed numberof quotas
aoross countries was based on the ethnic population of the United States in 1919 and thus encouraged migrationfrom
(some) Western European countries,and
strongly discouraged immigrationfrom
other continents, particularly Asia. The favored countries have one important characteristic: their income distributions are probably much less dispersedthan
those of countries in Latin America or Asia. The 1963 Amendments abolished the discriminatory restrictions against immigration from non— European countries, established a 20,000 numerical limit for legal migrationfrom any single country (subject
to
both Hemispheric and worldwide numerical limitations) ,and
led to a substantial increase in the number of migrants from Asia and Latin America. The new flow of migrants thus originatesin
countries that are much more likely to have greater income inequality than the United States.5 It would not be surprising, therefore, if the qualityof
immigrants
declined
as a result of the
1965 Amendments.In addition, the 1965 Amendments led to a fundamental shift in the mechanism
by
which visas were allocated among potential migrants. In particular,the
roleplayed
by
observable skills and occupationalcharacteristics was deemphasized, and the role played
by family relationships
with
relatives currently in theUnited
States became the primary focusof
the policy.
In
1985, for example, nearly 70 percent of all (legal) migrants entered theUnited
Statea through oneof
the kinship provisions in the immigration law.The
Roy model suggests that this change in the statutes will leadto
a substantial decline in immigrant quality. In particular, the familyof
the migrant that resides in the United States provides a "safety net"that
insures the immigrants againstpoor
labor market outcomesand
unemployment periodsin
the months after immigration. Low—ability persons who could not migrate without family connections in theUnited
States,and
hence withoutthat
insurance, willnow
find it worthwhileto do
so.In
effect, the kinship regulations inthe
immigrat ion law createa
lower boundin
the income distribution that low-skilled immigrants face in theUnited
States,and
hence make itmore likely
that migrants are negatively selected from thepopulation.
The theoretical analysis yields two equations that
can
guide empirical analysis. These equations are given by:=
4t1 ,lt,G,C,
,p)
(11)
= h)m,m1,p)
). (12)Equation (11) gives a "reduced form" equation, where immigrant quality
in the
United
States (i.e.,the
wage differential between migrantsand
nativesof
equalmeasured
skills) isa
function ofall
the primitiveparameters
of
the model (i.e., the parametersof
the two income distributions and migration costs) . My earlier paper (Sorjas, 1987) provides a detailedanalysis
of
the theoretical restrictions impliedby
the income maximization hymothesiaon
the directionof
the effectsof
the various variables in the model. These effects are usually ambiguous and canbe
categorized in termsof "composition effects"
and
"scale effects".In
particular, a change in any variablea will create incentives for a different type of person to
migrate(the composition effect), and for a different number of persons to migrate
(the scale effect)
Equation )l2)
is
a "structural" equation and states that if knowledgeof
Xis
available, a subset of the parametersof
the model enter multi— plicatively through the h function )see equation (7)) .By
holding ).con-
stant, the structural equation essentially nets out the scale effect, and leads to more unambiguous predictionsof
the impact of the exogenous vari- ables on the qualityof
insnigrants. It is important tonote that
the h function in (12) does not dependon mean
income levels in the countriesof
origin and the countryof
destination, or on the level of migration costs since these factorsonly play
a role through the selectivity vari- able 2..Three comparative statics results are implied by analysis of the 2.—
constant
structural quality
equation:
1
An
increasein
the variance of the income distributionin
the home countryleads
to
a
decrease
in
the
quality of
migrants in
the
United
States
2.
Anincrease
in
the
variance of
the
income
distribution
in
the
United
States
leads to an increase
in
the quality of
migrants in
the
United
States.6
3
An
increasein
the correlation coefficient between earnings in thehome
country
and
earnings
in
the
United
States increases
immigrant
quality
if
there
is positive selection,and
decreases immigrant quality if there is negativeselection.
The
ambiguity
arises
because
the larger the correlation
coefficient, the
better
the
"match" between
the
two
countries.
The
improvement
in
the
match
increases
the quality
of
the
immigrant flow
if
there
is positive selection, and decreases it if
there is negative selection.2.2.
Random
Mobility CostsThese insights have
been
derived from the simplest versionof
the Roy model that treats mobility oosts (defined as a fraotionof
potential incomein the country
of
origin) as a constantin
the population. This assumptionstay
be
restrictive,and
it is important to ascertain how its relaxation affects the results of the model. Suppose that mobility costs are normally distributed in the population and canbe
written as:(13)
where )L is the
mean
level of mobility costsin
the population, and C is a normally distributed random variablewith mean
Cand
varianceC.
The random variable Cmay be
correlated with C 0 and1
and the correlationcoeffioients
are givenby
pand
Pm
respectively.
The
conditional
expectations
of
migrant incomes in the homeand
destination countries are now givenby
moct
(
co.t
OttE)Ln
w X,I
> C)xS
+———
p ——
—p
—
X,
(14)
o
oV
o
J1tocJ
co
E)Zn
w1 jX,
I>
C) x81 +—r-
{ — p ) — p—
X,
(15)
whereV'
=C—C—C.
Equations
(14)and
(15) show that the additionof
migration costs does not affectany
ofthe
substantive results of the simplest versionof
the Roymodel if
migration costs are uncorrelated with earnings opportunities.However, if migration coats are correlated with earnings opportunities the tyme
of
seleotionthat
is generated may change in either direction. Suppose, for example, that migration costs are positively correlated with earnings opportunities. For instance, high ability personsmay
take longerto
find appropriate jobs. This positive correlation makesboth
and
l
more negative, and hence increases the likelihoodof
negative selection.Conversely, if migration costs (measured in
time
units)
and
earningsopportunitiea are negatively correlated, the likelihood
of
positive selection is increased.Two additional points about this more general
model
are worthstressing. First, the importance
of
variable migration costs in the analysis will diminish greatly if the variance in migration costsis
relatively smallcompared
to
the variancein
the income distributions. Secondly, regardlessof how important migration costs are, the key
result that negative selectionis more likely from
countries with high levelsof
income inequality andpositive selection ia more likely from countries
with more
equal
income
distributions
is unaffected.2.3.
Selection
in Observed CharacteristicsEquation (4), the probit equation determining the migration rate, contains an additional insight; The migration rate
is a function of X
through the parameter
ience,
the migration of persons with larger levels of X is more likely if K has a higher return in theUnited
States than in the country of origin, and the migrationof persons with
lower
levels
of
Kis more likely
if the countryof
origin values the characteristicK more than
the United States.
A complementary analysis
to
the Roy model canbe
derivedif it is assumed that the vector K consists of
only one variable, sayeducation (5),
that
this
variable is uncorrelated with the disturbances rn the earnings functions, and that this variable, too,is
normally distributed in the population. The assumption of only one variablein
the vector K isirrelevant, since the results can
be easily generalized
to
any numberof
variables. The assumption of normality, though unrealistic for some socio- economic characteristics, does simplify the mathematics substantiallyand
allows a useful extension of the Roy approach tothe
determination of theobserved income distribution.
Suppose the earnings functions in the two countries are given by:
Znw
js
+Ss+e
,(16)
0
00
0
-
£n
w1 = +&1s
+
(17)
and that the education distribution
in
the populationof
the countryof
origin canbe
written as:5 =
+
5,
(18)where C is
noa11y
distributedwith mean
zero and variance a25
5
Assuming
that nobility coats are constant, the emigration rate for the population in the country of origin is given by:P
=Pr
{(e2_eO)
+ a > — C +(5t_5o)_z]}
(19)
=
l_(z*),
where
t
=
+
(8—3),
and
z
—Two
interesting questions canbe
addressed within this framework. First, consider the conditional expectationof
schooling of persons whodo
migrate. It iseasy to show
that:E(s
j
I
>0)
=i
+
—
(8—8).
(20)Hence
the mean
schooling level of migrants willbe
lessthan or
greater thanthe mean
schooling levelof
the population dependingon
which of the two countries values schooling more. Positive selection in schoolingwill
be observedwhen
(5—3)
>
0,
so that the U.S. labor market attachesa
higher value to schooling, while negative selection in schoolingwill be
observedwhen
the
country of origin.
It is
important
to stress that these selectionconditions have nothing
to do with the conditions
determiningselection in unobserved
character- istics.Any permutation
of selection
mechanisms in unobserved
and observed
characteristics
is
theoretically possible.Hence negative
selectionin
unobserved
characteristics
(or ability) may be jointly occurringwith posi-
tive selection in
education,or vice
versa,Simply because
the UnitedStates
attracts highly educated
personsfrom
some countries does not implythat these
highly educated
persons arethe most productive
highly educated
persons
in that particular
countryof origin.7
This important insight implies
that
little can be learnedfrom com-
parisons
of the unstandardired earnings of migrants and natives.The actual
mean earnings
of themigrant pool
in each of the two countries are given by:2
0
-05
aoa
Iol
c.(Zn w0 I >0)
(1s + ÷)3,"S)$o
+
_._
[
p
— X,(II,
a1 iE(Zn
w1I
>
0)
t1+
6
+—'
(8—3)t
+
I. — p JX,
(22)
Mean earnings of migrants
depend on the mean education of migrants,as
given by
(20), and on the mean level of their unobserved characteristics.Since the two kinds
of
selections are independent,nothing can be
said abouthow the average
migrant
performs in thehost
country unless the sinds ofselections
that occurred
in each of the two dimensions of quality are known.This result suggests
that generalizations aboutthe quality of
immigrsntsbased solely on observed
education levels (or other measures of X) areextremely
misleading.In
addition, it is well known that observed char-acteristics such as
education, age, marital status, health, etc., explauna relatively small
fractionof earnings
variation across
individuals.It is
not
uncommon, for example,to find that the observed
characteristics explainmuch
lessthan
a third of the variancein
wage ratescr
weekly earnings. The selection
in
unobserved characteristics, therefore, is empiricallymuch more
important than the selection in observed characteristics.A
numberof
comparative statics results canbe
genecatedby
analysisof
equation (20) .Perhaps
themost
interestingof
these results is:ascal
I
> 0)
<
a < 1 (23)
That
is, a oneyear
increasein
themean
education levelof
the countcyof origin will
increase themean
education levelof
persons who actuallymigrate to the United
States,but this
increasewill be by
lessthan
one year. The intuition for this result followsfrom
the fact thatan
increasein
(Iwill
change thesize
of the immigrant flow. Suppose, for concreteness,that (31—30) > C so
that
thereis
positive selectionin
education. The inccease in p makes it worthwhile formore
persons to migrate, and thus dilutes themean
education levelof the
populationof
migrants.Hence
the increasein the
conditional expectation is lessthan
the increasein
thepopulation
mean.An
important implicationof
this theoretical prediction is thatthe
variance in education levels across irmnigrants (from different countries)in
theUnited
States willbe
smaller than the variancein
education levelsof
the actual populations across countries in the world. In other words, thepopulation
of
migrants in theUnited
States ismore
homogeneous (in termsof
education)than
the populationsof
the different sending countries.In general, equation (20) implies the existence
of
observablequality
equations analogousto
(11) and (12)(24)
=
h
(C
,C ,p,3.t,C
,
where
Q
gives themean
level
of
the observed characteristicsof
immigrants in the U.S.The
estimationof
(24) and (25),of
course, is likely to be extremely difficultin
practice since they introducea number of
primitive parameters (e.g.,that
are unobservable and likely to remain so. 3.Eirical
FrameworkRecent empirical research
on
the earningsof
immigrants stresses theimportance
of
disentangling the cohort and aging effects that are confoundedby
a single cross—section of data.In
the analysis presentedin
this paper, two Censusesin
the country of destinationwill
bepooled
(e.g., the 1970 and 1980 U.S. Censuses),and
the
following regression model will be estimated:Zn
w.,x5.
+a1y,
÷a2y
+tt
?]
+€..,
(28)
Znw
nZ=xS
Zn
+ylt
nZ
+E
nZ ,(2°(
where w,, is the wage rate of immigrant j,
wf
is the wage rate of natove person 8; X s a vector of socioeconomic characteristics(eg., education,
age, eto.(; y is a variable measuring the number of years that tne inmigrant has residedIn
the country of destination; C is a vector of duumy variablesindicating the year in which migration occurred; and S
is
a
dummy variacle set to unity if the observation is drawn from the 1980 Census, and rerootherwise. The vector of parameters (a;,a;)
along
with the age coefficients in the vectorX
provide a measure of the assimilation effect (i.e., the rate at which the age/earnings profile of migrants is converging to the age/earnings profile of natives), while the vector of parameters estimate the cohort effects. The period effects are given by for immigrants andby 7n
for natives.The
model in equations (26)and
(27) is underidentified. In particular,some of
the righthand
side variablesin
the immigrant earnings function areperfectly
collinear. Suppose, for example, thst the immigrant arrivedin
cslendar year
9so that
C9 1. Then:
y
(T—k—9)+
mk (28)where
T isthe calendar year
in whichthe
latest cross-sectionis observed
and
k isthe number of
yearsseparating
the two cross—sections. Thevariable
cspturing the period
effect, therefore, is alinear combination
of
the cohortvariable and
ofthe
years—since-migration variable. Cbviously, two cross-sectionscannot be used to
identify threeseparate
effects: period, cohort, and aging effects.In
order to estimate
the structural parametersdescribing
theextent of
immigrantassimilation
and cohort
qualitychange a
restrictionmust be
Imposedon the size of
theperiod
effectin the migrant
population.A
reasonable,though
unverifiable, assumption isthat
theperiod effect
experienced
by
irtsnigrants (7,) is identicalto the period effect experience
by natives (7)
.In
other words, changes in thewage
ratedue to
shiftsin
aggregate
economic
conditions affect the
irmnigrantand native wage
levelsby
the same relative
magnitude.
It iseasy to show that this
restriction is sufficientto exactly identify
allthe
structuralparameters in
equations(26)
and
(27) .This
theoretical restriction leaves someamplitude
for itsempirical implementation
since thechoice of the native base is
essentially arbitrary.The choice of a native base
for the various irrmigrant groupsunder study will be discussed
indetail
below.There are
twodimensions of
migrant qualitythat can be calculated from
the estimated
regressionsin
(26)and
(27) : (a)the entry wage of
immigrantswhen they arrive into
theUnited
States;and
(b)the rate at which this
wage
changesover
time.To simplify
the empiricalanalysis
the twomeasures
will be combined into
a single measureof
ixmnigrantquality. In
particular,let
.(8)
be the entry wage
of an ixmsigrantcohort that
arrivesin the
United States at age 20 in calendar year
8,and
letbe the entry
wageof
a
comparable
(in termsof
all observableeconomic
variables)native
person
that enters the labor market at age 20. Similarly, let g.be the
rateat which the earnings of irmnigrants
grow over their
lifetime, andg be the growth rate for natives. Finally, let r be the
rate
of discount(assumed
to be the
same
formigrants
and natives)If
persons are in- finitely lived,the
present values associatedwith the earnings
streamsof migrants and natives
are given by:—(r—g)t
V(9)
=
J
(8)e
dt(8)/(r-g),
(29) r—(r—g)t
V
e dtI(r—g
),(30)
n
jn
n
n
The percentage
difference
in present valuesbetween
ixtmigrantsof cohort 9
and
natives
is defined
by:tn(V(9)/V
) (Zn,(e)
—Zn
( —Zn(r—g.J
+
Zn(r—g
),(31)
n
nand a
first—order approximation (using the assumptionthat
earningsgrowth
rates are small relativeto the
discount rate) yields:p
Zn(V, (9)/V
)
(Zn 9, (8)— Zn
Y )+
(32)
1
fl
1 n rHence the percentage
difference in the presentvalue of the earnings
streams facedby imnigrants
and
natives is an additive functionof
the
wage
differential
at the time of entry,and
of the difference in earningsgrowth
rates
over
the life cycle,8The
present value differentialin
(32) oanbe easily evaluated from
the estimatesof equation
(26(and
(27) if two assumptions are made. First,the
rate of discount is assumedto be
5 percent. Clearly, the assumptionof
anyhigher discount
ratewould lead
to a worseningof
relative Immigrant earningssince the latter
part of
theworking
life cycle (where immigrantstend
to do better) wouldbe more
heavily discounted. Second, thegrowth
rates,
and g
must be
evaluatedfrom
the ageand
years-since—migration coefficients in the earnings functionsin
(26)and
(27) .The
quadratic specification for age andyears-since-migration
in
the earnings functions impliesthat
thegrowth
rateis
notconstant
over
time. The empirical analysisbelow
will define thegrowth
rate g.and g
by:=
[
J
x,tO,30,8 j-
x,20,O,8
] ) /30, (33) IIn[
'°
] —n[
X,20 /30, (34)where
Y(X,A,y,8)
is the predicted (Zn) earnings foran
immigrantwith
characteristics X, at age A, withy
years of residencein
theUnited
States,and
whomigrated
in cohort e. Similarly,Yn)XA)
gives thepredicted
earnings
fora native with
characteristicsX
at age A.In
other words, theaverage
growth rate
experiencedby
immigrants and nativesbetween
ages 20 and 50 (evaluated at themean
characteristicsof the
migrant population, X) isused
forestimation
of
the growth ratein
the present value expressions.This approach
has theuseful
propertythat the
growth rates (forboth
immigrantsand
natives) are basically a linear functionof
regression coefficients,and
since the entry wagesare
givenby
Y,(X,20,0,8( forimmigrants
and Y(X,20(
for matives,the present value
expressions in (33)and
(34)ar
also linear functionsof
regressions coefficientsand
hence astandard
errorcan be easily
evaluated.This
approach marks a rather important departurefrom the
empirical traditionin
the literature that analyzes immigrant earnings. The entire literature essentially focuseson the
estimationof
entry wage levels,and on
Ite
calculationof
"overtaking"points
(ifthey
exist)This type of
analysis is basically irrelevant
if
overtaking points occur rather late in the life cycle (orif they do
notoccur
at all) as some recent evidence suggests. The empiricaluse of the present value of earnings
ismuch more
consistentwith
the theoretical content of the theory of migration anddeemphasizes
the
somewhatmisleading
concept of overtaking points.The
analysis of the success of migrant groups in the United States,to borrow
from
the huoancapital theory that guided much
early researchon
immigrant earnings,should
not be based on the calculation of wage differentials at given ages, buton
the
life cycle wealth accumulatedby
migrants and natives. Hence the present value approachused in
the empirical sectionsof this paper
is much more in the traditionof the human capital literature
and of the Roy
model of
immigrationdeveloped in
the previous section.4. Zarnings
of
Iigrants
in the United
States 4.1.Data and
Descriptive StatisticsThis section analyzes the relative earnings
of
immigrants in the U.S. labor market. The data are drawnfrom
the 1970 2/100 U.S. Census (obtainedby pooling
the 5%SMSA and County Group
Sample and the 5% State Sample)and
the 1980 5/100A
Sample. The complete samples areused
in thecreation
of
the immigrant extracts, butrandom
samples are drawn for the native 'baseline" populations.9 The analysisis restricted
to men aged
25-64 who satisfiedfive
sample selection rules: (1) the individual wasemployed
in
the
calendar
year prior to the Census; (2) the individual was not self—employed
or
working
without
pay; (3)
the individual
was
not
in
the
Armed
Forces
(asof the survey
week) ;(4)
the individualdid
not residein
groupiarters; and
(5)the
individual reported annual earningsexoeeding
$1000. Throughout this section,the
dependent variable is the logarithmof
the individual'swage rate
in the calendaryear
prior to the Census.Forty
one countries werechosen
for analysis.These
countries were selectedon
thebasis that
both the 1970 and 1980 Censuses containeda
substantialnumber of
migrants from that country. In particular,it
is necessary tohave at
least 80 observationscf
personsborn
in aparticular
foreign countryin
thepooled
2/100 1970Census
to enterthe
sampleof
41 countries.The
countries thus chosen account for over 90 percentof
all irmigrationto
theUnited
States between 1951and
1980. Itmust be
noted, however,that this
restriction omits some countries which during the late l970s became important source countries (e.g., Vietnam) .Since
twoCensuses
are reotired
for
the complete identificationof the
parameters of the modelpresented in
Section 3, hcwever,a
systematic analysisof
the relativeearnings
of these
migrantswill
have to await the 1990 Census.Table 1
begins
the empirical analysisby
presenting the unstandardireddifferential
between the logwage
rate of the various migrant groupsand
"natives".
In
these statistics, the native population is defined as the groupof
U.S.born
white, non-Hispanic, non-Asianmen aged
25-64. Perhaps themost striking finding in
the table isthe
fact that migrantsfrom
European
countriestend to have wage
rates that oftenexceed
the wages of white natives, whilemigrants
from Asianor
LatinAmerican
countriestend
to havewage
ratesthat are
substantially below thoseof
white natives.Table
1also presents the
relative earningsof
the 1965—1969 cohortcf
migrants
as of
1970, the relative earningsof
the same cohortin
1980,and
therelative
earnings
of the 1975—1979 cohortas
of 1980. These statisticsyield important
insights
into the
processof
assimilation (the rate at whicn theearnings of migrants
and
nativea are converging) and intothe
extentof
productivity
differences
across
successive cohorts. The "tracking"of
the 1965—1969cohort
across Censuses showsthat
the relativeearnings
of this
cohortof migrants improved over
time for most national groups. At the same time,the comparison of
successive issnigrant cohorts (i.e., the comparison of the 1965—1969cohort as of 1969 and
the 1975—1979 cohortas of 1979) shows
that forsome
countries the relative earnings of migrants increased, while for other countries the relativeearnings
of
migrants decreased substanti-ally.
For
example,the most
recent migrantfrom
France in 1970 was earning about 8percent less than
nativesat
the time of entry,while
the most recentmigrant
from France in
1980 was earning about 22 percentmore than
natives atthe time of entry.
Conversely,the most
recent migrantfrom
India in 1970earned
about 4 percentmore than
white nativesat the time of
entry, but themoat recent
migrants from
Indiain
1980 was earning 21 percent less than white natives at thetime of
entry.Table 2 continues
the
descriptive analysisby presenting
the mean(completed) education level
of four different cohorts
of immigrants that
arrivedin
the 1960—1980 period. Since the educationdata
availablein
theCensus
does not differentiate between educationobtained
prior to
immigration
and
educationobtained
in
the United Statesafter
immigration, the mean educationlevels
for the 1970—74 and 1975—79 cohorts are obtainedfrom
the 1980 Census, and the mean education levels for the 1960—64and
1965—69 cohorts are
obtained
from
the 1970 Census. This use of the availabledata is designed to
minimize the contamination of the educationvariable
by post—migration
schooling.The statistics
in
Table 2 are consistentwith
the well-known secularincrease
in education
levelsover time
for practically all migrant cohorts.It is worth noting, however,
that
for some countries the increase in educationhas been
quite small (e.g., Portugal(, whilefor
other countries (e.g., Norway)it has been
amazingly large.As the
theoretical analyaisin
Section II shows, these truncated educationmeans can only
be understoodin
termsof the population
means of the education distributionin
the countriesof
origin.To
provide some insights into the extentof
self— selectionon
the basisof
education, Table 2 also presentsmean
education levelscalculated
for
thepopulation
in the countriesof
origin. Themean
education
levelfor
the lSGOsis
calculatedusing
enrollment datain
thevarious countries
of origin during
the 1950s, while themean
education level forthe
l970sis calculated
using
enrollmentdata in
the various countriesof
origin during the
l960s.The
"lagged" constructionof
the variablegiving
mean education
levelsin
thecountry
of origin isdesigned
to
accountfor
thefact that,
in the
samplesused
here, the averageperson
migratedat
age 20.The
relevanteducation
distribution, therefore isgiven by
thatof
personsenrolled in school
a few yearsearlierJ°
The means
in Table 2 presenta
remarkable picture.Even
afterallowing
forthe
substantial errors involved in calculating the population meansfor
each countryof
origin, the truncated means are almost alwaysmuch
greaterthan
thepopulation
means.For
example, themean of
educationin
Haiti is about 3 years, butthe most
recent Haitian immigrants report 10 yearsof
educationin
the 1980 Census. Surprisingly, the two statisticsare most
similar for Mexico, whereboth immigrants
and
theMexican
population
have 6-7 years of education. Overall,Table
2suggests
that immigrants arepositively
selectedon the
basis of
education. The model presentedearlier
implies that this result isconsistent
with
the hypothesis that the "rate of return"to
education isgreater in the United
Statesthan in
most countriesof
origin.114.2.
Basic Regression
ResultsThe regression
model in equations
(26)and
(27)was estimated on each of
the 41 countries
under
analysis using the pooled 1970 and 1980 Census data.As noted
earlier, the choice of the native baseline is an importantstep
inthe estimation
procedure. In this section, the reference group is chosenaccording to the
race/ethnicbackground
of the
population
of each country of
origin.
The estimation uses the
white, non—Hispanic, non—Asiansample of
native men as the reference group for migrants from
Europe, Canada, and theMiddle
East. The groupof Asian
natives is the reference group for migrantsfrom all
other Asian
countries. The groupof Mexican
natives is thereference group
for Mexiran migrants,and
thegroup of "other Hispanic"
men
is
thereference group for persons
from
all other Spanish—speaking countriesin the
American
continent. Finally,the group of
black nativesis the
reference group
formigrants from
countries with predoninantlyblack
populations (i.e., Haiti, Jamaica,
and Trinidad
and
Tobago).The definition of the reference
group in terms of
the racial/ethnicbaokground of the
irrsnigrantpopulation
is a simple
way of specifying differentperiod
effects for the various itisnigrant groups. Presumably, theimpact
of changes in aggregateeconomic
conditionson
irrmigrantearnings
islikely
tobe better approximated
by
the period effectsexperienced
by
populations