Endogenous Human Capital Accumulation, Comparative Advantage and Direct vs. Indirect Redistribution

(1)

Discussion Paper No.

590 ENDOGENOUS HUMAN CAPITAL

ACCUMULATION, COMPARATIVE ADVANTAGE

AND DIRECT

VS

. INDIRECT REDISTRIBUTION

Hisahiro Naito

July 2003

The Institute of Social and Economic Research Osaka University

(2)

Endogenous Human Capital Accumulation, Comparative

Advantage and Direct vs. Indirect Redistribution

Hisahiro Naito ∗†

Institute of Social and Economic Research Osaka University

and

Department of Economics University of California Irvine Previous version August 30, 2002

Current version June 7, 2003

Forthcoming from Journal of Public Economics

Abstract

Recently, several papers have re-examined the so-called production efficiency theorem and the Atkinson and Stiglitz theorem on commodity taxes in the optimal taxation literature. Naito (1999) showed that indirect redistribution through production distortion or consump-tion distorconsump-tion can Pareto-improve welfare and that the two theorems do not necessarily hold when different factors are imperfect substitutes and factor prices are endogenous. On the other hand, Saez (2001) argued that in the long run where human capital accumulation is endogenous, the two theorems are still valid. This paper develops reasonable alternative mod-els where individuals accumulate human capital based on their comparative advantage. The present paper shows that the production efficiency theorem is not necessarily valid and that indirect redistribution from the able to the less able such as tariffs and production subsidies can increase efficiency even when skill accumulation is endogenous.

Keywords: Human capital accumulation, non-linear income taxation, and comparative advantage JEL Number: H21, H23

∗

I appreciate the participants of the Public Economics Research Group around Osaka and Daiji Kawaguchi for his helpful information on the relationship between earnings and ability. I am very grateful for the comments from two anonymous referees and professor Robin Boadway, the editor of the Journal. Their comments have been very helpful for improving this paper. Of course, the author is responsible for all remaining errors.

†

Address: Institute of Social and Economic Research, Osaka University, Mihogaoka 6-1, Ibaraki City, Osaka, Japan, postal code 567-0047

phone:81-6-6879-8581; fax: 81-6-6878-2766.

(3)

1 Introduction

Whether efficient income redistribution should be done through income taxation alone or should

be complemented with other measures such as production distortion or consumption distortion

is one of the key issues whenever optimal public policies are discussed. With this regard, the

production efficiency theorem (Diamond and Mirrlees, 1971), which states that production

dis-tortion is not optimal and the Atkinson and Stiglitz theorem on optimal commodity taxation

(Atkinson and Stiglitz, 1976, 1980), which shows that commodity taxation is not necessary in

the presence of an optimal income tax system, are the most important results in public finance

literature.

However, in public finance literature researchers started examining those results. For example,

Cremer, Pestieau and Rochet (2001) showed that the Atkinson and Stiglitz theorem does not hold

when individuals are different in ability and endowment. Saez (2003) showed that the Atkinson

and Stiglitz theorem does not hold when tastes are heterogenous. Naito (1999) showed that in

a model similar to the model of Stiglitz (1982), if multiple goods are produced and factor prices

are endogenous, the Atkinson and Stiglitz theorem does not necessarily hold and the production

efficiency result does not either.

On the other hand, many of the previous studies on optimal income taxation have received

criticism that they did not focus on long term decisions such as human capital accumulation but

focused on the short term choices such as labor supply. As a result, it is sometimes argued that

the result obtained in the short run model might not hold in the long run.

In particular, Saez (2003) made skill accumulation endogenous in the model of optimal

tax-ation and analyzed several issues of public policy. He showed that Naito’s results are not valid

and that the production efficiency theorem and the Atkinson and Stiglitz theorem on

commod-ity taxation are valid when human capital accumulation is endogenous. Since accumulation of

human capital has a strong effect on the economy in the long run and since the implications of

(4)

contribution of Saez’s paper is substantial.

Despite such contributions, however, we believe that a further investigation would be needed.

In many previous analyses involving asymmetric information not only in public finance literature

but also in other literature, conclusions were not often robust in the sense that they critically

depended on the structure of information and the timing of information revelation. Thus, it

is worthwhile investigating the robustness of the result of Saez (2003) with another reasonable

set of assumptions. In particular, in this paper we will show that if higher ability persons have

comparative advantage in the sense that the relative return from accumulating skilled human

capital to unskilled human capital is higher than that of lower ability persons, the production

efficiency theorem does not hold. 1

To explain the intuition of the present paper, it would be useful to look at the differences

between the assumptions in Saez (2003) and those in Naito (1999). In Naito (1999), there are

two factors of production that are imperfect substitutes. In addition, from the beginning, each

individual is attached to a particular labor market (the skilled labor market or the unskilled

labor market) but the government cannot observe whether each individual is attached to the

skilled or unskilled labor markets. The main idea in Naito (1999) is that when the government

cannot observe an individual’s type, the government can affect different individuals differently by

using the response of the factor markets (Stolper and Samuelson theorem, Stolper and Samuelson

(1941)). Since the income tax policy cannot discriminate the different types of agents attached

to different labor markets but a commodity tax and a tariff can, using a commodity tax (in a

case of a closed economy), or a tariff (in a case of an open economy) with the response of factor

markets can increase the efficiency.

In Saez (2003), each job requires pre-determined skill levels. As a result, the income level

1_{In this paper, we only analyze the case of a small open economy due to the limitation of the space. As a result,}

we only prove that the production efficiency theorem does not hold. In a small open economy, a commodity tax cannot affect the producer prices and hence, factor prices. Therefore, the Atkinson and Stiglitz theorem holds in a small open economy. On the other hand, the result is changed in a closed economy. In the previous version of the paper, we proved that the Atkinson and Stiglitz theorem does not hold in a closed economy. See the section 5 in the present paper for more discussion. Also, for proof, please see our previous version of the present paper (Naito, 2002).

(5)

represents the amount of skill the individuals acquired. People have a heterogenous ability to

acquire skill. However, since such heterogeneity of ability is incorporated in the utility function

as a difference of disutility to acquire skill, there is not any room for such heterogenous ability to

interact with the market reaction. Thus, the heterogeneity of abilities is intrinsically independent

of the external environment of the economy. When the heterogeneity of abilities is independent

of the response of the factor market, it is essentially equivalent to assuming that the dimension

of factors used for production is one. In such a case, changes of factor prices due to government

policy cannot increase the economic efficiency.

The key idea of the present paper is that in the presence of comparative advantage in

ac-cumulating different types of human capital, individuals with different abilities will be affected

differently by the responses of factor markets even when skill accumulation is endogenous. In

such a case, a policy that introduces inefficient production but affects the factor prices differently

for different factors can indirectly redistribute from the less able to the able. Although such a

policy cause a distortion, it has only the second order effect, but such an indirect redistribution

has the first order effect on welfare. Thus, it can increase the social welfare.

For illustration, consider a situation where there are two types of human capital: skilled human

capital and unskilled human capital and where those who have higher ability have comparative

advantagein accumulating skilled human capital. Comparative advantagein accumulating skilled

human capital for the able means that the relative benefit from accumulating skilled human

capital to unskilled human capital for individuals with high ability is higher than for the less

able. We could think that training, knowledge and experience in white collar jobs are skilled

human capital and those in blue collar jobs are unskilled human capital. In such a situation, a

decrease of the return from skilled human capital and an increase of the return from unskilled

human capital will hurt the able relatively more and give relatively more benefit to the less able.

The intuition of this paper is that when individual ability is not observable to the social planner

(6)

individuals, then a policy that will change the returns from skilled and unskilled human capital

differently might be useful for an efficiency reason.

The crucial assumption in the present paper is the presence of comparative advantage in

human capital accumulation. Whether such an assumption is reasonable or not is an interesting

empirical question. Earlier literature of the human capital theory assumed that earning could be

explained completely once it is conditioned by human capital level. Earlier empirical evidences

showed that there is a strong correlation between earnings and the level of human capital and

indicated that ability does not matter for explaining earnings once they are conditioned by the

human capital levels. On the other hand, recent literature of labor economics and self-selection

emphasizes that ability can also increase earning and play a systematic role for explaining earnings

even after it is conditioned by the human capital level. This literature points out that even in

an extreme case when human capital does not increase the productivity at all, if ability can

increase the productivity and if higher ability agents tend to acquire more skills, there will be a

correlation between human capital level and earnings. In the standard signaling literature, it is

commonly assumed that a higher ability person would get more benefit from acquiring skill. In

addition, recently, Dinardo and Tobias (2001) and Tobias (2003) examined whether the returns

from schooling are higher for high ability individuals than for low ability individuals by using a

non-parametric method. They found that the returns are higher for high ability individuals than

for low ability individuals. This suggests that assuming the presence of comparative advantage

is not unrealistic as an approximation of the reality.

At this point, one might wonder about the difference between Naito (1999) and the present

paper. In the case of Naito (1999), each type of worker is attached to a different labor market.

As a result, skilled workers can supply only skilled labor and unskilled workers can supply only

unskilled labor. However, in the present paper, both high ability persons and low ability persons

have options to accumulate both types of human capital or either type of human capital. Thus,

(7)

return from skilled human capital always increases efficiency is not obvious.

The organization of this paper is as follows. In section 2, we present the model in a small

open economy and analyze the production efficiency theorem by Diamond and Mirrlees (1971)

when two factors are imperfect substitutes. In section 3, we analyze the same issue when two

factors are perfect substitutes. In section 4, we shall give the implications and in section 5 we

will give a brief conclusion.

2 The model

The economy is small and open and there are two output goods: good 1 and good 2. Good 1 is

skilled human capital intensive good and good 2 is unskilled human capital intensive good. We

assume that there are two types of human capital in this economy: skilled human capital and

unskilled human capital. In this economy, there is a continuum of agents and all agents have

identical, additive separable utility functions with respect to consumption, skilled human capital

investment and unskilled human capital investment. We index all individuals’ ability by iwhere

i takes any value from one to two. We assume that the utility function of the typei agent has

the following form:

u(c1i, c2i)−fs(hsi)−fu(hui)

where u(c1i, c2i) is strictly increasing with each argument and strictly concave and fs(hsi) and

f(hu_i) are strictly increasing and strictly convex. c1i and c2i are the consumption of good 1 and

good 2 by agent i. We assume that the labor supply is fixed and it is normalized to one. hs_i

and hu_i are the levels of skilled and unskilled human capital of individual i. hs_i and hu_i can be

interpreted as the knowledge levels, years of education, experience and training for each type of

skill. In addition, to illustrate our point, we assume that fs(hsi) and f(hui) have the following

functional forms:2

fs(hsi) = (hsi)γs andfs(hui) = (hui)γu 2

(8)

whereγs and γu measure the curvature of the disutility functions of skilled and unskilled human

capital accumulation respectively and they are strictly greater than one. Given the amount of

skilled human capital and unskilled human capital of individuali, we assume that the earning of

individual i is determined as follows:

earning_i =gs(i)×ws×his+gu(i)×wu×hui (1)

where ws and wu are the returns from one efficient unit of skilled and unskilled human capital,

respectively. (1) means that when individual i accumulates hs_i units of skilled human capital

and hu_i units of unskilled human capital, the efficient unit of skilled human capital and unskilled

human capital are gs(i)×hsi and gu(i)×hui and the total return from skilled human capital and

unskilled human capital are gs(i)×ws×hsi and gu(i)×wu ×hsi, respectively. Let gs(i)×ws

and gu(i)×wu be wis and wui. gs0(i)/gs(i) and g0u(i)/gu(i) measure the absolute advantage of an

agent with abilityi+over agentiin accumulating skilled human capital and unskilled human

capital, respectively. We assume that agents who have higher ability have absolute advantage

in accumulating both skilled human capital and unskilled human capital: g0_s(i)/gs(i) > 0 and

g0_u(i)/gu(i). 3 Also, as we discussed in the introduction, we assume that agents who have higher

ability have comparative advantage in accumulating skilled human capital than unskilled human

capital. Thus, we assume that

g_s0(i) gs(i) > g 0 u(i) gu(i) γu γs (2)

The assumption (2) has a clear economic meaning. Consider a situation where the disutility

functions of accumulating skilled human capital and unskilled human capital have the same

degree of curvature (γs = γu). In this case, (2) means that an agent whose ability is higher

will have a larger rate of increase of ws

i, the return from accumulating skilled human capital,

than that of w_iu, the return from accumulating unskilled human capital. When the curvature of

the disutility functions are different, (2) says that the condition of the comparative advantage

3

The assumption of the absolute advantage is not necessary. The assumption of the absolute advantage is a sufficient condition that guarantees that agents who have higheriwill receive higher utility. As long as agents with higher ability can receive higher utility the assumption of the absolute advantage is not necessary.

(9)

must be adjusted by the ratio of the curvatures of marginal disutility of skilled human capital

accumulation and unskilled human capital accumulation. 4

At this point, note that (1) is different from the assumptions in Saez in several ways. In

Saez, he assumed that heterogeneity of individuals is incorporated in the utility function, not

the earning equation. Thus, once earning is conditioned by the human capital level, individual

level heterogeneity of ability does not play any systematic role for explaining earnings. On the

other hand, in (1) even after conditioned by the level of human capital, heterogeneity of ability

plays a systematic role for explaining earnings and it induces higher earning for agents with

higher ability. In addition, the relative return from one efficient unit of skilled human capital to

unskilled human capital is higher for the agent who has higher ability than for the agent who has

lower ability. As we discussed in the introduction, this interaction term between heterogeneity of

ability and the return of human capital plays a crucial role in the present paper.

As for the objective of the government, we assume that the social planner will maximize the

following utilitarian social welfare function:

Z 2

1

{u(c1i, c2i)−fs(his)−fu(hui)}nidi . (3)

As for prices, we normalize the producer price and the consumer price of good 1 to one.

Let p2, q2 and p∗2 be the consumer price and the producer price and the international price of

good 2, respectively. As the purpose of this section is to examine whether introducing production

distortion can increase the social welfare or not, we consider imposing a tariff on good 2. Although

a tariff introduces not only a production distortion but also a consumption distortion, the first

order effect of consumption distortion on welfare can be ignored as we will demonstrate. Let σ

4

The reason that the terms ofgs0/gsandg0u/guneed to be adjusted by the curvature of the marginal disutility

is as follows. For illustration, consider a condition thatgs0/gs and g0u/gu must satisfy when agents with ability

i+and agents with abilityihave the same degree of comparative advantage under the assumption ofγs > γu.

The assumption ofγs > γu implies that the marginal disutility of skilled human capital changes faster than the

marginal disutility of unskilled human capital when the amount of skilled and unskilled human capital respectively changes at the same rate. Note that the marginal disutility per return of skilled and unskilled human capital must be equal at the margin. This implies that in order that agents with abilityi+and agents with abilityihave the same degree of comparative advantage,g0s/gs must be smaller thangu0/gu.

(10)

be a size of a tariff on good 2. Then, we will have

p2=q2 =p∗2+σ. (4)

As for the equations determining the returns from skilled and unskilled human capital, we

assume the standard two sector Heckscher-Ohlin model. In this economy, there are two sectors.

Sector 1 is the skilled human capital intensive sector and it produces good 1. Sector 2 is the

unskilled human capital intensive sector and it produces good 2. Each sector uses both skilled and

unskilled human capital. Consumers (workers) are perfectly mobile between two sectors. When

an agent who hashs_i units of skilled human capital andhu_i units of unskilled human capital works

in sector k, it means that sector k uses gs(i)×hsi units of skilled human capital and gu(i)×hui

units of unskilled human capital. Each sector behaves as a price taker and maximizes its profit.

Let Fk₍_Hs

k, Hku) be the production function in sector k = 1,2 where Hks and Hku are the total

amount of skilled human capital and unskilled human capital used in sector k. We assume that

Fk(H_ks, H_ku) exhibits constant returns to scale and it is concave with respect to both arguments.

Let ck(ws, wu) be the cost function in sector k to produce one unit of output in sector k when

the returns of one efficient unit of skilled human capital and unskilled human capital arews and

wu, respectively. When both good 1 and good 2 are produced at the equilibrium, ws andwu are

determined

1 =c1(ws, wu) andq2=c2(ws, wu), (5)

From the Stolper -Samuelson theorem,∂ws/∂q <0 and ∂wu/∂q >0.

The output of both goods are determined from the following factor market equilibrium

con-ditions: ∂c1 ∂wsy 1₊ ∂c2 ∂wsy 2 ₌ Z 2 1 gs(i)×hsi ×nidi, and ∂c1 ∂wuy 1₊ ∂c2 ∂wuy 2 ₌ Z 2 1 gu(i)×hui ×nidi (6)

Although the output of both goods can be calculated from equation (6), it is more useful to

work on the production possibility frontier for analytical reasons. Let Hs and Hu be the total

(11)

production possibility frontier as Γ(Hs, Hu). Since the production functions are concave and the

factor intensity of the two sectors are different, the production possibility set is convex. Let the

producer price of good 1 and good 2 be 1 and q2. Then, the output of good 1 and good 2 are

determined as the solution of the following constrained maximization problem:

max y1+q2y2 s.t. (y1, y2)∈Γ(Hs, Hu) = 0

Thus, the output of good 1 and good 2 can be thought as a function of q2, Hs and Hu. Let

Y(q, Hu, Hu) be the output function of good 2. At the optimum, the slope of production

possi-bility set is equal to the relative producer price of good 2. Thus, we obtain Yq ≡∂Y /∂q2 > 0.

The Rybcyzynski theorem shows thatYHu ≡∂Y /∂Hu >0 and Y_Hs ≡∂Y /∂Hs<0.

The purpose of the social planner is to maximize the utilitarian social welfare function. Given

the additive separable utility function and the utilitarian social welfare function, the social planner

wants to redistribute income from those who have higher ability to those who have lower ability.

On the other hand, since the social planner cannot observe individual ability but rather individual

earning, the social planner needs to design a non-linear income tax system T(R) to redistribute

income where T(R) is a tax liability function andR is pre-tax income.

Before designing an income tax system, it is useful to consider the problem of designing the

non-linear income tax system in two steps. The first step is to know how an individual i will

choose skilled human capital and unskilled human capital to generate pre-tax income, R. The

second step is to know, given an after-tax-income schedule ofX=R−T(R), how each individual

chooses pre-tax income.

The first stage of the problem can be solved considering the following programming problem:

minfs(hsi) +fu(hui) (7)

s.t. R=ws_i ×h_is+wu_i ×hu_i

where w_is=gs(i)×ws and wui =gu(i)×wu

(12)

disutility to generate the pre-tax income R for an agent whose net returns from skilled human

capital and unskilled capital are ws_i and w_iu, respectively. We denote the solution of the above

problem as hs_i(ws_i, wu_i, R) and hu_i(ws_i, wu_i, R).For the analysis later, it is useful to calculate

com-pensated human capital supply. Consider the following dual problem of (7):

E(ws_i, wu_i, V)≡max w_ishs_i +w_iuhu_i

st. fs(hsi) +fu(hui)≤V

Let the solution of the above problem be ehj_i(ws_i, wu_i, V) where j = s, u. Then, from the dual

relationship, we will have

hj_i(ws_i, wu_i, E(ws_i, wu_i, V))≡eh j i(w

s

i, wiu, V) j=s,u

By taking derivative on both sides, we will have the Slutsky equation forhs_i and hu_i:

∂hj_i ∂ws_i + ∂hj_i ∂Rh s i = ∂ehj_i ∂ws_i and ∂hj_i ∂w_iu + ∂hj_i ∂Rh u i = ∂ehj_i ∂w_iu j=s,u

Note that the indifference curve of fs(hsi) +fu(hui) is strictly concave. Therefore, ∂ehs_i/∂ws_i >0,

∂ehu_i/∂wu < 0, ∂ehu_i/∂wu_i > 0 and ∂ehu_i/∂ws_i < 0. This relationship means that if an individual

maximizes his earnings holding the total disutility constant, an increase of the net return from

skilled human capital will increase the supply of skilled human capital and an increase of the

return of unskilled human capital will decrease the supply for skilled human capital. As for

the properties of Z, let the Lagrangian multiplier of the disutility minimization problem beαi.

Then, f_s0(hs_i) = αiwsi , fu0(hui) = αiwu,Zws

i ≡ ∂Z/∂w

s

i = −αihsi, Zwu ≡ ∂Z/∂wu =−α_i hu_iand

ZR≡∂Z/∂R =αi.

LetX(R) be the after-tax income schedule that the government designed. Then, at the second

stage of the problem, given Z(ws

i, wiu, R) andX(R), each individualiwill maximize his utility:

max

{R} U(p2, X(R))−Z(w

s

i, wiu, R)

where U(p2, x) is the indirect utility function from the consumption of two goods when the

(13)

is to design a schedule ofX(R) to maximize the social welfare. On the other hand, the Revelation

Principle shows that without loss of generality we can focus on the incentive compatible revelation

mechanism. Thus, let (Rj, Xj) be the pre-tax income and after tax income when an agent

announces that his type isj. Then define v(i) and _bv(j;i) as follows:

v(i) = max {j} U(p2, Xj)−Z(w s i, wui, Rj) b v(j;i) =U(p2, Xj)−Z(wsi, wiu, Rj)

v(i) is the maximized utility given the schedule of (Rj, Xj) andbv(j;i) is the indirect utility when

agentiannounces that he is typej. The incentive compatibility condition implies that the type

iagent has an incentive to announce that he is type i:

i= arg max

{j} bv(j;i)

Assuming the differentiability of (Xj, Rj), the first order condition of the incentive compatibility

condition is ∂_bv(j, i) ∂j j=i = ∂U ∂x ∂x ∂j − ∂Z ∂R ∂R ∂j = 0

On the other hand, by using the above first order condition, we havedv/di=−Zws i ×(dw

s i/di).

Sinceαi is the Lagrangian multiplier of the required income constraint in the disutility

minimiza-tion problem (7), from the FOC of the minimizaminimiza-tion problem we obtain

dv di =αiRi{ g_s0(i) gs(i) θsi+ g_u0(i) gu(i) θui} where θji = wj_ihj_i Ri (8)

Because of the assumption from the absolute advantage, dv/di > 0. (8) has a clear economic

meaning. It means that the slope of the value function v(i) is proportional to the weighted

average of the absolute advantage of skilled human capital accumulation and unskilled human

capital accumulation. For analytical reasons, it is useful to eliminate αi in the above equation.

Using the first order condition for hs_i and hu_i, we can rewrite (8) as follows:

dv di = g0_s(i) gs(i) f_s0(hs_i)hs_i +g 0 u(i) gu(i) f0(hu_i)hu_i. (9)

(14)

Given (9), as Mirrlees (1971) pointed out, it is more useful to assume that the social planner

controls vi and Ri.5 Then,xi is defined by the following relationship:

v(i) =U(p2, Xi)−Z(wsi, wui, Ri). (10)

Let x(R, v, p2, wsi, wui) be the solution that solves (10) about X. Obviously, ∂x/∂v = (Ux)−1 ,

∂x/∂Ri = ZR/Ux and ∂x/∂p = −(Up)/(Ux), ∂x/∂p = −(Up)/(Ux) ,∂x/∂wsi = Zws

i/(Ux) and

∂x/∂wu =Zwu/(U_x).

Finally, the government budget constraint implies that

Z 2 1 ni{Ri−xi}di+σ{ Z 2 1 c2inidi−Y(p∗2+σ, Hs, Hu)}= 0

The problem of the social planner is to solve the following constrained optimization program:

W(σ) = max {Ri,vi} Z 2 1 v(i)nidi. st. dv di = g_s0(i) gs(i) f_s0(hs_i)hs_i +g 0 u(i) gu(i) f0(hu_i)hu_i Z 2 1 ni{Ri−xi(Vi}di+σ{ Z 2 1 c2inidi−Y(p∗2+σ, Hs, Hu)}= 0 σ is given.

In the above programming problem,W(σ) is the maximized social welfare for givenσ. Also note

thaths

i and hui are functions of (Ri, wis, wiu) and thatwsi and wui are the functions ofσ.

Our interest is to know whether a change of σ from 0 will increase the social welfare or not.

Analytically, by calculatingdW/dσ, and evaluating at σ = 0, we can check whether introducing

a distortion in production side (and consumption side too) can increase the social welfare. Letµi

andλbe the Lagrangian multiplier of the incentive compatibility constraint and the government

5

One might think that the local incentive compatibility constraints are not sufficient for the global incentive compatibility constraints. On the other hand, the literature of the mechanism design shows that a single crossing property (SCP) and the monotonicity constraints are sufficient conditions for local incentive compatibility con-straints to satisfy the global incentive compatibility concon-straints. (Fudenberg and Tirole, 1992). In this paper, we assume that the monotonicity constraints are always satisfied. This assumption is equivalent to assuming that there is no bunching. Many of the previous papers assumed that there is no bunching at the optimum. (Konishi 1995, Naito 1998). As for SCP, we can check it by examining _∂R∂i∂2Z >0. This is true as long as ∂hsi

(15)

budget constraint. By using the envelope theorem, we obtain dW dσ σ=0 =− Z 2 1 µi d[f0(hs_i)hs_i(g_s0/gs)] dhs i {dh s i dws i dw_is dσ + dhs_i dwu dwu_i dσ }di − Z 2 1 µi d[f0(hu i)hui(gu0/gu)] dhu_i { dhu i dw_iu dwu i dσ + dhu i dw_iu dwu i dσ }di +λ{ Z 2 1 c2inidi−Y(p∗2+σ, Hs, Hu)}+λ Z 2 1 (−∂xi ∂p − ∂xi ∂ws i ∂ws_i ∂σ − ∂xi ∂wu ∂wu ∂σ )nidi

After several calculations, we can obtain the following equation (See Appendix):

dW dσ σ=0 =− Z 2 1 µi n γs(g 0 s/gs)−γu(g 0 u/gu) o f_s0(hs_i)[∂eh s i ∂ws i ∂ws_i ∂σ + ∂ehs_i ∂wu i ∂w_iu ∂σ ]di >0 (11)

Because of the property of the compensated supply function ofhs

i,∂ehs_i/∂ws_i >0 and∂ehs_i/∂wu_i <

0. From the Stolper-Samuelson theorem,∂ws_i/∂σ <0 and∂wu_i/∂σ >0. From the assumption on

comparative advantage, γs(g

0

s/gs)−γu(g

0

u/gu)>0. As for the sign of the Lagrangian multiplier

of the incentive compatibility constraint, the standard argument shows thatµi ≥0 for alli.(See

Appendix). Thus, we obtain dW/dσ >0.

Proposition 1 Suppose that at the zero distortion on production and the consumption in a small

open economy the social planner sets the income tax structure to maximize the social welfare

function in an endogenous skill accumulation model . Then an introduction of a tariff (export

subsidy) on an unskilled-labor-intensive good will increase the social welfare.

The above equation (11) has several implications. For an illustration, consider a situation

where the disutility functions of skilled and unskilled human capital accumulation have the same

degree of curvature, i.e. γs = γu ≡γ. Then, (11) shows that if (g

0

s/gs) = (g

0

u/gu), dW/dσ = 0.

In other words, if there is no comparative advantage and if higher ability individuals are as good

at accumulating skilled and unskilled human capital as lower ability individuals, then there is

no welfare gain from changing the returns of skilled and unskilled human capital. Second, note

that (∂ehs_i/∂w_is)(∂ws_i/∂σ) and (∂ehs_i/∂wu_i)(∂w_iu/∂σ) measure how changes of returns from each

(16)

γ ×f_s0(hs_i) = f_s00(hs_i)hs_i +f_s0(hs_i) and that f_s00(hs_i)hs_i +f_s0(hs_i) is related with a change of ˙v. In

addition, note thatµimeasures how the social welfare increases when the incentive compatibility

is relaxed. This implies that the term after the integration measures how a compensated change

of the returns from skilled and unskilled human capital changes the slope of ˙v and increases the

social welfare. Also the calculation needed for obtaining the equation shows that the effect of

consumption distortion on welfare is zero, because as long as σ is small, such a distortion is of

the second order.6

The intuition of the above proposition is as follows. In a situation where higher ability

individuals have comparative advantage in accumulating skilled human capital and lower ability

individuals have comparative advantage in accumulating unskilled human capital, a decrease of

the return from skilled human capital and an increase of the return from unskilled human capital

will hurt higher ability individuals and benefit lower ability individuals. If the social planner is

interested in redistributing income from high ability individuals to low ability individuals, such

changes of the returns from skilled and unskilled capital can indirectly redistribute income. On

the other hand, starting from zero distortion, the deadweight loss of the production distortion

is of the second-order but the welfare gain of relaxing the incentive problem has the first-order

effect. As a result, introducing the production distortion increases the social welfare.

3 Extension: A case of Perfect substitute

In the previous section, we have assumed that two types of human capital are imperfect substitutes

in order to assume differentiability of the human capital accumulation functions. As a result,

people always accumulate both types of human capital. In reality, however, people sometimes

accumulate only one type of human capital and, as a result, the choice of human capital becomes

discrete. The purpose of this section is to analyze the welfare effect of direct versus indirect

redistribution when human capital accumulation is endogenous and different types of human

6

This can be easily checked fromλR2

1 c2inidi=λ R2

1(−

∂xi

(17)

capital are perfect substitutes. 7

In this section, because the assumption that two types of human capital are perfect substitutes,

we assume the following utility function for agenti:

u(c1i, c2i)−ashsi −auhui

whereu(c1i, c2i) is strictly increasing with each argument and strictly concave.

As in the previous section, we assume that following comparative advantage condition holds:

g0_s(i) gs(i) > g 0 u(i) gu(i) (12)

The economic meaning of the above equation is the same as before. When two types of skill

ac-cumulation are perfect substitutes in the disutility function, the agent always solves the following

constrained disutility minimization problem:

Z(ws_i, wu_i, R)≡minashsi +auhui

stR=w_ishs_i +wu_ihu_i

wherews_i =gs(i)×ws andwui =gu(i)×wu

In the above problem, for an agent with abilityi, ifas/au < wsi/wiuhe will accumulate only skilled

human capital and if as/au > wsi/wiu, he will accumulate only unskilled human capital. Note

that because of the assumption of comparative advantage (12),w_is/w_iu is an increasing function

of i. Let i∗ be ithat satisfies (ws×gs(i))/(wu×gu(i)) =as/au. Then, agents whose ability is

greater thani∗ accumulate only skilled human capital and agents whose abilityi is less than i∗

accumulate only unskilled human capital. We assume that such i∗ is located within 1 and 2.8

7_{Besides the reason mentioned in the previous section, conducting a welfare analysis when individual behavior}

includes a discrete choice is useful from a theoretical standpoint as well. In many important economic situations such as the choice of location to live, the choice of technology by firms and labor market participation, decisions made by consumers or firms include discrete choices. Until very recently, a welfare analysis that includes discrete choices was rare. As far as the author knows, only Boadway and Cuff (2001) started to investigate this issue very recently. They analyzed an optimal taxation problem when some individuals are bunched at the bottom. Another purpose of this section is to contribute to such a literature as well.

8_{This assumption is not so restrictive as the following reason. For example, if}_i∗

(18)

Given suchi∗ , Z(w_is, w_iu, R) is Z(ws_i, wu_i, R) =as(R ws i ) for i∗ ≤i≤2 Z(ws_i, wu_i, R) =au( R w_iu) for 1≤ i < i ∗_.

LetX(R) be an after-tax income schedule that the government designs. Then, each agent chooses

his best R to maximize U(p2, X(R))−Z(wis, wui, R). Once R is chosen, an agent chooses his

optimal skill type and accumulates human capital to generate pre-tax incomeR. Letev(i) be the

maximized value given the schedule X(R):

e

v(i)≡max

R U(p2, X(R))−Z(w s

i, wiu, R).

For the analysis of the optimal schedule of X(R), we assume that the schedule of X(R) is a

continuous function. Although it is possible that the optimal schedule ofX(R) is not continuous,

the tax schedules of almost of all developed countries are continuous. When X(R) is a

contin-uous function, it is straightforward to show that ev(i) is continuous with respect to i from the

theory of the maximum (Berg 1963). In addition, there is an interesting property on_ev(i) in the

neighborhood of i∗ that turns out to be crucial for our result. The following lemma shows that

property of _ev(i).

Lemma 1 When iincreases, the graph of _ev(i) has a counter-clockwise kink at i∗.

Proof. Let _evs(i) be the maximized utility of an agent with abilityigiven the tax schedule when

he can accumulate only skilled human capital. Also, let evu(i) be the maximized utility of an

agent with abilityiwhen he can accumulate only unskilled human capital. By the definition, the

graph of _ev(i) is the upper envelope of _evs(i) and evu(i) and i

∗ _{is at the intersection between}

e vs(i)

and _evu(i). This implies that there is a counter-clockwise kink ati∗ (See also Figure 1).

will accumulate only unskilled human capital. However, the production needs both skilled and unskilled human capital. As a result, the return from skilled human capital will start to increase and the return from unskilled human capital will start to decrease. This implies thati∗will start to decrease. This process will continue until some agents start to accumulate skilled human capital.

(19)

Now consider the problem of designing a nonlinear income tax system. As in the previous

section, we define v(i) as follows:

v(i) = max

j U(p2, Xj)−Z(w s

i, wui, Rj)

By using the same technique as in the previous section, we can calculate dv(i)/di foriin (1, i∗)

and (i∗,2). dv di =a sgs0 gs Ri gs(i)ws fori ∈(i∗,2) (13) dv di =a ug0u gu ( Ri guwu ) for i∈(1, i∗) (14)

Next we will check a single crossing property of the utility function U(p2, X)−Z(R, ws, wu, R).

The marginal rate of substitution betweenX and R is

MRS(R,x) = 1 Ux as gsws fori∈(i∗,2) = 1 Ux au guwu fori∈(1, i∗)

Thus the MRS(R,X) is a decreasing function of i and a single crossing property is satisfied.

This means that the local incentive compatibility and the monotone condition ofRare sufficient

conditions for the global incentive compatibility (Fudenberg and Tirole, 1991). We assume that

the monotonicity constraint is not binding.

As in the previous section, it is useful to think that the government controls v(i) andRi and

thatXi is defined from the following relationship:

v(i) =U(p2, X)−Z(wis, wiu, Rj)

Finally for analytical convenience, rewrite the first order condition of (13) and (14) :

˙ vs= g 0 s gs ashs_i and v˙u = g 0 u gu auhu_i.

(20)

W(σ) = max Z i∗ 1 vu(i)nidi+ Z 2 i∗ vs(i)nidi st. v˙s= g 0 s gs ashs_i fori∗ < i≤2 (IC1) ˙ vu = g 0 u gu auhu_i for 1< i < i∗ (IC2) vs(i∗) =vu(i∗) (BD1) Rs_i∗=Ru_i∗ (BD2) Z 2 i∗ {Rs_i −x(Rs_i, v_is, w_is, w_iu)}nidi + Z i∗ 1 {Ru_i −x(Ru_i, v_iu, w_is, w_iu)}nidi +σ{ Z 2 1 nic2idi−Y(p∗2+σ, Hs, Hu)} ≥0 (RC) where Hs= Z 2 i∗ hs_igs(i)nidi andHu = Z i∗ 1 hu_igu(i)nidi

The above programming problem deserves several comments. First, (IC1) and (IC2) are the

local incentive compatibility constraints. Second, (BD1) comes from the assumption that the tax

schedule that the government designs is continuous and, as a result, the utility level of the agents

must be continuous. (BD2) comes from the assumption that individual i∗ chooses only one R.

Now let µs_i,µu_i and λ be the Lagrangian multipliers of (IC1),(IC2) and (RC). Let β1 and β2 be

the Lagrangian multipliers of (BD1) and (BD2). The first order conditions can be calculated

and we will write them in the Appendix to save the space. Then, what we need to know is the

effect of increasing σ from zero on the social welfare, which is equivalent to dW/dσ. By using

the envelope theorem, we have (See Appendix)

dW dσ σ=0 = ∂i ∗ ∂σ µs_i∗ashs_i∗ g_s0 gs −µu_i∗auhu_i∗ g_u0 gu (15)

From the FOC of v_is∗ and vu_i∗, we have µs_i∗ = µu_i∗. In addition, as we show in the Appendix µs_i

(21)

slope of vs_i and the left hand slopev_iu ati∗. From Lemma 1, the slope ofv_is is steeper than the

slope ofv_iu ati∗. Since ∂i_∂σ∗ >0, we havedW/dσ >0.

Proposition 2 Consider a small open economy where individuals accumulate human capital

endogenously and different types of human capital are perfect substitutes. Suppose that the social

planner designs a nonlinear income tax system to maximize the utilitarian social welfare function

without any production distortion and that there is no-bunching at the switching point i∗. Then,

introducing a tariff on an unskilled human capital intensive good will increase the social welfare.

At this point, it would be useful to consider the economic meaning of (15). Figure 1 shows

the graph of ev(i) , ves(i) and evu(i). When the government increases the tariff σ from zero, the

graph of _evs(i) will shift downward and the graph of evu(i) shifts upward. As a result, i

∗ _will

increase. Also, notice that from (IC1) and (IC2), the slope of _evs(i) increases and the slope of

e

vu(i) decreases.

In the mechanism designs problem, ˙v, the slope of the value function, is related with how the

compensation schedule must be sensitive with unobserved ability. When ˙vis higher, it means that

the social planner needs to give higher utility to those with higher ability. With redistributive

social welfare function, the social planner wants to give higher utility to agents with lower ability.

Thus, when ˙vis high, the level of utility that the social planner can give to the agents with lower

ability is limited since the amount of the resource is limited. In such a situation, if the government

can make ˙v smaller exogenously, it is possible to increase the social welfare and changingσ can

be a good policy tool for changing ˙v.

When σ increases, the change of ˙v is not the same for all individuals however. As Figure 1

shows, all individuals whose ability is lower than i∗ will experience a decrease of ˙v and all

indi-viduals whose ability is greater thani∗ will experience an increase of ˙vexcept the neighborhood

of i∗. But, as the analysis in the Appendix shows, the effect of a change of ˙v for those agents is

(22)

the other hand, there are some individuals who experience the first order change of ˙v.

Individ-uals whose ability is in (i∗, i∗+∂i∗/∂σ) will switch from accumulating skilled human capital to

unskilled human capital. Since the graph ofv(i) has a counter-clockwise kink ati∗, individuals

in (i∗, i∗+∂i∗/∂σ) will experience the first order decrease of ˙v. This implies that the government

needs less ability-sensitive compensation schedules for those agents. Because this change of ˙v has

the first order effect, it will increase the social welfare.

(15) can be interpreted in terms of the marginal tax schedule as well. Note that (∂Z/∂Rm)/Ux

is equal to 1−T_imwhereT_imis the marginal tax rate of income of those who accumulatedm=s, u

type of skill and his ability is equal to i. From the FOC of R_is andRu_i,

λnT_is=µs_i ×as∂h s i ∂Rs i g_s0 gs and λnT_iu=µu_i ×au∂h u i ∂Ru i g_u0 gu

Thus, Since (∂h/∂R)×R=h, we have

dW dσ σ=0 = ∂i ∗ ∂σλn(R s i∗T_is∗−Ru_i∗T_iu∗).

T_is∗ andT_iu∗ are the marginal tax rates of individuals just above i∗ and just belowi∗, respectively.

Whenσincreases, the individual just abovei∗who initially accumulated skilled human capital will

switch from accumulating skilled human capital to unskilled human capital. Since the marginal

tax rate of those who accumulated skilled human capital is higher than the marginal tax rate for

those who accumulated unskilled human capital aroundi∗, the marginal tax rate will decrease.9 Thus,Rs_i∗T_is∗−Ru_i∗T_iu∗ is the earning that is affected by a change of the marginal tax rates. Since

this change of the marginal tax rate is of the first order, it can increase the social welfare.

4 Discussion

In the above two sections, we have shown that indirect redistribution through an increase of the

return from unskilled human capital and a decrease of the return from skilled human capital

9

Readers still might wonder why the marginal tax rate for those who accumulated skilled human capital is higher than those who accumulated unskilled human capital aroundi∗. The reason is around the right hand side ofi∗, the marginal return from ability is higher at the right hand side ofi∗than at the left hand side ofi∗because

(23)

would increase the social welfare. One natural question at this point would be why such changes

of returns do not cause the adverse effect on human capital accumulation and, if they cause it,

why we can ignore it. The answer to such a question is that it will cause the adverse effect on

human capital accumulation but a redistributive income tax system also causes such an incentive

problem. In a circumstance where each individual’s comparative advantage is not observable

and human capital accumulation is endogenous, the redistributive income taxation necessarily

introduces adverse incentive effects on human capital accumulation. In such a situation, the

question is not whether redistributive change of returns from different types of skill causes the

adverse incentive effect but whether it can mitigate the existing incentive problem caused by

income taxation. Given that income taxation is subject to asymmetric information due to the

unobservability of comparative advantage at individual levels, redistributive changes of return

from different types of skill mitigates the asymmetric information problem since the government

can affect agents with different types of comparative advantage differently. As a result, the

redistributive changes of returns from different types of skill will increase the economic efficiency.

At this point we should emphasize that the assumption of comparative advantage plays a

crucial role in our analysis. This implies that empirical studies that examined the returns from

human capital accumulation for individual with different abilities such as Dinardo and Tobias

(2001) and Tobias (2003) are important. In addition, the results in the empirical studies and the

result in this paper can have important implications for public policy. For example, it might be

possible that encouraging skilled human capital accumulation through government funding does

not necessarily increase the social welfare if comparative advantage in human capital accumulation

exists and individuals who get the benefit most from the government funding are individuals with

higher ability.

In this paper, we examined the production efficiency theorem in a small open economy setting.

It is worth mentioning that in a small open economy setting, the Atkinson and Stiglitz theorem

(24)

(Samuelson, 1949). On the other, it is possible to extend the intuition of the present paper to a

closed economy setting by using the two sector-two factor general equilibrium model (Harberger,

1962). In this case, we can prove that the Atkinson and Stiglitz theorem does not hold since

a commodity tax can affect the producer prices and factor prices in a closed economy. More

specifically, we can prove that imposing a commodity tax on skilled human capital intensive good

will increase the social welfare.10

5 Conclusion

In this paper, we have examined whether indirect redistribution such as tariffs and production

subsidies can complement income taxation in the long run where human capital accumulation is

endogenous. For that purpose, I developed two models where individuals can choose the amount of

both skilled and unskilled human capital based on theircomparative advantage. In the first model,

we assumed that skilled human capital and unskilled human capital are imperfect substitutes and

that individuals accumulate both skilled and unskilled human capital. In the second model, we

assumed that skilled human capital and unskilled human capital are perfect substitutes and that

individuals accumulate only one type of human capital. Assuming that individuals with higher

ability have comparative advantage in accumulating skilled human capital, we have shown that

indirect redistribution such imposing a tariff on an unskilled human capital intensive good can

increase the efficiency and complement an income tax system. This suggests that the validity

of the production efficiency theorem depends on how the process of human capital accumulation

is modelled. The result of this paper also suggests that empirical studies such as Dinardo and

Tobias (2001) and Tobias (2003) that showed the returns from human capital were different

among individuals with different abilities have important implications for public policy.

10

(25)

Appendix

Derivation of equation (11)

Letµi andλbe the Lagrangian multiplier of the incentive constraint and the resource constraint.

Then, the Lagrangian function is

W(σ) = Z 2 1 v(i)nidi+ Z 2 1 µi[ dv di −f 0 s(hsi)hsi(g 0 s/gs)−fu0(hui)hui(g 0 u/gu)di]+ +λ Z 2 1 ni{Ri−xi(vi)}di+σ{ Z 2 1 nic2idi−y2(p∗2+σ, Hs, Hu)}

By using the integration by parts, we can obtain

W(σ) = Z 2 1 vinidi+ Z 2 1 µi dv didi− Z 2 1 µifs0(hsi)hsi(gs0/gs)di− Z 2 1 µifu0(hui)hui(gu0/gu)di +λ Z 2 1 ni{Ri−xi}di+σ Z 2 1 nic2idi−σY(p∗2+σ, Hs, Hu)} = Z 2 1 vinidi+µ2v2−µ1v1− Z 2 1 ˙ uividi− Z 2 1 µifs0(hsi)hsi(gs0/gs)di− Z 2 1 µifu0(hui)hui(g0u/gu)di +λ Z 2 1 ni{Ri−xi}di+λσ Z 2 1 nic2idi−Y(p∗2+σ, Hs, Hu)

Therefore, the first-order-conditions are

ni−u˙i−λni ∂xi ∂vi +λσ∂c2i ∂xi ∂xi ∂vi = 0 −µi d[f_s0(hs_i)hs_i(g0_s/gs)] dhs_i ∂hu_i ∂Ri −µi d[f_u0(hu_i)hu_i(g_u0/gu)] dhu_i ∂hu_i ∂Ri +λni−λni ∂xi ∂Ri +λσni ∂c2i ∂xi ∂xi ∂Ri = 0 µ1 = 0 andµ2 = 0

By using the envelope theorem, we obtain

dW dσ σ=0 =− Z 2 1 µi d[f_s0(hs i)hsi(g 0 s/gs)] dhs_i { dhs i dws_i dws i dσ + dhs i dwu_i dwu i dσ }di − Z 2 1 µi d[f_u0(hu_i)hu_i(g0_u/gu)] dhs i {dh u i dws i dws_i dσ + dhu_i dwu i dwu_i dσ }di +λ{ Z 2 1 c2inidi−Y(p∗2+σ, Hs, Hu)}+λ Z 2 1 (−∂xi ∂p − ∂xi ∂w_is ∂w_is ∂σ − ∂xi ∂wu ∂wu ∂σ )nidi

Note that −∂xi/∂p2 = (Up2)/(Ux). From the Roy’s identity, (Up2)/(Ux) = −c2i. Therefore,

λR2 1 c2inidi=λ R2 1(− ∂xi ∂p)nidi. In addition, ∂xi ∂ws i =zw s i/Ux and ∂xi ∂wu =zwu/U_xand ∂xi ∂Ri =ZRi/Ux.

(26)

Using the definition of Zws i and Zwu, ∂xi ∂ws i = −αihsi/Ux, _∂w∂xiu =−αihui/Ux and ZR/Ux =αi/Ux .

On the other hand, the FOC of Ri atσ= 0 is that

−µi d[f_s0(hs_i)hs_i(g_s0/gs)] dhs i ∂hs_i ∂Ri −µi d[f_u0(hu_i)hu_i(g_u0/gu)] dhu i ∂hu_i ∂Ri +λni =λniαi/Ux Thus,dW/dσ becomes dW dσ σ=0 =− Z 2 1 µi d[f_s0(hs_i)hs_i(g_s0/gs)] dhs_i { dhs_i dws_i dw_is dσ + dhs_i dw_iu dwu_i dσ }di − Z 2 1 µi d[f_u0(hu_i)hu_i(g0_u/gu)] dhs i {dh u i dws i dws_i dσ + dhu_i dwu i dwu_i dσ }di −λy2+ Z 2 1 [−µi d[f_s0(hs_i)hs_i(g_s0/gs)] dhs_i ∂hs_i ∂Ri −µi d[f_u0(hu_i)hu_i(g0_u/gu)] dhu_i ∂hu_i ∂Ri +λni]hsi ∂ws_i ∂σ di + Z 2 1 [−µi d[f_s0(hs_i)hs_i(g0_s/gs)] dhs_i ∂hs_i ∂Ri −µi d[f_u0(hu_i)hu_i(g_u0/gu)] dhu_i ∂hu_i ∂Ri +λni]ihui ∂wu ∂σ di −λy2+ Z 2 1 λnihsi ∂ws_i ∂σ di+ Z 2 1 λnihui ∂wu ∂σ di Note thatR2 1 λnih s i ∂ws i ∂σdi+ R2 1 λnih u i ∂w u ∂σ di= R2 1 λnih s ii ∂ws ∂σ di+ R2 1 λnih u i∂w u ∂σ di. R2 1 λnih s ii∂w s ∂σ di+ R2 1 λnih u i∂w u

∂σ dis a change of total earning due to a tariff when levels of human capital of all

indi-viduals are fixed. On the other hand, from perfect competition, for given level of human capital

of all individuals, the total revenue of the firm should be equal to the total payment to

fac-tor owners. Thus, y1 + (p∗2 +σ)y2 = ws

R2 1 λnih s iidi+wu R2 1 λnih u

idi always holds. Let Q(σ)

be the total revenue of firms when all human capital level of all individuals are fixed. Then,

dQ/dσ=R₁2λnihsii∂w s ∂σ di+ R2 1 λnihui ∂w u ∂σ di. By definition of Q(σ) Q(σ) = max y1+ (p∗2+σ)y2 s.t. (y1, y2)∈Γ(Hs, Hu) = 0

Hs and Hu are fixed.

From the envelope theorem, dQ_dσ =y2. Therefore ,−λy2+

R2 1 λnihsi ∂ws_i ∂σ di+ R2 1 λnihui ∂w u ∂σ di= 0.

Note that from the definition ofehs_i and ehu_i, we have

f_s0(ehs_i) ∂ehs_i ∂ws_i +f 0 u(ehu_i) ∂ehu_i ∂w_is = 0 andf 0 u(ehs_i) ∂ehs_i ∂w_iu +f 0 u(ehu_i) ∂ehs_i ∂w_iu = 0

(27)

By using the Slutsky equation for hs_iand hu_i and the above equation, we have dW dσ σ=0 =− Z 2 1 µi [f 00 s(hsi)hsi f0 s(hsi) + 1](g_s0/gs)−[ f_u00(hu_i)hu_i f0 u(hui) + 1](g_u0/gu) f_s0(hs_i)∂eh s i ∂ws i ∂w_is ∂σ di − Z 2 1 µi [f 00 s(hsi)hsi f0 s(hsi) + 1](g_s0/gs)−[ f_u00(hu_i)hu_i f0 u(hui) + 1](g_u0/gu) f_s0(hs_i)∂eh s i ∂wu_i ∂w_iu ∂σ di =− Z 2 1 µi [f 00 s(hsi)hsi f0 s(hsi) + 1](g_s0/gs)−[ f_u00(hu_i)hu_i f0 u(hui) + 1](g_u0/gu) f_s0(hs_i)[∂eh s i ∂ws i ∂ws_i ∂σ + ∂ehs_i ∂wu i ∂w_iu ∂σ ]di

From the condition of the comparative advantage, the inside of the large bracket is positive.

Also, both ∂ehs_i ∂ws i ∂ws i ∂σ and ∂ehs_i ∂wu i ∂wu i

∂σ are positive. Thus, we have dW dσ σ=0 >0 . Proof of µi ≥0

From the FOC ofvi, we will have ni−u˙i−λni∂x_∂v_ii +λσ∂c_∂x2_ii∂x_∂v_ii = 0. Thus, we have

ni−λni

∂xi

∂vi

= ˙µi

atσ= 0. By integrating both sides and using the definition of ∂xi

∂vi and µ1= 0, we will have

Z i 1 ni{1− λ Ux }=µi

From the first order condition of the revelation problem, Ux(p2, X)X0(i) =ZRR0(i). This means

that the sign of X0(i) and R0(i) are the same. Since v(i) is strictly increasing, X0(i) and R0(i)

must be increasing. When X0(i) is increasing, _Uλ

x is increasing. This implies that if at some i

∗∗_,

1−λ/Ux = 0, then for any i > i∗∗, 1−λ/Ux <0. However, µ2 = 0 from the FOC of v2. This

implies that µ1 is initially strictly positive untili∗∗ and then it begins to decrease and reaches

to zero at i= 2. Therefore,µi≥0 for all 1≤i≤2.

(28)

The Lagrangian is: L= Z i∗ 1 vu(i)nidi+ Z 2 i∗ vs(i)nidi+ Z i∗ 1 µu_i{_v˙u₋_au_hu i(g0u/gu)}di+ Z 2 i∗ µs_i{_v˙s₋_as_hs i(gs0/gs)}di +β1{vis∗−vu_i∗}+β₂{Rs_i∗−Ru_i∗} +λ Z i∗ 1 {Ru_i −x(Ru_i, vu_i, w_is, w_iu)}nidi+λ Z 2 i∗ {Rs_i −x(Rs_i, v_is, ws_i, wu_i)}nidi +λσ{ Z 2 1 nic2idi−Y(p∗2+σ, Hs, Hu)}

By using the integration by parts, we obtain

L= Z i∗ 1 vu(i)nidi+ Z 2 i∗ vs(i)nidi+µui∗v_iu∗−µu₁v₁u− Z i∗ 1 ˙ µu_ivu_idi− Z i∗ 1 µu_iauhu_i(g_u0/gu)di µs₂v₂s−µs_i∗vs_i∗− Z 2 i∗ ˙ µs_ivs_idi− Z 2 i∗ µ_isashs_i(g_s0/gs)di+β1{vis∗−v_iu∗}+β₂{Rs_i∗−Ru_i∗} +λ Z i∗ 1 {Ru_i −x(Ru_i, v_iu, q, ws_i, wu_i)}nidi+λ Z 2 i∗ {R_is−x(Rs_i, v_is, w_is, wu_i)}nidi +λσ{ Z 2 1 nic2idi−Y(p∗2+σ, Hs, Hu)}

The first order condition forv_is,v₂s,v_is∗,Rs_i,R_is∗,v_iu,vu_i∗,v₁u,Ru_i and Ru_i∗ are

v_is:ni−µ˙si −λni ∂x ∂v_is +λσ ∂c2i ∂xi ∂xi ∂v_is = 0 vs₂ :µs₂ = 0 vs_i∗ :−µs_i∗+β₁ = 0 Rs_i :−µs_i ×as∂h s i ∂Rs i g_s0 gs +λni−λni ∂x ∂Rs i +λσni ∂c2i ∂xi ∂xi ∂Ri = 0 Rs_i∗ :β₂= 0 vu_i :ni−µ˙si −λni ∂x ∂v_is +λσ ∂c2i ∂xi ∂xi ∂vs_i = 0 v_iu∗:µu_i∗−β₁ = 0 vu₁ :µu₁ = 0 Ru_i :−µu_i ×au∂h u i ∂Ru i g_u0 gu +λni−λni ∂x ∂Ru i +λσni ∂c2i ∂xi ∂xi ∂Ru i = 0 Ru_i∗ :β₂= 0

(29)

Now we characterize those first order conditions. First, note that atσ = 0, µs_i =µs_i∗+ Z i i∗ nj(1−λ ∂x ∂vj )dj fori∈(i∗,2) and µu_i∗ =µu₁ + Z i∗ 1 nj(1−λ ∂x ∂vj )dj (16) Thus, since µu 1 = 0, µsi = Ri 1nj(1−λ ∂x ∂vj)dj fori ∈(i ∗_,_2). _{Note that} ∂x ∂vj = 1/(Ux) . A single crossing property and the monotonicity ofRs_i andRu_i guarantee thatxiis increasing. This implies

that _∂v∂x

j is increasing and the inside of the integral is a decreasing function of i. Since µ

s

2 = 0,

µu_i and µs_i are non-negative.

Now we examine dW/dσ and evaluate at σ = 0.From the envelope theorem,

dW dσ = ∂i∗ ∂σ vu(i∗)ni∗−vs(i∗)n_i∗+ ˙µu_i∗vu(i∗) +µu_i∗v˙_iu∗−µ˙u_i∗vu(i∗)−µu_i∗auhu_i g_u0 gu −µs_i∗v(i∗)−µs_i∗v˙s_i∗+ ˙µs_i∗v(i∗) +µs_i∗ashs_i g_s0 gs +β1{v˙si∗−v˙u i∗}+β₅R˙u_i∗−β₅R˙s_i∗ −λ{Rs_i∗−x(Rs_i∗, v_is∗, q, w_is∗, wu_i∗)}n_i∗+λ{Ru i∗−x(Ru_i∗, vu_i∗, q, w_is∗, wu_i∗)}n_i∗ +∂w s ∂σ ( − Z 2 i∗ µs_ias∂h s i ∂ws i g0_sdi−λ Z 2 i∗ ∂x ∂ws i nigs(i)di ) +∂w u ∂σ ( − Z 2 i∗ µu_iau∂h u i ∂wu i g_u0di−λ Z i∗ 1 ∂x ∂wugu(i)nidi ) +λ Z 2 1 nic2idi−λY(p∗2+σ, Hs, Hu) + Z 2 i∗ −∂x ∂q ∂q ∂σnidi− ∂x ∂q ∂q ∂σ Z i∗ 1 nidi

From the Roy’s identity, c2i =−∂x_∂q∂q_∂σ. Thus,

dW dσ = ∂i∗ ∂σ µs_i∗ashs_i g0_s gs −µu_i∗auhu_i g_u0 gu +∂w s ∂σ ( − Z 2 i∗ µs_ias∂h s i ∂ws_ig 0 sdi−λ Z 2 i∗ ∂x ∂wsnigs(i)di ) +∂w u ∂σ ( − Z 2 i∗ µu_iau∂h u i ∂w_iug 0 udi−λ Z i∗ 1 ∂x ∂wunigudi ) −λY(p∗₂+σ, Hs, Hu)

Now we need to calculate the inside of the integral. Note that from the definition of hs_i and hu_i,

we have ∂hs_i ∂ws_i =−h s i ∂hs_i ∂Rs_i and ∂hu_i ∂w_iu =−h u i ∂hu_i ∂R_iu This implies that

−µs_ias∂h s i ∂w_isg 0 s=µsiashsi ∂hs_i ∂Rs_ig 0 s and −µuiau ∂hu_i ∂wu_i g 0 u =µuiauhui ∂hu_i ∂Ru_i g 0 u

(30)

By using the FOC of Rs_i and R_iu, µs_iashs_i ∂h s i ∂Rs i g_s0 =hs_igs{λni−λni ∂x ∂Rs i } µu_iauhu_i ∂h u i ∂Ru_i g 0 u =huigu{λni−λni ∂x ∂Ru_i } Thus, dW_dσ is dW dσ = ∂i∗ ∂σ µs_i∗ashs_i g_s0 gs −µu_i∗auhu_i g_u0 gu +∂w s ∂σ Z 2 i∗ hs_igs{λni−λni ∂x ∂Rs i }di−λ Z 2 i∗ ∂x ∂ws i gsnidi +∂w u ∂σ ( Z 2 i∗ hu_igu{λni−λni ∂x ∂Ru_i }di−λ Z i∗ 1 ∂x ∂wu_i gunidi ) −λY(p∗₂+σ, Hs, Hu)

Next, we need to calculate λ∂w_∂σsR2

i∗hs_igs(i)ni+λ∂w

u

∂σ Ri∗

1 huigu(i)nidi. From the argument of the

previous section, we have

λy2 =λ ∂ws ∂σ Z 2 i∗ hs_igs(i)ni+λ ∂wu ∂σ Z i∗ 1 hu_igu(i)nidi.

Third, we will show that hs_i_∂R∂xs i = −_∂w∂xs i and h u i ∂R∂xu i = −_∂w∂xu

i. From the definition of Z, we have ∂Z ∂Rs i =as/ws_i and ∂Z ∂ws i =−ashs_i(1/w_is) for i∈(i∗,2) ∂Z ∂Ru_i =a u_/wu iand ∂Z ∂wu_i =−a u_hu i(1/wis) for i∈(1, i ∗ )

Thus, by using the definition of _∂R∂xs i,

∂x ∂ws ,_∂R∂xs

i,

∂x

∂ws, we can check thaths_i_∂R∂xs i = −_∂w∂xs i andh u i ∂R∂xu_i = −_∂w∂xu i . Therefore, dW/dσ is dW dσ σ=0 = ∂i ∗ ∂σ µs_i∗ashs_i∗ g_s0 gs −µu_i∗auhu_i∗ g_u0 gu

From the FOC of v_is∗ and v_iu∗, we haveµ_is∗ =µu_i∗. In addition, ashs_ig 0 s gs and a u_hu i g0 u

gu are the right side slope ofv_isand the left side slope vu_i ati∗ From Lemma 1, the slope ofv_is is steeper than the

(31)

References

[1] Atkinson, A. and Joseph E. StiglitzLectures on Public Economics , McGraw Hill, 1980.

[2] Atkinson, A. and Stiglitz, Joseph E., “The Design of Tax Structure: Direct versus Indirect

Taxation,”Journal of Public Economics 6, July-Aug. 1976, pp 55-75.

[3] Berg, Claude Topological Spaces, New York, Macmillan, 1963.

[4] Boadway, Robin and Katherine Cuff, “A Minimum Wage Can Be Welfare-Improving and

Employment-Enhancing,”European Economic Review 45(3), March 2001, pages 553-76.

[5] Cremer, Helmuth, Pierre Pestieau and Jean-Charles Rochet, 2001, “Direct versus Indirect

Taxation: The Design of the Tax Structure Revisted,”,International Economic Review, vol

42(3),pp781-99

[6] Diamond, Peter and James Mirrlees, 1971, “Optimal Taxation and Public Production,”

American Economic Review, 61, pp8-27 and pp 261-278.

[7] DiNardo,John and Justin Tobias, 2001, “Nonparametric Density and Regression

Estima-tion,”Journal of Economic Perspectives 15(4), pp 11-28.

[8] Fudenberg, Drew and Jean Tirole, Game Theory, MIT Press, 1992.

[9] Hargerger, Arnold C., 1962, “The Incidence of the Corporation Income Tax,” Journal of

Political Economy, pp215-240

[10] Mirrlees, James A.,1971, “An Exploration in the Theory of Optimum Income Taxation,”

Review of Economic Studies, 38, pp 175-208.

[11] Naito, Hisahiro, 1996, “Tariff As A Device to Relax the Incentive Problem of a Progressive

Income Tax System,”Research Seminar of International Economics Working Papers, No 391,

The Department of Economics and School of Public Policy,http://www.spp.umich.edu/rsi,

(32)

[12] Naito, Hisahiro, 1999, “Re-examination of Uniform Commodity Taxes under A Non-linear

Income Tax System and Its Implication for Production Efficiency,” February 1(2),. Journal

of Public Economics,pp65-88

[13] Naito, Hisahiro, 2002, “Endogenous Human Capital Accumulation and Direct vs. Indirect

Redistribution,” working paper, Institute of Social Economic Research Osaka University

[14] Stiglitz,Joseph, 1982, “Self-Selection and Pareto Efficient Taxation,”Journal of Public

Eco-nomics, 17, pp 213-240.

[15] Samuelson, Paul A., 1949, “International Factor-Price Equalisation Once Again”, The

Eco-nomic Journal, pp 181-197.

[16] Saez, Emmanuel, 2003, “Direct or Indirect Instruments for Redistribution: Short-run versus

Long-Run”, forthcomming fromJournal of Public Economics

[17] Saez, Emmanuel, 2002, “The Desirability of Commodity Taxation under Non-linear Income

Taxation and Heterogeneous Tastes,”,Journal of Public Economics, vol83(2), pp217-30.

[18] Samuelson, Paul A., 1949, ”International Factor-Price Equalisation Once Again”, The

Eco-nomic Journal, pp 181-197.

[19] Stolper, Wolfgang and Paul Samuelson, 1941, “Protection and Real Wages,” Review of

Economic Studies, 9, pp 58-73.

[20] Tobias, L. Justin, 2003, “Are Returns to Schooling Concentrated Among the Most Able?

A Semiparametric Analysis of the Ability Earnings Relationships”, Oxford Bulletin of

(33)

Vu

Vs

I* i

V