The Envelope Theorem 1

(1)

John Nachbar

Washington University April 2, 2015

The Envelope Theorem

¹

1 Introduction.

The Envelope theorem is a corollary of the Karush-Kuhn-Tucker theorem (KKT) that characterizes changes in the value of the objective function in response to changes in the parameters in the problem. For example, in a standard cost minimization problem for a firm, the Envelope theorem characterizes changes in cost in response to changes in input prices or output quantities.

2 The Envelope Theorem with no binding constraints.

Consider the following parameterized version of a differentiable MAX problem with no (binding) constraints,

max

x

f (x, q)

where q ∈ R

^L

is a vector of parameters. The parametrized MIN problem is analogous. Let φ(q) give the (or at least, a) optimal x for a given q. The value function f

^v

is defined by

f

^v

(q) = f (φ(q), q).

The following is the Envelope theorem for this unconstrained problem. In theorem statement, D

x

f refers to the partial derivatives of f with respect to the x variables and D

_q

f refers to the partial derivatives of f with respect to the q variables.

Theorem 1. Fix a parametrized differentiable MAX or MIN problem. Let U ⊆ R

^L

be open, let φ : U → R

^N

be differentiable, and suppose that, for every q ∈ U , φ(q) is a solution to the MAX or MIN problem. Let f

^v

: U → R be defined by f

^v

(q) = f (φ(q), q). Finally, suppose that the standard first order condition, D

x

f (φ(q), q) = 0, holds for every q ∈ U . Then for any q

^∗

∈ U , letting x

^∗

= φ(q

^∗

),

Df

^v

(q

^∗

) = D

q

f (x

^∗

, q

^∗

).

1cbna. This work is licensed under the Creative Commons Attribution-NonCommercial- ShareAlike 4.0 License.

(2)

Proof. By the Chain Rule

²

Df

^v

(q

^∗

) = D

_x

f (x

^∗

, q

^∗

)Dφ(q

^∗

) + D

_q

f (x

^∗

, q

^∗

).

Since D

x

f (x

^∗

, q

^∗

) = 0, the result follows.

3 An example with no binding constraints.

Consider the cost minimization problem,

min

_a,b∈R

a + b s.t. √

ab ≥ y a, b ≥ 0

To interpret, a is capital, b is labor, and y is output. I assume (for simplicity) that the rental price of capital and the wage for labor both equal 1, so that total cost is a + b. If y > 0, any feasible a and b must be strictly positive, hence the non-negativity constraints do not bind at the solution. On the other hand, the production constraint √

ab ≥ y binds (since cost is strictly increasing in both a and b, and since the production function is continuous), hence I can rewrite this constraint as

b = y

²

a > 0.

Substituting, let the (modified) objective function be f (a, y) = a + y

²

a .

I can thus convert the above constrained minimization problem into an unconstrained minimization problem

min

_a∈R

a +

^y_a²

.

(With the qualification that I have to check that indeed a > 0 at any proposed solution.) In terms of notation, here a is playing the role of “x” and y is playing the role of “q”.

2Formally, define r : U → R^{N +L}by r(q) = (φ(q), q). Then f^v(q) = f (r(q)). By the Chain Rule, Df^v(q^∗) = Df (x^∗, q^∗)Dr(q^∗)

=

Dxf (x^∗, q^∗) Dqf (x^∗, q^∗)

Dφ(q^∗) I

= Dxf (x^∗, q^∗)Dφ(q^∗) + Dqf (x^∗, q^∗).

as claimed. Many of the other Chain Rule applications are similar.

(3)

At y = y

^∗

, the first order condition D

_a

f (a

^∗

, y

^∗

) = 0 gives, 1 − y

^∗2

a

^∗2

= 0,

hence y

^∗

= y

^∗

(which is strictly positive), hence φ(y) = a. For this particular application, call the value function C

^`

(since the value function here is what is often called long-run cost). Then C

^`

(y) = 2y.

Verifying the Envelope theorem,

DC

^`

(y

^∗

) = D

_y

f (a

^∗

, y

^∗

), since

D

_y

f (a

^∗

, y

^∗

) = 2y

^∗

a

^∗

= 2.

Next, note that for a fixed (I won’t try to justify here in why a might be fixed), the value,

a + y

²

a ,

is the minimum cost of producing y (this is how I constructed the unconstrained problem in the first place). This can be interpreted as short-run cost, and accord- ingly I write,

C

^s

(a, y) = a + y

²

a .

Short-run and long-run cost are equal when a = φ(y): C

^`

(y) = C

^s

(φ(y), y). By the Envelope theorem, then,

DC

^`

(y

^∗

) = D

_y

C

^s

(a

^∗

, y

^∗

).

For any given value of a

^∗

, the graph of C

^s

is strictly above that of C

^`

(i.e., short- run cost is strictly higher than long-run cost) except at output level y

^∗

, since at that output level, the level of the fixed factor a is optimal. Moreover, when y = y

^∗

, so that φ(y

^∗

) = a

^∗

, the two graphs are tangent; their slopes are equal. One says that the graph of C

^`

is the (lower) envelope of the graph of C

^s

, hence the name Envelope theorem. Graphing C

^`

and a few of the C

^s

will help you visualize what is going on.

All of this generalizes. If C

^s

is the short-run cost of producing a vector of outputs y given input prices and a fixed subvector of inputs a

^∗

, and if a

^∗

is optimal in the long-run when y = y

^∗

, then

DC

^`

(y

^∗

) = D

_y

C

^s

(a

^∗

, y

^∗

).

(4)

4 The General Envelope Theorem.

Consider the following parameterized version of a differentiable MAX problem, max

x

f (x, q)

s.t. g

₁

(x, q) ≤ 0, .. . g

K

(x, q) ≤ 0.

where q ∈ R

^L

is a vector of parameters. For example, in a consumer maximization problem, the parameters are usually prices and income. The parameterized MIN problem is analogous. Again let φ(q) give the (or at least, a) optimal x for a given q and let the value function be defined by

f

^v

(q) = f (φ(q), q).

Theorem 2 (Envelope Theorem). Fix a differentiable parameterized MAX or MIN problem. Let U ⊆ R

^L

be open, let φ : U → R

^N

be differentiable, and suppose that, for every q ∈ U , φ(q) is a solution to the MAX or MIN problem. Let f

^v

: U → R be defined by f

^v

(q) = f (φ(q), q). Let J (q) be the set of indices of constraints that are binding at φ(q). Finally, suppose that the conclusion of KKT holds for every q ∈ U . Fix any q

^∗

∈ U , let φ(q

^∗

) = x

^∗

, let J

^∗

= J (q

^∗

), and let λ

^∗_k

be the associated KKT multipliers. Then

Df

^v

(q

^∗

) = D

_q

f (x

^∗

, q

^∗

) − X

k∈J^∗

λ

^∗_k

D

_q

g

_k

(x

^∗

, q

^∗

).

Proof. By the Chain Rule (see the proof of Theorem 1), Df

^v

(q

^∗

) = D

x

f (x

^∗

, q

^∗

)Dφ(q

^∗

) + D

q

f (x

^∗

, q

^∗

).

By KKT,

D

_x

f (x

^∗

, q

^∗

) = X

k∈J

λ

^∗_k

D

_x

g

_k

(x

^∗

, q

^∗

).

Substituting,

Df

^v

(q

^∗

) = X

k∈J

λ

^∗_k

D

_x

g

_k

(x

^∗

, q

^∗

)Dφ(q

^∗

) + D

_q

f (x

^∗

, q

^∗

).

For any k ∈ J

^∗

, let γ

_k

(q) = g

_k

(φ(q), q). For a MAC problem, since k is binding at q

^∗

, γ

_k

(q

^∗

) = 0 while for all other q ∈ U , γ

_k

(q) ≤ 0 (since φ(q) must be feasible to be a solution). Therefore, q

^∗

maximizes γ

k

on U . (For a MIN problem, the argument is almost the same but q

^∗

minimizes γ

_k

on U .) Therefore Dγ

_k

(q

^∗

) = 0, hence, by the Chain Rule,

0 = Dγ

_k

(q

^∗

) = D

x

g

_k

(x

^∗

, q

^∗

)Dφ(q

^∗

) + D

q

g

_k

(x

^∗

, q

^∗

).

(5)

Hence, D

_x

g

_k

(x

^∗

, q

^∗

)Dφ(q

^∗

) = −D

_q

g

_k

(x

^∗

, q

^∗

), hence, since λ

^∗_k

≥ 0, λ

^∗_k

D

_x

g

_k

(x

^∗

, q

^∗

)Dφ(q

^∗

) = −λ

^∗_k

D

_q

g

_k

(x

^∗

, q

^∗

).

Substituting this expression for λ

^∗_k

D

_x

g

_k

(x

^∗

, q

^∗

)Dφ(q

^∗

) into the above expression for Df

^v

(q

^∗

) yields the result.

5 An example with constraints.

Consider the problem

max

x 1

3

ln(x

1

) +

²₃

ln(x

2

) s.t. p · x ≤ m,

x ≥ 0

Where x ∈ R

^N

, p ∈ R

^N++

and m ∈ R

++

. Think of this as a utility maximization problem with parameters being prices p 0 and income m > 0.

Grinding through the Kuhn-Tucker calculation, at prices p

^∗

and m

^∗

the solution is,

x

^∗

=

m

^∗

/(3p

^∗₁

) (2m

^∗

)/(3p

^∗₂

)

. Thus, at general (p, m), the solution function φ is given by,

φ(p, m) =

m/(3p

₁

) (2m)/(3p

2

)

. Also

λ

^∗₁

= 1 m

^∗

and the other KKT multipliers (on the x ≥ 0 constraint) equal zero.

The value function is then determined by f

^v

(p, m) = f (φ(p, m)), hence f

^v

(p, m) = 1

3 ln m 3p

₁

+ 2

3 ln 2m 3p

₂

= ln(m) − 1

3 ln(p

₁

) − 2

3 ln(p

₂

) + 1 3 ln 1

3 + 2

3 ln 2 3

.

Note that this makes some intuitive sense; in particular f

^v

is increasing in m and decreasing in p

1

and p

2

.

By direct calculation, at the point (p

^∗

, m

^∗

),

Df

^v

(p

^∗

, m

^∗

) =







∂f^v

∂p1

(p

^∗

, m

^∗

)

∂f^v

∂p2

(p

^∗

, m

^∗

)

∂f^v

∂m

(p

^∗

, m

^∗

)





 =





−1/(3p

^∗₁

)

−2/(3p

^∗₂

) 1/m

^∗



 .

(6)

On the other hand, by the Envelope theorem, since the parameters p and m don’t appear in the objective function, and with the first constraint written in standard form as g

₁

(x, p, m) = p · x − m ≤ 0:

Df

^v

(p

^∗

, m

^∗

) =





−λ

^∗₁

x

^∗₁

−λ

^∗₁

x

^∗₂

λ

^∗₁



 .

Substitute in the value of x

^∗

and λ

^∗

found above and you will confirm that these two expressions for Df

^v

(p

^∗

, m

^∗

) are indeed equal.

6 The constraint term and the interpretation of the λ

_k

.

Consider a MAX problem and focus on K parameters with the following special form. Parameter q

_k

affects only constraint k and it does it additively:

g

_k

(x) − q

_k

≤ 0.

Increasing q

k

weakens the constraint. Then, from the Envelope theorem, at q

^∗_k

= 0, D

_q_k

f

^v

(0) = −λ

^∗_k

(−1) = λ

^∗_k

,

for any k ∈ J

^∗

. That is, λ

^∗_k

is the marginal value of relaxing binding constraint k.

The argument for MIN problems is almost identical.

For k / ∈ J , set λ

^∗_k

= 0. For some versions of KKT, this is actually a requirement

of KKT. For KKT as I have stated it, it is more in the nature of a convention. In

any event, the interpretation is that if constraint k is not binding then the marginal

value of relaxing constraint k is zero.