John Nachbar
Washington University April 2, 2015
The Envelope Theorem
11 Introduction.
The Envelope theorem is a corollary of the Karush-Kuhn-Tucker theorem (KKT) that characterizes changes in the value of the objective function in response to changes in the parameters in the problem. For example, in a standard cost mini- mization problem for a firm, the Envelope theorem characterizes changes in cost in response to changes in input prices or output quantities.
2 The Envelope Theorem with no binding constraints.
Consider the following parameterized version of a differentiable MAX problem with no (binding) constraints,
max
xf (x, q)
where q ∈ R
Lis a vector of parameters. The parametrized MIN problem is analo- gous. Let φ(q) give the (or at least, a) optimal x for a given q. The value function f
vis defined by
f
v(q) = f (φ(q), q).
The following is the Envelope theorem for this unconstrained problem. In theorem statement, D
xf refers to the partial derivatives of f with respect to the x variables and D
qf refers to the partial derivatives of f with respect to the q variables.
Theorem 1. Fix a parametrized differentiable MAX or MIN problem. Let U ⊆ R
Lbe open, let φ : U → R
Nbe differentiable, and suppose that, for every q ∈ U , φ(q) is a solution to the MAX or MIN problem. Let f
v: U → R be defined by f
v(q) = f (φ(q), q). Finally, suppose that the standard first order condition, D
xf (φ(q), q) = 0, holds for every q ∈ U . Then for any q
∗∈ U , letting x
∗= φ(q
∗),
Df
v(q
∗) = D
qf (x
∗, q
∗).
1cbna. This work is licensed under the Creative Commons Attribution-NonCommercial- ShareAlike 4.0 License.
Proof. By the Chain Rule
2Df
v(q
∗) = D
xf (x
∗, q
∗)Dφ(q
∗) + D
qf (x
∗, q
∗).
Since D
xf (x
∗, q
∗) = 0, the result follows.
3 An example with no binding constraints.
Consider the cost minimization problem,
min
a,b∈Ra + b s.t. √
ab ≥ y a, b ≥ 0
To interpret, a is capital, b is labor, and y is output. I assume (for simplicity) that the rental price of capital and the wage for labor both equal 1, so that total cost is a + b. If y > 0, any feasible a and b must be strictly positive, hence the non-negativity constraints do not bind at the solution. On the other hand, the production constraint √
ab ≥ y binds (since cost is strictly increasing in both a and b, and since the production function is continuous), hence I can rewrite this constraint as
b = y
2a > 0.
Substituting, let the (modified) objective function be f (a, y) = a + y
2a .
I can thus convert the above constrained minimization problem into an uncon- strained minimization problem
min
a∈Ra +
ya2.
(With the qualification that I have to check that indeed a > 0 at any proposed solution.) In terms of notation, here a is playing the role of “x” and y is playing the role of “q”.
2Formally, define r : U → RN +Lby r(q) = (φ(q), q). Then fv(q) = f (r(q)). By the Chain Rule, Dfv(q∗) = Df (x∗, q∗)Dr(q∗)
=
Dxf (x∗, q∗) Dqf (x∗, q∗)
Dφ(q∗) I
= Dxf (x∗, q∗)Dφ(q∗) + Dqf (x∗, q∗).
as claimed. Many of the other Chain Rule applications are similar.
At y = y
∗, the first order condition D
af (a
∗, y
∗) = 0 gives, 1 − y
∗2a
∗2= 0,
hence y
∗= y
∗(which is strictly positive), hence φ(y) = a. For this particular application, call the value function C
`(since the value function here is what is often called long-run cost). Then C
`(y) = 2y.
Verifying the Envelope theorem,
DC
`(y
∗) = D
yf (a
∗, y
∗), since
D
yf (a
∗, y
∗) = 2y
∗a
∗= 2.
Next, note that for a fixed (I won’t try to justify here in why a might be fixed), the value,
a + y
2a ,
is the minimum cost of producing y (this is how I constructed the unconstrained problem in the first place). This can be interpreted as short-run cost, and accord- ingly I write,
C
s(a, y) = a + y
2a .
Short-run and long-run cost are equal when a = φ(y): C
`(y) = C
s(φ(y), y). By the Envelope theorem, then,
DC
`(y
∗) = D
yC
s(a
∗, y
∗).
For any given value of a
∗, the graph of C
sis strictly above that of C
`(i.e., short- run cost is strictly higher than long-run cost) except at output level y
∗, since at that output level, the level of the fixed factor a is optimal. Moreover, when y = y
∗, so that φ(y
∗) = a
∗, the two graphs are tangent; their slopes are equal. One says that the graph of C
`is the (lower) envelope of the graph of C
s, hence the name Envelope theorem. Graphing C
`and a few of the C
swill help you visualize what is going on.
All of this generalizes. If C
sis the short-run cost of producing a vector of outputs y given input prices and a fixed subvector of inputs a
∗, and if a
∗is optimal in the long-run when y = y
∗, then
DC
`(y
∗) = D
yC
s(a
∗, y
∗).
4 The General Envelope Theorem.
Consider the following parameterized version of a differentiable MAX problem, max
xf (x, q)
s.t. g
1(x, q) ≤ 0, .. . g
K(x, q) ≤ 0.
where q ∈ R
Lis a vector of parameters. For example, in a consumer maximization problem, the parameters are usually prices and income. The parameterized MIN problem is analogous. Again let φ(q) give the (or at least, a) optimal x for a given q and let the value function be defined by
f
v(q) = f (φ(q), q).
Theorem 2 (Envelope Theorem). Fix a differentiable parameterized MAX or MIN problem. Let U ⊆ R
Lbe open, let φ : U → R
Nbe differentiable, and suppose that, for every q ∈ U , φ(q) is a solution to the MAX or MIN problem. Let f
v: U → R be defined by f
v(q) = f (φ(q), q). Let J (q) be the set of indices of constraints that are binding at φ(q). Finally, suppose that the conclusion of KKT holds for every q ∈ U . Fix any q
∗∈ U , let φ(q
∗) = x
∗, let J
∗= J (q
∗), and let λ
∗kbe the associated KKT multipliers. Then
Df
v(q
∗) = D
qf (x
∗, q
∗) − X
k∈J∗
λ
∗kD
qg
k(x
∗, q
∗).
Proof. By the Chain Rule (see the proof of Theorem 1), Df
v(q
∗) = D
xf (x
∗, q
∗)Dφ(q
∗) + D
qf (x
∗, q
∗).
By KKT,
D
xf (x
∗, q
∗) = X
k∈J
λ
∗kD
xg
k(x
∗, q
∗).
Substituting,
Df
v(q
∗) = X
k∈J
λ
∗kD
xg
k(x
∗, q
∗)Dφ(q
∗) + D
qf (x
∗, q
∗).
For any k ∈ J
∗, let γ
k(q) = g
k(φ(q), q). For a MAC problem, since k is binding at q
∗, γ
k(q
∗) = 0 while for all other q ∈ U , γ
k(q) ≤ 0 (since φ(q) must be feasible to be a solution). Therefore, q
∗maximizes γ
kon U . (For a MIN problem, the argument is almost the same but q
∗minimizes γ
kon U .) Therefore Dγ
k(q
∗) = 0, hence, by the Chain Rule,
0 = Dγ
k(q
∗) = D
xg
k(x
∗, q
∗)Dφ(q
∗) + D
qg
k(x
∗, q
∗).
Hence, D
xg
k(x
∗, q
∗)Dφ(q
∗) = −D
qg
k(x
∗, q
∗), hence, since λ
∗k≥ 0, λ
∗kD
xg
k(x
∗, q
∗)Dφ(q
∗) = −λ
∗kD
qg
k(x
∗, q
∗).
Substituting this expression for λ
∗kD
xg
k(x
∗, q
∗)Dφ(q
∗) into the above expression for Df
v(q
∗) yields the result.
5 An example with constraints.
Consider the problem
max
x 13
ln(x
1) +
23ln(x
2) s.t. p · x ≤ m,
x ≥ 0
Where x ∈ R
N, p ∈ R
N++and m ∈ R
++. Think of this as a utility maximization problem with parameters being prices p 0 and income m > 0.
Grinding through the Kuhn-Tucker calculation, at prices p
∗and m
∗the solution is,
x
∗=
m
∗/(3p
∗1) (2m
∗)/(3p
∗2)
. Thus, at general (p, m), the solution function φ is given by,
φ(p, m) =
m/(3p
1) (2m)/(3p
2)
. Also
λ
∗1= 1 m
∗and the other KKT multipliers (on the x ≥ 0 constraint) equal zero.
The value function is then determined by f
v(p, m) = f (φ(p, m)), hence f
v(p, m) = 1
3 ln m 3p
1+ 2
3 ln 2m 3p
2= ln(m) − 1
3 ln(p
1) − 2
3 ln(p
2) + 1 3 ln 1
3
+ 2
3 ln 2 3
.
Note that this makes some intuitive sense; in particular f
vis increasing in m and decreasing in p
1and p
2.
By direct calculation, at the point (p
∗, m
∗),
Df
v(p
∗, m
∗) =
∂fv
∂p1
(p
∗, m
∗)
∂fv
∂p2
(p
∗, m
∗)
∂fv
∂m