Chapter 11
Optimization with equality
constraints
11.1
First order necessary conditions for
con-strained local maximum
11.1.1
Single equality constraint
Consider a set A ⇢ Rn and two functions f :A ! R and g : A! R. The
setC ={x2A:g(x) = 0}is referred to as the constraint set. The following is a typical optimization problem:
maxf(x) subject to x2C. (11.1)
Definition 11.1 (Local maximum). A point x⇤ 2 C is a point of local maximum off subject to the constraint g(x) = 0 if there exists an open ball aroundx⇤, B✏(x⇤) such that f(x⇤) f(x) for all x2B✏(x⇤)\C.
Definition 11.2 (Global maximum). A point x⇤ 2 C is a point of global maximum of f subject to the constraint g(x) = 0 if it solves the problem 11.1.
Local and global minimum can be defined in an analogous manner.
11.1.2
Lagrange Method
Theorem 11.1(Lagrange: single equality constraint). LetA⇢Rnbe open,
and f : A! R, g :A !R be C1 functions onA. Supposex⇤ is a point of
84 CHAPTER 11. OPTIMIZATION WITH EQUALITY CONSTRAINTS
local maximum or local minimum off subject tog(x) = 0. Further suppose
rg(x⇤)6= 0. Then there is ⇤ 2R such that
rf(x⇤) = ⇤rg(x⇤) (11.2) The n conditions in 11.2 and the constraint condition g(x) = 0 together are referred to as first order conditions for a constrained local maximum or local minimum.
There is a convenient method to express the conclusion of Theorem 11.2: Consider the function L:A⇥R!Rgiven by
L(x, ) =f(x) g(x). (11.3) The functionLis referred to as the Lagrangian and is referred to as the La-grangian multiplier. Now consider the problem of finding the unconstrained local maximum or minimum ofL. It has the first order conditions:
DiL(x, ) = 0, i= 1, . . . , n+ 1,
which givesDif(x) = Dig(x) for i= 1, . . . , n+ 1 and g(x) = 0. Note that
the first n conditions are exactly the same as in the statement of Theorem 11.2. This method of expressing the conditions of a constrained optimization problem is known as the Lagrangian multiplier method.
Remark. The conditionrg(x)6= 0 is known as the constraint qualification. Theorem 11.2 may not be applicable without the constraint qualification. For instance, consider f(x1, x2) = x1 +x2 and g(x1, x2) = x21 +x22 for all (x1, x2) 2 R2. Check that the conclusion of Theorem 11.2 does not hold with respect to the problem maxx1,x2f(x1, x2) subject to g(x1, x2) = 0 in
this case. This is because the constraint set is a singleton, containing the point (0,0), at which the constraint qualification is not satisfied.
11.1.3
Multiple equality constraints
Suppose there are m equality constraints given by gj(x) = 0, j = 1, . . . , m. Then the constraint set is
C ={x2A:gj(x) = 0, j = 1, . . . , m}.
The constraint qualification involves the Jacobian derivative of the constraint functions:
Dg(x⇤) =
2 6 4
@g1
@x1(x
⇤) · · · @g1
@xn(x
⇤) ... . .. ... @gm
@x1(x
⇤) · · · @gm
@xn(x
⇤)
The natural generalization of the constraint qualification to the case of multi-ple constraints is that the rank ofDg(x⇤) must be equal tom. This condition is referred to as the non-degenerate constraint qualification. It implies that the constraint set has a well-definedn mdimensional tangent plane every-where.
We will use the following definition.
Definition 11.3 (Critical point). A point x is called a critical point of
g = (g1, . . . , gm) if the rank ofDg(x) is less than m.
Theorem 11.2 (Lagrange: multiple equality constraints). Let A ⇢ Rn be
open, and f : A ! R, gj : A ! R, j = 1, . . . , m, be C1 functions on A. Suppose x⇤ is a point of local maximum or local minimum of f subject to
gj(x) = 0, j = 1, . . . , m. Further suppose the rank of Dg(x⇤) is m. Then there exist ( ⇤
1, . . . , ⇤m) 2 Rm such that (x⇤, ⇤) is a critical point of the
Lagrangian
L(x, ) =f(x) 1g1(x) · · · mgm(x), (11.4)
i.e.,
@L
@xi
(x⇤, ⇤) = 0, i= 1, . . . , n;
@L
@ j
(x⇤, ⇤) = 0, j = 1, . . . , m;
Proof. We first claim that the (m+ 1)⇥n Jacobian matrix
2 6 6 6 4
@f
@x1(x
⇤) · · · @f
@xn(x
⇤)
@g1
@x1(x
⇤) · · · @g1
@xn(x
⇤) ... . .. ... @gm
@x1(x
⇤) · · · @gm
@xn(x
⇤)
3 7 7 7 5
does not have maximal rank. Letf(x⇤) = c. We know that x⇤ is a solution of
f(x) = c g1(x) = 0 ... ... ...
86 CHAPTER 11. OPTIMIZATION WITH EQUALITY CONSTRAINTS
Suppose the Jacobian matrix above has full rank. Then by the Implicit Function Theorem 8.9, we can find a solutionx⇤⇤ to the system
f(x) = c+✏
g1(x) = 0 ... ... ...
gm(x) = 0
where ✏ is a small positive number. Then f(x⇤⇤) > f(x⇤) and gj(x⇤) = 0 forj = 1, . . . , m, contradicting our assumption that x⇤ maximizes f subject to the constraints gj(x) = 0, j = 1, . . . , m. Consequently, the (m+ 1)⇥n
matrix does not have maximal rank. This implies that them+ 1 rows of this matrix are linearly dependent, i.e., there exist scalars ↵0,↵1, . . . ,↵m not all
zero such that
↵0
2 6 4
@f
@x1(x
⇤) ... @f
@xn(x
⇤)
3 7 5+↵1
2 6 4
@g1
@x1(x
⇤) ... @g1
@xn(x
⇤)
3 7
5+· · ·+↵m
2 6 4
@gm
@x1(x
⇤) ... @gm
@xn(x
⇤) 3 7 5= 2 6 4 0 ... 0 3 7 5. (11.5)
Now we claim that ↵0 6= 0: otherwise, there exist scalars ↵1, . . . ,↵m not all zero such that
↵1
2 6 4
@g1
@x1(x
⇤) ... @g1
@xn(x
⇤)
3 7
5+· · ·+↵m
2 6 4
@gm
@x1(x
⇤) ... @gm
@xn(x
⇤) 3 7 5= 2 6 4 0 ... 0 3 7 5.
i.e., them rows ofDg(x⇤) are not independent, orDg(x⇤) does not have full rank. Hence contradiction.
Finally, divide 11.5 through by ↵0 and writing ↵↵0i = i, i = 1, . . . , m, to
obtain
rf(x⇤) 1rg1(x⇤) · · · mrgm(x⇤) = 0.
Hence the claim.
11.2
Second order necessary conditions
g(x) = 0. Further suppose rg(x⇤)6= 0. Then there is ⇤ 2R such that
rf(x⇤) = ⇤rg(x⇤) (11.6) and y0HL(x⇤, ⇤)y0 for all y:y·rg(x⇤) = 0, (11.7)
where L(x, ⇤) = f(x) ⇤g(x) and H
L(x⇤, ⇤) is the Hessian matrix of
L(x, ⇤) with respect to xevaluated at (x, ⇤).
The second order necessary condition for a local minimum would require
y0HL(x⇤, ⇤)y 0 for all y:y·rg(x⇤) = 0.
When there are m equality constraints, the condition rg(x⇤) 6= 0 is replaced by the condition that Dg(x⇤) has full rank m.
11.3
Sufficient conditions for constrained
lo-cal maximum
Theorem 11.4. Let A ⇢ Rn be open, and f : A ! R, g : A ! R be C2 functions on A. Suppose (x⇤, ⇤)2C⇥R and
rf(x⇤) = ⇤rg(x⇤) (11.8) and y0HL(x⇤, ⇤)y<0 for ally6= 0 : y·rg(x⇤) = 0, (11.9)
where L(x, ⇤) = f(x) ⇤g(x) and HL(x⇤, ⇤) is the Hessian matrix of
L(x, ⇤) with respect to x evaluated at (x, ⇤). Then x⇤ is a point of local maximum of f subject to g(x) = 0.
The second order sufficient condition for constrained local minimum is
y0HL(x⇤, ⇤)y>0 for all y6= 0 :y·rg(x⇤) = 0.
There is a convenient way of checking the second order condition 11.9 as given in the following result.
Theorem 11.5. LetA be ann⇥n symmetric matrix and b be ann-vector with b1 6= 0. Consider the (n+ 1)⇥(n+ 1) matrix
S =
0 b
b A .
If|S|has the same sign as ( 1)n and the last n 1 leading principal minors
of S alternate in sign, then y0Ay < 0 for all y 6= 0 such that y·b = 0. If
|S| and the last n 1 leading principal minors of S are all negative, then
88 CHAPTER 11. OPTIMIZATION WITH EQUALITY CONSTRAINTS
Consequently, we have to check the signs of the leading principal minors of
0 rg(x⇤)
rg(x⇤) H
L(x⇤, ⇤) .
Here we provide a restricted proof of this result when n = 2: given two
C2 functions f and g on R2, consider the problem of maximizing f on the constraint setCg ={(x, y)2R2 :g(x, y) = 0}.
We form the Lagrangian
L(x, y, ) =f(x, y) g(x, y).
Suppose (x⇤, y⇤, ⇤) satisfies @L
@x = 0,
@L
@y = 0,
@L
@ = 0, and
0 @@gx @@gy @g
@x
@2L
@x2 @
2L
@x@y
@g
@y
@2L
@x@y
@2L
@y2
>0 at (x⇤, y⇤, ⇤).
We will show that (x⇤, y⇤) maximizes f onC
g.
By the second condition above, either @@xg 6= 0 or @@gy 6= 0. Without loss of generality, let @@gy 6= 0. Then by the Implicit Function Theorem 8.7Cg can be
written as the graph of aC1 function y= (x) around (x⇤, y⇤):
h(x, (x)) = C for all xnear x⇤. (11.10)
Di↵erentiating this expression with respect to x, we get
@g
@x(x, (x)) +
@g
@y(x, (x))
0(x) = 0, (11.11)
or, 0(x) = @g
@x(x, (x))
@g
@y(x, (x))
. (11.12)
Let F(x) = f(x, (x)) be f evaluated on Cg. Note that it is a function
of one unconstrained variable. Consequently, if F0(x⇤) = 0 and F00(x⇤)< 0, then x⇤ will be a local maximum of F and (x⇤, y⇤) = (x⇤, (x⇤)) will be a local constrained maximum of f. Now, adding ⇤ times (11.11) to
F0(x) = @f
@x(x, (x)) +
@f
@y(x, (x))
0(x), (11.13)
F0(x⇤) =
✓
@f
@x(x
⇤, y⇤) ⇤(@g
@x(x
⇤, y⇤)
◆
+
✓
@f
@y(x
⇤, y⇤) ⇤(@g
@y(x
⇤, y⇤)
◆
0(x⇤)
= @L
@x(x
⇤, y⇤) + @L
@y(x
⇤, y⇤) 0(x⇤). (11.14)
By the hypothesis of this result, F0(x⇤) = 0.
Di↵erentiating (11.14) again at x⇤, settingy⇤ = (x⇤),
F00(x⇤) = @ 2L
@x2 + 2
@2L
@x@y
0(x⇤) + @2L
@y2
0(x⇤)2
= @ 2L
@x2 + 2
@2L
@x@y
@g
@x(x, (x))
@g
@y(x, (x))
!
+ @ 2L
@y2
@g
@x(x, (x))
@g
@y(x, (x))
!2
= 1 (@@gy)2
"
@2L
@x2
✓ @g @y ◆2 2 @ 2L
@x@y
@g
@x
@g
@y +
@2L
@y2
✓
@g
@x
◆2#
which is negative by the hypothesis of this result. Hence F(x) =f(x, (x)) has a local maximum at x⇤, and therefore, f restricted to Cg has a local
maximum at (x⇤, y⇤).
This result is generalized to the case of m equality constraints below.
Theorem 11.6. Let A ⇢ Rn be open, and f : A ! R, gj : A ! R,
j = 1, . . . , m, be C2 functions on A. Suppose (x⇤, ⇤)2C⇥Rm and
@L
@xi
(x⇤, ⇤) = 0, i= 1, . . . , n; (11.15)
@L
@ j
(x⇤, ⇤) = 0, j = 1, . . . , m; (11.16)
and y0HL(x⇤, ⇤)y<0 for all y6= 0 :Dg(x⇤)·y= 0, (11.17)
where L(x, ⇤) = f(x) ⇤1g1(x) · · · ⇤
mgm(x) and HL(x⇤, ⇤) is the
Hessian matrix of L(x, ⇤) with respect to x evaluated at (x⇤, ⇤). Then x⇤ is a point of local maximum off subject togj(x) = 0, j = 1, . . . , m.
The second order sufficient condition for a local minimum isy0H
L(x⇤, ⇤)y>
0 for all y6= 0 : Dg(x⇤)·y= 0.
90 CHAPTER 11. OPTIMIZATION WITH EQUALITY CONSTRAINTS
Theorem 11.7. Consider the quadratic form Q(x) = x0Ax restricted to a constraint set given bym linear equationsBx= 0. Construct the (n+m)⇥ (n+m) matrix
S =
0 B B0 A .
If|S|has the same sign as ( 1)nand the lastn mleading principal minors of
S alternate in sign, thenQ is negative definite on the constraint setBx= 0. If|S| and the lastn m leading principal minors ofS have the same sign as ( 1)m, then Qis positive definite on the constraint set Bx= 0.
Consequently, we have to check the signs of the leading principal minors of
0 rg(x⇤)
rg(x⇤) HL(x⇤, ⇤) .
11.4
Sufficient conditions for constrained global
maximum
Theorem 11.8. Let A ⇢ Rn be an open convex set, and f : A ! R,
gj :A!R,j = 1, . . . , m, beC1 functions onA. Suppose (x⇤, ⇤)2C⇥Rm
and rf(x⇤)
1rg1(x⇤) · · · mrgm(x⇤) = 0. If L(x, ⇤) = f(x)
⇤
1g1(x) · · · ⇤mgm(x) is concave (resp, convex) in x on A, then x⇤ is
a point of global maximum (resp. minimum) of f subject to gj(x) = 0,
j = 1, . . . , m.
11.5
Solving optimization problems
The results above suggest two ways of solving an optimization problem:
• use the conditions laid down in Theorem 11.8;