Linear Algebra
E XAMPLE 3.23: D IAGONALIZATION
1. Fixed parameters, system aspects that cannot be changed and that there- there-fore, from the perspective of the model, are constants
4.6 Nonlinear Constrained Optimization
So far, we have been examining the use of optimization techniques where the objec-tive function and the set of constraints are both linear functions. We now consider situations in which these functions are not linear.
How does nonlinearity change the optimization problem? In a noninteger con-strained linear system, the objective function attains its maximum or minimum value at one of the vertices of a polytope defined by the constraint planes. Intuitively, because the objective function is linear, we can always “walk along” one of the hyper-edges of the polytope to increase the value of the objective function, so that the extre-mal value of the objective function is guaranteed to be at a polytope vertex.
In contrast, with nonlinear optimization, the objective function may both increase and decrease as we walk along what would correspond to a hyperedge (a contour line, as we will see shortly). Therefore, we cannot exploit polytope vertices to carry out optimization. Instead, we must resort to one of a large number of non-linear optimization techniques, some of which we study next.
Nonlinear optimization techniques fall into roughly into two categories.
1. When the objective function and the constraints are continuous and at least twice differentiable, there are two well-known techniques: Lagrangian optimization and Lagrangian optimization with the Karush-Kuhn-Tucker (KKT) conditions.
2. When the objective functions are not continuous or differentiable, we are forced to use heuristic techniques, such as hill climbing, simulated annealing, and ant algorithms.
We will first look at Lagrangian techniques (Section 4.6.1), a variant called the KKT conditions that allows inequality constraints (Section 4.6.2), and then briefly consider several heuristic optimization techniques (Section 4.7).
4.6.1 Lagrangian Techniques
Lagrangian optimization computes the maximum or minimum of a function f of several variables subject to one or more constraint functions denoted gi. We will assume that f and all the gi are continuous, at least twice differentiable, and are defined over the entire domain, that is, do not have boundaries.
ptg7913109
4.6 Nonlinear Constrained Optimization 165
Formally, f is defined over a vector x drawn from Rn, and we wish to find the value(s) of x for which f attains its maximum or minimum, subject to the constraint function(s): gi(x) = ci, where the ciare real constants.
To begin with, consider a function f of two variables x and y with a single con-straint function. We want to find the set of tuples of the form (x,y) that maximize f(x,y) subject to the constraint g(x,y) = c. The constraint gi(x) = ci corresponds to a contour, or level, set—that is, a set of points where g’s value does not change.
Imagine tracing a path along such a contour. Along this path, f will increase and decrease in some manner. Imagine the contours of f corresponding to f(x) = d for some value of d. The path on g’s contour touches successive contours of f. An extre-mal value of f on g’s contour is reached exactly when g’s contour grazes an extreextre-mal contour of f. At this point, the two contours are tangential, so that the gradient of f ’s contour, a vector that points in a direction perpendicular to the contour, has the same direction as the gradient of g’s contour, though it may have a different abso-lute value. More precisely, if the gradient is denoted by , at the con-strained extremal point,
Define an auxiliary function:
(EQ 4.7)
The stationary points of F, that is, the points where , are points that (1) satisfy the constraint g, because the partial derivative with respect to , or , must be zero, and (2) are also constrained extremal points of f, because . Thus, the extremal points of F are also the points of constrained extrema of f (i.e., minima or maxima). From Fermat’s theorem, the maximum or minimum value of any function is attained at one of three types of points: (1) a boundary point, (2) a point where f is not differentiable, and (3) at a stationary point where its first derivative is zero. Because we assume away the first two situa-tions, the maximum or minimum is attained at one of the stationary points of F.
Thus, we can simply solve and use the second derivative to determine the type of extremum.
This analysis continues to hold for more than two dimensions and more than one constraint function. That is, to obtain a constrained extremal point of f, take the objective function and add to it a constant multiple of each constraint function to get the auxiliary. This constant is called a Lagrange multiplier. The resulting system of equations is solved by setting the gradient of the auxiliary function to 0 to find its stationary points.
ptg7913109
EXAMPLE 4.10: LAGRANGIAN OPTIMIZATION
Consider a company that purchases capacity on a link to the Internet and has to pay for this capacity. Suppose that the cost of a link of capacity b is Kb. Also suppose that the mean delay experienced by data sent on the link, denoted by d, is inversely proportional to b, so that bd = 1. Finally, let the benefit U from using a network connection with capacity b, and delay d, be described by U = –Kb – d; that is, it decreases both with cost and with the delay. We want to maximize U subject to the constraint bd = 1. Both U and the constraint function are continuous and twice differentiable. Therefore, we can define the auxiliary function
Set the partial derivatives with respect to b, d, and to zero to obtain, respectively:
From the second equation, , and from the first equation, , and from the third equation: Substituting the values for b and d, . Substitut-ing these values into the equations for b and d, and . This gives a value of U at (b, d) to be . Since U is clearly unbounded in terms of a smallest value (when b approaches 0), this is also its maximum.
4.6.2 Karush-Kuhn-Tucker Conditions for Nonlinear Optimization
The Lagrangian method is applicable when the constraint function is of the form g(x) = 0. What if the constraints are of the form ? In this case, we can use the Karush-Kuhn-Tucker conditions (often called the KKT, or Kuhn-Tucker condi-tions) to determine whether the stationary point of the auxiliary function is also a global minimum.
As a preliminary, we define what is meant by a convex function. A function f is convex if, for any two points x and y in its domain, and for t in the closed interval [0,1], . That is, the function always lies below a line drawn from x to y.
Consider a convex objective function f: with m inequality and l equality constraints. Denote the inequality constraints by and the
equal-0ddf
F = – b dK – +O bd 1 –
O
–K+Od = 0 –1+Ob = 0 bd = 1
b = 1eO d = KeO
O = K b = 1e K d = K –2 K
g x d0
f tx +1–tydtf x +1–ty
RnoR
gi x d0 1 d di m
ptg7913109