Optimization Basics: Single-Variable Optimality Conditions, Gradient, Hessian

where x = (x,y,z) is any generic point on the plane. Substituting the value of x in Eq. 2.7 , the general form of the equation of a plane is

x 1 and x 2 that satisfy the above two equations By inspection, we find that = 1 and x 2 = 2 satisfy the above equations Thus, the vector [1, 2] T is an eigenvector corresponding to

2.5 Optimization Basics: Single-Variable Optimality Conditions, Gradient, Hessian

A general optimization problem refers to finding the maximum or the minimum value of a function. Some examples of optimization problems include maximizing the mileage of a car or minimizing the failure rate of a product.

Consider a continuous single variable function f (x). Optimization theory gives us the tools to find a “good” value of x that corresponds to a “good” value of f (x). In many problems, the choice of the values of x is constrained in the sense that a candidate value of x must satisfy some conditions. Such optimization problems are called Constrained

Optimization problems, which will be studied later. Unconstrained Optimization

problems, on the other hand, involve no constraints on the values of x. In this section, we present some basic concepts of unconstrained optimization. We discuss the basic conditions of optimality for single variable functions. We also present the two basic entities that play a central role in multi-variable optimization: gradients and Hessians.

Definitions

Consider the function f (x) in a set S. We define global and local minima as follows: 1. The function f (x) is said to be at a point of global minimum, x* ∈ S, if f (x*_) ≤ f (x)

for all x ∈ S.

2. The function f (x) is said to be at a point of local minimum, x* ∈ S, if f (x*) ≤ f (x) for all x within an infinitesimally small distance from x*. That is, there exists > 0 such that for all x satisfying |x − x*| < , f (x*) ≤ f (x).

The concept of global and local minima is illustrated in Fig. 2.6. The word “optimum” refers to a maximum or a minimum.

Figure 2.6. Local and Global Minima

2.5.1 Necessary Conditions for Local Optimum

Assuming that the first and second derivatives of the function f (x) exist, the necessary

conditions for x* to be a local minimum of the function f (x) on an interval (a,b) are: 1.

The necessary conditions for x*to be a local maximum of the function f (x) on an interval (a,b) are:

1. 2.

It is important to understand that the above stated conditions are necessary, but not sufficient. This means that if the above conditions are not satisfied, x* will not be a local minimum or maximum. On the other hand, if the above conditions are satisfied, it does not guarantee that x*is the local minimum or maximum.

2.5.2 Stationary Points and Inflection Points

A stationary point is a point x*that satisfies the following equation:

An inflection point may or may not be a stationary point. An inflection point is one where the curvature of the curve (second derivative) changes sign from positive to

negative, or vice versa. That point is not necessarily a minimum or a maximum. 2.5.3 Sufficient Conditions for Local Optima Consider a point x*at which the first derivative of f (x) is equal to zero, and the order of the first nonzero higher derivative is n. The following are the sufficient conditions for x*to be a local optimum. 1. If n is odd, then x*is an inflection point. 2. If n is even, then x*is a local optimum. In addition,

(a) If the value of that derivative of f (x) at x*is positive, then the point x*is a local minimum.

(b) If the value of that derivative of f (x) at x*is negative, then the point x*is a local maximum.

The procedure to find the maximum of a function is illustated using the following example.

Example 10: In this example, we find the maximum value of a function given by f (x) =

−x3 + 3x2 + 9x + 10 in the interval −2 ≤ x ≤ 4.

First, the stationary points are determined by solving = 0.

Using the formula to compute roots of a quadratic equation, the roots of the first derivative are found by solving x = . This process yields two solutions: x = 3 and x = −1 as the two stationary points, which are in the interval −2 ≤ x ≤ 4. The function values at these stationary points are evaluated to determine which of these points may correspond to a global maximum.

Evaluating f (x) at x = 3,−1,−2 and 4 yields the function values as 37, 5, 12, and 30, respectively. Therefore, x = 3 corresponds to the maximum of the function in the interval −2 ≤ x ≤ 4.

2.5.4 Gradient and Hessian of a Function

In the previous subsection, we have only considered single variable functions; that is, f (x), where x is a single variable. In practice, however, x could represent the two dimensions of a rectangular backyard. In this case, we could let x be a vector with two entries, a two- dimensional vector: x = {x₁,x₂}, where x₁ = Length and x₂ = Width. Therefore, in general, we can consider multi-variable functions f (x), where x is an n-dimensional vector. In these cases, the first and the second derivatives of the function are more complicated. They are respectively referred to as the gradient and the Hessian of the functions. The gradient is an n × 1 vector and the Hessian is an n × n matrix.

Given a function f (x), its gradient is given by

(2.40) For example, given , the gradient is given by

(2.41)

Recall that when the partial derivative of f is taken with respect to x₁, , we treat x₁ as the variable and x₂ as a constant.

Hessian of a Function:

The Hessian of a function f (x) is given by the following matrix

(2.42)

and when the above derivatives exist, the Hessian is symmetric. That is, the terms below the diagonal are the same as those above the diagonal. Therefore, we can either forgo evaluating the lower or upper triangular terms, or we can evaluate all the terms and verify that the resulting matrix is indeed symmetric. For the above function, the Hessian is given by (2.43) leading to the symmetric matrix (2.44) 2.6 Summary

Quantitative optimization is founded on the understanding of important mathematical concepts, including calculus, geometry, and matrix algebra. This chapter hence provided a summary of the important mathematical concepts that are needed for learning and practicing optimization. Specifically, it provided a brief introduction to vectors, Euclidean geometry (e.g., equation of a plane), matrix properties and operations, and differential and integral calculus (e.g., function continuity, partial derivatives, and Taylor Series). The chapter ended with an introduction to single-variable optimality conditions. That is, how to estimate the gradient and the Hessian of a function and use them to determine (or search for) optimum points. These topics will be greater amplified in later chapters.

2.7 Problems

In document Optimization in Practice with MATLAB® For Engineering Students and Professionals (Page 91-95)