FUNCTIONS OF A SINGLE VARIABLE
Step 6. Test for termination:
2.5 METHODS REQUIRING DERIVATIVES
The search methods discussed in the preceding sections required the assump-tions of unimodality and, in some cases, continuity of the performance index being optimized. It seems reasonable that, in addition to continuity, if it is assumed that the function is differentiable, then further efficiencies in the search could be achieved. Recall from Section 2.2 that the necessary condition for a point z to be a local minimum is that the derivative at z vanish, that is, ƒ⬘(z)⫽ dƒ / dx兩x⫽z ⫽ 0.
When ƒ(x) is a function of third-degree or higher terms involving x, direct analytic solution of ƒ⬘(x)⫽0 would be difficult. Hence, a search method that successively approximates the stationary point of ƒ is required. We first de-scribe a classical search scheme for finding the root of a nonlinear equation that was originally developed by Newton and later refined by Raphson [5].
2.5.1 Newton–Raphson Method
The Newton–Raphson scheme requires that the function ƒ be twice differ-entiable. It begins with a point x1that is the initial estimate or approximation to the stationary point or root of the equation ƒ⬘(x) ⫽ 0. A linear approxi-mation of the function ƒ⬘(x) at the point x1 is constructed, and the point at which the linear approximation vanishes is taken as the next approximation.
Formally, given the point xkto be the current approximation of the stationary point, the linear approximation of the function ƒ⬘(x) at xkis given by
ƒ˜⬘(x; x )k ⫽ƒ⬘(x )k ⫹ ƒⴖ(x )(xk ⫺ x )k (2.7) Setting Eq. (2.7) to be zero, we get the next approximation point as
ƒ⬘(x )k xk⫹1⫽ xk⫺
ƒⴖ(x )k
Figure 2.14 illustrates the general steps of Newton’s method. Unfortunately, depending on the starting point and the nature of the function, it is quite
Figure 2.14. Newton–Raphson method (convergence).
Figure 2.15. Newton–Raphson method (divergence).
possible for Newton’s method to diverge rather than converge to the true stationary point. Figure 2.15 illustrates this difficulty. If we start at a point to the right of x0, the successive approximations will be moving away from the stationary point z.
Example 2.10 Newton–Raphson Method
Consider the problem
2 16 Minimize ƒ(x)⫽ 2x ⫹
x
Suppose we use the Newton–Raphson method to determine a stationary point of ƒ(x) starting at the point x1 ⫽ 1:
16 32
ƒ⬘(x)⫽ 4x⫺ 2 ƒⴖ(x)⫽ 4⫹ 3
x x
We continue until兩ƒ⬘(xk)兩 ⬍ , where is a prespecified tolerance.
2.5.2 Bisection Method
If the function ƒ(x) is unimodal over a given search interval, then the optimal point will be the one where ƒ⬘(x) ⫽ 0. If both the function value and the derivative of the function are available, then an efficient region elimination search can be conducted using just a single point rather than a pair of points to identify a point at which ƒ⬘(x) ⫽ 0. For example, if, at a point z, ƒ⬘(z) ⬍ 0, then assuming that the function is unimodal, the minimum cannot lie to the left of z. In other words, the interval x ⭐ z can be eliminated. On the other hand, if ƒ⬘(z)⬎ 0, then the minimum cannot lie to the right of z, and the interval x ⭓ z can be eliminated. Based on these observations, the bisec-tion method (sometimes called the Bolzano search) can be constructed.
Determine two points L and R such that ƒ⬘(L) ⬍ 0 and ƒ⬘(R) ⬎ 0. The stationary point is between the points L and R. We determine the derivative of the function at the midpoint,
L⫹ R z⫽
2
If ƒ⬘(z) ⬎ 0, then the interval (z, R) can be eliminated from the search. On the other hand, if ƒ⬘(z) ⬍ 0, then the interval (L, z) can be eliminated. We shall now state the formal steps of the algorithm.
Given a bounded interval a⭐ x ⭐ b and a termination criterion: Step 1. Set R ⫽ b, L ⫽ a; assume ƒ⬘(a)⬍ 0 and ƒ⬘(b)⬎ 0.
Step 2. Calculate z ⫽ (R⫹ L) / 2, and evaluate ƒ⬘(z).
Step 3. If兩ƒ⬘(z)兩 ⭐ , terminate. Otherwise, if ƒ⬘(z)⬍ 0, set L⫽ z and go to step 2. If ƒ⬘(z) ⬎ 0, set R⫽ z and go to step 2.
Note that the search logic of this region elimination method is based purely on the sign of the derivative and does not use its magnitude. A method that uses both is the secant method, to be discussed next.
Figure 2.16. Secant method.
2.5.3 Secant Method
The secant method combines Newton’s method with a region elimination scheme for finding a root of the equation ƒ⬘(x)⫽ 0 in the interval (a, b) if it exists. Suppose we are interested in finding the stationary point of ƒ(x) and we have two points L and R in (a, b) such that their derivatives are opposite in sign. The secant algorithm then approximates the function ƒ⬘(x) as a ‘‘se-cant line’’ (a straight line between the two points) and determines the next point where the secant line of ƒ⬘(x) is zero (see Figure 2.16). Thus, the next approximation to the stationary point x* is given by
ƒ⬘(R) z ⫽R⫺
[ƒ⬘(R) ⫺ ƒ⬘(L)] / (R⫺ L)
If兩ƒ⬘(z)兩 ⭐ , we terminate the algorithm. Otherwise, we select z and one of the points L or R such that their derivatives are opposite in sign and repeat the secant step. For example, in Figure 2.16, we would have selected z and R as the next two points. It is easy to see that, unlike the bisection search, the secant method uses both the magnitude and sign of the derivatives and hence can eliminate more than half the interval in some instances (see Figure 2.16).
Example 2.11 Secant Method
Consider again the problem of Example 2.10:
2 16
Minimize ƒ(x) ⫽ 2x ⫹ over the interval 1⭐ x⭐ 5 x
dƒ(x) 16
ƒ⬘(x)⫽ ⫽ 4x⫺ 2
dx x
Iteration 2
Step 2. z ⫽ 2.53⫺ 7.62 ⫽ 1.94 (7.62 ⫹12) /1.53
Step 3. ƒ⬘(z)⫽ 3.51 ⬎ 0; set R⫽ 1.94.
Continue until兩ƒ⬘(z)兩 ⭐ .
2.5.4 Cubic Search Method
This is a polynomial approximation method in which a given function ƒ to be minimized is approximated by a third-order polynomial. The basic logic is similar to the quadratic approximation scheme. However, in this instance, because both the function value and the derivative value are available at each point, the approximating polynomial can be constructed using fewer points.
The cubic search starts with an arbitrary point x1 and finds another point x2 by a bounding search such that the derivatives ƒ⬘(x1) and ƒ⬘(x2) are of opposite sign. In other words, the stationary pointxwhere ƒ⬘(x)⫽0 is brack-eted between x1 and x2. A cubic approximation function of the form
ƒ(x)⫽ a0⫹ a (x1 ⫺ x )1 ⫹ a (x2 ⫺ x )(x1 ⫺ x )2 ⫹ a (x3 ⫺x ) (x1 2 ⫺ x )2 (2.8) is fitted such that Eq. (2.8) agrees with ƒ(x) at the two points x1 and x2. The first derivative of ƒ(x)is given by
dƒ(x) 2
⫽ a1⫹ a (x2 ⫺ x )1 ⫹a (x2 ⫺ x )2 ⫹ a (x3 ⫺x )1 ⫹2a (x3 ⫺ x )(x1 ⫺ x )2 dx
(2.9) The coefficients a0, a1, a2, and a3 of Eq. (2.8) can now be determined using the values of ƒ(x1), ƒ(x2), ƒ⬘(x1), and ƒ⬘(x2) by solving the following linear equations:
ƒ1 ⫽ƒ(x )1 ⫽a0
ƒ2 ⫽ƒ(x )2 ⫽a0 ⫹ a (x1 2 ⫺x )1 ƒ⬘ ⫽1 ƒ⬘(x )1 ⫽a1 ⫹a (x2 1 ⫺x )2
ƒ⬘ ⫽2 ƒ⬘(x )2 ⫽a1 ⫹a (x2 2 ⫺x )1 ⫹ a (x3 2⫺ x )1 2
Note that the above system can be solved very easily in a recursive manner.
Then, as in the quadratic case discussed earlier, given these coefficients, an estimate of the stationary point of ƒ can be obtained from the approximating cubic of Eq. (2.8). In this case, when we set the derivative ofƒ(x)given by Eq. (2.9) to zero, we get a quadratic equation. By applying the formula for the root of the quadratic equation, a closed-form solution to the stationary pointx of the approximating cubic is obtained as follows:
x2 if ⬍ 0
The definition of w in Eq. (2.10) ensures that the proper root of the quadratic equation is selected, while forcingto lie between 0 and 1 ensures that the predicted pointxlies between the bounds x1and x2. Once again we select the next two points for the cubic approximation as x and one of the points x1or x2such that the derivatives of the two points are of opposite sign and repeat the cubic approximation.
Given an initial point x0, positive step size ⌬, and termination parameters
1 and 2, the formal steps of the cubic search algorithm are as follows:
Step 1. Compute ƒ⬘(x0).
If ƒ⬘(x0) ⬍ 0, compute xK⫹1 ⫽ xK⫹ 2K⌬ for K⫽ 0, 1, . . . . If ƒ⬘(x0) ⬎ 0, compute xK⫹1 ⫽ xK⫺ 2K⌬ for K⫽ 0, 1, 2, . . . .
Step 2. Evaluate ƒ⬘(x) for points xK⫹1 for K ⫽ 0, 1, 2, . . . until a point xM
is reached at which ƒ⬘(xM⫺1)ƒ⬘(xM) ⭐ 0.