• No results found

Heuristic Nonlinear Optimization

Linear Algebra

E XAMPLE 3.23: D IAGONALIZATION

1. Fixed parameters, system aspects that cannot be changed and that there- there-fore, from the perspective of the model, are constants

4.7 Heuristic Nonlinear Optimization

ity constraints by hj(x) = 0, . The KKT conditions require all the gi to be con-vex and all the hjto be linear. Then, if a is a point in Rn, and there exist m and l constants, respectively, denoted and such that the following conditions hold, we can guarantee that a is a globally constrained minimum of f:

(EQ 4.8)

If these conditions are met, the stationary points of the auxiliary function (the first equation in the preceding set of equations) yield the minima of f.

4.7 Heuristic Nonlinear Optimization

We now turn to the situation in which the objective function and the constraints are not mathematically “nice,” that is, linear or convex. In such cases, we need to rely on heuristic approaches to optimization. Many such approaches have been proposed in the literature. We will outline only two common ones: hill climbing and genetic algorithms.

4.7.1 Hill Climbing

Hill climbing is perhaps the simplest technique for heuristic optimization. Its sim-plest variant does not even support constraints, seeking only to find the value of x that maximizes f(x). Hill climbing requires two primitives: a way to evaluate f(x) given x and a way to generate, for each point x, another point y, that is near x (assume that x and y are embedded in a suitable metric space).

Start by randomly choosing a point x in the domain of f and labeling it the candi-date maximum (we might just get lucky!). Evaluate f on x, then generate a point y that is close to x. If the value of f is higher at y, then y is the new candidate maxi-mum; otherwise, x remains the candidate. We continue to generate and evaluate f on neighboring points of the candidate maximum until we find a point x, all of whose neighbors have a lower value of f than at x. We declare this to be the maximum.

The analogy to climbing a hill is clear. We start somewhere on the hill and take a step in a random direction. If it is higher, we step up. If not, we stay where we are.

1d dj l

Pi Qj

f a

’ Pi’gi a

i=1 m

¦

Pj’hj a

j=1 l

+ +

¦

= 0

gi a d0i hj a = 0 j

Pit0 i Pigi a = 0 i

ptg7913109 This way, assuming that the hill has a single peak, we will eventually get to the top,

where every neighboring step must lead downhill.

Although simple, this approach to hill climbing leaves much to be desired. These concerns are addressed by the following variants of the basic approach.

ƒ Generate more than one neighbor of x and choose the one where the value of f is greatest. This variant is also called the steepest-gradient method.

ƒ Memorize some or all of the values of y that were discarded in a tabu list.

Subsequently, if any value in the tabu list is generated, it can be immediately discarded. This variant is called tabu search.

ƒ To find the maximum value of f subject to constraint g, choose the initial can-didate maximum x to be a value that also satisfies g. Then, when generating neighbors of x, ensure that the neighboring values also satisfy g. This allows the use of hill climbing for constrained optimization.

The single biggest problem with hill climbing is that it fails when f has more than one maximum. In this case, an unfortunate initial choice of x will cause the algorithm to be stuck in a local maximum instead of finding the global maximum.

This is illustrated in Figure 4.6, which shows a function with multiple peaks. Start-ing at the base of any of the lesser peaks will result in hill climbStart-ing stoppStart-ing at a local maximum.

There are several ways to get around this problem. One approach is called shot-gun hill climbing. Here, the hill-climbing algorithm is started from several ran-domly chosen candidate maxima. The best result from among these should be the global maximum as well. This approach is widely used.

A second approach, called simulated annealing, varies the closeness of a selected neighbor dynamically. Moreover, it allows for some steps of the climb to be downhill. The idea is that if the algorithm is trapped at a local maximum, the only way out is to go down before going up, and therefore, downhill steps should be allowed. The degree to which downhill steps are permitted varies over the climb. At the start, even large downhill steps are permitted. As the climb progresses, how-ever, only small downhill steps are permitted.

More precisely, the algorithm evaluates the function value at the current candi-date point x and at some neighbor y. There is also a control variable, called the tem-perature, T, that describes how large a downhill step is permitted. The acceptance function A(f(x), f(y), T) determines the probability with which the algorithm moves from x to y as a function of their values and the current temperature, with a non-zero probability even when f(y) < f(x). Moreover, the acceptance function tends to zero when T tends to zero and f(y) < f(x). The choice of the acceptance function is problem-specific and therefore usually handcrafted.

ptg7913109

4.7 Heuristic Nonlinear Optimization 169

4.7.2 Genetic Algorithms

The term genetic algorithm applies to a broad class of approaches that share some common attributes. The key idea is to encode a candidate maximum value x as a bit string. At the start, hundreds or thousands of such candidate values are randomly generated. The function f is then evaluated at each such value, and the best ones are selected for propagation in one of two ways. With mutation, some bits of a selected candidate value are randomly perturbed to form the next generation of candidates. With crossover, bits from two selected candidate values are randomly exchanged. In this way, the best features of the population are inherited by the next generation. The algorithm proceeds by forming generation after generation of candidates, until adequate solutions are found.

There is an extensive literature on algorithms for encoding candidates, introduc-ing mutations, and makintroduc-ing effective crossovers. Genetic algorithms have been found to produce surprisingly good results in areas ranging from antenna design to job scheduling. However, the approach has also been criticized for many shortcom-ings, such as its sensitivity to numerous tuning parameters and a tendency to con-verge to local optima.

Figure 4.6 Example of a function in which hill climbing can get stuck in a local maximum

ptg7913109

4.8 Exercises

1. Modeling

You have been hired as the head of a hot air balloon company’s flight opera-tions. Too much money is being spent for each flight! Your job is to make flight profitable again. (The number of flights is not negotiable.)

For each flight, you can control where you take off from (there is a finite set of take-off locations) and the duration of the flight, as long as the flight lasts at least 15 minutes. The cost of a flight depends on its duration (to pay for natural gas, the pilot’s wages, and for the chase vehicle), where the balloon takes off from, and how far the landing site is from a road (the farther away it is from a road, the more it has to be dragged over a farmer’s field). Moreover, you can have at least one pilot and up to nine passengers and can charge them what you wish. Of course, the number of passengers decreases (say, linearly) with the cost of the ticket.

What are the fixed parameters? What are the input and output parameters?

What are the control variables? Come up with plausible transfer and objective functions. How would you empirically estimate the transfer function?

2. Optimizing a function of two variables