Summary of Chapter 6: Algorithm SO-Ic - Contribution and Organization

1.3 Contribution and Organization

1.3.5 Summary of Chapter 6: Algorithm SO-Ic

An application problem arising in the agricultural land use management of the Cannonsville reservoir watershed in upstate New York is considered in Chapter 6. The problem is to determine the optimal locations of land conversion (or land retirement) in order to reduce the amount of total phosphorus runoff in the watershed. Land conversion is a best management practice (BMP) employed in watersheds that drain to bodies of water that are particularly sensitive to agricultural runoff. This particular BMP involves contracts between government agencies and local land owners to remove certain land from crop production. In exchange for retiring the land from crop production, the government agencies agree to provide some percentage of setup costs, and yearly rental and maintenance payments. Since the cost of these projects can be quite high, the goal is to minimize the total conversion costs while keeping the total phosphorus runoff below a given threshold. The total conversion costs and phosphorus runoff must be computed with a highly nonlinear and computationally expensive simulation model that has been supplied by Joshua Woodbury from Cornell University. The algorithm SO-I described in Chapter 5 has been further developed with respect to the constraint handling to solve this particular application problem. The algorithm, SO-Ic (Surrogate Optimization - Integer

constraints), uses a cubic radial basis function interpolant as surrogate

model for approximating the objective and the constraint function. In the first optimization phase the response surface of the constraint is used for minimizing the computationally expensive phosphorus constraint in order to find a first feasible solution. In the second optimization phase the response

CHAPTER 1. INTRODUCTION 21

surface for the constraint is used to discard infeasible-predicted candidate points, and thus it is more likely that points chosen for the computationally expensive simulations are feasible.

The performance of SO-Ic on this application problem has been compared to NOMAD, a genetic algorithm, and the discrete dynamically dimensioned search (discrete-DDS) algorithm [140]. Branch and bound could not be used for this application problem because the code for evaluating the objective and constraint function fails if the input variable vector contains continuous variables, which is the case whenever branch and bound tries to optimize a relaxed subproblem to compute lower bounds for the objective function value. The numerical experiments on three problem instances with different upper bounds for the allowable phosphorus content in the water showed that SO-Ic achieves significantly better results than the other algorithms. SO-Ic also has the lowest mean standard errors, which indicates that SO-Ic is a rather robust algorithm. NOMAD performed comparatively poorly on all problem instances, and was not able to find a feasible solution for several trials. The numerical experiments also showed that the cost increase does not behave linearly with the phosphorus reduction goals. Increasing the phosphorus reduction goal from 20% to 40% of the base case invokes significantly fewer additional costs than increasing the reduction goal from 40% to 60%.

The material of Chapter 6 is joint work with Joshua Woodbury from Cornell University who kindly provided the description of the problem and the cost beneﬁt analysis.

SO-M: A Mixture Surrogate

Model Algorithm for Global

Optimization Problems Using

Dempster-Shafer Theory

Abstract

Research in algorithms for solving computationally expensive global optimization problems using surrogate models has shown that it is in general not possible to use the same type of surrogate model for solving different kinds of problems. While a radial basis function model may be the most suitable for some problems, for others the best results may be obtained with a polynomial regression model. In this chapter the approach of applying Dempster-Shafer theory to surrogate model selection and their combination is introduced. Cross-validation is used to compute various model characteristics, and, for dealing with conflicting characteristics, different conflict redistribution rules have been examined with respect to their influence on the results in numerical experiments. Furthermore, the effect of the surrogate model type, i.e. using mixture models, single models or a hybrid of both, has been studied. The various versions of the algorithm, SO-M, are compared on six well-known global optimization test problems from the Dixon and Szegö [39] test bench. The results indicate that SO-M is able to thoroughly explore the variable domain for all problems, and the vicinities of global optima could be detected. The global minima could except for one test problem be approximated with high accuracy within 150 or fewer function evaluations.

CHAPTER 2. SO-M 23

Abbreviations and Nomenclature

ARS Accelerated random search BPA Basic Probability Assignment BetP Pignistic probability

CC Correlation coeﬃcient

D Dempster’s rule

DST Dempster-Shafer Theory

I Inagaki’s rule

K Kriging model

M, MARS Multivariate adaptive regression spline MAD Median absolute deviation

MAE Maximum absolute error P Polynomial regression model

PCR Proportional conﬂict redistribution rule R, RBF Radial basis function surrogate model RMSE Root mean squared error

SO-M Surrogate Optimization - Mixture

Y Yager’s rule

f (·) Objective function, see equation (2.1a)

x Continuous variable vector

xT _{Transpose of x}

R Real numbers

x_i ith continuous variable, i = 1, . . . , k, see equation (2.1c) xl

i, xui Lower and upper bound for ith continuous variable, see equa-

tion (2.1b)

k Problem dimension, see equation (2.1c)

Ω Variable domain

s_mix Mixture surrogate model, see equation (2.2)

wr Weight for the rth surrogate model in the mixture, r ∈ M,

see equation (2.2)

M Set of models contributing to the mixture, see equation (2.2) s_r rth surrogate model in the mixture, r∈ M, see equation (2.2) s_min Minimum of the response surface

2.1 Introduction

In this chapter box-constrained global optimization problems of the following form are considered:

minimize f (x) (2.1a)

s.t. − ∞ < xl_i ≤ x_i ≤ xu_i < ∞, (2.1b) x_i ∈ R, i = 1, 2, . . . , k. (2.1c) It is assumed that evaluating the objective function f (x) requires a time consuming simulation model, and thus a good approximation of the global minimum should be found within as few function evaluations as possible. Moreover, the algebraic form of f (x) is unknown (black-box). In many applications f (x) is multimodal, and therefore an algorithm that is able to search locally as well as globally is needed. The box-constrained variable domain deﬁned in equation (2.1b) will in the following be denoted by Ω⊂ Rk_.

As stated in Chapter 1, surrogate models have been developed in order to reduce the necessary number of costly simulations while searching for the global minimum [52, 75, 84, 86, 114, 155]. Surrogate models have for example been used by Glaz et al. [52] during the optimization of helicopter rotor blades, and also by Queipo et al. [114] who optimized a liquid-rocket injector with respect to multiple objectives. Eﬃcient global optimization (EGO) has been applied by Morgans et al. [96] in order to optimize the shape of horn-loaded loudspeakers. Surrogate models have also been used in automotive design (see, for example, [86, 155, 163]).

However, as shown for example by Goel et al. [53] and Viana and Haftka [144], one surrogate model does not suit all kinds of problems, i.e. a certain surrogate model might perform very well for some problems, but poorly for others. If it is not known beforehand which surrogate model is the most suitable for the problem at hand, different models would have to be tried in order to find the most effective one. This approach is, however, not feasible due to restrictions on the computation time. Thus, the challenge is to either some- how determine the best surrogate model, or to adjust the influence of single surrogate models in mixtures such that good models have higher influence than bad models in order to obtain the best results. The prediction s_mix(x) of such a mixture surrogate model can in general be represented as

s_mix(x) =

r∈M

w_rs_r(x), where

r∈M

CHAPTER 2. SO-M 25

and where s_r(x) denotes the prediction of the rth contributing model, w_r ≥ 0 is the corresponding weight, and M is the set of surrogate models in the mixture.

Goel et al. [53] suggested different approaches for determining the weights of models in combinations. However, only one of the approaches allows emphasizing and restricting the influence of good and bad model characteristics, respectively. Moreover, in this approach parameters must be adjusted, which is in general a difficult task. Viana and Haftka [144] also considered mixture surrogate models and suggested the optimization of an auxiliary function in order to obtain an approximation matrix. This matrix is in turn used for determining the model weights, but it can lead to negative weights and weights larger than one, and thus results may become inaccurate. In order to overcome the mentioned problems the choice and combination of surrogate models using Dempster-Shafer theory (DST) [37, 129] is introduced in this chapter. DST is a mathematical theory of evidence that provides means of combining information from different sources in order to construct a degree of belief. The theory allows the combination of imprecise and uncertain pieces of information that may even be conflicting. So-called basic probability assignments (BPA) contain information about certain hypotheses (focal elements1), and are combined to calculate the credibility of a given hypothesis. Three functions are usually associated with BPAs, namely the belief, plausibility, and pignistic probability (BetP) function. In terms of surrogate models the BPAs can be derived, for example, from model characteristics such as correlation coefficients and various error mea- sures. It is possible that one surrogate model has conflicting characteristics, i.e. good (e.g., high correlation coefficients, or low errors) and bad (e.g., high errors, or low correlation coefficients) characteristics simultaneously. This conflict must be taken into account when calculating the belief value for a given model. Several rules have been developed in the literature for dealing with such conflicting information. Dempster’s rule of combination redistributes the conflict among all focal elements, regardless of which elements caused the conflict. However, as shown by Zadeh [161], the results of this approach may be counter-intuitive. Fairer conflict redistribution rules have been developed. The conflict can be assigned to the set reflecting complete ignorance [154] (Yager’s rule), or one may apply the proportional

1_{Here a hypothesis would be, for example, “A mixture of RBF and MARS should be} chosen”.

conflict redistribution (PCR) rules [132] to redistribute the conflict among those focal elements that actually cause the conflict. The disadvantage of the latter approach is, however, the computational complexity that increases with the number of information sources. Inagaki [70] proposed a general parametric formulation for the redistribution of the conflict.

The goal of this chapter is to develop a surrogate model algorithm that uses Dempster-Shafer theory for choosing among different surrogate models the most suitable one for a given optimization problem and for finding the weights of single models contributing to mixture models as defined in equation (2.2). The considered surrogate models (polynomial regression models, MARS, RBF, kriging) have already been described in Section 1.2. Since it is in general unknown whether mixture models will be more successful than single models, the following alternative strategies are examined:

1. using the best mixture model for the ﬁrst optimization steps, then switching to using always the best single surrogate model,

2. using only the best mixture model,

3. using only the best single surrogate model.

The remainder of this chapter is structured as follows. The algorithm SO-M (Surrogate Optimization - Mixture) using DST is described in Section 2.2. The results of the numerical experiments on a subset of problems from the Dixon and Szeg¨o [39] test bench are discussed in Section 2.3. The algebraic description of the examined test problems is given in Appendix A. Conclu- sions are drawn in Section 2.4.

2.2 SO-M: Mixture Surrogate Model Algo-

In document Surrogate Model Algorithms for Computationally Expensive Black-Box Global Optimization Problems (Page 36-42)