Abstract—In simulation-basedoptimization the objective func- tion is often computationally expensive for many optimization problems. Surrogate-assistedoptimization is therefore a major approach to efficiently solve them. One of the major issues of this approach is how to integrate the approximate models (surrogates or metamodels) in the optimization process. The challenge is to find the best trade-off between the quality (in terms of precision) of the provided solutions and the efficiency (in terms of execution time) of the resolution. In this paper, we investigate the evolutioncontrol that alternates between the simulator and the surrogate within the optimization process. We propose an adaptiveevolutioncontrol mechanism based on the distance-based concept of confidentregions. The approach has been integrated into an ANN-assisted NSGA-2 and experimented using the ZDT4 multi-modal benchmark function. The reported results show that the proposed approach outperforms two other existing ones.
software development principle of ‘separation of concerns’, all this can be done based on the best assumptions of how real-world data would look and the use of similarly formed simulated data. At some stage, however, and at the latest when the machine learning pipeline from data preparation over training to evaluation can be shown to work, this project will need actual, real mine location data. At the moment, sources for relevant data are being investigated, but due to the high demand of data in machine learning processes, any additional source would be beneficial for the project and greatly appreciated.
In the evolutionary computation research community, how to deal with expensive unconstrained optimization prob- lems (EUOPs) has attracted much attention. Owing to the computationally expensive nature, applying EAs to solve EUOPs in a straightforward way is highly time-consuming and even impractical. Over the past 15 years, surrogate- assisted EAs (SAEAs) have been widely accepted as one of the most popular methods for solving EUOPs. The basic idea behind SAEAs is to build a surrogate model to approx- imate the original expensive objective function. Since the surrogate model is much more computationally efficient than the original objective function, the computational cost can be reduced significantly. At present, a variety of surrogate models has been developed, such as polynomial regression, Gaussian process (GP), radial basis function (RBF), artifi- cial neural network, and support vector machine (SVM). In order to find the global optimum within a limited budget of FEs, SAEAs should combine the surrogate model with the original objective function in an effective way. There are three kinds of methods, namely evolutioncontrol, surrogate- assisted prescreening method, and surrogate-assisted local search, to address the above issue. Evolutioncontrol, proposed by Jin et al. [ 5 ], is a simple yet popular framework for man- aging surrogates, which includes individual-basedcontrol and generation-basedcontrol. In individual-basedcontrol, some of individuals are evaluated with the original objective function and the remaining individuals are evaluated with surrogates at each generation. In contrast, with respect to generation-basedcontrol, all individuals in the population are evaluated with sur- rogates in some generations, while in the rest of generations the original objective function is used for evaluating all indi- viduals. For surrogate-assisted prescreening methods [ 6 ], [ 7 ], surrogates are used to preselect a subset of promising individ- uals among a number of trial offspring. Afterward, the chosen individuals are re-evaluated with the original objective func- tion. Regarding surrogate-assisted local search [ 8 ], [ 9 ], the surrogate of objective function and its gradient information are utilized to generate an offspring in each iteration. Finally, the resulting offspring is re-evaluated with the original objec- tive function. In 2011, Jin [ 10 ] carried out a comprehensive survey on SAEAs.
Currently, for the optimal design of electric machines, population based evolutionary algorithms are widely used with the differential evolution (DE) method being a typical choice , . According to DE, following initialization of a random population, offspring multiple successive generations are created by differential mutation, an operation achieved by adding a scaled difference of two previous designs to a third parent design. The resulting children will survive to the next generation if they achieved improvement for all multi-objective performance indices considered as part of the optimal design problem. Only a minimum number of control parameters, namely the scaling factor and crossover probability, are considered in the DE algorithm and the global optimum can be achieved regardless of the initial designs. Nevertheless, a major disadvantage of conventional DE is that it requires the evaluation of a very large number of generations and candidate designs, which for electrical machines are typically based on computational expensive FE models , , . For example, a previous optimal design problem with five independent variables employed more than four thousand candidate designs , while using the novel algorithm proposed in the current paper this number can be
their key design choices and components in a more systematic man- ner. A comparative analysis was conducted on a comprehensive benchmark of bi-objective black-box numerical optimization prob- lems from the bbob-biobj test suite. Our main indings are as follows. Firstly, it appears clearly that carefully selecting solutions for train- ing the model is of high importance, not only to reduce the amount of data for accelerating the training phase, but also to improve solution quality. Secondly, training multiple models implies to have an ensemble of surrogates. Combined with the design where the training set is clustered around weight vectors, surrogates are then likely to specialize to speciic regions of the objective space, then matching the rationale of decomposition. As such, we argue that a more systematic investigation of this design choice would allow for explicit ways of designing and training specialized local surro- gates. This is perfectly in line with the overall good performance of MOEAD-RBF, which also has the particularity of producing a whole batch of solutions to be evaluated. This is certainly of high importance in terms of computational complexity: not only this reduces the number of computationally-demanding model training tasks, this also enables the evaluation of multiple solutions at once in a parallel computing environment. The selection of the batch of solutions to be evaluated by the expensive objective functions shall then carefully cope with the multiobjective nature of the problem at hand by targeting diverse regions of the Pareto front. At last, the behavior of approaches based on Gaussian processes makes them very attractive for an extremely expensive optimization sce- nario, where the number of evaluations is reduced to the minimum. Notice that basic iltering approaches can be preferred when the ap- plication context can accommodate with a restricted but relatively moderate budget. Besides, the use of EI to select solutions seems to quickly inhibit the search progress, so that alternative inill criteria combined with adaptive and hybrid design choices of candidate solutions generation seems to be a promising future research line. ACKNOWLEDGMENTS
It is worth understanding the landscape characteristics of the targeted problem. Compared to normal MEMS shape optimization, high-performance MEMS optimization intro- duces stringent constraints and higher expectations of the optimal objective function value. To cope with that, both (very) high exploration and exploitation abilities are essen- tial for the optimization algorithm. The reasons include: (1) Stringent constraints and highly optimal objective function value make the optimal region become (very) narrow and the optimization algorithm must be able to search elaborately in that narrow region (i.e., a high exploitation ability). (2) To get access to the optimal region, high exploration ability must be available to jump out of the local optima in the design landscape. More particularly, when several constraints are imposed on, their extent of difficulty to be satisfied varies and the population may be dominated by relatively easier ones among them in an early stage and the diversity may be lost to satisfy the more stringent constraints (e.g., Example 1 in Section IV). Hence, a high exploration ability is essential to maintain population diversity.
assumes that there are analytic functions for evaluating the objectives and constraints. In the real-world, however, the ob- jective or constraint values of many optimization problems can be evaluated solely based on data and solving such optimization problems is often known as data-driven optimization. In this paper, we divide data-driven optimization problems into two categories, i.e., off-line and on-line data-driven optimization, and discuss the main challenges involved therein. An evolutionary algorithm is then presented to optimize the design of a trauma system, which is a typical off-line data-driven multi-objective optimization problem, where the objectives and constraints can be evaluated using incidents only. As each single function evaluation involves large amount of patient data, we develop a multi-fidelity surrogate management strategy to reduce the computation time of the evolutionary optimization. The main idea is to adaptively tune the approximation fidelity by clustering the original data into different numbers of clusters and a regression model is constructed to estimate the required minimum fidelity. Experimental results show that the proposed algorithm is able to save up to 90% of computation time without much sacrifice of the solution quality.
choice of control parameters that overemphasizes global search. To handle these problems, it is suggested that we replace large function values at each iteration by the median of all computed function values  (see more discussion about it in [24, 29]).
Since black-box functions are often expensive to evaluate, we can use parallel computation to speed up the algorithms. The idea of parallel computing is to break the main computation task into independent tasks so that each can be executed in a separate worker computer (or processing unit) simultaneously with the others. In black-box optimization, it is relatively easy to design a parallel algorithm. At each iteration, we find different candidate points (by solving multiple auxiliary subproblems) and assign each point to one worker computer for evaluation. In , parallel versions of RBF-Gutmann and CORS- RBF methods are implemented. While in RBF-Gutmann, the set of candidate points are obtained by solving subproblems for different values of the targets T n , in CORS-RBF, they are obtained by choosing
Variable-fidelity optimization (VFO) has emerged as an attractive method of performing, both, high-speed and high- fidelity optimization. VFO uses computationally inexpensive low-fidelity models, complemented by a surrogate to ac- count for the difference between the high- and low-fidelity models, to obtain the optimum of the function efficiently and accurately. To be effective, however, it is of prime importance that the low fidelity model be selected prudently. This paper outlines the requirements for selecting the low fidelity model and shows pitfalls in case the wrong model is cho- sen. It then presents an efficient VFO framework and demonstrates it by performing transonic airfoil drag optimization at constant lift, subject to thickness constraints, using several low fidelity solvers. The method is found to be efficient and capable of finding the optimum that closely agrees with the results of high-fidelity optimization alone.
Figure 6.2-(a,b) shows the convergence history of the three FPGA optimizations com- pared to the plain DGA (black circles) on the left (full GA history) and the two MPGA optimization history on the right. In the latter figure, only the convergence history of the best candidates for each generation are reported instead of the whole optimization history. Another important point is that, while the DGA predictions (black circles and black line) are obtained with the CFD solver, the POD-based predictions (green, red and blue circles, blue and red lines) are the surrogate ones. For example, the red circles do not indicate that FPGA1 reached objective levels significantly better than DGA, but simply that the predicted values of the airfoil performances have been strongly overestimated. The plot clearly highlights that, whatever the energy content, the full- POD approximation is not able to match the “true” data during the search process. Moreover, the general FPGA trend is to strongly underestimate the CFD prediction in terms of objective function evaluation. On the other hand, the MPGA model agree- ment with the CFD progress is very satisfying, both in terms of trends and accuracy. Figure 6.2-(c,d), 6.2-(e,f) and 6.2-(g,h) confirm these results as they show the conver- gence history of some design variables (leading edge radius, upper A 4 and lower A 1 ) as
Another shortcoming of Aggregate Surrogate Models is how to resist the loss of diversity. It is emphasized that RASM might incorporate additional speciﬁc constraints in each generation. Some possible constraints are described in Figure 1-Right: such non-dominance constraints involved points on the current Pareto front, and include inequality constraints from the extremal points over their neighbors (continuous arrows), and equality constraints for all neighbor pairs on the Pareto front (continuous double arrow), as well as between extremal points (dotted double arrows). Such equality constraints can be rewritten as two symmetrical inequality constraints in order to preserve the particular form of the formulation (Eq. 2). Along the same lines, constraints could be weighted, e.g. the weight of constraints related to points with the largest hypervolume contributions can be increased online. This is however a topic for further work.
This paper addresses multi-scenario airfoil design optimiza- tion, which is challenging due to extremely high computational cost. To solve the problem, we first give a problem formulation for multi-scenario airfoil optimization that minimizes the area between the drag divergence boundary and extreme point on the drag landscape. To reduce the needed number of expensive CFD simulations as much as possible, a hierarchical surrogate consisting of a KNN and a Kriging model is proposed to assist the covariance matrix adaptation evolution strategy. Our ex- perimental results on the RAE2822 airfoil design demonstrate that the proposed algorithm is able to find an optimal design having much improved aerodynamic performance compared to the baseline design using limited amount of computational budget.
The landscapes of electromagnetic machines (e.g., EM actuators) are often not complex and (multi-start) local op- timization works well in many cases , . Antenna and microwave device and circuit landscapes are shown to be multimodal , . However, very rugged landscapes (e.g., the Rastrigin function ), which tend to be discontinuous rarely appear. This is because EM simulations are, in fact, solving Maxwell’s equations and such a landscape is not typically generated by partial differential equations. Based on this, we use the Ellipsoid function (14)  to represent simple landscapes and the Ackley function (15)  to repre- sent multimodal landscapes. The reason to select the Ackley function is that the more complex the landscape, the more speed improvement SMAS (or most SAEAs) has compared to standard EAs . To the best of our knowledge, the speed improvement of SMAS is often lower when using real-world problems than using the Ackley function.
In many real-world multiobjective optimization problems (MOPs), the computation time for eval- uating objective functions can be substantial. For instance, evaluations involving computational fluid dynamics simulations  or real physical, biological or chemical experiments  can be very time-consuming (i.e. computationally or otherwise expensive). Therefore, it is desirable to obtain solutions within few evaluations. In the last few years, surrogate-assisted approaches embedding evolutionary algorithms have been proposed for such problems, referred to as surrogate-assisted evolutionary algorithms (SAEAs) in this article. These algorithms train surrogates to replace expensive functions, efficiently select the training data and optimize the surrogate with an opti- mizer, e.g., an evolutionary algorithm. Examples of SAEAs for MOPs include ParEGO  and K-RVEA . For more details, e.g., characteristics, advantages and limitations of SAEAs and general overviews, see, e.g., [1, 8].
b Leibniz Institute of Marine Science (IFM-GEOMAR), Marine Biogeochemistry, Biological Oceanography, D¨usternbrooker Weg 20, 24105 Kiel, Germany
c Engineering Optimization & Modeling Center, School of Science and Engineering, Reykjavik University, Menntavegur 1, 101 Reykjavik, Iceland
Model calibration plays a key role in simulations and predictions of the earth’s climate system. Calibration is normally formulated as an inverse problem where a set of control parameters are to be found so that the model fits given measurement data. Straightforward calibration attempts by direct adjustment of the high-fidelity (or fine) model parameters using conventional optimization algorithms are often tedious or even infeasible as they normally require a large number of simulations. The development of faster methods becomes critical, particularly for the models that are computationally expensive. The optimization of coupled marine ecosystem models simulating biogeochemical processes in the ocean is here a representative example. In this paper, we introduce a surrogate- basedoptimization (SBO) methodology where the calibration of the expensive fine model is carried out by means of a surrogate: its fast and yet reasonably accurate representation. As a case study, we consider a representative of the class of one-dimensional marine ecosystem models. The surrogate is obtained from a temporarily coarser discretized physics-based low-fidelity (or coarse) model and a multiplicative response correction technique. In our previous work, a basic formulation of this surrogate was sufficient to create a reliable approximation, yielding a remarkably accurate solution at low computational costs. This was verified by model-generated attainable data. The application on real (measurement) data is covered in this paper. Enhancements of the basic formulation by utilizing additionally fine and coarse model sensitivity information as well as trust-region convergence safeguards allow us to further improve the robustness of the algorithm and the accuracy of the solution. The trade-offs between the solution accuracy and the extra computational overhead related to sensitivity calculation is also addressed. We demonstrate that SBO is able to yield a very accurate solution at nevertheless low computational cost. The time savings are up to 85 percent when compared to the direct fine model optimization.
Slawomir Koziel received the M.Sc. and Ph.D. degrees in electronic engineering from Gdansk University of Technology, Poland, in 1995 and 2000, respectively, and the M.Sc. degrees in theoretical physics and in mathematics, in 2000 and 2002, and the Ph.D. in mathematics in 2003, from the University of Gdansk, Poland. He is currently a Professor with the School of Science and Engineering, Reykjavik University, Iceland. His research interests include CAD and modeling of microwave circuits, simulation-driven design, surrogate-basedoptimization, space mapping, circuit theory, analog signal processing, evolutionary computation and numerical analysis.
T-MTT Manuscript, LAMC-2016 Mini-Special Issue < Mo-4A-2> 4
The multilayer perceptron (MLP) is a feedforward network and one of the ANN topologies most widely used . They can use different kinds of activation functions with different approximation properties , including sigmoidal, hyperbolic tangent, Gaussian, piece-wise linear, etc. In this work, we use hyperbolic tangent activation functions. The number of neurons (h) in the hidden layer depends on the required complexity of the ANN, and its final number is defined based on the ANN generalization performance. In this work, the 3- layer perceptron (3LP) ANN is trained by using Bayesian regularization training  available in the MATLAB Neural Network ToolBox. We start training the 3LP ANN with h = 1, calculating the corresponding learning and testing errors. We keep increasing the complexity of the ANN (the number of hidden neurons h) until the current testing error is larger than the previous one and the current learning error is smaller than the current testing error . A detailed formulation of ANN is in .
cally, a forward dynamic simulation of movement contains thou sands of time steps, and an iterative movement optimization may require thousands of such movement simulations until the optimal movement is found. This adds up to millions of FE simulations, which would thus require enormous computational resources in order to solve just one movement optimization problem. In order to obtain solutions, modelers typically focus on one of the mod eling domains, while simplifying the other. For instance, surface- surface penetration has been used within multibody dynamics to compute reaction loads in the knee  or between the foot and the ground . This is a good approximation when tissue defor mation is limited to a surface layer but not generally applicable. Under conditions where the analysis requires iterative simula tions of a computationally expensive model, surrogate modeling is often employed. In general, surrogate modeling approaches can be classiﬁed as global or local methods. Global methods ﬁt a statis tical regression model to a deﬁned set of input/output sets. Accu racy of a global method depends on the number of available data sets and the goodness of ﬁt of the approximation over the whole domain. Examples include response surface techniques and neural networks. Lin et al.  developed a response surface approxima tion of knee joint contact mechanics and demonstrated its feasi bility for potential use in optimization routines. This promising work showed a signiﬁcant reduction in computational cost asso ciated with the use of the surrogate model but requires an a priori estimate of input data ranges for response surface ﬁtting. In addi tion, for higher dimensional input spaces, response surface ap proximations of complicated or highly nonlinear behavior are dif ﬁcult to capture with a low-order polynomial or other function approximators. User input would also be required to produce a new approximating function whenever the underlying model is changed or updated, such as for patient speciﬁc models of joint contact or soft tissue restraint. Local methods use a set of neigh boring points only and include locally weighted regression, spline ﬁtting, or radial basis functions. Lazy learning  is one form of locally weighted regression based on linear or polynomial ﬁts to neighboring points. It is particularly attractive because it retains all the original data and can provide error estimates to drive an adaptive sampling scheme for generating additional data. This al lows unimportant areas of the domain space to be avoided and the highly nonlinear areas can be densely sampled to accurately de scribe the response.
In multi-objective optimization, there have been many methods that can transform multiple objectives to single ob- jective. Among them, the weighting function-based method is the most intuitive and widely used one. In this paper, we assign higher weights to the outputs with larger errors. In the research of Liu et al. (2005), the RMSE of each outputs were normalized by the RMSE of the default parameter set, and each normalized RMSE was assigned equal weights. Van Griensven and Meixner (2007) developed a weighting sys- tem based on Bayesian statistics to define “high-probability regions” that can give “good” results for multiple outputs. However, both of Liu et al. (2005) and van Griensven and Meixner (2007) tended to assign higher weights to the out- puts with lower RMSE, and lower weights to the outputs with higher RMSE. This tendency, although reasonable in the probability meaning, conflicts with our intuitive motiva- tions that we want to emphasis on the poorly simulated out- puts with large RMSE. Jackson et al. (2003) assumed Gaus- sian error in the data and model so that the outputs were in a joint Gaussian distribution, and the multi-objective “cost function” was defined on the joint Gaussian distribution of multiple outputs. In Gupta et al. (1998), a multiple weighting function method is proposed to fully describe the Pareto fron- tier, if the frontier is convex and model simulation is cheap enough. If one outputs is more important than others, a higher weight should be assigned to it. Marler and Arora (2010) reviewed the applications, conceptual significance and pit-
surrogate have been used to solve problems with up to eight decision variables. This can be attributed to the fact that the computational time for training the Kriging model will become too long when the number of training samples increases . This paper focuses on developing an efficient SAEA for solving computationally expensive many-objective optimiza- tion problems. One of the major reasons limiting the applica- bility of existing algorithms to many-objective optimization is the lack of an efficient surrogate management method suited for the evolutionary algorithm used. In SAEAs when managing the surrogates, individuals should be selected by taking into account of both convergence and diversity. To select such individuals, surrogates need to be seamlessly embedded into the evolutionary algorithm. Most existing SAEAs are domi- nance based and thus are not well suited for handling many objectives. Therefore, the major contribution of the paper is to propose an efficient algorithm to manage the surrogates for handling a large number of objectives. To this end, we adopt the reference vector guided evolutionary algorithm (RVEA)  for many-objective optimization to be used as an evolutionary algorithm. Two sets of reference vectors adaptive and fixed, together with uncertainty information from the Kriging models as well as the location of the individuals are exploited for surrogate management. To limit the computation time for training the Kriging models, a strategy for choosing training samples is proposed so that the maximum number of training data is fixed.