• No results found

Towards Single- and Multiobjective Bayesian Global Optimization for Mixed Integer Problems

N/A
N/A
Protected

Academic year: 2020

Share "Towards Single- and Multiobjective Bayesian Global Optimization for Mixed Integer Problems"

Copied!
5
0
0

Loading.... (view fulltext now)

Full text

(1)

AIP Conference Proceedings 2070, 020044 (2019); https://doi.org/10.1063/1.5090011 2070, 020044 © 2019 Author(s).

Towards single- and multiobjective Bayesian

global optimization for mixed integer

problems

Cite as: AIP Conference Proceedings 2070, 020044 (2019); https://doi.org/10.1063/1.5090011

Published Online: 12 February 2019

Kaifeng Yang, Koen van der Blom, Thomas Bäck, and Michael Emmerich

ARTICLES YOU MAY BE INTERESTED IN

The R2 indicator: A study of its expected improvement in case of two objectives

AIP Conference Proceedings 2070, 020054 (2019); https://doi.org/10.1063/1.5090021

Towards self-adaptive efficient global optimization

AIP Conference Proceedings 2070, 020056 (2019); https://doi.org/10.1063/1.5090023

On the hierarchical structure of Pareto critical sets

(2)

Towards Single- and Multiobjective Bayesian Global

Optimization for Mixed Integer Problems

Kaifeng Yang

a)

, Koen van der Blom

b)

, Thomas B¨ack

c)

and Michael Emmerich

d)

LIACS, Leiden University, Niels Bohrweg 1, 2333CA Leiden, Netherlands

a)Corresponding author: [email protected] b)[email protected]

c)[email protected] d)[email protected]

Abstract.Bayesian Global Optimization (BGO) is a very efficient technique to optimize expensive evaluation problems. However, the application domain is limited to continuous search spaces when using a BGO algorithm. To solve mixed integer problems with a BGO algorithm, this paper adapts the heterogeneous distance function to construct the Kriging models and applies these new Kriging models in Multi-objective Bayesian Global Optimization (MOBGO). The proposed mixed integer MOBGO algorithm and the traditional MOBGO algorithm are compared on three mixed integer multi-objective optimization problems (MOP), w.r.t. the mean value of thehypervolume(HV) and the related standard deviation.

ALGORITHM

Bayesian Global Optimizationwas proposed by the Lithuanian research group of Jonas Mockus and Antanas ˇZilinskas [1, 2]. The basic idea of BGO, also known asEfficient Global Optimization, is to build a statistical model to reflect the relationship between decision vectorsX = (x(1),x(2),· · ·,x(n))> in anm dimensional search space, and their

corresponding objective valuesY(X)=(y(x(1)),y(x(2)),· · ·,y(x(n)))>. Then an optimizer in a BGO algorithm searches

for an optimal solution by using the predictions of the surrogate model(s), instead of evaluating the real objective function.

To construct a surrogate model, it is assumed that the objective function is the realization of a Gaussian random field, which is also called Gaussian processes (GP) or Kriging in BGO. Specifically, Kriging assumes yto be a realization of a random processY and to be of the form [3]:

Y(x)=µ(x)+(x) (1)

whereµ(x) is the estimated mean value over all given sampled points, and(x) is a realization of a normally distributed Gaussian random process with a zero mean and a variance ofσ2. The correlation between the deviations at two points (xandx0) using an isotropic Gaussian Kernel is defined as:

Corr[(x), (x0)]=R(x,x0)=exp−θd(x,x0)2 (θ>=0) (2) whereR(·,·) is the correlation function, andd(·,·) is the Euclidean distance function.

In Kriging, the distance value between two points is usually calculated with the Euclidean metric, that isd(x,x0)=

q Pi=m

i=1(xi−x

0

i)

2. This is useful and straightforward in a continuous search space, as the Euclidean metric assumes the continuity of an objective function. However, this assumption will not be suitable when the search space is a nominal discrete or an integer space.

To solve this problem, we applied the heterogeneous metric [4, 5] as the distance function, which is defined as:

dh(x,x0)=

v u ti=nr

X

i=1 (ri−r

0

i)2+ i=nz X

i=1 |zi−z

0

i|+ i=nd X

i=1

I(di,d 0

(3)

whereI(true)=1 andI(f alse)=0. The idea of the heterogeneous metric is to combine different metrics by taking the square root of the sum of each distance. As shown in Eq. (3), the distance between any two points in a nominal discrete space and an integer space is measured by the overlap metric and the Manhattan metric, respectively. A detailed motivation of the heterogeneous metric in the context of statistical modeling is provided in [5].

TEST PROBLEMS

Three mixed integer MOPs are used in this paper, specifically: the double sphere function in Eq. (4) [5], the double barrier function in Eq. (5) [6], and a multi-layer optical filter optimization problem [7].

fsphere1(r,z,d)= nr X

i=1

ri2+

nz X

i=1

z2i +

nd X

i=1

d2i fsphere2(r,z,d)= nr X

i=1

(ri−2)2+ nz X

i=1

(zi−2)2+ nd X

i=1

(di−2)2 (4)

fbarrier1(r,z,d)= nr X

i=1

r2i +αsin(ri)2

+

nz X

i=1

A[zi]2+ nd X

i=1

Bi[di]2

fbarrier2(r,z,d)= nr X

i=1

(ri−2)2+αsin(ri−2)2

+

nz X

i=1

(A[zi]−2)2+ nd X

i=1

(Bi[di]−2)2

(5)

In Eq. (4) and Eq. (5),r,z, anddrepresent a real variable vector, an integer variable vector, and a nominal discrete variable vector, respectively.

For the barrier functionα =1, andAis generated by Algorithm 6 from Li et al (2013) [6] with the parameter

C=75, andBi∈1,...,ndis a set ofndrandom permutations of the sequence 0, . . . ,4. BothAandBremain fixed throughout

the experiments. Unlike in [6], here smooth wave-like barriers are used in the real part, rather than staircase-like barriers. For the multi-objective case both of these functions can be adjusted with an offset for each term, such that real, integer, and nominal discrete optima are different in the second objective.

For the optical filter problem [7], the pairs of continuous and (binary) nominal discrete variables are considered. When the binary variable is active the corresponding continuous variable is used in the objective functions, otherwise it is ignored. If all bits are inactive a penalty of (250,1250) is returned. In addition to the original objective we consider

fopt f ilt2(r,d)=

Pnr

i=1ridias a second objective. The parameters of each test problem are given in Table 1.

RESULTS

All the experiments are performed on the same computer: Intel(R) i7-4800mq CPU @ 2.70GHz, RAM 32GB. The operating system is Ubuntu 16.04 LTS (64 bit), and the platform is MATLAB 8.4.0.150421 (R2014b), 64 bit.

Figure 1 shows the heatmap comparison of the predictions by using different distance metrics for the barrier function in Eq. (5). For the visualization, only one variable is considered for each variable type. The parameterθin Eq. (2) is chosen asθ=[0.01,0.01,0.01]. Each variable is normalized to [0,1] in Figure 1. The number of sampling points is 15 and there are 200 random points for the test points. Note that the pictures in the third column don’t have sampling points and are plotted with these 200 test points, which are evaluated on the real objective functions. These 200 test points are used to visualize the landscape of the barrier function. In these figures, the models using the heterogeneous metric are more accurate than those using the Euclidean metric, especially in the fbarrier1function.

Figure 2 illustrates the comparison of the predictions of the sphere functions using the Euclidean metric and the heterogeneous metric. There are 15 variables in total, 5 variables for each type. The number of sampling points is 90 and the number of test points is 200. The optimalθin the Kriging models are calculated with the simplex search method of Lagarias et al. (f minsearch) [8], with the parameter of max function evaluations as 1000. In Figure 2, ˆy

andσrepresent the predicted mean and the predicted standard deviation, respectively. After applying the optimalθ strategy for both the Euclidean metric and the heterogeneous metric, the prediction of using the Euclidean metric is similar to that of using the heterogeneous metric. However, the Kriging models using the heterogeneous metric still outperform the Kriging models using the Euclidean metric, w.r.t. the standard deviation of the predictions.

(4)

FIGURE 1.Landscape of the fmbarrierfunction

both algorithms, the optimalθcalculated by f minsearch[8] are used, the infill criterion is theExpected Hypervolume Improvement [9] and the optimizer is a build-in genetic algorithm [10] in MATLAB. The results in Table 1 show that the mixed integer MOBGO using the heterogeneous metric outperforms the traditional MOGBO on both the barrier and sphere test problems, w.r.t. mean hypervolume (HV) values and the standard deviations of the HV over 10 repetitions. For the optical filter problem, the traditional MOBGO can generate a better Pareto front set, but the mixed integer MOBGO is more robust than the traditional MOBGO, w.r.t. standard deviation.

TABLE 1.Parameter settings and empirical experimental results

HV - Eucl. HV - Hete. Para. setting

mean std. mean std. ref. nr range nz range nd range

fsphere 8.4644e+03 98.5064 8.5446e+03 18.3754 [100,100] 5 [0,4] 5 [0,4] 5 [0,4]

fbarrier 6.4769e+03 175.5024 6.6927e+03 55.9169 [100,100] 5 [0,4] 5 [0,4] 5 [0,4]

fopt f ilt 1.5469e+04 27.4487 1.5456e+04 6.3562 [50,800] 7 [0,1] N/A N/A 7 [0,1]

CONCLUSIONS AND FUTURE WORK

This paper proposes a mixed integer MOBGO by using the heterogeneous metric, instead of the Euclidean metric. Both of these two algorithms are compared in terms of the prediction errors and performance of the final Pareto front sets on three mixed integer multi-objective test problems. The results show that the proposed method surpasses the traditional MOBGO on both sphere and barrier test functions, with regard to the prediction errors, mean HV and standard deviation of the HV. On the optical filter problem, the traditional MOBGO performs slightly better than the proposed algorithm. The reason for this could be thatNmaxis too small, as this problem contains 14 variables.

(5)

FIGURE 2.Comparison of predictions on fmspherefunctions

REFERENCES

[1] A. ˇZilinskas and J. Mockus, Avtomatica i Vychislitel’naya Teknika4, 42–44 (1972).

[2] J. Moˇckus, “On bayesian methods for seeking the extremum,” inOptimization Techniques IFIP Technical Conference Novosibirsk, July 1–7, 1974, edited by G. I. Marchuk (Springer Berlin Heidelberg, Berlin, Hei-delberg, 1975), pp. 400–404.

[3] D. R. Jones, M. Schonlau, and W. J. Welch,Journal of Global optimization13, 455–492 (1998). [4] D. R. Wilson and T. R. Martinez,Journal of artificial intelligence research6, 1–34 (1997).

[5] R. Li, M. T. M. Emmerich, J. Eggermont, E. G. P. Bovenkamp, T. Back, J. Dijkstra, and J. H. C. Reiber, “Metamodel-assisted mixed integer evolution strategies and their application to intravascular ultrasound im-age analysis,” in2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computa-tional Intelligence)(2008), pp. 2764–2771.

[6] R. Li, M. T. M. Emmerich, J. Eggermont, T. B¨ack, M. Sch¨utz, J. Dijkstra, and J. H. C. Reiber,Evolutionary computation21, 29–64 (2013).

[7] J.-M. Yang and C.-Y. Kao, “An evolutionary algorithm for synthesizing optical thin-film designs,” inParallel Problem Solving from Nature — PPSN V, edited by A. E. Eiben, T. B¨ack, M. Schoenauer, and H.-P. Schwefel (Springer Berlin Heidelberg, Berlin, Heidelberg, 1998), pp. 947–956.

[8] J. C. Lagarias, J. A. Reeds, M. H. Wright, and P. E. Wright,SIAM Journal on Optimization9, 112–147 (1998), https://doi.org/10.1137/S1052623496303470 .

[9] M. Emmerich, K. Yang, A. Deutz, H. Wang, and C. M. Fonseca, inAdvances in Stochastic and Deterministic Global Optimization, edited by P. M. Pardalos, A. Zhigljavsky, and J. ˇZilinskas (Springer, Berlin, Heidelberg, 2016), pp. 229–243.

[10] D. E. Goldberg,Genetic Algorithms in Search, Optimization and Machine Learning, 1st ed. (Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1989).

Figure

Figure 1 shows the heatmap comparison of the predictions by using di fferent distance metrics for the barrier function in Eq
TABLE 1. Parameter settings and empirical experimental results
FIGURE 2. Comparison of predictions on f msphere functions

References

Related documents

Our study of work in-situ led us to consider an Algerian maternity ward to better un- derstand the usual way under which medical staff such as gynaecologists-

In this paper we study the energy efficiency of a class of Monte Carlo (MC) and Quasi-Monte Carlo (QMC) algorithms for a given integral equation using hybrid HPC systems.. The

Because reflected light from the shallow Bahamas Banks confounds algorithms used to estimate water-column properties (e.g., Lee et al. 1998), estimates of CDOM absorption

First, changes in pain were associated with short-term changes in the number of SA days among all employees, and that the effect of acute/subacute and single located pain on SA

Sinha (2004) proposed a robust method for analyzing clustered correlated data in the framework of maximum likelihood approach for generalized linear mixed models.. Also, Sinha

Using a dynamic panel data specification we document that central banks serving both monetary policy and banking supervision functions are less

Whilst Remuneration Services (Qld) Pty Ltd has performed a registration process to gain an understanding of the experience and technical expertise of the registered advisers

Anyone who accesses the site can get more detailed material on financial (financial performance, socioeconomics and tax contribution), social (patients, employees and