Genetic algorithm application - Performance assessment of Surrogate model integrated with sensi

in the third order matrix, which is a 3-dimensional matrix, entries which characterize second order self-interactions (e.g. β112) will be divided by 3, since appears three times,

meanwhile terms which deal with mixed interactions (e.g. β123), need to be divided

by 6; instead, entries of pure self-interaction (e.g. β111) appear only once in the main

diagonal.

Contrary to the sensitivity analysis, the coding relative of Response Surface methodology has been entirely developed in all its routines and functions, with the only ex- ception of the quasi-random Sobol’ sequence used to perform the random sampling. The main feature that this code achieves and that differentiates it from the literature ones is the treating of non-modelled variables; it does not enhance the surface fitting but let the surrogate model reach much better optimization results in certain cases, as described later on in chapter 7.

6.3 Genetic algorithm application

As stated in the introduction of this chapter, the choice of the evolutionary algorithm performing the optimum searching process on the response surface fell upon the second version of Non-dominated Sorting Genetic Algorithm (NSGA-II, 3.2), hence involving elitism. This choice was done due to the huge simplicity achieved through the previous phase of the model: response surfaces are often described by few decision variables and do not requires specific features, so the optimum search become much easier than original functions and does not involve particular problems. NSGA-II displays good behave on various kind of functions and moreover it is recognized as a standard approach in the scientific community.

The parameter settings for the algorithm are quite similar to those realized in [7, 62], excluding the coefficients of recombination, mutation and so on, which have not been modified from those set in the chosen algorithm. Population size is set to 100 individuals and the maximum number of generations depends problem by problem and it is reported in the table 6.5 below. The function tolerance, which checks the objective function values through consecutive generation, to stop the algorithm when it does not realize further improvements, is set to 10−4.

The method built often gets to the Pareto Front before reaching the maximum number of generations and hence stops due to function tolerance; this fact further highlights the model capabilities, which manage to build an extremely simple surrogate, though still effective and suitable to the optimum searching task.

Dealing with the implementation of this last step of the model, it was realized in M AT LAB and it uses either developed and already written routines. The mainR code, which is the Genetic Algorithm NSGA-II, was already implemented in the software and did not need any further development. Anyway, its code had been carefully studied to verify the described properties and analogies with the descriptive section of NSGA and elitism 3.2, as well as to get how to define population size and the other parameters. Moreover, it was necessary to let MATLAB software read the Python’s results: it required the translation of stored outcomes, which were the β coefficients. Once defined them, the surface function has been built, independently on the number of factors involved, number of objective functions and surface order; the chosen method to organize the β values was the matrix one, because it realizes the simplest possible

Problem Max. number of generations ZDT1 20 ZDT2 20 ZDT3 40 ZDT4 40 ZDT6 20 DTLZ1 100 DTLZ2 70 DTLZ3 80 DTLZ5 80 DTLZ6 50 DTLZ7 100

Table 6.5: Maximum number of generations for each test problem.

behaviour related with the input vector. To perform the successive metric analysis of the results and to retrieve the Pareto Fronts, it was used functions developed by authors of article [62]. Instead, to compare the obtained results, it was available the datasets resulting again from the above article; once retrieved the useful data, it was developed the routines realizing several analyses, as the information to build the box-plot and further more.

Chapter 7 Result and comparison on Test

Function

The analysis of results on test functions reported in A is described in the following. First thing to notice and already mentioned, both the two- and three-objective problems present a peculiar behaviour dealing with variables from the second and third one respectively. In fact these problems reach the Pareto front when all the so-called secondary variables assume the same values. These are 0 in most of the two-objective test functions and 0.5, the mid-point of the domain, for three-objective tests. With the response surface built, the optima conditions can be reached quite effectively, obtaining good fits of the functions at lower costs. Therefore, these test problems do not evaluate completely the performance of the surrogate model realized. Another analysis of the literature to search for further test functions has been done, though it does not find any better problem. Several tests were found grouped in the PlatEMO package for MATLAB [60]1_{. However, none of those proved to provide different characteristics}

and specific behaviours with the Response Surface Methodology developed. So, only the aforementioned problems had been used to test the realized surrogate, despite the highlighted issues.

Using these functions it is possible to study the behaviour of the response surface submitted to the evolutionary search process of the genetic algorithm. But to further investigate the true fitting qualities of this method, in the following chapter 8 an analysis which deals with a real dataset will be run.

Dealing with results on test functions, let’s describe step by step all the choice taken, from the sensitivity analysis to the genetic algorithm.

7.1 Sensitivity analysis results

Sensitivity analysis, in particular Variance-based method, highlights that almost the entire variability of the test functions is produced by the first decision variable alone in two-objective problems and by the combination of first and second decision variables in three-objective problems. Moreover, most of the first test functions for two objective problems are defined involving just one variable, as in A.1. So their expressions often can be neglected in the sensitivity analysis due to their obvious results. Instead, three-

1_{PlatEMO is an open source platform developed in MATLAB including several evolutionary multi-} objective algorithms and test problems [2]

objective tests present functions which often involve several factors contemporaneously, but in DTLZ7 A.2. ZDT3 DTLZ2 Variable S1 ST ot Variable S1 ST ot x1 0.9109 0.9050 x1 0.4549 0.5775 x2 0.0007 0.0011 x2 0.4245 0.5492 x3 0.0011 0.0011 x3 -0.0015 0.0022 x4 0.0011 0.0011 x4 -0.0046 0.0022 x5 0.0011 0.0011 x5 -0.0014 0.0022 x6 0.0010 0.0011 x6 0.0011 0.0024 x7 0.0012 0.0011 x7 -0.0059 0.0022 x8 0.0011 0.0011 x8 -0.0004 0.0023 x9 0.0009 0.0011 x9 -0.0013 0.0026 x10 0.0004 0.0011 x10 0.0041 0.0022 . . . .

Table 7.1: Results of Sobol’ sensitivity analysis for second objective function of ZDT3 and first one of DTLZ2

Observing table 7.1, it is possible to notice this behaviour observing the variables from the second one and third one in the respective test problems. These factors do not introduce in the respective functions any influential variability compared to the previous variables.

Taking advantage of these results in building the response surfaces, they lead to realize surrogate models which are described by just one and two factors respectively for two and three objective test functions. In fact, attempts in realizing the surfaces with larger or lower number of factors produced models quite far from the original ones. Introduction of further variables in the surface model leads the most important factors to share part of their variability with other variables within the fitting procedure. Such models produce anyway quite good prediction. However they do not perform as well as those using only the main variable(s) found by sensitivity analysis. On the other hand, as it can be simply deduced, a smaller number of factors in the surface cannot properly describe the true behaviour of test function at all.

In document Performance assessment of Surrogate model integrated with sensitivity analysis in multi-objective optimization (Page 98-101)