• No results found

AutoParTunehas three components, namely: (1) instance-specific tuning, (2) param- eter search space reduction; and (3) global tuning. The details of these components are discussed in the following subsections.

5.2.1

Instance-Specific Tuning

For instance-specific tuning, we implement CluPaTra and CluPaTra-II. CluPaTra

and CluPaTra-II are premised on the assumption that an algorithm configuration is correlated with its fitness landscape, i.e. a configuration that performs well on a prob- lem instance of a certain fitness landscape will also perform well on another instance with similar topology [92]. Furthermore, since the fitness landscape is difficult to compute, it can be approximated by a search trajectory [48, 49] which is deemed a probe through the landscape under a given algorithm configuration.

CluPaTraworks by transforming the search trajectory as a directed sequence and uses sequence alignment to calculate similarity for each pair of instances. On the other handCluPaTra-IIis an extension ofCluPaTrathat overcomes three major limitations ofCluPaTra: scalability, flexibility and descriptiveness. CluPaTraandCluPaTra-II

are described in detail in Chapter 3 and 4 respectively.

Up to this stage, CluPaTraandCluPaTra-IIcan only be applied on local-search based target algorithms due to its reliance on search trajectory. A search trajectory generator is required to performCluPaTraandCluPaTra-II.

5.2.2

Parameter Search Space Reduction

An often neglected preprocessing step in automated parameter tuning is to reduce the parameter space into a specific favorable parameter range. A good initial parameter range is able to guide the tuning process to provide a better parameter configuration with shorter computation time.

yz{|} ~ €  z{z €‚ ƒ  €„ z{ …† „|€ ‡ {ˆ‚ ƒ € ‡‰‡‰ „ Љ ‚ {  ‰ ‹ z ‚ yz{ | } Š Œ | € {  ‰{ ~  € z{z €‚ ‹€ „ ~ˆ ‚ Š Œ|€{ ‰{ ~ €  z{z €‚ Љ‡{‡  †  ‰ „ z ‚ ƒ  €„ z{ …† „|€ ‡ {ˆ‚ ƒ € ‡‰‡‰ „ Љ ‚ {  ‰ ‹ z ‚ Š Œ|€{ ‰{ ~  € z{z €‚ އ‰  †  ‰ „ z ‚  Œ z € ‡  z‰{{‡|‰~ˆ‚z

Figure 5.1: Phases of Fact-RSM, Parameter Search Space Reduction Method using DoE methodology

technique. Fact-RSM is a sequential experimental method for screening and reducing a parameter space for numerical parameters. Fact-RSM is based on the DoE (Design of Experiment) methodology as follows.

A full factorial experiment design is applied to first screen and rank the parame- ters. Parameters which are determined to be unimportant (i.e. the solution quality is insensitive to the values of these parameters) are set with constant values that reduce the parameter space to be explored. A first-order polynomial model based on RSM (Response Surface Methodology) is then built to define the promising initial range for the important parameter values. For statistical calculation, we use a well-known statistical software, R (http://www.r-project.org/).

Fact-RSM, as illustrated in Fig. 5.1, works in two phases: screening, and exper- imentation. The screening phase identifies the important parameters using 2k full factorial design, while the experimentation uses RSM to locate ”promising” regions for important parameters. The details of the screening and experimentation phases are as follows.

Screening Phase

A screening process is conducted to determine which parameters are significantly more important to reduce the number of parameters under consideration. It applies a 2k full factorial design which consists of k parameters, where each parameter xi

‘’“ ” • • ” ˜™ ‘ š “ › œ ‘ ž “ –— ˜™ ‘ š “ › œ ‘ ž “ Ÿ   ¡¢ £¤ ¥¦ £§ ¥ ¨©ª « ¬ « ¬ ­ ® ¯ ® ¯ ° ® ® ¯ ¯

Figure 5.2: 2kFull Factorial Design for Fact-RSM

only has two levels (aiandbi) withai as lower bound andbi as upper bound.

As an example, consider if there are two parameters, A and B. Fig. 5.2 shows the22

design with treatment combinations represented as the corners of the squares. The signs - and+ denote the values of ai and bi of each parameter xi, respectively.

Treatment combinations are the Cartesian product of the two parameters values (ai

andbi). A treatment combination is represented by a series of lowercase letters. For

example, the treatment combination a indicates that parameters A and B are set to bA and aB, respectively. To estimate the treatment combination, we run the target

algorithm forn replicates for each treatment combination. A complete design requires

(2 x 2 x ... x 2) xn = n x 2k. For simplicity, we setn to 10.

Since the main focus in this phase is to determine the important parameters, the in- teractions between parameters are ignored. The importance of a particular parameter is defined by conducting a significance test on the parameter’s main effect. A signif- icant level is set to 5% (α = 5%). Parameters with p − value ≤ α are significantly

important. The important parameters are explored further in the next phase. On the other hand, unimportant parameters are set to a constant value by looking at the main effect value of the parameter. If the value is negative, it is set to its upper bound, if otherwise, to its lower bound.

Experimentation Phase

This phase aims to find and locate ”promising” regions for important parameters by using the Response Surface Methodology (RSM). RSM is a model-based approach within DoE that can be used to quantify the importance of each parameter, support

interpolation of performance between parameter settings as well as extrapolation to previously-unseen regions of the parameter space [59]. RSM has been used in the parameter tuning scenario to finetune algorithm parameters [26, 27] and to identify ”promising” regions of a parameter search space [59].

The underlying assumption in this phase is that the region can be approximated by a planar model (the first-order model). Since we might be at a region on the response surface that is far from the optimum, we assume that there exists only a little curvature. Hence a planar model would be appropriate. The planar model of the parameters is approximated using the following function:

Y = β0+ β1x1 + ... + βmxm+ ε (5.1)

whereβ0, β1, ..., βmare parameter coefficients,x1, ..., xmare parameters, andε is error

coefficient.

In order to move rapidly to the ”promising” regions, we apply the method of steep- est descent (for a minimization problem). This method is a procedure for moving se- quentially along the path of steepest descent, that is, in the direction of the maximum decrease in response Y. For example, if β1 (coefficient of parameter 1) is the largest

absolute coefficient value compared against other coefficient values, the step size of another parameteri is calculated by β1/βi.

This phase is terminated when the local optimum region is found. From a sta- tistical point of view, the local optimum can be indicated by the existence of either interaction or curvature. Interaction is tested using analysis of variance (ANOVA) while curvature is tested using thet − test.

5.2.3

Global Tuning

As a global tuning component, we implement two efficient and well-established global tunings: ParamILS [60] and F-Race [19]. ParamILS [60] utilizes Iterated Local Search (ILS) to explore the parameter space in order to find a good parameter configuration

based on the given training instances. ParamILS has been very successfully applied to tune a broad range of high-performing algorithms for several hard combinatorial prob- lems with a large number of parameters. ParamILS is itself an iterated local search algorithm used for tuning discrete parameters. Since ParamILS works only with dis- crete parameters, inAutoParTune, we first discretize the values of the parameters if the target algorithm has parameters that assume continuous values.

Iterated F-Race [19] is a racing algorithm for the task of automated parameter tuning for categorical and numerical parameters. Iterated F-Race is an extension of F-Race [17] which is based on a statistical approach for selecting the best parameter configurations using stochastic evaluations.