• No results found

2.5 Conclusions

3.2.3 Objective function approximation with a SVMr algorithm

One of the most important statistic models in the eld of prediction are the Support Vector Regression algorithms (SVMr) [smo98, smo99]. The SVMrs are appealing algorithms for a large variety of regression problems, in many of them mixed with evolutionary computation algorithms [che11, sal11, jia12]. Although there are several versions of SVMr, in this case the classic model presented in [smo98] will be described.

The ϵ-SVMr method for regression consists of training a model of the form y(x) = f(x)+b = wTϕ(x) + b, given a set of training vectors C = {(xi, yi), i = 1, . . . , l}, to minimize a general risk

function of the form

R[f ] = 1 2∥w∥ 2 + C li=1 L (yi, f (x)) (3.19)

where w controls the smoothness of the model, ϕ(x) is a function of projection of the input space to the feature space, b is a parameter of bias, xi is a feature vector of the input space

with dimension N, yi is the output value to be estimated and L (yi, f (x)) is the loss function

selected. In this case, it is used the L1-SVMr (L1 support vector regression), characterized by an ϵ-insensitive loss function [smo99]

L (yi, f (x)) =|yi− f(xi) (3.20)

In order to train this model, it is necessary to solve the following optimization problem [smo99]: min ( 1 2∥w∥ 2+ C li=1 (ξi+ ξi∗) ) (3.21) subject to yi− wTϕ(xi)− b ≤ ϵ + ξi, i = 1, . . . , l (3.22) − yi+ wTϕ(xi) + b≤ ϵ + ξi∗, i = 1, . . . , l (3.23) ξi, ξi∗ ≥ 0, i = 1, . . . , l (3.24)

The dual form of this optimization problem is usually obtained through the minimization of the Lagrange function, constructed from the objective function and the problem constraints. In this case, the dual form of the optimization problem is the following:

max  −1 2 li,j=1 (αi− αi∗)(αj− α∗j)K(xi, xj) −ϵ li=1 (αi+ α∗i) + li=1 yi(αi− α∗i) ) (3.25) subject to

3.2. Proposed approach 61 li=1 (αi− α∗i) = 0 (3.26) αi, α∗i ∈ [0, C] (3.27)

In addition to these constraints, the Karush-Kuhn-Tucker conditions must be fullled, and also the bias variable, b, must be obtained [smo99] . In the dual formulation of the problem the function K(xi, xj) is the kernel matrix, which is formed by the evaluation of a kernel function,

equivalent to the dot product ⟨ϕ(xi), ϕ(xj)⟩. A usual selection for this kernel function is a

Gaussian function, as follows:

K(xi, xj) = exp(−γ · ∥xi− xj2). (3.28)

The nal form of function f(x) depends on the Langrange multipliers αi, α∗i, as follows:

f (x) =

l

i=1

(αi− α∗i)K(xi, x) (3.29)

In this way it is possible to obtain a SVMr model by means of the training of a quadratic problem for given hyper-parameters C, ϵ and γ. However, obtaining these parameters is not a simple procedure, being necessary the implementation of search algorithms to obtain the optimal ones or the estimation of them [ort09].

The selection of hyper-parameters for SVMs is a key point in the training process of these models when applied to regression problems. Unfortunately, an exact method to obtain the opti- mal set of SVM hyper-parameters is unknown, and search algorithms are usually applied to obtain the best possible set of hyper-parameters. In general, these search algorithms are implemented as grid searches, which are time-consuming, so the computational cost of the SVM training pro- cess increases considerably. The search algorithms used to obtain SVM hyper-parameters can be divided in three groups. The rst group of algorithms for SVM hyper-parameters is based on grid search [aka98], where the search space of parameters is divided into groups of possible parameters to be tested, usually in a uniform fashion. The second group of search algorithms is formed by local search type approaches, such as pattern search proposed in [mom02]. Finally, the third group is based on metaheuristic, or global optimization algorithms, such as evolutionary computation [wan05]. All these search algorithms have similar problems: rst, the selection of the initial ranges of parameters, which limit the search space. In most cases, these initial ranges are selected by experience of the researcher, or using large ranges of parameters. The former case is highly dependent on the regression problem studied, and it is a very dicult and specic task, not useful in the majority of occasions. In the latter case, the usage of large parameter ranges implies the increasing of the search space, and thus the training time of the SVM.

In [ort09], a novel eort to improve the SVM training time through the reduction of the search space is presented. The objective is to reduce the training time necessary to nd the nal parameters of the SVM model, while maintaining the performance of the models. This search space reductions are generated by bounding the SVM hyper-parameters, mainly parameter C. This parameter is bounded by taking to account its relation with parameters γ and ϵ, through an approximation of the SVM model. Parameter γ is bounded using characteristics of the Gaussian

kernel function used, and nally parameter ϵ bound is constructed basing on some previous results in the literature [kwo03]. All these reductions are applied to a grid search algorithm, in order to nd the parameters which obtain the best SVM performance, reducing the computation time of the full search space case.

Given the amount of aerodynamic data available to train the SVMr-based model, it was necessary to split them into several groups, each corresponding to a dierent value of the angle of attack. Subsequently, each of these datasets is employed to train a dierent model, which will be associated to its corresponding angle. In case of values dierent from those 24 angles existing in the original data, a linear interpolation of the two nearest models has been carried out. Figure 3.4 shows the architecture of the proposed SVMr model. Note that the problem is tackled with a network of SVMr banks, dening a single SVMr for each angle of attack, both in the train and test periods. Airfoil geometry SVM model #1 SVM model #2 SVM model #24 Linear In terpola tor angle of attack Model outcome

Figure 3.4: SVMr banks architecture applied in this work.