• No results found

Introduction

1.3 Evolutionary algorithms

EAs are biology-inspired search algorithms that are commonly used in problems which are difficult to tackle with any other analytical methods [84]. EAs are very flexible methods that has been applied to many machine learning tasks, such as clustering [85], pattern mining [84], and feature selection [86]. Note that although suitable and very useful for a wide range of problems, EAs are stochastic proce- dures that cannot guarantee that the global optima is reached [87].

5http://scikit.ml

6https://github.com/fcharte/mldr 7https://github.com/i02momuj/MLDA

In EAs, there usually exist a population of individuals, where each of the individ- uals represents a full or partial candidate solution to the problem. Furthermore, each individual have a fitness value associated, meaning how well this solution solves the problem, and thus leading the evolution of the population towards fit- ter or more adapted individuals, aiming to obtain an optimal solution (or set of solutions).

Although a wide range of variations of EAs have been defined thorough the years, they usually rely on the same structure [88]:

Population The population is a set of individuals of (usually) fixed size, where

each represents a full or partial candidate solution to the problem. The in- dividuals in the population evolve, and those that provide a better solution to the problem usually have a higher chance to remain in the population. How- ever, individuals in the population not only need to be fitted to the problem, but they also need to be somewhere different among them, thus representing different locations in the search space, and avoiding to stuck in local optima.

Fitness The fitness function represents the requirements that individuals should

met. In other words, it is a procedure that assigns a quality value to each individual, depending on how well they solve the problem. It is used to lead the evolution towards optimal solutions.

Parents selection The role of parents selection operator is to select those individ- uals that will later produce offspring (i.e., new individuals). Usually, the se- lection of parents is made in a probabilistic way, where fitter individuals are more probable to be selected, thus aiming to improve the quality of new so- lutions. However, low-quality solutions usually also have a small chance to be selected, in order to maintain diversity in the population and not get stuck soon in a local optima.

Genetic operators Individuals interact with each other and are modified to gen-

erate offspring by means of genetic operators. Widely used genetic operators are crossover and mutation. Crossover operator combines genetic material of several individuals (usually two) in order to create new individuals that are

similar to each parent. On the other hand, mutation operator modify a single individual and it is usually more disruptive than crossover, since it is able to discover new genetic material that was no previously present in the popula- tion. Both types of operators are stochastic, i.e., are based on random choices to create offspring.

Population update At the end of each generation of the EA, the size of the popula-

tion should remain the same. Therefore, the population is updated consider- ing individuals from both the parents and offspring sets. Common techniques are to replace the whole parents set by the offspring, to maintain the best par- ent in the following population, to combine both sets and select most suitable or diverse individuals, etc. Together with the parents selection, these two pro- cedures lead the population to improve in quality.

Stop criterion In order to stop the execution of the EA, one or several stop criteria must be used. Commonly used stop criteria are the number of generations in the evolution, the maximum number of evaluations of individuals, maximum number of generations in which the population has not improved, etc.

The basic steps of EAs are those presented inFigure 1.10. First, a populationp

of popSizeindividuals is generated, usually randomly or following any heuristic.

Then, individuals inpare evaluated using the fitness function. Later, until the stop criteria is reached, parents are selected, usually based on their fitness value, where fitter individuals have more chance to be selected; genetic operators are applied to the selected individuals; offspring individuals insare evaluated; and the popula- tion is updated considering individuals in bothpandssets. Finally, the population is usually returned.

InFigure 1.11 the operation of EAs is shown with a simple example of maxi-

mization of a one-dimensional function. In the figures, the x-axis represents the different positions in the search space, while the y-axis represents the fitness of each individual (i.e., the value of the function to maximize). As observed, at the beginning of the evolution, individuals are usually randomly created, so they are distributed throughout the search space (Figure 1.11a). After some generations,

HYDOXDWH S <HV 6WRS FULWHULD UHDFKHG" 1R V VHOHFW S S LQLW3RS SRS6L]H V DSSO\2SHUDWRUV V HYDOXDWH V S XSGDWH3RS S V UHWXUQ S

Figure 1.10: Main steps of EAs.

given the selection and use of genetic operators, individuals in regions with lower value of fitness tend to disappear, while they start to climb the hills in pursuit of more promising zones (Figure 1.11b). Finally, over the end of the EA, individuals would be gathered around optimal zones (Figure 1.11c). These individuals may be spread over several hills, but they could also be concentrated in a suboptimal zone, thus not reaching the global optimum. For this reason, it is essential to have both exploration (creating individuals in new zones of the search space) and exploitation (concentration and improvement of individuals in promising zones) mechanisms in the EA. Nevertheless, it should be fine-tuned the trade-off between exploration and exploitation in order to not to lead to a premature convergence, i.e., getting trapped in a local optimum.

2(-:-(9%07 -8 2 ) 77 (a) Beginning 2(-:-(9%07 -8 2 ) 77 (b) Halfway ,QGLYLGXDOV )LWQHVV (c) End Figure 1.11: Operation of EAs along the iterations.

Based on this structure, a wide range of different evolutionary techniques have been proposed so far, such as Genetic Algorithms (GAs) [89], Genetic Programming (GP) [90], Gene Expression Programming (GEP) [91], Cooperative CoEvolutionary Algorithms (CCEAs) [92], or Particle Swarm Optimization (PSO) [93], among a large list of different frameworks.

In order to implement the EAs for our research, we have used the JCLEC li-

brary [94]. JCLEC is a software system for evolutionary computation, developed

in Java, and publicly available8 under GPL License. It provides a high-level soft-

ware framework to do any kind of EA, and gives support for several predefined algorithms such as GAs and GP.