A class of probabilistic optimization
algorithms
Inspired by the biological evolution process
Uses concepts of “Natural Selection” and
“Genetic Inheritance” (Darwin 1859)
Originally developed by John Holland
Particularly well suited for hard problems
where little is known about the underlying search space
Widely-used in business, science and
Search Techniqes
Calculus Base
Techniqes Guided random search techniqes
Enumerative Techniqes
BFS
DFS Dynamic
Programming
Selection replicates the most successful
solutions found in a population at a rate
proportional to their relative quality
Recombination decomposes two distinct
solutions and then randomly mixes their parts to form novel solutions
Mutation randomly perturbs a candidate
Genetic Algorithm Nature
Optimization problem Environment
Feasible solutions Individuals living in that
environment
Solutions quality (fitness function)
Individual’s degree of adaptation to its
Genetic Algorithm Nature
A set of feasible solutions A population of organisms
(species)
Stochastic operators Selection, recombination
and mutation in nature’s evolutionary process
Iteratively applying a set of stochastic operators on a set of feasible solutions Evolution of populations to
The computer model introduces simplifications (relative to the real biological mechanisms),
BUT
produce an initial population of individuals
evaluate the fitness of all individuals
while termination condition not met do
select fitter individuals for reproduction
recombine between individuals
mutate individuals
evaluate the fitness of the modified individuals
selection
population evaluation
modification
discard
deleted members parents
modified offspring
evaluated offspring initiate
&
Suppose we want to maximize the number of ones in a string of l binary digits
Is it a trivial problem?
It may seem so because we know the answer in advance
An individual is encoded (naturally) as a
string of l binary digits
The fitness f of a candidate solution to the
MAXONE problem is the number of ones in its genetic code
We start with a population of n random
We toss a fair coin 60 times and get the following initial population:
s1 = 1111010101 f (s1) = 7
s2 = 0111000101 f (s2) = 5
s3 = 1110110101 f (s3) = 7
s4 = 0100010011 f (s4) = 4
s5 = 1110111101 f (s5) = 8
Next we apply fitness proportionate selection with the roulette wheel method:
2 1 n 3 Area is Proportional to fitness value
Individual i will have a
probability to be chosen
i i f i f ) ( ) ( 4
We repeat the extraction as many times as the
Suppose that, after performing selection, we get the following population:
s1` = 1111010101 (s1)
s2` = 1110110101 (s3)
s3` = 1110111101 (s5)
s4` = 0111000101 (s2)
s5` = 0100010011 (s4)
Next we mate strings for crossover. For each couple we decide according to crossover
probability (for instance 0.6) whether to actually perform crossover or not
Suppose that we decide to actually perform crossover only for couples (s1`, s2`) and (s5`, s6`).
For each couple, we randomly extract a
Type of Crossover
:
Single site
Two Points
Multi points
Uniform
s1` = 1111010101
s2` = 1110110101
s5` = 0100010011
s6` = 1110111101
Before crossover:
After crossover:
s1`` = 1110110101
s2`` = 1111010101
s5`` = 0100011101
s1` = 1111010101 s2` = 1110110101
s5` = 0100010011 s6` = 1110111101
Before crossover:
After crossover:
s1`` = 1110110101 s2`` = 1111010101
s1` = 1111010101
s2` = 1110110101
s5` = 0100010011 s6` = 1110111101
Before crossover:
After crossover:
s1`` = 1110110101 s2`` = 1111010101
s1` = 1111010101
s2` = 1110110101
s5` = 0100010011 s6` = 1110111101
Before crossover:
After crossover:
s1`` = 1110110101 s2`` = 1111010101
The final step is to apply random mutation: for each bit that we are to copy to the new population we allow a small
probability of error (for instance 0.1)
Before applying mutation:
s1`` = 1110110101
s2`` = 1111010101
s3`` = 1110111101
s4`` = 0111000101
After applying mutation:
s1``` = 1110100101 f (s1``` ) = 6
s2``` = 1111110100 f (s2``` ) = 7
s3``` = 1110101111 f (s3``` ) = 8
s4``` = 0111000101 f (s4``` ) = 5
s5``` = 0100011101 f (s5``` ) = 5
Before: (1 0 1 1 0 1 1 0)
After: (0 1 1 0 0 1 1 0)
Before: (1.38 -69.4 326.44 0.1)
After: (1.38 -67.5 326.44 0.1)
Causes movement in the search space
(local or global)
The traveling salesman must visit every city in his territory exactly once and then return to the starting point; given the cost of travel between all cities, how should he plan his itinerary for minimum total cost of the entire tour?
TSP NP-Complete
The Traveling Salesman Problem:
Find a tour of a given set of cities so
that
• each city is visited only once
Representation is an ordered list of city
numbers known as an
order-based
GA.
1) London 3) Dunedin 5) Beijing 7) Tokyo 2) Venice 4) Singapore 6) Phoenix 8) Victoria
Crossover combines inversion and recombination:
* *
Parent1 (3 5 7 2 1 6 4 8) Parent2 (2 5 7 6 8 1 3 4)
Child (5 8 7 2 1 6 3 4)
Mutation involves reordering of the list:
* *
Before: (5 8 7 2 1 6 3 4)
The sub-string between two randomly selected points in the path is reversed
Example:
(1 2 3 4 5 6 7 8 9) is changed into (1 2 7 6 5 4 3 8 9)
0 20 40 60 80 100 120
0 10 20 30 40 50 60 70 80 90 100
y
0 20 40 60 80 100 120
0 10 20 30 40 50 60 70 80 90 100
y
x
44 62 69 67 78 64 62 54 42 50 40 40 38 21 35 67 60 60 40 42 50 99 0 20 40 60 80 100 120
0 10 20 30 40 50 60 70 80 90 100
y
x
0 20 40 60 80 100 120
0 10 20 30 40 50 60 70 80 90 100
y
x
42 38 35 26 21 35 32 7 38 46 44 58 60 69 76 78 71 69 67 62 84 94 0 20 40 60 80 100 120
0 10 20 30 40 50 60 70 80 90 100
y
x
0 200 400 600 800 1000 1200 1400 1600 1800
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31
D is ta n ce Generations (1000)
TSP30 - Overview of Performance
Best
Worst
“ Almost eight years ago ...
people at Microsoft wrote a program [that] uses some
genetic things for finding short code sequences. Windows 2.0 and 3.2, NT,
and almost all Microsoft applications products have
shipped with pieces of code created by that
system ”.
Decade long debate: which one is better /
necessary / main-background
Answer (at least, rather wide agreement):
• it depends on the problem, but • in general, it is good to have both • both have another role
• mutation-only-EA is possible, Xover-only-EA would not
Exploration: Discovering promising areas in the search space, i.e. gaining information on the problem
Exploitation: Optimising within a promising area, i.e. using information
There is co-operation AND competition between them
Crossover is explorative, it makes a big jump to an
area somewhere “in between” two (parent) areas
Mutation is exploitative, it creates random small
Only crossover can combine information from two
parents
Only mutation can introduce new information
(alleles)
Crossover does not change the allele frequencies of
the population (thought experiment: 50% 0’s on first bit in the population, ?% after performing n
crossovers)
A problem definition as input, and
Each cell of a living thing contains chromosomes - strings of DNA
Each chromosome contains a set of genes - blocks of DNA
Each gene determines some aspect of the organism (like eye colour)
A collection of genes is sometimes called a
genotype
A collection of aspects (like eye colour) is sometimes called a phenotype
Reproduction involves recombination of genes from parents and then small amounts of mutation
(errors) in copying
Possible individual’s encoding
Use a data structure as close as possible to
the natural representation
Write appropriate genetic operators as
needed
If possible, ensure that all genotypes
correspond to feasible solutions
If possible, ensure that genetic operators
preserve feasibility
Start with a population of randomly generated individuals, or use
- A previously saved population - A set of solutions provided by
a human expert
Purpose: to focus the search in promising
regions of the space
Inspiration: Darwin’s “survival of the fittest” Trade-off between exploration and exploitation
of the search space
Derived by Holland as the optimal
trade-off between exploration and exploitation
Drawbacks
Different selection for f1(x) and f2(x) = f1(x) + c
Superindividuals cause convergence (that
Extracts k individuals from the population
with uniform probability (without re-insertion) and makes them play a “tournament”,
where the probability for an individual to win is generally proportional to its fitness
Purpose: to simulate the effect of errors that happen with low probability during
duplication
Results:
- Movement in the search space
Solution is only as good as the
evaluation function; choosing a good one is often the hardest part
Similar-encoded solutions should have a
Examples:
A pre-determined number of generations or
time has elapsed
A satisfactory solution has been achieved
No improvement in solution quality has
Two or more species evolve and interact
through the fitness function
Closer to real biological systems
Competitive (prey-predator) or Cooperative
Examples: Hillis (competitive), my work
Choosing basic implementation issues:
♦representation
♦population size, mutation rate, ...
♦selection, deletion policies
♦crossover, mutation operators
Termination Criteria
Performance, scalability
Solution is only as good as the
Concept is easy to understand
Modular, separate from application
Supports multi-objective optimization
Good for “noisy "environments
Always gives answer; answer gets
better with time
Multiple ways to speed up and
improve a GA-based application as
knowledge about problem domain is
gained
Easy to exploit previous or alternate
solutions
Flexible building blocks for hybrid
applications
Alternate solutions are too slow or much
complicated
Need an exploratory tool to examine new
approaches
Problem is similar to one that has already
been successfully solved.
Want to hybridize with an existing
solution
Benefits of the GA technology meet key
Domain
Application
Types
Control gas pipeline, pole balancing, missile evasion, pursuit
Design semiconductor layout, aircraft
design, keyboard configuration, communication networks
Scheduling manufacturing, facility scheduling,
resource allocation
Robotics trajectory planning
Signal Processing filter design
Game Playing poker, checkers, prisoner’s
dilemma
Combinatorial Optimization
set covering, travelling salesman, routing, bin packing, graph
Essentials of Darwinian
evolution:
Organisms reproduce in
proportion to their fitness in the environment
Offspring inherit all traits from
parents
Traits are inherited with some
variation, via mutation and sexual recombination
Essentials of evolutionary algorithms:
Computer “organisms”(e.g.,
programs) reproduce in proportion to
their fitness of problem environment
(e.g., how well they perform a desired task)
Offspring (outcome) inherit only
strong traits (fitness)from their parent.
♦Traits are inherited, with some
C. Darwin. On the Origin of Species by Means of Natural Selection; or, the Preservation of flavored Races in the Struggle for Life. John Murray, London, 1859.
W. D. Hillis. Co-Evolving Parasites Improve Simulated
Evolution as an Optimization Procedure. Artificial Life 2, vol 10, Addison-Wesley, 1991.
J. H. Holland. Adaptation in Natural and Artificial Systems. The
University of Michigan Press, Ann Arbor, Michigan, 1975.
Z. Michalewicz. Genetic Algorithms + Data Structures =
Evolution Programs. Springer-Verlag, Berlin, third edition, 1996.
M. Sipper. Machine Nature: The Coming Age of Bio-Inspired Computing. McGraw-Hill, New-York, first edition, 2002.
M. Tomassini. Evolutionary algorithms. In E. Sanchez and
M. Tomassini, editors, Towards Evolvable Hardware, volume