• No results found

The Ant Colony Optimization metaheuristic

3.2 Ant Colony Optimization routing algorithms

3.2.2 The Ant Colony Optimization metaheuristic

The distributed shortest path finding process of foraging ants described above has been an important source of inspiration for artificial intelligence (AI)

re-searchers. In particular, it was the basis for the development of the ACO meta-heuristic [83,87]. This is a general framework for the development of algorithms to solve optimization problems. The main idea behind ACO is the use of a colony of artificial ants and a matrix of artificial pheromone. ACO algorithms work in an iterative way. In each iteration, all artificial ants build a solution to the problem at hand in parallel, using the artificial pheromone matrix. Then, the pheromone matrix is updated based on the solutions that were found. This way, the pheromone matrix reflects information about good solutions that have been found so far, and allows ants in subsequent generations to use this infor-mation when building new solutions.

The first applications of ACO were for the traveling salesman problem (TSP).

An instance of the TSP is defined by a fully connected weighted graph G = (V, E), where the set of vertices V corresponds to a number of cities, and the set of edges E represents the connections between the cities. With each of the edges (i, j), a distance d(i, j) is associated. The distances can be symmetric (in which case d(i, j) = d(j, i) for all pairs of cities i and j), or asymmetric. The aim is to find a closed tour that visits all cities exactly once while minimizing the total traveled distance. This combinatorial optimization problem is NP-hard.

The TSP can very easily be seen as a shortest path finding problem, which makes it an obvious first choice for the implementation of ACO.

The first ACO algorithm that was developed for the TSP is Ant System (AS), which was originally proposed by Dorigo in his PhD thesis in 1992 [82]

and first published in English in 1996 [85]. In AS, an artificial pheromone value τ (i, j) is associated with each edge (i, j). The algorithm maintains a colony of artificial ants, which build solutions using this artificial pheromone, and afterwards update the pheromone based on the quality of the solution they obtain. The algorithm works in an iterative way. At the start of each iteration, each ant is placed in a randomly chosen initial city. Starting from there, it moves from city to city, building a solution to the TSP. When choosing the next city to move to, an ant considers all cities that it has not visited yet. It picks one of these using the random-proportional rule given in equation 3.1. This rule calculates the probability pk(i, j) that ant k in city i chooses city j to move to next.

pk(i, j) =

( P[τ (i,j)]α[η(i,j)]β

l∈N ki[τ (i,l)]α[η(i,l)]β if j ∈ Nik

0 otherwise (3.1)

In this equation, Nik represents the set of cities that ant k has not yet visited before reaching city i. η(i, j) is equal to 1/d(i, j), the inverse of the distance between i and j. It serves as a heuristic value that helps guiding the construction of solutions. Using the rule of equation 3.1, the probability of choosing city j after city i increases when the pheromone between i and j is higher and when the distance between i and j is lower. The parameters α and β define the relative weight given to respectively the pheromone and the distance heuristic in the decision process: with β = 0, the decision is purely based on the pheromone, meaning the experience gathered in previous iterations, while with

α = 0, the decision is purely based on the heuristic, so that AS comes down to a randomized local search.

At the end of each iteration, the solutions constructed by the ants are eval-uated, and the pheromone values are updated. Pheromone updating includes pheromone evaporation and pheromone deposition. Pheromone evaporation refers to the decrease of all pheromone values. It is done using equation 3.2.

In this equation, 0 ≤ ρ ≤ 1 is the pheromone evaporation rate. Pheromone evaporation allows to forget old solutions. Pheromone deposition refers to the increase of pheromone on edges that have been used in the solutions constructed by the ants. It is done using equation 3.3. In this equation, m is the total num-ber of ants, and ∆τ (i, j)k is the inverse of the cost of the solution constructed by ant k if edge (i, j) is part of this solution, and is 0 otherwise. Pheromone deposition serves to reinforce good solutions.

τ (i, j) = (1 − ρ)τ (i, j), ∀(i, j) ∈ E (3.2)

τ (i, j) = τ (i, j) + Xm k=1

∆τ (i, j)k, ∀(i, j) ∈ E (3.3) The algorithm is run until a given number of iterations is reached, or until no solution improvement has been obtained for a number of iterations.

AS was found to work well for small instances of the TSP, but failed to compete with state-of-the-art methods on large instances. However, its publi-cation raised a general interest in the approach, and inspired the development of a number of similar algorithms for the TSP that were more powerful and did manage to provide state-of-the-art performance. These algorithms include Elitist AS [85], Ant-Q [105], Ant Colony System (ACS) [84] and MAX-MIN AS (MMAS) [245]. All of these algorithms are based on the same basic principles as AS: a number of artificial ants each build their own solution to the TSP. They do this in a constructive way, starting from a random initial city and adding new cities until a full solution is reached. Each decision to add a new city is made stochastically, using probabilities that depend partly on a pheromone value and partly on a heuristic value. Pheromone is updated according to the quality of the solutions provided by the ants. The main difference between these new algorithms and the original AS lies in the balance between the exploitation of information that has been learnt so far and the exploration of new possibili-ties. In general, AS’s successors are more aggressive to exploit. For example, in Elitist AS, pheromone updating is only done for the ants that found the best solutions in the current iteration, and for the best solution found so far over all past iterations. In ACS, exploitation is also increased by using the selection rule for cities (equation 3.1) deterministically in a certain percentage of cases. Also, AS’s successors provide mechanisms to balance exploration and exploitation, so that the algorithms can be better tweaked for the problem at hand. Apart from these differences, some of AS’s successors, namely ACS and MMAS, have been combined with local search: to each of the solutions found by the ants, a local search procedure is applied to bring it to a local optimum. Then, pheromone

is updated based on the improved solution. This hybrid approach was found to be very powerful.

After the applications to the TSP, ACO has in recent years also been adapted for the solution of a wide range of other problems. These include the quadratic assignment problem [107, 184], the vehicle routing problem [106], the graph coloring problem [63], the shortest common super-sequence problem [193], the multiple knapsack problem [168], the bin packing problem [99, 169], the 2D HP protein folding problem [239], etc.. Most of these problems are very different from the TSP and have a structure that is much less easy to reduce to a shortest path finding problem. ACO algorithms for these problems use therefore often quite different approaches, which is mainly reflected in the way pheromone is stored, updated and used. The core ideas always remain the same however:

to produce multiple parallel solutions in each iteration, using a high degree of stochasticity, and to store information about previously found good solutions in an artificial pheromone matrix, which is used to produce new solutions in subsequent iterations. Overviews of applications of ACO can be found in [34, 86, 87].

ACO has also been applied to networking problems. Some of these are off-line combinatorial problems, such as the problem of routing and wavelength-allocation in an optical fibre network [202] or the problem of finding disjoint paths in a telecommunication network [267]. ACO algorithms for these prob-lems follow a similar pattern as the other ACO algorithms for combinatorial problems described earlier. A very different problem is that of adaptive routing in telecommunication networks. This is an online dynamic problem: the prop-erties of the problem change continuously and the optimization algorithm has to adapt its solution online. Applications of ACO for routing, such as Ant-Based Control (ABC) [235] and AntNet [71], are therefore very different from those for static off-line problems. The rest of this section is dedicated to the description of these algorithms.

3.2.3 AntNet: an ACO algorithm for routing in