3. Statistical Methods for the Analysis of Stochastic Optimisers
3.9. Landscape analysis
The methods for the analysis described in the previous sections are useful for assessing the performance of algorithms but do not provide explanations on the search mecha- nisms underlying these algorithms. SLS methods are intriguing because they perform well where other mathematical models have difficulties although they are based solely on intuitive rules whose validity is sometimes not clear. Particularly surprising is the effectiveness of local search which is in first place responsible for the success of such methods. One main concern is how to generalise its behaviour from some problems or from some specific instances.
Studies to improve the understanding of SLS algorithms based on local search have received increasing interest in recent years. The search space characteristics of a number of standard problems have been deeply investigated, including the Satisfiability Problem
(Frank et al., 1997), the Travelling Salesman Problem (Merz and Freisleben, 2000), the
Job Shop Scheduling Problem (Watson et al.,2003; Watson,2003), the Linear Ordering
Problem (Schiavinotto and Stützle,2004), and others (Merz,2000). These studies focus
on various properties of the search space which might have an impact on the performance of SLS algorithms.
The search process of a local-search-based algorithm applied to a problem instance I can
be seen as a walk on a neighbourhood graph, GI(S, N ), induced by the neighbourhood
structure N and the search space S. In this neighbourhood graph, vertices correspond to solutions s ∈ S and edges connect neighbouring solutions. Typically, the neighbourhood is symmetric.
Definition 3.1 Asearch landscape LI(S, N , f ) for a problem instance I corresponds to the
neighborhood graph GI(S, N )determined by the search space S and the neighbourhood structure
N , with a value f (s), f : S →R, associated to each vertex s ∈ S. The value f(s) is called vertex
level.
The definition of fitness landscape dates back toWright(1932) who studied the roles
3.9 Landscape analysis 77
is a space of possible genotypes, each genotype with a certain “fitness”, and the distri- bution of fitness values over the space of genotypes determines the fitness landscape. In optimisation, the role of genotypes is taken by the possible solutions to the problem and the evaluation function determines their fitness (for an in-depth study of the similarities
between biology and optimisation we refer the reader toGoldberg,1989).
The search landscape can be visualised through similarities with the natural features of a land surface. It may be a region more or less mountainous, with many peaks of high level flanked by steep ridges and precipitous cliffs falling to profound valleys. For the rest of this work, it may be useful to define formally the following classes of points that may be identified in a search landscape.
Definition 3.2 Aplateau of level z, z ∈R, is a maximal connected subgraph P of LI(S, N , f ) such that f (p) = z for all p ∈ P . A plateau P is then:
• a local minimum if there does not exist any p ∈ P and s : s ∈ N (p) such that f (s) < f (p);
• a bench if there exists a p ∈ P and a s : s ∈ N (p) such that f (s) < f (p). In this case, the solution s receives the name of plateau exit.
A study of the features of the search landscape, such as connectivity, solution density, distribution of local minima, ruggedness and plateau size, may be useful to (i) improve the current algorithm, (ii) explain the behaviour of SLS algorithms, (iii) understand the reasons why a problem is difficult or not, (iv) generalise the hardness of problem in- stances to larger instance classes, and (v) link a priori knowledge on the instance with characteristics of the search landscape and the consequent appropriate tuning of algo- rithm. More in general, the ultimate goal of landscape analysis is to create a general theory of local search based on the relationship between search landscape features and SLS performance.
Determining all features of a search landscape requires the exhaustive enumeration of all possible solutions. Alternatively, in some cases, where the search graph has some particular properties, it might be possible to characterise analytically some of the fea-
tures of the landscape, as pointed out by (Grover,1992) or (Dimitriou and Impagliazzo,
1996). Typically, however, problems from real world applications are complicated and such analyses infeasible. In these cases, the techniques are based on approximations de- rived by sampling or estimating landscape features through surrogate measures.
A variety of measures have been used in the literature for the approximate characteri- sation of the search space. We review some of the most important. However, what can be inferred from each single measures independently by the others is not clear. Validating or refuting intuitive relationships between landscape features and local search behaviour is still an open issue of landscape analysis.
Autocorrelation function. One important feature of the landscape is its “correlation structure”, that is, how similar neighbouring solutions in the search space are with re- spect to the evaluation function. A smooth landscape is one in which neighbouring points in the space have similar levels. Knowing the level of one point carries a lot of informa- tion about the level of the neighbouring points and the situation is favourable to local search procedures which base their search on the local information about the evaluation function. At the opposite extreme, a random landscape is one in which the level of neigh- bouring points is entirely uncorrelated. Knowing the fitness at one point would then
78 Statistical Methods for the Analysis of Stochastic Optimisers
carry no information about the fitness of neighbouring points and a local search based on the information of the evaluation function is expected to perform not differently from an uninformed random walk.
One possible way to determine the ruggedness of the landscape is by performing an explorative random walk in it. It starts from a randomly selected initial candidate solu- tion, and at each step it goes to a randomly chosen neighbour; a walk in m steps results
in a series of evaluation function values (f1, . . . , fm). In time series analysis, a measure
to detect non-random trends in the data is the (empirical) autocorrelation function which is defined as r(i) = Pm−i j=1(fj − f )(fj+i− f ) Pm j=1(fj− f )2 (3.21) where f = 1/m ·Pm
j=1fj. In optimisation, the value r(i) indicates the correlation be-
tween two points that are i steps apart in the random walk (Weinberger,1990). Of main
importance is r(1) because it captures the statistical dependency between the level of a point in the landscape and its direct neighbours: r(1) close to 1 corresponds to a smooth landscape, while a r(1) close to 0 corresponds to a random landscape.
Clearly, the starting point has a strong influence on the character of the random walk. In order to generalise the results the random walk must, then, be repeated several times so that a sample of reasonable size of the search space is collected. If the values of r(1) remain similar the information provided can be used to describe the whole search land- scape, provided that we assume the landscape to be regular in every of its points and directions (i.e., homogeneous and isotropic).
Descriptions beyond the first neighbourhood are, instead, rare in landscape analysis for
optimisation (seeHordijk and Manderick,1995for the only example to our knowledge)
but might as well be worth to investigate. Insights may be obtained by autocorrelation plots which are scatter plots of r(i) for different distance values i.
For some problems, the correlation structure of the landscape can be determined an-
alytically. Stadler (1996) derives analytical result for the correlation length, defined as
l = 1/|ln(r(1))|, for some particular search landscape that satisfy a certain difference equation with respect to neighbouring solutions similar to wave equations in mathemat-
ical physics (Grover,1992). Barnes et al.(2003) extend these results to a broader class of
search landscapes. Some problems like travelling salesman, min-cut graph partitioning, graph colouring, and a version of the satisfiability problem have this property. In particu- lar, graph colouring, with an evaluation function that measures the number of edges that connect vertices with the same colour, has r(1) = 1 − 2k/(k − 1)n, where k is the number
of colours and n the number of vertices (Stadler,1996). This result entails that the corre-
lation of nearest-neighbours depends only on the number of colours and the size of the graph. Hence, all instances of graph colouring solved at the same k have a similar search landscape and the behaviour of local search is expected to be similar. Nevertheless, this fact contrasts with the observation, reported in Chapter 4, that local search methods may encounter different difficulties when solving graphs with different structure. Apparently, the autocorrelation function is not enough to explain the behaviour of SLS algorithms.
Fitness distance analysis. Fitness distance analysis focuses on the relation between
solution quality and solution distances (Jones and Forrest, 1995). The relation is sum-
marised by the correlation coefficient and is also often represented graphically by fitness- distance plots. The solutions considered may be randomly sampled or, more frequently,
3.9 Landscape analysis 79
are local optima. In the analysis of SLS algorithms, the focus is usually on samples of lo- cally optimal solutions. Particularly interesting is then the relation between local optima and global optima. If optimal solutions are not available best known solutions may be used in their place (in which cases the interpretation of results must be treated with cau- tion). For minimisation problems, a large positive correlation coefficient indicates that the lower the evaluation function value, the closer the respective positions are, on aver- age, to a globally optimal solution. A value close to zero indicates that the evaluation function does not provide much guidance towards globally optimal solutions, while for negative correlations, the evaluation function is actually misleading.
The results of fitness distance analysis have impact on the design of SLS algorithms. In- deed, highly correlated search landscapes suggest that the use of intensification strategies in SLS algorithms leads to good performance while strong diversification may be useful for the cases of weak correlation. Cases of negative correlations may instead suggest that the use of local search is not appealing and that different solution methods should be con- sidered. Moreover, fitness-distance correlation might also be used to evaluate different neighbourhoods, giving preference to those that allow higher correlation.
Fitness distance correlation alone, however, is never enough to account for differences in the difficulty of individual instances. Moreover, discordant comments on its interpre-
tation are reported in the literature (seeNaudts and Kallel,2000for a discussion on the
limitations of this approach).
Local optima localisation. The notion of local optima is crucial to SLS algorithms. SLS algorithms search for local optima, explore them and try to exit from them. Character- ising the location of local optima in the search space is, therefore, useful to unveil the difficulties inherent in a search landscape. A sample of local optima may be obtained by a relatively simple SLS algorithm. Two distributions of distances may then be con- sidered: the pairwise distances within the set of local optima, or the distances between local optima and closest optimal solutions. In both cases, the range of observed distances and the modes of the distribution reflect important properties of the relative placement of local optima across the landscape. For example, a multi-modal distribution of pairwise local optima distances reveals the concentration of local minima in a number of clusters where the lower modes correspond to the intra-cluster distances and the higher modes
represent the inter-cluster distances (seePaquete et al.,2004). In some other cases it was
shown that the average distance of local optima from the nearest optimal solution gives rise to the most significant model to predict the hardness of an instance (Watson et al., 2003).
Clearly, the crucial step in this analysis is computing the correct distances between so- lutions. For many neighbourhoods and solution representations the problem of finding the minimal distance between two solutions given a move operator is an N P-hard prob- lem and approximate measures may become necessary. The indications provided by the approximate measures are then reliable only if highly correlated with the exact value.
However, (Schiavinotto and Stützle, 2005) observe that for some problems represented
by permutations the right distance value can be computed efficiently, and that the use of an approximate value is unjustified. In addition to this, it must be observed that dis- tances between pairs of solutions peak in a value which is typical of the move operator and the analysis of results should distinguish observations determined by the algorithm from those which are a typical effect of random sampling the search landscape.
80 Statistical Methods for the Analysis of Stochastic Optimisers
Number of local and global optima. Another characteristic of local optima that could be taken into account is their number. It is reasonable to assume a negative correlation be- tween the number of local optima and the hardness of solving a problem instance by local search. Indeed, in the extreme case that all local optima are global optima, local search would always find a solution to the problem while, in all other cases, the chances that local search ends in a local optimum that is not a global one increases with the number of local optima.
In general, neighbourhoods, that entail fewer local optima or local optima of better quality should be preferred. The analysis of these two features, possibly even from a the- oretical point of view, may be relevant for the selection of the neighbourhood structure. However, when more complex SLS algorithms are introduced the impact of the number of local optima on this issue becomes less clear.
Finally, the number of global optima in the search space is also indicative of the effort for local search to locate them. Intuitively, the lower this number is, the harder it should be to reach optimal solutions. In contrast, with many optimal solutions spread over the search landscape, random restart algorithms are likely to perform well.
Backbone size. The backbone of a problem instance is the set of solution components that maintain identical values in all optimal solutions of the instance (Monasson et al., 1999). The fraction of solution components appearing in the backbone determines the backbone size. Studies, above all on the SAT problem, showed that the backbone size is correlated to the search cost of locating solutions. The SAT problem is known to ex- hibit a sharp transition from satisfiable to unsatisfiable problems, and this phase transi- tion can well be characterised by a few features of the instances. It has been noted that there is a common pattern of problems easy-hard-easy in correspondence of this phase transition and this has stimulated researchers to investigate the reasons why instances become harder. It has been noted that the backbones rapidly pass from small to large size close to that region, but contemporaneously the number of solutions drastically de-
creases (Achlioptas et al., 2000). The backbone size appears, therefore, as a redundant
information on the number of optimal solutions. This conjecture was confirmed on the
Job Shop Scheduling problem byWatson et al.(2003).
Plateaux. In order to escape from a plateau, a local search procedure has either to find an exit or to accept a move to a worse neighbour. The characterisation of the plateaux, that is, the definition of the fraction of plateaux that are benches or local minima might, therefore, help to decide which strategy to adopt. Intuitively, however, the larger a plateau is the more costly it is to find an exit. The size of plateaux, therefore, has an impact on the design of SLS algorithms. With small plateaux it might be worth to spend some iterations moving in the plateau in search of an exit, with large plateaux, instead, it might be better to diversify soon the search jumping to other regions of the search space. To determine its size, the plateau has to be exhaustively explored. In this case sam- pling does not help and exploration of plateaux may be achieved with standard search
techniques, introduced in Section2.3.3.
Connectivity. A property of the search landscape which is of chief importance for local search, above all in highly constrained problems, is the connectivity of the search land- scape.