Heuristics for general set covering problems

2 Review of Techniques for Driver Scheduling and Set covering

2.3 Heuristics for general set covering problems

Set covering problems are difficult zero-one optimization problems, which have been proven to be NP-complete (Garey and Johnson, 1979). They are often encountered in a wide area of applications such as resource allocation (Revelle et al., 1970), crew scheduling (Rubin 1973;

Smith and Wren, 1988), location of emergency service (Toregas et al., 1971), assembly line balancing (Salveson 1955), and simplification of Boolean expressions (Breuer 1970).

Besides the exact algorithms (Beasley 1987; Fisher and Kedia, 1990; Beasley and Jornstern, 1992), there is an abundant literature dealing with heuristics, some of which are discussed in the following sections.

2.3.1 A GA by Beasley and Chu

Beasley and Chu (1996) used a GA for non-unicost set covering problems. In its chromosome presentation, each gene position denotes one of the columns in the zero-one matrix, and has a value of 1 or 0 depending on whether the variable is or is not present in the solution. A crossover operator called ‘fusion’ is designed to combine two parent strings: the choice of whose gene values are passed to the child is made based on the relative fitness of the two parents. For the mutation operator, they applied a variable mutation rate. At the early stage of the GA, the mutation rate is set to be lower to allow minimal disruption. As the GA progresses, the mutation rate increases since the crossover operator becomes less effective.

When the GA finally converges, the mutation rate will stay at a constant rate.

The solutions generated by the crossover and mutation operators may be infeasible, i.e. some rows are not covered. To repair these infeasible solutions, Beasley and Chu presented a heuristic that could not only maintain the feasibility of the solution, but also provide an additional local optimisation step to make the GA more efficient.

They tested the performance of the approach on a large set of randomly generated problems.

Computational results showed that this heuristic can produce optimal solutions for small-size problems and high-quality solutions for large-size problems.

2.3.2 An Artificial Neural Network algorithm by Ohlsson et al.

Artificial Neural Network (ANN) has attracted much research during the past decades. Most of the activities involve feed-forward architectures for pattern recognition or function approximation. However, ANN can also be used for difficult combinatorial optimisation problems. This is usually done by first mapping the problem onto an energy function, and then finding configurations with low energy function values by the method of iterating some mean field equations.

Ohlsson et al. (2001) developed a mean field feedback ANN algorithm for the set covering problem. They used a multilinear penalty function to obtain a convenient encoding of the inequality constraints. An approximate energy minimum is achieved by iterating a set of mean field equations, in combination with annealing. In contrast to most existing search and heuristics techniques, this ANN model is not based on exploratory search to find the optimal configuration. Rather, the neural units of ANN find their way in a fuzzy manner through an interpolating and continuous space towards good solutions.

This algorithm has been tested against some very large-scale problems (up to 5000 rows and 10⁶ columns). Computational results shows that this approach can produce results typically within a few percent from the optimal solutions, and its executing speed is extremely fast.

2.3.3 A sophisticated GA by Solar et al.

During recent years, parallel GAs have been used to discover how the interchange of genetic information for separated populations affects the final solution. Normally a parallel GA is implemented based on an “Island Model” where separate and isolated sub-populations evolve independently and in parallel. Fit members occasionally migrate between sub-populations, allowing the distribution and sharing of good genetic material of fit members and helping to maintain genetic diversity. The exploration of different solution spaces could optimise the search in terms of both computational time and solution quality.

Solar et al. (2002) presented a parallel GA model to solve the set covering problem. The chromosome representation used is the traditional one: a bit string with n bits where n is the number of columns in the problem. Since new chromosomes generated by genetic operators could violate some problem constraints, solution representations do not always ensure their feasibility. Therefore a feasibility operator is designed to repair all infeasible solutions

They proposed the following population scheme: independent populations are associated with nodes. Each node executes a single GA and creates a new local population. When all nodes are ready with new generations, each node sends the best local individual to the master node.

The master node then selects the best individual received, and broadcasts it to all slave nodes.

Each independent slave node replaces the worst local individual with the new best global received. In other words, the interchange of information between parallel searches is the selection of the best global, which replaces the worst of each node.

The parallel GA has been tested by using ten problems up to 500 rows and 5000 columns. The final solutions obtained are not very satisfactory: the percentage deviations are in the range from 3.3% to 10%, and only one optimal value was achieved once in more than 1000 experiments.

2.3.4 A Lagrangian heuristic by Caprara et al.

A number of attempts have been made by using the Lagrangian-based heuristics for the set covering problem (Beasley 1990; Haddadi 1997; Caprara et al., 1999). The more recent work of Caprara et al. will be introduced herein, which consists of three phases of subgradient optimisation, heuristic, and column fixing.

The subgradient phase is to find a near-optimal Lagrangian multiplier vector quickly, by means of an aggressive policy. The heuristic phase is to generate a sequence of near-optimal multiplier vectors, and for each vector compute a heuristic solution. The column fixing phase is to select a subset of “good” columns, and fix to 1 the corresponding variables. In this way an instance with a reduced number of columns and rows is obtained, on which the three-phase procedure is executed iteratively until the solution cannot be improved.

The algorithm was extensively tested on very large size problems, involving up to 5,000 rows and 1,000,000 columns. In 92 out of the 94 test instances, the optimal or the best known solutions can be found quickly. Furthermore, among the 18 instances that the optima are unknown, in 6 cases their solutions are better than the previous best known solutions.

2.3.5 A simulated annealing approach by Sen

Simulated annealing is a stochastic optimization technique based on an analogy from statistical mechanics, where a substance is reduced to its lowest energy configuration by a

sequence of steps that involve alternate heating and cooling. Sen (1993) used a simulated annealing for the set covering problem, which consists of the following four steps:

1) Encode the points in the solution space by using an n-bit string that represents the n columns. A value of 1 in the j-th string position means that the column j is chosen to be in the cover.

2) Formulate an evaluation function that evaluates the goodness of a point in the solution space.

3) Design a set of moves that can be used to alter the points in the solution space. Three types of moves are defined in the annealing scheme: the first and second involves either adding or removing a column from the chosen cover by flipping a randomly picked bit; and the third involved replacing one column with another in the chosen cover by interchanging the values of two bit positions with different values.

4) Decide an annealing schedule, such as the setting of starting temperature, rule for temperature decrements, and the stopping criteria to halt the algorithm.

This approach has only been implemented on some small size problems with good results. For large size problems, its performance would be difficult to estimate.

In document Fuzzy Evolutionary Approaches for Bus and Rail Driver Scheduling (Page 36-40)