Decomposition Techniques - Constraint Based Job Dispatching

2.2 Constraint Based Job Dispatching

2.2.6 Decomposition Techniques

A lot of research over the last decades produced CP algorithms very good at solving small or medium sized instances; the main strengths of CP are strong inference methods and powerful search heuristics. The former is very apt at quickly detecting any infeasibility while the heuristics are used to guide the search towards areas in the search space which will likely contain solutions. Constraint Programming deals with optimization problems using branch and bound techniques, with the cost represented by an objective variable – new solutions are required to have lower costs than the previous ones (through constraint placed on the objective variable). However these methods often risk to be ineffective when tackling larger problem instances. For instance, in the case of allocation & scheduling under constrained resources in HPC systems we may have to deal with problems formed by thousands (or even more) of variables, depending on the number of resources hosted in the machine (as told before we can have supercomputers with tens of thousands processing units) and on the number of tasks submitted by users (up to thousands per day in larger systems). The sheer complexity of many problems renders even the most advanced CP methods ineffective unless an alternative strategy is applied. In order to face this issue, in the last few years many works have studied approaches that effectively decompose and solve large scale problem, i.e. the original problem is divided in sub-problems whose individual solutions can be found with relatively lower computational effort. Decomposition techniques have been proved to be useful also in smaller scale (but hard) problems too, often because while there are no efficient way to cope with the original problem its components can be solved with specialized and effective algorithms. For example, Benini et al. [BBGM05] describe a method to deal with allocation and scheduling in Multi- Processor Systems on Chip. The key idea is to decompose the problem in the scheduling component, solved through Constraint Programming, and the allocation component solved with Integer Linear Programming.

An important component in many decomposed approach is the so called Local Search. This term represents a large class of meta-heuristics commonly used in computationally hard optimization problems [GH06, BPW+_{12, MA04,}

AEOP02, LMS03]. The typical approach used in Local Search starts from an initial feasible solution and then proceeds to apply small changes to the solution in order to produce a new one; if the new solution is better than the older one it becomes the new current solution. After that, new local moves are iteratively applied to further explore the solution space until a fixed point or a time limit is reached. The main drawback of this approach is given by the local nature of the moves: if the current solution is a local minimum (given that we are dealing with a minimization problem), moves to neighbours solutions may never be able to escape such local minimum and the algorithm could not find possible global minima. To overcome this limitation many extension have been proposed, such

2.2 Constraint Based Job Dispatching 39

as introducing randomization [NW07, Len97] or preventing to go back to the previous solution even if it was better than the new one (Tabu Search [GL97]). Delving in the depths of local search is outside the scope of this work but the interested reader can find an interesting starting point in this book by Aarts et al. [AL97].

2.2.6.1 Benders Decomposition

A very common decomposition techniques well suited for scheduling & allocation problems is the Benders Decomposition. The classical Benders Decompo- sition [Geo72, BM91] method decomposes a problem into two loosely connected subproblems. It enumerates values for the connecting variables. For each set of values enumerated, it solves the subproblem that results from fixing the connecting variables to these values. The solution of the subproblem generates a Benders cut that the connecting variables must satisfy in all subsequent solutions. The process continues until the master problem and subproblem converge providing the same value. The idea is to “learn by mistake” and the use of Ben- ders cuts accomplish the goal of eliminating superfluous solutions. The classical Benders cut is a linear inequality based on Lagrange multipliers obtained from a solution of the subproblem dual. The typical Benders approach, however, requires that the subproblem is a continuous linear or nonlinear programming problem.

Logic-Based Benders Decomposition LBBD [HO03] is an extension of the traditional scheme that enables generic solvers to be used as subproblem solvers. LBBD can be applied to any class of optimization problem but a method to generate Benders cuts must be identified for each different class of problems – and this is usually not a trivial task. LBBD has been applied to numerous application, in particular it had a great success with planning and scheduling problems [Hoo07], and it has been also used in conjunction with Constraint Programming [EW01]. LBBD is used in [FZB09] to solve a location-allocation problem, i.e. deciding where to locate a set of facilities and allocate clients to them, and the results show that it outperforms the traditional ILP approach. LBBD has also been applied to scheduling problems [Hoo05a,Hoo05b,TB12]; for example Canto uses LBBD [Can08] to solve the problem of scheduling preventive maintenance activities in a power plant.

A great advantage of LBBD is the possibility to use heterogeneous techniques to solve the master and the subproblems and this can lead to very efficient solutions. The main limitation of LBBD-based approach lies in the difficulty to generate effective cuts, which in turns leads to a inefficient exploration of the search space and a slow convergence towards the optimal solution, especially with large-scale problems. Another disadvantage of LBBD is the risk of loosing valuable information (e.g. if master and problem variables are connected by tight constraints) when we decouple master and subproblems.

2.2.6.2 Large Neighborhood Search

A method extensively studied in the literature and very appreciated for its prac- tical effectiveness is Large Neighborhood Search or LNS [Sha98], a framework that combines the search power of CP with the scaling performance of local search. As in local search, we start from an initial solution of the problem

(which can be found through standard CP search or faster heuristics) and then we modify it. However, instead of making small changes to the solution, as is typical with local search move operators, a subset of variables from the problem are selected. These variables are then unassigned while the remaining ones are locked to their values in the current solution, then the search for an improving solution happens through reassigning only the unassigned variables. The search strategy used for finding a new assignment for the relaxed variables can be cho- sen in order to best fit the problem taken in consideration. There are three crucial aspects affecting the quality of LNS: 1) the fragment selection procedure (which are the variables that will be relaxed), 2) the fragment size and 3) the search limit – the stopping criterion applied to the search of the new solution with the relaxed variables. The best methods to address these points have not been decided yet and they are subject of a lot of research efforts.

LNS has proved to be a very effective tool for solving complex optimization problems, however applying LNS to real world problems still requires a great deal of problem domain knowledge since heuristics to select effective neighborhoods must be discovered for each problem class. Carchrae et al. [CB09] show how to reduce the required expertise using adaptive techniques to create algorithms that adjust their behaviour to suit the problem instance being solved. With a similar purpose, Laborie et al. [LG] present an approach called Self- Adapting Large Neighborhood Search, which combines Large Neighborhood Search with a portfolio of neighborhoods selection and completion strategies together with Machine Learning techniques to converge on the most efficient way to solve the target problem. The re-enforcement learning scheme, although quite simple, ensures a quick convergence on the most effective neighborhoods, search strategies and their associated parameter values and is a key factor in the robustness of the approach.

Godard et al. [GLNI] use a LNS technique to solve cumulative constrained scheduling problems, where resources may execute several activities in parallel, provided the resource capacity is not exceeded. It relies on a general approach based on calculating partial-order schedule from a fixed start time schedule. A partial-order schedule, POS, for a problem P is a graph where the nodes are the activities of P and the edges represent temporal constraints between pairs of activities, such that any possible temporal solution is also a consistent assignment [PSCO04]. In the context of LNS, POSs provide a very powerful way to inject flexibility into the schedule while keeping interesting features from one solution to the other. Danna et al. [DP03] consider the job-shop scheduling problem with earliness and tardiness costs, i.e. jobs should be scheduled exactly at certain times in order not to incur in “penalties”. The paper compares two approaches for dealing with this problem, one being a LNS based method tai- lored for the problem, the other being a form of LNS called Relaxation Induced Neighborhood Search which is a generic and unstructured algorithm, relying only on a continuous relaxation of the Mixed-Integer Programming model of the problem to define its neighborhood.

Palpant et al. [PAM04] propose a technique to overcome scale issues in resource scheduling problem. They introduce a method which combines local search with subproblem exact resolution (LSSPER). The method can be seen as a hybrid scheme: each step fixes a subpart of the current solution while the other part defines a subproblem solved by a heuristic or exact solution approach. The key factor of the method is the choice of the subproblem to be optimized

In document Power-Aware Job Dispatching in High Performance Computing Systems (Page 50-53)