Experiments - Multi-objective tools for the vehicle routing problem with time windows

In this section, we describe the settings used in our experiments. Our efforts focus on comparing different ranking approaches. For this purpose, we implemented a canoni- cal Discrete Particle Swarm Optimisation (DPSO) inspired by [54]. This algorithm was intentionally simple. Particles could only perform one out of four types of moves at each generation. These four moves included one inertial move (only the moving particle was involved), and three follower-attractor moves (two particles involved). While in the inertial move only the moving particle was involved in a mutation operation, the follower-attractor moves involved the crossover operation between the moving particle (follower) and one of the three particles/positions (attractor): 1) best personal position bi, 2) the best position achieved by the swarm so far g, and 3) the best position in the

neighbourhood of the moving particle at the current generation gi.

All particles within the swarm were initialised with random solutions (route-plans). Besides, only two operators were implemented: 1) an inter-route operator that exchanges pairs of customers within a route-plan (for inertial moves), and 2) an operator that copies an entire sub-route from one to another route-plan and then removes duplicates (for all other moves).

The probability of all attractors were set to 0.25. In our simulations, the swarm was formed by 50 particles evolving for 2000 iterations. This algorithm was applied to the Vehicle Routing Problem with Time Windows (VRPTW) using the Solomon’s dataset [75]. These instances are divided in three classes: C1XX (customers positioned in clusters), R1XX(customers randomly spread) and RC1XX (some customers forming clusters and others randomly positioned). Regarding the DPSO implementation, two operators are used to move the particles within the swarm. A crossover operator is used to move particles towards other particles’ locations. This operator copies a random route from an attractor to the moving particle. A mutation operator is used to encourage the local exploration within route-plans. In the experiments, this operator exchanges customers from one route to another within the route-plan in the solution of the moving particle. In order to assess the performance of each ranking approach, a number of minimisation objectives were considered: Number of vehicles (Znv), Travel Time (Ztt) or elapsed time of

the route-plan, Waiting Time (Zwt) or sum of all time the drivers need to wait in case of

an early arrival, Travel Distance (Ztd) or length of the whole route-plan, Time Window Vi-

olation(Ztwv) or sum of lateness of all arrivals, Number of Time Window Violations (Zntwv)

or number of customers not served within the appropriate time, Capacity Violation (Zcv)

or amount of exceeding capacity on vehicles and Number of Capacity Violations (Zncv)

or number of vehicles whose capacity is being exceeded. Reducing violations are considered as objectives in this study. In this way, we entitle the decision maker to decide on the convenience of serving customers out of their time windows or exceeding the capacity of some vehicles.

Regarding the coefficients for the ranking schemes, DLA is presented in two versions for selecting leaders and updating best particle’s positions (e.g. the best personal position biand the best position achieved by the swarm so far g).

The first version of DLA uses a greedier approach to establish the probability for each preference using Equation 4.3.1. The first coefficient (0.9) increases its curvature and the second moves it up by 0.05. We set these coefficients according to some preliminary

Figure 4.3:Distribution of probabilities used by DLA and DLA2 for N=8 objectives. The resulting distribution according to the probability mass functions: 1) p(x) =0.9∗e−x+0.05 is depicted on the left, and 2) p(x) =0.6∗cos(0.7x+

0.1) +0.3 with x= {0, 1}is depicted on the right.

tests (Figure 4.3 - on the left).

p(x) =0.9∗e−x+0.05 (4.3.1)

The second version of the DLA (DLA2) splits the ranking process into two phases (Equation 4.3.2). If the current number of iterations is less or equal than a given k, a probability mass function is used to encourage intensification. Otherwise, a different probability mass function is employed to encourage diversification. To this aim, the first phase only takes into account the first two highest preferences using the pmf p(x) = 0.6∗cos(0.7x+0.1) +0.3, with x = {0, 1}. For values including x = 0 and x = 1, this function (equivalent to a quadratic expression in the given range) assigns high and similar probabilities to the two objectives with the highest preference (Fig- ure 4.3 - on the right). The coefficients were set, as in the previous pmf, using preliminary computational experiments. This intensification phase lasts k iterations. In our experiments k is set to 500 which corresponds to 25% of the total number of iterations that the algorithm runs. The diversification phase runs for the remaining iterations using the pmf p(x) = 0.9∗exp(−x) +0.05, working with the whole set of preferences (Figure 4.3 - on the left).

p(x) =      0.6∗cos(0.7x+0.1) +0.3, x= {0, 1} k≤500 0.9∗e−x+0.05 k>500 (4.3.2)

Summarising, the success of DLA2 strives to provide a good compromise between intensification and diversification. For this purpose, Equation 4.3.2 splits the ranking process in two phases. In the first phase, as shown in Figure 4.3 - on the right, the two objectives with highest preference have probabilities 56% and 44%, respectively. Thus, the search focuses on the optimisation of these two objectives only. This speeds up the convergence, but it might lead to local optima. The second phase changes the distribution of probabilities to avoid this side effect. This diversification phase, as shown in Figure 4.3 - on the left, assigns the descending probabilities 52%, 21%, 9%, 5%, 4%, 3%, 3% and 3% to each preference from the highest to the lowest.

In a preliminary study, we tested a number of different combinations of objectives using Pareto dominance and Lexicographic ordering. This study compared the average hypervolume (Section 2.3.4) values obtained by using each combination. We ran experiments with Pareto dominance involving 2 to 8 objectives. We found the best results working with pairs of objectives. With three or more objectives, Pareto dominance produced the premature convergence of the swarm. Furthermore, these experiments also showed that the best setting for Pareto dominance was to discriminate solutions using (Ztd, Ztwv). For the lexicographic approach, the study involved finding the best

sequence (ordering) of objectives. We found that the best sequence of objectives was (Zntwv, Ztd, Zwt, Ztt, Ztwv, Zcv,Zncv, Znv).

With respect to the DLA and DLA2, they both used the same sequence for preferences as the lexicographic for priorities. But, for DLA2 this sequence was used after 500 generations as explained above.

In document Multi-objective tools for the vehicle routing problem with time windows (Page 99-102)