5.4 Concept Evaluation
5.4.4 Selection Hyper-Heuristic with Static LLH Parameters (HH-SP)
The second mode of the developed approach is the RL-based selection hyper-heuristic, which descrip- tion can be found in Section 3.5. Here we group three available LLHs (aforementioned py.ES, py.SA and j.ES) with static parameter (default and tuned) into selection hyper-heuristic (HH-SP).
We present the problem-solving process in two forms. Firstly, we distinguish the selected at each external iteration LLH. In order to do so, we visualize only the first repetition (out of 9 available). Secondly, we present the final results of all runs in the form of box-plots and compare them with the performance of underlying LLHs used executed separately (baseline). The left group of box-plots presents the final solution quality obtained with the default parameter values, while on the right site the results of tuned parameters are outlined.
kroA100, pr439 and rat783 TSP instances. Once again, we group relatively small problem instances on which the implemented HH-SP performs similarly. To analyze this group, we selected the largest instance among them: rat783, while the figures depicting kroA100 and pr439 TSP instances may be found in Appendix A.1.3.
We would like to draw the readers’ attention to HH-SP cases, in which the LLHs were used with the default parameter values (upper row in Figure 5.17). According to baseline evaluation, there is only one algorithm with a strong performance dominance: py.SA. Therefore, in Figure 5.17 we observe a
5.4 Concept Evaluation
Figure 5.19Intermediate performance of HH-SP on pla7397 (single experiment).
high frequency of py.SA sampling by both learning-based selection strategies. One may distinguish a repetitive pattern in LLH allocation with FRAMAB (middle column). It is caused by a deterministic essence of the algorithm. Reaching a critical point, where other heuristics should be verified, the FRAMABs’ exploration mechanism fully guides a selection. Due to the time-based LLH termination usage, all workers are starting the next round (mostly) in bunches. Thus, when a new round starts, FRAMAB operates on static information and allocates all next configurations with the same LLH, which turns to be the second best performing algorithm: j.ES. Therefore, we conclude that FRAMAB behaves slightly inertly in our setup. One may argue this will cause a decrease in performance, which is a rather logical conclusion. However, it requires further investigation, which we postpone for a future work. In the case of BRR usage (right column in Figure 5.17), the bias is strongly shifted towards py.SA. It definitely may cause the performance issues due to the lack of exploration. According to presented in Figure 5.18 final results statistics, given at least one dominating LLH, HH-SP utilizes it enough times to obtain a good final solution quality.
The next setup is LLHs with tuned parameters (lower row in Figure 5.17 and right side of Figure 5.17). According to the baseline evaluation, all among available LLHs are able to tackle the problem instance producing comparable solution quality, however, the performance difference still exists (Figure 5.18). As a consequence, FRAMAB HLH learns it and frequently utilizes the best performing j.ES (lower row, middle plot in Figure 5.17). On the contrary, BRR and random-based approaches sample all LLH types evenly. We conclude that BRR is not as sensitive to the performance evidence and was ‘confused’, since the process quickly converged into a local optimum. The quality of all HP-SP final results presented in Figure 5.18 are at least as good, as the solution quality provided by the best underlying LLH.
pla7397 TSP instance. Our observations of HH-SP performance on the largest problem instance is as follows. During the baseline evaluation, py.ES with default parameters had the worst performance, while the tuned algorithm version was able to outperform only default j.ES. On the contrary, j.ES with tuned parameters produced the best results, outperforming py.SA. The best meta-heuristic with default parameters was py.SA. Therefore, we observe an expected behavior of HH-SP with default LLHs (upper row in Figure 5.19): the most frequently sampled by learning-based HLH was py.SA. However, the number of py.ES usages is suspiciously high in BRR case. Referring to the final results presented in Figure 5.20, we observe a high diversity in quality when py.ES is allocated frequently (codes 1.1. and 3.1), which is expected behavior, when the performance of py.ES with default parameters is taken into account.
Talking about the tuned LLHs usage, we observe almost equal performance of all LLH sampling approaches, comparable to the baseline results. The solution quality of random-based HLH is slightly worse, in comparison to the results of FRAMAB- and BRR-based HH-SP due to their j.ES preference (see right chart in Figure 5.20).
Discussion. According to our observations of the developed HH-SP performance, we conclude that the proposed concept implementation operates as expected: HH-SP provides similar to the best underlying LLH results. Two implemented selection HLH are performing slightly differently when reaching a local optimum. We claim the FRAMAB is a more perspective HLH, since it starts to balance between previously good performing LLH exposing good exploration abilities. In the cases when an advantage of one LLH changes to another, BRR may need more time to learn this.
The observed issues call not only for a thorough investigation (pla7397 code 1.1, 3.1), but also for a generic approach to handle potential flaws in the LLH implementation that may cause issues in overall execution process.