Ensemble Selection - MOGP Ensemble Classification Results

6.5 MOGP Ensemble Classification Results

6.5.2 Ensemble Selection

In addition to the PF-Wvote strategy, two ensemble selection strategies are also compared in MOGP to limit the influence of biased ensemble members in the voting process, to improve ensemble performances. Table 6.4 shows the ensemble accuracies (on the test set) and ensemble sizes using the accuracy-based selection strategy and the off-EEL algorithm for ensemble selection over 50 runs for the MOGP approaches. The accuracy-based selection strategy simplyremoves Pareto front solutions with less than 50% accuracy on either class (on the training set) from the ensemble. This strategy is called the RPF-vote in Table 6.4 as the ensembles arereducedcompared to the PF-vote and PF-Wvote strategies. The off- EEL algorithm uses a more exhaustive and arguably better ensemble selection approach than RPF-vote to choose individuals from the Pareto front for the final

6.5. MOGP ENSEMBLE CLASSIFICATION RESULTS 161 Table 6.4: Average accuracies (±standard deviation) on the test set and ensembles sizes using RPF-vote and off-EEL [76] ensemble selection strategies (50 runs).

Task MOGP RPF-vote Off-EEL

Size Minority % Majority % Size Minority % Majority % Baseline 7.8 79.9_±7.2 87.2_±9.1 5.6 83.7_±5.8 89.2_±8.8 Ion NCL 10.8 82.6±6.9 91.0±3.2 9.9 82.2±5.5 89.5±8.1 PFC 22.3 81.7±5.8 95.8±3.8 21.2 83.9±5.2 96.6±2.8 Baseline 2.8 69.9±11.7 70.1±17.4 3.9 56.0±10.1 83.6±4.8 Spt NCL 4.1 71.1_±9.0 78.4_±8.7 5.2 53.9_±9.7 83.8_±4.8 PFC 12.1 62.1±8.0 80.5±4.8 10.7 66.3±8.5 79.9±6.6 Baseline 87.8 81.4±14.1 79.3±28.9 65.3 89.5±1.5 84.6±2.3 Ped NCL 43.2 87.4_±4.5 84.2_±6.1 40.4 88.8_±3.0 84.4_±2.8 PFC 40.1 91.6±1.9 85.2±3.0 55.2 90.6±1.5 87.9±1.5 Baseline 24.1 68.7±3.2 77.5±3.8 33.6 68.5±5.5 80.4±5.2 Yst1 NCL 17.0 69.0±2.6 80.7±1.8 13.5 64.1±5.0 83.4±4.0 PFC 16.5 71.0±4.4 75.5±5.4 29.2 70.6±5.4 78.8±5.5 Baseline 15.2 80.6±7.8 94.8±2.2 6.5 92.3±2.9 90.7±2.9 Yst2 NCL 13.4 89.6±5.2 91.9±2.9 8.6 80.4±7.3 94.5±2.1 PFC 20.6 89.2_±3.2 92.3_±1.8 17.2 93.1_±2.6 90.8_±2.4 Baseline 4.7 82.6_±13.7 59.0_±21.4 4.7 71.4_±15.9 85.6_±9.0 Bal NCL 4.9 61.8±4.2 94.1±5.9 5.5 62.1±17.5 85.6±7.3 PFC 10.1 83.6_±9.4 79.5_±10.3 10.9 81.4_±12.1 86.2_±9.1

Table 6.5: Average accuracies (±standard deviation) on the test set and ensembles size using the majority vote (PF-vote) and fitness-weighted vote (PF-Wvote) strategies (50 runs). Repeated from Table 6.3.

Task MOGP Size Majority Vote (PF-vote) Weighted Vote (PF-Wvote) Minority % Majority % Minority % Majority % Baseline 18.8 84.5±6.2 83.5±9.9 82.5±5.8 89.1±8.3 Ion NCL 12.7 85.5±5.2 86.8±7.3 80.9±5.9 91.4±5.9 PFC 28.1 84.9±5.1 92.4±6.4 79.6±6.2 96.3±3.7 Baseline 7.8 44.5±5.5 88.8±2.7 86.4±13.7 59.9±36.4 Spt NCL 9.5 48.6±5.6 86.5±2.9 84.1±12.0 63.3±32.7 PFC 27.3 44.6±5.4 90.8±2.3 75.9±10.4 72.6±22.2 Baseline 124.9 12.5±27.2 87.1±28.2 89.1±3.6 83.3±3.1 Ped NCL 52.6 71.8±8.9 91.7±2.7 88.0±3.2 83.5±3.7 PFC 71.6 82.4±5.6 92.1±2.4 92.5±1.6 84.1±3.1 Baseline 46.7 58.0±4.0 87.1±2.4 70.0±4.2 77.3±4.3 Yst1 NCL 25.8 63.6±3.8 83.0±3.3 70.6±3.7 76.8±4.4 PFC 39.7 64.6±4.8 82.5±4.3 71.8±5.3 75.4±6.5 Baseline 18.5 77.1±4.6 96.2±1.1 82.8±3.6 95.1±1.3 Yst2 NCL 16.1 77.6±6.0 95.3±1.7 83.6±4.7 93.7±1.7 PFC 27.9 81.2±4.9 95.5±1.5 89.6±3.2 92.1±1.9 Baseline 9.8 53.3±21.4 94.1±4.4 84.2±12.5 71.4±23.6 Bal NCL 8.4 59.2±16.1 87.8±6.6 86.9±11.8 66.0±29.5 PFC 20.8 51.7±18.2 95.4±3.5 87.3±9.3 74.1±17.1

ensemble. Off-EEL iteratively adds individuals to the ensemble and evaluates the ensemble performance (on the training set) at each step. Once all Pareto front solutions are processed, the ensemble with the best performance is taken as the final ensemble.

For convenience, the PF-vote and PF-Wvote ensemble performances (from Table 6.3 in the previous section) are repeated in Table 6.5 (below Table 6.4) to make comparisons between these ensemble approaches easier.

Advantage of off-EEL for Ensemble Selection

Table 6.4 shows that both the RPF-vote and off-EEL strategies show relatively balanced performances on both the minority and majority classes compared to the PF-vote (from Table 6.5). Both strategies achieve this by only allowing relatively accurate Pareto front solutions from voting in the ensembles. This can be seen by the smaller ensemble sizes for RPF-vote and off-EEL in Table 6.4 (compared to Table 6.5), and suggests that both strategies are effective in limiting the influence of biased Pareto front solutions in the ensemble voting process.

According to Table 6.4, the off-EEL ensembles either dominate, or are non- dominated relative to, the RPF-vote results in the tasks. This is important as it shows that the off-EEL ensembles are at least as good as, or in some cases better than, the RPF-vote ensemble in the tasks. In some cases where the off- EEL ensembles dominate the RPF-vote ensembles such as Ion (Baseline and PFC) and Ped (Baseline and NCL), the off-EEL ensemble sizes are even smaller than the RPF-vote ensembles. This suggests that the RPF-vote ensembles still contain some individuals that do not positively contribute to the ensemble performance. This is not unexpected as the off-EEL algorithm uses an additional search to choose good individuals for the ensembles; whereas in the naive RPF-vote strategy, the performance threshold to exclude individuals from the ensembles is determineda priori.

A similar observation can be seen when the off-EEL strategy is compared to the PF-Wvote strategy from Table 6.5. The off-EEL ensembles are at least as good as, or better than, the PF-Wvote ensemble in all tasks.

Tables 6.4 and 6.5 also show that the PFC ensembles using both off-EEL and PF-Wvote dominate both the Baseline and NCL ensembles in two tasks (Ped and Bal). In three other tasks (Ion, Yst1and Yst2), PFC also dominates the Baseline but

only using the off-EEL strategy (PFC and the Baseline are non-dominated to each other using the PF-Wvote strategy in these three tasks). This may be because the Baseline MOGP uses no explicit ensemble-diversity objective in fitness, and the

6.5. MOGP ENSEMBLE CLASSIFICATION RESULTS 163 PFC approach can find individuals that are more diverse in their outputs. When individuals with good diversity are combined together in the ensembles created using off-EEL, the PFC ensembles (with off-EEL) improve to a greater extent than the Baseline ensembles (with off-EEL). The NCL ensembles are not able to achieve this as NCL does not dominate the Baseline or PFC in any task using either the off-EEL and PF-Wvote strategies. This suggests that the PFC ensembles may have the best diversity from the three MOGP approaches (this is also explored further in the next section which compares “wins” for the different MOGP approaches on the tasks).

Baseline Better with Off-EEL

In the Baseline MOGP approach, the PF-Wvote results (from Table 6.5) dominate the RPF-vote results in Table 6.4 in nearly all tasks. The only exception is Spt where RPF-vote shows equally high class accuracies, while PF-Wvote has a slight bias toward high minority class accuracy only. As mentioned above, the poorer performance by the RPF-vote strategy may be due to the main limitation of this strategy, i.e., the criterion for ensemble selection is determined a priori

(and remains fixed for all tasks). This means that these ensembles still contain individuals that do not positively contribute in the voting process. In contrast, by reducing the contributions of individuals with poorer fitness in the voting process (relative to other fitter solutions) using the fitness-weighed vote in PF-Wvote, the PF-Wvote strategy can generally outperform RPF-vote strategy on the tasks.

For the NCL and PFC approaches, the PF-Wvote and RPF-vote strategies produce similar (non-dominating) ensemble performances in nearly all tasks (ex- cept Bal). This suggests that both methods are similarly effective in keeping the ensemble performances relatively well-balanced on the two classes (compared to PF-vote). In Bal, the NCL results for PF-Wvote and RPF-vote vary in their minority to majority class bias, e.g., the PF-Wvote is stronger on the minority class, while the RPF-vote is stronger on the majority class. This may be due to noise or the comparatively high level of class imbalance in Bal.

Due to these limitations for the RPF-vote strategy, the remainder of this chapter will focus on the PF-Wvote and off-EEL strategies.

In document Genetic Programming for Classification with Unbalanced Data (Page 182-185)