Sensitivity to inclusion of methods - Hierarchical forecasting of engineering demand at KLM Eng

Comparing the results of the best forecast against that of the benchmark combination confirmed that a larger selection of models is beneficial for forecast accuracy. Including ten different models and all of their combinations creates questions about which method adds the most predictive power to the approach. In order to assess their individual influence we produce the minimal and mean MASE for each group given that a single method is removed from the pool.

5.2.1 Effect on minimal MASE

Table 5.6 shows the effects of excluding a method on the average minimal MASE per group. For each node we disregard all forecast combinations that include the relevant model, determine the minimal MASE and determine the average per groups. If a model is suited for nodes in a certain group we expect the best performance to drop, i.e. result in a higher MASE. Table 5.6 highlights the biggest increases in MASE to see the effects per method. What becomes clear is that the naive method

consistently adds the most predictive power to combinations for the lower level groups which makes sense. These are characterised by demand that is increasingly difficult to model due to higher variation and 0 value observations. In those situations the best forecast is often simple and naïve predicts optimally for a random walk, so effective under large variation. The higher level groups where demand is more consistent benefit from the more complex models. From Table 5.6 we conclude that ETS, temporal hierarchical forecasting, Croston’s methods and the theta approach do not significantly add to overall best forecast performance and might be excluded from the model pool without impacting the results. But comparing minimal values only indicates the effect on best performance and does not tell us whether a model, on average, adds predictive power to the forecasts, this is discussed in Section 5.2.2.

Table 5.6 Method exclusion effect on minimal MASE, the highest increase in bold, 2nd_{highest increase underlined}

5.2.2 Effect on mean MASE

Table 5.7 shows the effect of excluding a model on the average MASE per group. To determine the effects we disregard any forecasts made with the relevant model, determine the average MASE over the remaining forecasts and average over the nodes in the groups. In the case of a model that mostly adds predictive power to the forecasts we expect the mean to increase, conversely removing a bad model is expected to see the mean MASE decrease. If the mean MASE is unchanged the model contributes evenly to good and bad forecasts. Table 5.7 highlights the increased and decreased values, to see the general effects per model. As in Section 5.2.1 we can observe the naive method to consistently add predictive power and we can see that all models add predictive power to at least one group. The mean approach seems to reduce performance consistently, this is a further indication the current approach to forecasting is unsuitable as a mean forecast is currently used (see Section 2.4). Apart from the naïve method we can also observe that the more complex methods (ETS, Arima and TBATS) have a more positive effect on the result than the simple approaches.

Group N S M E A H B C I Tf Best Total 1,45 1,46 1,45 1,45 1,45 1,45 1,45 1,45 1,45 1,45 1,45 G1 0,96 0,96 0,98 0,96 0,97 0,97 0,99 0,96 0,96 0,96 0,96 G2 0,79 0,79 0,78 0,78 0,80 0,78 0,78 0,78 0,83 0,78 0,78 G3 0,53 0,53 0,55 0,53 0,53 0,53 0,53 0,53 0,53 0,53 0,53 G4 1,54 1,05 1,05 1,05 1,05 1,05 1,09 1,05 1,05 1,05 1,05 G5 0,81 0,79 0,78 0,77 0,78 0,77 0,77 0,77 0,79 0,77 0,77 G6 0,46 0,47 0,48 0,46 0,46 0,46 0,47 0,46 0,46 0,46 0,46 G7 0,78 0,78 0,77 0,76 0,76 0,76 0,77 0,76 0,78 0,76 0,76 G8 1,12 1,12 1,12 1,13 1,23 1,12 1,13 1,12 1,33 1,12 1,12 G9 1,08 1,03 1,02 1,02 1,03 1,02 1,04 1,02 1,04 1,02 1,02 G10 0,95 0,92 0,92 0,91 0,91 0,91 0,92 0,91 0,92 0,91 0,91 G11 1,94 1,90 1,90 1,89 1,90 1,89 1,89 1,89 1,89 1,89 1,88 G12 0,53 0,50 0,51 0,50 0,52 0,50 0,51 0,50 0,51 0,50 0,50 G13 0,71 0,69 0,68 0,67 0,67 0,68 0,67 0,67 0,68 0,67 0,67 G14 0,74 0,72 0,71 0,70 0,71 0,70 0,70 0,71 0,71 0,70 0,70 BTS 1,09 1,06 1,06 1,05 1,05 1,05 1,05 1,05 1,05 1,05 1,05 Avg 0,97 0,92 0,92 0,92 0,93 0,91 0,92 0,91 0,94 0,91 0,91

Table 5.7 Method exclusion effect on mean MASE, the highest value in bold, lowest underlined

5.2.3 Conclusion

Based on the results in this section we conclude that the inclusion of different models is beneficial. Best and average results vary depending on the inclusion of a suitable method. Confirming the principles of forecast combination that make it a more accurate approach. Selecting one model to fit on data introduces a level of uncertainty, applying multiple and combining the results mitigates this. We can further state that the choice to include all models was valid, since each has added at least some predictive power. Naive shows to influence nearly all of the results positively, implying that some part of the variation is best explained by a simple process and therefore benefits from including a naive forecast. Its function might also be to act a normalizer in the forecast combination, it forces more complex forecasts to average with the last observed value tethering it to the most recent outcome. A more detailed exploration of the individual model effects is necessary to infer causation of model performance.

Based on the results some models could probably be excluded, for instance Croston’s method does not or only marginally improves its forecasts. But in order to truly claim that a model adds no predictive power requires a deeper analysis. The order of removing models will also affect the accuracy of the remaining combinations, thus to fully see the effect of a model every possible order of removal should be analysed. As this step only serves to optimize the runtime of the model (no predictive power can be gained by removing a model) we defer the deeper analysis to a future research/implementation. Group N S M E A H B C I Tf Best Total 1,51 1,52 1,51 1,51 1,51 1,51 1,52 1,51 1,51 1,52 1,51 G1 1,17 1,19 1,19 1,19 1,22 1,20 1,23 1,17 1,12 1,20 1,18 G2 1,08 1,04 1,05 1,08 1,09 1,07 1,09 1,06 1,07 1,08 1,07 G3 0,72 0,72 0,71 0,73 0,73 0,73 0,73 0,72 0,69 0,73 0,72 G4 1,79 1,68 1,67 1,72 1,72 1,70 1,73 1,69 1,72 1,69 1,71 G5 1,11 1,10 1,10 1,09 1,10 1,10 1,09 1,10 1,11 1,10 1,10 G6 0,69 0,68 0,67 0,69 0,68 0,69 0,69 0,66 0,65 0,69 0,68 G7 1,10 1,09 1,09 1,08 1,10 1,09 1,09 1,09 1,09 1,10 1,09 G8 1,83 1,79 1,78 1,79 1,84 1,78 1,80 1,79 1,80 1,81 1,80 G9 1,53 1,51 1,53 1,51 1,47 1,48 1,53 1,55 1,55 1,50 1,51 G10 1,29 1,27 1,27 1,28 1,27 1,28 1,28 1,28 1,28 1,27 1,27 G11 2,39 2,36 2,36 2,37 2,37 2,37 2,36 2,37 2,36 2,38 2,36 G12 0,76 0,74 0,73 0,76 0,76 0,75 0,76 0,74 0,74 0,76 0,75 G13 1,05 1,04 1,03 1,04 1,04 1,04 1,03 1,04 1,04 1,04 1,04 G14 1,07 1,06 1,05 1,05 1,06 1,06 1,05 1,06 1,06 1,06 1,06 BTS 1,56 1,53 1,53 1,54 1,54 1,54 1,53 1,54 1,54 1,54 1,53 Avg 1,29 1,27 1,27 1,28 1,28 1,27 1,28 1,27 1,27 1,28 1,27

In document Hierarchical forecasting of engineering demand at KLM Engineering & Maintenance (Page 95-98)