3.3 Performance indicators for machine learning and forecasting
3.3.5 Conclusion on performance metrics
Several performance indicators are discussed in this section. First, we elaborated about classi- fication indicators. The classification matrix provides an overview of all predictions and actual classes. Two performance indicators that can be derived from this matrix are the accuracy and kappa. The accuracy shows the percentage of correct predictions. The kappa is an accuracy ad- justed for imbalanced classes, where 0 is equal to random guessing and 1 is a perfect prediction. To provide a proper overview of classifications, we will use both the accuracy and the kappa as KPIs for classification.
Besides classification indicators, we discussed regression and forecasting indicators. As KPI for regression problems, we will use the RMSE. The RMSE puts a heavier penalty on large errors than the MAE and is easier to interpret than the MSE since it has the same unit as the actual values. The scale-independent indicators MAPE and the sMAPE are not suitable for our research, since these should not be used when data is close to zero.
For the performance of prediction intervals, we described two indicators. These cover the two most important aspects of a prediction interval: the coverage probability and the interval width. Therefore, we will use both the PICP and the PINAW as KPIs.
Lastly, we described two metrics for inventory management: the Cycle Service Level and the Fill Rate. The CSL is the probability of not having a stock out, whereas the FR is the fraction of the demand that is met. Both indicators are ideally as high as possible, but this can also
CHAPTER 3. LITERATURE REVIEW
increase inventory costs. Hence, besides the KPIs, we also need to evaluate the costs. These costs will be defined in the next chapter, where we describe the methodology.
4
|
Proposed method and experimental
design
In this chapter, we propose our method demandForest, that provide a solution to the research objective and we describe with the experimental design how to validate the performance of this method. In Section 4.1, we propose the demandForest method that generates a forecast for the complete introduction period. Thereafter, we introduce an extension to the demandForest in Section 4.2. To compare the proposed methods with the current situation, we describe two benchmark methods in Section 4.3. At last, we elaborate about the experimental design in Section 4.4. In this section, we discuss how we train the applied machine learning algorithms and analyse the performance of the forecasts for the different data sets. We also define experiments to evaluate the inventory performance when the forecasts are employed. Additionally, we propose a synthetic data set for a theoretical evaluation of the methods.
4.1
demandForest
In the previous chapter, we found that the Random Forest and Quantile Regression Forest algorithms seem to be the most suitable for the current research problem. Both algorithms are used in demandForest to generate a forecast for the demand of a new product. Due to these machine learning algorithms, demandForest can learn to generate predictions for products of specific companies. The forecast for the complete introduction period provides insight into the amount and development of the demand of a new product. To provide these insights, demandForest divides the demand during 18 weeks into a demand profile and the total amount of demand. The cumulative demand patterns are clustered in distinctive profiles and predict this profile and the total demand. With the QRF algorithm, not only the total demand, but also the corresponding quantiles can be predicted. The combination of the profile and the total demand is the forecast for 18 weeks. This method is inspired by Thomassey and Fiordaliso (2006) and Loureiro et al. (2018), described in Section 3.1.2. We combine the profile predictions of Thomassey and Fiordaliso (2006) with satisfactory results and suitability of Random Forests as in Loureiro et al. (2018). Furthermore, we enhance the methodology by predicting quantiles with Quantile Regression Forests. The advantage of demandForest is that we only require two predictions (profile and demand) to obtain a forecast with 18 demand points. These two predictions can also be easily interpreted by planners, which may not understand the complete method and algorithms in depth. The method is illustrated schematically in Figure 4.1 and we discuss the utilisation of demandForest in more detail.
As visualised in Figure 4.1, demandForest consists of a preparation phase and an operational phase. In the preparation phase, historical demand is clustered into profiles with a k-means algorithm and the existing products are used to train the Random Forest and Quantile Regression Forest algorithms. In the operational phase, the profiles and algorithms are used to predict the demand of new products.
CHAPTER 4. PROPOSED METHOD AND EXPERIMENTAL DESIGN
Figure 4.1: Schematic overview of demandForest
In the preparation phase, we first cluster the demand patterns to obtain demand profiles. We cluster the normalised cumulative demand of historical items with the k-means algorithm. The algorithm is repeated 25 times to obtain a stable result, as discussed in Section 4.4.2. As also described in Section 4.4.2, the number of clusters are determined by the CH-index, which resulted in two clusters (e.g., the demand profiles) for all the data sets available for this research. Subsequently, the profiles are assigned to each item. After clustering, a Random Forest algorithm is trained to classify the profile based on the product characteristics.
Besides the prediction of the profile, the total demand during the introduction period is predicted. For this prediction, we use the Quantile Regression Forest algorithm. This algorithm does not only predict the expected demand (the mean), but also a full conditional distribution. This full conditional distribution provides insight into the potential uncertainty of the demand of a new product. The QRF algorithm is trained based on the product characteristics as well.
After the preparation phase, the trained algorithms can be used for predicting the demand and profile for new products. The final forecast for the complete introduction period can be obtained by combining the prediction of the profile and the total demand. Since it is not possible to order or sell half a product, the forecast in each week is rounded towards the closest integer. When new data becomes available from recently introduced products, the algorithms could be trained again. In that case, it will consider the new data, which statistically improves the accuracy of the predictions in the future. Besides predicting the demand, safety stock levels can be determined using the conditional distribution. For example, using quantile 0.9 is equal to a CSL of 90%, which implies a probability of not having a stock out of 90%. This only holds for the prediction of 18 weeks. What the CSL becomes when used in combination with a predicted profile should be investigated.