Defining benchmark methods - New Product Forecasting with Analogous Products : Applying Random

To compare the proposed methods with the current situation, we ideally use the actual historical manual forecasts of planners and their orders. In that case, we can find out whether the proposed methods improve the current way of working. Unfortunately, this data is not available. Therefore, we need to define other types of benchmarks to compare with our proposed methods. The first benchmark we define is Zero Rule, abbreviated to ZeroR, which simple predicts the average output of the training data for the test data (Amasyali & Ersoy, 2009). In our case, it uses the average demand of each week of the training data as weekly forecast for each new

CHAPTER 4. PROPOSED METHOD AND EXPERIMENTAL DESIGN

product, or the average of the complete introduction period as forecast for the total demand of each product. For prediction intervals or safety stock calculations, we extend ZeroR by using the quantiles of each week from the training data, or by using the quantiles the complete introduction period.

ZeroR is a simple benchmark, as it does not make any distinction between the products and also does not relate to the current situation of forecasting. Therefore, we also want to use a method that can imitate the decisions of a planner. As described in Section 1.2, planners usually discuss the forecasts during S&OP meetings with managers and they often base their estimates on one similar existing product, of which they expect a similar demand. A discussion during S&OP meetings is difficult to imitate, but finding similar existing products should be possible.

A simple approach to find a similar product is to determine the ‘nearest neighbour’ based on the Euclidean distance of the product characteristics. A constraint for using an Euclidean distance is that all characteristics should be numerical. Moreover, all characteristics will be weighted equally. However, a planner may know that there are only a few significant predictive characteristics. Therefore, the nearest neighbour is in our case not suitable to mimic the decisions of a planner.

Another method of similarity among products, is using the proximity measure of the Random Forest algorithm. As mentioned in Section 3.2.4, the proximity between two products is the percentage of trees in which they end up in the same leaf node. With the proximity measure, we can identify an existing product with the highest similarity compared to the new product. A Random Forest algorithm weighs the different features to find the best value for the prediction. In other words, the product characteristics are weighted to find products with similar demand. This may resemble the domain knowledge or experience of a planner. Hence, we should be able to use the demand of the closest existing product as a benchmark for the actual prediction of the Random Forest.

The proximity may overestimate the accuracy of the actual behaviour of a planner. The proximity of a Random Forest considers all products introduced in the past, while a planner may only remember a limited amount. Additionally, a planner does not always use the historical demand of a comparable product as initial forecast. Nevertheless, this method comes quite close to the actual behaviour of a planner and is more suitable than the nearest neighbour. Moreover, when the actual RF algorithm improves the prediction of a most similar product based on proximity, it will probably also improve the actual forecasts of a planner.

As mentioned in Section 1.2, the software of Slimstock currently uses, by default, a coefficient of variation of 0.45 and a Normal distribution for the monthly demand of new products, when a planner manually sets up a forecast. After four months, all forecasting parameters get updated regularly based on the demand data. The forecast and coefficient of variation are used for safety stock calculations and order levels. Therefore, we will also use the factor of 0.45 for our benchmark method. In our case, we do not forecast the demand of one month, but the demand of four months. Hence, the coefficient of variation should be scaled to four months. Scaling the standard deviation or the coefficient of variation can be done by multiplying the value by the square root of the number of periods. Since we want to scale the factor from one to four months, the new coefficient of variation becomes: 0.45·√4 = 0.9. With this value and the Normal distribution, we can not only determine intimations of forecasts created by supply chain planners, but also prediction intervals, quantiles and safety stocks.

While the proximity is very useful for the prediction of the demand, it may not be very useful for the profile. Currently, planners do not have defined profiles available and may plan a stable demand for all weeks. Nevertheless, we calculate the average profile of all products in the training set. We can use this average profile as benchmark for the predicted profile. When the classification of the profiles does not improve the average profile, it would be easier for a

CHAPTER 4. PROPOSED METHOD AND EXPERIMENTAL DESIGN

company to only apply the average profile to their new products instead of predicting profiles. To compare the demandForest method, the extensions with the Log-Normal and Gamma distribution, and the benchmark methods to see which performs best, we will test the quality of the methods in the next chapter. First, in the next section, we explain the experimental design of this analysis. The experimental design outlines the steps taken to compare the methods in a structured and comprehensive way.

In document New Product Forecasting with Analogous Products : Applying Random Forest and Quantile Regression Forest to forecasting and inventory management (Page 46-48)