3.3 Data Characteristics
3.3.1 Hierarchies & Article Clusters
The organizational structure of retailers can often be presented as a hierarchy (see Figure 3.1). In the retail domain, the stores are typically grouped into regions having their own distribution centers. Moreover, the articles that are offered by the retailer build a hierarchy itself, i.e., several articles can be grouped into an article category. Hence, various hierarchies can be built and exploited for decision support. Demand observations at the store-article level can be aggregated to various higher levels of the hierarchy:
• Region (RX): The total quantity sold within the region.
• Region Category (RC): The total quantity sold of articles of a specific category (cluster) within the region.
• Region Article (SA): The total quantity sold of a specific article within the region. • Store (SX): The total quantity sold at a specific store.
• Store Category (SC): The total quantity sold of articles of a specific category (cluster) at a specific store.
3.3. DATA CHARACTERISTICS 31 S11A11 S11A12 S11A12 S11A22 S12A11 S12A12 S12A21 S12A22 R 1C1 S 11C1 S11C2 S12C1 S12C2 R 1C2 R 1X (a) RX-RC-SC-SA S11A11 S11A12 S12A11 S12A12 R1C1 R1A11 R1A12 R1C2 R1X (b) RX-RC-RA-SA S11A11 S11A12 S11A12 S11A12 S11X S11C1 S11C2 S12X R1X (c) RX-SX-SC-SA
Figure 3.1: The figures illustrate hierarchies that can be used for hierarchical forecasting. Each node refers to a time series of a certain level that is identified by two letters. The first letter specifies a region (R) or a store (S). The second letter specifies an article (A), an article group (C), or the group of all articles (X).
• Store Article (SA): The total quantity sold of a specific article at a specific store. While our goal is decision optimization at the store-article level (see Section 1.2), other levels of the hierarchy are also of interest. For instance, the category levels (RC + SC) can be used to monitor the demand within a group of items and to validate the predictions at article level (RA + SA). Studies indicate that perishable articles (e.g. baked goods) have a high substitution rate in case of stock-outs, e.g., customers buy another article of the same category in the same store (van Woensel et al., 2007). Hence, it is important to maintain a high service level for at least one product of a cluster of substitutable items if not for every article. By risking that some articles are out-of-stock, the total amount of waste can be limited while the expected revenue loss might be acceptable due to the substitution effects (van Donselaar et al., 2006). Predicting the aggregated demand of the complete cluster might lead to more accurate forecasts, which helps to reduce the risk of excessive stock levels and stock-outs for the whole article group, because the time series of single articles can be more volatile and distorted due to various effects (e.g. stock-outs). Moreover, forecasts for a group of substitutable items are valuable if the assortment is changing or some articles are temporarily not available due to delivery problems or item damage. For instance, the demand forecast of an article group can be used to estimate the demand for a new article if a seasonal article gets replaced. For completeness, we also introduce the total aggregation of sales (SX + RX) which are not directly linked to the optimization of order quantities but are relevant for revenue forecasts. For instance, revenue forecasts at store level are relevant for staffing decisions.
3.3.1.1 Article Clustering Approach
In order to gain the most benefit from a hierarchy, it is important to rely on meaningful article clusters. Therefore, we aim to identify groups of comparable products that are potentially substitutes and also beneficial with respect to demand estimation.
For example, Kalchschmidt et al. (2006) cluster customers of warehouses (e.g. stores) according to various criteria (e.g. weekly sales pattern, penetration rate) in order to obtain homogenous groups. Zotteri et al. (2005) cluster time series based on their characteristics (e.g. demand pattern) rather than more intuitive but misleading features like the allocation to a distribution center (e.g. geographical proximity). The demand of each group becomes less uncertain and variable, which leads to more accurate predictions at company level.
We propose to detect article groups automatically by clustering articles according to their intraday sales patterns. In order to perform a cluster analysis, we transform the point-of-sales data into feature vectors Pa,q,w representing the intraday sales pattern for each article a in a specific quarter q on each weekday w. Hence, each article is represented by 24 (Monday - Saturday) or 28 (Monday - Sunday) vectors.
Pa,q,w = (pa,q,w,1, pa,q,w,2, . . . , pa,q,w,T) (3.1)
We introduce a vector for each weekday and quarter in order to reveal possible differences in the demand patterns and to cover seasonal aspects. For instance, the demand patterns of working days and weekends could be distinguishable. Moreover, different environmental fac- tors (e.g. weather conditions) might cause different demand patterns in the summer compared
to the winter. The length of Pa,q,w depends on the maximal number of hours T during which
the stores are open. Each element pa,q,w,t represents the average relative proportion of the total daily sales that is sold in the respective hour t.
pa,q,w,t=
sa,q,w,t
P
tsa,q,w,t
(3.2)
The variable sa,q,w,t represents the total sales of an article a in quarter q on weekday w and hour t. We cluster the generated features with the k-means algorithm. The algorithm ensures that each vector is assigned to exactly one cluster (strict partitioning). Moreover, the center of a cluster can be interpreted as a general demand pattern of the allocated articles. In order to apply the algorithm, one has to set the number of clusters k. It is noteworthy that the number of clusters should be aligned to the characteristics of the demand patterns. Therefore, we suggest applying agglomerative hierarchical clustering in a preceding step as this helps to reveal a hierarchical structure and to determine a suitable number of clusters. A suitable linkage criterion for our use case is Ward’s method (Ward Jr, 1963) which merges clusters so that the within cluster variance is minimal.
After the feature vectors are allocated to clusters using the k-means algorithm, we deter- mine the final article groups by majority vote. This is necessary as each article is represented by several feature vectors, and it is not guaranteed that all feature vectors are part of the same
3.3. DATA CHARACTERISTICS 33 104 103 105 102 101 106 204 201 203 202 205 0 2 4 6 8 12
(a) Working Days (Mon - Fri)
202 203 204 201 205 102 106 103 101 104 105 0 2 4 6 8 12 (b) Weekend (Sat)
Figure 3.2: Hierarchical cluster analysis based on intraday sales patterns of articles on work- ing days and the weekend. The dataset yields two main clusters which match the two article categories.
cluster. Thus, we assign an article to the cluster to which most of its feature vectors belong. The obtained article clusters can be complemented with the organizational structure to build the hierarchy.
3.3.1.2 Article Clustering Evaluation
We apply the proposed clustering approach to dataset v2, which contains 6 buns (ids: 101- 106) and 5 breads (ids: 201-205). The hierarchical cluster analysis reveals that the demand patterns of working days are distinguishable from weekends. Moreover, we observe that the two groups of articles match the article groups buns and breads (see Figure 3.2), i.e., the demand patterns of buns and breads are clearly distinguishable. Based on these observations, we decide to split the feature vectors into one set that contains feature vectors of working days and another set that contains all feature vectors of weekend sales patterns. For each set of vectors, we apply k-means with k = 2.
The results of the cluster analysis are depicted in Figure 3.3. Overall, the resulting clusters are quite pure and accurate compared to the given article category assignment. In this case, we use the original category assignment as the gold standard as the two categories already contain substitutable goods and thus are reasonable clusters. However, this has not to be the case in other scenarios. The cluster analysis shows that the demand patterns for buns (see Figures 3.3a & 3.3b) are distinguishable from breads (see Figures 3.3c & 3.3d). Based on these results, articles 101-106 (buns) and articles 201-205 (breads) can be grouped. Those clusters can be used for hierarchical forecasting.
It is also mentionable that the patterns for different buns (breads) are similar, which under- lines the assumption that they have comparable characteristics with respect to the customer demand. The clusters show that buns are mostly sold in the mornings, while the demand for breads is higher in the afternoon. This suggests that buns are the preferred product in the morning, whereas bread sales are rather equally distributed over the day. Moreover, we
0 5 10 15 20 25 6 10 14 18 pct. hour
(a) Buns (working days)
0 5 10 15 6 10 14 18 pct. hour (b) Buns (weekend) 0 5 10 15 6 10 14 18 pct. hour
(c) Breads (working days)
0 5 10 15 6 10 14 18 pct. hour (d) Buns (weekend)
Figure 3.3: The intraday sales patterns are clustered using k-Means. The clusters match the article categories and the type of the weekday.
observe peaks during lunchtime and in the afternoon, which seem to be related to the work- ing times of employees in Germany. We also observe that the demand patterns of working days are different from the weekend. On Saturday, the demand for buns is very high in the mornings and drops continuously during the day. For breads, we do not observe the second peak in the afternoon that we see on working days. For all clusters, we observe that the sales drastically decrease during the last opening hours due to less demand. Hence, running out- of-stock during the last hour of the opening hours may not have a big impact on the revenues and might be acceptable if it decreases the amount of discarded goods.