Perishable Products: Case 4 - Value Iteration for Perishable Inventory Control

4.4.1 Results

Case 4 differs from the other cases by the fact that the inventory is perishable. The products expire after being held for 1 day. This means two kinds of inventory exist (at the beginning of the day): 0-days old inventory I0, and 1-day old inventory I1. When keeping the perishability aside, the case has similarities with case 2. Table 4.7 shows the used input data in this case. However, multiple situations containing varied fixed costs k and varied holding costs h values are simulated in this case.

Table 4.7: Used input data.

Table 4.8: The average costs of order policies, when back-ordering is not permitted and selling price is taken into account.

Avg. Profit

k = 0 k = 1 k = 0 k = 1 k = 4

h = 0.1 h = 0.1 h = 0.25 h = 0.25 h = 0.25 Value Iteration 3.152 2.259 2.957 2.080 0.0

Simulation 3.157 2.259 2.963 2.088 0.0

δ[SIM − V I] 0.005 0 0.006 0.008 0

δ[ratio] 0.0016 0 0.002 0.0038 0

Value Iteration

The minimum inventory (I) used for the VI is 0, because back-ordering is not permit-ted. The maximum inventory is the Order-up-to level S, 5. The obtained optimal order quantity is depicted in Table 4.9, when k = 1 and h = 0.25. For both I0 and I1, the range of inventory is between I to S. Table 4.9 is an example of one of the possibilities of an optimal order quantity matrix, in this case the optimal order quantity is based on k = 1 and h = 0.25. This matrix exists for every k/h -case, depicted in a 3D-graph below.The frequency of occurrence is only calculated in the simulation. Therefore the figures below are based on the optimal order quantity of VI and the frequency is based on the simulation model.

Table 4.9: The optimal order quantity, when I₀and I₁have a certain amount of inventory

Table 4.8 compares the profit of VI against profit of the simulation, for multiple k and h values. The highest profit is obtained when k = 0 and h = 0.1. And the lowest profit is obtained when k = 1 and h = 0.25. Keeping the situation with k = 4 and h = 0.25 out of scope, when no profit was made. The difference in average profit, shows how close the simulation is to the optimal found after using VI. All differences are p(δ) < α, when α = 0.05. To know whether the order policy turns periodic, it is important to check the ordered order quantities. Table 4.10 seems to show only when k = 0 and h = 0.25, the order policy is not periodic. However, when expanding the array with ordered order quantities, it turns out all orders containing k ≤ 1 are non-periodic. If k > 1, the orders become periodic.

Table 4.10: Order quantities based on time t.

k = 4, h = 0.25 q=[0,0,0,..,0,0]

Figures 9.3a- 9.6b (Appendix), show the outcome of the optimal order quantity of VI, and the frequency as simulation output. Table 4.9 is the optimal order quantity when k = 1 and h = 0.25, this situation will be explained in this chapter, whereas the other situations are built upon the same strategy.

(a) Frequency of occurrence of inventory I₀and I1(k = 1 and h = 0.25).

(b) Optimal order quantity when containing two levels of perishability (I0and I1), the inventories could not be negative. For k = 1 and h = 0.25

Figure 4.10: Frequency and Optimal order quantity for k = 1 and h = 0.25.

4.4.2 Conclusion

The fixed costs are of great importance in determining the correct order policy. Due to perishable inventory it is not possible to order great amounts in advance to keep the fixed order costs low. For such high fixed costs it is not possible to make profit any more, no orders are placed (in this case when k ≥ 3). When it is still possible to make a slight profit, the orders become periodic. Which means that orders are made depending on perishability. Which in this case is every other day (when k = 2). When the fixed costs are low enough to order independent of the perishability (cost wise), an order policy could be found. Still the order quantities are highly bound to the expected demand.

It could differ slightly, because of its one-day perishability but only what the Poisson distribution allows. Changing the holding cost h does not affect the order policy.

It is important to keep the frequency of occurrence graph aside of the optimal order policy figure. To know which orders are most likely to happen, which in this thesis is 0 or 3. However, the sequence of ordering shows it is not periodic.

4.4.3 Discussion

In this case no Markov stationary state was made to compare with the frequency of the simulation. However, the average profits of the VI and the simulation do match.

Chapter 5 Curse of Dimensionality

To tackle the curse of dimensionality it is important to know how to deal with it theo-retically. In Chapter 1 the curse of dimensionality was described, this chapter explains how to deal with it and how to set-up correct limits for a VI. Powell (2007) wrote the state space, the outcome space and the action space should be kept small. This also means that the boundaries need to be strict, this prevents calculating infeasible so-lutions. In this thesis, the curse of dimensionality is mainly be tackled at the state space.

The state space s has to do with all states that a model can have (Kettenis and Van der Vorst, 2007). Both Powell (2007) and Hendriks (2007) recalled the state space as the reason for existing of the curse of dimensionality. Hendriks (2007) specifically states for the Bellman equation, which is used in this thesis, that an increase of dimension also results in an increase of state space. The Bellman equation encounters ∀s ∈ #S : V (s), in which #S is the number of elements in set S. When increasing with dimension n, the calculation time will increase as well. For example, if s could have 10 values, the stored #S = 10ⁿ. This means when the VI has 3 dimensions, the Bellman equation should encounter the calculation 10³ = 1000 times in each stage. Besides #S also #X influences the calculation time for a Bellman calculation (Hendriks, 2007).

2D is used when the inventory does not have an expiring date, 3D when the product expires after 1 day, 4D after 2 days etcetera. As stated earlier, the amount of calculations required per increase of dimension, increases exponentially. In this thesis, the state space contains inventory and order quantity, which goes up to three dimensions. The Poisson distribution is used to keep the state space of inventory as small as possible. The Poisson distribution is based on the replenishment cycle and two times the average demand µ.

difference until infinity. If the program is not running well, or requires too many loops, an can be introduced in a while-loop. This causes the model to stop after a specific number of iterations. This is also confirmed by Rust (1997), he states when a prob-lem (in their case a Markov Decision Probprob-lem, M DP ) is discrete, the probprob-lem could be solved exactly. However, when a M DP is continuous, an arbitrary small solution tolerance should be encountered, which is in this thesis α. This is part of the Bellman’s curse of dimensionality (Rust, 1997).

Pakes and McGuire (2001) wrote that two causes of the curse of dimensionality are found, 1) in the number of points evaluated at each iteration and 2) the time per point required. The first statement is tackled in this thesis by limiting the following input the average demand µ was set low, to avoid extra calculations. The second statement, the reduction of time per iteration, is tackled by replacing do-loops by minimization and maximization functions. max(d, 0) e.g. instead of a do-loop of multiple lines.

So, according to the references used in the text above, multiple reasons for the curse of dimensionality exist. From state space to time required per iteration, all possible causes of the ”curse” were encountered in this thesis, and decreased in size as much as possible.

Chapter 6 General Conclusion

According to the outcomes of the models of Case 1-3, when having non-perishable prod-ucts, the optimal order policy turns out to be an (s, S)- order policy. This is also in line with what Tekin et al. (2001) said (Chapter 2). The first case is allowed to have a negative inventory level, due to back-ordering. The found order policy follows exactly the behaviour of how Pauls-Worm (2007) did describe an (s, S)- order policy (Chapter 2). The next case has its inventory levels limited by setting the minimum inventory level at I = 0 and loss of sales is a possibility. Despite of these limitations, the (s, S)- order policy is still optimal, however it shows a dip at inventory level I = 0. The same happens when a customer service level is set and should be met. This happens by introducing a minimum re-order level s₀. So, for non-perishable inventory the optimal order policy is an (s, S)- order policy.

For perishable inventory it is more difficult to find an optimal order policy. This policy depends on the amount of fixed costs k, changing the holding cost h did not in-fluence the outcome of the model. High fixed costs cause periodic ordering, but too high fixed costs cause no ordering at all. However, when optimizing the order policy with low fixed costs, an order policy is shown. The inventory levels which are most likely to occur in case 4, are shown in figure 4.10a. These most occurring inventory levels are equal to: 0, µ or µ + 0.5µ. The values are concluded from the frequency of occurrence graph in combination with the optimal order quantity graph (figure 4.10b). This policy depends on the perishability of the product and the service level put into the distribution.

When comparing the VI with the results of the simulation, it turns out that the

To deal with the curse of dimensionality it is important to limit the state space (most important in this thesis). This curse of dimensionality especially happen when the Bell-man equation is used. When increasing the dimension, the number of state elements in the set will increase exponentially. In this thesis, the increase is caused by adding perishability. In case of non-perishability the model is 2D, do the products expire after 2 days, the model turns to 3D, etcetera. The first attempt to keep the state space as small as possible, was by limiting the inventory levels by using the Poisson distribution. The second attempt was to keep the state space of order quantity as small as possible was by first determining the EOQ to know what the global optimal order quantity will be.

These two approaches were used to keep the state space as small as possible, however by introducing a span the running time is reduced as well.

To wrap up, the value iteration is based on the Bellman equation. This equation is used in all cases. By changing the aim slightly per case, it was possible to see what happened step-by-step. A general pseudo-code was designed which applied for all cases and functions as back-bone of this research. The value iteration for perishable inventory was more difficult, because another dimension was required. Adding the dimension was the only big difference regarding the calculations of the used pseudo-code. So, value iteration of perishable inventory does have a lot of similarities compared to non-perishable inventory. However, dimensions should be added, which makes it complicated when the time of perishability increases. Complicated in sense of calculation, but also the number of states increase, which could result in problems caused by the curse of dimensionality.

Chapter 7 General Discussion

The optimized costs or profit, depending on the case, are (nearly) similar. This means the simulation is a close evaluation of the value iteration. However, when comparing the frequency of occurrence, obtained by the Markov stationary state, deviations are visible.

The probability of occurrence has a well shaped and distributed figure. Whereas the frequency of occurrence does show multiple outliers.

In Chapter 4 all cases were discussed. Case 1 had an increase at the end of the frequency of occurrence graph. Case 2 showed a dip at the inventory level I = 0, of the optimal order quantity Qop(I)-graph. This is not in line with the (s, S)-order policy. Another deviation in Case 2 was the frequency of occurrence at inventory level I = 10 of the frequency of occurrence graph. This deviation could be explained by the fact that the order-up-to level S = 12, but the leadtime is 1 day, this means that the expected de-mand µ should be withdrawn. Case 3 did show unexpected dede-mand patterns for fixed costs lower than 3. Case 4 does not have the probability of occurrence graph based on the Markov matrix. This makes it only possible to compare the outcomes of average profit by the simulation and the value iteration. And to make an estimation of the most frequent ordered order quantities.

Chapter 8 Further Research

For further research the increase of average demand could be included, it is worth know-ing what happens with the optimal order quantity and model when larger values are used. Another interesting topic is to find the influence of the fixed costs k, using VI and evaluation by simulation. The fixed costs turn out to be of great influence in this research. It is interesting to know what happens, using different fixed costs k, when the shelf-life of the inventory increases. This will influence the order policy, it could become periodic or even no orders placed.

Another case for further research are the declines at Qop(0), shown in the figures 4.5 and 4.8. These declines seem to be regularly, which means there must be an explanation for this behaviour.

In case 3, the order policies for low fixed costs show large deviations. It is interesting to find out what these deviations mean and where they come from. After checking the model, the orders were not periodic for small fixed costs. However, this assumption should be included in further research.

For all cases applies the simulation is close to the outcomes of the VI. Comparing the fre-quency of occurrence (simulation) and the probability of occurrence (Markov stationary state of VI), the figures do have many the same characteristics. However, some of the simulations do show strange outliers, case 2 e.g., for further research it is worth knowing why the simulations do show outliers and to check the reliability of these outliers.

A challenge for further research could be the comparison of the value iteration used in this thesis, with the (Q, r, T )-order policy of Tekin et al. (2001). Because it focusses on two triggers, the inventory level s and the shelf-life T . It is interesting to know whether the value iteration and the (Q, r, T )-order policy are close to each other when measuring costs. Both policies do take the shelf-life and the inventory level into account when determining the order quantity.

Bibliography

R. Bellman. A markovian decision process. Journal of Mathematics and Mechanics, pages 679–684, 1957.

R. Bellman. Dynamic programming. 153(3731):34–37, 1966. doi: 10.1126/sci-ence.153.3731.34.

L. Bertazzi, G. Paletta, and M. Speranza. Deterministic order-up-to level policies in an inventory routing problem. Transportation Science, 36(1):119–132, 2002.

R. A. Broekmeulen and K. H. Van Donselaar. A heuristic to manage perishable inventory with batch ordering, positive lead-times, and time-varying demand. Computers &

Operations Research, 36(11):3013–3018, 2009.

S. Chopra and P. Meindl. Supply Chain Management: Strategy, Planning, and Operation, volume 6. 2016.

EC. Preparatory study on food waste across eu 27, 2010.

M. Eriksson. Retail Food Wastage. PhD thesis, 12 2012.

FAO. Save food: Global initiative on food loss and waste reduction, 2011. URL http://www.fao.org/docrep/014/mb060e/mb060e.pdf.

S. Fawcett, L. Ellram, and J. Ogden. Supply Chain Management From Vision to Imple-mentation. Pearson Education Limited, 2014.

H. Fischer. History of the central limit theorem : from classical to modern probability theory, chapter Markov Processes, pages 19–27. Springer, 2011a.

H. Fischer. History of the central limit theorem : from classical to modern probability theory, chapter Poisson Processes, pages 17–19. Springer, 2011b.

T. Hendriks. Inventory management. In G. Claassen, T. Hendriks, and E. Hendrix, editors, Decision Science: Theory and Applications, chapter 8, pages 181–183. Wa-geningen Acad. Publ., 2007.

E. Hendrix, G. Ortega, R. Haijema, M. Buisman, and I. Garcia. On computing optimal policies in perishable inventory control using value iteration. 2018.

E. Hendrix, C. Kortenhorst, and G. Ortega. On computational procedures for value iteration in inventory control. IFAC, 2019.

R. M. Hill and S. G. Johansen. Optimal and near-optimal policies for lost sales inven-tory models with at most one replenishment order outstanding. European journal of operational research, 169(1):111–132, 2006.

D. Kettenis and J. Van der Vorst. Inventory management. In G. Claassen, T. Hendriks, and E. Hendrix, editors, Decision Science: Theory and Applications, chapter 13, page 344. Wageningen Acad. Publ., 2007.

S. Nahmias. Perishable inventory systems, volume 160. Springer Science & Business Media, 2011.

A. Pakes and P. McGuire. Stochastic algorithms, symmetric markov perfect equilibrium, and the ‘curse’of dimensionality. Econometrica, 69(5):1261–1281, 2001.

J. Parfitt, M. Barthel, and S. Macnaughton. Food waste within food supply chains:

quantification and potential for change to 2050. Philosophical transactions of the royal society B: biological sciences, 365(1554):3065–3081, 2010.

K. Pauls-Worm. Inventory management. In G. Claassen, T. Hendriks, and E. Hen-drix, editors, Decision Science: Theory and Applications, chapter 11, pages 257–284.

Wageningen Acad. Publ., 2007.

V. Powell and L. Lehe. Markov chains. URL http://setosa.io/ev/markov-chains/.

W. Powell. Approximate Dynamic Programming: Solving the curses of dimensionality, volume 703. John Wiley & Sons, 2007.

D. Prak, R. Teunter, and J. Riezebos. Periodic review and continuous ordering. European Journal of Operational Research, 242(3):820–827, 2015.

Prasanth. Simple statistics with scipy, 2011. URL

https://oneau.wordpress.com/2011/02/28/simple-statistics-with-scipy/.

J. Rust. Using randomization to break the curse of dimensionality. Econometrica:

Journal of the Econometric Society, pages 487–516, 1997.

E. Tekin, ¨U. G¨urler, and E. Berk. Age-based vs. stock level control policies for a per-ishable inventory system. European Journal of Operational Research, 134(2):309–329,

H. Tijms and H. Groenevelt. Simple approximations for the reorder point in periodic and continuous review (s, s) inventory systems with service level constraints. European Journal of Operational Research, 17(2):175 – 190, 1984. ISSN 0377-2217.

K. Van Donselaar, T. van Woensel, R. Broekmeulen, and J. Fransoo. Inventory control of perishables in supermarkets. International Journal of Production Economics, 104 (2):462–472, 2006.

M. Verleysen and D. Fran¸cois. The curse of dimensionality in data mining and time series prediction. In International Work-Conference on Artificial Neural Networks, pages 761–767. Springer, 2005.

Chapter 9 Appendix

9.1 List of Symbols

Value function over state s V_s

State s

Previous value function over state s Ws

Contribution of state s and action x C(s, x)

Expected contribution of state s and action x and expectation e EC(s, x, e)

Transition of state s and action x T (s, x)

Alpha service level α

Minimum order quantity q or qmin

Maximum order quantity Q or Qmax

Chapter 3

Valuation calculation per order quantity vs

Maximum difference for optimum

In document Value Iteration for Perishable Inventory Control (Page 49-63)