IDEAS home Printed from https://ideas.repec.org/a/eee/proeco/v252y2022ics0925527322001670.html
   My bibliography  Save this article

Bandit-based inventory optimisation: Reinforcement learning in multi-echelon supply chains

Author

Listed:
  • Preil, Deniz
  • Krapp, Michael

Abstract

Even though base-stock policies are per se straightforward, determining them in complex, stochastic multi-echelon supply chains is often cumbersome or even analytically impossible. Therefore, a wide range of heuristics has been proposed for this purpose. This is the first study considering the problem as a multi-armed bandit problem. In this context, we investigate two algorithms: first, we propose an approach that is based on upper confidence bounds and priority queues. This so-called PQ-UCB algorithm allows us to drastically reduce the runtime of upper confidence bound allocation strategies in problems with large action spaces. Subsequently, we apply the parameter-free sequential halving (SH) algorithm. We investigate various scenarios to compare the performance of both algorithms with the performance of a genetic algorithm and a simulated annealing algorithm taken from the literature. PQ-UCB as well as SH outperform both benchmark metaheuristics and require substantially less effort related to parameter tuning (or even no effort in the case of SH). As multi-armed bandits are not common in inventory optimisation so far, we aim to emphasise their strengths and hope to promote their dissemination also in other domains of supply chain management.

Suggested Citation

  • Preil, Deniz & Krapp, Michael, 2022. "Bandit-based inventory optimisation: Reinforcement learning in multi-echelon supply chains," International Journal of Production Economics, Elsevier, vol. 252(C).
  • Handle: RePEc:eee:proeco:v:252:y:2022:i:c:s0925527322001670
    DOI: 10.1016/j.ijpe.2022.108578
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0925527322001670
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ijpe.2022.108578?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Eric M. Schwartz & Eric T. Bradlow & Peter S. Fader, 2017. "Customer Acquisition via Display Advertising Using Multi-Armed Bandit Experiments," Marketing Science, INFORMS, vol. 36(4), pages 500-522, July.
    2. Ehsan Badakhshan & Paul Humphreys & Liam Maguire & Ronan McIvor, 2020. "Using simulation-based system dynamics and genetic algorithms to reduce the cash flow bullwhip in the supply chain," International Journal of Production Research, Taylor & Francis Journals, vol. 58(17), pages 5253-5279, September.
    3. Petrovic, Dobrila & Roy, Rajat & Petrovic, Radivoj, 1999. "Supply chain modelling using fuzzy sets," International Journal of Production Economics, Elsevier, vol. 59(1-3), pages 443-453, March.
    4. Andrew J. Clark & Herbert Scarf, 2004. "Optimal Policies for a Multi-Echelon Inventory Problem," Management Science, INFORMS, vol. 50(12_supple), pages 1782-1790, December.
    5. Awi Federgruen & Paul Zipkin, 1984. "Computational Issues in an Infinite-Horizon, Multiechelon Inventory Model," Operations Research, INFORMS, vol. 32(4), pages 818-836, August.
    6. Xiaoguang Huo & Feng Fu, 2017. "Risk-Aware Multi-Armed Bandit Problem with Application to Portfolio Selection," Papers 1709.04415, arXiv.org.
    7. Li, Xiuhui & Wang, Qinan, 2007. "Coordination mechanisms of supply chain systems," European Journal of Operational Research, Elsevier, vol. 179(1), pages 1-16, May.
    8. Diana M. Negoescu & Kostas Bimpikis & Margaret L. Brandeau & Dan A. Iancu, 2018. "Dynamic Learning of Patient Response Types: An Application to Treating Chronic Diseases," Management Science, INFORMS, vol. 64(8), pages 3469-3488, August.
    9. Tsai, Shing Chih & Zheng, Ya-Xin, 2013. "A simulation optimization approach for a two-echelon inventory system with service level constraints," European Journal of Operational Research, Elsevier, vol. 229(2), pages 364-374.
    10. Daniel, J. Sudhir Ryan & Rajendran, Chandrasekharan, 2006. "Heuristic approaches to determine base-stock levels in a serial supply chain with a single objective and with multiple objectives," European Journal of Operational Research, Elsevier, vol. 175(1), pages 566-592, November.
    11. Anupam Keshari & Nishikant Mishra & Nagesh Shukla & Steve McGuire & Sangeeta Khorana, 2018. "Multiple order-up-to policy for mitigating bullwhip effect in supply chain network," Annals of Operations Research, Springer, vol. 269(1), pages 361-386, October.
    12. Duan, Qinglin & Warren Liao, T., 2013. "Optimization of replenishment policies for decentralized and centralized capacitated supply chains under various demands," International Journal of Production Economics, Elsevier, vol. 142(1), pages 194-204.
    13. Alp Muharremoglu & John N. Tsitsiklis, 2008. "A Single-Unit Decomposition Approach to Multiechelon Inventory Systems," Operations Research, INFORMS, vol. 56(5), pages 1089-1103, October.
    14. Deshpande, Paras & Shukla, Deepak & Tiwari, M.K., 2011. "Fuzzy goal programming for inventory management: A bacterial foraging approach," European Journal of Operational Research, Elsevier, vol. 212(2), pages 325-336, July.
    15. Powell, Warren B., 2019. "A unified framework for stochastic optimization," European Journal of Operational Research, Elsevier, vol. 275(3), pages 795-821.
    16. Jörn Grahl & Stefan Minner & Daniel Dittmar, 2016. "Meta-heuristics for placing strategic safety stock in multi-echelon inventory with differentiated service times," Annals of Operations Research, Springer, vol. 242(2), pages 489-504, July.
    17. Schildbach, Georg & Morari, Manfred, 2016. "Scenario-based model predictive control for multi-echelon supply chain management," European Journal of Operational Research, Elsevier, vol. 252(2), pages 540-549.
    18. Fangruo Chen, 2000. "Optimal Policies for Multi-Echelon Inventory Problems with Batch Ordering," Operations Research, INFORMS, vol. 48(3), pages 376-389, June.
    19. Kanishka Misra & Eric M. Schwartz & Jacob Abernethy, 2019. "Dynamic Online Pricing with Incomplete Information Using Multiarmed Bandit Experiments," Marketing Science, INFORMS, vol. 38(2), pages 226-252, March.
    20. Paolo Priore & Borja Ponte & Rafael Rosillo & David de la Fuente, 2019. "Applying machine learning to the dynamic selection of replenishment policies in fast-changing supply chain environments," International Journal of Production Research, Taylor & Francis Journals, vol. 57(11), pages 3663-3677, June.
    21. K. Devika & A. Jafarian & A. Hassanzadeh & R. Khodaverdi, 2016. "Optimizing of bullwhip effect and net stock amplification in three-echelon supply chains using evolutionary multi-objective metaheuristics," Annals of Operations Research, Springer, vol. 242(2), pages 457-487, July.
    22. Giannoccaro, Ilaria & Pontrandolfo, Pierpaolo & Scozzi, Barbara, 2003. "A fuzzy echelon approach for inventory management in supply chains," European Journal of Operational Research, Elsevier, vol. 149(1), pages 185-196, August.
    23. Francis de Véricourt & Fikri Karaesmen & Yves Dallery, 2001. "Assessing the Benefits of Different Stock-Allocation Policies for a Make-to-Stock Production System," Manufacturing & Service Operations Management, INFORMS, vol. 3(2), pages 105-121, December.
    24. L. Jeff Hong & Barry L. Nelson & Jie Xu, 2015. "Discrete Optimization via Simulation," International Series in Operations Research & Management Science, in: Michael C Fu (ed.), Handbook of Simulation Optimization, edition 127, chapter 0, pages 9-44, Springer.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jung, Seung Hwan & Yang, Yunsi, 2023. "On the value of operational flexibility in the trailer shipment and assignment problem: Data-driven approaches and reinforcement learning," International Journal of Production Economics, Elsevier, vol. 264(C).
    2. Park, Hyungjun & Choi, Dong Gu & Min, Daiki, 2023. "Adaptive inventory replenishment using structured reinforcement learning by exploiting a policy structure," International Journal of Production Economics, Elsevier, vol. 266(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Deniz Preil & Michael Krapp, 2022. "Artificial intelligence-based inventory management: a Monte Carlo tree search approach," Annals of Operations Research, Springer, vol. 308(1), pages 415-439, January.
    2. de Kok, Ton & Grob, Christopher & Laumanns, Marco & Minner, Stefan & Rambau, Jörg & Schade, Konrad, 2018. "A typology and literature review on stochastic multi-echelon inventory models," European Journal of Operational Research, Elsevier, vol. 269(3), pages 955-983.
    3. Jordan Tong & Gregory DeCroix & Jing-Sheng Song, 2020. "Modeling Payment Timing in Multiechelon Inventory Systems with Applications to Supply Chain Coordination," Manufacturing & Service Operations Management, INFORMS, vol. 22(2), pages 346-363, March.
    4. Peidro, David & Mula, Josefa & Jiménez, Mariano & del Mar Botella, Ma, 2010. "A fuzzy linear programming based approach for tactical supply chain planning in an uncertainty environment," European Journal of Operational Research, Elsevier, vol. 205(1), pages 65-80, August.
    5. Gregory A. DeCroix, 2006. "Optimal Policy for a Multiechelon Inventory System with Remanufacturing," Operations Research, INFORMS, vol. 54(3), pages 532-543, June.
    6. Kevin H. Shang & Jing-Sheng Song, 2007. "Serial Supply Chains with Economies of Scale: Bounds and Approximations," Operations Research, INFORMS, vol. 55(5), pages 843-853, October.
    7. Huaxiao Shen & Tian Tian & Han Zhu, 2019. "A Two-Echelon Inventory System with a Minimum Order Quantity Requirement," Sustainability, MDPI, vol. 11(18), pages 1-22, September.
    8. Warsing, Donald P. & Wangwatcharakul, Worawut & King, Russell E., 2019. "Computing base-stock levels for a two-stage supply chain with uncertain supply," Omega, Elsevier, vol. 89(C), pages 92-109.
    9. Xiuli Chao & Sean X. Zhou, 2009. "Optimal Policy for a Multiechelon Inventory System with Batch Ordering and Fixed Replenishment Intervals," Operations Research, INFORMS, vol. 57(2), pages 377-390, April.
    10. Hossein Abouee-Mehrizi & Opher Baron & Oded Berman, 2014. "Exact Analysis of Capacitated Two-Echelon Inventory Systems with Priorities," Manufacturing & Service Operations Management, INFORMS, vol. 16(4), pages 561-577, October.
    11. Jean Respen & Nicolas Zufferey & Philippe Wieser, 2017. "Three-level inventory deployment for a luxury watch company facing various perturbations," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 68(10), pages 1195-1210, October.
    12. Alexandar Angelus, 2011. "A Multiechelon Inventory Problem with Secondary Market Sales," Management Science, INFORMS, vol. 57(12), pages 2145-2162, December.
    13. Gumus, Alev Taskin & Guneri, Ali Fuat & Ulengin, Fusun, 2010. "A new methodology for multi-echelon inventory management in stochastic and neuro-fuzzy environments," International Journal of Production Economics, Elsevier, vol. 128(1), pages 248-260, November.
    14. Geert-Jan van Houtum & Alan Scheller-Wolf & Jinxin Yi, 2007. "Optimal Control of Serial Inventory Systems with Fixed Replenishment Intervals," Operations Research, INFORMS, vol. 55(4), pages 674-687, August.
    15. Lingxiu Dong & Hau L. Lee, 2003. "Optimal Policies and Approximations for a Serial Multiechelon Inventory System with Time-Correlated Demand," Operations Research, INFORMS, vol. 51(6), pages 969-980, December.
    16. Peter Berling & Victor Martínez-de-Albéniz, 2016. "Dynamic Speed Optimization in Supply Chains with Stochastic Demand," Transportation Science, INFORMS, vol. 50(3), pages 1114-1127, August.
    17. Retsef Levi & Robin Roundy & Van Anh Truong & Xinshang Wang, 2017. "Provably Near-Optimal Balancing Policies for Multi-Echelon Stochastic Inventory Control Models," Mathematics of Operations Research, INFORMS, vol. 42(1), pages 256-276, January.
    18. Fangruo Chen & Jing-Sheng Song, 2001. "Optimal Policies for Multiechelon Inventory Problems with Markov-Modulated Demand," Operations Research, INFORMS, vol. 49(2), pages 226-234, April.
    19. Kevin H. Shang & Jing-Sheng Song & Paul H. Zipkin, 2009. "Coordination Mechanisms in Decentralized Serial Inventory Systems with Batch Ordering," Management Science, INFORMS, vol. 55(4), pages 685-695, April.
    20. Zhanwei Tian & Guoqing Zhang, 2021. "Multi-echelon fulfillment warehouse rent and production allocation for online direct selling," Annals of Operations Research, Springer, vol. 304(1), pages 427-451, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:proeco:v:252:y:2022:i:c:s0925527322001670. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/ijpe .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.