Reinforcement learning for logistics and supply chain management: Methodologies, state of the art, and future opportunities

My bibliography Save this article

Reinforcement learning for logistics and supply chain management: Methodologies, state of the art, and future opportunities

Author

Listed:

Yan, Yimo
Chow, Andy H.F.
Ho, Chin Pang
Kuo, Yong-Hong
Wu, Qihao
Ying, Chengshuo

Registered:

Abstract

With advances in technologies, data science techniques, and computing equipment, there has been rapidly increasing interest in the applications of reinforcement learning (RL) to address the challenges resulting from the evolving business and organisational operations in logistics and supply chain management (SCM). This paper aims to provide a comprehensive review of the development and applications of RL techniques in the field of logistics and SCM. We first provide an introduction to RL methodologies, followed by a classification of previous research studies by application. The state-of-the-art research is reviewed and the current challenges are discussed. It is found that Q-learning (QL) is the most popular RL approach adopted by these studies and the research on RL for urban logistics is growing in recent years due to the prevalence of E-commerce and last mile delivery. Finally, some potential directions are presented for future research.

Suggested Citation

Yan, Yimo & Chow, Andy H.F. & Ho, Chin Pang & Kuo, Yong-Hong & Wu, Qihao & Ying, Chengshuo, 2022. "Reinforcement learning for logistics and supply chain management: Methodologies, state of the art, and future opportunities," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 162(C).

Handle: RePEc:eee:transe:v:162:y:2022:i:c:s136655452200103x
DOI: 10.1016/j.tre.2022.102712

Download full text from publisher

As the access to this document is restricted, you may want to search for a different version of it.

References listed on IDEAS

Lafkihi, Mariam & Pan, Shenle & Ballot, Eric, 2019. "Freight transportation service procurement: A literature review and future research opportunities in omnichannel E-commerce," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 125(C), pages 348-365.
Wang, Xin & Kuo, Yong-Hong & Shen, Houcai & Zhang, Lianmin, 2021. "Target-oriented robust location–transportation problem with service-level measure," Transportation Research Part B: Methodological, Elsevier, vol. 153(C), pages 1-20.
Rana, Rupal & Oliveira, Fernando S., 2014. "Real-time dynamic pricing in a non-stationary environment using model-free reinforcement learning," Omega, Elsevier, vol. 47(C), pages 116-126.
Martin, Layla & Minner, Stefan, 2021. "Feature-based selection of carsharing relocation modes," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 149(C).
Mitręga, Maciej & Choi, Tsan-Ming, 2021. "How small-and-medium transportation companies handle asymmetric customer relationships under COVID-19 pandemic: A multi-method study," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 148(C).
Ying, Cheng-shuo & Chow, Andy H.F. & Chin, Kwai-Sang, 2020. "An actor-critic deep reinforcement learning approach for metro train scheduling with rolling stock circulation under stochastic demand," Transportation Research Part B: Methodological, Elsevier, vol. 140(C), pages 210-235.
Byeongseop Kim & Yongkuk Jeong & Jong Gye Shin, 2020. "Spatial arrangement using deep reinforcement learning to minimise rearrangement in ship block stockyards," International Journal of Production Research, Taylor & Francis Journals, vol. 58(16), pages 5062-5076, July.
Kim, Kap Hwan & Lee, Keung Mo & Hwang, Hark, 2003. "Sequencing delivery and receiving operations for yard cranes in port container terminals," International Journal of Production Economics, Elsevier, vol. 84(3), pages 283-292, June.
Shenle Pan & Damien Trentesaux & Duncan Mcfarlane & Benoit Montreuil & Eric Ballot & George Huang, 2021. "Digital interoperability and transformation in logistics and supply chain management: Editorial," Post-Print hal-03195695, HAL.
Enayati, Shakiba & Özaltın, Osman Y., 2020. "Optimal influenza vaccine distribution with equity," European Journal of Operational Research, Elsevier, vol. 283(2), pages 714-725.
Rameshwar Dubey & Angappa Gunasekaran & Thanos Papadopoulos, 2019. "Disaster relief operations: past, present and future," Annals of Operations Research, Springer, vol. 283(1), pages 1-8, December.
Cheung, Kam-Fung & Bell, Michael G.H. & Bhattacharjya, Jyotirmoyee, 2021. "Cybersecurity in logistics and supply chain management: An overview and future research directions," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 146(C).
Asadi, Amin & Nurre Pinkley, Sarah, 2021. "A stochastic scheduling, allocation, and inventory replenishment problem for battery swap stations," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 146(C).
Li, Xueping & Wang, Jiao & Sawhney, Rapinder, 2012. "Reinforcement learning for joint pricing, lead-time and scheduling decisions in make-to-order systems," European Journal of Operational Research, Elsevier, vol. 221(1), pages 99-109.
Martin, Simon & Ouelhadj, Djamila & Beullens, Patrick & Ozcan, Ender & Juan, Angel A. & Burke, Edmund K., 2016. "A multi-agent based cooperative approach to scheduling and routing," European Journal of Operational Research, Elsevier, vol. 254(1), pages 169-178.
Wolfram Wiesemann & Daniel Kuhn & Berç Rustem, 2013. "Robust Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 38(1), pages 153-183, February.
Hau L. Lee & V. Padmanabhan & Seungjin Whang, 1997. "Information Distortion in a Supply Chain: The Bullwhip Effect," Management Science, INFORMS, vol. 43(4), pages 546-558, April.
Giannoccaro, Ilaria & Pontrandolfo, Pierpaolo, 2002. "Inventory management in supply chains: a reinforcement learning approach," International Journal of Production Economics, Elsevier, vol. 78(2), pages 153-161, July.
Yin, Jiateng & Tang, Tao & Yang, Lixing & Gao, Ziyou & Ran, Bin, 2016. "Energy-efficient metro train rescheduling with uncertain time-variant passenger demands: An approximate dynamic programming approach," Transportation Research Part B: Methodological, Elsevier, vol. 91(C), pages 178-210.
Ahamed, Tanvir & Zou, Bo & Farazi, Nahid Parvez & Tulabandhula, Theja, 2021. "Deep Reinforcement Learning for Crowdsourced Urban Delivery," Transportation Research Part B: Methodological, Elsevier, vol. 152(C), pages 227-257.
Galindo, Gina & Batta, Rajan, 2013. "Review of recent developments in OR/MS research in disaster operations management," European Journal of Operational Research, Elsevier, vol. 230(2), pages 201-211.
Yong-Hong Kuo & Andrew Kusiak, 2019. "From data to big data in production research: the past and future trends," International Journal of Production Research, Taylor & Francis Journals, vol. 57(15-16), pages 4828-4853, August.
Firdausiyah, N. & Taniguchi, E. & Qureshi, A.G., 2019. "Modeling city logistics using adaptive dynamic programming based multi-agent simulation," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 125(C), pages 74-96.
Amir Ardestani-Jaafari & Erick Delage, 2018. "The Value of Flexibility in Robust Location–Transportation Problems," Transportation Science, INFORMS, vol. 52(1), pages 189-209, January.
Dimitris Bertsimas & Aurélie Thiele, 2006. "A Robust Optimization Approach to Inventory Theory," Operations Research, INFORMS, vol. 54(1), pages 150-168, February.
Shenle Pan & Damien Trentesaux & Duncan Mcfarlane & Benoit Montreuil & Eric Ballot & George Huang, 2021. "Digital interoperability in logistics and supply chain management: state-of-the-art and research avenues towards Physical Internet," Post-Print hal-03161524, HAL.
Cleophas, Catherine & Cottrill, Caitlin & Ehmke, Jan Fabian & Tierney, Kevin, 2019. "Collaborative urban transportation: Recent advances in theory and practice," European Journal of Operational Research, Elsevier, vol. 273(3), pages 801-816.
Mariam Lafkihi & Shenle Pan & Eric Ballot, 2019. "Freight transportation service procurement: A literature review and future research opportunities in omnichannel E-commerce," Post-Print hal-02086154, HAL.
Fotuhi, Fateme & Huynh, Nathan & Vidal, Jose M. & Xie, Yuanchang, 2013. "Modeling yard crane operators as reinforcement learning agents," Research in Transportation Economics, Elsevier, vol. 42(1), pages 3-12.
Al Hajj Hassan, Lama & Mahmassani, Hani S. & Chen, Ying, 2020. "Reinforcement learning framework for freight demand forecasting to support operational planning decisions," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 137(C).
Volodymyr Mnih & Koray Kavukcuoglu & David Silver & Andrei A. Rusu & Joel Veness & Marc G. Bellemare & Alex Graves & Martin Riedmiller & Andreas K. Fidjeland & Georg Ostrovski & Stig Petersen & Charle, 2015. "Human-level control through deep reinforcement learning," Nature, Nature, vol. 518(7540), pages 529-533, February.
Arnab Nilim & Laurent El Ghaoui, 2005. "Robust Control of Markov Decision Processes with Uncertain Transition Matrices," Operations Research, INFORMS, vol. 53(5), pages 780-798, October.
Choi, Tsan-Ming, 2020. "Internet based elastic logistics platforms for fashion quick response systems in the digital era," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 143(C).
Illhoe Hwang & Young Jae Jang, 2020. "Q(λ) learning-based dynamic route guidance algorithm for overhead hoist transport systems in semiconductor fabs," International Journal of Production Research, Taylor & Francis Journals, vol. 58(4), pages 1199-1221, February.
Chen-Fu Chien & Yun-Siang Lin & Sheng-Kai Lin, 2020. "Deep reinforcement learning for selecting demand forecast models to empower Industry 3.5 and an empirical study for a semiconductor component distributor," International Journal of Production Research, Taylor & Francis Journals, vol. 58(9), pages 2784-2804, May.
Bruzzone, Francesco & Cavallaro, Federico & Nocera, Silvio, 2021. "The integration of passenger and freight transport for first-last mile operations," Transport Policy, Elsevier, vol. 100(C), pages 31-48.
Kyuree Ahn & Jinkyoo Park, 2021. "Cooperative zone-based rebalancing of idle overhead hoist transportations using multi-agent reinforcement learning with graph representation learning," IISE Transactions, Taylor & Francis Journals, vol. 53(10), pages 1140-1156, October.
Nie, Yu (Marco) & Wu, Xing, 2009. "Shortest path problem considering on-time arrival probability," Transportation Research Part B: Methodological, Elsevier, vol. 43(6), pages 597-613, July.
Liu, Shan & Jiang, Hai & Chen, Shuiping & Ye, Jing & He, Renqing & Sun, Zhizhao, 2020. "Integrating Dijkstra’s algorithm into deep inverse reinforcement learning for food delivery route planning," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 142(C).
Chiang, Chi, 2003. "Optimal replenishment for a periodic review inventory system with two supply modes," European Journal of Operational Research, Elsevier, vol. 149(1), pages 229-244, August.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Amine Masmoudi, M. & Mancini, Simona & Baldacci, Roberto & Kuo, Yong-Hong, 2022. "Vehicle routing problems with drones equipped with multi-package payload compartments," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 164(C).
Wang, Haibo & Alidaee, Bahram, 2023. "White-glove service delivery: A quantitative analysis," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 175(C).
Li, Huanhuan & Jiao, Hang & Yang, Zaili, 2023. "AIS data-driven ship trajectory prediction modelling and analysis based on machine learning and deep learning methods," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 175(C).
Fang, Chao & Han, Zonglei & Wang, Wei & Zio, Enrico, 2023. "Routing UAVs in landslides Monitoring: A neural network heuristic for team orienteering with mandatory visits," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 175(C).
Dubey, Rameshwar & Gunasekaran, Angappa & Papadopoulos, Thanos, 2024. "Benchmarking operations and supply chain management practices using Generative AI: Towards a theoretical framework," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 189(C).
Ding, Yida & Wandelt, Sebastian & Wu, Guohua & Xu, Yifan & Sun, Xiaoqian, 2023. "Towards efficient airline disruption recovery with reinforcement learning," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 179(C).
Guo, Feng & Wei, Qu & Wang, Miao & Guo, Zhaoxia & Wallace, Stein W., 2023. "Deep attention models with dimension-reduction and gate mechanisms for solving practical time-dependent vehicle routing problems," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 173(C).
Wadi Khalid Anuar & Lai Soon Lee & Hsin-Vonn Seow & Stefan Pickl, 2022. "A Multi-Depot Dynamic Vehicle Routing Problem with Stochastic Road Capacity: An MDP Model and Dynamic Policy for Post-Decision State Rollout Algorithm in Reinforcement Learning," Mathematics, MDPI, vol. 10(15), pages 1-70, July.
Stranieri, Francesco & Fadda, Edoardo & Stella, Fabio, 2024. "Combining deep reinforcement learning and multi-stage stochastic programming to address the supply chain inventory management problem," International Journal of Production Economics, Elsevier, vol. 268(C).
Bootaki, Behrang & Zhang, Guoqing, 2024. "A location-production-routing problem for distributed manufacturing platforms: A neural genetic algorithm solution methodology," International Journal of Production Economics, Elsevier, vol. 275(C).
Li, Huanhuan & Xing, Wenbin & Jiao, Hang & Yang, Zaili & Li, Yan, 2024. "Deep bi-directional information-empowered ship trajectory prediction for maritime autonomous surface ships," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 181(C).
Kuo, Yong-Hong & Leung, Janny M.Y. & Yan, Yimo, 2023. "Public transport for smart cities: Recent innovations and future challenges," European Journal of Operational Research, Elsevier, vol. 306(3), pages 1001-1026.
Winkelmann, Jonas & Spinler, Stefan & Neukirchen, Thomas, 2024. "Green transport fleet renewal using approximate dynamic programming: A case study in German heavy-duty road transportation," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 186(C).

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Xiaoyan Xu & Suresh P. Sethi & Sai‐Ho Chung & Tsan‐Ming Choi, 2023. "Reforming global supply chain management under pandemics: The GREAT‐3Rs framework," Production and Operations Management, Production and Operations Management Society, vol. 32(2), pages 524-546, February.
Kong, Xiang T.R. & Kang, Kai & Zhong, Ray Y. & Luo, Hao & Xu, Su Xiu, 2021. "Cyber physical system-enabled on-demand logistics trading," International Journal of Production Economics, Elsevier, vol. 233(C).
Qi, Mingyao & Yang, Ying & Cheng, Chun, 2023. "Location and inventory pre-positioning problem under uncertainty," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 177(C).
Arthur Flajolet & Sébastien Blandin & Patrick Jaillet, 2018. "Robust Adaptive Routing Under Uncertainty," Operations Research, INFORMS, vol. 66(1), pages 210-229, January.
Li, Wenqing & Ni, Shaoquan, 2022. "Train timetabling with the general learning environment and multi-agent deep reinforcement learning," Transportation Research Part B: Methodological, Elsevier, vol. 157(C), pages 230-251.
Xin, Linwei & Goldberg, David A., 2021. "Time (in)consistency of multistage distributionally robust inventory models with moment constraints," European Journal of Operational Research, Elsevier, vol. 289(3), pages 1127-1141.
Wang, Haibo & Alidaee, Bahram, 2023. "White-glove service delivery: A quantitative analysis," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 175(C).
Sun, Xuting & Kuo, Yong-Hong & Xue, Weili & Li, Yanzhi, 2024. "Technology-driven logistics and supply chain management for societal impacts," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 185(C).
Wang, Xuekai & D’Ariano, Andrea & Su, Shuai & Tang, Tao, 2023. "Cooperative train control during the power supply shortage in metro system: A multi-agent reinforcement learning approach," Transportation Research Part B: Methodological, Elsevier, vol. 170(C), pages 244-278.
Guo, Chaojie & Thompson, Russell G. & Foliente, Greg & Kong, Xiang T.R., 2021. "An auction-enabled collaborative routing mechanism for omnichannel on-demand logistics through transshipment," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 146(C).
Martin W.P Savelsbergh & Marlin W. Ulmer, 2022. "Challenges and opportunities in crowdsourced delivery planning and operations," 4OR, Springer, vol. 20(1), pages 1-21, March.
Maximilian Blesch & Philipp Eisenhauer, 2021. "Robust decision-making under risk and ambiguity," Papers 2104.12573, arXiv.org, revised Oct 2021.
Aliakbari Sani, Sajad & Bahn, Olivier & Delage, Erick, 2022. "Affine decision rule approximation to address demand response uncertainty in smart Grids’ capacity planning," European Journal of Operational Research, Elsevier, vol. 303(1), pages 438-455.
Roberto Gomes de Mattos & Fabricio Oliveira & Adriana Leiras & Abdon Baptista de Paula Filho & Paulo Gonçalves, 2019. "Robust optimization of the insecticide-treated bed nets procurement and distribution planning under uncertainty for malaria prevention and control," Annals of Operations Research, Springer, vol. 283(1), pages 1045-1078, December.
Jihane El Ouadi & Hanae Errousso & Nicolas Malhene & Siham Benhadou & Hicham Medromi, 2022. "A machine-learning based hybrid algorithm for strategic location of urban bundling hubs to support shared public transport," Quality & Quantity: International Journal of Methodology, Springer, vol. 56(5), pages 3215-3258, October.
Rameshwar Dubey & David J. Bryde & Cyril Foropon & Gary Graham & Mihalis Giannakis & Deepa Bhatt Mishra, 2022. "Agility in humanitarian supply chain: an organizational information processing perspective and relational view," Annals of Operations Research, Springer, vol. 319(1), pages 559-579, December.
- Rameshwar Dubey & David Bryde & Cyril Foropon & Gary Graham & Mihalis Giannakis & Deepa Bhatt Mishra, 2022. "Agility in humanitarian supply chain: an organizational information processing perspective and relational view," Post-Print hal-03931989, HAL.
Nathalie Touratier-Muller & Jacques Jaussaud, 2021. "Development of Road Freight Transport Indicators Focused on Sustainability to Assist Shippers: An Analysis Conducted in France through the FRET 21 Programme," Sustainability, MDPI, vol. 13(17), pages 1-17, August.
- Nathalie Touratier-Muller & Jacques Jaussaud, 2021. "Development of Road Freight Transport Indicators Focused on Sustainability to Assist Shippers: An Analysis Conducted in France through the FRET 21 Programme," Post-Print hal-03338852, HAL.
Su, Z.C. & Chow, Andy H.F. & Fang, C.L. & Liang, E.M. & Zhong, R.X., 2023. "Hierarchical control for stochastic network traffic with reinforcement learning," Transportation Research Part B: Methodological, Elsevier, vol. 167(C), pages 196-216.
Li, Feng & Du, Timon C. & Wei, Ying, 2020. "Enhancing supply chain decisions with consumers’ behavioral factors: An illustration of decoy effect," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 144(C).
Ji, Yuxiong & Zhou, Minhang & Zheng, Yujing & Shen, Yu & Du, Yuchuan, 2024. "Urban passenger-and-package sharing transportation by e-hailing taxis: A simulation-based pricing analysis in shanghai," Transport Policy, Elsevier, vol. 156(C), pages 138-151.

More about this item

Keywords

Reinforcement learning; Logistics; Supply chain; Markov decision process; Q-learning; Actor-critic methods; Neural network;
All these keywords.

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:transe:v:162:y:2022:i:c:s136655452200103x. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/600244/description#description .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Reinforcement learning for logistics and supply chain management: Methodologies, state of the art, and future opportunities

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data