Dynamic resource matching in manufacturing using deep reinforcement learning

My bibliography Save this article

Dynamic resource matching in manufacturing using deep reinforcement learning

Author

Listed:

Panda, Saunak Kumar
Xiang, Yisha
Liu, Ruiqi

Registered:

Abstract

Matching plays an important role in the logical allocation of resources across a wide range of industries. The benefits of matching have been increasingly recognized in manufacturing industries. In particular, capacity sharing has received much attention recently. In this paper, we consider the problem of dynamically matching demand-capacity types of manufacturing resources. We formulate the multi-period, many-to-many manufacturing resource-matching problem as a sequential decision process. The formulated manufacturing resource-matching problem involves large state and action spaces, and it is not practical to accurately model the joint distribution of various types of demands. To address the curse of dimensionality and the difficulty of explicitly modeling the transition dynamics, we use a model-free deep reinforcement learning approach to find optimal matching policies. Moreover, to tackle the issue of infeasible actions and slow convergence due to initial biased estimates caused by the maximum operator in Q-learning, we introduce two penalties to the traditional Q-learning algorithm: a domain knowledge-based penalty based on a prior policy and an infeasibility penalty that conforms to the demand–supply constraints. We establish theoretical results on the convergence of our domain knowledge-informed Q-learning providing performance guarantee for small-size problems. For large-size problems, we further inject our modified approach into the deep deterministic policy gradient (DDPG) algorithm, which we refer to as domain knowledge-informed DDPG (DKDDPG). In our computational study, including small- and large-scale experiments, DKDDPG consistently outperformed traditional DDPG and other RL algorithms, yielding higher rewards and demonstrating greater efficiency in time and episodes.

Suggested Citation

Panda, Saunak Kumar & Xiang, Yisha & Liu, Ruiqi, 2024. "Dynamic resource matching in manufacturing using deep reinforcement learning," European Journal of Operational Research, Elsevier, vol. 318(2), pages 408-423.

Handle: RePEc:eee:ejores:v:318:y:2024:i:2:p:408-423
DOI: 10.1016/j.ejor.2024.05.027

Download full text from publisher

As the access to this document is restricted, you may want to search for a different version of it.

References listed on IDEAS

Amirmahdi Tafreshian & Neda Masoud & Yafeng Yin, 2020. "Frontiers in Service Science: Ride Matching for Peer-to-Peer Ride Sharing: A Review and Future Directions," Service Science, INFORMS, vol. 12(2-3), pages 44-60, June.
Alvin E. Roth, 1985. "Conflict and Coincidence of Interest in Job Matching: Some New Results and Open Questions," Mathematics of Operations Research, INFORMS, vol. 10(3), pages 379-389, August.
Morimitsu Kurino, 2014. "House Allocation with Overlapping Generations," American Economic Journal: Microeconomics, American Economic Association, vol. 6(1), pages 258-289, February.
Roth, Alvin E, 1984. "Stability and Polarization of Interests in Job Matching," Econometrica, Econometric Society, vol. 52(1), pages 47-57, January.
Jay Sethuraman & Chung-Piaw Teo & Liwen Qian, 2006. "Many-to-One Stable Matching: Geometry and Fairness," Mathematics of Operations Research, INFORMS, vol. 31(3), pages 581-596, August.
Roth, Alvin E & Sotomayor, Marilda, 1989. "The College Admissions Problem Revisited," Econometrica, Econometric Society, vol. 57(3), pages 559-570, May.
Sutton, John, 1986. "Vertical Product Differentiation: Some Basic Themes," American Economic Review, American Economic Association, vol. 76(2), pages 393-398, May.
Francis Bloch & Nicolas Houy, 2012. "Optimal assignment of durable objects to successive agents," Economic Theory, Springer;Society for the Advancement of Economic Theory (SAET), vol. 51(1), pages 13-33, September.
- Francis Bloch & Nicolas Houy, 2009. "Optimal Assignment of Durable Objects to Successive Agents," Working Papers hal-00435385, HAL.
Matthew J. Sobel, 1981. "Myopic Solutions of Markov Decision Processes and Stochastic Games," Operations Research, INFORMS, vol. 29(5), pages 995-1009, October.
Volodymyr Mnih & Koray Kavukcuoglu & David Silver & Andrei A. Rusu & Joel Veness & Marc G. Bellemare & Alex Graves & Martin Riedmiller & Andreas K. Fidjeland & Georg Ostrovski & Stig Petersen & Charle, 2015. "Human-level control through deep reinforcement learning," Nature, Nature, vol. 518(7540), pages 529-533, February.
Itai Ashlagi & Peng Shi, 2016. "Optimal Allocation Without Money: An Engineering Approach," Management Science, INFORMS, vol. 62(4), pages 1078-1097, April.
Ziming Gao & Yuan Gao & Yi Hu & Zhengyong Jiang & Jionglong Su, 2020. "Application of Deep Q-Network in Portfolio Management," Papers 2003.06365, arXiv.org.
ORTEGA , Francisco & WOLSEY, Laurence A., 2003. "A branch-and-cut algorithm for the single-commodity, uncapacitated, fixed-charge network flow problem," LIDAM Reprints CORE 1611, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE).

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Kadam, Sangram V. & Kotowski, Maciej H., 2018. "Time horizons, lattice structures, and welfare in multi-period matching markets," Games and Economic Behavior, Elsevier, vol. 112(C), pages 1-20.
- Kadam, Sangram V. & Kotowski, Maciej H., 2015. "Time Horizons, Lattice Structures, and Welfare in Multi-period Matching Markets," Working Paper Series rwp15-031, Harvard University, John F. Kennedy School of Government.
Julien Combe & Vladyslav Nora & Olivier Tercieux, 2021. "Dynamic assignment without money: Optimality of spot mechanisms," Working Papers 2021-11, Center for Research in Economics and Statistics.
Paula Jaramillo & Çaǧatay Kayı & Flip Klijn, 2014. "On the exhaustiveness of truncation and dropping strategies in many-to-many matching markets," Social Choice and Welfare, Springer;The Society for Social Choice and Welfare, vol. 42(4), pages 793-811, April.
- Paula Jaramillo & Kagi Cagatay & Flip Klijn, 2012. "On the exhaustiveness of truncation and dropping strategies in many-to-many matching markets," Documentos de Trabajo 9997, Universidad del Rosario.
- Paula Jaramillo & Ã‡aÇ§atay KayÄ± & Flip Klijn, 2015. "On the Exhaustiveness of Truncation and Dropping Strategies in Many-to-Many Matching Markets," Working Papers 632, Barcelona School of Economics.
- Paula Jaramillo & Cagatay Kay & Flip Klijn, 2012. "On the Exhaustiveness of Truncation and Dropping Strategies in Many-to-Many Matching Markets," Documentos CEDE 10316, Universidad de los Andes, Facultad de Economía, CEDE.
Itai Ashlagi & Flip Klijn, 2012. "Manipulability in matching markets: conflict and coincidence of interests," Social Choice and Welfare, Springer;The Society for Social Choice and Welfare, vol. 39(1), pages 23-33, June.
- Itai Ashlagi & Flip Klijn, 2010. "Manipulability in Matching Markets: Conflict and Coincidence of Interests," UFAE and IAE Working Papers 835.10, Unitat de Fonaments de l'Anàlisi Econòmica (UAB) and Institut d'Anàlisi Econòmica (CSIC).
- Itai Ashlagi & Flip Klijn, 2010. "Manipulability in Matching Markets: Conflict and Coincidence of Interests," Working Papers 479, Barcelona School of Economics.
Eliana Pepa Risma, 2022. "Matching with contracts: calculation of the complete set of stable allocations," Theory and Decision, Springer, vol. 93(3), pages 449-461, October.
Dimakopoulos, Philipp D. & Heller, C.-Philipp, 2015. "Matching with Waiting Times: The German Entry-Level Labour Market for Lawyers," VfS Annual Conference 2015 (Muenster): Economic Development - Theory and Policy 113153, Verein für Socialpolitik / German Economic Association.
Hatfield, John William & Kojima, Fuhito, 2010. "Substitutes and stability for matching with contracts," Journal of Economic Theory, Elsevier, vol. 145(5), pages 1704-1723, September.
Hatfield, John William & Kominers, Scott Duke, 2017. "Contract design and stability in many-to-many matching," Games and Economic Behavior, Elsevier, vol. 101(C), pages 78-97.
Ruth Mart?ez & Jordi MassóAuthor-Name: Alejandro Neme & Jorge Oviedo, "undated". "An Algorithm To Compute The Set Of Many-To-Many Stable Matchings," UFAE and IAE Working Papers 457.00, Unitat de Fonaments de l'Anàlisi Econòmica (UAB) and Institut d'Anàlisi Econòmica (CSIC).
Okumura, Yasunori, 2017. "A one-sided many-to-many matching problem," Journal of Mathematical Economics, Elsevier, vol. 72(C), pages 104-111.
Neme, Pablo & Oviedo, Jorge, 2021. "On the set of many-to-one strongly stable fractional matchings," Mathematical Social Sciences, Elsevier, vol. 110(C), pages 1-13.
- Pablo Neme & Jorge Oviedo, 2020. "On the set of many-to-one strongly stable fractional matchings," Working Papers 19, Red Nacional de Investigadores en Economía (RedNIE).
Zhengyong Jiang & Jeyan Thiayagalingam & Jionglong Su & Jinjun Liang, 2023. "CAD: Clustering And Deep Reinforcement Learning Based Multi-Period Portfolio Management Strategy," Papers 2310.01319, arXiv.org.
Afodjo, Nabil & Pongou, Roland, 2024. "Efficiency and maximality in anonymous two-sided economies," Games and Economic Behavior, Elsevier, vol. 146(C), pages 184-195.
John William Hatfield & Paul R. Milgrom, 2005. "Matching with Contracts," American Economic Review, American Economic Association, vol. 95(4), pages 913-935, September.
- Paul Milgrom, 2003. "Matching with Contracts," Working Papers 03003, Stanford University, Department of Economics.
Tamás Fleiner, 2003. "A Fixed-Point Approach to Stable Matchings and Some Applications," Mathematics of Operations Research, INFORMS, vol. 28(1), pages 103-126, February.
Tamás Fleiner & Ravi Jagadeesan & Zsuzsanna Jankó & Alexander Teytelboym, 2019. "Trading Networks With Frictions," Econometrica, Econometric Society, vol. 87(5), pages 1633-1661, September.
- Tamas Fleiner & Ravi Jagadeesan & Zsuzsanna Janko & Alexander Teytelboym, 2020. "Trading Networks with Frictions," CERS-IE WORKING PAPERS 2008, Institute of Economics, Centre for Economic and Regional Studies.
Ma, Jinpeng, 2010. "The singleton core in the college admissions problem and its application to the National Resident Matching Program (NRMP)," Games and Economic Behavior, Elsevier, vol. 69(1), pages 150-164, May.
Vilmos Komornik & Zsolt Komornik & Christelle Viauroux, 2010. "Stable Schedule Matchings," UMBC Economics Department Working Papers 10-120, UMBC Department of Economics, revised 01 Jul 2011.
Roth, Alvin E. & Sotomayor, Marilda, 1996. "Stable Outcomes in Discrete and Continuous Models of Two-Sided Matching: a Unified Treatment," Brazilian Review of Econometrics, Sociedade Brasileira de Econometria - SBE, vol. 16(2), November.
Vilmos Komornik & Christelle Viauroux, 2012. "Conditional Stable Matchings," UMBC Economics Department Working Papers 12-03, UMBC Department of Economics.

More about this item

Keywords

Assignment; Matching problem; Manufacturing; Markov decision process; Deep reinforcement learning;
All these keywords.

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:318:y:2024:i:2:p:408-423. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Dynamic resource matching in manufacturing using deep reinforcement learning

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data