IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v318y2024i2p408-423.html
   My bibliography  Save this article

Dynamic resource matching in manufacturing using deep reinforcement learning

Author

Listed:
  • Panda, Saunak Kumar
  • Xiang, Yisha
  • Liu, Ruiqi

Abstract

Matching plays an important role in the logical allocation of resources across a wide range of industries. The benefits of matching have been increasingly recognized in manufacturing industries. In particular, capacity sharing has received much attention recently. In this paper, we consider the problem of dynamically matching demand-capacity types of manufacturing resources. We formulate the multi-period, many-to-many manufacturing resource-matching problem as a sequential decision process. The formulated manufacturing resource-matching problem involves large state and action spaces, and it is not practical to accurately model the joint distribution of various types of demands. To address the curse of dimensionality and the difficulty of explicitly modeling the transition dynamics, we use a model-free deep reinforcement learning approach to find optimal matching policies. Moreover, to tackle the issue of infeasible actions and slow convergence due to initial biased estimates caused by the maximum operator in Q-learning, we introduce two penalties to the traditional Q-learning algorithm: a domain knowledge-based penalty based on a prior policy and an infeasibility penalty that conforms to the demand–supply constraints. We establish theoretical results on the convergence of our domain knowledge-informed Q-learning providing performance guarantee for small-size problems. For large-size problems, we further inject our modified approach into the deep deterministic policy gradient (DDPG) algorithm, which we refer to as domain knowledge-informed DDPG (DKDDPG). In our computational study, including small- and large-scale experiments, DKDDPG consistently outperformed traditional DDPG and other RL algorithms, yielding higher rewards and demonstrating greater efficiency in time and episodes.

Suggested Citation

  • Panda, Saunak Kumar & Xiang, Yisha & Liu, Ruiqi, 2024. "Dynamic resource matching in manufacturing using deep reinforcement learning," European Journal of Operational Research, Elsevier, vol. 318(2), pages 408-423.
  • Handle: RePEc:eee:ejores:v:318:y:2024:i:2:p:408-423
    DOI: 10.1016/j.ejor.2024.05.027
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0377221724003862
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ejor.2024.05.027?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Sutton, John, 1986. "Vertical Product Differentiation: Some Basic Themes," American Economic Review, American Economic Association, vol. 76(2), pages 393-398, May.
    2. Amirmahdi Tafreshian & Neda Masoud & Yafeng Yin, 2020. "Frontiers in Service Science: Ride Matching for Peer-to-Peer Ride Sharing: A Review and Future Directions," Service Science, INFORMS, vol. 12(2-3), pages 44-60, June.
    3. Francis Bloch & Nicolas Houy, 2012. "Optimal assignment of durable objects to successive agents," Economic Theory, Springer;Society for the Advancement of Economic Theory (SAET), vol. 51(1), pages 13-33, September.
    4. Alvin E. Roth, 1985. "Conflict and Coincidence of Interest in Job Matching: Some New Results and Open Questions," Mathematics of Operations Research, INFORMS, vol. 10(3), pages 379-389, August.
    5. Matthew J. Sobel, 1981. "Myopic Solutions of Markov Decision Processes and Stochastic Games," Operations Research, INFORMS, vol. 29(5), pages 995-1009, October.
    6. Volodymyr Mnih & Koray Kavukcuoglu & David Silver & Andrei A. Rusu & Joel Veness & Marc G. Bellemare & Alex Graves & Martin Riedmiller & Andreas K. Fidjeland & Georg Ostrovski & Stig Petersen & Charle, 2015. "Human-level control through deep reinforcement learning," Nature, Nature, vol. 518(7540), pages 529-533, February.
    7. Itai Ashlagi & Peng Shi, 2016. "Optimal Allocation Without Money: An Engineering Approach," Management Science, INFORMS, vol. 62(4), pages 1078-1097, April.
    8. Ziming Gao & Yuan Gao & Yi Hu & Zhengyong Jiang & Jionglong Su, 2020. "Application of Deep Q-Network in Portfolio Management," Papers 2003.06365, arXiv.org.
    9. Morimitsu Kurino, 2014. "House Allocation with Overlapping Generations," American Economic Journal: Microeconomics, American Economic Association, vol. 6(1), pages 258-289, February.
    10. Roth, Alvin E, 1984. "Stability and Polarization of Interests in Job Matching," Econometrica, Econometric Society, vol. 52(1), pages 47-57, January.
    11. Jay Sethuraman & Chung-Piaw Teo & Liwen Qian, 2006. "Many-to-One Stable Matching: Geometry and Fairness," Mathematics of Operations Research, INFORMS, vol. 31(3), pages 581-596, August.
    12. ORTEGA , Francisco & WOLSEY, Laurence A., 2003. "A branch-and-cut algorithm for the single-commodity, uncapacitated, fixed-charge network flow problem," LIDAM Reprints CORE 1611, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE).
    13. Roth, Alvin E & Sotomayor, Marilda, 1989. "The College Admissions Problem Revisited," Econometrica, Econometric Society, vol. 57(3), pages 559-570, May.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Julien Combe & Vladyslav Nora & Olivier Tercieux, 2021. "Dynamic assignment without money: Optimality of spot mechanisms," Working Papers 2021-11, Center for Research in Economics and Statistics.
    2. Kadam, Sangram V. & Kotowski, Maciej H., 2018. "Time horizons, lattice structures, and welfare in multi-period matching markets," Games and Economic Behavior, Elsevier, vol. 112(C), pages 1-20.
    3. Paula Jaramillo & Çaǧatay Kayı & Flip Klijn, 2014. "On the exhaustiveness of truncation and dropping strategies in many-to-many matching markets," Social Choice and Welfare, Springer;The Society for Social Choice and Welfare, vol. 42(4), pages 793-811, April.
    4. Itai Ashlagi & Flip Klijn, 2012. "Manipulability in matching markets: conflict and coincidence of interests," Social Choice and Welfare, Springer;The Society for Social Choice and Welfare, vol. 39(1), pages 23-33, June.
    5. Eliana Pepa Risma, 2022. "Matching with contracts: calculation of the complete set of stable allocations," Theory and Decision, Springer, vol. 93(3), pages 449-461, October.
    6. Dimakopoulos, Philipp D. & Heller, C.-Philipp, 2015. "Matching with Waiting Times: The German Entry-Level Labour Market for Lawyers," VfS Annual Conference 2015 (Muenster): Economic Development - Theory and Policy 113153, Verein für Socialpolitik / German Economic Association.
    7. Hatfield, John William & Kojima, Fuhito, 2010. "Substitutes and stability for matching with contracts," Journal of Economic Theory, Elsevier, vol. 145(5), pages 1704-1723, September.
    8. Hatfield, John William & Kominers, Scott Duke, 2017. "Contract design and stability in many-to-many matching," Games and Economic Behavior, Elsevier, vol. 101(C), pages 78-97.
    9. Ruth Mart?ez & Jordi MassóAuthor-Name: Alejandro Neme & Jorge Oviedo, "undated". "An Algorithm To Compute The Set Of Many-To-Many Stable Matchings," UFAE and IAE Working Papers 457.00, Unitat de Fonaments de l'Anàlisi Econòmica (UAB) and Institut d'Anàlisi Econòmica (CSIC).
    10. Konishi, Hideo & Unver, M. Utku, 2006. "Credible group stability in many-to-many matching problems," Journal of Economic Theory, Elsevier, vol. 129(1), pages 57-80, July.
    11. Okumura, Yasunori, 2017. "A one-sided many-to-many matching problem," Journal of Mathematical Economics, Elsevier, vol. 72(C), pages 104-111.
    12. Neme, Pablo & Oviedo, Jorge, 2021. "On the set of many-to-one strongly stable fractional matchings," Mathematical Social Sciences, Elsevier, vol. 110(C), pages 1-13.
    13. Zhengyong Jiang & Jeyan Thiayagalingam & Jionglong Su & Jinjun Liang, 2023. "CAD: Clustering And Deep Reinforcement Learning Based Multi-Period Portfolio Management Strategy," Papers 2310.01319, arXiv.org.
    14. Afodjo, Nabil & Pongou, Roland, 2024. "Efficiency and maximality in anonymous two-sided economies," Games and Economic Behavior, Elsevier, vol. 146(C), pages 184-195.
    15. John William Hatfield & Paul R. Milgrom, 2005. "Matching with Contracts," American Economic Review, American Economic Association, vol. 95(4), pages 913-935, September.
    16. Bettina Klaus & Flip Klijn, 2010. "Smith and Rawls share a room: stability and medians," Social Choice and Welfare, Springer;The Society for Social Choice and Welfare, vol. 35(4), pages 647-667, October.
    17. Tamás Fleiner, 2003. "A Fixed-Point Approach to Stable Matchings and Some Applications," Mathematics of Operations Research, INFORMS, vol. 28(1), pages 103-126, February.
    18. Tamás Fleiner & Ravi Jagadeesan & Zsuzsanna Jankó & Alexander Teytelboym, 2019. "Trading Networks With Frictions," Econometrica, Econometric Society, vol. 87(5), pages 1633-1661, September.
    19. Philipp D. Dimakopoulos & Christian-Philipp Heller, "undated". "Matching with Waiting Times: The German Entry-Level Labour Market for Lawyers," BDPEMS Working Papers 2014005, Berlin School of Economics.
    20. Martinez, Ruth & Masso, Jordi & Neme, Alejandro & Oviedo, Jorge, 2004. "An algorithm to compute the full set of many-to-many stable matchings," Mathematical Social Sciences, Elsevier, vol. 47(2), pages 187-210, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:318:y:2024:i:2:p:408-423. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.