IDEAS home Printed from https://ideas.repec.org/a/spr/jglopt/v89y2024i3d10.1007_s10898-024-01364-6.html
   My bibliography  Save this article

A K-means Supported Reinforcement Learning Framework to Multi-dimensional Knapsack

Author

Listed:
  • Sabah Bushaj

    (SUNY Plattsburgh)

  • İ. Esra Büyüktahtakın

    (Grado Department of Industrial and Systems Engineering, Virginia Tech)

Abstract

In this paper, we address the difficulty of solving large-scale multi-dimensional knapsack instances (MKP), presenting a novel deep reinforcement learning (DRL) framework. In this DRL framework, we train different agents compatible with a discrete action space for sequential decision-making while still satisfying any resource constraint of the MKP. This novel framework incorporates the decision variable values in the 2D DRL where the agent is responsible for assigning a value of 1 or 0 to each of the variables. To the best of our knowledge, this is the first DRL model of its kind in which a 2D environment is formulated, and an element of the DRL solution matrix represents an item of the MKP. Our framework is configured to solve MKP instances of different dimensions and distributions. We propose a K-means approach to obtain an initial feasible solution that is used to train the DRL agent. We train four different agents in our framework and present the results comparing each of them with the CPLEX commercial solver. The results show that our agents can learn and generalize over instances with different sizes and distributions. Our DRL framework shows that it can solve medium-sized instances at least 45 times faster in CPU solution time and at least 10 times faster for large instances, with a maximum solution gap of 0.28% compared to the performance of CPLEX. Furthermore, at least 95% of the items are predicted in line with the CPLEX solution. Computations with DRL also provide a better optimality gap with respect to state-of-the-art approaches.

Suggested Citation

  • Sabah Bushaj & İ. Esra Büyüktahtakın, 2024. "A K-means Supported Reinforcement Learning Framework to Multi-dimensional Knapsack," Journal of Global Optimization, Springer, vol. 89(3), pages 655-685, July.
  • Handle: RePEc:spr:jglopt:v:89:y:2024:i:3:d:10.1007_s10898-024-01364-6
    DOI: 10.1007/s10898-024-01364-6
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10898-024-01364-6
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10898-024-01364-6?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Frieze, A. M. & Clarke, M. R. B., 1984. "Approximation algorithms for the m-dimensional 0-1 knapsack problem: Worst-case and probabilistic analyses," European Journal of Operational Research, Elsevier, vol. 15(1), pages 100-109, January.
    2. Yang, Ming-Hsien, 2001. "An efficient algorithm to allocate shelf space," European Journal of Operational Research, Elsevier, vol. 131(1), pages 107-118, May.
    3. Dogacan Yilmaz & İ. Esra Büyüktahtakın, 2023. "Learning Optimal Solutions via an LSTM-Optimization Framework," SN Operations Research Forum, Springer, vol. 4(2), pages 1-40, June.
    4. Hasan Pirkul, 1987. "A heuristic solution procedure for the multiconstraint zero‐one knapsack problem," Naval Research Logistics (NRL), John Wiley & Sons, vol. 34(2), pages 161-172, April.
    5. Arne Thesen, 1975. "A recursive branch and bound algorithm for the multidimensional knapsack problem," Naval Research Logistics Quarterly, John Wiley & Sons, vol. 22(2), pages 341-353, June.
    6. Jae Sik Lee & Monique Guignard, 1988. "Note---An Approximate Algorithm for Multidimensional Zero-One Knapsack Problems---A Parametric Approach," Management Science, INFORMS, vol. 34(3), pages 402-410, March.
    7. Magazine, M. J. & Oguz, Osman, 1984. "A heuristic algorithm for the multidimensional zero-one knapsack problem," European Journal of Operational Research, Elsevier, vol. 16(3), pages 319-326, June.
    8. Dimitris Bertsimas & Ramazan Demir, 2002. "An Approximate Dynamic Programming Approach to Multidimensional Knapsack Problems," Management Science, INFORMS, vol. 48(4), pages 550-565, April.
    9. H. Martin Weingartner, 1966. "Capital Budgeting of Interrelated Projects: Survey and Synthesis," Management Science, INFORMS, vol. 12(7), pages 485-516, March.
    10. Renata Mansini & M. Grazia Speranza, 2012. "CORAL: An Exact Algorithm for the Multidimensional Knapsack Problem," INFORMS Journal on Computing, INFORMS, vol. 24(3), pages 399-415, August.
    11. Yoshiaki Toyoda, 1975. "A Simplified Algorithm for Obtaining Approximate Solutions to Zero-One Programming Problems," Management Science, INFORMS, vol. 21(12), pages 1417-1427, August.
    12. Balev, Stefan & Yanev, Nicola & Freville, Arnaud & Andonov, Rumen, 2008. "A dynamic programming based reduction procedure for the multidimensional 0-1 knapsack problem," European Journal of Operational Research, Elsevier, vol. 186(1), pages 63-76, April.
    13. Richard Loulou & Eleftherios Michaelides, 1979. "New Greedy-Like Heuristics for the Multidimensional 0-1 Knapsack Problem," Operations Research, INFORMS, vol. 27(6), pages 1101-1114, December.
    14. Frederick S. Hillier, 1969. "Efficient Heuristic Procedures for Integer Linear Programming with an Interior," Operations Research, INFORMS, vol. 17(4), pages 600-637, August.
    15. Gregory Dobson, 1982. "Worst-Case Analysis of Greedy Heuristics for Integer Programming with Nonnegative Data," Mathematics of Operations Research, INFORMS, vol. 7(4), pages 515-531, November.
    16. Shizuo Senju & Yoshiaki Toyoda, 1968. "An Approach to Linear Programming with 0-1 Variables," Management Science, INFORMS, vol. 15(4), pages 196-207, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Arnaud Fréville & SaÏd Hanafi, 2005. "The Multidimensional 0-1 Knapsack Problem—Bounds and Computational Aspects," Annals of Operations Research, Springer, vol. 139(1), pages 195-227, October.
    2. Dimitris Bertsimas & Ramazan Demir, 2002. "An Approximate Dynamic Programming Approach to Multidimensional Knapsack Problems," Management Science, INFORMS, vol. 48(4), pages 550-565, April.
    3. Yalçın Akçay & Haijun Li & Susan Xu, 2007. "Greedy algorithm for the general multidimensional knapsack problem," Annals of Operations Research, Springer, vol. 150(1), pages 17-29, March.
    4. Freville, Arnaud, 2004. "The multidimensional 0-1 knapsack problem: An overview," European Journal of Operational Research, Elsevier, vol. 155(1), pages 1-21, May.
    5. Yalçin Akçay & Susan H. Xu, 2004. "Joint Inventory Replenishment and Component Allocation Optimization in an Assemble-to-Order System," Management Science, INFORMS, vol. 50(1), pages 99-116, January.
    6. Jakob Puchinger & Günther R. Raidl & Ulrich Pferschy, 2010. "The Multidimensional Knapsack Problem: Structure and Algorithms," INFORMS Journal on Computing, INFORMS, vol. 22(2), pages 250-265, May.
    7. José García & Paola Moraga & Matias Valenzuela & Hernan Pinto, 2020. "A db-Scan Hybrid Algorithm: An Application to the Multidimensional Knapsack Problem," Mathematics, MDPI, vol. 8(4), pages 1-22, April.
    8. Yoon, Yourim & Kim, Yong-Hyuk & Moon, Byung-Ro, 2012. "A theoretical and empirical investigation on the Lagrangian capacities of the 0-1 multidimensional knapsack problem," European Journal of Operational Research, Elsevier, vol. 218(2), pages 366-376.
    9. Ivan Derpich & Carlos Herrera & Felipe Sepúlveda & Hugo Ubilla, 2021. "Complexity indices for the multidimensional knapsack problem," Central European Journal of Operations Research, Springer;Slovak Society for Operations Research;Hungarian Operational Research Society;Czech Society for Operations Research;Österr. Gesellschaft für Operations Research (ÖGOR);Slovenian Society Informatika - Section for Operational Research;Croatian Operational Research Society, vol. 29(2), pages 589-609, June.
    10. Cao, Chengxuan & Gao, Ziyou & Li, Keping, 2012. "Capacity allocation problem with random demands for the rail container carrier," European Journal of Operational Research, Elsevier, vol. 217(1), pages 214-221.
    11. Hanafi, Said & Freville, Arnaud, 1998. "An efficient tabu search approach for the 0-1 multidimensional knapsack problem," European Journal of Operational Research, Elsevier, vol. 106(2-3), pages 659-675, April.
    12. Lin, Feng-Tse & Yao, Jing-Shing, 2001. "Using fuzzy numbers in knapsack problems," European Journal of Operational Research, Elsevier, vol. 135(1), pages 158-176, November.
    13. Yilmaz, Dogacan & Büyüktahtakın, İ. Esra, 2024. "An expandable machine learning-optimization framework to sequential decision-making," European Journal of Operational Research, Elsevier, vol. 314(1), pages 280-296.
    14. Edward Y H Lin & Chung-Min Wu, 2004. "The multiple-choice multi-period knapsack problem," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 55(2), pages 187-197, February.
    15. A Volgenant & I Y Zwiers, 2007. "Partial enumeration in heuristics for some combinatorial optimization problems," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 58(1), pages 73-79, January.
    16. Paola Cappanera & Marco Trubian, 2005. "A Local-Search-Based Heuristic for the Demand-Constrained Multidimensional Knapsack Problem," INFORMS Journal on Computing, INFORMS, vol. 17(1), pages 82-98, February.
    17. Enrique Garza-Escalante & Arturo de la Torre, 2015. "Nacional Monte de Piedad Uses a Novel Social-Value Measure for Allocating Grants Among Charities," Interfaces, INFORMS, vol. 45(6), pages 514-528, December.
    18. Oliver Bastert & Benjamin Hummel & Sven de Vries, 2010. "A Generalized Wedelin Heuristic for Integer Programming," INFORMS Journal on Computing, INFORMS, vol. 22(1), pages 93-107, February.
    19. G. Edward Fox & Christopher J. Nachtsheim, 1990. "An analysis of six greedy selection rules on a class of zero‐one integer programming models," Naval Research Logistics (NRL), John Wiley & Sons, vol. 37(2), pages 299-307, April.
    20. Bahram Alidaee & Vijay P. Ramalingam & Haibo Wang & Bryan Kethley, 2018. "Computational experiment of critical event tabu search for the general integer multidimensional knapsack problem," Annals of Operations Research, Springer, vol. 269(1), pages 3-19, October.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:jglopt:v:89:y:2024:i:3:d:10.1007_s10898-024-01364-6. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.