IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v249y2016i1p22-31.html
   My bibliography  Save this article

New approximate dynamic programming algorithms for large-scale undiscounted Markov decision processes and their application to optimize a production and distribution system

Author

Listed:
  • Ohno, Katsuhisa
  • Boh, Toshitaka
  • Nakade, Koichi
  • Tamura, Takayoshi

Abstract

Undiscounted Markov decision processes (UMDP's) can formulate optimal stochastic control problems that minimize the expected total cost per period for various systems. We propose new approximate dynamic programming (ADP) algorithms for large-scale UMDP's that can solve the curses of dimensionality. These algorithms, called simulation-based modified policy iteration (SBMPI) algorithms, are extensions of the simulation-based modified policy iteration method (SBMPIM) (Ohno, 2011) for optimal control problems of multistage JIT-based production and distribution systems with stochastic demand and production capacity. The main new concepts of the SBMPI algorithms are that the simulation-based policy evaluation step of the SBMPIM is replaced by the partial policy evaluation step of the modified policy iteration method (MPIM) and that the algorithms starts from the expected total cost per period and relative value estimated by simulating the system under a reasonable initial policy.

Suggested Citation

  • Ohno, Katsuhisa & Boh, Toshitaka & Nakade, Koichi & Tamura, Takayoshi, 2016. "New approximate dynamic programming algorithms for large-scale undiscounted Markov decision processes and their application to optimize a production and distribution system," European Journal of Operational Research, Elsevier, vol. 249(1), pages 22-31.
  • Handle: RePEc:eee:ejores:v:249:y:2016:i:1:p:22-31
    DOI: 10.1016/j.ejor.2015.07.026
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0377221715006591
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ejor.2015.07.026?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Andrew J. Clark & Herbert Scarf, 2004. "Optimal Policies for a Multi-Echelon Inventory Problem," Management Science, INFORMS, vol. 50(12_supple), pages 1782-1790, December.
    2. Ohno, Katsuhisa, 2011. "The optimal control of just-in-time-based production and distribution systems and performance comparisons with optimized pull systems," European Journal of Operational Research, Elsevier, vol. 213(1), pages 124-133, August.
    3. Richard Bellman, 1957. "On a Dynamic Programming Approach to the Caterer Problem--I," Management Science, INFORMS, vol. 3(3), pages 270-278, April.
    4. Katsuhisa Ohno & Kuniyoshi Ichiki, 1987. "Computing Optimal Policies for Controlled Tandem Queueing Systems," Operations Research, INFORMS, vol. 35(1), pages 121-126, February.
    5. Vijay V. Desai & Vivek F. Farias & Ciamac C. Moallemi, 2012. "Approximate Dynamic Programming via a Smoothed Linear Program," Operations Research, INFORMS, vol. 60(3), pages 655-674, June.
    6. Tapas K. Das & Abhijit Gosavi & Sridhar Mahadevan & Nicholas Marchalleck, 1999. "Solving Semi-Markov Decision Problems Using Average Reward Reinforcement Learning," Management Science, INFORMS, vol. 45(4), pages 560-574, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. de Kok, Ton & Grob, Christopher & Laumanns, Marco & Minner, Stefan & Rambau, Jörg & Schade, Konrad, 2018. "A typology and literature review on stochastic multi-echelon inventory models," European Journal of Operational Research, Elsevier, vol. 269(3), pages 955-983.
    2. Annear, Luis Mauricio & Akhavan-Tabatabaei, Raha & Schmid, Verena, 2023. "Dynamic assignment of a multi-skilled workforce in job shops: An approximate dynamic programming approach," European Journal of Operational Research, Elsevier, vol. 306(3), pages 1109-1125.
    3. Cerqueti, Roy & Falbo, Paolo & Pelizzari, Cristian, 2017. "Relevant states and memory in Markov chain bootstrapping and simulation," European Journal of Operational Research, Elsevier, vol. 256(1), pages 163-177.
    4. Sankaranarayanan, Sriram & Feijoo, Felipe & Siddiqui, Sauleh, 2018. "Sensitivity and covariance in stochastic complementarity problems with an application to North American natural gas markets," European Journal of Operational Research, Elsevier, vol. 268(1), pages 25-36.
    5. Barlow, E. & Bedford, T. & Revie, M. & Tan, J. & Walls, L., 2021. "A performance-centred approach to optimising maintenance of complex systems," European Journal of Operational Research, Elsevier, vol. 292(2), pages 579-595.
    6. Cheng, Bayi & Leung, Joseph Y.-T. & Li, Kai & Yang, Shanlin, 2019. "Integrated optimization of material supplying, manufacturing, and product distribution: Models and fast algorithms," European Journal of Operational Research, Elsevier, vol. 277(1), pages 100-111.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ohno, Katsuhisa, 2011. "The optimal control of just-in-time-based production and distribution systems and performance comparisons with optimized pull systems," European Journal of Operational Research, Elsevier, vol. 213(1), pages 124-133, August.
    2. Noordhoek, Marije & Dullaert, Wout & Lai, David S.W. & de Leeuw, Sander, 2018. "A simulation–optimization approach for a service-constrained multi-echelon distribution network," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 114(C), pages 292-311.
    3. Tan Wang & L. Jeff Hong, 2023. "Large-Scale Inventory Optimization: A Recurrent Neural Networks–Inspired Simulation Approach," INFORMS Journal on Computing, INFORMS, vol. 35(1), pages 196-215, January.
    4. Pierre Bernhard & Marc Deschamps, 2017. "Kalman on dynamics and contro, Linear System Theory, Optimal Control, and Filter," Working Papers 2017-10, CRESE.
    5. Jones, Randall E. & Cacho, Oscar J., 2000. "A Dynamic Optimisation Model of Weed Control," 2000 Conference (44th), January 23-25, 2000, Sydney, Australia 123685, Australian Agricultural and Resource Economics Society.
    6. Qu, Zhan & Raff, Horst & Schmitt, Nicolas, 2018. "Incentives through inventory control in supply chains," International Journal of Industrial Organization, Elsevier, vol. 59(C), pages 486-513.
    7. Boissiere, J. & Frein, Y. & Rapine, C., 2008. "Optimal stationary policies in a 3-stage serial production-distribution logistic chain facing constant and continuous demand," European Journal of Operational Research, Elsevier, vol. 186(2), pages 608-619, April.
    8. Voelkel, Michael A. & Sachs, Anna-Lena & Thonemann, Ulrich W., 2020. "An aggregation-based approximate dynamic programming approach for the periodic review model with random yield," European Journal of Operational Research, Elsevier, vol. 281(2), pages 286-298.
    9. Pam Norton & Ravi Phatarfod, 2008. "Optimal Strategies In One-Day Cricket," Asia-Pacific Journal of Operational Research (APJOR), World Scientific Publishing Co. Pte. Ltd., vol. 25(04), pages 495-511.
    10. Jan A. Van Mieghem & Nils Rudi, 2002. "Newsvendor Networks: Inventory Management and Capacity Investment with Discretionary Activities," Manufacturing & Service Operations Management, INFORMS, vol. 4(4), pages 313-335, August.
    11. Wang, Zhaodong & Wang, Xin & Ouyang, Yanfeng, 2015. "Bounded growth of the bullwhip effect under a class of nonlinear ordering policies," European Journal of Operational Research, Elsevier, vol. 247(1), pages 72-82.
    12. Hill, R.M. & Seifbarghy, M. & Smith, D.K., 2007. "A two-echelon inventory model with lost sales," European Journal of Operational Research, Elsevier, vol. 181(2), pages 753-766, September.
    13. Aghayi, Nazila & Maleki, Bentolhoda, 2016. "Efficiency measurement of DMUs with undesirable outputs under uncertainty based on the directional distance function: Application on bank industry," Energy, Elsevier, vol. 112(C), pages 376-387.
    14. Tan, Madeleine Sui-Lay, 2016. "Policy coordination among the ASEAN-5: A global VAR analysis," Journal of Asian Economics, Elsevier, vol. 44(C), pages 20-40.
    15. Carole Camisullis & Vincent Giard, 2010. "Détermination des stocks de sécurité dans une chaîne logistique-amont dédiée à une production de masse de produits fortement diversifiés," Working Papers hal-00876986, HAL.
    16. Preil, Deniz & Krapp, Michael, 2022. "Bandit-based inventory optimisation: Reinforcement learning in multi-echelon supply chains," International Journal of Production Economics, Elsevier, vol. 252(C).
    17. D. W. K. Yeung, 2008. "Dynamically Consistent Solution For A Pollution Management Game In Collaborative Abatement With Uncertain Future Payoffs," International Game Theory Review (IGTR), World Scientific Publishing Co. Pte. Ltd., vol. 10(04), pages 517-538.
    18. Korfhage, Thorben & Fischer-Weckemann, Björn, 2024. "Long-run consequences of informal elderly care and implications of public long-term care insurance," Journal of Health Economics, Elsevier, vol. 96(C).
    19. Kevin H. Shang & Jing-Sheng Song & Paul H. Zipkin, 2009. "Coordination Mechanisms in Decentralized Serial Inventory Systems with Batch Ordering," Management Science, INFORMS, vol. 55(4), pages 685-695, April.
    20. Zhanwei Tian & Guoqing Zhang, 2021. "Multi-echelon fulfillment warehouse rent and production allocation for online direct selling," Annals of Operations Research, Springer, vol. 304(1), pages 427-451, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:249:y:2016:i:1:p:22-31. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.