Author
Listed:
- Chengyi Lyu
(Leeds School of Business, University of Colorado Boulder, Boulder, Colorado 80309)
- Huanan Zhang
(Leeds School of Business, University of Colorado Boulder, Boulder, Colorado 80309)
- Linwei Xin
(Booth School of Business, University of Chicago, Chicago, Illinois 60637)
Abstract
In this paper, we consider a classic periodic-review lost-sales inventory system with lead times, which is notoriously challenging to optimize with a wide range of real-world applications. We consider a joint learning and optimization problem in which the decision maker does not know the demand distribution a priori and can only use past sales information (i.e., censored demand). Departing from existing learning algorithms on this learning problem that require the convexity property of the underlying system, we develop an upper confidence bound (UCB)-type learning framework that incorporates simulations with the Kaplan–Meier estimator and demonstrate its applicability to learning not only the optimal capped base-stock policy in which convexity no longer holds, but also the optimal base-stock policy with a regret that matches the best existing result. Compared with a classic multi-armed bandit problem, our problem has unique challenges because of the nature of the inventory system, because (1) each action has long-term impacts on future costs, and (2) the system state space is exponentially large in the lead time. As such, our learning algorithms are not naive adoptions of the classic UCB algorithm; in fact, the design of the simulation steps with the Kaplan–Meier estimator and averaging steps is novel in our algorithms, and the confidence width in the UCB index is also different from the classic one. We prove the regrets of our learning algorithms are tight up to a logarithmic term in the planning horizon T . Our extensive numerical experiments suggest the proposed algorithms (almost) dominate existing learning algorithms. We also demonstrate how to select which learning algorithm to use with limited demand data.
Suggested Citation
Chengyi Lyu & Huanan Zhang & Linwei Xin, 2024.
"UCB-Type Learning Algorithms with Kaplan–Meier Estimator for Lost-Sales Inventory Models with Lead Times,"
Operations Research, INFORMS, vol. 72(4), pages 1317-1332, July.
Handle:
RePEc:inm:oropre:v:72:y:2024:i:4:p:1317-1332
DOI: 10.1287/opre.2022.0273
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:oropre:v:72:y:2024:i:4:p:1317-1332. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.