IDEAS home Printed from https://ideas.repec.org/a/inm/oropre/v41y1993i3p583-599.html
   My bibliography  Save this article

Suboptimal Policies, with Bounds, for Parameter Adaptive Decision Processes

Author

Listed:
  • William S. Lovejoy

    (Stanford University, Stanford, California)

Abstract

A parameter adaptive decision process is a sequential decision process where some parameter or parameter set impacting the rewards and/or transitions of the process is not known with certainty. Signals from the performance of the system can be processed by the decision maker as time progresses, yielding information regarding which parameter set is operative. Active learning is an essential feature of these processes, and the decision maker must choose actions that simultaneously guide the system in a preferred direction, as well as yield information that can be used to better prescribe future actions. If the operative parameter set is known with certainty, the parameter adaptive problem reduces to a conventional stochastic dynamic program, which is presumed solvable. Previous authors have shown how to use these solutions to generate suboptimal policies with performance bounds for the parameter adaptive problem. Here it is shown that some desirable characteristics of those bounds are shared by a larger class of functions than those generated from fully observed problems, and that this generalization allows for iterative tightening of the bounds in a manner that preserves those attributes. An example inventory stocking problem demonstrates the technique.

Suggested Citation

  • William S. Lovejoy, 1993. "Suboptimal Policies, with Bounds, for Parameter Adaptive Decision Processes," Operations Research, INFORMS, vol. 41(3), pages 583-599, June.
  • Handle: RePEc:inm:oropre:v:41:y:1993:i:3:p:583-599
    DOI: 10.1287/opre.41.3.583
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/opre.41.3.583
    Download Restriction: no

    File URL: https://libkey.io/10.1287/opre.41.3.583?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Larson, C. Erik & Olson, Lars J. & Sharma, Sunil, 2001. "Optimal Inventory Policies when the Demand Distribution Is Not Known," Journal of Economic Theory, Elsevier, vol. 101(1), pages 281-300, November.
    2. Arnab Bisi & Maqbool Dada, 2007. "Dynamic learning, pricing, and ordering by a censored newsvendor," Naval Research Logistics (NRL), John Wiley & Sons, vol. 54(4), pages 448-461, June.
    3. James T. Treharne & Charles R. Sox, 2002. "Adaptive Inventory Control for Nonstationary Demand and Partial Information," Management Science, INFORMS, vol. 48(5), pages 607-624, May.
    4. Anyan Qi & Hyun-Soo Ahn & Amitabh Sinha, 2017. "Capacity Investment with Demand Learning," Operations Research, INFORMS, vol. 65(1), pages 145-164, February.
    5. Xiaomei Ding & Martin L. Puterman & Arnab Bisi, 2002. "The Censored Newsvendor and the Optimal Acquisition of Information," Operations Research, INFORMS, vol. 50(3), pages 517-527, June.
    6. Anyan Qi & Hyun-Soo Ahn & Amitabh Sinha, 2017. "Capacity Investment with Demand Learning," Operations Research, INFORMS, vol. 65(1), pages 145-164, February.
    7. Yossi Aviv & Amit Pazgal, 2005. "A Partially Observed Markov Decision Process for Dynamic Pricing," Management Science, INFORMS, vol. 51(9), pages 1400-1416, September.
    8. Martin A. Lariviere & Evan L. Porteus, 1999. "Stalking Information: Bayesian Inventory Management with Unobserved Lost Sales," Management Science, INFORMS, vol. 45(3), pages 346-363, March.
    9. Glenn, David & Bisi, Arnab & Puterman, Martin L., 2004. "The Bayesian Newsvendors in Supply Chains with Unobserved Lost Sales," Working Papers 04-0110, University of Illinois at Urbana-Champaign, College of Business.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:oropre:v:41:y:1993:i:3:p:583-599. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.