IDEAS home Printed from https://ideas.repec.org/a/spr/joptap/v153y2012i3d10.1007_s10957-012-9989-5.html
   My bibliography  Save this article

An Online Actor–Critic Algorithm with Function Approximation for Constrained Markov Decision Processes

Author

Listed:
  • Shalabh Bhatnagar

    (Indian Institute of Science)

  • K. Lakshmanan

    (Indian Institute of Science)

Abstract

We develop an online actor–critic reinforcement learning algorithm with function approximation for a problem of control under inequality constraints. We consider the long-run average cost Markov decision process (MDP) framework in which both the objective and the constraint functions are suitable policy-dependent long-run averages of certain sample path functions. The Lagrange multiplier method is used to handle the inequality constraints. We prove the asymptotic almost sure convergence of our algorithm to a locally optimal solution. We also provide the results of numerical experiments on a problem of routing in a multi-stage queueing network with constraints on long-run average queue lengths. We observe that our algorithm exhibits good performance on this setting and converges to a feasible point.

Suggested Citation

  • Shalabh Bhatnagar & K. Lakshmanan, 2012. "An Online Actor–Critic Algorithm with Function Approximation for Constrained Markov Decision Processes," Journal of Optimization Theory and Applications, Springer, vol. 153(3), pages 688-708, June.
  • Handle: RePEc:spr:joptap:v:153:y:2012:i:3:d:10.1007_s10957-012-9989-5
    DOI: 10.1007/s10957-012-9989-5
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10957-012-9989-5
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10957-012-9989-5?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Mas-Colell, Andreu & Whinston, Michael D. & Green, Jerry R., 1995. "Microeconomic Theory," OUP Catalogue, Oxford University Press, number 9780195102680.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Yuqing Zheng & Guoshan Zhang, 2020. "Suboptimal Control for Nonlinear Systems with Disturbance via Integral Sliding Mode Control and Policy Iteration," Journal of Optimization Theory and Applications, Springer, vol. 185(2), pages 652-677, May.
    2. Thomas Spooner & Rahul Savani, 2020. "A Natural Actor-Critic Algorithm with Downside Risk Constraints," Papers 2007.04203, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Wright, Austin L. & Sonin, Konstantin & Driscoll, Jesse & Wilson, Jarnickae, 2020. "Poverty and economic dislocation reduce compliance with COVID-19 shelter-in-place protocols," Journal of Economic Behavior & Organization, Elsevier, vol. 180(C), pages 544-554.
    2. Jolian McHardy & Michael Reynolds & Stephen Trotter, 2012. "The Stackelberg Model as a Partial Solution to the Problem of Pricing in a Network," Working Paper series 19_12, Rimini Centre for Economic Analysis.
    3. Janvier D. Nkurunziza, 2005. "Reputation and Credit without Collateral in Africa`s Formal Banking," Economics Series Working Papers WPS/2005-02, University of Oxford, Department of Economics.
    4. Stephanie Rosenkranz & Patrick W. Schmitz, 2007. "Can Coasean Bargaining Justify Pigouvian Taxation?," Economica, London School of Economics and Political Science, vol. 74(296), pages 573-585, November.
    5. Vadim Borokhov, 2014. "On the properties of nodal price response matrix in electricity markets," Papers 1404.3678, arXiv.org, revised Jan 2015.
    6. Yuzhou Jiang & Ramteen Sioshansi, 2023. "What Duality Theory Tells Us About Giving Market Operators the Authority to Dispatch Energy Storage," The Energy Journal, , vol. 44(3), pages 89-110, May.
    7. Daniel Sutter & Daniel J. Smith, 2017. "Coordination in disaster: Nonprice learning and the allocation of resources after natural disasters," The Review of Austrian Economics, Springer;Society for the Development of Austrian Economics, vol. 30(4), pages 469-492, December.
    8. Hanming Fang & Peter Norman, 2014. "Toward an efficiency rationale for the public provision of private goods," Economic Theory, Springer;Society for the Advancement of Economic Theory (SAET), vol. 56(2), pages 375-408, June.
    9. Gan, Li & Ju, Gaosheng & Zhu, Xi, 2015. "Nonparametric estimation of structural labor supply and exact welfare change under nonconvex piecewise-linear budget sets," Journal of Econometrics, Elsevier, vol. 188(2), pages 526-544.
    10. Peterson, Jeffrey M. & Boisvert, Richard N. & de Gorter, Harry, 1999. "Multifunctionality and Optimal Environmental Policies for Agriculture in an Open Economy," Working Papers 127701, Cornell University, Department of Applied Economics and Management.
    11. Tian, Guoqiang, 2009. "Implementation of Pareto efficient allocations," Journal of Mathematical Economics, Elsevier, vol. 45(1-2), pages 113-123, January.
    12. Ahmad Naimzada & Marina Pireddu, 2019. "The first fundamental theorem of welfare in a general equilibrium evolutionary setting," Working Papers 415, University of Milano-Bicocca, Department of Economics, revised 06 Jun 2019.
    13. Gajanan Panchal & Vipul Jain & Naoufel Cheikhrouhou & Matthias Gurtner, 2017. "Equilibrium analysis in multi-echelon supply chain with multi-dimensional utilities of inertial players," Journal of Revenue and Pricing Management, Palgrave Macmillan, vol. 16(4), pages 417-436, August.
    14. Aldasoro, Iñaki & Delli Gatti, Domenico & Faia, Ester, 2017. "Bank networks: Contagion, systemic risk and prudential policy," Journal of Economic Behavior & Organization, Elsevier, vol. 142(C), pages 164-188.
    15. Gatti, Nicolas & Cecil, Michael & Baylis, Kathy & Estes, Lyndon & Blekking, Jordan & Heckelei, Thomas & Vergopolan, Noemi & Evans, Tom, 2023. "Is closing the agricultural yield gap a “risky” endeavor?," Agricultural Systems, Elsevier, vol. 208(C).
    16. Aldo Montesano, 2018. "Social welfare for an economy of angelic agents," International Review of Economics, Springer;Happiness Economics and Interpersonal Relations (HEIRS), vol. 65(2), pages 185-200, June.
    17. Alexei A. Gaivoronski & Per Jonny Nesse & Olai Bendik Erdal, 2017. "Internet service provision and content services: paid peering and competition between internet providers," Netnomics, Springer, vol. 18(1), pages 43-79, May.
    18. Romero-Jordán, Desiderio & del Río, Pablo & Peñasco, Cristina, 2016. "An analysis of the welfare and distributive implications of factors influencing household electricity consumption," Energy Policy, Elsevier, vol. 88(C), pages 361-370.
    19. Araoz, Veronica & Jörnsten, Kurt, 2011. "Semi-Lagrangean approach for price discovery in markets with non-convexities," European Journal of Operational Research, Elsevier, vol. 214(2), pages 411-417, October.
    20. Chorvat, Terrence, 2006. "Taxing utility," Journal of Behavioral and Experimental Economics (formerly The Journal of Socio-Economics), Elsevier, vol. 35(1), pages 1-16, February.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:joptap:v:153:y:2012:i:3:d:10.1007_s10957-012-9989-5. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.