A structured pattern matrix algorithm for multichain Markov decision processes

My bibliography Save this article

A structured pattern matrix algorithm for multichain Markov decision processes

Author

Listed:

Tetsuichiro Iki
Masayuki Horiguchi
Masami Kurano

Registered:

Abstract

In this paper, we are concerned with a new algorithm for multichain finite state Markov decision processes which finds an average optimal policy through the decomposition of the state space into some communicating classes and a transient class. For each communicating class, a relatively optimal policy is found, which is used to find an optimal policy by applying the value iteration algorithm. Using a pattern matrix determining the behaviour pattern of the decision process, the decomposition of the state space is effectively done, so that the proposed algorithm simplifies the structured one given by the excellent Leizarowitz’s paper (Math Oper Res 28:553–586, 2003). Also, a numerical example is given to comprehend the algorithm. Copyright Springer-Verlag 2007

Suggested Citation

Tetsuichiro Iki & Masayuki Horiguchi & Masami Kurano, 2007. "A structured pattern matrix algorithm for multichain Markov decision processes," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 66(3), pages 545-555, December.

Handle: RePEc:spr:mathme:v:66:y:2007:i:3:p:545-555
DOI: 10.1007/s00186-006-0138-5

Download full text from publisher

As the access to this document is restricted, you may want to search for a different version of it.

References listed on IDEAS

Arie Leizarowitz, 2003. "An Algorithm to Identify and Compute Average Optimal Policies in Multichain Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 28(3), pages 553-586, August.
Richard Bellman, 1957. "On a Dynamic Programming Approach to the Caterer Problem--I," Management Science, INFORMS, vol. 3(3), pages 270-278, April.
Arie Hordijk & Martin L. Puterman, 1987. "On the Convergence of Policy Iteration in Finite State Undiscounted Markov Decision Processes: The Unichain Case," Mathematics of Operations Research, INFORMS, vol. 12(1), pages 163-176, February.
A. Hordijk & L. C. M. Kallenberg, 1979. "Linear Programming and Markov Decision Chains," Management Science, INFORMS, vol. 25(4), pages 352-362, April.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Guillot, Matthieu & Stauffer, Gautier, 2020. "The Stochastic Shortest Path Problem: A polyhedral combinatorics perspective," European Journal of Operational Research, Elsevier, vol. 285(1), pages 148-158.
Arie Leizarowitz & Alexander J. Zaslavski, 2007. "Uniqueness and Stability of Optimal Policies of Finite State Markov Decision Processes," Mathematics of Operations Research, INFORMS, vol. 32(1), pages 156-167, February.
Voelkel, Michael A. & Sachs, Anna-Lena & Thonemann, Ulrich W., 2020. "An aggregation-based approximate dynamic programming approach for the periodic review model with random yield," European Journal of Operational Research, Elsevier, vol. 281(2), pages 286-298.
Lodewijk Kallenberg, 2013. "Derman’s book as inspiration: some results on LP for MDPs," Annals of Operations Research, Springer, vol. 208(1), pages 63-94, September.
Tan, Madeleine Sui-Lay, 2016. "Policy coordination among the ASEAN-5: A global VAR analysis," Journal of Asian Economics, Elsevier, vol. 44(C), pages 20-40.
D. W. K. Yeung, 2008. "Dynamically Consistent Solution For A Pollution Management Game In Collaborative Abatement With Uncertain Future Payoffs," International Game Theory Review (IGTR), World Scientific Publishing Co. Pte. Ltd., vol. 10(04), pages 517-538.
Hanafi, Said & Freville, Arnaud, 1998. "An efficient tabu search approach for the 0-1 multidimensional knapsack problem," European Journal of Operational Research, Elsevier, vol. 106(2-3), pages 659-675, April.
Renato Cordeiro Amorim, 2016. "A Survey on Feature Weighting Based K-Means Algorithms," Journal of Classification, Springer;The Classification Society, vol. 33(2), pages 210-242, July.
Dmitri Blueschke & Ivan Savin, 2015. "No such thing like perfect hammer: comparing different objective function specifications for optimal control," Jena Economics Research Papers 2015-005, Friedrich-Schiller-University Jena.
Changming Ji & Chuangang Li & Boquan Wang & Minghao Liu & Liping Wang, 2017. "Multi-Stage Dynamic Programming Method for Short-Term Cascade Reservoirs Optimal Operation with Flow Attenuation," Water Resources Management: An International Journal, Published for the European Water Resources Association (EWRA), Springer;European Water Resources Association (EWRA), vol. 31(14), pages 4571-4586, November.
Ghassan, Hassan B. & Al-Jefri, Essam H., 2015. "الحساب الجاري في المدى البعيد عبر نموذج داخلي الزمن [The Current Account in the Long Run through the Intertemporal Model]," MPRA Paper 66527, University Library of Munich, Germany.
John Stachurski, 2009. "Economic Dynamics: Theory and Computation," MIT Press Books, The MIT Press, edition 1, volume 1, number 0262012774, December.
Mercedes Esteban-Bravo & Jose M. Vidal-Sanz & Gökhan Yildirim, 2014. "Valuing Customer Portfolios with Endogenous Mass and Direct Marketing Interventions Using a Stochastic Dynamic Programming Decomposition," Marketing Science, INFORMS, vol. 33(5), pages 621-640, September.
- Vidal-Sanz, Jose M. & Yildirim, Gökhan, 2012. "Valuing customer portfolios with endogenous mass-and-direct-marketing interventions using a stochastic dynamic programming decomposition," DEE - Working Papers. Business Economics. WB wb121304, Universidad Carlos III de Madrid. Departamento de EconomÃa de la Empresa.
Ohno, Katsuhisa & Boh, Toshitaka & Nakade, Koichi & Tamura, Takayoshi, 2016. "New approximate dynamic programming algorithms for large-scale undiscounted Markov decision processes and their application to optimize a production and distribution system," European Journal of Operational Research, Elsevier, vol. 249(1), pages 22-31.
Dijk, N.M. van, 1989. "Truncation of Markov decision problems with a queueing network overflow control application," Serie Research Memoranda 0065, VU University Amsterdam, Faculty of Economics, Business Administration and Econometrics.
Oleg Malafeyev & Achal Awasthi, 2015. "A Dynamic Model of Functioning of a Bank," Papers 1511.01529, arXiv.org.
Bellemare, Charles, 2007. "A life-cycle model of outmigration and economic assimilation of immigrants in Germany," European Economic Review, Elsevier, vol. 51(3), pages 553-576, April.
- Bellemare, Charles, 2004. "A Life-Cycle Model of Outmigration and Economic Assimilation of Immigrants in Germany," IZA Discussion Papers 1012, Institute of Labor Economics (IZA).
- Charles Bellemare, 2004. "A Life-Cycle Model of Outmigration and Economic Assimilation of Immigrants in Germany," Cahiers de recherche 0430, CIRPEE.
- Bellemare, C., 2004. "A Life-Cycle Model of Outmigration and Economic Assimilation of Immigrants in Germany," Discussion Paper 2004-29, Tilburg University, Center for Economic Research.
Daniel Adelman & George L. Nemhauser & Mario Padron & Robert Stubbs & Ram Pandit, 1999. "Allocating Fibers in Cable Manufacturing," Manufacturing & Service Operations Management, INFORMS, vol. 1(1), pages 21-35.
Fosgerau, Mogens & Frejinger, Emma & Karlstrom, Anders, 2013. "A link based network route choice model with unrestricted choice set," Transportation Research Part B: Methodological, Elsevier, vol. 56(C), pages 70-80.
- Fosgerau, Mogens & Frejinger, Emma & Karlstrom, Anders, 2013. "A link based network route choice model with unrestricted choice set," MPRA Paper 48707, University Library of Munich, Germany.
- Fosgerau, Mogens & Frejinger, Emma & Karlström, Anders, 2013. "A link based network route choice model with unrestricted choice set," Working papers in Transport Economics 2013:10, CTS - Centre for Transport Studies Stockholm (KTH and VTI).
Alipanah, A. & Razzaghi, M. & Dehghan, M., 2007. "Nonclassical pseudospectral method for the solution of brachistochrone problem," Chaos, Solitons & Fractals, Elsevier, vol. 34(5), pages 1622-1628.

More about this item

Keywords

Multichain Markov decision processes; Structured algorithm; Communicating class; Transient class; Value iteration;
All these keywords.

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:mathme:v:66:y:2007:i:3:p:545-555. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

A structured pattern matrix algorithm for multichain Markov decision processes

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data