Reinforcement Procedure for Randomized Machine Learning

My bibliography Save this article

Reinforcement Procedure for Randomized Machine Learning

Author

Listed:

Yuri S. Popkov
(Federal Research Center “Computer Science and Control” of Russian Academy of Sciences, 44/2 Vavilova, 119333 Moscow, Russia
Trapeznikov Institute of Control Sciences of Russian Academy of Sciences, 65 Profsoyuznaya, 117997 Moscow, Russia)
Yuri A. Dubnov
(Federal Research Center “Computer Science and Control” of Russian Academy of Sciences, 44/2 Vavilova, 119333 Moscow, Russia
Faculty of Computer Science, National Research University “Higher Schools of Economics”, 20 Myasnitskaya, 109028 Moscow, Russia)
Alexey Yu. Popkov
(Federal Research Center “Computer Science and Control” of Russian Academy of Sciences, 44/2 Vavilova, 119333 Moscow, Russia)

Registered:

Abstract

This paper is devoted to problem-oriented reinforcement methods for the numerical implementation of Randomized Machine Learning. We have developed a scheme of the reinforcement procedure based on the agent approach and Bellman’s optimality principle. This procedure ensures strictly monotonic properties of a sequence of local records in the iterative computational procedure of the learning process. The dependences of the dimensions of the neighborhood of the global minimum and the probability of its achievement on the parameters of the algorithm are determined. The convergence of the algorithm with the indicated probability to the neighborhood of the global minimum is proved.

Suggested Citation

Yuri S. Popkov & Yuri A. Dubnov & Alexey Yu. Popkov, 2023. "Reinforcement Procedure for Randomized Machine Learning," Mathematics, MDPI, vol. 11(17), pages 1-14, August.

Handle: RePEc:gam:jmathe:v:11:y:2023:i:17:p:3651-:d:1223667

Download full text from publisher

References listed on IDEAS

Marco Avellaneda, 1998. "Minimum-Relative-Entropy Calibration of Asset-Pricing Models," International Journal of Theoretical and Applied Finance (IJTAF), World Scientific Publishing Co. Pte. Ltd., vol. 1(04), pages 447-472.
Yuri S. Popkov & Alexey Yu. Popkov & Yuri A. Dubnov & Dimitri Solomatine, 2020. "Entropy-Randomized Forecasting of Stochastic Dynamic Regression Models," Mathematics, MDPI, vol. 8(7), pages 1-20, July.
Yuri S. Popkov & Yuri A. Dubnov & Alexey Yu. Popkov, 2016. "New Method of Randomized Forecasting Using Entropy-Robust Estimation: Application to the World Population Prediction," Mathematics, MDPI, vol. 4(1), pages 1-16, March.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Fard, Farzad Alavi & Siu, Tak Kuen, 2013. "Pricing participating products with Markov-modulated jump–diffusion process: An efficient numerical PIDE approach," Insurance: Mathematics and Economics, Elsevier, vol. 53(3), pages 712-721.
Gupta, Aparna & Palepu, Sai, 2024. "Designing risk-free service for renewable wind and solar resources," European Journal of Operational Research, Elsevier, vol. 315(2), pages 715-728.
Sebastian Jaimungal & Silvana M. Pesenti & Leandro S'anchez-Betancourt, 2022. "Minimal Kullback-Leibler Divergence for Constrained L\'evy-It\^o Processes," Papers 2206.14844, arXiv.org, revised Aug 2022.
Alexander Veremyev & Peter Tsyurmasto & Stan Uryasev & R. Rockafellar, 2014. "Calibrating probability distributions with convex-concave-convex functions: application to CDO pricing," Computational Management Science, Springer, vol. 11(4), pages 341-364, October.
Paul Glasserman & Bin Yu, 2005. "Large Sample Properties of Weighted Monte Carlo Estimators," Operations Research, INFORMS, vol. 53(2), pages 298-312, April.
Julien Guyon, 2024. "Dispersion-constrained martingale Schrödinger problems and the exact joint S&P 500/VIX smile calibration puzzle," Finance and Stochastics, Springer, vol. 28(1), pages 27-79, January.
José L. Vilar-Zanón & Olivia Peraita-Ezcurra, 2019. "A linear goal programming method to recover risk neutral probabilities from options prices by maximum entropy," Decisions in Economics and Finance, Springer;Associazione per la Matematica, vol. 42(1), pages 259-276, June.
Xiaohong Chen & Lars Peter Hansen & Peter G. Hansen, 2020. "Robust identification of investor beliefs," Proceedings of the National Academy of Sciences, Proceedings of the National Academy of Sciences, vol. 117(52), pages 33130-33140, December.
- Xiaohong Chen & Lars P. Hansen & Peter G. Hansen, 2020. "Robust Identification of Investor Beliefs," Cowles Foundation Discussion Papers 2236, Cowles Foundation for Research in Economics, Yale University.
- Xiaohong Chen & Lars Peter Hansen & Peter G. Hansen, 2020. "Robust Identification of Investor Beliefs," Working Papers 2020-69, Becker Friedman Institute for Research In Economics.
- Xiaohong Chen & Lars P. Hansen & Peter G. Hansen, 2020. "Robust Identification of Investor Beliefs," NBER Working Papers 27257, National Bureau of Economic Research, Inc.
Minzhi Wu & Emili Tortosa-Ausina & Paula Cruz-García, 2024. "The impact of diversification on the profitability and risk of Chinese banks: evidence from a semiparametric approach," Empirical Economics, Springer, vol. 67(6), pages 2565-2606, December.
Minzhi Wu & Emili Tortosa-Ausina, 2020. "Bank Diversification and Focus in Disruptive Times: China, 2007–2018," Working Papers 2020/21, Economics Department, Universitat Jaume I, Castellón (Spain).
Vladislav Kargin, 2003. "Consistent Estimation of Pricing Kernels from Noisy Price Data," Papers math/0310223, arXiv.org.
- Vladislav Kargin, 2003. "Consistent Estimation of Pricing Kernels from Noisy Price Data," Finance 0311001, University Library of Munich, Germany.
Shulin Lan & Ming-Lang Tseng, 2018. "Coordinated Development of Metropolitan Logistics and Economy Toward Sustainability," Computational Economics, Springer;Society for Computational Economics, vol. 52(4), pages 1113-1138, December.
Jorge P. Zubelli & Kuldeep Singh & Vinicius Albani & Ioannis Kourakis, 2024. "Travelling wave solutions of an equation of Harry Dym type arising in the Black-Scholes framework," Papers 2412.19020, arXiv.org.
Marcel Nutz & Johannes Wiesel & Long Zhao, 2022. "Martingale Schr\"odinger Bridges and Optimal Semistatic Portfolios," Papers 2204.12250, arXiv.org.
Evgeny Danilov, 2023. "Impact of Market Changes and Regulatory Measures on Accuracy of Bond Valuation in Portfolios of Russian Credit Institutions," Russian Journal of Money and Finance, Bank of Russia, vol. 82(4), pages 108-125, December.
Vinicius Albani & Adriano De Cezaro & Jorge P. Zubelli, 2017. "Convex Regularization Of Local Volatility Estimation," International Journal of Theoretical and Applied Finance (IJTAF), World Scientific Publishing Co. Pte. Ltd., vol. 20(01), pages 1-37, February.
Meucci, A. & Nicolosi, M., 2016. "Dynamic portfolio management with views at multiple horizons," Applied Mathematics and Computation, Elsevier, vol. 274(C), pages 495-518.
Farzad Alavi Fard & Firmin Doko Tchatoka & Sivagowry Sriananthakumar, 2021. "Maximum Entropy Evaluation of Asymptotic Hedging Error under a Generalised Jump-Diffusion Model," JRFM, MDPI, vol. 14(3), pages 1-19, February.
- Farzad Alavi Fard & Firmin Doko Tchatoka & Sivagowry Sriananthakumar, 2015. "Maximum Entropy Evaluation of Asymptotic Hedging Error under a Generalised Jump-Diffusion Model," School of Economics and Public Policy Working Papers 2015-17, University of Adelaide, School of Economics and Public Policy.
Pierre Henry-Labordere, 2019. "From (Martingale) Schrodinger bridges to a new class of Stochastic Volatility Models," Papers 1904.04554, arXiv.org.
Lelièvre, Tony & Samaey, Giovanni & Zieliński, Przemysław, 2020. "Analysis of a micro–macro acceleration method with minimum relative entropy moment matching," Stochastic Processes and their Applications, Elsevier, vol. 130(6), pages 3753-3801.

More about this item

Keywords

randomized machine learning; reinforcement learning; utility function; payoff function; Bellman’s optimality principle;
All these keywords.

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:11:y:2023:i:17:p:3651-:d:1223667. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Reinforcement Procedure for Randomized Machine Learning

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data