Off-Policy Exploitability-Evaluation in Two-Player Zero-Sum Markov Games
Author
Abstract
Suggested Citation
Download full text from publisher
References listed on IDEAS
- Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018.
"Double/debiased machine learning for treatment and structural parameters,"
Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
- Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2017. "Double/Debiased Machine Learning for Treatment and Structural Parameters," NBER Working Papers 23564, National Bureau of Economic Research, Inc.
- Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey & James Robins, 2017. "Double/debiased machine learning for treatment and structural parameters," CeMMAP working papers CWP28/17, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey & James Robins, 2017. "Double/debiased machine learning for treatment and structural parameters," CeMMAP working papers 28/17, Institute for Fiscal Studies.
- Toru Kitagawa & Aleksey Tetenov, 2018.
"Who Should Be Treated? Empirical Welfare Maximization Methods for Treatment Choice,"
Econometrica, Econometric Society, vol. 86(2), pages 591-616, March.
- Toru Kitagawa & Aleksey Tetenov, 2015. "Who should be treated? Empirical welfare maximization methods for treatment choice," CeMMAP working papers CWP10/15, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Toru Kitagawa & Aleksey Tetenov, 2017. "Who should be treated? Empirical welfare maximization methods for treatment choice," CeMMAP working papers CWP24/17, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Toru Kitagawa & Aleksey Tetenov, 2015. "Who should be Treated? Empirical Welfare Maximization Methods for Treatment Choice," Carlo Alberto Notebooks 402, Collegio Carlo Alberto.
- Keisuke Hirano & Guido W. Imbens & Geert Ridder, 2003.
"Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score,"
Econometrica, Econometric Society, vol. 71(4), pages 1161-1189, July.
- Keisuke Hirano & Guido W. Imbens & Geert Ridder, 2000. "Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score," NBER Technical Working Papers 0251, National Bureau of Economic Research, Inc.
- Guido Imbens, 2000. "Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score," Econometric Society World Congress 2000 Contributed Papers 1166, Econometric Society.
- S. A. Murphy, 2003. "Optimal dynamic treatment regimes," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 65(2), pages 331-355, May.
- David Silver & Aja Huang & Chris J. Maddison & Arthur Guez & Laurent Sifre & George van den Driessche & Julian Schrittwieser & Ioannis Antonoglou & Veda Panneershelvam & Marc Lanctot & Sander Dieleman, 2016. "Mastering the game of Go with deep neural networks and tree search," Nature, Nature, vol. 529(7587), pages 484-489, January.
- David Silver & Julian Schrittwieser & Karen Simonyan & Ioannis Antonoglou & Aja Huang & Arthur Guez & Thomas Hubert & Lucas Baker & Matthew Lai & Adrian Bolton & Yutian Chen & Timothy Lillicrap & Fan , 2017. "Mastering the game of Go without human knowledge," Nature, Nature, vol. 550(7676), pages 354-359, October.
- Athey, Susan & Wager, Stefan, 2017. "Efficient Policy Learning," Research Papers 3506, Stanford University, Graduate School of Business.
Most related items
These are the items that most often cite the same works as this one and are cited by the same works as this one.- Andrew Bennett & Nathan Kallus, 2020. "Efficient Policy Learning from Surrogate-Loss Classification Reductions," Papers 2002.05153, arXiv.org.
- Davide Viviano, 2019. "Policy Targeting under Network Interference," Papers 1906.10258, arXiv.org, revised Apr 2024.
- Michael C. Knaus, 2021.
"A double machine learning approach to estimate the effects of musical practice on student’s skills,"
Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(1), pages 282-300, January.
- Knaus, Michael C., 2018. "A Double Machine Learning Approach to Estimate the Effects of Musical Practice on Student's Skills," IZA Discussion Papers 11547, Institute of Labor Economics (IZA).
- Michael C. Knaus, 2018. "A Double Machine Learning Approach to Estimate the Effects of Musical Practice on Student's Skills," Papers 1805.10300, arXiv.org, revised Jan 2019.
- Michael C Knaus & Michael Lechner & Anthony Strittmatter, 2021.
"Machine learning estimation of heterogeneous causal effects: Empirical Monte Carlo evidence,"
The Econometrics Journal, Royal Economic Society, vol. 24(1), pages 134-161.
- Knaus, Michael C. & Lechner, Michael & Strittmatter, Anthony, 2018. "Machine Learning Estimation of Heterogeneous Causal Effects: Empirical Monte Carlo Evidence," IZA Discussion Papers 12039, Institute of Labor Economics (IZA).
- Lechner, Michael & Knaus, Michael C. & Strittmatter, Anthony, 2018. "Machine Learning Estimation of Heterogeneous Causal Effects: Empirical Monte Carlo Evidence," CEPR Discussion Papers 13402, C.E.P.R. Discussion Papers.
- Knaus, Michael C. & Lechner, Michael & anthony.strittmatter@unisg.ch, 2018. "Machine Learning Estimation of Heterogeneous Causal Effects: Empirical Monte Carlo Evidence," Economics Working Paper Series 1817, University of St. Gallen, School of Economics and Political Science.
- Michael C. Knaus & Michael Lechner & Anthony Strittmatter, 2018. "Machine Learning Estimation of Heterogeneous Causal Effects: Empirical Monte Carlo Evidence," Papers 1810.13237, arXiv.org, revised Dec 2018.
- Luo, Yu & Graham, Daniel J. & McCoy, Emma J., 2023. "Semiparametric Bayesian doubly robust causal estimation," LSE Research Online Documents on Economics 117944, London School of Economics and Political Science, LSE Library.
- Anders Bredahl Kock & Martin Thyrsgaard, 2017. "Optimal sequential treatment allocation," Papers 1705.09952, arXiv.org, revised Aug 2018.
- Michael C Knaus, 2022.
"Double machine learning-based programme evaluation under unconfoundedness [Econometric methods for program evaluation],"
The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 602-627.
- Knaus, Michael C., 2020. "Double Machine Learning based Program Evaluation under Unconfoundedness," Economics Working Paper Series 2004, University of St. Gallen, School of Economics and Political Science.
- Knaus, Michael C., 2020. "Double Machine Learning Based Program Evaluation under Unconfoundedness," IZA Discussion Papers 13051, Institute of Labor Economics (IZA).
- Michael C. Knaus, 2020. "Double Machine Learning based Program Evaluation under Unconfoundedness," Papers 2003.03191, arXiv.org, revised Jun 2022.
- Susan Athey & Stefan Wager, 2021.
"Policy Learning With Observational Data,"
Econometrica, Econometric Society, vol. 89(1), pages 133-161, January.
- Susan Athey & Stefan Wager, 2017. "Policy Learning with Observational Data," Papers 1702.02896, arXiv.org, revised Sep 2020.
- Rahul Singh & Liyuan Xu & Arthur Gretton, 2020. "Kernel Methods for Causal Functions: Dose, Heterogeneous, and Incremental Response Curves," Papers 2010.04855, arXiv.org, revised Oct 2022.
- Shosei Sakaguchi, 2024. "Policy Learning for Optimal Dynamic Treatment Regimes with Observational Data," Papers 2404.00221, arXiv.org, revised Dec 2024.
- Huber, Martin, 2019.
"An introduction to flexible methods for policy evaluation,"
FSES Working Papers
504, Faculty of Economics and Social Sciences, University of Freiburg/Fribourg Switzerland.
- Martin Huber, 2019. "An introduction to flexible methods for policy evaluation," Papers 1910.00641, arXiv.org.
- Ganesh Karapakula, 2023. "Stable Probability Weighting: Large-Sample and Finite-Sample Estimation and Inference Methods for Heterogeneous Causal Effects of Multivalued Treatments Under Limited Overlap," Papers 2301.05703, arXiv.org, revised Jan 2023.
- Nathan Kallus, 2022. "Treatment Effect Risk: Bounds and Inference," Papers 2201.05893, arXiv.org, revised Jul 2022.
- Davide Viviano & Jelena Bradic, 2020. "Fair Policy Targeting," Papers 2005.12395, arXiv.org, revised Jun 2022.
- Davide Viviano & Jess Rudder, 2020. "Policy design in experiments with unknown interference," Papers 2011.08174, arXiv.org, revised May 2024.
- Masahiro Kato & Masatoshi Uehara & Shota Yasui, 2020. "Off-Policy Evaluation and Learning for External Validity under a Covariate Shift," Papers 2002.11642, arXiv.org, revised Oct 2020.
- Shi, Chengchun & Luo, Shikai & Le, Yuan & Zhu, Hongtu & Song, Rui, 2022. "Statistically efficient advantage learning for offline reinforcement learning in infinite horizons," LSE Research Online Documents on Economics 115598, London School of Economics and Political Science, LSE Library.
- Mert Demirer & Vasilis Syrgkanis & Greg Lewis & Victor Chernozhukov, 2019.
"Semi-Parametric Efficient Policy Learning with Continuous Actions,"
CeMMAP working papers
CWP34/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Mert Demirer & Vasilis Syrgkanis & Greg Lewis & Victor Chernozhukov, 2019. "Semi-Parametric Efficient Policy Learning with Continuous Actions," Papers 1905.10116, arXiv.org, revised Jul 2019.
- Masahiro Kato, 2020. "Confidence Interval for Off-Policy Evaluation from Dependent Samples via Bandit Algorithm: Approach from Standardized Martingales," Papers 2006.06982, arXiv.org.
- Julia Hatamyar & Noemi Kreif, 2023. "Policy Learning with Rare Outcomes," Papers 2302.05260, arXiv.org, revised Oct 2023.
More about this item
NEP fields
This paper has been announced in the following NEP Reports:- NEP-GTH-2020-09-07 (Game Theory)
Statistics
Access and download statisticsCorrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2007.02141. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .
Please note that corrections may take a couple of weeks to filter through the various RePEc services.