IDEAS home Printed from https://ideas.repec.org/a/inm/oropre/v71y2023i1p148-183.html
   My bibliography  Save this article

Offline Multi-Action Policy Learning: Generalization and Optimization

Author

Listed:
  • Zhengyuan Zhou

    (Stern School of Business, New York University)

  • Susan Athey

    (Graduate School of Business, Stanford University)

  • Stefan Wager

    (Graduate School of Business, Stanford University)

Abstract

In many settings, a decision maker wishes to learn a rule, or policy, that maps from observable characteristics of an individual to an action. Examples include selecting offers, prices, advertisements, or emails to send to consumers, choosing a bid to submit in a contextual first-price auctions, and determining which medication to prescribe to a patient. In this paper, we study the offline multi-action policy learning problem with observational data and where the policy may need to respect budget constraints or belong to a restricted policy class such as decision trees. By using the standard augmented inverse propensity weight estimator, we design and implement a policy learning algorithm that achieves asymptotically minimax-optimal regret. To the best of our knowledge, this is the first result of this type in the multi-action setup, and it provides a substantial performance improvement over the existing learning algorithms. We then consider additional computational challenges that arise in implementing our method for the case where the policy is restricted to take the form of a decision tree. We propose two different approaches: one using a mixed integer program formulation and the other using a tree-search based algorithm.

Suggested Citation

  • Zhengyuan Zhou & Susan Athey & Stefan Wager, 2023. "Offline Multi-Action Policy Learning: Generalization and Optimization," Operations Research, INFORMS, vol. 71(1), pages 148-183, January.
  • Handle: RePEc:inm:oropre:v:71:y:2023:i:1:p:148-183
    DOI: 10.1287/opre.2022.2271
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/opre.2022.2271
    Download Restriction: no

    File URL: https://libkey.io/10.1287/opre.2022.2271?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Yingqi Zhao & Donglin Zeng & A. John Rush & Michael R. Kosorok, 2012. "Estimating Individualized Treatment Rules Using Outcome Weighted Learning," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(499), pages 1106-1118, September.
    2. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey, 2016. "Double machine learning for treatment and causal parameters," CeMMAP working papers 49/16, Institute for Fiscal Studies.
    3. Toru Kitagawa & Aleksey Tetenov, 2018. "Who Should Be Treated? Empirical Welfare Maximization Methods for Treatment Choice," Econometrica, Econometric Society, vol. 86(2), pages 591-616, March.
    4. Newey, Whitney K, 1994. "The Asymptotic Variance of Semiparametric Estimators," Econometrica, Econometric Society, vol. 62(6), pages 1349-1382, November.
    5. Farrell, Max H., 2015. "Robust inference on average treatment effects with possibly more covariates than observations," Journal of Econometrics, Elsevier, vol. 189(1), pages 1-23.
    6. Victor Chernozhukov & Juan Carlos Escanciano & Hidehiko Ichimura & Whitney K. Newey & James M. Robins, 2022. "Locally Robust Semiparametric Estimation," Econometrica, Econometric Society, vol. 90(4), pages 1501-1535, July.
    7. Charles F. Manski, 2004. "Statistical Treatment Rules for Heterogeneous Populations," Econometrica, Econometric Society, vol. 72(4), pages 1221-1246, July.
    8. A. Belloni & V. Chernozhukov & I. Fernández‐Val & C. Hansen, 2017. "Program Evaluation and Causal Inference With High‐Dimensional Data," Econometrica, Econometric Society, vol. 85, pages 233-298, January.
    9. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "Inference on Treatment Effects after Selection among High-Dimensional Controlsâ€," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 81(2), pages 608-650.
    10. Xin Zhou & Nicole Mayer-Hamblett & Umer Khan & Michael R. Kosorok, 2017. "Residual Weighted Learning for Estimating Individualized Treatment Rules," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(517), pages 169-187, January.
    11. Guido W. Imbens, 2004. "Nonparametric Estimation of Average Treatment Effects Under Exogeneity: A Review," The Review of Economics and Statistics, MIT Press, vol. 86(1), pages 4-29, February.
    12. Athey, Susan & Wager, Stefan, 2017. "Efficient Policy Learning," Research Papers 3506, Stanford University, Graduate School of Business.
    13. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881, October.
    14. Daniel Russo & Benjamin Van Roy, 2014. "Learning to Optimize via Posterior Sampling," Mathematics of Operations Research, INFORMS, vol. 39(4), pages 1221-1243, November.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Martin Huber, 2019. "An introduction to flexible methods for policy evaluation," Papers 1910.00641, arXiv.org.
    2. Yi Zhang & Eli Ben-Michael & Kosuke Imai, 2022. "Safe Policy Learning under Regression Discontinuity Designs with Multiple Cutoffs," Papers 2208.13323, arXiv.org, revised Sep 2024.
    3. Athey, Susan & Imbens, Guido W., 2019. "Machine Learning Methods Economists Should Know About," Research Papers 3776, Stanford University, Graduate School of Business.
    4. Cockx, Bart & Lechner, Michael & Bollens, Joost, 2023. "Priority to unemployed immigrants? A causal machine learning evaluation of training in Belgium," Labour Economics, Elsevier, vol. 80(C).
    5. Zhaonan Qu & Isabella Qian & Zhengyuan Zhou, 2020. "Interpretable Personalization via Policy Learning with Linear Decision Boundaries," Papers 2003.07545, arXiv.org, revised Nov 2022.
    6. Michael C Knaus, 2022. "Double machine learning-based programme evaluation under unconfoundedness [Econometric methods for program evaluation]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 602-627.
    7. Davide Viviano, 2019. "Policy Targeting under Network Interference," Papers 1906.10258, arXiv.org, revised Apr 2024.
    8. Daniel Goller & Tamara Harrer & Michael Lechner & Joachim Wolff, 2021. "Active labour market policies for the long-term unemployed: New evidence from causal machine learning," Papers 2106.10141, arXiv.org, revised May 2023.
    9. Weibin Mo & Yufeng Liu, 2022. "Efficient learning of optimal individualized treatment rules for heteroscedastic or misspecified treatment‐free effect models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(2), pages 440-472, April.
    10. Masahiro Kato, 2020. "Confidence Interval for Off-Policy Evaluation from Dependent Samples via Bandit Algorithm: Approach from Standardized Martingales," Papers 2006.06982, arXiv.org.
    11. Kitagawa, Toru & Wang, Guanyi, 2023. "Who should get vaccinated? Individualized allocation of vaccines over SIR network," Journal of Econometrics, Elsevier, vol. 232(1), pages 109-131.
    12. Ruohan Zhan & Zhimei Ren & Susan Athey & Zhengyuan Zhou, 2021. "Policy Learning with Adaptively Collected Data," Papers 2105.02344, arXiv.org, revised Nov 2022.
    13. Tobias Cagala & Ulrich Glogowsky & Johannes Rincke & Anthony Strittmatter, 2021. "Optimal Targeting in Fundraising: A Causal Machine-Learning Approach," Papers 2103.10251, arXiv.org, revised Sep 2021.
    14. Jann Spiess & Vasilis Syrgkanis & Victor Yaneng Wang, 2021. "Finding Subgroups with Significant Treatment Effects," Papers 2103.07066, arXiv.org, revised Dec 2023.
    15. Shosei Sakaguchi, 2021. "Estimation of Optimal Dynamic Treatment Assignment Rules under Policy Constraints," Papers 2106.05031, arXiv.org, revised Aug 2024.
    16. Garbero, Alessandra & Sakos, Grayson & Cerulli, Giovanni, 2023. "Towards data-driven project design: Providing optimal treatment rules for development projects," Socio-Economic Planning Sciences, Elsevier, vol. 89(C).
    17. Susan Athey & Raj Chetty & Guido Imbens, 2020. "Combining Experimental and Observational Data to Estimate Treatment Effects on Long Term Outcomes," Papers 2006.09676, arXiv.org.
    18. Robert Donnelly & Francisco J.R. Ruiz & David Blei & Susan Athey, 2021. "Counterfactual inference for consumer choice across many product categories," Quantitative Marketing and Economics (QME), Springer, vol. 19(3), pages 369-407, December.
    19. Danielle Li & Lindsey R. Raymond & Peter Bergman, 2020. "Hiring as Exploration," NBER Working Papers 27736, National Bureau of Economic Research, Inc.
    20. Tobias Cagala & Ulrich Glogowsky & Johannes Rincke & Anthony Strittmatter, 2021. "Optimal Targeting in Fundraising: A Machine-Learning Approach," Economics working papers 2021-08, Department of Economics, Johannes Kepler University Linz, Austria.
    21. Toru Kitagawa & Guanyi Wang, 2020. "Who Should Get Vaccinated? Individualized Allocation of Vaccines Over SIR Network," Papers 2012.04055, arXiv.org, revised Jul 2021.
    22. Toru Kitagawa & Guanyi Wang, 2020. "Who should get vaccinated? Individualized allocation of vaccines over SIR network," CeMMAP working papers CWP59/20, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Susan Athey & Stefan Wager, 2021. "Policy Learning With Observational Data," Econometrica, Econometric Society, vol. 89(1), pages 133-161, January.
    2. Davide Viviano & Jelena Bradic, 2020. "Fair Policy Targeting," Papers 2005.12395, arXiv.org, revised Jun 2022.
    3. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    4. Michael C Knaus, 2022. "Double machine learning-based programme evaluation under unconfoundedness [Econometric methods for program evaluation]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 602-627.
    5. Kyle Colangelo & Ying-Ying Lee, 2020. "Double Debiased Machine Learning Nonparametric Inference with Continuous Treatments," Papers 2004.03036, arXiv.org, revised Sep 2023.
    6. Huber, Martin, 2019. "An introduction to flexible methods for policy evaluation," FSES Working Papers 504, Faculty of Economics and Social Sciences, University of Freiburg/Fribourg Switzerland.
    7. Ganesh Karapakula, 2023. "Stable Probability Weighting: Large-Sample and Finite-Sample Estimation and Inference Methods for Heterogeneous Causal Effects of Multivalued Treatments Under Limited Overlap," Papers 2301.05703, arXiv.org, revised Jan 2023.
    8. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    9. Victor Chernozhukov & Juan Carlos Escanciano & Hidehiko Ichimura & Whitney K. Newey & James M. Robins, 2022. "Locally Robust Semiparametric Estimation," Econometrica, Econometric Society, vol. 90(4), pages 1501-1535, July.
    10. Davide Viviano, 2019. "Policy Targeting under Network Interference," Papers 1906.10258, arXiv.org, revised Apr 2024.
    11. Mert Demirer & Vasilis Syrgkanis & Greg Lewis & Victor Chernozhukov, 2019. "Semi-Parametric Efficient Policy Learning with Continuous Actions," CeMMAP working papers CWP34/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    12. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP54/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    13. Agboola, Oluwagbenga David & Yu, Han, 2023. "Neighborhood-based cross fitting approach to treatment effects with high-dimensional data," Computational Statistics & Data Analysis, Elsevier, vol. 186(C).
    14. Weibin Mo & Yufeng Liu, 2022. "Efficient learning of optimal individualized treatment rules for heteroscedastic or misspecified treatment‐free effect models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(2), pages 440-472, April.
    15. Matias D Cattaneo & Michael Jansson & Xinwei Ma, 2019. "Two-Step Estimation and Inference with Possibly Many Included Covariates," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 86(3), pages 1095-1122.
    16. Alejandro Sanchez-Becerra, 2023. "Robust inference for the treatment effect variance in experiments using machine learning," Papers 2306.03363, arXiv.org.
    17. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    18. Vira Semenova, 2020. "Generalized Lee Bounds," Papers 2008.12720, arXiv.org, revised Feb 2023.
    19. Nathan Kallus, 2022. "Treatment Effect Risk: Bounds and Inference," Papers 2201.05893, arXiv.org, revised Jul 2022.
    20. Michael Pollmann, 2020. "Causal Inference for Spatial Treatments," Papers 2011.00373, arXiv.org, revised Jan 2023.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:oropre:v:71:y:2023:i:1:p:148-183. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.