IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2305.12407.html
   My bibliography  Save this paper

Federated Offline Policy Learning

Author

Listed:
  • Aldo Gael Carranza
  • Susan Athey

Abstract

We consider the problem of learning personalized decision policies from observational bandit feedback data across multiple heterogeneous data sources. In our approach, we introduce a novel regret analysis that establishes finite-sample upper bounds on distinguishing notions of global regret for all data sources on aggregate and of local regret for any given data source. We characterize these regret bounds by expressions of source heterogeneity and distribution shift. Moreover, we examine the practical considerations of this problem in the federated setting where a central server aims to train a policy on data distributed across the heterogeneous sources without collecting any of their raw data. We present a policy learning algorithm amenable to federation based on the aggregation of local policies trained with doubly robust offline policy evaluation strategies. Our analysis and supporting experimental results provide insights into tradeoffs in the participation of heterogeneous data sources in offline policy learning.

Suggested Citation

  • Aldo Gael Carranza & Susan Athey, 2023. "Federated Offline Policy Learning," Papers 2305.12407, arXiv.org, revised Oct 2024.
  • Handle: RePEc:arx:papers:2305.12407
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2305.12407
    File Function: Latest version
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    2. Toru Kitagawa & Aleksey Tetenov, 2018. "Who Should Be Treated? Empirical Welfare Maximization Methods for Treatment Choice," Econometrica, Econometric Society, vol. 86(2), pages 591-616, March.
    3. Susan Athey & Stefan Wager, 2021. "Policy Learning With Observational Data," Econometrica, Econometric Society, vol. 89(1), pages 133-161, January.
    4. Ruoxuan Xiong & Allison Koenecke & Michael Powell & Zhu Shen & Joshua T. Vogelstein & Susan Athey, 2021. "Federated Causal Inference in Heterogeneous Observational Data," Papers 2107.11732, arXiv.org, revised Apr 2023.
    5. Krishnamurthy, Sanath Kumar & Hadad, Vitor & Athey, Susan, 2021. "Adapting to Misspecification in Contextual Bandits with Offline Regression Oracles," Research Papers 3951, Stanford University, Graduate School of Business.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Henrika Langen & Martin Huber, 2022. "How causal machine learning can leverage marketing strategies: Assessing and improving the performance of a coupon campaign," Papers 2204.10820, arXiv.org, revised Jun 2022.
    2. Augustine Denteh & Helge Liebert, 2022. "Who Increases Emergency Department Use? New Insights from the Oregon Health Insurance Experiment," Papers 2201.07072, arXiv.org, revised Apr 2023.
    3. Alejandro Sanchez-Becerra, 2023. "Robust inference for the treatment effect variance in experiments using machine learning," Papers 2306.03363, arXiv.org.
    4. Ganesh Karapakula, 2023. "Stable Probability Weighting: Large-Sample and Finite-Sample Estimation and Inference Methods for Heterogeneous Causal Effects of Multivalued Treatments Under Limited Overlap," Papers 2301.05703, arXiv.org, revised Jan 2023.
    5. Davide Viviano & Jess Rudder, 2020. "Policy design in experiments with unknown interference," Papers 2011.08174, arXiv.org, revised May 2024.
    6. Susan Athey & Undral Byambadalai & Vitor Hadad & Sanath Kumar Krishnamurthy & Weiwen Leung & Joseph Jay Williams, 2022. "Contextual Bandits in a Survey Experiment on Charitable Giving: Within-Experiment Outcomes versus Policy Learning," Papers 2211.12004, arXiv.org.
    7. Chunrong Ai & Yue Fang & Haitian Xie, 2024. "Data-driven Policy Learning for Continuous Treatments," Papers 2402.02535, arXiv.org, revised Nov 2024.
    8. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    9. Yi Zhang & Kosuke Imai, 2023. "Individualized Policy Evaluation and Learning under Clustered Network Interference," Papers 2311.02467, arXiv.org, revised Feb 2024.
    10. Manski, Charles F., 2023. "Probabilistic prediction for binary treatment choice: With focus on personalized medicine," Journal of Econometrics, Elsevier, vol. 234(2), pages 647-663.
    11. Combes, Pierre-Philippe & Gobillon, Laurent & Zylberberg, Yanos, 2022. "Urban economics in a historical perspective: Recovering data with machine learning," Regional Science and Urban Economics, Elsevier, vol. 94(C).
    12. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP54/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    13. Bokelmann, Björn & Lessmann, Stefan, 2024. "Improving uplift model evaluation on randomized controlled trial data," European Journal of Operational Research, Elsevier, vol. 313(2), pages 691-707.
    14. Garbero, Alessandra & Sakos, Grayson & Cerulli, Giovanni, 2023. "Towards data-driven project design: Providing optimal treatment rules for development projects," Socio-Economic Planning Sciences, Elsevier, vol. 89(C).
    15. Undral Byambadalai, 2022. "Identification and Inference for Welfare Gains without Unconfoundedness," Papers 2207.04314, arXiv.org.
    16. Black, Dan A. & Grogger, Jeffrey & Kirchmaier, Tom & Sanders, Koen, 2023. "Criminal charges, risk assessment and violent recidivism in cases of domestic abuse," LSE Research Online Documents on Economics 121374, London School of Economics and Political Science, LSE Library.
    17. Michael Lechner, 2023. "Causal Machine Learning and its use for public policy," Swiss Journal of Economics and Statistics, Springer;Swiss Society of Economics and Statistics, vol. 159(1), pages 1-15, December.
    18. Yuchen Hu & Henry Zhu & Emma Brunskill & Stefan Wager, 2024. "Minimax-Regret Sample Selection in Randomized Experiments," Papers 2403.01386, arXiv.org, revised Jun 2024.
    19. Michael C Knaus, 2022. "Double machine learning-based programme evaluation under unconfoundedness [Econometric methods for program evaluation]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 602-627.
    20. Jonas Metzger, 2022. "Adversarial Estimators," Papers 2204.10495, arXiv.org, revised Jun 2022.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2305.12407. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.