IDEAS home Printed from https://ideas.repec.org/p/zbw/irtgdp/2021005.html
   My bibliography  Save this paper

CATE meets ML: Conditional average treatment effect and machine learning

Author

Listed:
  • Jacob, Daniel

Abstract

For treatment effects - one of the core issues in modern econometric analysis - prediction and estimation are flip-sides of the same coin. As it turns out, machine learning methods are the tool for generalized prediction models. Combined with econometric theory allows us to estimate not only the average but a personalized treatment effect - the conditional average treatment effect (CATE). In this tutorial, we give an overview of novel methods, explain them in detail, and apply them via Quantlets in real data applications. We study the effect that microcredit availability has on the amount of money borrowed and if the 401(k) pension plan eligibility has an impact on net financial assets, as two empirical examples. The presented toolbox of methods contains metalearners, like the Doubly-Robust, the R-, T- and X-learner, and methods that are specially designed to estimate the CATE like the causal BART and the generalized random forest. In both, the microcredit and the 401(k) example, we find a positive treatment effect for all observations but diverse evidence of treatment effect heterogeneity. An additional simulation study, where the true treatment effect is known, allows us to compare the different methods and to observe patterns and similarities.

Suggested Citation

  • Jacob, Daniel, 2021. "CATE meets ML: Conditional average treatment effect and machine learning," IRTG 1792 Discussion Papers 2021-005, Humboldt University of Berlin, International Research Training Group 1792 "High Dimensional Nonstationary Time Series".
  • Handle: RePEc:zbw:irtgdp:2021005
    as

    Download full text from publisher

    File URL: https://www.econstor.eu/bitstream/10419/233509/1/1755344805.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. van der Laan Mark J., 2010. "Targeted Maximum Likelihood Based Causal Inference: Part II," The International Journal of Biostatistics, De Gruyter, vol. 6(2), pages 1-33, February.
    2. van der Laan Mark J., 2010. "Targeted Maximum Likelihood Based Causal Inference: Part I," The International Journal of Biostatistics, De Gruyter, vol. 6(2), pages 1-45, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Konstantin Häusler & Hongyu Xia, 2022. "Indices on cryptocurrencies: an evaluation," Digital Finance, Springer, vol. 4(2), pages 149-167, September.
    2. Vinish Shrestha, 2024. "Heterogeneous Impacts of ACA-Medicaid Expansion on Insurance and Labor Market Outcomes in the American South," Working Papers 2024-08, Towson University, Department of Economics, revised Jun 2024.
    3. Kushal S. Shah & Haoda Fu & Michael R. Kosorok, 2023. "Stabilized direct learning for efficient estimation of individualized treatment rules," Biometrics, The International Biometric Society, vol. 79(4), pages 2843-2856, December.
    4. repec:ags:aaea22:335586 is not listed on IDEAS
    5. Aaron Baird & Yusen Xia, 2024. "Precision Digital Health," Business & Information Systems Engineering: The International Journal of WIRTSCHAFTSINFORMATIK, Springer;Gesellschaft für Informatik e.V. (GI), vol. 66(3), pages 261-271, June.
    6. Olga Takács & János Vincze, 2023. "Heterogeneous wage structure effects: a partial European East-West comparison," CERS-IE WORKING PAPERS 2305, Institute of Economics, Centre for Economic and Regional Studies.
    7. Krantz, Sebastian, 2024. "Mapping Africa's infrastructure potential with geospatial big data and causal ML," Kiel Working Papers 2276, Kiel Institute for the World Economy (IfW Kiel).
    8. Gabriel Okasa, 2022. "Meta-Learners for Estimation of Causal Effects: Finite Sample Cross-Fit Performance," Papers 2201.12692, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mireille E. Schnitzer & Erica E.M. Moodie & Mark J. van der Laan & Robert W. Platt & Marina B. Klein, 2014. "Modeling the impact of hepatitis C viral clearance on end-stage liver disease in an HIV co-infected cohort with targeted maximum likelihood estimation," Biometrics, The International Biometric Society, vol. 70(1), pages 144-152, March.
    2. repec:jss:jstsof:43:i13 is not listed on IDEAS
    3. Daniel Jacob, 2021. "CATE meets ML," Digital Finance, Springer, vol. 3(2), pages 99-148, June.
    4. Sara E Moore & Anna Decker & Alan Hubbard & Rachael A Callcut & Erin E Fox & Deborah J del Junco & John B Holcomb & Mohammad H Rahbar & Charles E Wade & Martin A Schreiber & Louis H Alarcon & Karen J , 2015. "Statistical Machines for Trauma Hospital Outcomes Research: Application to the PRospective, Observational, Multi-Center Major Trauma Transfusion (PROMMTT) Study," PLOS ONE, Public Library of Science, vol. 10(8), pages 1-16, August.
    5. Jason Roy & Kirsten J. Lum & Bret Zeldow & Jordan D. Dworkin & Vincent Lo Re & Michael J. Daniels, 2018. "Bayesian nonparametric generative models for causal inference with missing at random covariates," Biometrics, The International Biometric Society, vol. 74(4), pages 1193-1202, December.
    6. Rachael V. Phillips & Mark J. van der Laan, 2022. "Rachael V. Phillips and Mark J. van der Laan’s contribution to the Discussion of ‘Assumption‐lean inference for generalised linear model parameters’ by Vansteelandt and Dukes," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(3), pages 717-718, July.
    7. Brathwaite, Timothy & Walker, Joan L., 2018. "Causal inference in travel demand modeling (and the lack thereof)," Journal of choice modelling, Elsevier, vol. 26(C), pages 1-18.
    8. Guo, Xu & Fang, Yun & Zhu, Xuehu & Xu, Wangli & Zhu, Lixing, 2018. "Semiparametric double robust and efficient estimation for mean functionals with response missing at random," Computational Statistics & Data Analysis, Elsevier, vol. 128(C), pages 325-339.

    More about this item

    Keywords

    Causal Inference; CATE; Machine Learning; Tutorial;
    All these keywords.

    JEL classification:

    • C00 - Mathematical and Quantitative Methods - - General - - - General

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:zbw:irtgdp:2021005. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ZBW - Leibniz Information Centre for Economics (email available below). General contact details of provider: https://edirc.repec.org/data/wfhubde.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.