IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2405.10198.html
   My bibliography  Save this paper

Comprehensive Causal Machine Learning

Author

Listed:
  • Michael Lechner
  • Jana Mareckova

Abstract

Uncovering causal effects at various levels of granularity provides substantial value to decision makers. Comprehensive machine learning approaches to causal effect estimation allow to use a single causal machine learning approach for estimation and inference of causal mean effects for all levels of granularity. Focusing on selection-on-observables, this paper compares three such approaches, the modified causal forest (mcf), the generalized random forest (grf), and double machine learning (dml). It also provides proven theoretical guarantees for the mcf and compares the theoretical properties of the approaches. The findings indicate that dml-based methods excel for average treatment effects at the population level (ATE) and group level (GATE) with few groups, when selection into treatment is not too strong. However, for finer causal heterogeneity, explicitly outcome-centred forest-based approaches are superior. The mcf has three additional benefits: (i) It is the most robust estimator in cases when dml-based approaches underperform because of substantial selectivity; (ii) it is the best estimator for GATEs when the number of groups gets larger; and (iii), it is the only estimator that is internally consistent, in the sense that low-dimensional causal ATEs and GATEs are obtained as aggregates of finer-grained causal parameters.

Suggested Citation

  • Michael Lechner & Jana Mareckova, 2024. "Comprehensive Causal Machine Learning," Papers 2405.10198, arXiv.org.
  • Handle: RePEc:arx:papers:2405.10198
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2405.10198
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Sokbae Lee & Ryo Okui & Yoon†Jae Whang, 2017. "Doubly robust uniform confidence band for the conditional average treatment effect function," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 32(7), pages 1207-1225, November.
    2. Stefan Wager & Susan Athey, 2018. "Estimation and Inference of Heterogeneous Treatment Effects using Random Forests," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1228-1242, July.
    3. Bertrand,Marianne & Crepon,Bruno Jacques Jean Philippe & Marguerie,Alicia Charlene & Premand,Patrick, 2021. "Do Workfare Programs Live Up to Their Promises ? Experimental Evidence from Côte d’Ivoire," Policy Research Working Paper Series 9611, The World Bank.
    4. Michael C Knaus & Michael Lechner & Anthony Strittmatter, 2021. "Machine learning estimation of heterogeneous causal effects: Empirical Monte Carlo evidence," The Econometrics Journal, Royal Economic Society, vol. 24(1), pages 134-161.
    5. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    6. Michael C. Knaus, 2021. "A double machine learning approach to estimate the effects of musical practice on student’s skills," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(1), pages 282-300, January.
    7. Cockx, Bart & Lechner, Michael & Bollens, Joost, 2023. "Priority to unemployed immigrants? A causal machine learning evaluation of training in Belgium," Labour Economics, Elsevier, vol. 80(C).
    8. Robinson, Peter M, 1988. "Root- N-Consistent Semiparametric Regression," Econometrica, Econometric Society, vol. 56(4), pages 931-954, July.
    9. Michael C Knaus, 2022. "Double machine learning-based programme evaluation under unconfoundedness [Econometric methods for program evaluation]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 602-627.
    10. Hugo Bodory & Lorenzo Camponovo & Martin Huber & Michael Lechner, 2020. "The Finite Sample Performance of Inference Methods for Propensity Score Matching and Weighting Estimators," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 38(1), pages 183-200, January.
    11. Philipp Bach & Oliver Schacht & Victor Chernozhukov & Sven Klaassen & Martin Spindler, 2024. "Hyperparameter Tuning for Causal Inference with Double Machine Learning: A Simulation Study," Papers 2402.04674, arXiv.org.
    12. Boller, Daniel & Lechner, Michael & Okasa, Gabriel, 2021. "The Effect of Sport in Online Dating: Evidence from Causal Machine Learning," Economics Working Paper Series 2104, University of St. Gallen, School of Economics and Political Science.
    13. Qingliang Fan & Yu-Chin Hsu & Robert P. Lieli & Yichong Zhang, 2022. "Estimation of Conditional Average Treatment Effects With High-Dimensional Data," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 40(1), pages 313-327, January.
    14. Farrell, Max H., 2015. "Robust inference on average treatment effects with possibly more covariates than observations," Journal of Econometrics, Elsevier, vol. 189(1), pages 1-23.
    15. Goller, Daniel & Lechner, Michael & Moczall, Andreas & Wolff, Joachim, 2020. "Does the estimation of the propensity score by machine learning improve matching estimation? The case of Germany's programmes for long term unemployed," Labour Economics, Elsevier, vol. 65(C).
    16. Farbmacher, Helmut & Kögel, Heinrich & Spindler, Martin, 2021. "Heterogeneous effects of poverty on attention," Labour Economics, Elsevier, vol. 71(C).
    17. Dylan J. Foster & Vasilis Syrgkanis, 2019. "Orthogonal Statistical Learning," Papers 1901.09036, arXiv.org, revised Jun 2023.
    18. Michael C. Knaus & Michael Lechner & Anthony Strittmatter, 2022. "Heterogeneous Employment Effects of Job Search Programs: A Machine Learning Approach," Journal of Human Resources, University of Wisconsin Press, vol. 57(2), pages 597-636.
    19. Guido W. Imbens & Jeffrey M. Wooldridge, 2009. "Recent Developments in the Econometrics of Program Evaluation," Journal of Economic Literature, American Economic Association, vol. 47(1), pages 5-86, March.
    20. A. Belloni & V. Chernozhukov & I. Fernández‐Val & C. Hansen, 2017. "Program Evaluation and Causal Inference With High‐Dimensional Data," Econometrica, Econometric Society, vol. 85, pages 233-298, January.
    21. Bryan S. Graham & Cristine Campos De Xavier Pinto & Daniel Egel, 2012. "Inverse Probability Tilting for Moment Condition Models with Missing Data," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 79(3), pages 1053-1079.
    22. Sexton, Joseph & Laake, Petter, 2009. "Standard errors for bagged and random forest estimators," Computational Statistics & Data Analysis, Elsevier, vol. 53(3), pages 801-811, January.
    23. Kosuke Imai & Marc Ratkovic, 2014. "Covariate balancing propensity score," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(1), pages 243-263, January.
    24. Heiler, Phillip & Mareckova, Jana, 2021. "Shrinkage for categorical regressors," Journal of Econometrics, Elsevier, vol. 223(1), pages 161-189.
    25. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "Inference on Treatment Effects after Selection among High-Dimensional Controlsâ€," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 81(2), pages 608-650.
    26. Susan Athey & Guido W. Imbens & Stefan Wager, 2018. "Approximate residual balancing: debiased inference of average treatment effects in high dimensions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 80(4), pages 597-623, September.
    27. Huber, Martin & Lechner, Michael & Wunsch, Conny, 2013. "The performance of estimators based on the propensity score," Journal of Econometrics, Elsevier, vol. 175(1), pages 1-21.
    28. Strittmatter, Anthony, 2023. "What is the value added by using causal machine learning methods in a welfare experiment evaluation?," Labour Economics, Elsevier, vol. 84(C).
    29. van der Laan Mark J., 2006. "Statistical Inference for Variable Importance," The International Journal of Biostatistics, De Gruyter, vol. 2(1), pages 1-33, February.
    30. Ben B. Hansen, 2008. "The prognostic analogue of the propensity score," Biometrika, Biometrika Trust, vol. 95(2), pages 481-488.
    31. Alberto Caron & Gianluca Baio & Ioanna Manolopoulou, 2022. "Estimating individual treatment effects using non‐parametric regression models: A review," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(3), pages 1115-1149, July.
    32. Guido W. Imbens, 2004. "Nonparametric Estimation of Average Treatment Effects Under Exogeneity: A Review," The Review of Economics and Statistics, MIT Press, vol. 86(1), pages 4-29, February.
    33. X Nie & S Wager, 2021. "Quasi-oracle estimation of heterogeneous treatment effects [TensorFlow: A system for large-scale machine learning]," Biometrika, Biometrika Trust, vol. 108(2), pages 299-319.
    34. Jinyong Hahn, 1998. "On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects," Econometrica, Econometric Society, vol. 66(2), pages 315-332, March.
    35. van der Laan Mark J. & Gruber Susan, 2012. "Targeted Minimum Loss Based Estimation of Causal Effects of Multiple Time Point Interventions," The International Journal of Biostatistics, De Gruyter, vol. 8(1), pages 1-41, May.
    36. Victor Chernozhukov & Whitney K. Newey & Rahul Singh, 2022. "Automatic Debiased Machine Learning of Causal and Structural Effects," Econometrica, Econometric Society, vol. 90(3), pages 967-1027, May.
    37. Jonathan M.V. Davis & Sara B. Heller, 2017. "Using Causal Forests to Predict Treatment Heterogeneity: An Application to Summer Jobs," American Economic Review, American Economic Association, vol. 107(5), pages 546-550, May.
    38. Lu Li & Niwen Zhou & Lixing Zhu, 2022. "Outcome regression-based estimation of conditional average treatment effect," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 74(5), pages 987-1041, October.
    39. Vira Semenova & Victor Chernozhukov, 2021. "Debiased machine learning of conditional average treatment effects and other causal functions," The Econometrics Journal, Royal Economic Society, vol. 24(2), pages 264-289.
    40. Philipp Bach & Victor Chernozhukov & Malte S. Kurz & Martin Spindler & Sven Klaassen, 2021. "DoubleML -- An Object-Oriented Implementation of Double Machine Learning in R," Papers 2103.09603, arXiv.org, revised Jun 2024.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Hugo Bodory & Federica Mascolo & Michael Lechner, 2024. "Enabling Decision-Making with the Modified Causal Forest: Policy Trees for Treatment Assignment," Papers 2406.02241, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ganesh Karapakula, 2023. "Stable Probability Weighting: Large-Sample and Finite-Sample Estimation and Inference Methods for Heterogeneous Causal Effects of Multivalued Treatments Under Limited Overlap," Papers 2301.05703, arXiv.org, revised Jan 2023.
    2. Michael Lechner & Jana Mareckova, 2022. "Modified Causal Forest," Papers 2209.03744, arXiv.org.
    3. Michael C Knaus, 2022. "Double machine learning-based programme evaluation under unconfoundedness [Econometric methods for program evaluation]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 602-627.
    4. Huber, Martin, 2019. "An introduction to flexible methods for policy evaluation," FSES Working Papers 504, Faculty of Economics and Social Sciences, University of Freiburg/Fribourg Switzerland.
    5. Nora Bearth & Michael Lechner, 2024. "Causal Machine Learning for Moderation Effects," Papers 2401.08290, arXiv.org, revised Apr 2024.
    6. Michael C. Knaus, 2021. "A double machine learning approach to estimate the effects of musical practice on student’s skills," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(1), pages 282-300, January.
    7. Michael C Knaus & Michael Lechner & Anthony Strittmatter, 2021. "Machine learning estimation of heterogeneous causal effects: Empirical Monte Carlo evidence," The Econometrics Journal, Royal Economic Society, vol. 24(1), pages 134-161.
    8. Gabriel Okasa, 2022. "Meta-Learners for Estimation of Causal Effects: Finite Sample Cross-Fit Performance," Papers 2201.12692, arXiv.org.
    9. Kyle Colangelo & Ying-Ying Lee, 2020. "Double Debiased Machine Learning Nonparametric Inference with Continuous Treatments," Papers 2004.03036, arXiv.org, revised Sep 2023.
    10. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP54/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    11. Phillip Heiler & Michael C. Knaus, 2021. "Effect or Treatment Heterogeneity? Policy Evaluation with Aggregated and Disaggregated Treatments," Papers 2110.01427, arXiv.org, revised Aug 2023.
    12. Simon Calmar Andersen & Louise Beuchert & Phillip Heiler & Helena Skyt Nielsen, 2023. "A Guide to Impact Evaluation under Sample Selection and Missing Data: Teacher's Aides and Adolescent Mental Health," Papers 2308.04963, arXiv.org.
    13. Goller, Daniel & Harrer, Tamara & Lechner, Michael & Wolff, Joachim, 2021. "Active labour market policies for the long-term unemployed: New evidence from causal machine learning," Economics Working Paper Series 2108, University of St. Gallen, School of Economics and Political Science.
    14. Strittmatter, Anthony, 2023. "What is the value added by using causal machine learning methods in a welfare experiment evaluation?," Labour Economics, Elsevier, vol. 84(C).
    15. Paul Clarke & Annalivia Polselli, 2023. "Double Machine Learning for Static Panel Models with Fixed Effects," Papers 2312.08174, arXiv.org, revised Sep 2024.
    16. Athey, Susan & Imbens, Guido W. & Metzger, Jonas & Munro, Evan, 2024. "Using Wasserstein Generative Adversarial Networks for the design of Monte Carlo simulations," Journal of Econometrics, Elsevier, vol. 240(2).
    17. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    18. Sant’Anna, Pedro H.C. & Zhao, Jun, 2020. "Doubly robust difference-in-differences estimators," Journal of Econometrics, Elsevier, vol. 219(1), pages 101-122.
    19. Chunrong Ai & Oliver Linton & Kaiji Motegi & Zheng Zhang, 2021. "A unified framework for efficient estimation of general treatment models," Quantitative Economics, Econometric Society, vol. 12(3), pages 779-816, July.
    20. Agboola, Oluwagbenga David & Yu, Han, 2023. "Neighborhood-based cross fitting approach to treatment effects with high-dimensional data," Computational Statistics & Data Analysis, Elsevier, vol. 186(C).

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2405.10198. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.