IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2409.01266.html
   My bibliography  Save this paper

Double Machine Learning meets Panel Data -- Promises, Pitfalls, and Potential Solutions

Author

Listed:
  • Jonathan Fuhr
  • Dominik Papies

Abstract

Estimating causal effect using machine learning (ML) algorithms can help to relax functional form assumptions if used within appropriate frameworks. However, most of these frameworks assume settings with cross-sectional data, whereas researchers often have access to panel data, which in traditional methods helps to deal with unobserved heterogeneity between units. In this paper, we explore how we can adapt double/debiased machine learning (DML) (Chernozhukov et al., 2018) for panel data in the presence of unobserved heterogeneity. This adaptation is challenging because DML's cross-fitting procedure assumes independent data and the unobserved heterogeneity is not necessarily additively separable in settings with nonlinear observed confounding. We assess the performance of several intuitively appealing estimators in a variety of simulations. While we find violations of the cross-fitting assumptions to be largely inconsequential for the accuracy of the effect estimates, many of the considered methods fail to adequately account for the presence of unobserved heterogeneity. However, we find that using predictive models based on the correlated random effects approach (Mundlak, 1978) within DML leads to accurate coefficient estimates across settings, given a sample size that is large relative to the number of observed confounders. We also show that the influence of the unobserved heterogeneity on the observed confounders plays a significant role for the performance of most alternative methods.

Suggested Citation

  • Jonathan Fuhr & Dominik Papies, 2024. "Double Machine Learning meets Panel Data -- Promises, Pitfalls, and Potential Solutions," Papers 2409.01266, arXiv.org.
  • Handle: RePEc:arx:papers:2409.01266
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2409.01266
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Harold D. Chiang & Kengo Kato & Yukun Ma & Yuya Sasaki, 2022. "Multiway Cluster Robust Double/Debiased Machine Learning," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 40(3), pages 1046-1056, June.
    2. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    3. Alexandre Belloni & Victor Chernozhukov & Christian Hansen & Damian Kozbur, 2016. "Inference in High-Dimensional Panel Models With an Application to Gun Control," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(4), pages 590-605, October.
    4. Neng-Chieh Chang, 2020. "Double/debiased machine learning for difference-in-differences models," The Econometrics Journal, Royal Economic Society, vol. 23(2), pages 177-191.
    5. Hugo Bodory & Martin Huber & Lukáš Lafférs, 2022. "Evaluating (weighted) dynamic treatment effects by double machine learning [Identification of causal effects using instrumental variables]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 628-648.
    6. Molei Liu & Yi Zhang & Doudou Zhou, 2021. "Double/debiased machine learning for logistic partially linear model," The Econometrics Journal, Royal Economic Society, vol. 24(3), pages 559-588.
    7. Chamberlain, Gary, 1982. "Multivariate regression models for panel data," Journal of Econometrics, Elsevier, vol. 18(1), pages 5-46, January.
    8. Victor Chernozhukov & Whitney K. Newey & Rahul Singh, 2022. "Automatic Debiased Machine Learning of Causal and Structural Effects," Econometrica, Econometric Society, vol. 90(3), pages 967-1027, May.
    9. Brett R. Gordon & Robert Moakler & Florian Zettelmeyer, 2022. "Close Enough? A Large-Scale Exploration of Non-Experimental Approaches to Advertising Measurement," Papers 2201.07055, arXiv.org, revised Oct 2022.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jonathan Fuhr & Philipp Berens & Dominik Papies, 2024. "Estimating Causal Effects with Double Machine Learning -- A Method Evaluation," Papers 2403.14385, arXiv.org, revised Apr 2024.
    2. Michael Lechner, 2023. "Causal Machine Learning and its use for public policy," Swiss Journal of Economics and Statistics, Springer;Swiss Society of Economics and Statistics, vol. 159(1), pages 1-15, December.
    3. Paul Clarke & Annalivia Polselli, 2023. "Double Machine Learning for Static Panel Models with Fixed Effects," Papers 2312.08174, arXiv.org, revised Sep 2024.
    4. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    5. Soren Blomquist & Anil Kumar & Che-Yuan Liang & Whitney K. Newey, 2022. "Nonlinear Budget Set Regressions for the Random Utility Model," Working Papers 2219, Federal Reserve Bank of Dallas.
    6. Guo, Jiaqi & Wang, Qiang & Li, Rongrong, 2024. "Can official development assistance promote renewable energy in sub-Saharan Africa countries? A matter of institutional transparency of recipient countries," Energy Policy, Elsevier, vol. 186(C).
    7. Oyenubi, Adeola & Kollamparambil, Umakrishnan, 2023. "Does noncompliance with COVID-19 regulations impact the depressive symptoms of others?," Economic Modelling, Elsevier, vol. 120(C).
    8. Juan Carlos Escanciano & Telmo P'erez-Izquierdo, 2023. "Automatic Locally Robust Estimation with Generated Regressors," Papers 2301.10643, arXiv.org, revised Nov 2023.
    9. Sander Gerritsen & Mark Kattenberg & Sonny Kuijpers, 2019. "The impact of age at arrival on education and mental health," CPB Discussion Paper 389.rdf, CPB Netherlands Bureau for Economic Policy Analysis.
    10. Martin Huber & Eva-Maria Oe{ss}, 2024. "A joint test of unconfoundedness and common trends," Papers 2404.16961, arXiv.org, revised Jun 2024.
    11. Sander Gerritsen & Mark Kattenberg & Sonny Kuijpers, 2019. "The impact of age at arrival on education and mental health," CPB Discussion Paper 389, CPB Netherlands Bureau for Economic Policy Analysis.
    12. Mark Kattenberg & Bas Scheer & Jurre Thiel, 2023. "Causal forests with fixed effects for treatment effect heterogeneity in difference-in-differences," CPB Discussion Paper 452, CPB Netherlands Bureau for Economic Policy Analysis.
    13. Yuya Sasaki & Takuya Ura & Yichong Zhang, 2022. "Unconditional quantile regression with high‐dimensional data," Quantitative Economics, Econometric Society, vol. 13(3), pages 955-978, July.
    14. Esfandiar Maasoumi & Jianqiu Wang & Zhuo Wang & Ke Wu, 2024. "Identifying factors via automatic debiased machine learning," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 39(3), pages 438-461, April.
    15. Kyle Colangelo & Ying-Ying Lee, 2020. "Double Debiased Machine Learning Nonparametric Inference with Continuous Treatments," Papers 2004.03036, arXiv.org, revised Sep 2023.
    16. Henrika Langen & Martin Huber, 2022. "How causal machine learning can leverage marketing strategies: Assessing and improving the performance of a coupon campaign," Papers 2204.10820, arXiv.org, revised Jun 2022.
    17. Nan Liu & Yanbo Liu & Yuya Sasaki, 2024. "Estimation and Inference for Causal Functions with Multiway Clustered Data," Papers 2409.06654, arXiv.org.
    18. Brett R. Gordon & Robert Moakler & Florian Zettelmeyer, 2023. "Predictive Incrementality by Experimentation (PIE) for Ad Measurement," Papers 2304.06828, arXiv.org.
    19. Alejandro Sanchez-Becerra, 2023. "Robust inference for the treatment effect variance in experiments using machine learning," Papers 2306.03363, arXiv.org.
    20. Zequn Jin & Lihua Lin & Zhengyu Zhang, 2022. "Identification and Auto-debiased Machine Learning for Outcome Conditioned Average Structural Derivatives," Papers 2211.07903, arXiv.org.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2409.01266. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.