IDEAS home Printed from https://ideas.repec.org/a/spr/psycho/v87y2022i1d10.1007_s11336-021-09805-x.html
   My bibliography  Save this article

Robust Machine Learning for Treatment Effects in Multilevel Observational Studies Under Cluster-level Unmeasured Confounding

Author

Listed:
  • Youmi Suk

    (University of Virginia)

  • Hyunseung Kang

    (University of Wisconsin-Madison)

Abstract

Recently, machine learning (ML) methods have been used in causal inference to estimate treatment effects in order to reduce concerns for model mis-specification. However, many ML methods require that all confounders are measured to consistently estimate treatment effects. In this paper, we propose a family of ML methods that estimate treatment effects in the presence of cluster-level unmeasured confounders, a type of unmeasured confounders that are shared within each cluster and are common in multilevel observational studies. We show through simulation studies that our proposed methods are robust from biases from unmeasured cluster-level confounders in a variety of multilevel observational studies. We also examine the effect of taking an algebra course on math achievement scores from the Early Childhood Longitudinal Study, a multilevel observational educational study, using our methods. The proposed methods are available in the CURobustML R package.

Suggested Citation

  • Youmi Suk & Hyunseung Kang, 2022. "Robust Machine Learning for Treatment Effects in Multilevel Observational Studies Under Cluster-level Unmeasured Confounding," Psychometrika, Springer;The Psychometric Society, vol. 87(1), pages 310-343, March.
  • Handle: RePEc:spr:psycho:v:87:y:2022:i:1:d:10.1007_s11336-021-09805-x
    DOI: 10.1007/s11336-021-09805-x
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11336-021-09805-x
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11336-021-09805-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Shadish, William R. & Clark, M. H. & Steiner, Peter M., 2008. "Can Nonrandomized Experiments Yield Accurate Answers? A Randomized Experiment Comparing Random and Nonrandom Assignments," Journal of the American Statistical Association, American Statistical Association, vol. 103(484), pages 1334-1344.
    2. Arpino, Bruno & Mealli, Fabrizia, 2011. "The specification of the propensity score in multilevel observational studies," Computational Statistics & Data Analysis, Elsevier, vol. 55(4), pages 1770-1780, April.
    3. Hansen, Lars Peter, 1982. "Large Sample Properties of Generalized Method of Moments Estimators," Econometrica, Econometric Society, vol. 50(4), pages 1029-1054, July.
    4. Stefan Wager & Susan Athey, 2018. "Estimation and Inference of Heterogeneous Treatment Effects using Random Forests," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1228-1242, July.
    5. Glynn, Adam N. & Quinn, Kevin M., 2010. "An Introduction to the Augmented Inverse Propensity Weighted Estimator," Political Analysis, Cambridge University Press, vol. 18(1), pages 36-56, January.
    6. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    7. van Buuren, Stef & Groothuis-Oudshoorn, Karin, 2011. "mice: Multivariate Imputation by Chained Equations in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 45(i03).
    8. Peng Ding & Avi Feller & Luke Miratrix, 2019. "Decomposing Treatment Effect Variation," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(525), pages 304-317, January.
    9. Lin, Zhongjian & Li, Qi & Sun, Yiguo, 2014. "A consistent nonparametric test of parametric regression functional form in fixed effects panel data models," Journal of Econometrics, Elsevier, vol. 178(P1), pages 167-179.
    10. Jee-Seon Kim & Edward Frees, 2006. "Omitted Variables in Multilevel Models," Psychometrika, Springer;The Psychometric Society, vol. 71(4), pages 659-690, December.
    11. Gruber, Susan & Laan, Mark van der, 2012. "tmle: An R Package for Targeted Maximum Likelihood Estimation," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 51(i13).
    12. Heejung Bang & James M. Robins, 2005. "Doubly Robust Estimation in Missing Data and Causal Inference Models," Biometrics, The International Biometric Society, vol. 61(4), pages 962-973, December.
    13. Bates, Douglas & Mächler, Martin & Bolker, Ben & Walker, Steve, 2015. "Fitting Linear Mixed-Effects Models Using lme4," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 67(i01).
    14. Jee-Seon Kim & Edward Frees, 2007. "Multilevel Modeling with Correlated Effects," Psychometrika, Springer;The Psychometric Society, vol. 72(4), pages 505-533, December.
    15. Henderson, Daniel J. & Carroll, Raymond J. & Li, Qi, 2008. "Nonparametric estimation and testing of fixed effects panel data models," Journal of Econometrics, Elsevier, vol. 144(1), pages 257-275, May.
    16. Dmitry Arkhangelsky & Guido Imbens, 2018. "The Role of the Propensity Score in Fixed Effect Models," NBER Working Papers 24814, National Bureau of Economic Research, Inc.
    17. Jordan H. Rickles, 2013. "Examining Heterogeneity in the Effect of Taking Algebra in Eighth Grade," The Journal of Educational Research, Taylor & Francis Journals, vol. 106(4), pages 251-268, July.
    18. Hong, Guanglei & Raudenbush, Stephen W., 2006. "Evaluating Kindergarten Retention Policy: A Case Study of Causal Inference for Multilevel Observational Data," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 901-910, September.
    19. Jeffrey M Wooldridge, 2010. "Econometric Analysis of Cross Section and Panel Data," MIT Press Books, The MIT Press, edition 2, volume 1, number 0262232588, April.
    20. Kosuke Imai & In Song Kim, 2019. "When Should We Use Unit Fixed Effects Regression Models for Causal Inference with Longitudinal Data?," American Journal of Political Science, John Wiley & Sons, vol. 63(2), pages 467-490, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Youmi Suk, 2024. "A Within-Group Approach to Ensemble Machine Learning Methods for Causal Inference in Multilevel Studies," Journal of Educational and Behavioral Statistics, , vol. 49(1), pages 61-91, February.
    2. Youmi Suk & Kyung T. Han, 2024. "A Psychometric Framework for Evaluating Fairness in Algorithmic Decision Making: Differential Algorithmic Functioning," Journal of Educational and Behavioral Statistics, , vol. 49(2), pages 151-172, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Youmi Suk, 2024. "A Within-Group Approach to Ensemble Machine Learning Methods for Causal Inference in Multilevel Studies," Journal of Educational and Behavioral Statistics, , vol. 49(1), pages 61-91, February.
    2. Mary Ying-Fang Wang & Paul Tuss & Lihong Qi, 2019. "Augmented Weighted Estimators Dealing with Practical Positivity Violation to Causal inferences in a Random Coefficient Model," Psychometrika, Springer;The Psychometric Society, vol. 84(2), pages 447-467, June.
    3. Davide Viviano & Jelena Bradic, 2019. "Synthetic learner: model-free inference on treatments over time," Papers 1904.01490, arXiv.org, revised Aug 2022.
    4. Jushan Bai & Sung Hoon Choi & Yuan Liao, 2021. "Feasible generalized least squares for panel data with cross-sectional and serial correlations," Empirical Economics, Springer, vol. 60(1), pages 309-326, January.
    5. Dmitry Arkhangelsky & Guido Imbens, 2023. "Causal Models for Longitudinal and Panel Data: A Survey," Papers 2311.15458, arXiv.org, revised Jun 2024.
    6. Lechner, Michael & Okasa, Gabriel, 2019. "Random Forest Estimation of the Ordered Choice Model," Economics Working Paper Series 1908, University of St. Gallen, School of Economics and Political Science.
    7. Aleksey Oshchepkov & Anna Shirokanova, 2020. "Multilevel Modeling For Economists: Why, When And How," HSE Working papers WP BRP 233/EC/2020, National Research University Higher School of Economics.
    8. Dmitry Arkhangelsky & Guido W. Imbens & Lihua Lei & Xiaoman Luo, 2021. "Design-Robust Two-Way-Fixed-Effects Regression For Panel Data," Papers 2107.13737, arXiv.org, revised Mar 2024.
    9. Kerda Varaku & Robin Sickles, 2023. "Public subsidies and innovation: a doubly robust machine learning approach leveraging deep neural networks," Empirical Economics, Springer, vol. 64(6), pages 3121-3165, June.
    10. Harsh Parikh & Carlos Varjao & Louise Xu & Eric Tchetgen Tchetgen, 2022. "Validating Causal Inference Methods," Papers 2202.04208, arXiv.org, revised Jul 2022.
    11. Ruoxuan Xiong & Allison Koenecke & Michael Powell & Zhu Shen & Joshua T. Vogelstein & Susan Athey, 2021. "Federated Causal Inference in Heterogeneous Observational Data," Papers 2107.11732, arXiv.org, revised Apr 2023.
    12. Khashayar Khosravi & Greg Lewis & Vasilis Syrgkanis, 2019. "Non-Parametric Inference Adaptive to Intrinsic Dimension," Papers 1901.03719, arXiv.org, revised Jun 2019.
    13. Yiyi Huo & Yingying Fan & Fang Han, 2023. "On the adaptation of causal forests to manifold data," Papers 2311.16486, arXiv.org, revised Dec 2023.
    14. Huber, Martin & Lechner, Michael & Wunsch, Conny, 2013. "The performance of estimators based on the propensity score," Journal of Econometrics, Elsevier, vol. 175(1), pages 1-21.
    15. Mark Kattenberg & Bas Scheer & Jurre Thiel, 2023. "Causal forests with fixed effects for treatment effect heterogeneity in difference-in-differences," CPB Discussion Paper 452, CPB Netherlands Bureau for Economic Policy Analysis.
    16. Youmi Suk & Jee-Seon Kim & Hyunseung Kang, 2021. "Hybridizing Machine Learning Methods and Finite Mixture Models for Estimating Heterogeneous Treatment Effects in Latent Classes," Journal of Educational and Behavioral Statistics, , vol. 46(3), pages 323-347, June.
    17. Matias D Cattaneo & Michael Jansson & Xinwei Ma, 2019. "Two-Step Estimation and Inference with Possibly Many Included Covariates," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 86(3), pages 1095-1122.
    18. Nathan Kallus, 2022. "What's the Harm? Sharp Bounds on the Fraction Negatively Affected by Treatment," Papers 2205.10327, arXiv.org, revised Nov 2022.
    19. Athey, Susan & Imbens, Guido W., 2022. "Design-based analysis in Difference-In-Differences settings with staggered adoption," Journal of Econometrics, Elsevier, vol. 226(1), pages 62-79.
    20. Heejun Shin & Joseph Antonelli, 2023. "Improved inference for doubly robust estimators of heterogeneous treatment effects," Biometrics, The International Biometric Society, vol. 79(4), pages 3140-3152, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:psycho:v:87:y:2022:i:1:d:10.1007_s11336-021-09805-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.