IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v159y2021ics0167947321000220.html
   My bibliography  Save this article

Subgroup causal effect identification and estimation via matching tree

Author

Listed:
  • Zhang, Yuyang
  • Schnell, Patrick
  • Song, Chi
  • Huang, Bin
  • Lu, Bo

Abstract

Inferring causal effect from observational studies is a central topic in many scientific fields, including social science, health and medicine. The statistical methodology for estimating population average causal effect has been well established. However, the methods for identifying and estimating subpopulation causal effects are relatively less developed. Part of the challenge is that the subgroup structure is usually unknown, therefore, methods working well for population level effect need to be modified to address this. A tree method based on a matched design is proposed to identify subgroups with differential treatment effects. To remove observed confounding, propensity-score-matched pairs are first created. Then the classification and regression tree is applied to the within-pair outcome differences to identify the subgroup structure. This nonparametric approach is robust to model misspecification, which is important because it becomes much harder to specify a parametric outcome model in the presence of subgroup effects. In addition to describing assumptions under which our matching estimator is unbiased, algorithms for identifying subgroup structures are provided. Simulation results indicate that the proposed approach compares favorably, in terms of the percentage of correctly identifying true tree structure, with other competing tree-based methods, including causal trees, causal inference trees and the virtual twins approach. Finally the proposed method is implemented to examine the potential subgroup effect of the timing of Tobramycin use on chronic infection among pediatric Cystic Fibrosis patients.

Suggested Citation

  • Zhang, Yuyang & Schnell, Patrick & Song, Chi & Huang, Bin & Lu, Bo, 2021. "Subgroup causal effect identification and estimation via matching tree," Computational Statistics & Data Analysis, Elsevier, vol. 159(C).
  • Handle: RePEc:eee:csdana:v:159:y:2021:i:c:s0167947321000220
    DOI: 10.1016/j.csda.2021.107188
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947321000220
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2021.107188?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Lu, Bo & Greevy, Robert & Xu, Xinyi & Beck, Cole, 2011. "Optimal Nonbipartite Matching and Its Statistical Applications," The American Statistician, American Statistical Association, vol. 65(1), pages 21-30.
    2. Jesse Y. Hsu & Dylan S. Small & Paul R. Rosenbaum, 2013. "Effect Modification and Design Sensitivity in Observational Studies," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 108(501), pages 135-148, March.
    3. Giovanni Nattino & Bo Lu, 2018. "Model assisted sensitivity analyses for hidden bias with binary outcomes," Biometrics, The International Biometric Society, vol. 74(4), pages 1141-1149, December.
    4. AfDB AfDB, . "Annual Report 2012," Annual Report, African Development Bank, number 461.
    5. Heejung Bang & James M. Robins, 2005. "Doubly Robust Estimation in Missing Data and Causal Inference Models," Biometrics, The International Biometric Society, vol. 61(4), pages 962-973, December.
    6. L. Altstein & G. Li, 2013. "Latent Subgroup Analysis of a Randomized Clinical Trial through a Semiparametric Accelerated Failure Time Mixture Model," Biometrics, The International Biometric Society, vol. 69(1), pages 52-61, March.
    7. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881.
    8. Lu Tian & Ash A. Alizadeh & Andrew J. Gentles & Robert Tibshirani, 2014. "A Simple Method for Estimating Interactions Between a Treatment and a Large Number of Covariates," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(508), pages 1517-1532, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Qingyuan Zhao & Dylan S. Small & Ashkan Ertefaie, 2022. "Selective inference for effect modification via the lasso," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(2), pages 382-413, April.
    2. Ruoxuan Xiong & Allison Koenecke & Michael Powell & Zhu Shen & Joshua T. Vogelstein & Susan Athey, 2021. "Federated Causal Inference in Heterogeneous Observational Data," Papers 2107.11732, arXiv.org, revised Apr 2023.
    3. Yiyi Huo & Yingying Fan & Fang Han, 2023. "On the adaptation of causal forests to manifold data," Papers 2311.16486, arXiv.org, revised Dec 2023.
    4. Jinglong Zhao, 2024. "Experimental Design For Causal Inference Through An Optimization Lens," Papers 2408.09607, arXiv.org, revised Aug 2024.
    5. Mark Kattenberg & Bas Scheer & Jurre Thiel, 2023. "Causal forests with fixed effects for treatment effect heterogeneity in difference-in-differences," CPB Discussion Paper 452, CPB Netherlands Bureau for Economic Policy Analysis.
    6. Michael C Knaus, 2022. "Double machine learning-based programme evaluation under unconfoundedness [Econometric methods for program evaluation]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 602-627.
    7. Alberto Abadie & Anish Agarwal & Raaz Dwivedi & Abhin Shah, 2024. "Doubly Robust Inference in Causal Latent Factor Models," Papers 2402.11652, arXiv.org, revised Apr 2024.
    8. Yumou Qiu & Jing Tao & Xiao‐Hua Zhou, 2021. "Inference of heterogeneous treatment effects using observational data with high‐dimensional covariates," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(5), pages 1016-1043, November.
    9. Siying Guo & Jianxuan Liu & Qiu Wang, 2022. "Effective Learning During COVID-19: Multilevel Covariates Matching and Propensity Score Matching," Annals of Data Science, Springer, vol. 9(5), pages 967-982, October.
    10. Graham, Bryan S. & Pinto, Cristine Campos de Xavier, 2022. "Semiparametrically efficient estimation of the average linear regression function," Journal of Econometrics, Elsevier, vol. 226(1), pages 115-138.
    11. Donna Feir & Kelly Foley & Maggie E. C. Jones, 2021. "The Distributional Impacts of Active Labor Market Programs for Indigenous Populations," AEA Papers and Proceedings, American Economic Association, vol. 111, pages 216-220, May.
    12. Loh, Wen Wei & Ren, Dongning, 2021. "Data-driven Covariate Selection for Confounding Adjustment by Focusing on the Stability of the Effect Estimator," OSF Preprints yve6u, Center for Open Science.
    13. Difang Huang & Jiti Gao & Tatsushi Oka, 2022. "Semiparametric Single-Index Estimation for Average Treatment Effects," Papers 2206.08503, arXiv.org, revised Apr 2024.
    14. Eoghan O'Neill & Melvyn Weeks, 2018. "Causal Tree Estimation of Heterogeneous Household Response to Time-Of-Use Electricity Pricing Schemes," Papers 1810.09179, arXiv.org, revised Oct 2019.
    15. Phillip Heiler & Michael C. Knaus, 2021. "Effect or Treatment Heterogeneity? Policy Evaluation with Aggregated and Disaggregated Treatments," Papers 2110.01427, arXiv.org, revised Aug 2023.
    16. Dmitry Arkhangelsky & Guido Imbens, 2023. "Causal Models for Longitudinal and Panel Data: A Survey," Papers 2311.15458, arXiv.org, revised Jun 2024.
    17. Kevin P. Josey & Elizabeth Juarez‐Colunga & Fan Yang & Debashis Ghosh, 2021. "A framework for covariate balance using Bregman distances," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 48(3), pages 790-816, September.
    18. Susan Athey & Guido W. Imbens, 2017. "The State of Applied Econometrics: Causality and Policy Evaluation," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 3-32, Spring.
    19. Muxuan Liang & Menggang Yu, 2023. "Relative contrast estimation and inference for treatment recommendation," Biometrics, The International Biometric Society, vol. 79(4), pages 2920-2932, December.
    20. Haoge Chang, 2023. "Design-based Estimation Theory for Complex Experiments," Papers 2311.06891, arXiv.org.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:159:y:2021:i:c:s0167947321000220. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.