IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v159y2021ics0167947321000220.html
   My bibliography  Save this article

Subgroup causal effect identification and estimation via matching tree

Author

Listed:
  • Zhang, Yuyang
  • Schnell, Patrick
  • Song, Chi
  • Huang, Bin
  • Lu, Bo

Abstract

Inferring causal effect from observational studies is a central topic in many scientific fields, including social science, health and medicine. The statistical methodology for estimating population average causal effect has been well established. However, the methods for identifying and estimating subpopulation causal effects are relatively less developed. Part of the challenge is that the subgroup structure is usually unknown, therefore, methods working well for population level effect need to be modified to address this. A tree method based on a matched design is proposed to identify subgroups with differential treatment effects. To remove observed confounding, propensity-score-matched pairs are first created. Then the classification and regression tree is applied to the within-pair outcome differences to identify the subgroup structure. This nonparametric approach is robust to model misspecification, which is important because it becomes much harder to specify a parametric outcome model in the presence of subgroup effects. In addition to describing assumptions under which our matching estimator is unbiased, algorithms for identifying subgroup structures are provided. Simulation results indicate that the proposed approach compares favorably, in terms of the percentage of correctly identifying true tree structure, with other competing tree-based methods, including causal trees, causal inference trees and the virtual twins approach. Finally the proposed method is implemented to examine the potential subgroup effect of the timing of Tobramycin use on chronic infection among pediatric Cystic Fibrosis patients.

Suggested Citation

  • Zhang, Yuyang & Schnell, Patrick & Song, Chi & Huang, Bin & Lu, Bo, 2021. "Subgroup causal effect identification and estimation via matching tree," Computational Statistics & Data Analysis, Elsevier, vol. 159(C).
  • Handle: RePEc:eee:csdana:v:159:y:2021:i:c:s0167947321000220
    DOI: 10.1016/j.csda.2021.107188
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947321000220
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2021.107188?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Lu, Bo & Greevy, Robert & Xu, Xinyi & Beck, Cole, 2011. "Optimal Nonbipartite Matching and Its Statistical Applications," The American Statistician, American Statistical Association, vol. 65(1), pages 21-30.
    2. Heejung Bang & James M. Robins, 2005. "Doubly Robust Estimation in Missing Data and Causal Inference Models," Biometrics, The International Biometric Society, vol. 61(4), pages 962-973, December.
    3. L. Altstein & G. Li, 2013. "Latent Subgroup Analysis of a Randomized Clinical Trial through a Semiparametric Accelerated Failure Time Mixture Model," Biometrics, The International Biometric Society, vol. 69(1), pages 52-61, March.
    4. Lu Tian & Ash A. Alizadeh & Andrew J. Gentles & Robert Tibshirani, 2014. "A Simple Method for Estimating Interactions Between a Treatment and a Large Number of Covariates," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(508), pages 1517-1532, December.
    5. Jesse Y. Hsu & Dylan S. Small & Paul R. Rosenbaum, 2013. "Effect Modification and Design Sensitivity in Observational Studies," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 108(501), pages 135-148, March.
    6. Giovanni Nattino & Bo Lu, 2018. "Model assisted sensitivity analyses for hidden bias with binary outcomes," Biometrics, The International Biometric Society, vol. 74(4), pages 1141-1149, December.
    7. AfDB AfDB, . "Annual Report 2012," Annual Report, African Development Bank, number 461.
    8. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Qingyuan Zhao & Dylan S. Small & Ashkan Ertefaie, 2022. "Selective inference for effect modification via the lasso," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(2), pages 382-413, April.
    2. Mark Kattenberg & Bas Scheer & Jurre Thiel, 2023. "Causal forests with fixed effects for treatment effect heterogeneity in difference-in-differences," CPB Discussion Paper 452, CPB Netherlands Bureau for Economic Policy Analysis.
    3. Michael C Knaus, 2022. "Double machine learning-based programme evaluation under unconfoundedness [Econometric methods for program evaluation]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 602-627.
    4. Yumou Qiu & Jing Tao & Xiao‐Hua Zhou, 2021. "Inference of heterogeneous treatment effects using observational data with high‐dimensional covariates," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(5), pages 1016-1043, November.
    5. Difang Huang & Jiti Gao & Tatsushi Oka, 2022. "Semiparametric Single-Index Estimation for Average Treatment Effects," Papers 2206.08503, arXiv.org, revised Apr 2024.
    6. Eoghan O'Neill & Melvyn Weeks, 2018. "Causal Tree Estimation of Heterogeneous Household Response to Time-Of-Use Electricity Pricing Schemes," Papers 1810.09179, arXiv.org, revised Oct 2019.
    7. Phillip Heiler & Michael C. Knaus, 2021. "Effect or Treatment Heterogeneity? Policy Evaluation with Aggregated and Disaggregated Treatments," Papers 2110.01427, arXiv.org, revised Aug 2023.
    8. Dmitry Arkhangelsky & Guido Imbens, 2023. "Causal Models for Longitudinal and Panel Data: A Survey," Papers 2311.15458, arXiv.org, revised Jun 2024.
    9. Susan Athey & Guido W. Imbens, 2017. "The State of Applied Econometrics: Causality and Policy Evaluation," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 3-32, Spring.
    10. Muxuan Liang & Menggang Yu, 2023. "Relative contrast estimation and inference for treatment recommendation," Biometrics, The International Biometric Society, vol. 79(4), pages 2920-2932, December.
    11. Jiaming Mao & Jingzhi Xu, 2020. "Ensemble Learning with Statistical and Structural Models," Papers 2006.05308, arXiv.org.
    12. Kiran Tomlinson & Johan Ugander & Austin R. Benson, 2021. "Choice Set Confounding in Discrete Choice," Papers 2105.07959, arXiv.org, revised Aug 2021.
    13. Meyer, Birgit, 2020. "How deep is your love? Innovation, Upgrading and the Depth of Internationalization," VfS Annual Conference 2020 (Virtual Conference): Gender Economics 224584, Verein für Socialpolitik / German Economic Association.
    14. Susan Athey & Guido Imbens, 2015. "Recursive Partitioning for Heterogeneous Causal Effects," Papers 1504.01132, arXiv.org, revised Dec 2015.
    15. Graham, Bryan S. & Pinto, Cristine Campos de Xavier, 2022. "Semiparametrically efficient estimation of the average linear regression function," Journal of Econometrics, Elsevier, vol. 226(1), pages 115-138.
    16. Jason Poulos & Shuxi Zeng, 2021. "RNN‐based counterfactual prediction, with an application to homestead policy and public schooling," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 70(4), pages 1124-1139, August.
    17. Joshua B. Gilbert & Zachary Himmelsbach & James Soland & Mridul Joshi & Benjamin W. Domingue, 2024. "Estimating Heterogeneous Treatment Effects with Item-Level Outcome Data: Insights from Item Response Theory," Papers 2405.00161, arXiv.org, revised Aug 2024.
    18. Shixiao Zhang & Peisong Han & Changbao Wu, 2023. "Calibration Techniques Encompassing Survey Sampling, Missing Data Analysis and Causal Inference," International Statistical Review, International Statistical Institute, vol. 91(2), pages 165-192, August.
    19. Corder Nathan & Yang Shu, 2020. "Estimating Average Treatment Effects Utilizing Fractional Imputation when Confounders are Subject to Missingness," Journal of Causal Inference, De Gruyter, vol. 8(1), pages 249-271, January.
    20. Jinglong Zhao, 2023. "Adaptive Neyman Allocation," Papers 2309.08808, arXiv.org, revised Sep 2023.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:159:y:2021:i:c:s0167947321000220. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.