IDEAS home Printed from https://ideas.repec.org/a/bla/istatr/v88y2020is1ps135-s178.html
   My bibliography  Save this article

Stable Discovery of Interpretable Subgroups via Calibration in Causal Studies

Author

Listed:
  • Raaz Dwivedi
  • Yan Shuo Tan
  • Briton Park
  • Mian Wei
  • Kevin Horgan
  • David Madigan
  • Bin Yu

Abstract

Building on Yu and Kumbier's predictability, computability and stability (PCS) framework and for randomised experiments, we introduce a novel methodology for Stable Discovery of Interpretable Subgroups via Calibration (StaDISC), with large heterogeneous treatment effects. StaDISC was developed during our re‐analysis of the 1999–2000 VIGOR study, an 8076‐patient randomised controlled trial that compared the risk of adverse events from a then newly approved drug, rofecoxib (Vioxx), with that from an older drug naproxen. Vioxx was found to, on average and in comparison with naproxen, reduce the risk of gastrointestinal events but increase the risk of thrombotic cardiovascular events. Applying StaDISC, we fit 18 popular conditional average treatment effect (CATE) estimators for both outcomes and use calibration to demonstrate their poor global performance. However, they are locally well‐calibrated and stable, enabling the identification of patient groups with larger than (estimated) average treatment effects. In fact, StaDISC discovers three clinically interpretable subgroups each for the gastrointestinal outcome (totalling 29.4% of the study size) and the thrombotic cardiovascular outcome (totalling 11.0%). Complementary analyses of the found subgroups using the 2001–2004 APPROVe study, a separate independently conducted randomised controlled trial with 2587 patients, provide further supporting evidence for the promise of StaDISC.

Suggested Citation

  • Raaz Dwivedi & Yan Shuo Tan & Briton Park & Mian Wei & Kevin Horgan & David Madigan & Bin Yu, 2020. "Stable Discovery of Interpretable Subgroups via Calibration in Causal Studies," International Statistical Review, International Statistical Institute, vol. 88(S1), pages 135-178, December.
  • Handle: RePEc:bla:istatr:v:88:y:2020:i:s1:p:s135-s178
    DOI: 10.1111/insr.12427
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/insr.12427
    Download Restriction: no

    File URL: https://libkey.io/10.1111/insr.12427?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Lu Tian & Ash A. Alizadeh & Andrew J. Gentles & Robert Tibshirani, 2014. "A Simple Method for Estimating Interactions Between a Treatment and a Large Number of Covariates," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(508), pages 1517-1532, December.
    2. Bin Yu & Rebecca Barter, 2020. "The Data Science Process: One Culture," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(530), pages 672-674, April.
    3. Guido W. Imbens & Jeffrey M. Wooldridge, 2009. "Recent Developments in the Econometrics of Program Evaluation," Journal of Economic Literature, American Economic Association, vol. 47(1), pages 5-86, March.
    4. Susan Athey, 2018. "The Impact of Machine Learning on Economics," NBER Chapters, in: The Economics of Artificial Intelligence: An Agenda, pages 507-547, National Bureau of Economic Research, Inc.
    5. Stefan Wager & Susan Athey, 2018. "Estimation and Inference of Heterogeneous Treatment Effects using Random Forests," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1228-1242, July.
    6. Gerber, Alan S. & Green, Donald P. & Larimer, Christopher W., 2008. "Social Pressure and Voter Turnout: Evidence from a Large-Scale Field Experiment," American Political Science Review, Cambridge University Press, vol. 102(1), pages 33-48, February.
    7. Bin Yu & Rebecca Barter, 2020. "The Data Science Process: One Culture," International Statistical Review, International Statistical Institute, vol. 88(S1), pages 83-86, December.
    8. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Yu, Bin, 2023. "What is uncertainty in today’s practice of data science?," Journal of Econometrics, Elsevier, vol. 237(1).
    2. Hui Lan & Vasilis Syrgkanis, 2023. "Causal Q-Aggregation for CATE Model Selection," Papers 2310.16945, arXiv.org, revised Nov 2023.
    3. Jann Spiess & Vasilis Syrgkanis & Victor Yaneng Wang, 2021. "Finding Subgroups with Significant Treatment Effects," Papers 2103.07066, arXiv.org, revised Dec 2023.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Gabriel Okasa, 2022. "Meta-Learners for Estimation of Causal Effects: Finite Sample Cross-Fit Performance," Papers 2201.12692, arXiv.org.
    2. Michael C Knaus & Michael Lechner & Anthony Strittmatter, 2021. "Machine learning estimation of heterogeneous causal effects: Empirical Monte Carlo evidence," The Econometrics Journal, Royal Economic Society, vol. 24(1), pages 134-161.
    3. Athey, Susan & Imbens, Guido W., 2019. "Machine Learning Methods Economists Should Know About," Research Papers 3776, Stanford University, Graduate School of Business.
    4. Davide Viviano & Jelena Bradic, 2019. "Synthetic learner: model-free inference on treatments over time," Papers 1904.01490, arXiv.org, revised Aug 2022.
    5. Michael C. Knaus, 2021. "A double machine learning approach to estimate the effects of musical practice on student’s skills," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(1), pages 282-300, January.
    6. Mark Kattenberg & Bas Scheer & Jurre Thiel, 2023. "Causal forests with fixed effects for treatment effect heterogeneity in difference-in-differences," CPB Discussion Paper 452, CPB Netherlands Bureau for Economic Policy Analysis.
    7. Michael C Knaus, 2022. "Double machine learning-based programme evaluation under unconfoundedness [Econometric methods for program evaluation]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 602-627.
    8. Denis Fougère & Nicolas Jacquemet, 2020. "Policy Evaluation Using Causal Inference Methods," SciencePo Working papers Main hal-03455978, HAL.
    9. Guido W. Imbens, 2020. "Potential Outcome and Directed Acyclic Graph Approaches to Causality: Relevance for Empirical Practice in Economics," Journal of Economic Literature, American Economic Association, vol. 58(4), pages 1129-1179, December.
    10. Daniel Boller & Michael Lechner & Gabriel Okasa, 2021. "The Effect of Sport in Online Dating: Evidence from Causal Machine Learning," Papers 2104.04601, arXiv.org.
    11. Eoghan O'Neill & Melvyn Weeks, 2018. "Causal Tree Estimation of Heterogeneous Household Response to Time-Of-Use Electricity Pricing Schemes," Papers 1810.09179, arXiv.org, revised Oct 2019.
    12. Phillip Heiler & Michael C. Knaus, 2021. "Effect or Treatment Heterogeneity? Policy Evaluation with Aggregated and Disaggregated Treatments," Papers 2110.01427, arXiv.org, revised Aug 2023.
    13. Anna Baiardi & Andrea A. Naghi, 2021. "The Value Added of Machine Learning to Causal Inference: Evidence from Revisited Studies," Papers 2101.00878, arXiv.org.
    14. Michael C. Knaus & Michael Lechner & Anthony Strittmatter, 2022. "Heterogeneous Employment Effects of Job Search Programs: A Machine Learning Approach," Journal of Human Resources, University of Wisconsin Press, vol. 57(2), pages 597-636.
    15. Daniel Goller & Tamara Harrer & Michael Lechner & Joachim Wolff, 2021. "Active labour market policies for the long-term unemployed: New evidence from causal machine learning," Papers 2106.10141, arXiv.org, revised May 2023.
    16. Joshua B. Gilbert & Zachary Himmelsbach & James Soland & Mridul Joshi & Benjamin W. Domingue, 2024. "Estimating Heterogeneous Treatment Effects with Item-Level Outcome Data: Insights from Item Response Theory," Papers 2405.00161, arXiv.org, revised Aug 2024.
    17. Anthony Strittmatter, 2018. "What Is the Value Added by Using Causal Machine Learning Methods in a Welfare Experiment Evaluation?," Papers 1812.06533, arXiv.org, revised Dec 2021.
    18. Riccardo Di Francesco, 2022. "Aggregation Trees," CEIS Research Paper 546, Tor Vergata University, CEIS, revised 20 Nov 2023.
    19. Michael Lechner & Jana Mareckova, 2022. "Modified Causal Forest," Papers 2209.03744, arXiv.org.
    20. Hyung G. Park & Danni Wu & Eva Petkova & Thaddeus Tarpey & R. Todd Ogden, 2023. "Bayesian Index Models for Heterogeneous Treatment Effects on a Binary Outcome," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 15(2), pages 397-418, July.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:istatr:v:88:y:2020:i:s1:p:s135-s178. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/isiiinl.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.