IDEAS home Printed from https://ideas.repec.org/a/bla/biomet/v79y2023i4p3624-3636.html
   My bibliography  Save this article

Bayesian causal inference for observational studies with missingness in covariates and outcomes

Author

Listed:
  • Huaiyu Zang
  • Hang J. Kim
  • Bin Huang
  • Rhonda Szczesniak

Abstract

Missing data are a pervasive issue in observational studies using electronic health records or patient registries. It presents unique challenges for statistical inference, especially causal inference. Inappropriately handling missing data in causal inference could potentially bias causal estimation. Besides missing data problems, observational health data structures typically have mixed‐type variables ‐ continuous and categorical covariates ‐ whose joint distribution is often too complex to be modeled by simple parametric models. The existence of missing values in covariates and outcomes makes the causal inference even more challenging, while most standard causal inference approaches assume fully observed data or start their works after imputing missing values in a separate preprocessing stage. To address these problems, we introduce a Bayesian nonparametric causal model to estimate causal effects with missing data. The proposed approach can simultaneously impute missing values, account for multiple outcomes, and estimate causal effects under the potential outcomes framework. We provide three simulation studies to show the performance of our proposed method under complicated data settings whose features are similar to our case studies. For example, Simulation Study 3 assumes the case where missing values exist in both outcomes and covariates. Two case studies were conducted applying our method to evaluate the comparative effectiveness of treatments for chronic disease management in juvenile idiopathic arthritis and cystic fibrosis.

Suggested Citation

  • Huaiyu Zang & Hang J. Kim & Bin Huang & Rhonda Szczesniak, 2023. "Bayesian causal inference for observational studies with missingness in covariates and outcomes," Biometrics, The International Biometric Society, vol. 79(4), pages 3624-3636, December.
  • Handle: RePEc:bla:biomet:v:79:y:2023:i:4:p:3624-3636
    DOI: 10.1111/biom.13918
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/biom.13918
    Download Restriction: no

    File URL: https://libkey.io/10.1111/biom.13918?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. van Buuren, Stef & Groothuis-Oudshoorn, Karin, 2011. "mice: Multivariate Imputation by Chained Equations in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 45(i03).
    2. Lumley, Thomas, 2004. "Analysis of Complex Survey Samples," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 9(i08).
    3. S Yang & L Wang & P Ding, 2019. "Causal inference with confounders missing not at random," Biometrika, Biometrika Trust, vol. 106(4), pages 875-888.
    4. Jared S. Murray & Jerome P. Reiter, 2016. "Multiple Imputation of Missing Categorical and Continuous Values via Bayesian Mixture Models With Local Dependence," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1466-1479, October.
    5. Jason Roy & Kirsten J. Lum & Bret Zeldow & Jordan D. Dworkin & Vincent Lo Re & Michael J. Daniels, 2018. "Bayesian nonparametric generative models for causal inference with missing at random covariates," Biometrics, The International Biometric Society, vol. 74(4), pages 1193-1202, December.
    6. van der Laan Mark J. & Rubin Daniel, 2006. "Targeted Maximum Likelihood Learning," The International Journal of Biostatistics, De Gruyter, vol. 2(1), pages 1-40, December.
    7. Maria Josefsson & Michael J. Daniels, 2021. "Bayesian semi‐parametric G‐computation for causal inference in a cohort study with MNAR dropout and death," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 70(2), pages 398-414, March.
    8. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881, October.
    9. Kapelner, Adam & Bleich, Justin, 2016. "bartMachine: Machine Learning with Bayesian Additive Regression Trees," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 70(i04).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Antonio R. Linero, 2023. "Prior and posterior checking of implicit causal assumptions," Biometrics, The International Biometric Society, vol. 79(4), pages 3153-3164, December.
    2. Susan Athey & Guido W. Imbens & Stefan Wager, 2018. "Approximate residual balancing: debiased inference of average treatment effects in high dimensions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 80(4), pages 597-623, September.
    3. Yiyi Huo & Yingying Fan & Fang Han, 2023. "On the adaptation of causal forests to manifold data," Papers 2311.16486, arXiv.org, revised Dec 2023.
    4. Michael Lechner, 2023. "Causal Machine Learning and its use for public policy," Swiss Journal of Economics and Statistics, Springer;Swiss Society of Economics and Statistics, vol. 159(1), pages 1-15, December.
    5. Michael C Knaus, 2022. "Double machine learning-based programme evaluation under unconfoundedness [Econometric methods for program evaluation]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 602-627.
    6. Raphael Nishimura & James Wagner & Michael Elliott, 2016. "Alternative Indicators for the Risk of Non-response Bias: A Simulation Study," International Statistical Review, International Statistical Institute, vol. 84(1), pages 43-62, April.
    7. Mireille E. Schnitzer & Erica E.M. Moodie & Mark J. van der Laan & Robert W. Platt & Marina B. Klein, 2014. "Modeling the impact of hepatitis C viral clearance on end-stage liver disease in an HIV co-infected cohort with targeted maximum likelihood estimation," Biometrics, The International Biometric Society, vol. 70(1), pages 144-152, March.
    8. Silvia Coderoni & Roberto Esposti & Alessandro Varacca, 2024. "How Differently Do Farms Respond to Agri-environmental Policies? A Probabilistic Machine-Learning Approach," Land Economics, University of Wisconsin Press, vol. 100(2), pages 370-397.
    9. Yilin Li & Wang Miao & Ilya Shpitser & Eric J. Tchetgen Tchetgen, 2023. "A self‐censoring model for multivariate nonignorable nonmonotone missing data," Biometrics, The International Biometric Society, vol. 79(4), pages 3203-3214, December.
    10. Veronica Sciannameo & Gian Paolo Fadini & Daniele Bottigliengo & Angelo Avogaro & Ileana Baldi & Dario Gregori & Paola Berchialla, 2022. "Assessment of Glucose Lowering Medications’ Effectiveness for Cardiovascular Clinical Risk Management of Real-World Patients with Type 2 Diabetes: Targeted Maximum Likelihood Estimation under Model Mi," IJERPH, MDPI, vol. 19(22), pages 1-13, November.
    11. Xiaojun Mao & Zhonglei Wang & Shu Yang, 2023. "Matrix completion under complex survey sampling," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 75(3), pages 463-492, June.
    12. Susan Athey & Julie Tibshirani & Stefan Wager, 2016. "Generalized Random Forests," Papers 1610.01271, arXiv.org, revised Apr 2018.
    13. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    14. Danhyang Lee & Jae Kwang Kim, 2022. "Semiparametric imputation using conditional Gaussian mixture models under item nonresponse," Biometrics, The International Biometric Society, vol. 78(1), pages 227-237, March.
    15. Humera Razzak & Christian Heumann, 2019. "Hybrid Multiple Imputation In A Large Scale Complex Survey," Statistics in Transition New Series, Polish Statistical Association, vol. 20(4), pages 33-58, December.
    16. Razzak Humera & Heumann Christian, 2019. "Hybrid Multiple Imputation In A Large Scale Complex Survey," Statistics in Transition New Series, Polish Statistical Association, vol. 20(4), pages 33-58, December.
    17. Athey, Susan & Imbens, Guido W., 2019. "Machine Learning Methods Economists Should Know About," Research Papers 3776, Stanford University, Graduate School of Business.
    18. Jelena Bradic & Stefan Wager & Yinchu Zhu, 2019. "Sparsity Double Robust Inference of Average Treatment Effects," Papers 1905.00744, arXiv.org.
    19. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2016. "Double/Debiased Machine Learning for Treatment and Causal Parameters," Papers 1608.00060, arXiv.org, revised Dec 2017.
    20. Bryan Keller, 2020. "Variable Selection for Causal Effect Estimation: Nonparametric Conditional Independence Testing With Random Forests," Journal of Educational and Behavioral Statistics, , vol. 45(2), pages 119-142, April.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:biomet:v:79:y:2023:i:4:p:3624-3636. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0006-341X .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.