IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v300y2022i3p827-836.html
   My bibliography  Save this article

Network flow methods for the minimum covariate imbalance problem

Author

Listed:
  • Hochbaum, Dorit S.
  • Rao, Xu
  • Sauppe, Jason

Abstract

In an observational study, one is given disjoint samples of treatment units and control (untreated) units, and the goal is to compare outcomes between the two samples in order to estimate a treatment effect. A complication is that the treatment and control units often differ on important pre-treatment attributes, and these differences, referred to as covariate imbalance, can bias the estimate. One method to correct for covariate imbalance is to select a subset of the control sample that has minimum imbalance with respect to the treatment sample, and then use this control subset for estimating the treatment effect. While this optimization problem is NP-hard in general, certain special cases can be solved efficiently. Specifically, the variant of this optimization problem with one covariate is easy to solve, the variant with three or more covariates is NP-hard, and the variant with two covariates is solvable in polynomial time. We present several network flow formulations for the problem of minimizing imbalance on two nominal covariates. First, we present a minimum cost network flow formulation for solving the problem with the constraint that the control subset must have the same size as the treatment sample. We then derive an improved maximum flow formulation. For alternate size restrictions on the control subset, we use a proportional imbalance objective which leads to non-integral supplies and demands in the preceding network flow formulations. We then derive an alternate minimum cost network flow formulation that ensures integrality and solves the proportional imbalance problem in polynomial time.

Suggested Citation

  • Hochbaum, Dorit S. & Rao, Xu & Sauppe, Jason, 2022. "Network flow methods for the minimum covariate imbalance problem," European Journal of Operational Research, Elsevier, vol. 300(3), pages 827-836.
  • Handle: RePEc:eee:ejores:v:300:y:2022:i:3:p:827-836
    DOI: 10.1016/j.ejor.2021.10.041
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0377221721008924
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ejor.2021.10.041?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Alexander G. Nikolaev & Sheldon H. Jacobson & Wendy K. Tam Cho & Jason J. Sauppe & Edward C. Sewell, 2013. "Balance Optimization Subset Selection (BOSS): An Alternative Approach for Causal Inference with Observational Data," Operations Research, INFORMS, vol. 61(2), pages 398-412, April.
    2. Dan Yang & Dylan S. Small & Jeffrey H. Silber & Paul R. Rosenbaum, 2012. "Optimal Matching with Minimal Deviation from Fine Balance in a Study of Obesity and Surgical Outcomes," Biometrics, The International Biometric Society, vol. 68(2), pages 628-636, June.
    3. Jason J. Sauppe & Sheldon H. Jacobson, 2017. "The role of covariate balance in observational studies," Naval Research Logistics (NRL), John Wiley & Sons, vol. 64(4), pages 323-344, June.
    4. Jason J. Sauppe & Sheldon H. Jacobson & Edward C. Sewell, 2014. "Complexity and Approximation Results for the Balance Optimization Subset Selection Model for Causal Inference in Observational Studies," INFORMS Journal on Computing, INFORMS, vol. 26(3), pages 547-566, August.
    5. Rosenbaum, Paul R. & Ross, Richard N. & Silber, Jeffrey H., 2007. "Minimum Distance Matched Sampling With Fine Balance in an Observational Study of Treatment for Ovarian Cancer," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 75-83, March.
    6. José R. Zubizarreta, 2012. "Using Mixed Integer Programming for Matching in an Observational Study of Kidney Failure After Surgery," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(500), pages 1360-1371, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jason J. Sauppe & Sheldon H. Jacobson, 2017. "The role of covariate balance in observational studies," Naval Research Logistics (NRL), John Wiley & Sons, vol. 64(4), pages 323-344, June.
    2. Md Saiful Islam & Md Sarowar Morshed & Md. Noor-E-Alam, 2022. "A Computational Framework for Solving Nonlinear Binary Optimization Problems in Robust Causal Inference," INFORMS Journal on Computing, INFORMS, vol. 34(6), pages 3023-3041, November.
    3. Cousineau, Martin & Verter, Vedat & Murphy, Susan A. & Pineau, Joelle, 2023. "Estimating causal effects with optimization-based methods: A review and empirical comparison," European Journal of Operational Research, Elsevier, vol. 304(2), pages 367-380.
    4. Jason J. Sauppe & Sheldon H. Jacobson & Edward C. Sewell, 2014. "Complexity and Approximation Results for the Balance Optimization Subset Selection Model for Causal Inference in Observational Studies," INFORMS Journal on Computing, INFORMS, vol. 26(3), pages 547-566, August.
    5. Hee Youn Kwon & Jason J. Sauppe & Sheldon H. Jacobson, 2019. "Treatment Effect Decomposition and Bootstrap Hypothesis Testing in Observational Studies," Annals of Data Science, Springer, vol. 6(3), pages 491-511, September.
    6. Ruoqi Yu, 2023. "How well can fine balance work for covariate balancing," Biometrics, The International Biometric Society, vol. 79(3), pages 2346-2356, September.
    7. Tian Heong Chan & Francis de Véricourt & Omar Besbes, 2019. "Contracting in Medical Equipment Maintenance Services: An Empirical Investigation," Management Science, INFORMS, vol. 65(3), pages 1136-1150, March.
    8. Martin Cousineau & Vedat Verter & Susan A. Murphy & Joelle Pineau, 2022. "Estimating causal effects with optimization-based methods: A review and empirical comparison," Papers 2203.00097, arXiv.org.
    9. Pierluigi Montalbano & Silvia Nenci & Laura Dell'Agostino, 2022. "A non-parametric assessment of the effects of the Euro on GVC trade," International Economics, CEPII research center, issue 172, pages 56-76.
    10. Bikram Karmakar, 2022. "An approximation algorithm for blocking of an experimental design," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(5), pages 1726-1750, November.
    11. Md Saiful Islam & Md Sarowar Morshed & Gary J Young & Md Noor-E-Alam, 2019. "Robust policy evaluation from large-scale observational studies," PLOS ONE, Public Library of Science, vol. 14(10), pages 1-19, October.
    12. Glazer Amanda K. & Pimentel Samuel D., 2023. "Robust inference for matching under rolling enrollment," Journal of Causal Inference, De Gruyter, vol. 11(1), pages 1-19, January.
    13. Luke Keele & Steve Harris & Samuel D. Pimentel & Richard Grieve, 2020. "Stronger instruments and refined covariate balance in an observational study of the effectiveness of prompt admission to intensive care units," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 183(4), pages 1501-1521, October.
    14. Shouvik Dutta & Jason Sauppe & Sheldon Jacobson, 2016. "Targeted Marketing Using Balance Optimization Subset Selection," Annals of Data Science, Springer, vol. 3(4), pages 423-444, December.
    15. Dutta Shouvik & Jacobson Sheldon H. & Sauppe Jason J., 2017. "Identifying NCAA tournament upsets using Balance Optimization Subset Selection," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 13(2), pages 79-93, June.
    16. Yu, Haiyan & Yang, Ching-Chi & Yu, Ping, 2023. "Constrained optimization for stratified treatment rules in reducing hospital readmission rates of diabetic patients," European Journal of Operational Research, Elsevier, vol. 308(3), pages 1355-1364.
    17. Ruoqi Yu, 2021. "Evaluating and improving a matched comparison of antidepressants and bone density," Biometrics, The International Biometric Society, vol. 77(4), pages 1276-1288, December.
    18. Florian Gunsilius & Yuliang Xu, 2021. "Matching for causal effects via multimarginal unbalanced optimal transport," Papers 2112.04398, arXiv.org, revised Jul 2022.
    19. Gensler, Sonja & Leeflang, Peter & Skiera, Bernd, 2012. "Impact of online channel use on customer revenues and costs to serve: Considering product portfolios and self-selection," International Journal of Research in Marketing, Elsevier, vol. 29(2), pages 192-201.
    20. Díaz, Juan & Grau, Nicolás & Reyes, Tatiana & Rivera, Jorge, 2021. "The impact of grade retention on juvenile crime," Economics of Education Review, Elsevier, vol. 84(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:300:y:2022:i:3:p:827-836. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.