IDEAS home Printed from https://ideas.repec.org/a/bla/biomet/v79y2023i4p2961-2973.html
   My bibliography  Save this article

Combining observational and experimental datasets using shrinkage estimators

Author

Listed:
  • Evan T.R. Rosenman
  • Guillaume Basse
  • Art B. Owen
  • Mike Baiocchi

Abstract

We consider the problem of combining data from observational and experimental sources to draw causal conclusions. To derive combined estimators with desirable properties, we extend results from the Stein shrinkage literature. Our contributions are threefold. First, we propose a generic procedure for deriving shrinkage estimators in this setting, making use of a generalized unbiased risk estimate. Second, we develop two new estimators, prove finite sample conditions under which they have lower risk than an estimator using only experimental data, and show that each achieves a notion of asymptotic optimality. Third, we draw connections between our approach and results in sensitivity analysis, including proposing a method for evaluating the feasibility of our estimators.

Suggested Citation

  • Evan T.R. Rosenman & Guillaume Basse & Art B. Owen & Mike Baiocchi, 2023. "Combining observational and experimental datasets using shrinkage estimators," Biometrics, The International Biometric Society, vol. 79(4), pages 2961-2973, December.
  • Handle: RePEc:bla:biomet:v:79:y:2023:i:4:p:2961-2973
    DOI: 10.1111/biom.13827
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/biom.13827
    Download Restriction: no

    File URL: https://libkey.io/10.1111/biom.13827?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Xianchao Xie & S. C. Kou & Lawrence D. Brown, 2012. "SURE Estimates for a Heteroscedastic Hierarchical Model," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(500), pages 1465-1479, December.
    2. Stefan Wager & Susan Athey, 2018. "Estimation and Inference of Heterogeneous Treatment Effects using Random Forests," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1228-1242, July.
    3. Timothy B. Armstrong & Michal Kolesár, 2018. "Optimal Inference in a Class of Regression Models," Econometrica, Econometric Society, vol. 86(2), pages 655-683, March.
    4. Timothy B. Armstrong & Michal Kolesár & Mikkel Plagborg‐Møller, 2022. "Robust Empirical Bayes Confidence Intervals," Econometrica, Econometric Society, vol. 90(6), pages 2567-2602, November.
    5. Tan, Zhiqiang, 2006. "A Distributional Approach for Causal Inference Using Propensity Scores," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1619-1637, December.
    6. Hansen, Bruce E., 2016. "Efficient shrinkage in parametric models," Journal of Econometrics, Elsevier, vol. 190(1), pages 115-132.
    7. Qingyuan Zhao & Dylan S. Small & Bhaswar B. Bhattacharya, 2019. "Sensitivity analysis for inverse probability weighting estimators via the percentile bootstrap," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 81(4), pages 735-761, September.
    8. Xinran Li & Peng Ding, 2017. "General Forms of Finite Population Central Limit Theorems with Applications to Causal Inference," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(520), pages 1759-1769, October.
    9. Timothy B. Armstrong & Michal Koles'ar & Mikkel Plagborg-M{o}ller, 2020. "Robust Empirical Bayes Confidence Intervals," Papers 2004.03448, arXiv.org, revised May 2022.
    10. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Irina Degtiar & Tim Layton & Jacob Wallace & Sherri Rose, 2023. "Conditional cross‐design synthesis estimators for generalizability in Medicaid," Biometrics, The International Biometric Society, vol. 79(4), pages 3859-3872, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Timothy B. Armstrong & Michal Kolesár & Mikkel Plagborg‐Møller, 2022. "Robust Empirical Bayes Confidence Intervals," Econometrica, Econometric Society, vol. 90(6), pages 2567-2602, November.
    2. Nathan Kallus, 2023. "Treatment Effect Risk: Bounds and Inference," Management Science, INFORMS, vol. 69(8), pages 4579-4590, August.
    3. Nathan Kallus, 2022. "Treatment Effect Risk: Bounds and Inference," Papers 2201.05893, arXiv.org, revised Jul 2022.
    4. Boot, Tom, 2023. "Joint inference based on Stein-type averaging estimators in the linear regression model," Journal of Econometrics, Elsevier, vol. 235(2), pages 1542-1563.
    5. Nathan Kallus, 2022. "What's the Harm? Sharp Bounds on the Fraction Negatively Affected by Treatment," Papers 2205.10327, arXiv.org, revised Nov 2022.
    6. Timothy B. Armstrong & Michal Koles'ar & Mikkel Plagborg-M{o}ller, 2020. "Robust Empirical Bayes Confidence Intervals," Papers 2004.03448, arXiv.org, revised May 2022.
    7. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    8. Dimitris Bertsimas & Agni Orfanoudaki & Rory B. Weiner, 2020. "Personalized treatment for coronary artery disease patients: a machine learning approach," Health Care Management Science, Springer, vol. 23(4), pages 482-506, December.
    9. Ruoxuan Xiong & Allison Koenecke & Michael Powell & Zhu Shen & Joshua T. Vogelstein & Susan Athey, 2021. "Federated Causal Inference in Heterogeneous Observational Data," Papers 2107.11732, arXiv.org, revised Apr 2023.
    10. Davide Viviano & Jelena Bradic, 2019. "Synthetic learner: model-free inference on treatments over time," Papers 1904.01490, arXiv.org, revised Aug 2022.
    11. Chenchuan (Mark) Li & Ulrich K. Müller, 2021. "Linear regression with many controls of limited explanatory power," Quantitative Economics, Econometric Society, vol. 12(2), pages 405-442, May.
    12. Rina Friedberg & Julie Tibshirani & Susan Athey & Stefan Wager, 2018. "Local Linear Forests," Papers 1807.11408, arXiv.org, revised Sep 2020.
    13. Koch, Bernard & Sainburg, Tim & Geraldo, Pablo & JIANG, SONG & Sun, Yizhou & Foster, Jacob G., 2021. "Deep Learning of Potential Outcomes," SocArXiv aeszf, Center for Open Science.
    14. Clément de Chaisemartin, 2022. "Trading-off Bias and Variance in Stratified Experiments and in Staggered Adoption Designs, Under a Boundedness Condition on the Magnitude of the Treatment Effect," Working Papers hal-03873919, HAL.
    15. Undral Byambadalai & Tatsushi Oka & Shota Yasui, 2024. "Estimating Distributional Treatment Effects in Randomized Experiments: Machine Learning for Variance Reduction," Papers 2407.16037, arXiv.org.
    16. Yiyi Huo & Yingying Fan & Fang Han, 2023. "On the adaptation of causal forests to manifold data," Papers 2311.16486, arXiv.org, revised Dec 2023.
    17. Julius Owusu, 2023. "Randomization Inference of Heterogeneous Treatment Effects under Network Interference," Papers 2308.00202, arXiv.org, revised Jan 2024.
    18. Michael Lechner, 2023. "Causal Machine Learning and its use for public policy," Swiss Journal of Economics and Statistics, Springer;Swiss Society of Economics and Statistics, vol. 159(1), pages 1-15, December.
    19. Miruna Oprescu & Vasilis Syrgkanis & Zhiwei Steven Wu, 2018. "Orthogonal Random Forest for Causal Inference," Papers 1806.03467, arXiv.org, revised Sep 2019.
    20. Han, Kevin & Basse, Guillaume & Bojinov, Iavor, 2024. "Population interference in panel experiments," Journal of Econometrics, Elsevier, vol. 238(1).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:biomet:v:79:y:2023:i:4:p:2961-2973. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0006-341X .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.