IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2201.10743.html
   My bibliography  Save this paper

Combining Experimental and Observational Data for Identification and Estimation of Long-Term Causal Effects

Author

Listed:
  • AmirEmad Ghassami
  • Alan Yang
  • David Richardson
  • Ilya Shpitser
  • Eric Tchetgen Tchetgen

Abstract

We consider the task of identifying and estimating the causal effect of a treatment variable on a long-term outcome variable using data from an observational domain and an experimental domain. The observational domain is subject to unobserved confounding. Furthermore, subjects in the experiment are only followed for a short period of time; hence, long-term effects of treatment are unobserved but short-term effects will be observed. Therefore, data from neither domain alone suffices for causal inference about the effect of the treatment on the long-term outcome, and must be pooled in a principled way, instead. Athey et al. (2020) proposed a method for systematically combining such data for identifying the downstream causal effect in view. Their approach is based on the assumptions of internal and external validity of the experimental data, and an extra novel assumption called latent unconfoundedness. In this paper, we first review their proposed approach, and then we propose three alternative approaches for data fusion for the purpose of identifying and estimating average treatment effect as well as the effect of treatment on the treated. Our first approach is based on assuming equi-confounding bias for the short-term and long-term outcomes. Our second approach is based on a relaxed version of the equi-confounding bias assumption, where we assume the existence of an observed confounder such that the short-term and long-term potential outcome variables have the same partial additive association with that confounder. Our third approach is based on the proximal causal inference framework, in which we assume the existence of an extra variable in the system which is a proxy of the latent confounder of the treatment-outcome relation. We propose influence function-based estimation strategies for each of our data fusion frameworks and study the robustness properties of the proposed estimators.

Suggested Citation

  • AmirEmad Ghassami & Alan Yang & David Richardson & Ilya Shpitser & Eric Tchetgen Tchetgen, 2022. "Combining Experimental and Observational Data for Identification and Estimation of Long-Term Causal Effects," Papers 2201.10743, arXiv.org, revised Apr 2022.
  • Handle: RePEc:arx:papers:2201.10743
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2201.10743
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. David Card, 1990. "The Impact of the Mariel Boatlift on the Miami Labor Market," ILR Review, Cornell University, ILR School, vol. 43(2), pages 245-257, January.
    2. Susan Athey & Guido W. Imbens, 2006. "Identification and Inference in Nonlinear Difference-in-Differences Models," Econometrica, Econometric Society, vol. 74(2), pages 431-497, March.
    3. Wang Miao & Zhi Geng & Eric J Tchetgen Tchetgen, 2018. "Identifying causal effects with proxy variables of an unmeasured confounder," Biometrika, Biometrika Trust, vol. 105(4), pages 987-993.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Guido Imbens & Nathan Kallus & Xiaojie Mao & Yuhao Wang, 2022. "Long-term Causal Inference Under Persistent Confounding via Data Combination," Papers 2202.07234, arXiv.org, revised Aug 2024.
    2. Harsh Parikh & Marco Morucci & Vittorio Orlandi & Sudeepa Roy & Cynthia Rudin & Alexander Volfovsky, 2023. "A Double Machine Learning Approach to Combining Experimental and Observational Data," Papers 2307.01449, arXiv.org, revised Apr 2024.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Rahul Singh, 2020. "Kernel Methods for Unobserved Confounding: Negative Controls, Proxies, and Instruments," Papers 2012.10315, arXiv.org, revised Mar 2023.
    2. Susan Athey & Raj Chetty & Guido Imbens, 2020. "Combining Experimental and Observational Data to Estimate Treatment Effects on Long Term Outcomes," Papers 2006.09676, arXiv.org.
    3. Guido W. Imbens & Jeffrey M. Wooldridge, 2009. "Recent Developments in the Econometrics of Program Evaluation," Journal of Economic Literature, American Economic Association, vol. 47(1), pages 5-86, March.
    4. Rösner, Anja & Haucap, Justus & Heimeshoff, Ulrich, 2020. "The impact of consumer protection in the digital age: Evidence from the European Union," International Journal of Industrial Organization, Elsevier, vol. 73(C).
    5. Peter Hull & Michal Kolesár & Christopher Walters, 2022. "Labor by design: contributions of David Card, Joshua Angrist, and Guido Imbens," Scandinavian Journal of Economics, Wiley Blackwell, vol. 124(3), pages 603-645, July.
    6. Athey, Susan & Imbens, Guido W., 2022. "Design-based analysis in Difference-In-Differences settings with staggered adoption," Journal of Econometrics, Elsevier, vol. 226(1), pages 62-79.
    7. Erlend E. Bø & Joel Slemrod & Thor O. Thoresen, 2015. "Taxes on the Internet: Deterrence Effects of Public Disclosure," American Economic Journal: Economic Policy, American Economic Association, vol. 7(1), pages 36-62, February.
    8. Susan Athey & Guido W. Imbens, 2017. "The State of Applied Econometrics: Causality and Policy Evaluation," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 3-32, Spring.
    9. Christian Aleman & Christopher Busch & Alexander Ludwig & Raul Santaeulalia-Llopis, 2022. "A Stage-Based Identification of Policy Effects," PIER Working Paper Archive 22-026, Penn Institute for Economic Research, Department of Economics, University of Pennsylvania.
    10. Lechner, Michael, 2011. "The Estimation of Causal Effects by Difference-in-Difference Methods," Foundations and Trends(R) in Econometrics, now publishers, vol. 4(3), pages 165-224, November.
    11. Herrero Prieto, Luis César, 2009. "La investigación en economía de la cultura en España: un estudio bibliométrico/Research in Cultural Economics in Spain: A Bibliometric Study," Estudios de Economia Aplicada, Estudios de Economia Aplicada, vol. 27, pages 35-62, Abril.
    12. Bryan S. Graham & James Powell, 2008. "Identification and Estimation of 'Irregular' Correlated Random Coefficient Models," NBER Working Papers 14469, National Bureau of Economic Research, Inc.
    13. van der Klaauw, Bas, 2014. "From micro data to causality: Forty years of empirical labor economics," Labour Economics, Elsevier, vol. 30(C), pages 88-97.
    14. Timothy G. Conley & Christopher R. Taber, 2011. "Inference with "Difference in Differences" with a Small Number of Policy Changes," The Review of Economics and Statistics, MIT Press, vol. 93(1), pages 113-125, February.
    15. Dionysia Lambiri & Alessandra Faggian & Neil Wrigley, 2017. "Linked-trip effects of ‘town-centre-first' era foodstore development: An assessment using difference-in-differences," Environment and Planning B, , vol. 44(1), pages 160-179, January.
    16. Committee, Nobel Prize, 2021. "Answering causal questions using observational data," Nobel Prize in Economics documents 2021-2, Nobel Prize Committee.
    17. Nikolay Doudchenko & Guido W. Imbens, 2016. "Balancing, Regression, Difference-In-Differences and Synthetic Control Methods: A Synthesis," NBER Working Papers 22791, National Bureau of Economic Research, Inc.
    18. Greta Laage & Emma Frejinger & Andrea Lodi & Guillaume Rabusseau, 2021. "Assessing the Impact: Does an Improvement to a Revenue Management System Lead to an Improved Revenue?," Papers 2101.10249, arXiv.org, revised Jun 2021.
    19. Michael Zimmert, 2018. "Efficient Difference-in-Differences Estimation with High-Dimensional Common Trend Confounding," Papers 1809.01643, arXiv.org, revised Aug 2020.
    20. Fernando Díaz & Pablo A Henríquez, 2021. "Social sentiment segregation: Evidence from Twitter and Google Trends in Chile during the COVID-19 dynamic quarantine strategy," PLOS ONE, Public Library of Science, vol. 16(7), pages 1-29, July.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2201.10743. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.