IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2109.15154.html
   My bibliography  Save this paper

Causal Matrix Completion

Author

Listed:
  • Anish Agarwal
  • Munther Dahleh
  • Devavrat Shah
  • Dennis Shen

Abstract

Matrix completion is the study of recovering an underlying matrix from a sparse subset of noisy observations. Traditionally, it is assumed that the entries of the matrix are "missing completely at random" (MCAR), i.e., each entry is revealed at random, independent of everything else, with uniform probability. This is likely unrealistic due to the presence of "latent confounders", i.e., unobserved factors that determine both the entries of the underlying matrix and the missingness pattern in the observed matrix. For example, in the context of movie recommender systems -- a canonical application for matrix completion -- a user who vehemently dislikes horror films is unlikely to ever watch horror films. In general, these confounders yield "missing not at random" (MNAR) data, which can severely impact any inference procedure that does not correct for this bias. We develop a formal causal model for matrix completion through the language of potential outcomes, and provide novel identification arguments for a variety of causal estimands of interest. We design a procedure, which we call "synthetic nearest neighbors" (SNN), to estimate these causal estimands. We prove finite-sample consistency and asymptotic normality of our estimator. Our analysis also leads to new theoretical results for the matrix completion literature. In particular, we establish entry-wise, i.e., max-norm, finite-sample consistency and asymptotic normality results for matrix completion with MNAR data. As a special case, this also provides entry-wise bounds for matrix completion with MCAR data. Across simulated and real data, we demonstrate the efficacy of our proposed estimator.

Suggested Citation

  • Anish Agarwal & Munther Dahleh & Devavrat Shah & Dennis Shen, 2021. "Causal Matrix Completion," Papers 2109.15154, arXiv.org.
  • Handle: RePEc:arx:papers:2109.15154
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2109.15154
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Dmitry Arkhangelsky & Susan Athey & David A. Hirshberg & Guido W. Imbens & Stefan Wager, 2021. "Synthetic Difference-in-Differences," American Economic Review, American Economic Association, vol. 111(12), pages 4088-4118, December.
    2. Muhummad Amjad & Vishal Misra & Devavrat Shah & Dennis Shen, 2019. "mRSC: Multi-dimensional Robust Synthetic Control," Papers 1905.06400, arXiv.org, revised Sep 2019.
    3. Susan Athey & Mohsen Bayati & Nikolay Doudchenko & Guido Imbens & Khashayar Khosravi, 2021. "Matrix Completion Methods for Causal Panel Data Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 116(536), pages 1716-1730, October.
    4. Ivan Fernandez-Val & Hugo Freeman & Martin Weidner, 2020. "Low-rank approximations of nonseparable panel models," CeMMAP working papers CWP52/20, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    5. Chamberlain, Gary & Rothschild, Michael, 1983. "Arbitrage, Factor Structure, and Mean-Variance Analysis on Large Asset Markets," Econometrica, Econometric Society, vol. 51(5), pages 1281-1304, September.
    6. Alberto Abadie & Javier Gardeazabal, 2003. "The Economic Costs of Conflict: A Case Study of the Basque Country," American Economic Review, American Economic Association, vol. 93(1), pages 113-132, March.
    7. Michael E. Tipping & Christopher M. Bishop, 1999. "Probabilistic Principal Component Analysis," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 61(3), pages 611-622.
    8. Abadie, Alberto & Diamond, Alexis & Hainmueller, Jens, 2010. "Synthetic Control Methods for Comparative Case Studies: Estimating the Effect of California’s Tobacco Control Program," Journal of the American Statistical Association, American Statistical Association, vol. 105(490), pages 493-505.
    9. repec:cup:cbooks:9780521885881 is not listed on IDEAS
    10. Anish Agarwal & Rahul Singh, 2021. "Causal Inference with Corrupted Data: Measurement Error, Missing Values, Discretization, and Differential Privacy," Papers 2107.02780, arXiv.org, revised Feb 2024.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Choi, Jungjun & Kwon, Hyukjun & Liao, Yuan, 2024. "Inference for low-rank completion without sample splitting with application to treatment effect estimation," Journal of Econometrics, Elsevier, vol. 240(1).
    2. Sandro Heiniger, 2024. "Data-driven model selection within the matrix completion method for causal panel data models," Papers 2402.01069, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Dennis Shen & Peng Ding & Jasjeet Sekhon & Bin Yu, 2022. "Same Root Different Leaves: Time Series and Cross-Sectional Methods in Panel Data," Papers 2207.14481, arXiv.org, revised Oct 2022.
    2. Dmitry Arkhangelsky & Guido Imbens, 2023. "Causal Models for Longitudinal and Panel Data: A Survey," Papers 2311.15458, arXiv.org, revised Jun 2024.
    3. Anish Agarwal & Vasilis Syrgkanis, 2022. "Synthetic Blip Effects: Generalizing Synthetic Controls for the Dynamic Treatment Regime," Papers 2210.11003, arXiv.org.
    4. Stefano, Roberta di & Mellace, Giovanni, 2020. "The inclusive synthetic control method," Discussion Papers on Economics 14/2020, University of Southern Denmark, Department of Economics.
    5. Sandro Heiniger, 2024. "Data-driven model selection within the matrix completion method for causal panel data models," Papers 2402.01069, arXiv.org.
    6. David Gilchrist & Thomas Emery & Nuno Garoupa & Rok Spruk, 2023. "Synthetic Control Method: A tool for comparative case studies in economic history," Journal of Economic Surveys, Wiley Blackwell, vol. 37(2), pages 409-445, April.
    7. Florian Gunsilius, 2020. "Distributional synthetic controls," Papers 2001.06118, arXiv.org, revised Dec 2021.
    8. repec:oup:emjrnl:v:25:y:2022:i:1:p:46-70. is not listed on IDEAS
    9. Luis Costa & Vivek F. Farias & Patricio Foncea & Jingyuan (Donna) Gan & Ayush Garg & Ivo Rosa Montenegro & Kumarjit Pathak & Tianyi Peng & Dusan Popovic, 2023. "Generalized Synthetic Control for TestOps at ABI: Models, Algorithms, and Infrastructure," Interfaces, INFORMS, vol. 53(5), pages 336-349, September.
    10. Guido W. Imbens & Davide Viviano, 2023. "Identification and Inference for Synthetic Controls with Confounding," Papers 2312.00955, arXiv.org.
    11. Keegan Harris & Anish Agarwal & Chara Podimata & Zhiwei Steven Wu, 2022. "Strategyproof Decision-Making in Panel Data Settings and Beyond," Papers 2211.14236, arXiv.org, revised Dec 2023.
    12. Dmitry Arkhangelsky & Aleksei Samkov, 2024. "Sequential Synthetic Difference in Differences," Papers 2404.00164, arXiv.org.
    13. Viviano, Davide & Bradic, Jelena, 2023. "Synthetic Learner: Model-free inference on treatments over time," Journal of Econometrics, Elsevier, vol. 234(2), pages 691-713.
    14. Li, Xingyu & Shen, Yan & Zhou, Qiankun, 2024. "Confidence intervals of treatment effects in panel data models with interactive fixed effects," Journal of Econometrics, Elsevier, vol. 240(1).
    15. Nicolaj N. Mühlbach, 2020. "Tree-based Synthetic Control Methods: Consequences of moving the US Embassy," CREATES Research Papers 2020-04, Department of Economics and Business Economics, Aarhus University.
    16. Sadeghi, Ali & Kibler, Ewald, 2022. "Do bankruptcy laws matter for entrepreneurship? A Synthetic Control Method analysis of a bankruptcy reform in Finland," Journal of Business Venturing Insights, Elsevier, vol. 18(C).
    17. Gonzalez, Felipe & Prem, Mounu, 2020. "Police Repression and Protest Behavior: Evidence from Student Protests in Chile," SocArXiv 3xk5r, Center for Open Science.
    18. Di, Wenhua & Pattison, Nathaniel, 2023. "Industry Specialization and Small Business Lending," Journal of Banking & Finance, Elsevier, vol. 149(C).
    19. Cummins Joseph & Miller Douglas L. & Smith Brock & Simon David, 2024. "Matching on Noise: Finite Sample Bias in the Synthetic Control Estimator," Journal of Econometric Methods, De Gruyter, vol. 13(1), pages 67-95, January.
    20. Tomasz Serwach, 2023. "The European Union and within‐country income inequalities. The case of the new member states," The World Economy, Wiley Blackwell, vol. 46(7), pages 1890-1939, July.
    21. Michał Marcin Kobierecki & Michał Pierzgalski, 2022. "Sports Mega-Events and Economic Growth: A Synthetic Control Approach," Journal of Sports Economics, , vol. 23(5), pages 567-597, June.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2109.15154. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.