IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2006.14732.html
   My bibliography  Save this paper

Identification and Formal Privacy Guarantees

Author

Listed:
  • Tatiana Komarova
  • Denis Nekipelov

Abstract

Empirical economic research crucially relies on highly sensitive individual datasets. At the same time, increasing availability of public individual-level data makes it possible for adversaries to potentially de-identify anonymized records in sensitive research datasets. Most commonly accepted formal definition of an individual non-disclosure guarantee is referred to as differential privacy. It restricts the interaction of researchers with the data by allowing them to issue queries to the data. The differential privacy mechanism then replaces the actual outcome of the query with a randomised outcome. The impact of differential privacy on the identification of empirical economic models and on the performance of estimators in nonlinear empirical Econometric models has not been sufficiently studied. Since privacy protection mechanisms are inherently finite-sample procedures, we define the notion of identifiability of the parameter of interest under differential privacy as a property of the limit of experiments. It is naturally characterized by the concepts from the random sets theory. We show that particular instances of regression discontinuity design may be problematic for inference with differential privacy as parameters turn out to be neither point nor partially identified. The set of differentially private estimators converges weakly to a random set. Our analysis suggests that many other estimators that rely on nuisance parameters may have similar properties with the requirement of differential privacy. We show that identification becomes possible if the target parameter can be deterministically located within the random set. In that case, a full exploration of the random set of the weak limits of differentially private estimators can allow the data curator to select a sequence of instances of differentially private estimators converging to the target parameter in probability.

Suggested Citation

  • Tatiana Komarova & Denis Nekipelov, 2020. "Identification and Formal Privacy Guarantees," Papers 2006.14732, arXiv.org, revised May 2021.
  • Handle: RePEc:arx:papers:2006.14732
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2006.14732
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Beresteanu, Arie & Molchanov, Ilya & Molinari, Francesca, 2012. "Partial identification using random set theory," Journal of Econometrics, Elsevier, vol. 166(1), pages 17-32.
    2. Goldfarb, Avi & Greenstein, Shane M. & Tucker, Catherine E. (ed.), 2015. "Economic Analysis of the Digital Economy," National Bureau of Economic Research Books, University of Chicago Press, number 9780226206981, August.
    3. Tatiana Komarova & Denis Nekipelov & Evgeny Yakovlev, 2018. "Identification, data combination, and the risk of disclosure," Quantitative Economics, Econometric Society, vol. 9(1), pages 395-440, March.
    4. Imbens, Guido W. & Lemieux, Thomas, 2008. "Regression discontinuity designs: A guide to practice," Journal of Econometrics, Elsevier, vol. 142(2), pages 615-635, February.
    5. Komarova, Tatiana & Nekipelov, Denis & Al Rafi , Ahnaf & Yakovlev, Evgeny, 2017. "K-anonymity: A note on the trade-off between data utility and data security," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 48, pages 44-62.
    6. Sebastian Calonico & Matias D. Cattaneo & Rocio Titiunik, 2014. "Robust Nonparametric Confidence Intervals for Regression‐Discontinuity Designs," Econometrica, Econometric Society, vol. 82, pages 2295-2326, November.
    7. Tatiana Komarova & Denis Nekipelov & Evgeny Yakovlev, 2015. "Estimation of Treatment Effects from Combined Data: Identification versus Data Security," NBER Chapters, in: Economic Analysis of the Digital Economy, pages 279-308, National Bureau of Economic Research, Inc.
    8. McCrary, Justin, 2008. "Manipulation of the running variable in the regression discontinuity design: A density test," Journal of Econometrics, Elsevier, vol. 142(2), pages 698-714, February.
    9. Chernozhukov, Victor & Hong, Han, 2003. "An MCMC approach to classical estimation," Journal of Econometrics, Elsevier, vol. 115(2), pages 293-346, August.
    10. Arie Beresteanu & Francesca Molinari, 2008. "Asymptotic Properties for a Class of Partially Identified Models," Econometrica, Econometric Society, vol. 76(4), pages 763-814, July.
    11. Keisuke Hirano & Guido W. Imbens & Geert Ridder, 2003. "Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score," Econometrica, Econometric Society, vol. 71(4), pages 1161-1189, July.
    12. Guido Imbens & Karthik Kalyanaraman, 2012. "Optimal Bandwidth Choice for the Regression Discontinuity Estimator," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 79(3), pages 933-959.
    13. Hahn, Jinyong & Todd, Petra & Van der Klaauw, Wilbert, 2001. "Identification and Estimation of Treatment Effects with a Regression-Discontinuity Design," Econometrica, Econometric Society, vol. 69(1), pages 201-209, January.
    14. Anna, Petrenko, 2016. "Мaркування готової продукції як складова частина інформаційного забезпечення маркетингової діяльності підприємств овочепродуктового підкомплексу," Agricultural and Resource Economics: International Scientific E-Journal, Agricultural and Resource Economics: International Scientific E-Journal, vol. 2(1), March.
    15. Jinyong Hahn, 1998. "On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects," Econometrica, Econometric Society, vol. 66(2), pages 315-332, March.
    16. Karr, A.F. & Kohnen, C.N. & Oganian, A. & Reiter, J.P. & Sanil, A.P., 2006. "A Framework for Evaluating the Utility of Data Altered to Protect Confidentiality," The American Statistician, American Statistical Association, vol. 60, pages 224-232, August.
    17. Alberto Abadie & Guido W. Imbens, 2006. "Large Sample Properties of Matching Estimators for Average Treatment Effects," Econometrica, Econometric Society, vol. 74(1), pages 235-267, January.
    18. Anna Kormilitsina & Denis Nekipelov, 2016. "Consistent Variance Of The Laplace‐Type Estimators: Application To Dsge Models," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 57, pages 603-622, May.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Guido W. Imbens & Jeffrey M. Wooldridge, 2009. "Recent Developments in the Econometrics of Program Evaluation," Journal of Economic Literature, American Economic Association, vol. 47(1), pages 5-86, March.
    2. Huber, Martin, 2019. "An introduction to flexible methods for policy evaluation," FSES Working Papers 504, Faculty of Economics and Social Sciences, University of Freiburg/Fribourg Switzerland.
    3. Ivan A Canay & Vishal Kamat, 2018. "Approximate Permutation Tests and Induced Order Statistics in the Regression Discontinuity Design," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 85(3), pages 1577-1608.
    4. Christopher S. Carpenter & Carlos Dobkin & Casey Warman, 2016. "The Mechanisms of Alcohol Control," Journal of Human Resources, University of Wisconsin Press, vol. 51(2), pages 328-356.
    5. Prakash, Nishith & Rockmore, Marc & Uppal, Yogesh, 2019. "Do criminally accused politicians affect economic outcomes? Evidence from India," Journal of Development Economics, Elsevier, vol. 141(C).
    6. Joaquín Artés & Ignacio Jurado, 2018. "Government fragmentation and fiscal deficits: a regression discontinuity approach," Public Choice, Springer, vol. 175(3), pages 367-391, June.
    7. Gurgand, Marc & Lorenceau, Adrien & Mélonio, Thomas, 2023. "Student loans: Credit constraints and higher education in South Africa," Journal of Development Economics, Elsevier, vol. 161(C).
    8. Carta, Francesca & Rizzica, Lucia, 2018. "Early kindergarten, maternal labor supply and children's outcomes: Evidence from Italy," Journal of Public Economics, Elsevier, vol. 158(C), pages 79-102.
    9. Kaniel, Ron & Parham, Robert, 2017. "WSJ Category Kings – The impact of media attention on consumer and mutual fund investment decisions," Journal of Financial Economics, Elsevier, vol. 123(2), pages 337-356.
    10. Crespo Cristian, 2020. "Beyond Manipulation: Administrative Sorting in Regression Discontinuity Designs," Journal of Causal Inference, De Gruyter, vol. 8(1), pages 164-181, January.
    11. Mauricio Villamizar‐Villegas & Freddy A. Pinzon‐Puerto & Maria Alejandra Ruiz‐Sanchez, 2022. "A comprehensive history of regression discontinuity designs: An empirical survey of the last 60 years," Journal of Economic Surveys, Wiley Blackwell, vol. 36(4), pages 1130-1178, September.
    12. Dehejia Rajeev, 2015. "Experimental and Non-Experimental Methods in Development Economics: A Porous Dialectic," Journal of Globalization and Development, De Gruyter, vol. 6(1), pages 47-69, June.
    13. Chad D. Meyerhoefer & Muzhe Yang, 2011. "The Relationship between Food Assistance and Health: A Review of the Literature and Empirical Strategies for Identifying Program Effects," Applied Economic Perspectives and Policy, Agricultural and Applied Economics Association, vol. 33(3), pages 304-344.
    14. Hsu, Yu-Chin & Shiu, Ji-Liang & Wan, Yuanyuan, 2024. "Testing identification conditions of LATE in fuzzy regression discontinuity designs," Journal of Econometrics, Elsevier, vol. 241(1).
    15. Montoya, Ana Maria & Noton, Carlos & Solis, Alex, 2018. "The Returns to College Choice: Loans, Scholarships and Labor Outcomes," Working Paper Series 2018:12, Uppsala University, Department of Economics.
    16. Blaise Melly & Rafael Lalive, 2020. "Estimation, Inference, and Interpretation in the Regression Discontinuity Design," Diskussionsschriften dp2016, Universitaet Bern, Departement Volkswirtschaft.
    17. Susan Athey & Guido W. Imbens, 2017. "The State of Applied Econometrics: Causality and Policy Evaluation," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 3-32, Spring.
    18. Davezies, Laurent & Le Barbanchon, Thomas, 2017. "Regression discontinuity design with continuous measurement error in the running variable," Journal of Econometrics, Elsevier, vol. 200(2), pages 260-281.
    19. Yoichi Arai & Yu‐Chin Hsu & Toru Kitagawa & Ismael Mourifié & Yuanyuan Wan, 2022. "Testing identifying assumptions in fuzzy regression discontinuity designs," Quantitative Economics, Econometric Society, vol. 13(1), pages 1-28, January.
    20. Bartalotti Otávio, 2019. "Regression Discontinuity and Heteroskedasticity Robust Standard Errors: Evidence from a Fixed-Bandwidth Approximation," Journal of Econometric Methods, De Gruyter, vol. 8(1), pages 1-26, January.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2006.14732. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.