IDEAS home Printed from https://ideas.repec.org/p/nbr/nberwo/32979.html
   My bibliography  Save this paper

Synthetic Data and Social Science Research: Accuracy Assessments and Practical Considerations from the SIPP Synthetic Beta

Author

Listed:
  • Jordan C. Stanley
  • Evan S. Totty

Abstract

Synthetic microdata – data retaining the structure of original microdata while replacing original values with modeled values for the sake of privacy – presents an opportunity to increase access to useful microdata for data users while meeting the privacy and confidentiality requirements for data providers. Synthetic data could be sufficient for many purposes, but lingering accuracy concerns could be addressed with a validation system through which the data providers run the external researcher’s code on the internal data and share cleared output with the researcher. The U.S. Census Bureau has experience running such systems. In this chapter, we first describe the role of synthetic data within a tiered data access system and the importance of synthetic data accuracy in achieving a viable synthetic data product. Next, we review results from a recent set of empirical analyses we conducted to assess accuracy in the Survey of Income & Program Participation (SIPP) Synthetic Beta (SSB), a Census Bureau product that made linked survey-administrative data publicly available. Given this analysis and our experience working on the SSB project, we conclude with thoughts and questions regarding future implementations of synthetic data with validation.

Suggested Citation

  • Jordan C. Stanley & Evan S. Totty, 2024. "Synthetic Data and Social Science Research: Accuracy Assessments and Practical Considerations from the SIPP Synthetic Beta," NBER Working Papers 32979, National Bureau of Economic Research, Inc.
  • Handle: RePEc:nbr:nberwo:32979
    Note: LS TWP
    as

    Download full text from publisher

    File URL: http://www.nber.org/papers/w32979.pdf
    Download Restriction: Access to the full text is generally limited to series subscribers, however if the top level domain of the client browser is in a developing country or transition economy free access is provided. More information about subscriptions and free access is available at http://www.nber.org/wwphelp.html. Free access is also available to older working papers.
    ---><---

    As the access to this document is restricted, you may want to look for a different version below or search for a different version of it.

    Other versions of this item:

    References listed on IDEAS

    as
    1. Abel Brodeur & Scott Carrell & David Figlio & Lester Lusher, 2023. "Unpacking P-hacking and Publication Bias," American Economic Review, American Economic Association, vol. 113(11), pages 2974-3002, November.
    2. Abel Brodeur & Nikolai Cook & Anthony Heyes, 2020. "Methods Matter: p-Hacking and Publication Bias in Causal Analysis in Economics," American Economic Review, American Economic Association, vol. 110(11), pages 3634-3660, November.
    3. Abel Brodeur & Mathias Lé & Marc Sangnier & Yanos Zylberberg, 2016. "Star Wars: The Empirics Strike Back," American Economic Journal: Applied Economics, American Economic Association, vol. 8(1), pages 1-32, January.
    4. Drechsler, Jörg & Reiter, Jerome P., 2011. "An empirical evaluation of easily implemented, nonparametric methods for generating synthetic datasets," Computational Statistics & Data Analysis, Elsevier, vol. 55(12), pages 3232-3243, December.
    5. Lucas C. Coffman & Muriel Niederle, 2015. "Pre-analysis Plans Have Limited Upside, Especially Where Replications Are Feasible," Journal of Economic Perspectives, American Economic Association, vol. 29(3), pages 81-98, Summer.
    6. Thibaut Arpinon & Romain Espinosa, 2023. "A practical guide to Registered Reports for economists," Journal of the Economic Science Association, Springer;Economic Science Association, vol. 9(1), pages 90-122, June.
    7. Thibaut Arpinon & Romain Espinosa, 2023. "A Practical Guide to Registered Reports for Economists," Post-Print halshs-03897719, HAL.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Anna Dreber & Magnus Johannesson & Yifan Yang, 2024. "Selective reporting of placebo tests in top economics journals," Economic Inquiry, Western Economic Association International, vol. 62(3), pages 921-932, July.
    2. Thibaut Arpinon & Marianne Lefebvre, 2024. "Registered Reports and Associated Benefits for Agricultural Economics," Post-Print hal-04635986, HAL.
    3. Dreber, Anna & Johannesson, Magnus, 2023. "A framework for evaluating reproducibility and replicability in economics," Ruhr Economic Papers 1055, RWI - Leibniz-Institut für Wirtschaftsforschung, Ruhr-University Bochum, TU Dortmund University, University of Duisburg-Essen.
    4. Abel Brodeur & Nikolai M. Cook & Jonathan S. Hartley & Anthony Heyes, 2024. "Do Preregistration and Preanalysis Plans Reduce p-Hacking and Publication Bias? Evidence from 15,992 Test Statistics and Suggestions for Improvement," Journal of Political Economy Microeconomics, University of Chicago Press, vol. 2(3), pages 527-561.
    5. Brodeur, Abel & Cook, Nikolai & Hartley, Jonathan & Heyes, Anthony, 2022. "Do Pre-Registration and Pre-analysis Plans Reduce p-Hacking and Publication Bias?," MetaArXiv uxf39, Center for Open Science.
    6. Sarah A. Janzen & Jeffrey D. Michler, 2021. "Ulysses' pact or Ulysses' raft: Using pre‐analysis plans in experimental and nonexperimental research," Applied Economic Perspectives and Policy, John Wiley & Sons, vol. 43(4), pages 1286-1304, December.
    7. Bruno Ferman & Cristine Pinto & Vitor Possebom, 2020. "Cherry Picking with Synthetic Controls," Journal of Policy Analysis and Management, John Wiley & Sons, Ltd., vol. 39(2), pages 510-532, March.
    8. Romain Espinosa & Thibaut Arpinon & Paco Maginot & Sébastien Demange & Florimond Peureux, 2024. "Removing barriers to plant-based diets: assisting doctors with vegan patients," Post-Print hal-04479493, HAL.
    9. Stefano DellaVigna & Elizabeth Linos, 2022. "RCTs to Scale: Comprehensive Evidence From Two Nudge Units," Econometrica, Econometric Society, vol. 90(1), pages 81-116, January.
    10. Bruns, Stephan & Herwartz, Helmut & Ioannidis, John P.A. & Islam, Chris-Gabriel & Raters, Fabian H. C., 2023. "Statistical reporting errors in economics," MetaArXiv mbx62, Center for Open Science.
    11. Christoph Huber & Christian König-Kersting & Matteo M. Marini, 2022. "Experimenting with Financial Professionals," Working Papers 2022-07, Faculty of Economics and Statistics, Universität Innsbruck, revised Jun 2024.
    12. Abel Brodeur, Nikolai M. Cook, Anthony Heyes, 2022. "We Need to Talk about Mechanical Turk: What 22,989 Hypothesis Tests Tell Us about Publication Bias and p-Hacking in Online Experiments," LCERPA Working Papers am0133, Laurier Centre for Economic Research and Policy Analysis.
    13. Uwe Hassler & Marc‐Oliver Pohle, 2022. "Unlucky Number 13? Manipulating Evidence Subject to Snooping," International Statistical Review, International Statistical Institute, vol. 90(2), pages 397-410, August.
    14. Adam Gorajek & Joel Bank & Andrew Staib & Benjamin Malin & Hamish Fitchett, 2021. "Star Wars at Central Banks," RBA Research Discussion Papers rdp2021-02, Reserve Bank of Australia.
    15. Jasper Brinkerink, 2023. "When Shooting for the Stars Becomes Aiming for Asterisks: P-Hacking in Family Business Research," Entrepreneurship Theory and Practice, , vol. 47(2), pages 304-343, March.
    16. Gechert, Sebastian & Mey, Bianka & Opatrny, Matej & Havranek, Tomas & Stanley, T. D. & Bom, Pedro R. D. & Doucouliagos, Hristos & Heimberger, Philipp & Irsova, Zuzana & Rachinger, Heiko J., 2023. "Conventional Wisdom, Meta-Analysis, and Research Revision in Economics," EconStor Preprints 280745, ZBW - Leibniz Information Centre for Economics.
    17. Lucas C. Coffman & Muriel Niederle & Alistair J. Wilson, 2017. "A Proposal to Organize and Promote Replications," American Economic Review, American Economic Association, vol. 107(5), pages 41-45, May.
    18. Graham Elliott & Nikolay Kudrin & Kaspar Wüthrich, 2022. "Detecting p‐Hacking," Econometrica, Econometric Society, vol. 90(2), pages 887-906, March.
    19. Alexander L. Brown & Taisuke Imai & Ferdinand M. Vieider & Colin F. Camerer, 2024. "Meta-analysis of Empirical Estimates of Loss Aversion," Journal of Economic Literature, American Economic Association, vol. 62(2), pages 485-516, June.
    20. Burlig, Fiona, 2018. "Improving transparency in observational social science research: A pre-analysis plan approach," Economics Letters, Elsevier, vol. 168(C), pages 56-60.

    More about this item

    JEL classification:

    • C42 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods: Special Topics - - - Survey Methods
    • C81 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Methodology for Collecting, Estimating, and Organizing Microeconomic Data; Data Access
    • J01 - Labor and Demographic Economics - - General - - - Labor Economics: General
    • J1 - Labor and Demographic Economics - - Demographic Economics

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nbr:nberwo:32979. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: https://edirc.repec.org/data/nberrus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.