IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2011.01374.html
   My bibliography  Save this paper

Synthetic Data Generation for Economists

Author

Listed:
  • Allison Koenecke
  • Hal Varian

Abstract

As more tech companies engage in rigorous economic analyses, we are confronted with a data problem: in-house papers cannot be replicated due to use of sensitive, proprietary, or private data. Readers are left to assume that the obscured true data (e.g., internal Google information) indeed produced the results given, or they must seek out comparable public-facing data (e.g., Google Trends) that yield similar results. One way to ameliorate this reproducibility issue is to have researchers release synthetic datasets based on their true data; this allows external parties to replicate an internal researcher's methodology. In this brief overview, we explore synthetic data generation at a high level for economic analyses.

Suggested Citation

  • Allison Koenecke & Hal Varian, 2020. "Synthetic Data Generation for Economists," Papers 2011.01374, arXiv.org, revised Nov 2020.
  • Handle: RePEc:arx:papers:2011.01374
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2011.01374
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Athey, Susan & Imbens, Guido W. & Metzger, Jonas & Munro, Evan, 2024. "Using Wasserstein Generative Adversarial Networks for the design of Monte Carlo simulations," Journal of Econometrics, Elsevier, vol. 240(2).
    2. Seth Stephens-Davidowitz & Hal Varian & Michael D. Smith, 2017. "Super returns to Super Bowl ads?," Quantitative Marketing and Economics (QME), Springer, vol. 15(1), pages 1-28, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Yves-C'edric Bauwelinckx & Jan Dhaene & Tim Verdonck & Milan van den Heuvel, 2023. "On the causality-preservation capabilities of generative modelling," Papers 2301.01109, arXiv.org.
    2. Sonan Memon, 2022. "Inflation in Pakistan: High-Frequency Estimation and Forecasting," PIDE-Working Papers 2022:12, Pakistan Institute of Development Economics.
    3. Stefan Wimmer & Robert Finger, 2023. "A note on synthetic data for replication purposes in agricultural economics," Journal of Agricultural Economics, Wiley Blackwell, vol. 74(1), pages 316-323, February.
    4. Vansh Murad Kalia, 2024. "Packing Peanuts: The Role Synthetic Data Can Play in Enhancing Conventional Economic Prediction Models," Papers 2405.07431, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Nir Billfeld & Moshe Kim, 2024. "Context-dependent Causality (the Non-Nonotonic Case)," Papers 2404.05021, arXiv.org.
    2. Jesus Fernandez-Villaverde, 2020. "Simple Rules for a Complex World with Arti?cial Intelligence," PIER Working Paper Archive 20-010, Penn Institute for Economic Research, Department of Economics, University of Pennsylvania.
    3. Samuel Nocito & Marcello Sartarelli & Francesco Sobbrio, 2021. "A Beam of Light: Media, Tourism & Economic Development," CESifo Working Paper Series 9055, CESifo.
    4. Wesley R. Hartmann & Daniel Klapper, 2018. "Super Bowl Ads," Marketing Science, INFORMS, vol. 37(1), pages 78-96, January.
    5. Chen He & Tobias J. Klein, 2023. "Advertising as a Reminder: Evidence from the Dutch State Lottery," Marketing Science, INFORMS, vol. 42(5), pages 892-909, September.
    6. Nocito, Samuel & Sartarelli, Marcello & Sobbrio, Francesco, 2023. "A beam of light: Media, tourism and economic development," Journal of Urban Economics, Elsevier, vol. 137(C).
    7. Jonas Metzger, 2022. "Adversarial Estimators," Papers 2204.10495, arXiv.org, revised Jun 2022.
    8. Christian M. Dahl & Emil N. S{o}rensen, 2021. "Time Series (re)sampling using Generative Adversarial Networks," Papers 2102.00208, arXiv.org.
    9. Yves-C'edric Bauwelinckx & Jan Dhaene & Tim Verdonck & Milan van den Heuvel, 2023. "On the causality-preservation capabilities of generative modelling," Papers 2301.01109, arXiv.org.
    10. Michael Thomas, 2020. "Spillovers from Mass Advertising: An Identification Strategy," Marketing Science, INFORMS, vol. 39(4), pages 807-826, July.
    11. Tengyuan Liang, 2020. "How Well Generative Adversarial Networks Learn Distributions," Working Papers 2020-154, Becker Friedman Institute for Research In Economics.
    12. Jesús Fernández-Villaverde, 2021. "Has machine learning rendered simple rules obsolete?," European Journal of Law and Economics, Springer, vol. 52(2), pages 251-265, December.
    13. Jiaying Gu & Roger Koenker, 2020. "Invidious Comparisons: Ranking and Selection as Compound Decisions," Papers 2012.12550, arXiv.org, revised Sep 2021.
    14. Luis Aguiar & Zhizhong Chen, 2024. "Let that Sync in: The Effect of Music Reuse on Product Discovery," CESifo Working Paper Series 11249, CESifo.
    15. Max H. Farrell & Tengyuan Liang & Sanjog Misra, 2021. "Deep Neural Networks for Estimation and Inference," Econometrica, Econometric Society, vol. 89(1), pages 181-213, January.
    16. Jiafeng Chen & Xiaohong Chen & Elie Tamer, 2021. "Efficient Estimation in NPIV Models: A Comparison of Various Neural Networks-Based Estimators," Papers 2110.06763, arXiv.org, revised Oct 2022.
    17. He, Chen, 2018. "Essays on the role and effects of advertising," Other publications TiSEM 47a3272a-54f1-4a90-9714-c, Tilburg University, School of Economics and Management.
    18. Jiaying Gu & Roger Koenker, 2023. "Invidious Comparisons: Ranking and Selection as Compound Decisions," Econometrica, Econometric Society, vol. 91(1), pages 1-41, January.
    19. Christian M. Dahl & Torben S. D. Johansen & Emil N. S{o}rensen & Christian E. Westermann & Simon F. Wittrock, 2021. "Applications of Machine Learning in Document Digitisation," Papers 2102.03239, arXiv.org.
    20. Kevin Han & Han Wu & Linjia Wu & Yu Shi & Canyao Liu, 2024. "Estimating Treatment Effects Using Observational Data and Experimental Data with Non-Overlapping Support," Econometrics, MDPI, vol. 12(3), pages 1-11, September.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2011.01374. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.