IDEAS home Printed from https://ideas.repec.org/p/pra/mprapa/87741.html
   My bibliography  Save this paper

“Re-make/Re-model”: Should big data change the modelling paradigm in official statistics?

Author

Listed:
  • Braaksma, Barteld
  • Zeelenberg, Kees

Abstract

Big data offers many opportunities for official statistics: for example increased resolution, better timeliness, and new statistical outputs. But there are also many challenges: uncontrolled changes in sources that threaten continuity, lack of identifiers that impedes linking to population frames, and data that refers only indirectly to phenomena of statistical interest. We discuss two approaches to deal with these challenges and opportunities. First, we may accept big data for what they are: an imperfect, yet timely, indicator of phenomena in society. These data exist and that's why they are interesting. Secondly, we may extend this approach by explicit modelling. New methods like machine-learning techniques can be considered alongside more traditional methods like Bayesian techniques. National statistical institutes have always been reluctant to use models, apart from specific cases like small-area estimates. Based on the experience at Statistics Netherlands we argue that NSIs should not be afraid to use models, provided that their use is documented and made transparent to users. Moreover, the primary purpose of an NSI is to describe society; we should refrain from making forecasts. The models used should therefore rely on actually observed data and they should be validated extensively.

Suggested Citation

  • Braaksma, Barteld & Zeelenberg, Kees, 2015. "“Re-make/Re-model”: Should big data change the modelling paradigm in official statistics?," MPRA Paper 87741, University Library of Munich, Germany.
  • Handle: RePEc:pra:mprapa:87741
    as

    Download full text from publisher

    File URL: https://mpra.ub.uni-muenchen.de/87741/1/MPRA_paper_87741.pdf
    File Function: original version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "High-Dimensional Methods and Inference on Structural and Treatment Effects," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 29-50, Spring.
    2. Christian Reimsbach-Kounatze, 2015. "The Proliferation of “Big Data” and Implications for Official Statistics and Statistical Agencies: A Preliminary Analysis," OECD Digital Economy Papers 245, OECD Publishing.
    3. Hal R. Varian, 2014. "Big Data: New Tricks for Econometrics," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 3-28, Spring.
    4. David W. Nickerson & Todd Rogers, 2014. "Political Campaigns and Big Data," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 51-74, Spring.
    5. Daas, Piet J.H. & Puts, Marco J.H., 2014. "Social media sentiment and consumer confidence," Statistics Paper Series 5, European Central Bank.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Iacus Stefano M. & Salini Silvia & Siletti Elena & Porro Giuseppe, 2020. "Controlling for Selection Bias in Social Media Indicators through Official Statistics: a Proposal," Journal of Official Statistics, Sciendo, vol. 36(2), pages 315-338, June.
    2. Kuurstra, Douwe & Zeelenberg, Kees, 2018. "Statistical quality by design: certification, rules and culture," MPRA Paper 88227, University Library of Munich, Germany.
    3. George Kapetanios & Fotis Papailias, 2018. "Big Data & Macroeconomic Nowcasting: Methodological Review," Economic Statistics Centre of Excellence (ESCoE) Discussion Papers ESCoE DP-2018-12, Economic Statistics Centre of Excellence (ESCoE).
    4. Andrés Vallone & Coro Chasco & Beatriz Sánchez, 2020. "Strategies to access web-enabled urban spatial data for socioeconomic research using R functions," Journal of Geographical Systems, Springer, vol. 22(2), pages 217-239, April.
    5. Zeelenberg, Kees & Ypma, Winfried & Struijs, Peter, 2018. "Quality management of methodology and process development for official statistics," MPRA Paper 88610, University Library of Munich, Germany.
    6. Markus Zwick, 2016. "Statistikausbildung in Zeiten von Big Data [Statistical education in times of Big Data]," AStA Wirtschafts- und Sozialstatistisches Archiv, Springer;Deutsche Statistische Gesellschaft - German Statistical Society, vol. 10(2), pages 127-139, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ajay Agrawal & Joshua Gans & Avi Goldfarb, 2018. "Prediction, Judgment, and Complexity: A Theory of Decision-Making and Artificial Intelligence," NBER Chapters, in: The Economics of Artificial Intelligence: An Agenda, pages 89-110, National Bureau of Economic Research, Inc.
    2. Joyce P Jacobsen & Laurence M Levin & Zachary Tausanovitch, 2016. "Comparing Standard Regression Modeling to Ensemble Modeling: How Data Mining Software Can Improve Economists’ Predictions," Eastern Economic Journal, Palgrave Macmillan;Eastern Economic Association, vol. 42(3), pages 387-398, June.
    3. Michael C. Knaus & Michael Lechner & Anthony Strittmatter, 2022. "Heterogeneous Employment Effects of Job Search Programs: A Machine Learning Approach," Journal of Human Resources, University of Wisconsin Press, vol. 57(2), pages 597-636.
    4. Serena Ng, 2017. "Opportunities and Challenges: Lessons from Analyzing Terabytes of Scanner Data," NBER Working Papers 23673, National Bureau of Economic Research, Inc.
    5. Whitaker, Stephan D., 2018. "Big Data versus a survey," The Quarterly Review of Economics and Finance, Elsevier, vol. 67(C), pages 285-296.
    6. Achim Ahrens, 2015. "Civil conflicts in Africa: Climate, economic shocks, nighttime lights and spill-over effects," SEEC Discussion Papers 1501, Spatial Economics and Econometrics Centre, Heriot Watt University.
    7. Byron Botha & Rulof Burger & Kevin Kotzé & Neil Rankin & Daan Steenkamp, 2023. "Big data forecasting of South African inflation," Empirical Economics, Springer, vol. 65(1), pages 149-188, July.
    8. Georges, Christophre & Pereira, Javier, 2021. "Market stability with machine learning agents," Journal of Economic Dynamics and Control, Elsevier, vol. 122(C).
    9. Cerulli, Giovanni, 2020. "A Super-Learning Machine for Predicting Economic Outcomes," MPRA Paper 99111, University Library of Munich, Germany.
    10. Andreas Fuster & Paul Goldsmith‐Pinkham & Tarun Ramadorai & Ansgar Walther, 2022. "Predictably Unequal? The Effects of Machine Learning on Credit Markets," Journal of Finance, American Finance Association, vol. 77(1), pages 5-47, February.
    11. Shengying Zhai & Qihui Chen & Wenxin Wang, 2019. "What Drives Green Fodder Supply in China?—A Nerlovian Analysis with LASSO Variable Selection," Sustainability, MDPI, vol. 11(23), pages 1-17, November.
    12. Francesco Bloise & Paolo Brunori & Patrizio Piraino, 2021. "Estimating intergenerational income mobility on sub-optimal data: a machine learning approach," The Journal of Economic Inequality, Springer;Society for the Study of Economic Inequality, vol. 19(4), pages 643-665, December.
    13. Arthur Acolin & Ari Decter-Frain & Matt Hall, 2022. "Small-area estimates from consumer trace data," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 47(27), pages 843-882.
    14. Akash Malhotra, 2021. "A hybrid econometric–machine learning approach for relative importance analysis: prioritizing food policy," Eurasian Economic Review, Springer;Eurasia Business and Economics Society, vol. 11(3), pages 549-581, September.
    15. Jermain C. Kaminski & Christian Hopp, 2020. "Predicting outcomes in crowdfunding campaigns with textual, visual, and linguistic signals," Small Business Economics, Springer, vol. 55(3), pages 627-649, October.
    16. Croux, Christophe & Jagtiani, Julapa & Korivi, Tarunsai & Vulanovic, Milos, 2020. "Important factors determining Fintech loan default: Evidence from a lendingclub consumer platform," Journal of Economic Behavior & Organization, Elsevier, vol. 173(C), pages 270-296.
    17. Green, Gareth & Richards, Timothy, 2016. "Interpreting Results of Demand Estimation from Machine Learning Models," 2016 Annual Meeting, July 31-August 2, Boston, Massachusetts 236147, Agricultural and Applied Economics Association.
    18. McKenzie, David & Sansone, Dario, 2017. "Man vs. Machine in Predicting Successful Entrepreneurs: Evidence from a Business Plan Competition in Nigeria," CEPR Discussion Papers 12523, C.E.P.R. Discussion Papers.
    19. Andini, Monica & Boldrini, Michela & Ciani, Emanuele & de Blasio, Guido & D'Ignazio, Alessio & Paladini, Andrea, 2022. "Machine learning in the service of policy targeting: The case of public credit guarantees," Journal of Economic Behavior & Organization, Elsevier, vol. 198(C), pages 434-475.
    20. Athey, Susan & Imbens, Guido W., 2019. "Machine Learning Methods Economists Should Know About," Research Papers 3776, Stanford University, Graduate School of Business.

    More about this item

    Keywords

    Big data; model-based statistics;

    JEL classification:

    • C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis
    • C81 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Methodology for Collecting, Estimating, and Organizing Microeconomic Data; Data Access
    • C83 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Survey Methods; Sampling Methods

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:pra:mprapa:87741. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Joachim Winter (email available below). General contact details of provider: https://edirc.repec.org/data/vfmunde.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.