IDEAS home Printed from https://ideas.repec.org/p/fip/fednls/87258.html
   My bibliography  Save this paper

Economic Predictions with Big Data: The Illusion of Sparsity

Author

Listed:
  • Domenico Giannone
  • Michele Lenza
  • Giorgio E. Primiceri

Abstract

The availability of large data sets, combined with advances in the fields of statistics, machine learning, and econometrics, have generated interest in forecasting models that include many possible predictive variables. Are economic data sufficiently informative to warrant selecting a handful of the most useful predictors from this larger pool of variables? This post documents that they usually are not, based on applications in macroeconomics, microeconomics, and finance.

Suggested Citation

  • Domenico Giannone & Michele Lenza & Giorgio E. Primiceri, 2018. "Economic Predictions with Big Data: The Illusion of Sparsity," Liberty Street Economics 20180521, Federal Reserve Bank of New York.
  • Handle: RePEc:fip:fednls:87258
    as

    Download full text from publisher

    File URL: https://libertystreeteconomics.newyorkfed.org/2018/05/economic-predictions-with-big-data-the-illusion-of-sparsity.html
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Ivo Welch & Amit Goyal, 2008. "A Comprehensive Look at The Empirical Performance of Equity Premium Prediction," The Review of Financial Studies, Society for Financial Studies, vol. 21(4), pages 1455-1508, July.
    2. Domenico Giannone & Michele Lenza & Giorgio E. Primiceri, 2021. "Economic Predictions With Big Data: The Illusion of Sparsity," Econometrica, Econometric Society, vol. 89(5), pages 2409-2437, September.
    3. David Rapach & Jack Strauss, 2010. "Bagging or Combining (or Both)? An Analysis Based on Forecasting U.S. Employment Growth," Econometric Reviews, Taylor & Francis Journals, vol. 29(5-6), pages 511-533.
    4. Leamer, Edward E, 1973. "Multicollinearity: A Bayesian Interpretation," The Review of Economics and Statistics, MIT Press, vol. 55(3), pages 371-380, August.
    5. Carlos M. Carvalho & Nicholas G. Polson & James G. Scott, 2010. "The horseshoe estimator for sparse signals," Biometrika, Biometrika Trust, vol. 97(2), pages 465-480.
    6. A. Chudik & G. Kapetanios & M. Hashem Pesaran, 2018. "A One Covariate at a Time, Multiple Testing Approach to Variable Selection in High‐Dimensional Linear Regression Models," Econometrica, Econometric Society, vol. 86(4), pages 1479-1512, July.
    7. Michael W. McCracken & Serena Ng, 2016. "FRED-MD: A Monthly Database for Macroeconomic Research," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(4), pages 574-589, October.
    8. Inoue, Atsushi & Kilian, Lutz, 2008. "How Useful Is Bagging in Forecasting Economic Time Series? A Case Study of U.S. Consumer Price Inflation," Journal of the American Statistical Association, American Statistical Association, vol. 103, pages 511-522, June.
    9. Leeb, Hannes & Potscher, Benedikt M., 2008. "Sparse estimators and the oracle property, or the return of Hodges' estimator," Journal of Econometrics, Elsevier, vol. 142(1), pages 201-211, January.
    10. Robert J. Barro, 1991. "Economic Growth in a Cross Section of Countries," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 106(2), pages 407-443.
    11. Barro, Robert J. & Lee, Jong-Wha, 1994. "Sources of economic growth," Carnegie-Rochester Conference Series on Public Policy, Elsevier, vol. 40(1), pages 1-46, June.
    12. Stefan Wager & Susan Athey, 2018. "Estimation and Inference of Heterogeneous Treatment Effects using Random Forests," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1228-1242, July.
    13. John J. Donohue III & Steven D. Levitt, 2001. "The Impact of Legalized Abortion on Crime," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 116(2), pages 379-420.
    14. A. Belloni & D. Chen & V. Chernozhukov & C. Hansen, 2012. "Sparse Models and Methods for Optimal Instruments With an Application to Eminent Domain," Econometrica, Econometric Society, vol. 80(6), pages 2369-2429, November.
    15. Jonathan H. Wright, 2009. "Forecasting US inflation by Bayesian model averaging," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 28(2), pages 131-144.
    16. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "Inference on Treatment Effects after Selection among High-Dimensional Controlsâ€," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 81(2), pages 608-650.
    17. Jon Faust & Simon Gilchrist & Jonathan H. Wright & Egon Zakrajšsek, 2013. "Credit Spreads as Predictors of Real-Time Economic Activity: A Bayesian Model-Averaging Approach," The Review of Economics and Statistics, MIT Press, vol. 95(5), pages 1501-1519, December.
    18. Carmen Fernandez & Eduardo Ley & Mark F. J. Steel, 2001. "Model uncertainty in cross-country growth regressions," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 16(5), pages 563-576.
    19. Serena Ng, 2014. "Viewpoint: Boosting Recessions," Canadian Journal of Economics, Canadian Economics Association, vol. 47(1), pages 1-34, February.
    20. Bańbura, Marta & Giannone, Domenico & Lenza, Michele, 2015. "Conditional forecasts and scenario analysis with vector autoregressions for large cross-sections," International Journal of Forecasting, Elsevier, vol. 31(3), pages 739-756.
    21. Susan Athey & Mohsen Bayati & Guido Imbens & Zhaonan Qu, 2019. "Ensemble Methods for Causal Effects in Panel Data Settings," AEA Papers and Proceedings, American Economic Association, vol. 109, pages 65-70, May.
    22. Alberto Abadie & Maximilian Kasy, 2019. "Choosing Among Regularized Estimators in Empirical Economics: The Risk of Machine Learning," The Review of Economics and Statistics, MIT Press, vol. 101(5), pages 743-762, December.
    23. A. Belloni & V. Chernozhukov & L. Wang, 2011. "Square-root lasso: pivotal recovery of sparse signals via conic programming," Biometrika, Biometrika Trust, vol. 98(4), pages 791-806.
    24. Park, Trevor & Casella, George, 2008. "The Bayesian Lasso," Journal of the American Statistical Association, American Statistical Association, vol. 103, pages 681-686, June.
    25. Kozak, Serhiy & Nagel, Stefan & Santosh, Shrihari, 2020. "Shrinking the cross-section," Journal of Financial Economics, Elsevier, vol. 135(2), pages 271-292.
    26. Joachim Freyberger & Andreas Neuhierl & Michael Weber, 2020. "Dissecting Characteristics Nonparametrically," The Review of Financial Studies, Society for Financial Studies, vol. 33(5), pages 2326-2377.
    27. De Mol, Christine & Giannone, Domenico & Reichlin, Lucrezia, 2008. "Forecasting using a large number of predictors: Is Bayesian shrinkage a valid alternative to principal components?," Journal of Econometrics, Elsevier, vol. 146(2), pages 318-328, October.
    28. Leeb, Hannes & Pötscher, Benedikt M., 2008. "Can One Estimate The Unconditional Distribution Of Post-Model-Selection Estimators?," Econometric Theory, Cambridge University Press, vol. 24(2), pages 338-376, April.
    29. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2011. "Inference for High-Dimensional Sparse Econometric Models," Papers 1201.0220, arXiv.org.
    30. Liang, Feng & Paulo, Rui & Molina, German & Clyde, Merlise A. & Berger, Jim O., 2008. "Mixtures of g Priors for Bayesian Variable Selection," Journal of the American Statistical Association, American Statistical Association, vol. 103, pages 410-423, March.
    31. Hal R. Varian, 2014. "Big Data: New Tricks for Econometrics," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 3-28, Spring.
    32. Jushan Bai & Serena Ng, 2009. "Boosting diffusion indices," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 24(4), pages 607-629.
    33. Leeb, Hannes & Pötscher, Benedikt M., 2005. "Model Selection And Inference: Facts And Fiction," Econometric Theory, Cambridge University Press, vol. 21(1), pages 21-59, February.
    34. Ng, Serena, 2013. "Variable Selection in Predictive Regressions," Handbook of Economic Forecasting, in: G. Elliott & C. Granger & A. Timmermann (ed.), Handbook of Economic Forecasting, edition 1, volume 2, chapter 0, pages 752-789, Elsevier.
    35. Victor Chernozhukov & Christian Hansen & Martin Spindler, 2015. "Valid Post-Selection and Post-Regularization Inference: An Elementary, General Approach," Annual Review of Economics, Annual Reviews, vol. 7(1), pages 649-688, August.
    36. Sainan Jin & Liangjun Su & Aman Ullah, 2014. "Robustify Financial Time Series Forecasting with Bagging," Econometric Reviews, Taylor & Francis Journals, vol. 33(5-6), pages 575-605, August.
    37. Xavier Sala-I-Martin & Gernot Doppelhofer & Ronald I. Miller, 2004. "Determinants of Long-Term Growth: A Bayesian Averaging of Classical Estimates (BACE) Approach," American Economic Review, American Economic Association, vol. 94(4), pages 813-835, September.
    38. K. J. Martijn Cremers, 2002. "Stock Return Predictability: A Bayesian Model Selection Perspective," The Review of Financial Studies, Society for Financial Studies, vol. 15(4), pages 1223-1249.
    39. Stock J.H. & Watson M.W., 2002. "Forecasting Using Principal Components From a Large Number of Predictors," Journal of the American Statistical Association, American Statistical Association, vol. 97, pages 1167-1179, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lenza, Michele & Moutachaker, Inès & Paredes, Joan, 2023. "Density forecasts of inflation: a quantile regression forest approach," CEPR Discussion Papers 18298, C.E.P.R. Discussion Papers.
    2. Borup, Daniel & Christensen, Bent Jesper & Mühlbach, Nicolaj Søndergaard & Nielsen, Mikkel Slot, 2023. "Targeting predictors in random forest regression," International Journal of Forecasting, Elsevier, vol. 39(2), pages 841-868.
    3. Lee, Ji Hyung & Shi, Zhentao & Gao, Zhan, 2022. "On LASSO for predictive regression," Journal of Econometrics, Elsevier, vol. 229(2), pages 322-349.
    4. Petropoulos, Fotios & Apiletti, Daniele & Assimakopoulos, Vassilios & Babai, Mohamed Zied & Barrow, Devon K. & Ben Taieb, Souhaib & Bergmeir, Christoph & Bessa, Ricardo J. & Bijak, Jakub & Boylan, Joh, 2022. "Forecasting: theory and practice," International Journal of Forecasting, Elsevier, vol. 38(3), pages 705-871.
      • Fotios Petropoulos & Daniele Apiletti & Vassilios Assimakopoulos & Mohamed Zied Babai & Devon K. Barrow & Souhaib Ben Taieb & Christoph Bergmeir & Ricardo J. Bessa & Jakub Bijak & John E. Boylan & Jet, 2020. "Forecasting: theory and practice," Papers 2012.03854, arXiv.org, revised Jan 2022.
    5. Daniele Bianchi & Kenichiro McAlinn, 2018. "Large-Scale Dynamic Predictive Regressions," Papers 1803.06738, arXiv.org.
    6. Cheng, Xu & Hansen, Bruce E., 2015. "Forecasting with factor-augmented regression: A frequentist model averaging approach," Journal of Econometrics, Elsevier, vol. 186(2), pages 280-293.
    7. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    8. Alexandre Belloni & Victor Chernozhukov & Kengo Kato, 2019. "Valid Post-Selection Inference in High-Dimensional Approximately Sparse Quantile Regression Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(526), pages 749-758, April.
    9. Philippe Goulet Coulombe & Maxime Leroux & Dalibor Stevanovic & Stéphane Surprenant, 2022. "How is machine learning useful for macroeconomic forecasting?," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 37(5), pages 920-964, August.
    10. Mark F. J. Steel, 2020. "Model Averaging and Its Use in Economics," Journal of Economic Literature, American Economic Association, vol. 58(3), pages 644-719, September.
    11. Ricardo P. Masini & Marcelo C. Medeiros & Eduardo F. Mendes, 2023. "Machine learning advances for time series forecasting," Journal of Economic Surveys, Wiley Blackwell, vol. 37(1), pages 76-111, February.
    12. Adamek, Robert & Smeekes, Stephan & Wilms, Ines, 2023. "Lasso inference for high-dimensional time series," Journal of Econometrics, Elsevier, vol. 235(2), pages 1114-1143.
    13. Byron Botha & Rulof Burger & Kevin Kotzé & Neil Rankin & Daan Steenkamp, 2023. "Big data forecasting of South African inflation," Empirical Economics, Springer, vol. 65(1), pages 149-188, July.
    14. Alexandre Belloni & Victor Chernozhukov & Kengo Kato, 2013. "Robust inference in high-dimensional approximately sparse quantile regression models," CeMMAP working papers 70/13, Institute for Fiscal Studies.
    15. Dimitris Korobilis, 2018. "Machine Learning Macroeconometrics: A Primer," Working Paper series 18-30, Rimini Centre for Economic Analysis.
    16. Mykola Babiak & Jozef Barunik, 2020. "Deep Learning, Predictability, and Optimal Portfolio Returns," CERGE-EI Working Papers wp677, The Center for Economic Research and Graduate Education - Economics Institute, Prague.
    17. Hoang, Daniel & Wiegratz, Kevin, 2022. "Machine learning methods in finance: Recent applications and prospects," Working Paper Series in Economics 158, Karlsruhe Institute of Technology (KIT), Department of Economics and Management.
    18. Croux, Christophe & Jagtiani, Julapa & Korivi, Tarunsai & Vulanovic, Milos, 2020. "Important factors determining Fintech loan default: Evidence from a lendingclub consumer platform," Journal of Economic Behavior & Organization, Elsevier, vol. 173(C), pages 270-296.
    19. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey, 2016. "Double machine learning for treatment and causal parameters," CeMMAP working papers 49/16, Institute for Fiscal Studies.
    20. Yuan Liao & Xinjie Ma & Andreas Neuhierl & Zhentao Shi, 2023. "Economic Forecasts Using Many Noises," Papers 2312.05593, arXiv.org, revised Dec 2023.

    More about this item

    Keywords

    Shrinkage; High Dimensional Data; Model Selection;
    All these keywords.

    JEL classification:

    • E17 - Macroeconomics and Monetary Economics - - General Aggregative Models - - - Forecasting and Simulation: Models and Applications

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:fip:fednls:87258. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Gabriella Bucciarelli (email available below). General contact details of provider: https://edirc.repec.org/data/frbnyus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.