IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1004239.html
   My bibliography  Save this article

Forecasting the 2013–2014 Influenza Season Using Wikipedia

Author

Listed:
  • Kyle S Hickmann
  • Geoffrey Fairchild
  • Reid Priedhorsky
  • Nicholas Generous
  • James M Hyman
  • Alina Deshpande
  • Sara Y Del Valle

Abstract

Infectious diseases are one of the leading causes of morbidity and mortality around the world; thus, forecasting their impact is crucial for planning an effective response strategy. According to the Centers for Disease Control and Prevention (CDC), seasonal influenza affects 5% to 20% of the U.S. population and causes major economic impacts resulting from hospitalization and absenteeism. Understanding influenza dynamics and forecasting its impact is fundamental for developing prevention and mitigation strategies. We combine modern data assimilation methods with Wikipedia access logs and CDC influenza-like illness (ILI) reports to create a weekly forecast for seasonal influenza. The methods are applied to the 2013-2014 influenza season but are sufficiently general to forecast any disease outbreak, given incidence or case count data. We adjust the initialization and parametrization of a disease model and show that this allows us to determine systematic model bias. In addition, we provide a way to determine where the model diverges from observation and evaluate forecast accuracy. Wikipedia article access logs are shown to be highly correlated with historical ILI records and allow for accurate prediction of ILI data several weeks before it becomes available. The results show that prior to the peak of the flu season, our forecasting method produced 50% and 95% credible intervals for the 2013-2014 ILI observations that contained the actual observations for most weeks in the forecast. However, since our model does not account for re-infection or multiple strains of influenza, the tail of the epidemic is not predicted well after the peak of flu season has passed.Author Summary: We use modern methods for injecting current data into epidemiological models in order to offer a probabilistic evaluation of the future influenza state in the U.S. population. This type of disease forecasting is still in its infancy, but as these methods become more developed it will allow for increasingly robust control measures to react to and prevent large disease outbreaks. While weather forecasting has steadily improved over the last half century and become ubiquitous in modern life, there is surprisingly little work on infectious disease forecasting. Although there has been a great deal of work in modeling disease dynamics, these have seldom been used to generate a probabilistic description of expected future dynamics, given current public health data. Moreover, the mechanism to update expected disease outcomes as new data becomes available is just beginning to receive attention from the public health community. Using CDC influenza-like illness reports and digital monitoring sources, such as observations of Wikipedia article access logs, we are now at a point where forecasting for the influenza season can begin to offer useful information for disease monitoring and mitigation.

Suggested Citation

  • Kyle S Hickmann & Geoffrey Fairchild & Reid Priedhorsky & Nicholas Generous & James M Hyman & Alina Deshpande & Sara Y Del Valle, 2015. "Forecasting the 2013–2014 Influenza Season Using Wikipedia," PLOS Computational Biology, Public Library of Science, vol. 11(5), pages 1-29, May.
  • Handle: RePEc:plo:pcbi00:1004239
    DOI: 10.1371/journal.pcbi.1004239
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004239
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1004239&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1004239?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Phillip Stroud & Sara Del Valle & Stephen Sydoriak & Jane Riese & Susan Mniszewski, 2007. "Spatial Dynamics of Pandemic Influenza in a Massive Artificial Society," Journal of Artificial Societies and Social Simulation, Journal of Artificial Societies and Social Simulation, vol. 10(4), pages 1-9.
    2. Elaine O Nsoesie & Richard J Beckman & Sara Shashaani & Kalyani S Nagaraj & Madhav V Marathe, 2013. "A Simulation Optimization Approach to Epidemic Forecasting," PLOS ONE, Public Library of Science, vol. 8(6), pages 1-10, June.
    3. Luís M A Bettencourt & Ruy M Ribeiro, 2008. "Real Time Bayesian Estimation of the Epidemic Potential of Emerging Infectious Diseases," PLOS ONE, Public Library of Science, vol. 3(5), pages 1-9, May.
    4. Drew Creal, 2012. "A Survey of Sequential Monte Carlo Methods for Economics and Finance," Econometric Reviews, Taylor & Francis Journals, vol. 31(3), pages 245-296.
    5. Paolo Bajardi & Chiara Poletto & Jose J Ramasco & Michele Tizzoni & Vittoria Colizza & Alessandro Vespignani, 2011. "Human Mobility Networks, Travel Restrictions, and the Global Spread of 2009 H1N1 Pandemic," PLOS ONE, Public Library of Science, vol. 6(1), pages 1-8, January.
    6. Wan Yang & Alicia Karspeck & Jeffrey Shaman, 2014. "Comparison of Filtering Methods for the Modeling and Retrospective Forecasting of Influenza Epidemics," PLOS Computational Biology, Public Library of Science, vol. 10(4), pages 1-15, April.
    7. Jean-Paul Chretien & Dylan George & Jeffrey Shaman & Rohit A Chitale & F Ellis McKenzie, 2014. "Influenza Forecasting in Human Populations: A Scoping Review," PLOS ONE, Public Library of Science, vol. 9(4), pages 1-8, April.
    8. Nicholas Generous & Geoffrey Fairchild & Alina Deshpande & Sara Y Del Valle & Reid Priedhorsky, 2014. "Global Disease Monitoring and Forecasting with Wikipedia," PLOS Computational Biology, Public Library of Science, vol. 10(11), pages 1-16, November.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Christoph Zimmer & Reza Yaesoubi & Ted Cohen, 2017. "A Likelihood Approach for Real-Time Calibration of Stochastic Compartmental Epidemic Models," PLOS Computational Biology, Public Library of Science, vol. 13(1), pages 1-21, January.
    2. Zeynep Ertem & Dorrie Raymond & Lauren Ancel Meyers, 2018. "Optimal multi-source forecasting of seasonal influenza," PLOS Computational Biology, Public Library of Science, vol. 14(9), pages 1-16, September.
    3. Ibrahim Musa & Hyun Woo Park & Lkhagvadorj Munkhdalai & Keun Ho Ryu, 2018. "Global Research on Syndromic Surveillance from 1993 to 2017: Bibliometric Analysis and Visualization," Sustainability, MDPI, vol. 10(10), pages 1-20, September.
    4. Nicholas G Reich & Craig J McGowan & Teresa K Yamana & Abhinav Tushar & Evan L Ray & Dave Osthus & Sasikiran Kandula & Logan C Brooks & Willow Crawford-Crudell & Graham Casey Gibson & Evan Moore & Reb, 2019. "Accuracy of real-time multi-model ensemble forecasts for seasonal influenza in the U.S," PLOS Computational Biology, Public Library of Science, vol. 15(11), pages 1-19, November.
    5. Michal Ben-Nun & Pete Riley & James Turtle & David P Bacon & Steven Riley, 2019. "Forecasting national and regional influenza-like illness for the USA," PLOS Computational Biology, Public Library of Science, vol. 15(5), pages 1-20, May.
    6. Logan C Brooks & David C Farrow & Sangwon Hyun & Ryan J Tibshirani & Roni Rosenfeld, 2015. "Flexible Modeling of Epidemics with an Empirical Bayes Framework," PLOS Computational Biology, Public Library of Science, vol. 11(8), pages 1-18, August.
    7. Teresa K Yamana & Sasikiran Kandula & Jeffrey Shaman, 2017. "Individual versus superensemble forecasts of seasonal influenza outbreaks in the United States," PLOS Computational Biology, Public Library of Science, vol. 13(11), pages 1-17, November.
    8. Evan L Ray & Nicholas G Reich, 2018. "Prediction of infectious disease epidemics via weighted density ensembles," PLOS Computational Biology, Public Library of Science, vol. 14(2), pages 1-23, February.
    9. Logan C Brooks & David C Farrow & Sangwon Hyun & Ryan J Tibshirani & Roni Rosenfeld, 2018. "Nonmechanistic forecasts of seasonal influenza with iterative one-week-ahead distributions," PLOS Computational Biology, Public Library of Science, vol. 14(6), pages 1-29, June.
    10. Avanzi, Benjamin & Taylor, Greg & Vu, Phuong Anh & Wong, Bernard, 2020. "A multivariate evolutionary generalised linear model framework with adaptive estimation for claims reserving," Insurance: Mathematics and Economics, Elsevier, vol. 93(C), pages 50-71.
    11. Mumtaz, Haroon & Theodoridis, Konstantinos, 2017. "Common and country specific economic uncertainty," Journal of International Economics, Elsevier, vol. 105(C), pages 205-216.
    12. Wen Xu, 2016. "Estimation of Dynamic Panel Data Models with Stochastic Volatility Using Particle Filters," Econometrics, MDPI, vol. 4(4), pages 1-13, October.
    13. S. M. Mniszewski & S. Y. Del Valle & P. D. Stroud & J. M. Riese & S. J. Sydoriak, 2008. "Pandemic simulation of antivirals + school closures: buying time until strain-specific vaccine is available," Computational and Mathematical Organization Theory, Springer, vol. 14(3), pages 209-221, September.
    14. Ioannis Bournakis & Mike Tsionas, 2024. "A Non‐parametric Estimation of Productivity with Idiosyncratic and Aggregate Shocks: The Role of Research and Development (R&D) and Corporate Tax," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 86(3), pages 641-671, June.
    15. S. Bogan Aruoba & Pablo Cuba-Borda & Kenji Higa-Flores & Frank Schorfheide & Sergio Villalvazo, 2021. "Piecewise-Linear Approximations and Filtering for DSGE Models with Occasionally Binding Constraints," Review of Economic Dynamics, Elsevier for the Society for Economic Dynamics, vol. 41, pages 96-120, July.
    16. Arellano, Manuel & Blundell, Richard & Bonhomme, Stéphane & Light, Jack, 2024. "Heterogeneity of consumption responses to income shocks in the presence of nonlinear persistence," Journal of Econometrics, Elsevier, vol. 240(2).
    17. O. Samimi & Z. Mardani & S. Sharafpour & F. Mehrdoust, 2017. "LSM Algorithm for Pricing American Option Under Heston–Hull–White’s Stochastic Volatility Model," Computational Economics, Springer;Society for Computational Economics, vol. 50(2), pages 173-187, August.
    18. Kuchler, Theresa & Russel, Dominic & Stroebel, Johannes, 2022. "JUE Insight: The geographic spread of COVID-19 correlates with the structure of social networks as measured by Facebook," Journal of Urban Economics, Elsevier, vol. 127(C).
    19. De Simone, Andrea & Piangerelli, Marco, 2020. "A Bayesian approach for monitoring epidemics in presence of undetected cases," Chaos, Solitons & Fractals, Elsevier, vol. 140(C).
    20. Leif Anders Thorsrud, 2016. "Nowcasting using news topics Big Data versus big bank," Working Papers No 6/2016, Centre for Applied Macro- and Petroleum economics (CAMP), BI Norwegian Business School.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1004239. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.