IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1006236.html
   My bibliography  Save this article

Optimal multi-source forecasting of seasonal influenza

Author

Listed:
  • Zeynep Ertem
  • Dorrie Raymond
  • Lauren Ancel Meyers

Abstract

Forecasting the emergence and spread of influenza viruses is an important public health challenge. Timely and accurate estimates of influenza prevalence, particularly of severe cases requiring hospitalization, can improve control measures to reduce transmission and mortality. Here, we extend a previously published machine learning method for influenza forecasting to integrate multiple diverse data sources, including traditional surveillance data, electronic health records, internet search traffic, and social media activity. Our hierarchical framework uses multi-linear regression to combine forecasts from multiple data sources and greedy optimization with forward selection to sequentially choose the most predictive combinations of data sources. We show that the systematic integration of complementary data sources can substantially improve forecast accuracy over single data sources. When forecasting the Center for Disease Control and Prevention (CDC) influenza-like-illness reports (ILINet) from week 48 through week 20, the optimal combination of predictors includes public health surveillance data and commercially available electronic medical records, but neither search engine nor social media data.Author summary: In the United States, seasonal influenza causes thousands of deaths and hundreds of thousands of hospitalizations. The annual timing and burden of the flu season vary considerably with the severity of the circulating viruses. Epidemic forecasting can inform early and effective countermeasures to limit the human toll of severe seasonal and pandemic influenza. With a growing toolkit of sophisticated statistical methods and the recent explosion of influenza-related data, we can now systematically match models to data to achieve timely and accurate warning as flu epidemics emerge, peak and subside. Here, we introduce a framework for identifying optimal combinations of data sources, and show that public health surveillance data and electronic health records collectively forecast seasonal influenza better than any single data source alone and better than influenza-related search engine and social media data.

Suggested Citation

  • Zeynep Ertem & Dorrie Raymond & Lauren Ancel Meyers, 2018. "Optimal multi-source forecasting of seasonal influenza," PLOS Computational Biology, Public Library of Science, vol. 14(9), pages 1-16, September.
  • Handle: RePEc:plo:pcbi00:1006236
    DOI: 10.1371/journal.pcbi.1006236
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006236
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1006236&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1006236?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Samuel V Scarpino & Nedialko B Dimitrov & Lauren Ancel Meyers, 2012. "Optimizing Provider Recruitment for Influenza Surveillance Networks," PLOS Computational Biology, Public Library of Science, vol. 8(4), pages 1-12, April.
    2. Andrea Freyer Dugas & Mehdi Jalalpour & Yulia Gel & Scott Levin & Fred Torcaso & Takeru Igusa & Richard E Rothman, 2013. "Influenza Forecasting with Google Flu Trends," PLOS ONE, Public Library of Science, vol. 8(2), pages 1-7, February.
    3. Jose L Herrera & Ravi Srinivasan & John S Brownstein & Alison P Galvani & Lauren Ancel Meyers, 2016. "Disease Surveillance on Complex Social Networks," PLOS Computational Biology, Public Library of Science, vol. 12(7), pages 1-16, July.
    4. Samir Bhatt & Peter W. Gething & Oliver J. Brady & Jane P. Messina & Andrew W. Farlow & Catherine L. Moyes & John M. Drake & John S. Brownstein & Anne G. Hoen & Osman Sankoh & Monica F. Myers & Dylan , 2013. "The global distribution and burden of dengue," Nature, Nature, vol. 496(7446), pages 504-507, April.
    5. Logan C Brooks & David C Farrow & Sangwon Hyun & Ryan J Tibshirani & Roni Rosenfeld, 2015. "Flexible Modeling of Epidemics with an Empirical Bayes Framework," PLOS Computational Biology, Public Library of Science, vol. 11(8), pages 1-18, August.
    6. Jeremy Ginsberg & Matthew H. Mohebbi & Rajan S. Patel & Lynnette Brammer & Mark S. Smolinski & Larry Brilliant, 2009. "Detecting influenza epidemics using search engine query data," Nature, Nature, vol. 457(7232), pages 1012-1014, February.
    7. Cynthia Chew & Gunther Eysenbach, 2010. "Pandemics in the Age of Twitter: Content Analysis of Tweets during the 2009 H1N1 Outbreak," PLOS ONE, Public Library of Science, vol. 5(11), pages 1-13, November.
    8. Declan Butler, 2013. "When Google got flu wrong," Nature, Nature, vol. 494(7436), pages 155-156, February.
    9. Jean-Paul Chretien & Dylan George & Jeffrey Shaman & Rohit A Chitale & F Ellis McKenzie, 2014. "Influenza Forecasting in Human Populations: A Scoping Review," PLOS ONE, Public Library of Science, vol. 9(4), pages 1-8, April.
    10. Nicholas Generous & Geoffrey Fairchild & Alina Deshpande & Sara Y Del Valle & Reid Priedhorsky, 2014. "Global Disease Monitoring and Forecasting with Wikipedia," PLOS Computational Biology, Public Library of Science, vol. 10(11), pages 1-16, November.
    11. David A Broniatowski & Michael J Paul & Mark Dredze, 2013. "National and Local Influenza Surveillance through Twitter: An Analysis of the 2012-2013 Influenza Epidemic," PLOS ONE, Public Library of Science, vol. 8(12), pages 1-1, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jialiang Liu & Sumihiro Suzuki, 2022. "Real-Time Detection of Flu Season Onset: A Novel Approach to Flu Surveillance," IJERPH, MDPI, vol. 19(6), pages 1-9, March.
    2. John M Drake & Tobias S Brett & Shiyang Chen & Bogdan I Epureanu & Matthew J Ferrari & Éric Marty & Paige B Miller & Eamon B O’Dea & Suzanne M O’Regan & Andrew W Park & Pejman Rohani, 2019. "The statistics of epidemic transitions," PLOS Computational Biology, Public Library of Science, vol. 15(5), pages 1-14, May.
    3. Prashant Rangarajan & Sandeep K Mody & Madhav Marathe, 2019. "Forecasting dengue and influenza incidences using a sparse representation of Google trends, electronic health records, and time series data," PLOS Computational Biology, Public Library of Science, vol. 15(11), pages 1-24, November.
    4. Samuel V Scarpino & James G Scott & Rosalind M Eggo & Bruce Clements & Nedialko B Dimitrov & Lauren Ancel Meyers, 2020. "Socioeconomic bias in influenza surveillance," PLOS Computational Biology, Public Library of Science, vol. 16(7), pages 1-19, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ibrahim Musa & Hyun Woo Park & Lkhagvadorj Munkhdalai & Keun Ho Ryu, 2018. "Global Research on Syndromic Surveillance from 1993 to 2017: Bibliometric Analysis and Visualization," Sustainability, MDPI, vol. 10(10), pages 1-20, September.
    2. Logan C Brooks & David C Farrow & Sangwon Hyun & Ryan J Tibshirani & Roni Rosenfeld, 2018. "Nonmechanistic forecasts of seasonal influenza with iterative one-week-ahead distributions," PLOS Computational Biology, Public Library of Science, vol. 14(6), pages 1-29, June.
    3. Jose L Herrera & Ravi Srinivasan & John S Brownstein & Alison P Galvani & Lauren Ancel Meyers, 2016. "Disease Surveillance on Complex Social Networks," PLOS Computational Biology, Public Library of Science, vol. 12(7), pages 1-16, July.
    4. Samuel V Scarpino & James G Scott & Rosalind M Eggo & Bruce Clements & Nedialko B Dimitrov & Lauren Ancel Meyers, 2020. "Socioeconomic bias in influenza surveillance," PLOS Computational Biology, Public Library of Science, vol. 16(7), pages 1-19, July.
    5. Fantazzini, Dean, 2020. "Short-term forecasting of the COVID-19 pandemic using Google Trends data: Evidence from 158 countries," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 59, pages 33-54.
    6. Valentina Lorenzoni & Gianni Andreozzi & Andrea Bazzani & Virginia Casigliani & Salvatore Pirri & Lara Tavoschi & Giuseppe Turchetti, 2022. "How Italy Tweeted about COVID-19: Detecting Reactions to the Pandemic from Social Media," IJERPH, MDPI, vol. 19(13), pages 1-14, June.
    7. Logan C Brooks & David C Farrow & Sangwon Hyun & Ryan J Tibshirani & Roni Rosenfeld, 2015. "Flexible Modeling of Epidemics with an Empirical Bayes Framework," PLOS Computational Biology, Public Library of Science, vol. 11(8), pages 1-18, August.
    8. Teresa K Yamana & Sasikiran Kandula & Jeffrey Shaman, 2017. "Individual versus superensemble forecasts of seasonal influenza outbreaks in the United States," PLOS Computational Biology, Public Library of Science, vol. 13(11), pages 1-17, November.
    9. Sequoia I Leuba & Reza Yaesoubi & Marina Antillon & Ted Cohen & Christoph Zimmer, 2020. "Tracking and predicting U.S. influenza activity with a real-time surveillance network," PLOS Computational Biology, Public Library of Science, vol. 16(11), pages 1-14, November.
    10. Kuchler, Theresa & Russel, Dominic & Stroebel, Johannes, 2022. "JUE Insight: The geographic spread of COVID-19 correlates with the structure of social networks as measured by Facebook," Journal of Urban Economics, Elsevier, vol. 127(C).
    11. David C Farrow & Logan C Brooks & Sangwon Hyun & Ryan J Tibshirani & Donald S Burke & Roni Rosenfeld, 2017. "A human judgment approach to epidemiological forecasting," PLOS Computational Biology, Public Library of Science, vol. 13(3), pages 1-19, March.
    12. Khatri, Vijay, 2016. "Managerial work in the realm of the digital universe: The role of the data triad," Business Horizons, Elsevier, vol. 59(6), pages 673-688.
    13. Hongying Dai & Brian R. Lee & Jianqiang Hao, 2017. "Predicting Asthma Prevalence by Linking Social Media Data and Traditional Surveys," The ANNALS of the American Academy of Political and Social Science, , vol. 669(1), pages 75-92, January.
    14. Baki Cakici & Pedro Sanches, 2014. "Detecting the Visible: The Discursive Construction of Health Threats in a Syndromic Surveillance System Design," Societies, MDPI, vol. 4(3), pages 1-15, July.
    15. Pi Guo & Tao Liu & Qin Zhang & Li Wang & Jianpeng Xiao & Qingying Zhang & Ganfeng Luo & Zhihao Li & Jianfeng He & Yonghui Zhang & Wenjun Ma, 2017. "Developing a dengue forecast model using machine learning: A case study in China," PLOS Neglected Tropical Diseases, Public Library of Science, vol. 11(10), pages 1-22, October.
    16. Rivera, Roberto, 2016. "A dynamic linear model to forecast hotel registrations in Puerto Rico using Google Trends data," Tourism Management, Elsevier, vol. 57(C), pages 12-20.
    17. Mansour Ebrahimi & Parisa Aghagolzadeh & Narges Shamabadi & Ahmad Tahmasebi & Mohammed Alsharifi & David L Adelson & Farhid Hemmatzadeh & Esmaeil Ebrahimie, 2014. "Understanding the Underlying Mechanism of HA-Subtyping in the Level of Physic-Chemical Characteristics of Protein," PLOS ONE, Public Library of Science, vol. 9(5), pages 1-14, May.
    18. Victor Olsavszky & Mihnea Dosius & Cristian Vladescu & Johannes Benecke, 2020. "Time Series Analysis and Forecasting with Automated Machine Learning on a National ICD-10 Database," IJERPH, MDPI, vol. 17(14), pages 1-17, July.
    19. Svitlana Volkova & Ellyn Ayton & Katherine Porterfield & Courtney D Corley, 2017. "Forecasting influenza-like illness dynamics for military populations using neural networks and social media," PLOS ONE, Public Library of Science, vol. 12(12), pages 1-22, December.
    20. Jiachen Sun & Peter A. Gloor, 2021. "Assessing the Predictive Power of Online Social Media to Analyze COVID-19 Outbreaks in the 50 U.S. States," Future Internet, MDPI, vol. 13(7), pages 1-13, July.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1006236. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.