IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0176690.html
   My bibliography  Save this article

Forecasting influenza in Hong Kong with Google search queries and statistical model fusion

Author

Listed:
  • Qinneng Xu
  • Yulia R Gel
  • L Leticia Ramirez Ramirez
  • Kusha Nezafati
  • Qingpeng Zhang
  • Kwok-Leung Tsui

Abstract

Background: The objective of this study is to investigate predictive utility of online social media and web search queries, particularly, Google search data, to forecast new cases of influenza-like-illness (ILI) in general outpatient clinics (GOPC) in Hong Kong. To mitigate the impact of sensitivity to self-excitement (i.e., fickle media interest) and other artifacts of online social media data, in our approach we fuse multiple offline and online data sources. Methods: Four individual models: generalized linear model (GLM), least absolute shrinkage and selection operator (LASSO), autoregressive integrated moving average (ARIMA), and deep learning (DL) with Feedforward Neural Networks (FNN) are employed to forecast ILI-GOPC both one week and two weeks in advance. The covariates include Google search queries, meteorological data, and previously recorded offline ILI. To our knowledge, this is the first study that introduces deep learning methodology into surveillance of infectious diseases and investigates its predictive utility. Furthermore, to exploit the strength from each individual forecasting models, we use statistical model fusion, using Bayesian model averaging (BMA), which allows a systematic integration of multiple forecast scenarios. For each model, an adaptive approach is used to capture the recent relationship between ILI and covariates. Results: DL with FNN appears to deliver the most competitive predictive performance among the four considered individual models. Combing all four models in a comprehensive BMA framework allows to further improve such predictive evaluation metrics as root mean squared error (RMSE) and mean absolute predictive error (MAPE). Nevertheless, DL with FNN remains the preferred method for predicting locations of influenza peaks. Conclusions: The proposed approach can be viewed a feasible alternative to forecast ILI in Hong Kong or other countries where ILI has no constant seasonal trend and influenza data resources are limited. The proposed methodology is easily tractable and computationally efficient.

Suggested Citation

  • Qinneng Xu & Yulia R Gel & L Leticia Ramirez Ramirez & Kusha Nezafati & Qingpeng Zhang & Kwok-Leung Tsui, 2017. "Forecasting influenza in Hong Kong with Google search queries and statistical model fusion," PLOS ONE, Public Library of Science, vol. 12(5), pages 1-17, May.
  • Handle: RePEc:plo:pone00:0176690
    DOI: 10.1371/journal.pone.0176690
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0176690
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0176690&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0176690?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Jeremy Ginsberg & Matthew H. Mohebbi & Rajan S. Patel & Lynnette Brammer & Mark S. Smolinski & Larry Brilliant, 2009. "Detecting influenza epidemics using search engine query data," Nature, Nature, vol. 457(7232), pages 1012-1014, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Schaer, Oliver & Kourentzes, Nikolaos & Fildes, Robert, 2019. "Demand forecasting with user-generated online information," International Journal of Forecasting, Elsevier, vol. 35(1), pages 197-212.
    2. Sangwon Chae & Sungjun Kwon & Donghyun Lee, 2018. "Predicting Infectious Disease Using Deep Learning and Big Data," IJERPH, MDPI, vol. 15(8), pages 1-20, July.
    3. Songhee Cheon & Jungyoon Kim & Jihye Lim, 2019. "The Use of Deep Learning to Predict Stroke Patient Mortality," IJERPH, MDPI, vol. 16(11), pages 1-12, May.
    4. Fan Li & Hao Zhou & De-Sheng Huang & Peng Guan, 2020. "Global Research Output and Theme Trends on Climate Change and Infectious Diseases: A Restrospective Bibliometric and Co-Word Biclustering Investigation of Papers Indexed in PubMed (1999–2018)," IJERPH, MDPI, vol. 17(14), pages 1-14, July.
    5. Victor Olsavszky & Mihnea Dosius & Cristian Vladescu & Johannes Benecke, 2020. "Time Series Analysis and Forecasting with Automated Machine Learning on a National ICD-10 Database," IJERPH, MDPI, vol. 17(14), pages 1-17, July.
    6. Jungyoon Kim & Jihye Lim, 2021. "A Deep Neural Network-Based Method for Prediction of Dementia Using Big Data," IJERPH, MDPI, vol. 18(10), pages 1-13, May.
    7. Abay,Kibrom A. & Hirfrfot,Kibrom Tafere & Woldemichael,Andinet, 2020. "Winners and Losers from COVID-19 : Global Evidence from Google Search," Policy Research Working Paper Series 9268, The World Bank.
    8. Srinka Basu & Sugata Sen, 2023. "COVID 19 Pandemic, Socio-Economic Behaviour and Infection Characteristics: An Inter-Country Predictive Study Using Deep Learning," Computational Economics, Springer;Society for Computational Economics, vol. 61(2), pages 645-676, February.
    9. Francis Rathinam & Sayak Khatua & Zeba Siddiqui & Manya Malik & Pallavi Duggal & Samantha Watson & Xavier Vollenweider, 2021. "Using big data for evaluating development outcomes: A systematic map," Campbell Systematic Reviews, John Wiley & Sons, vol. 17(3), September.
    10. Monday Osayande & Osagie Osifo, 2024. "Application Of Covid-19 Data: Investigating The Impact On Weekly Stock Market Returns In Nigeria," Journal of Academic Research in Economics, Spiru Haret University, Faculty of Accounting and Financial Management Constanta, vol. 16(2 (July)), pages 403-416.
    11. Jing Cong & Mengmeng Ren & Shuyang Xie & Pingyu Wang, 2019. "Predicting Seasonal Influenza Based on SARIMA Model, in Mainland China from 2005 to 2018," IJERPH, MDPI, vol. 16(23), pages 1-8, November.
    12. Daniel Alejandro Gónzalez-Bandala & Juan Carlos Cuevas-Tello & Daniel E. Noyola & Andreu Comas-García & Christian A García-Sepúlveda, 2020. "Computational Forecasting Methodology for Acute Respiratory Infectious Disease Dynamics," IJERPH, MDPI, vol. 17(12), pages 1-20, June.
    13. Corrado Lanera & Ileana Baldi & Andrea Francavilla & Elisa Barbieri & Lara Tramontan & Antonio Scamarcia & Luigi Cantarutti & Carlo Giaquinto & Dario Gregori, 2022. "A Deep Learning Approach to Estimate the Incidence of Infectious Disease Cases for Routinely Collected Ambulatory Records: The Example of Varicella-Zoster," IJERPH, MDPI, vol. 19(10), pages 1-13, May.
    14. Taichi Murayama & Nobuyuki Shimizu & Sumio Fujita & Shoko Wakamiya & Eiji Aramaki, 2020. "Robust two-stage influenza prediction model considering regular and irregular trends," PLOS ONE, Public Library of Science, vol. 15(5), pages 1-14, May.
    15. Ibrahim Musa & Hyun Woo Park & Lkhagvadorj Munkhdalai & Keun Ho Ryu, 2018. "Global Research on Syndromic Surveillance from 1993 to 2017: Bibliometric Analysis and Visualization," Sustainability, MDPI, vol. 10(10), pages 1-20, September.
    16. Beytía, Pablo & Infante, Carlos Cruz, 2020. "Digital Pathways, Pandemic Trajectories. Using Google Trends to Track Social Responses to COVID-19," SocArXiv yndb7, Center for Open Science.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. David H Chae & Sean Clouston & Mark L Hatzenbuehler & Michael R Kramer & Hannah L F Cooper & Sacoby M Wilson & Seth I Stephens-Davidowitz & Robert S Gold & Bruce G Link, 2015. "Association between an Internet-Based Measure of Area Racism and Black Mortality," PLOS ONE, Public Library of Science, vol. 10(4), pages 1-12, April.
    2. Xiaoli Wang & Shuangsheng Wu & C Raina MacIntyre & Hongbin Zhang & Weixian Shi & Xiaomin Peng & Wei Duan & Peng Yang & Yi Zhang & Quanyi Wang, 2015. "Using an Adjusted Serfling Regression Model to Improve the Early Warning at the Arrival of Peak Timing of Influenza in Beijing," PLOS ONE, Public Library of Science, vol. 10(3), pages 1-14, March.
    3. Ishani Chaudhuri & Parthajit Kayal, 2022. "Predicting Power of Ticker Search Volume in Indian Stock Market," Working Papers 2022-214, Madras School of Economics,Chennai,India.
    4. Yang, Xin & Pan, Bing & Evans, James A. & Lv, Benfu, 2015. "Forecasting Chinese tourist volume with search engine data," Tourism Management, Elsevier, vol. 46(C), pages 386-397.
    5. Kuchler, Theresa & Russel, Dominic & Stroebel, Johannes, 2022. "JUE Insight: The geographic spread of COVID-19 correlates with the structure of social networks as measured by Facebook," Journal of Urban Economics, Elsevier, vol. 127(C).
    6. Markowitz, Sara & Nesson, Erik & Robinson, Joshua J., 2019. "The effects of employment on influenza rates," Economics & Human Biology, Elsevier, vol. 34(C), pages 286-295.
    7. Bentzen, Jeanet Sinding, 2021. "In crisis, we pray: Religiosity and the COVID-19 pandemic," Journal of Economic Behavior & Organization, Elsevier, vol. 192(C), pages 541-583.
    8. Jesse T. Richman & Ryan J. Roberts, 2023. "Assessing Spurious Correlations in Big Search Data," Forecasting, MDPI, vol. 5(1), pages 1-12, February.
    9. Linus Schiöler & Marianne Fris�n, 2012. "Multivariate outbreak detection," Journal of Applied Statistics, Taylor & Francis Journals, vol. 39(2), pages 223-242, April.
    10. Sasikiran Kandula & Jeffrey Shaman, 2019. "Reappraising the utility of Google Flu Trends," PLOS Computational Biology, Public Library of Science, vol. 15(8), pages 1-16, August.
    11. Daniel E. O'Leary, 2024. "Toward an extended framework of exhaust data for predictive analytics: An empirical approach," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 31(2), June.
    12. Yangkun Huang & Xiaoping Xu & Sini Su, 2021. "Diverging from News Media: An Exploratory Study on the Changing Dynamics between Media and Public Attention on Cancer in China from 2011–2020," IJERPH, MDPI, vol. 18(16), pages 1-13, August.
    13. Vosen, Simeon & Schmidt, Torsten, 2012. "A monthly consumption indicator for Germany based on Internet search query data," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 19(7), pages 683-687.
    14. Klaus Ackermann & Simon D Angus & Paul A Raschky, 2017. "The Internet as Quantitative Social Science Platform: Insights from a Trillion Observations," Papers 1701.05632, arXiv.org.
    15. Edward L. Glaeser & Scott Duke Kominers & Michael Luca & Nikhil Naik, 2018. "Big Data And Big Cities: The Promises And Limitations Of Improved Measures Of Urban Life," Economic Inquiry, Western Economic Association International, vol. 56(1), pages 114-137, January.
    16. Sean Coogan & Zhixian Sui & David Raubenheimer, 2018. "Gluttony and guilt: monthly trends in internet search query data are comparable with national-level energy intake and dieting behavior," Palgrave Communications, Palgrave Macmillan, vol. 4(1), pages 1-9, December.
    17. Tobias Preis & Federico Botta & Helen Susannah Moat, 2020. "Sensing global tourism numbers with millions of publicly shared online photographs," Environment and Planning A, , vol. 52(3), pages 471-477, May.
    18. D'Amuri, Francesco & Marcucci, Juri, 2009. "‘Google it!’ Forecasting the US unemployment rate with a Google job search index," ISER Working Paper Series 2009-32, Institute for Social and Economic Research.
    19. Liwen Ling & Dabin Zhang & Shanying Chen & Amin W. Mugera, 2020. "Can online search data improve the forecast accuracy of pork price in China?," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 39(4), pages 671-686, July.
    20. Klaus Ackermann & Simon D Angus & Paul A Raschky, 2020. "Estimating Sleep and Work Hours from Alternative Data by Segmented Functional Classification Analysis, SFCA," SoDa Laboratories Working Paper Series 2020-04, Monash University, SoDa Laboratories.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0176690. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.