IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v9y2021i4p421-d503070.html
   My bibliography  Save this article

Web Traffic Time Series Forecasting Using LSTM Neural Networks with Distributed Asynchronous Training

Author

Listed:
  • Roberto Casado-Vara

    (BISITE Research Group, University of Salamanca, 37008 Salamanca, Spain)

  • Angel Martin del Rey

    (Department of Applied Mathematics, Institute of Fundamental Physics and Mathematics, University of Salamanca, 37008 Salamanca, Spain)

  • Daniel Pérez-Palau

    (Escuela Superior de Ingeniería y Tecnología, Universidad Internacional de La Rioja, Av. La Paz 137, 26006 Logroño, Spain)

  • Luis de-la-Fuente-Valentín

    (Escuela Superior de Ingeniería y Tecnología, Universidad Internacional de La Rioja, Av. La Paz 137, 26006 Logroño, Spain)

  • Juan M. Corchado

    (BISITE Research Group, University of Salamanca, 37008 Salamanca, Spain)

Abstract

Evaluating web traffic on a web server is highly critical for web service providers since, without a proper demand forecast, customers could have lengthy waiting times and abandon that website. However, this is a challenging task since it requires making reliable predictions based on the arbitrary nature of human behavior. We introduce an architecture that collects source data and in a supervised way performs the forecasting of the time series of the page views. Based on the Wikipedia page views dataset proposed in a competition by Kaggle in 2017, we created an updated version of it for the years 2018–2020. This dataset is processed and the features and hidden patterns in data are obtained for later designing an advanced version of a recurrent neural network called Long Short-Term Memory. This AI model is distributed training, according to the paradigm called data parallelism and using the Downpour training strategy. Predictions made for the seven dominant languages in the dataset are accurate with loss function and measurement error in reasonable ranges. Despite the fact that the analyzed time series have fairly bad patterns of seasonality and trend, the predictions have been quite good, evidencing that an analysis of the hidden patterns and the features extraction before the design of the AI model enhances the model accuracy. In addition, the improvement of the accuracy of the model with the distributed training is remarkable. Since the task of predicting web traffic in as precise quantities as possible requires large datasets, we designed a forecasting system to be accurate despite having limited data in the dataset. We tested the proposed model on the new Wikipedia page views dataset we created and obtained a highly accurate prediction; actually, the mean absolute error of predictions regarding the original one on average is below 30. This represents a significant step forward in the field of time series prediction for web traffic forecasting.

Suggested Citation

  • Roberto Casado-Vara & Angel Martin del Rey & Daniel Pérez-Palau & Luis de-la-Fuente-Valentín & Juan M. Corchado, 2021. "Web Traffic Time Series Forecasting Using LSTM Neural Networks with Distributed Asynchronous Training," Mathematics, MDPI, vol. 9(4), pages 1-21, February.
  • Handle: RePEc:gam:jmathe:v:9:y:2021:i:4:p:421-:d:503070
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/9/4/421/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/9/4/421/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. De Gooijer, Jan G. & Hyndman, Rob J., 2006. "25 years of time series forecasting," International Journal of Forecasting, Elsevier, vol. 22(3), pages 443-473.
    2. Montero-Manso, Pablo & Athanasopoulos, George & Hyndman, Rob J. & Talagala, Thiyanga S., 2020. "FFORMA: Feature-based forecast model averaging," International Journal of Forecasting, Elsevier, vol. 36(1), pages 86-92.
    3. Makridakis, Spyros & Spiliotis, Evangelos & Assimakopoulos, Vassilios, 2020. "The M4 Competition: 100,000 time series and 61 forecasting methods," International Journal of Forecasting, Elsevier, vol. 36(1), pages 54-74.
    4. Chujie Tian & Jian Ma & Chunhong Zhang & Panpan Zhan, 2018. "A Deep Neural Network Model for Short-Term Load Forecast Based on Long Short-Term Memory Network and Convolutional Neural Network," Energies, MDPI, vol. 11(12), pages 1-13, December.
    5. Spyros Makridakis & Evangelos Spiliotis & Vassilios Assimakopoulos, 2018. "Statistical and Machine Learning forecasting methods: Concerns and ways forward," PLOS ONE, Public Library of Science, vol. 13(3), pages 1-26, March.
    6. Boone, Tonya & Ganeshan, Ram & Jain, Aditya & Sanders, Nada R., 2019. "Forecasting sales in the supply chain: Consumer analytics in the big data era," International Journal of Forecasting, Elsevier, vol. 35(1), pages 170-180.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Petropoulos, Fotios & Apiletti, Daniele & Assimakopoulos, Vassilios & Babai, Mohamed Zied & Barrow, Devon K. & Ben Taieb, Souhaib & Bergmeir, Christoph & Bessa, Ricardo J. & Bijak, Jakub & Boylan, Joh, 2022. "Forecasting: theory and practice," International Journal of Forecasting, Elsevier, vol. 38(3), pages 705-871.
      • Fotios Petropoulos & Daniele Apiletti & Vassilios Assimakopoulos & Mohamed Zied Babai & Devon K. Barrow & Souhaib Ben Taieb & Christoph Bergmeir & Ricardo J. Bessa & Jakub Bijak & John E. Boylan & Jet, 2020. "Forecasting: theory and practice," Papers 2012.03854, arXiv.org, revised Jan 2022.
    2. Makridakis, Spyros & Spiliotis, Evangelos & Assimakopoulos, Vassilios & Chen, Zhi & Gaba, Anil & Tsetlin, Ilia & Winkler, Robert L., 2022. "The M5 uncertainty competition: Results, findings and conclusions," International Journal of Forecasting, Elsevier, vol. 38(4), pages 1365-1385.
    3. Makridakis, Spyros & Hyndman, Rob J. & Petropoulos, Fotios, 2020. "Forecasting in social settings: The state of the art," International Journal of Forecasting, Elsevier, vol. 36(1), pages 15-28.
    4. Montero-Manso, Pablo & Hyndman, Rob J., 2021. "Principles and algorithms for forecasting groups of time series: Locality and globality," International Journal of Forecasting, Elsevier, vol. 37(4), pages 1632-1653.
    5. Semenoglou, Artemios-Anargyros & Spiliotis, Evangelos & Makridakis, Spyros & Assimakopoulos, Vassilios, 2021. "Investigating the accuracy of cross-learning time series forecasting methods," International Journal of Forecasting, Elsevier, vol. 37(3), pages 1072-1084.
    6. Makridakis, Spyros & Spiliotis, Evangelos & Assimakopoulos, Vassilios, 2022. "M5 accuracy competition: Results, findings, and conclusions," International Journal of Forecasting, Elsevier, vol. 38(4), pages 1346-1364.
    7. Makridakis, Spyros & Spiliotis, Evangelos & Assimakopoulos, Vassilios, 2022. "Predicting/hypothesizing the findings of the M5 competition," International Journal of Forecasting, Elsevier, vol. 38(4), pages 1337-1345.
    8. Büttner, Daniel & Scheidler, Anne Antonia & Rabe, Markus, 2021. "A reference model for data-driven sales planning: Development of the model's framework and functionality," Chapters from the Proceedings of the Hamburg International Conference of Logistics (HICL), in: Kersten, Wolfgang & Ringle, Christian M. & Blecker, Thorsten (ed.), Adapting to the Future: How Digitalization Shapes Sustainable Logistics and Resilient Supply Chain Management. Proceedings of the Hamburg Internationa, volume 31, pages 441-476, Hamburg University of Technology (TUHH), Institute of Business Logistics and General Management.
    9. Hewamalage, Hansika & Bergmeir, Christoph & Bandara, Kasun, 2021. "Recurrent Neural Networks for Time Series Forecasting: Current status and future directions," International Journal of Forecasting, Elsevier, vol. 37(1), pages 388-427.
    10. Spiliotis, Evangelos & Makridakis, Spyros & Kaltsounis, Anastasios & Assimakopoulos, Vassilios, 2021. "Product sales probabilistic forecasting: An empirical evaluation using the M5 competition data," International Journal of Production Economics, Elsevier, vol. 240(C).
    11. Bojer, Casper Solheim & Meldgaard, Jens Peder, 2021. "Kaggle forecasting competitions: An overlooked learning opportunity," International Journal of Forecasting, Elsevier, vol. 37(2), pages 587-603.
    12. Winita Sulandari & Yudho Yudhanto & Sri Subanti & Crisma Devika Setiawan & Riskhia Hapsari & Paulo Canas Rodrigues, 2023. "Comparing the Simple to Complex Automatic Methods with the Ensemble Approach in Forecasting Electrical Time Series Data," Energies, MDPI, vol. 16(22), pages 1-16, November.
    13. Van Belle, Jente & Guns, Tias & Verbeke, Wouter, 2021. "Using shared sell-through data to forecast wholesaler demand in multi-echelon supply chains," European Journal of Operational Research, Elsevier, vol. 288(2), pages 466-479.
    14. Odin Foldvik Eikeland & Filippo Maria Bianchi & Harry Apostoleris & Morten Hansen & Yu-Cheng Chiou & Matteo Chiesa, 2021. "Predicting Energy Demand in Semi-Remote Arctic Locations," Energies, MDPI, vol. 14(4), pages 1-17, February.
    15. Bojer, Casper Solheim, 2022. "Understanding machine learning-based forecasting methods: A decomposition framework and research opportunities," International Journal of Forecasting, Elsevier, vol. 38(4), pages 1555-1561.
    16. Wang, Xiaoqian & Hyndman, Rob J. & Li, Feng & Kang, Yanfei, 2023. "Forecast combinations: An over 50-year review," International Journal of Forecasting, Elsevier, vol. 39(4), pages 1518-1547.
    17. Wellens, Arnoud P. & Boute, Robert N. & Udenio, Maximiliano, 2024. "Simplifying tree-based methods for retail sales forecasting with explanatory variables," European Journal of Operational Research, Elsevier, vol. 314(2), pages 523-539.
    18. Makridakis, Spyros & Spiliotis, Evangelos & Assimakopoulos, Vassilios, 2020. "The M4 Competition: 100,000 time series and 61 forecasting methods," International Journal of Forecasting, Elsevier, vol. 36(1), pages 54-74.
    19. Qi, Lingzhi & Li, Xixi & Wang, Qiang & Jia, Suling, 2023. "fETSmcs: Feature-based ETS model component selection," International Journal of Forecasting, Elsevier, vol. 39(3), pages 1303-1317.
    20. Nasios, Ioannis & Vogklis, Konstantinos, 2022. "Blending gradient boosted trees and neural networks for point and probabilistic forecasting of hierarchical time series," International Journal of Forecasting, Elsevier, vol. 38(4), pages 1448-1459.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:9:y:2021:i:4:p:421-:d:503070. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.