IDEAS home Printed from https://ideas.repec.org/a/spr/infsem/v20y2022i3d10.1007_s10257-022-00560-9.html
   My bibliography  Save this article

Artificial data in sports forecasting: a simulation framework for analysing predictive models in sports

Author

Listed:
  • Marc Garnica-Caparrós

    (German Sport University Cologne)

  • Daniel Memmert

    (German Sport University Cologne)

  • Fabian Wunderlich

    (German Sport University Cologne)

Abstract

Far-reaching decisions in organizations often rely on sophisticated methods of data analysis. However, data availability is not always given in complex real-world systems, and even available data may not fully reflect all the underlying processes. In these cases, artificial data can help shed light on pitfalls in decision making, and gain insights on optimized methods. The present paper uses the example of forecasts targeting the outcomes of sports events, representing a domain where despite the increasing complexity and coverage of models, the proposed methods may fail to identify the main sources of inaccuracy. While the actual outcome of the events provides a basis for validation, it remains unknown whether inaccurate forecasts source from misestimating the strength of each competitor, inaccurate forecasting methods or just from inherently random processes. To untangle this paradigm, the present paper proposes the design of a comprehensive simulation framework that models the sports forecasting process while having full control of all the underlying unknowns. A generalized model of the sports forecasting process is presented as the conceptual basis of the system and is supported by the main challenges of real-world data applications. The framework aims to provide a better understanding of rating procedures and forecasting techniques that will boost new developments and serve as a robust validation system accounting for the predictive quality of forecasts. As a proof of concept, a full data generation is showcased together with the main analytical advantages of using artificial data.

Suggested Citation

  • Marc Garnica-Caparrós & Daniel Memmert & Fabian Wunderlich, 2022. "Artificial data in sports forecasting: a simulation framework for analysing predictive models in sports," Information Systems and e-Business Management, Springer, vol. 20(3), pages 551-580, September.
  • Handle: RePEc:spr:infsem:v:20:y:2022:i:3:d:10.1007_s10257-022-00560-9
    DOI: 10.1007/s10257-022-00560-9
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10257-022-00560-9
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10257-022-00560-9?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Baker, Rose D. & McHale, Ian G., 2013. "Forecasting exact scores in National Football League games," International Journal of Forecasting, Elsevier, vol. 29(1), pages 122-130.
    2. Siem Jan Koopman & Rutger Lit, 2015. "A dynamic bivariate Poisson model for analysing and forecasting match results in the English Premier League," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 178(1), pages 167-186, January.
    3. McHale, Ian & Morton, Alex, 2011. "A Bradley-Terry type model for forecasting tennis match results," International Journal of Forecasting, Elsevier, vol. 27(2), pages 619-630, April.
    4. Roberto Buizza & James W. Taylor, 2004. "A comparison of temperature density forecasts from GARCH and atmospheric models," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 23(5), pages 337-355.
    5. Newton Paul K & Aslam Kamran, 2009. "Monte Carlo Tennis: A Stochastic Markov Chain Model," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 5(3), pages 1-44, July.
    6. Forrest, David & Goddard, John & Simmons, Robert, 2005. "Odds-setters as forecasters: The case of English football," International Journal of Forecasting, Elsevier, vol. 21(3), pages 551-564.
    7. Hong, Tao & Pinson, Pierre & Fan, Shu & Zareipour, Hamidreza & Troccoli, Alberto & Hyndman, Rob J., 2016. "Probabilistic energy forecasting: Global Energy Forecasting Competition 2014 and beyond," International Journal of Forecasting, Elsevier, vol. 32(3), pages 896-913.
    8. Lessmann, Stefan & Sung, Ming-Chien & Johnson, Johnnie E.V., 2010. "Alternative methods of predicting competitive events: An application in horserace betting markets," International Journal of Forecasting, Elsevier, vol. 26(3), pages 518-536, July.
    9. Kovalchik, Stephanie, 2020. "Extension of the Elo rating system to margin of victory," International Journal of Forecasting, Elsevier, vol. 36(4), pages 1329-1341.
    10. Steffen Liebscher & Thomas Kirschstein, 2017. "Predicting the outcome of professional darts tournaments," International Journal of Performance Analysis in Sport, Taylor & Francis Journals, vol. 17(5), pages 666-683, September.
    11. A. Heuer & O. Rubner, 2009. "Fitness, chance, and myths: an objective view on soccer results," The European Physical Journal B: Condensed Matter and Complex Systems, Springer;EDP Sciences, vol. 67(3), pages 445-458, February.
    12. Hubáček, Ondřej & Šourek, Gustav & Železný, Filip, 2019. "Exploiting sports-betting market using machine learning," International Journal of Forecasting, Elsevier, vol. 35(2), pages 783-796.
    13. Green, Kesten C. & Armstrong, J. Scott & Soon, Willie, 2009. "Validity of climate change forecasting for public policy decision making," International Journal of Forecasting, Elsevier, vol. 25(4), pages 826-832, October.
    14. Booth, Heather, 2006. "Demographic forecasting: 1980 to 2005 in review," International Journal of Forecasting, Elsevier, vol. 22(3), pages 547-581.
    15. Stekler, H.O. & Sendor, David & Verlander, Richard, 2010. "Issues in sports forecasting," International Journal of Forecasting, Elsevier, vol. 26(3), pages 606-621, July.
      • Herman O. Stekler & David Sendor & Richard Verlander, 2009. "Issues in Sports Forecasting," Working Papers 2009-002, The George Washington University, Department of Economics, H. O. Stekler Research Program on Forecasting.
    16. Angelini, Giovanni & De Angelis, Luca, 2019. "Efficiency of online football betting markets," International Journal of Forecasting, Elsevier, vol. 35(2), pages 712-721.
    17. Manuela Cattelan & Cristiano Varin & David Firth, 2013. "Dynamic Bradley–Terry modelling of sports tournaments," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 62(1), pages 135-150, January.
    18. Wheatcroft, Edward, 2020. "A profitable model for predicting the over/under market in football," International Journal of Forecasting, Elsevier, vol. 36(3), pages 916-932.
    19. Koopman, Siem Jan & Lit, Rutger, 2019. "Forecasting football match results in national league competitions using score-driven time series models," International Journal of Forecasting, Elsevier, vol. 35(2), pages 797-809.
    20. Martin Spann & Bernd Skiera, 2009. "Sports forecasting: a comparison of the forecast accuracy of prediction markets, betting odds and tipsters," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 28(1), pages 55-72.
    21. de Saá Guerra, Y. & Martín González, J.M. & Sarmiento Montesdeoca, S. & Rodríguez Ruiz, D. & García-Rodríguez, A. & García-Manso, J.M., 2012. "A model for competitiveness level analysis in sports competitions: Application to basketball," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 391(10), pages 2997-3004.
    22. Wheatcroft, Edward, 2021. "Evaluating probabilistic forecasts of football matches: the case against the ranked probability score," LSE Research Online Documents on Economics 111494, London School of Economics and Political Science, LSE Library.
    23. David Forrest & Robert Simmons, 2008. "Sentiment in the betting market on Spanish football," Applied Economics, Taylor & Francis Journals, vol. 40(1), pages 119-126.
    24. Asif, Muhammad & McHale, Ian G., 2016. "In-play forecasting of win probability in One-Day International cricket: A dynamic logistic regression model," International Journal of Forecasting, Elsevier, vol. 32(1), pages 34-43.
    25. Leitner, Christoph & Zeileis, Achim & Hornik, Kurt, 2010. "Forecasting sports tournaments by ratings of (prob)abilities: A comparison for the EUROÂ 2008," International Journal of Forecasting, Elsevier, vol. 26(3), pages 471-481, July.
    26. Gorr, Wilpen & Olligschlaeger, Andreas & Thompson, Yvonne, 2003. "Short-term forecasting of crime," International Journal of Forecasting, Elsevier, vol. 19(4), pages 579-594.
    27. McHale, Ian & Morton, Alex, 2011. "A Bradley-Terry type model for forecasting tennis match results," International Journal of Forecasting, Elsevier, vol. 27(2), pages 619-630.
    28. Constantinou Anthony Costa & Fenton Norman Elliott, 2012. "Solving the Problem of Inadequate Scoring Rules for Assessing Probabilistic Football Forecast Models," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 8(1), pages 1-14, March.
    29. Goddard, John, 2005. "Regression models for forecasting goals and match results in association football," International Journal of Forecasting, Elsevier, vol. 21(2), pages 331-340.
    30. Wheatcroft, Edward, 2020. "A profitable model for predicting the over/under market in football," LSE Research Online Documents on Economics 103712, London School of Economics and Political Science, LSE Library.
    31. Hvattum, Lars Magnus & Arntzen, Halvard, 2010. "Using ELO ratings for match result prediction in association football," International Journal of Forecasting, Elsevier, vol. 26(3), pages 460-470, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Wunderlich, Fabian & Memmert, Daniel, 2020. "Are betting returns a useful measure of accuracy in (sports) forecasting?," International Journal of Forecasting, Elsevier, vol. 36(2), pages 713-722.
    2. Hubáček, Ondřej & Šír, Gustav, 2023. "Beating the market with a bad predictive model," International Journal of Forecasting, Elsevier, vol. 39(2), pages 691-719.
    3. da Costa, Igor Barbosa & Marinho, Leandro Balby & Pires, Carlos Eduardo Santos, 2022. "Forecasting football results and exploiting betting markets: The case of “both teams to score”," International Journal of Forecasting, Elsevier, vol. 38(3), pages 895-909.
    4. Sung, Ming-Chien & McDonald, David C.J. & Johnson, Johnnie E.V. & Tai, Chung-Ching & Cheah, Eng-Tuck, 2019. "Improving prediction market forecasts by detecting and correcting possible over-reaction to price movements," European Journal of Operational Research, Elsevier, vol. 272(1), pages 389-405.
    5. Lasek, Jan & Gagolewski, Marek, 2021. "Interpretable sports team rating models based on the gradient descent algorithm," International Journal of Forecasting, Elsevier, vol. 37(3), pages 1061-1071.
    6. J. James Reade & Carl Singleton & Alasdair Brown, 2021. "Evaluating strange forecasts: The curious case of football match scorelines," Scottish Journal of Political Economy, Scottish Economic Society, vol. 68(2), pages 261-285, May.
    7. Baboota, Rahul & Kaur, Harleen, 2019. "Predictive analysis and modelling football results using machine learning approach for English Premier League," International Journal of Forecasting, Elsevier, vol. 35(2), pages 741-755.
    8. Angelini, Giovanni & Candila, Vincenzo & De Angelis, Luca, 2022. "Weighted Elo rating for tennis match predictions," European Journal of Operational Research, Elsevier, vol. 297(1), pages 120-132.
    9. Angelini, Giovanni & De Angelis, Luca, 2019. "Efficiency of online football betting markets," International Journal of Forecasting, Elsevier, vol. 35(2), pages 712-721.
    10. Green, Lawrence & Sung, Ming-Chien & Ma, Tiejun & Johnson, Johnnie E. V., 2019. "To what extent can new web-based technology improve forecasts? Assessing the economic value of information derived from Virtual Globes and its rate of diffusion in a financial market," European Journal of Operational Research, Elsevier, vol. 278(1), pages 226-239.
    11. Raffaele Mattera, 2023. "Forecasting binary outcomes in soccer," Annals of Operations Research, Springer, vol. 325(1), pages 115-134, June.
    12. Singleton, Carl & Reade, J. James & Brown, Alasdair, 2020. "Going with your gut: The (In)accuracy of forecast revisions in a football score prediction game," Journal of Behavioral and Experimental Economics (formerly The Journal of Socio-Economics), Elsevier, vol. 89(C).
    13. He, Xue-Zhong & Treich, Nicolas, 2017. "Prediction market prices under risk aversion and heterogeneous beliefs," Journal of Mathematical Economics, Elsevier, vol. 70(C), pages 105-114.
    14. Ramirez, Philip & Reade, J. James & Singleton, Carl, 2023. "Betting on a buzz: Mispricing and inefficiency in online sportsbooks," International Journal of Forecasting, Elsevier, vol. 39(3), pages 1413-1423.
    15. Peeters, Thomas, 2018. "Testing the Wisdom of Crowds in the field: Transfermarkt valuations and international soccer results," International Journal of Forecasting, Elsevier, vol. 34(1), pages 17-29.
    16. Petropoulos, Fotios & Apiletti, Daniele & Assimakopoulos, Vassilios & Babai, Mohamed Zied & Barrow, Devon K. & Ben Taieb, Souhaib & Bergmeir, Christoph & Bessa, Ricardo J. & Bijak, Jakub & Boylan, Joh, 2022. "Forecasting: theory and practice," International Journal of Forecasting, Elsevier, vol. 38(3), pages 705-871.
      • Fotios Petropoulos & Daniele Apiletti & Vassilios Assimakopoulos & Mohamed Zied Babai & Devon K. Barrow & Souhaib Ben Taieb & Christoph Bergmeir & Ricardo J. Bessa & Jakub Bijak & John E. Boylan & Jet, 2020. "Forecasting: theory and practice," Papers 2012.03854, arXiv.org, revised Jan 2022.
    17. Gross, Johannes & Rebeggiani, Luca, 2018. "Chance or Ability? The Efficiency of the Football Betting Market Revisited," MPRA Paper 87230, University Library of Munich, Germany.
    18. Holmes, Benjamin & McHale, Ian G., 2024. "Forecasting football match results using a player rating based model," International Journal of Forecasting, Elsevier, vol. 40(1), pages 302-312.
    19. Vaughan Williams Leighton & Liu Chunping & Dixon Lerato & Gerrard Hannah, 2021. "How well do Elo-based ratings predict professional tennis matches?," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 17(2), pages 91-105, June.
    20. Szczecinski Leszek, 2022. "G-Elo: generalization of the Elo algorithm by modeling the discretized margin of victory," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 18(1), pages 1-14, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:infsem:v:20:y:2022:i:3:d:10.1007_s10257-022-00560-9. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.