IDEAS home Printed from https://ideas.repec.org/a/bpj/jqsprt/v11y2015i1p13-27n4.html
   My bibliography  Save this article

A mixture-of-modelers approach to forecasting NCAA tournament outcomes

Author

Listed:
  • Yuan Lo-Hua
  • Liu Anthony
  • Yeh Alec
  • Franks Alex
  • Wang Sherrie
  • Illushin Dmitri
  • Bornn Luke

    (Harvard University – Statistics, Cambridge, Massachusetts, USA)

  • Kaufman Aaron

    (Harvard University – Government, Cambridge, Massachusetts, USA)

  • Reece Andrew

    (Harvard University – Psychology, Cambridge, Massachusetts, USA)

  • Bull Peter

    (Harvard University – Institute for Applied Computational Science, Cambridge, Massachusetts, USA)

Abstract

Predicting the outcome of a single sporting event is difficult; predicting all of the outcomes for an entire tournament is a monumental challenge. Despite the difficulties, millions of people compete each year to forecast the outcome of the NCAA men’s basketball tournament, which spans 63 games over 3 weeks. Statistical prediction of game outcomes involves a multitude of possible covariates and information sources, large performance variations from game to game, and a scarcity of detailed historical data. In this paper, we present the results of a team of modelers working together to forecast the 2014 NCAA men’s basketball tournament. We present not only the methods and data used, but also several novel ideas for post-processing statistical forecasts and decontaminating data sources. In particular, we highlight the difficulties in using publicly available data and suggest techniques for improving their relevance.

Suggested Citation

  • Yuan Lo-Hua & Liu Anthony & Yeh Alec & Franks Alex & Wang Sherrie & Illushin Dmitri & Bornn Luke & Kaufman Aaron & Reece Andrew & Bull Peter, 2015. "A mixture-of-modelers approach to forecasting NCAA tournament outcomes," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 11(1), pages 13-27, March.
  • Handle: RePEc:bpj:jqsprt:v:11:y:2015:i:1:p:13-27:n:4
    DOI: 10.1515/jqas-2014-0056
    as

    Download full text from publisher

    File URL: https://doi.org/10.1515/jqas-2014-0056
    Download Restriction: For access to full text, subscription to the journal or payment for the individual article is required.

    File URL: https://libkey.io/10.1515/jqas-2014-0056?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Nicolò Cesa Bianchi & Gábor Lugosi, 1999. "Worst-case bounds for the logarithmic loss of predictors," Economics Working Papers 418, Department of Economics and Business, Universitat Pompeu Fabra.
    2. Boulier, Bryan L. & Stekler, H. O., 1999. "Are sports seedings good predictors?: an evaluation," International Journal of Forecasting, Elsevier, vol. 15(1), pages 83-91, February.
    3. Hamilton Howard H, 2011. "An Extension of the Pythagorean Expectation for Association Football," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 7(2), pages 1-18, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ludden Ian G. & Jacobson Sheldon H. & Khatibi Arash & King Douglas M., 2020. "Models for generating NCAA men’s basketball tournament bracket pools," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 16(1), pages 1-15, March.
    2. Kovalchik, Stephanie & Reid, Machar, 2019. "A calibration method with dynamic updates for within-match forecasting of wins in tennis," International Journal of Forecasting, Elsevier, vol. 35(2), pages 756-766.
    3. Alessandro Chessa & Pierpaolo D’Urso & Livia Giovanni & Vincenzina Vitale & Alfonso Gebbia, 2023. "Complex networks for community detection of basketball players," Annals of Operations Research, Springer, vol. 325(1), pages 363-389, June.
    4. Paola Zuccolotto & Marco Sandri & Marica Manisera, 2023. "Spatial performance analysis in basketball with CART, random forest and extremely randomized trees," Annals of Operations Research, Springer, vol. 325(1), pages 495-519, June.
    5. Jun Woo Kim & Mar Magnusen & Seunghoon Jeong, 2023. "March Madness prediction: Different machine learning approaches with non‐box score statistics," Managerial and Decision Economics, John Wiley & Sons, Ltd., vol. 44(4), pages 2223-2236, June.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Kovalchik, Stephanie, 2020. "Extension of the Elo rating system to margin of victory," International Journal of Forecasting, Elsevier, vol. 36(4), pages 1329-1341.
    2. Stekler Herman O. & Klein Andrew, 2012. "Predicting the Outcomes of NCAA Basketball Championship Games," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 8(1), pages 1-10, March.
    3. Ludden Ian G. & Jacobson Sheldon H. & Khatibi Arash & King Douglas M., 2020. "Models for generating NCAA men’s basketball tournament bracket pools," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 16(1), pages 1-15, March.
    4. Vaughan Williams, Leighton & Stekler, Herman O., 2010. "Sports forecasting," International Journal of Forecasting, Elsevier, vol. 26(3), pages 445-447, July.
      • Herman O. Stekler, 2007. "Sports Forecasting," Working Papers 2007-001, The George Washington University, Department of Economics, H. O. Stekler Research Program on Forecasting, revised Jan 2007.
    5. Angelini, Giovanni & Candila, Vincenzo & De Angelis, Luca, 2022. "Weighted Elo rating for tennis match predictions," European Journal of Operational Research, Elsevier, vol. 297(1), pages 120-132.
    6. Andrew J. Leach, 2003. "SubGame, set and match. Identifying Incentive Response in a Tournament," Cahiers de recherche 04-02, HEC Montréal, Institut d'économie appliquée.
    7. Stekler, H.O. & Sendor, David & Verlander, Richard, 2010. "Issues in sports forecasting," International Journal of Forecasting, Elsevier, vol. 26(3), pages 606-621, July.
      • Herman O. Stekler & David Sendor & Richard Verlander, 2009. "Issues in Sports Forecasting," Working Papers 2009-002, The George Washington University, Department of Economics, H. O. Stekler Research Program on Forecasting.
    8. Caudill, Steven B., 2003. "Predicting discrete outcomes with the maximum score estimator: the case of the NCAA men's basketball tournament," International Journal of Forecasting, Elsevier, vol. 19(2), pages 313-317.
    9. Selçuk Özaydın & Thomas Könecke, 2024. "Match-Level Uncertainty in Professional Tennis Revisited—A Novel Approach Applied for the Time Between 2010 and 2019," Journal of Sports Economics, , vol. 25(4), pages 507-532, May.
    10. Michael Cary & Heather Stephens, 2023. "Gendered Consequences of COVID-19 Among Professional Tennis Players," Journal of Sports Economics, , vol. 24(2), pages 241-266, February.
    11. Nicholas G. Hall & Chris N. Potts, 2012. "A Proposal for Redesign of the FedEx Cup Playoff Series on the PGA TOUR," Interfaces, INFORMS, vol. 42(2), pages 166-179, April.
    12. Blackburn McKinley L., 2013. "Ranking the performance of tennis players: an application to women’s professional tennis," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 9(4), pages 367-378, December.
    13. Kovalchik Stephanie Ann, 2016. "Searching for the GOAT of tennis win prediction," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 12(3), pages 127-138, September.
    14. Alessio Sancetta, 2010. "Bootstrap model selection for possibly dependent and heterogeneous data," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 62(3), pages 515-546, June.
    15. Lopez Michael J. & Matthews Gregory J., 2015. "Building an NCAA men’s basketball predictive model and quantifying its success," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 11(1), pages 5-12, March.
    16. Bryan Clair & David Letscher, 2007. "Optimal Strategies for Sports Betting Pools," Operations Research, INFORMS, vol. 55(6), pages 1163-1177, December.
    17. Ferda Halicioglu, 2005. "Can We Predict The Outcome Of The International Football Tournaments : The Case Of Euro 2000?," Microeconomics 0503008, University Library of Munich, Germany.
    18. McHale, Ian & Morton, Alex, 2011. "A Bradley-Terry type model for forecasting tennis match results," International Journal of Forecasting, Elsevier, vol. 27(2), pages 619-630.
    19. Steven Caudill & Norman Godwin, 2002. "Heterogeneous skewness in binary choice models: Predicting outcomes in the men's NCAA basketball tournament," Journal of Applied Statistics, Taylor & Francis Journals, vol. 29(7), pages 991-1001.
    20. McHale, Ian & Morton, Alex, 2011. "A Bradley-Terry type model for forecasting tennis match results," International Journal of Forecasting, Elsevier, vol. 27(2), pages 619-630, April.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bpj:jqsprt:v:11:y:2015:i:1:p:13-27:n:4. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.degruyter.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.