IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0211776.html
   My bibliography  Save this article

Overcoming the problem of multicollinearity in sports performance data: A novel application of partial least squares correlation analysis

Author

Listed:
  • Dan Weaving
  • Ben Jones
  • Matt Ireton
  • Sarah Whitehead
  • Kevin Till
  • Clive B Beggs

Abstract

Objectives: Professional sporting organisations invest considerable resources collecting and analysing data in order to better understand the factors that influence performance. Recent advances in non-invasive technologies, such as global positioning systems (GPS), mean that large volumes of data are now readily available to coaches and sport scientists. However analysing such data can be challenging, particularly when sample sizes are small and data sets contain multiple highly correlated variables, as is often the case in a sporting context. Multicollinearity in particular, if not treated appropriately, can be problematic and might lead to erroneous conclusions. In this paper we present a novel ‘leave one variable out’ (LOVO) partial least squares correlation analysis (PLSCA) methodology, designed to overcome the problem of multicollinearity, and show how this can be used to identify the training load (TL) variables that influence most ‘end fitness’ in young rugby league players. Methods: The accumulated TL of sixteen male professional youth rugby league players (17.7 ± 0.9 years) was quantified via GPS, a micro-electrical-mechanical-system (MEMS), and players’ session-rating-of-perceived-exertion (sRPE) over a 6-week pre-season training period. Immediately prior to and following this training period, participants undertook a 30–15 intermittent fitness test (30-15IFT), which was used to determine a players ‘starting fitness’ and ‘end fitness’. In total twelve TL variables were collected, and these along with ‘starting fitness’ as a covariate were regressed against ‘end fitness’. However, considerable multicollinearity in the data (VIF >1000 for nine variables) meant that the multiple linear regression (MLR) process was unstable and so we developed a novel LOVO PLSCA adaptation to quantify the relative importance of the predictor variables and thus minimise multicollinearity issues. As such, the LOVO PLSCA was used as a tool to inform and refine the MLR process. Results: The LOVO PLSCA identified the distance accumulated at very-high speed (>7 m·s-1) as being the most important TL variable to influence improvement in player fitness, with this variable causing the largest decrease in singular value inertia (5.93). When included in a refined linear regression model, this variable, along with ‘starting fitness’ as a covariate, explained 73% of the variance in v30-15IFT ‘end fitness’ (p

Suggested Citation

  • Dan Weaving & Ben Jones & Matt Ireton & Sarah Whitehead & Kevin Till & Clive B Beggs, 2019. "Overcoming the problem of multicollinearity in sports performance data: A novel application of partial least squares correlation analysis," PLOS ONE, Public Library of Science, vol. 14(2), pages 1-16, February.
  • Handle: RePEc:plo:pone00:0211776
    DOI: 10.1371/journal.pone.0211776
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0211776
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0211776&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0211776?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Vinicius Francisco Rofatto & Marcelo Tomio Matsuoka & Ivandro Klein & Maurício Roberto Veronez & Luiz Gonzaga da Silveira Junior, 2020. "On the effects of hard and soft equality constraints in the iterative outlier elimination procedure," PLOS ONE, Public Library of Science, vol. 15(8), pages 1-29, August.
    2. Tae‐Hyoung T. Gim, 2021. "Partial least squares regression and importance–satisfaction analyses of the strategic drivers of happiness: A quality of life survey in Seoul, Korea," Growth and Change, Wiley Blackwell, vol. 52(1), pages 567-599, March.
    3. Nancy Cherotich Sitienei & Meshack Misoi & Chidozie Ibeneme, 2023. "Supply Chain Management Practices on Organizational Performance: A Case Study of Tea Industries in North Rift Valley, Kenya," International Journal of Research and Innovation in Social Science, International Journal of Research and Innovation in Social Science (IJRISS), vol. 7(7), pages 846-865, July.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0211776. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.