IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0213653.html
   My bibliography  Save this article

Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants

Author

Listed:
  • Ahmed M Alaa
  • Thomas Bolton
  • Emanuele Di Angelantonio
  • James H F Rudd
  • Mihaela van der Schaar

Abstract

Background: Identifying people at risk of cardiovascular diseases (CVD) is a cornerstone of preventative cardiology. Risk prediction models currently recommended by clinical guidelines are typically based on a limited number of predictors with sub-optimal performance across all patient groups. Data-driven techniques based on machine learning (ML) might improve the performance of risk predictions by agnostically discovering novel risk predictors and learning the complex interactions between them. We tested (1) whether ML techniques based on a state-of-the-art automated ML framework (AutoPrognosis) could improve CVD risk prediction compared to traditional approaches, and (2) whether considering non-traditional variables could increase the accuracy of CVD risk predictions. Methods and findings: Using data on 423,604 participants without CVD at baseline in UK Biobank, we developed a ML-based model for predicting CVD risk based on 473 available variables. Our ML-based model was derived using AutoPrognosis, an algorithmic tool that automatically selects and tunes ensembles of ML modeling pipelines (comprising data imputation, feature processing, classification and calibration algorithms). We compared our model with a well-established risk prediction algorithm based on conventional CVD risk factors (Framingham score), a Cox proportional hazards (PH) model based on familiar risk factors (i.e, age, gender, smoking status, systolic blood pressure, history of diabetes, reception of treatments for hypertension and body mass index), and a Cox PH model based on all of the 473 available variables. Predictive performances were assessed using area under the receiver operating characteristic curve (AUC-ROC). Overall, our AutoPrognosis model improved risk prediction (AUC-ROC: 0.774, 95% CI: 0.768-0.780) compared to Framingham score (AUC-ROC: 0.724, 95% CI: 0.720-0.728, p

Suggested Citation

  • Ahmed M Alaa & Thomas Bolton & Emanuele Di Angelantonio & James H F Rudd & Mihaela van der Schaar, 2019. "Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants," PLOS ONE, Public Library of Science, vol. 14(5), pages 1-17, May.
  • Handle: RePEc:plo:pone00:0213653
    DOI: 10.1371/journal.pone.0213653
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0213653
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0213653&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0213653?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Victor Olsavszky & Mihnea Dosius & Cristian Vladescu & Johannes Benecke, 2020. "Time Series Analysis and Forecasting with Automated Machine Learning on a National ICD-10 Database," IJERPH, MDPI, vol. 17(14), pages 1-17, July.
    2. Shelda Sajeev & Stephanie Champion & Alline Beleigoli & Derek Chew & Richard L. Reed & Dianna J. Magliano & Jonathan E. Shaw & Roger L. Milne & Sarah Appleton & Tiffany K. Gill & Anthony Maeder, 2021. "Predicting Australian Adults at High Risk of Cardiovascular Disease Mortality Using Standard Risk Factors and Machine Learning," IJERPH, MDPI, vol. 18(6), pages 1-14, March.
    3. Mira Kim & Kyunghee Chae & Seungwoo Lee & Hong-Jun Jang & Sukil Kim, 2020. "Automated Classification of Online Sources for Infectious Disease Occurrences Using Machine-Learning-Based Natural Language Processing Approaches," IJERPH, MDPI, vol. 17(24), pages 1-13, December.
    4. Menteş, Nurettin & Çakmak, Mehmet Aziz & Kurt, Mehmet Emin, 2023. "Estimation of service length with the machine learning algorithms and neural networks for patients who receiving home health care," Evaluation and Program Planning, Elsevier, vol. 100(C).
    5. Ervasti, Jenni & Pentti, Jaana & Seppälä, Piia & Ropponen, Annina & Virtanen, Marianna & Elovainio, Marko & Chandola, Tarani & Kivimäki, Mika & Airaksinen, Jaakko, 2023. "Prediction of bullying at work: A data-driven analysis of the Finnish public sector cohort study," Social Science & Medicine, Elsevier, vol. 317(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0213653. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.