IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0163942.html
   My bibliography  Save this article

Prediction of Incident Diabetes in the Jackson Heart Study Using High-Dimensional Machine Learning

Author

Listed:
  • Ramon Casanova
  • Santiago Saldana
  • Sean L Simpson
  • Mary E Lacy
  • Angela R Subauste
  • Chad Blackshear
  • Lynne Wagenknecht
  • Alain G Bertoni

Abstract

Statistical models to predict incident diabetes are often based on limited variables. Here we pursued two main goals: 1) investigate the relative performance of a machine learning method such as Random Forests (RF) for detecting incident diabetes in a high-dimensional setting defined by a large set of observational data, and 2) uncover potential predictors of diabetes. The Jackson Heart Study collected data at baseline and in two follow-up visits from 5,301 African Americans. We excluded those with baseline diabetes and no follow-up, leaving 3,633 individuals for analyses. Over a mean 8-year follow-up, 584 participants developed diabetes. The full RF model evaluated 93 variables including demographic, anthropometric, blood biomarker, medical history, and echocardiogram data. We also used RF metrics of variable importance to rank variables according to their contribution to diabetes prediction. We implemented other models based on logistic regression and RF where features were preselected. The RF full model performance was similar (AUC = 0.82) to those more parsimonious models. The top-ranked variables according to RF included hemoglobin A1C, fasting plasma glucose, waist circumference, adiponectin, c-reactive protein, triglycerides, leptin, left ventricular mass, high-density lipoprotein cholesterol, and aldosterone. This work shows the potential of RF for incident diabetes prediction while dealing with high-dimensional data.

Suggested Citation

  • Ramon Casanova & Santiago Saldana & Sean L Simpson & Mary E Lacy & Angela R Subauste & Chad Blackshear & Lynne Wagenknecht & Alain G Bertoni, 2016. "Prediction of Incident Diabetes in the Jackson Heart Study Using High-Dimensional Machine Learning," PLOS ONE, Public Library of Science, vol. 11(10), pages 1-12, October.
  • Handle: RePEc:plo:pone00:0163942
    DOI: 10.1371/journal.pone.0163942
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0163942
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0163942&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0163942?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Ramon Casanova & Fang-Chi Hsu & Mark A. Espeland, for the Alzheimer's Disease Neuroimaging Initiative, 2012. "Classification of Structural MRI Images in Alzheimer's Disease from the Perspective of Ill-Posed Problems," PLOS ONE, Public Library of Science, vol. 7(10), pages 1-12, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Chao-Yu Guo & Min-Yang Wu & Hao-Min Cheng, 2021. "The Comprehensive Machine Learning Analytics for Heart Failure," IJERPH, MDPI, vol. 18(9), pages 1-17, May.
    2. Micheal O. Olusanya & Ropo Ebenezer Ogunsakin & Meenu Ghai & Matthew Adekunle Adeleke, 2022. "Accuracy of Machine Learning Classification Models for the Prediction of Type 2 Diabetes Mellitus: A Systematic Survey and Meta-Analysis Approach," IJERPH, MDPI, vol. 19(21), pages 1-19, November.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Raymond Salvador & Joaquim Radua & Erick J Canales-Rodríguez & Aleix Solanes & Salvador Sarró & José M Goikolea & Alicia Valiente & Gemma C Monté & María del Carmen Natividad & Amalia Guerrero-Pedraza, 2017. "Evaluation of machine learning algorithms and structural features for optimal MRI-based diagnostic prediction in psychosis," PLOS ONE, Public Library of Science, vol. 12(4), pages 1-24, April.
    2. Viswanadham Sridhara & Austin G Meyer & Piyush Rai & Jeffrey E Barrick & Pradeep Ravikumar & Daniel Segrè & Claus O Wilke, 2014. "Predicting Growth Conditions from Internal Metabolic Fluxes in an In-Silico Model of E. coli," PLOS ONE, Public Library of Science, vol. 9(12), pages 1-22, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0163942. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.