IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0220765.html
   My bibliography  Save this article

Multi-group diagnostic classification of high-dimensional data using differential scanning calorimetry plasma thermograms

Author

Listed:
  • Shesh N Rai
  • Sudhir Srivastava
  • Jianmin Pan
  • Xiaoyong Wu
  • Somesh P Rai
  • Chongkham S Mekmaysy
  • Lynn DeLeeuw
  • Jonathan B Chaires
  • Nichola C Garbett

Abstract

The thermoanalytical technique differential scanning calorimetry (DSC) has been applied to characterize protein denaturation patterns (thermograms) in blood plasma samples and relate these to a subject’s health status. The analysis and classification of thermograms is challenging because of the high-dimensionality of the dataset. There are various methods for group classification using high-dimensional data sets; however, the impact of using high-dimensional data sets for cancer classification has been poorly understood. In the present article, we proposed a statistical approach for data reduction and a parametric method (PM) for modeling of high-dimensional data sets for two- and three- group classification using DSC and demographic data. We compared the PM to the non-parametric classification method K-nearest neighbors (KNN) and the semi-parametric classification method KNN with dynamic time warping (DTW). We evaluated the performance of these methods for multiple two-group classifications: (i) normal versus cervical cancer, (ii) normal versus lung cancer, (iii) normal versus cancer (cervical + lung), (iv) lung cancer versus cervical cancer as well as for three-group classification: normal versus cervical cancer versus lung cancer. In general, performance for two-group classification was high whereas three-group classification was more challenging, with all three methods predicting normal samples more accurately than cancer samples. Moreover, specificity of the PM method was mostly higher or the same as KNN and DTW-KNN with lower sensitivity. The performance of KNN and DTW-KNN decreased with the inclusion of demographic data, whereas similar performance was observed for the PM which could be explained by the fact that the PM uses fewer parameters as compared to KNN and DTW-KNN methods and is thus less susceptible to the risk of overfitting. More importantly the accuracy of the PM can be increased by using a greater number of quantile data points and by the inclusion of additional demographic and clinical data, providing a substantial advantage over KNN and DTW-KNN methods.

Suggested Citation

  • Shesh N Rai & Sudhir Srivastava & Jianmin Pan & Xiaoyong Wu & Somesh P Rai & Chongkham S Mekmaysy & Lynn DeLeeuw & Jonathan B Chaires & Nichola C Garbett, 2019. "Multi-group diagnostic classification of high-dimensional data using differential scanning calorimetry plasma thermograms," PLOS ONE, Public Library of Science, vol. 14(8), pages 1-17, August.
  • Handle: RePEc:plo:pone00:0220765
    DOI: 10.1371/journal.pone.0220765
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0220765
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0220765&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0220765?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Nichola C Garbett & Michael L Merchant & C William Helm & Alfred B Jenson & Jon B Klein & Jonathan B Chaires, 2014. "Detection of Cervical Cancer Biomarker Patterns in Blood Plasma and Urine by Differential Scanning Calorimetry and Mass Spectrometry," PLOS ONE, Public Library of Science, vol. 9(1), pages 1-12, January.
    2. Sarah K Kendrick & Qi Zheng & Nichola C Garbett & Guy N Brock, 2017. "Application and interpretation of functional data analysis techniques to differential scanning calorimetry data from lupus patients," PLOS ONE, Public Library of Science, vol. 12(11), pages 1-21, November.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Francisca Barceló & Rosa Gomila & Ivan de Paul & Xavier Gili & Jaume Segura & Albert Pérez-Montaña & Teresa Jimenez-Marco & Antonia Sampol & José Portugal, 2018. "MALDI-TOF analysis of blood serum proteome can predict the presence of monoclonal gammopathy of undetermined significance," PLOS ONE, Public Library of Science, vol. 13(8), pages 1-14, August.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0220765. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.