IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0146413.html
   My bibliography  Save this article

Model Comparison for Breast Cancer Prognosis Based on Clinical Data

Author

Listed:
  • Sabri Boughorbel
  • Rashid Al-Ali
  • Naser Elkum

Abstract

We compared the performance of several prediction techniques for breast cancer prognosis, based on AU-ROC performance (Area Under ROC) for different prognosis periods. The analyzed dataset contained 1,981 patients and from an initial 25 variables, the 11 most common clinical predictors were retained. We compared eight models from a wide spectrum of predictive models, namely; Generalized Linear Model (GLM), GLM-Net, Partial Least Square (PLS), Support Vector Machines (SVM), Random Forests (RF), Neural Networks, k-Nearest Neighbors (k-NN) and Boosted Trees. In order to compare these models, paired t-test was applied on the model performance differences obtained from data resampling. Random Forests, Boosted Trees, Partial Least Square and GLMNet have superior overall performance, however they are only slightly higher than the other models. The comparative analysis also allowed us to define a relative variable importance as the average of variable importance from the different models. Two sets of variables are identified from this analysis. The first includes number of positive lymph nodes, tumor size, cancer grade and estrogen receptor, all has an important influence on model predictability. The second set incudes variables related to histological parameters and treatment types. The short term vs long term contribution of the clinical variables are also analyzed from the comparative models. From the various cancer treatment plans, the combination of Chemo/Radio therapy leads to the largest impact on cancer prognosis.

Suggested Citation

  • Sabri Boughorbel & Rashid Al-Ali & Naser Elkum, 2016. "Model Comparison for Breast Cancer Prognosis Based on Clinical Data," PLOS ONE, Public Library of Science, vol. 11(1), pages 1-15, January.
  • Handle: RePEc:plo:pone00:0146413
    DOI: 10.1371/journal.pone.0146413
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0146413
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0146413&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0146413?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Quesenberry Jr., C.P. & Fireman, B. & Hiatt, R.A. & Selby, J.V., 1989. "A survival analysis of hospitalization among patients with acquired immunodeficiency syndrome," American Journal of Public Health, American Public Health Association, vol. 79(12), pages 1643-1647.
    2. Kuhn, Max, 2008. "Building Predictive Models in R Using the caret Package," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 28(i05).
    3. Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2010. "Regularization Paths for Generalized Linear Models via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i01).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Bellotti, Anthony & Brigo, Damiano & Gambetti, Paolo & Vrins, Frédéric, 2021. "Forecasting recovery rates on non-performing loans with machine learning," International Journal of Forecasting, Elsevier, vol. 37(1), pages 428-444.
    2. Štefan Lyócsa & Petra Vašaničová & Branka Hadji Misheva & Marko Dávid Vateha, 2022. "Default or profit scoring credit systems? Evidence from European and US peer-to-peer lending markets," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 8(1), pages 1-21, December.
    3. Van Belle, Jente & Guns, Tias & Verbeke, Wouter, 2021. "Using shared sell-through data to forecast wholesaler demand in multi-echelon supply chains," European Journal of Operational Research, Elsevier, vol. 288(2), pages 466-479.
    4. Jun Wang & Jinyong Huang & Yunlong Hu & Qianwen Guo & Shasha Zhang & Jinglin Tian & Yanqin Niu & Ling Ji & Yuzhong Xu & Peijun Tang & Yaqin He & Yuna Wang & Shuya Zhang & Hao Yang & Kang Kang & Xinchu, 2024. "Terminal modifications independent cell-free RNA sequencing enables sensitive early cancer detection and classification," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    5. Siddharth Sethi & David Zhang & Sebastian Guelfi & Zhongbo Chen & Sonia Garcia-Ruiz & Emmanuel O. Olagbaju & Mina Ryten & Harpreet Saini & Juan A. Botia, 2022. "Leveraging omic features with F3UTER enables identification of unannotated 3’UTRs for synaptic genes," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    6. Michimasa Fujiogi & Yoshihiko Raita & Marcos Pérez-Losada & Robert J. Freishtat & Juan C. Celedón & Jonathan M. Mansbach & Pedro A. Piedra & Zhaozhong Zhu & Carlos A. Camargo & Kohei Hasegawa, 2022. "Integrated relationship of nasopharyngeal airway host response and microbiome associates with bronchiolitis severity," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
    7. Erik Duijvelaar & Jack Gisby & James E. Peters & Harm Jan Bogaard & Jurjan Aman, 2024. "Longitudinal plasma proteomics reveals biomarkers of alveolar-capillary barrier disruption in critically ill COVID-19 patients," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    8. Paweł Teisseyre & Robert A. Kłopotek & Jan Mielniczuk, 2016. "Random Subspace Method for high-dimensional regression with the R package regRSM," Computational Statistics, Springer, vol. 31(3), pages 943-972, September.
    9. Satre-Meloy, Aven & Diakonova, Marina & Grünewald, Philipp, 2020. "Cluster analysis and prediction of residential peak demand profiles using occupant activity data," Applied Energy, Elsevier, vol. 260(C).
    10. María Bueno Álvez & Fredrik Edfors & Kalle Feilitzen & Martin Zwahlen & Adil Mardinoglu & Per-Henrik Edqvist & Tobias Sjöblom & Emma Lundin & Natallia Rameika & Gunilla Enblad & Henrik Lindman & Marti, 2023. "Next generation pan-cancer blood proteome profiling using proximity extension assay," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    11. Fitzpatrick, Trevor & Mues, Christophe, 2016. "An empirical comparison of classification algorithms for mortgage default prediction: evidence from a distressed mortgage market," European Journal of Operational Research, Elsevier, vol. 249(2), pages 427-439.
    12. Hokuto Nakata & Akifumi Eguchi & Shouta M. M. Nakayama & John Yabe & Kaampwe Muzandu & Yoshinori Ikenaka & Chisato Mori & Mayumi Ishizuka, 2022. "Metabolomic Alteration in the Plasma of Wild Rodents Environmentally Exposed to Lead: A Preliminary Study," IJERPH, MDPI, vol. 19(1), pages 1-14, January.
    13. Patrick C Eschenfeldt & Uri Kartoun & Curtis R Heberle & Chung Yin Kong & Norman S Nishioka & Kenney Ng & Sagar Kamarthi & Chin Hur, 2018. "Analysis of factors associated with extended recovery time after colonoscopy," PLOS ONE, Public Library of Science, vol. 13(6), pages 1-16, June.
    14. Rachel Sippy & Daniel F Farrell & Daniel A Lichtenstein & Ryan Nightingale & Megan A Harris & Joseph Toth & Paris Hantztidiamantis & Nicholas Usher & Cinthya Cueva Aponte & Julio Barzallo Aguilar & An, 2020. "Severity Index for Suspected Arbovirus (SISA): Machine learning for accurate prediction of hospitalization in subjects suspected of arboviral infection," PLOS Neglected Tropical Diseases, Public Library of Science, vol. 14(2), pages 1-20, February.
    15. Joshua P White & Simon Dennis & Martin Tomko & Jessica Bell & Stephan Winter, 2021. "Paths to social licence for tracking-data analytics in university research and services," PLOS ONE, Public Library of Science, vol. 16(5), pages 1-19, May.
    16. Denis S. Zavorotnyuk & Anatoly A. Sorokin & Stanislav I. Pekov & Denis S. Bormotov & Vasiliy A. Eliferov & Konstantin V. Bocharov & Eugene N. Nikolaev & Igor A. Popov, 2023. "Shapley Value as a Quality Control for Mass Spectra of Human Glioblastoma Tissues," Data, MDPI, vol. 8(1), pages 1-9, January.
    17. Jack S. Gisby & Norzawani B. Buang & Artemis Papadaki & Candice L. Clarke & Talat H. Malik & Nicholas Medjeral-Thomas & Damiola Pinheiro & Paige M. Mortimer & Shanice Lewis & Eleanor Sandhu & Stephen , 2022. "Multi-omics identify falling LRRC15 as a COVID-19 severity marker and persistent pro-thrombotic signals in convalescence," Nature Communications, Nature, vol. 13(1), pages 1-21, December.
    18. Jimmy Semakula & Rene A. Corner-Thomas & Stephen T. Morris & Hugh T. Blair & Paul R. Kenyon, 2021. "Application of Machine Learning Algorithms to Predict Body Condition Score from Liveweight Records of Mature Romney Ewes," Agriculture, MDPI, vol. 11(2), pages 1-20, February.
    19. A. Jiran Meitei & Akanksha Saini & Bibhuti Bhusan Mohapatra & Kh. Jitenkumar Singh, 2022. "Predicting child anaemia in the North-Eastern states of India: a machine learning approach," International Journal of System Assurance Engineering and Management, Springer;The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden, vol. 13(6), pages 2949-2962, December.
    20. Schroeders, Ulrich & Watrin, Luc & Wilhelm, Oliver, 2021. "Age-related nuances in knowledge assessment," Intelligence, Elsevier, vol. 85(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0146413. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.