IDEAS home Printed from https://ideas.repec.org/a/gam/jdataj/v8y2023i1p21-d1037357.html
   My bibliography  Save this article

Shapley Value as a Quality Control for Mass Spectra of Human Glioblastoma Tissues

Author

Listed:
  • Denis S. Zavorotnyuk

    (The Moscow Institute of Physics and Technology, National Research University, 141701 Dolgoprudny, Russia)

  • Anatoly A. Sorokin

    (The Moscow Institute of Physics and Technology, National Research University, 141701 Dolgoprudny, Russia)

  • Stanislav I. Pekov

    (Skolkovo Institute of Science and Technology, 121205 Moscow, Russia
    Siberian State Medical University, 634050 Tomsk, Russia)

  • Denis S. Bormotov

    (The Moscow Institute of Physics and Technology, National Research University, 141701 Dolgoprudny, Russia)

  • Vasiliy A. Eliferov

    (The Moscow Institute of Physics and Technology, National Research University, 141701 Dolgoprudny, Russia)

  • Konstantin V. Bocharov

    (V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Science, 119334 Moscow, Russia)

  • Eugene N. Nikolaev

    (Skolkovo Institute of Science and Technology, 121205 Moscow, Russia)

  • Igor A. Popov

    (The Moscow Institute of Physics and Technology, National Research University, 141701 Dolgoprudny, Russia)

Abstract

The automatic processing of high-dimensional mass spectrometry data is required for the clinical implementation of ambient ionization molecular profiling methods. However, complex algorithms required for the analysis of peak-rich spectra are sensitive to the quality of the input data. Therefore, an objective and quantitative indicator, insensitive to the conditions of the experiment, is currently in high demand for the automated treatment of mass spectrometric data. In this work, we demonstrate the utility of the Shapley value as an indicator of the quality of the individual mass spectrum in the classification task for human brain tumor tissue discrimination. The Shapley values are calculated on the training set of glioblastoma and nontumor pathological tissues spectra and used as feedback to create a random forest regression model to estimate the contributions for all spectra of each specimen. As a result, it is shown that the implementation of Shapley values significantly accelerates the data analysis of negative mode mass spectrometry data alongside simultaneous improving the regression models’ accuracy.

Suggested Citation

  • Denis S. Zavorotnyuk & Anatoly A. Sorokin & Stanislav I. Pekov & Denis S. Bormotov & Vasiliy A. Eliferov & Konstantin V. Bocharov & Eugene N. Nikolaev & Igor A. Popov, 2023. "Shapley Value as a Quality Control for Mass Spectra of Human Glioblastoma Tissues," Data, MDPI, vol. 8(1), pages 1-9, January.
  • Handle: RePEc:gam:jdataj:v:8:y:2023:i:1:p:21-:d:1037357
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2306-5729/8/1/21/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2306-5729/8/1/21/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2010. "Regularization Paths for Generalized Linear Models via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i01).
    2. Kuhn, Max, 2008. "Building Predictive Models in R Using the caret Package," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 28(i05).
    3. Denis S. Zavorotnyuk & Stanislav I. Pekov & Anatoly A. Sorokin & Denis S. Bormotov & Nikita Levin & Evgeny Zhvansky & Savva Semenov & Polina Strelnikova & Konstantin V. Bocharov & Alexander Vorobiev &, 2021. "Lipid Profiles of Human Brain Tumors Obtained by High-Resolution Negative Mode Ambient Mass Spectrometry," Data, MDPI, vol. 6(12), pages 1-7, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Bellotti, Anthony & Brigo, Damiano & Gambetti, Paolo & Vrins, Frédéric, 2021. "Forecasting recovery rates on non-performing loans with machine learning," International Journal of Forecasting, Elsevier, vol. 37(1), pages 428-444.
    2. Štefan Lyócsa & Petra Vašaničová & Branka Hadji Misheva & Marko Dávid Vateha, 2022. "Default or profit scoring credit systems? Evidence from European and US peer-to-peer lending markets," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 8(1), pages 1-21, December.
    3. Van Belle, Jente & Guns, Tias & Verbeke, Wouter, 2021. "Using shared sell-through data to forecast wholesaler demand in multi-echelon supply chains," European Journal of Operational Research, Elsevier, vol. 288(2), pages 466-479.
    4. Jun Wang & Jinyong Huang & Yunlong Hu & Qianwen Guo & Shasha Zhang & Jinglin Tian & Yanqin Niu & Ling Ji & Yuzhong Xu & Peijun Tang & Yaqin He & Yuna Wang & Shuya Zhang & Hao Yang & Kang Kang & Xinchu, 2024. "Terminal modifications independent cell-free RNA sequencing enables sensitive early cancer detection and classification," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    5. Siddharth Sethi & David Zhang & Sebastian Guelfi & Zhongbo Chen & Sonia Garcia-Ruiz & Emmanuel O. Olagbaju & Mina Ryten & Harpreet Saini & Juan A. Botia, 2022. "Leveraging omic features with F3UTER enables identification of unannotated 3’UTRs for synaptic genes," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    6. Michimasa Fujiogi & Yoshihiko Raita & Marcos Pérez-Losada & Robert J. Freishtat & Juan C. Celedón & Jonathan M. Mansbach & Pedro A. Piedra & Zhaozhong Zhu & Carlos A. Camargo & Kohei Hasegawa, 2022. "Integrated relationship of nasopharyngeal airway host response and microbiome associates with bronchiolitis severity," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
    7. Erik Duijvelaar & Jack Gisby & James E. Peters & Harm Jan Bogaard & Jurjan Aman, 2024. "Longitudinal plasma proteomics reveals biomarkers of alveolar-capillary barrier disruption in critically ill COVID-19 patients," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    8. Paweł Teisseyre & Robert A. Kłopotek & Jan Mielniczuk, 2016. "Random Subspace Method for high-dimensional regression with the R package regRSM," Computational Statistics, Springer, vol. 31(3), pages 943-972, September.
    9. Satre-Meloy, Aven & Diakonova, Marina & Grünewald, Philipp, 2020. "Cluster analysis and prediction of residential peak demand profiles using occupant activity data," Applied Energy, Elsevier, vol. 260(C).
    10. María Bueno Álvez & Fredrik Edfors & Kalle Feilitzen & Martin Zwahlen & Adil Mardinoglu & Per-Henrik Edqvist & Tobias Sjöblom & Emma Lundin & Natallia Rameika & Gunilla Enblad & Henrik Lindman & Marti, 2023. "Next generation pan-cancer blood proteome profiling using proximity extension assay," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    11. Fitzpatrick, Trevor & Mues, Christophe, 2016. "An empirical comparison of classification algorithms for mortgage default prediction: evidence from a distressed mortgage market," European Journal of Operational Research, Elsevier, vol. 249(2), pages 427-439.
    12. Hokuto Nakata & Akifumi Eguchi & Shouta M. M. Nakayama & John Yabe & Kaampwe Muzandu & Yoshinori Ikenaka & Chisato Mori & Mayumi Ishizuka, 2022. "Metabolomic Alteration in the Plasma of Wild Rodents Environmentally Exposed to Lead: A Preliminary Study," IJERPH, MDPI, vol. 19(1), pages 1-14, January.
    13. Patrick C Eschenfeldt & Uri Kartoun & Curtis R Heberle & Chung Yin Kong & Norman S Nishioka & Kenney Ng & Sagar Kamarthi & Chin Hur, 2018. "Analysis of factors associated with extended recovery time after colonoscopy," PLOS ONE, Public Library of Science, vol. 13(6), pages 1-16, June.
    14. Rachel Sippy & Daniel F Farrell & Daniel A Lichtenstein & Ryan Nightingale & Megan A Harris & Joseph Toth & Paris Hantztidiamantis & Nicholas Usher & Cinthya Cueva Aponte & Julio Barzallo Aguilar & An, 2020. "Severity Index for Suspected Arbovirus (SISA): Machine learning for accurate prediction of hospitalization in subjects suspected of arboviral infection," PLOS Neglected Tropical Diseases, Public Library of Science, vol. 14(2), pages 1-20, February.
    15. Joshua P White & Simon Dennis & Martin Tomko & Jessica Bell & Stephan Winter, 2021. "Paths to social licence for tracking-data analytics in university research and services," PLOS ONE, Public Library of Science, vol. 16(5), pages 1-19, May.
    16. Jack S. Gisby & Norzawani B. Buang & Artemis Papadaki & Candice L. Clarke & Talat H. Malik & Nicholas Medjeral-Thomas & Damiola Pinheiro & Paige M. Mortimer & Shanice Lewis & Eleanor Sandhu & Stephen , 2022. "Multi-omics identify falling LRRC15 as a COVID-19 severity marker and persistent pro-thrombotic signals in convalescence," Nature Communications, Nature, vol. 13(1), pages 1-21, December.
    17. Jimmy Semakula & Rene A. Corner-Thomas & Stephen T. Morris & Hugh T. Blair & Paul R. Kenyon, 2021. "Application of Machine Learning Algorithms to Predict Body Condition Score from Liveweight Records of Mature Romney Ewes," Agriculture, MDPI, vol. 11(2), pages 1-20, February.
    18. A. Jiran Meitei & Akanksha Saini & Bibhuti Bhusan Mohapatra & Kh. Jitenkumar Singh, 2022. "Predicting child anaemia in the North-Eastern states of India: a machine learning approach," International Journal of System Assurance Engineering and Management, Springer;The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden, vol. 13(6), pages 2949-2962, December.
    19. Schroeders, Ulrich & Watrin, Luc & Wilhelm, Oliver, 2021. "Age-related nuances in knowledge assessment," Intelligence, Elsevier, vol. 85(C).
    20. Salamalikis, Vasileios & Tzoumanikas, Panayiotis & Argiriou, Athanassios A. & Kazantzidis, Andreas, 2022. "Site adaptation of global horizontal irradiance from the Copernicus Atmospheric Monitoring Service for radiation using supervised machine learning techniques," Renewable Energy, Elsevier, vol. 195(C), pages 92-106.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jdataj:v:8:y:2023:i:1:p:21-:d:1037357. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.