IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v15y2024i1d10.1038_s41467-024-46771-1.html
   My bibliography  Save this article

Prediction of glycopeptide fragment mass spectra by deep learning

Author

Listed:
  • Yi Yang

    (Zhejiang University)

  • Qun Fang

    (Zhejiang University
    Zhejiang University)

Abstract

Deep learning has achieved a notable success in mass spectrometry-based proteomics and is now emerging in glycoproteomics. While various deep learning models can predict fragment mass spectra of peptides with good accuracy, they cannot cope with the non-linear glycan structure in an intact glycopeptide. Herein, we present DeepGlyco, a deep learning-based approach for the prediction of fragment spectra of intact glycopeptides. Our model adopts tree-structured long-short term memory networks to process the glycan moiety and a graph neural network architecture to incorporate potential fragmentation pathways of a specific glycan structure. This feature is beneficial to model explainability and differentiation ability of glycan structural isomers. We further demonstrate that predicted spectral libraries can be used for data-independent acquisition glycoproteomics as a supplement for library completeness. We expect that this work will provide a valuable deep learning resource for glycoproteomics.

Suggested Citation

  • Yi Yang & Qun Fang, 2024. "Prediction of glycopeptide fragment mass spectra by deep learning," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
  • Handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-46771-1
    DOI: 10.1038/s41467-024-46771-1
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-024-46771-1
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-024-46771-1?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Florian Meier & Niklas D. Köhler & Andreas-David Brunner & Jean-Marc H. Wanka & Eugenia Voytik & Maximilian T. Strauss & Fabian J. Theis & Matthias Mann, 2021. "Deep learning the collisional cross sections of the peptide universe from a million experimental values," Nature Communications, Nature, vol. 12(1), pages 1-12, December.
    2. Yi Yang & Guoquan Yan & Siyuan Kong & Mengxi Wu & Pengyuan Yang & Weiqian Cao & Liang Qiao, 2021. "GproDIA enables data-independent acquisition glycoproteomics with comprehensive statistical control," Nature Communications, Nature, vol. 12(1), pages 1-15, December.
    3. Wen-Feng Zeng & Xie-Xuan Zhou & Sander Willems & Constantin Ammar & Maria Wahle & Isabell Bludau & Eugenia Voytik & Maximillian T. Strauss & Matthias Mann, 2022. "AlphaPeptDeep: a modular deep learning framework to predict peptide properties for proteomics," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    4. Ruedi Aebersold & Matthias Mann, 2016. "Mass-spectrometric exploration of proteome structure and function," Nature, Nature, vol. 537(7620), pages 347-355, September.
    5. Lei Xin & Rui Qiao & Xin Chen & Hieu Tran & Shengying Pan & Sahar Rabinoviz & Haibo Bian & Xianliang He & Brenton Morse & Baozhen Shan & Ming Li, 2022. "A streamlined platform for analyzing tera-scale DDA and DIA mass spectrometry data enables highly sensitive immunopeptidomics," Nature Communications, Nature, vol. 13(1), pages 1-9, December.
    6. Mukul K. Midha & David S. Campbell & Charu Kapil & Ulrike Kusebauch & Michael R. Hoopmann & Samuel L. Bader & Robert L. Moritz, 2020. "DIALib-QC an assessment tool for spectral libraries in data-independent acquisition proteomics," Nature Communications, Nature, vol. 11(1), pages 1-8, December.
    7. Ronghui Lou & Weizhen Liu & Rongjie Li & Shanshan Li & Xuming He & Wenqing Shui, 2021. "DeepPhospho accelerates DIA phosphoproteome profiling through in silico library generation," Nature Communications, Nature, vol. 12(1), pages 1-15, December.
    8. Bo Wen & Kai Li & Yun Zhang & Bing Zhang, 2020. "Cancer neoantigen prioritization through sensitive and reliable proteogenomics analysis," Nature Communications, Nature, vol. 11(1), pages 1-14, December.
    9. Mathias Wilhelm & Daniel P. Zolg & Michael Graber & Siegfried Gessulat & Tobias Schmidt & Karsten Schnatbaum & Celina Schwencke-Westphal & Philipp Seifert & Niklas Andrade Krätzig & Johannes Zerweck &, 2021. "Deep learning boosts sensitivity of mass spectrometry-based immunopeptidomics," Nature Communications, Nature, vol. 12(1), pages 1-12, December.
    10. Brian C. Searle & Kristian E. Swearingen & Christopher A. Barnes & Tobias Schmidt & Siegfried Gessulat & Bernhard Küster & Mathias Wilhelm, 2020. "Generating high quality libraries for DIA MS with empirically corrected peptide predictions," Nature Communications, Nature, vol. 11(1), pages 1-10, December.
    11. Vadim Demichev & Lukasz Szyrwiel & Fengchao Yu & Guo Ci Teo & George Rosenberger & Agathe Niewienda & Daniela Ludwig & Jens Decker & Stephanie Kaspar-Schoenefeld & Kathryn S. Lilley & Michael Mülleder, 2022. "dia-PASEF data analysis using FragPipe and DIA-NN for deep proteomics of low sample amounts," Nature Communications, Nature, vol. 13(1), pages 1-8, December.
    12. Weiping Sun & Qianqiu Zhang & Xiyue Zhang & Ngoc Hieu Tran & M. Ziaur Rahman & Zheng Chen & Chao Peng & Jun Ma & Ming Li & Lei Xin & Baozhen Shan, 2023. "Glycopeptide database search and de novo sequencing with PEAKS GlycanFinder enable highly sensitive glycoproteomics," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    13. Kevin L. Yang & Fengchao Yu & Guo Ci Teo & Kai Li & Vadim Demichev & Markus Ralser & Alexey I. Nesvizhskii, 2023. "MSBooster: improving peptide identification rates using deep learning-based features," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    14. Yi Yang & Xiaohui Liu & Chengpin Shen & Yu Lin & Pengyuan Yang & Liang Qiao, 2020. "In silico spectral libraries by deep learning facilitate data-independent acquisition proteomics," Nature Communications, Nature, vol. 11(1), pages 1-11, December.
    15. Dorte B. Bekker-Jensen & Oliver M. Bernhardt & Alexander Hogrebe & Ana Martinez-Val & Lynn Verbeke & Tejas Gandhi & Christian D. Kelstrup & Lukas Reiter & Jesper V. Olsen, 2020. "Rapid and site-specific deep phosphoproteome profiling by data-independent acquisition without the need for spectral libraries," Nature Communications, Nature, vol. 11(1), pages 1-12, December.
    16. Fynn M. Hansen & Maria C. Tanzer & Franziska Brüning & Isabell Bludau & Che Stafford & Brenda A. Schulman & Maria S. Robles & Ozge Karayel & Matthias Mann, 2021. "Data-independent acquisition method for ubiquitinome analysis reveals regulation of circadian biology," Nature Communications, Nature, vol. 12(1), pages 1-13, December.
    17. Mathias Wilhelm & Daniel P. Zolg & Michael Graber & Siegfried Gessulat & Tobias Schmidt & Karsten Schnatbaum & Celina Schwencke-Westphal & Philipp Seifert & Niklas Andrade Krätzig & Johannes Zerweck &, 2021. "Author Correction: Deep learning boosts sensitivity of mass spectrometry-based immunopeptidomics," Nature Communications, Nature, vol. 12(1), pages 1-1, December.
    18. Yu Zong & Yuxin Wang & Yi Yang & Dan Zhao & Xiaoqing Wang & Chengpin Shen & Liang Qiao, 2023. "DeepFLR facilitates false localization rate control in phosphoproteomics," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Kevin L. Yang & Fengchao Yu & Guo Ci Teo & Kai Li & Vadim Demichev & Markus Ralser & Alexey I. Nesvizhskii, 2023. "MSBooster: improving peptide identification rates using deep learning-based features," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    2. Charlotte Adams & Wassim Gabriel & Kris Laukens & Mario Picciani & Mathias Wilhelm & Wout Bittremieux & Kurt Boonen, 2024. "Fragment ion intensity prediction improves the identification rate of non-tryptic peptides in timsTOF," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    3. Wen-Feng Zeng & Xie-Xuan Zhou & Sander Willems & Constantin Ammar & Maria Wahle & Isabell Bludau & Eugenia Voytik & Maximillian T. Strauss & Matthias Mann, 2022. "AlphaPeptDeep: a modular deep learning framework to predict peptide properties for proteomics," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    4. Fengchao Yu & Guo Ci Teo & Andy T. Kong & Klemens Fröhlich & Ginny Xiaohe Li & Vadim Demichev & Alexey I. Nesvizhskii, 2023. "Analysis of DIA proteomics data using MSFragger-DIA and FragPipe computational platform," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    5. David Gomez-Zepeda & Danielle Arnold-Schild & Julian Beyrle & Arthur Declercq & Ralf Gabriels & Elena Kumm & Annica Preikschat & Mateusz Krzysztof Łącki & Aurélie Hirschler & Jeewan Babu Rijal & Chris, 2024. "Thunder-DDA-PASEF enables high-coverage immunopeptidomics and is boosted by MS2Rescore with MS2PIP timsTOF fragmentation prediction model," Nature Communications, Nature, vol. 15(1), pages 1-18, December.
    6. Henry Webel & Lili Niu & Annelaura Bach Nielsen & Marie Locard-Paulet & Matthias Mann & Lars Juhl Jensen & Simon Rasmussen, 2024. "Imputation of label-free quantitative mass spectrometry-based proteomics data using self-supervised deep learning," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    7. Lei Xin & Rui Qiao & Xin Chen & Hieu Tran & Shengying Pan & Sahar Rabinoviz & Haibo Bian & Xianliang He & Brenton Morse & Baozhen Shan & Ming Li, 2022. "A streamlined platform for analyzing tera-scale DDA and DIA mass spectrometry data enables highly sensitive immunopeptidomics," Nature Communications, Nature, vol. 13(1), pages 1-9, December.
    8. Daniela Klaproth-Andrade & Johannes Hingerl & Yanik Bruns & Nicholas H. Smith & Jakob Träuble & Mathias Wilhelm & Julien Gagneur, 2024. "Deep learning-driven fragment ion series classification enables highly precise and sensitive de novo peptide sequencing," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    9. Klemens Fröhlich & Eva Brombacher & Matthias Fahrner & Daniel Vogele & Lucas Kook & Niko Pinter & Peter Bronsert & Sylvia Timme-Bronsert & Alexander Schmidt & Katja Bärenfaller & Clemens Kreutz & Oliv, 2022. "Benchmarking of analysis strategies for data-independent acquisition proteomics using a large-scale dataset comprising inter-patient heterogeneity," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    10. Weiping Sun & Qianqiu Zhang & Xiyue Zhang & Ngoc Hieu Tran & M. Ziaur Rahman & Zheng Chen & Chao Peng & Jun Ma & Ming Li & Lei Xin & Baozhen Shan, 2023. "Glycopeptide database search and de novo sequencing with PEAKS GlycanFinder enable highly sensitive glycoproteomics," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    11. Ronghui Lou & Weizhen Liu & Rongjie Li & Shanshan Li & Xuming He & Wenqing Shui, 2021. "DeepPhospho accelerates DIA phosphoproteome profiling through in silico library generation," Nature Communications, Nature, vol. 12(1), pages 1-15, December.
    12. Humberto J. Ferreira & Brian J. Stevenson & HuiSong Pak & Fengchao Yu & Jessica Almeida Oliveira & Florian Huber & Marie Taillandier-Coindard & Justine Michaux & Emma Ricart-Altimiras & Anne I. Kraeme, 2024. "Immunopeptidomics-based identification of naturally presented non-canonical circRNA-derived peptides," Nature Communications, Nature, vol. 15(1), pages 1-18, December.
    13. Hanqing Liao & Carolina Barra & Zhicheng Zhou & Xu Peng & Isaac Woodhouse & Arun Tailor & Robert Parker & Alexia Carré & Persephone Borrow & Michael J. Hogan & Wayne Paes & Laurence C. Eisenlohr & Rob, 2024. "MARS an improved de novo peptide candidate selection method for non-canonical antigen target discovery in cancer," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    14. Celina Tretter & Niklas Andrade Krätzig & Matteo Pecoraro & Sebastian Lange & Philipp Seifert & Clara Frankenberg & Johannes Untch & Gabriela Zuleger & Mathias Wilhelm & Daniel P. Zolg & Florian S. Dr, 2023. "Proteogenomic analysis reveals RNA as a source for tumor-agnostic neoantigen identification," Nature Communications, Nature, vol. 14(1), pages 1-22, December.
    15. Valdemaras Petrosius & Pedro Aragon-Fernandez & Nil Üresin & Gergo Kovacs & Teeradon Phlairaharn & Benjamin Furtwängler & Jeff Op De Beeck & Sarah L. Skovbakke & Steffen Goletz & Simon Francis Thomsen, 2023. "Exploration of cell state heterogeneity using single-cell proteomics through sensitivity-tailored data-independent acquisition," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    16. Wen-Feng Zeng & Guoquan Yan & Huan-huan Zhao & Chao Liu & Weiqian Cao, 2024. "Uncovering missing glycans and unexpected fragments with pGlycoNovo for site-specific glycosylation analysis across species," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    17. Eduardo Vieira de Souza & Angie L. Bookout & Christopher A. Barnes & Brendan Miller & Pablo Machado & Luiz A. Basso & Cristiano V. Bizarro & Alan Saghatelian, 2024. "Rp3: Ribosome profiling-assisted proteogenomics improves coverage and confidence during microprotein discovery," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    18. Siyuan Kong & Pengyun Gong & Wen-Feng Zeng & Biyun Jiang & Xinhang Hou & Yang Zhang & Huanhuan Zhao & Mingqi Liu & Guoquan Yan & Xinwen Zhou & Xihua Qiao & Mengxi Wu & Pengyuan Yang & Chao Liu & Weiqi, 2022. "pGlycoQuant with a deep residual network for quantitative glycoproteomics at intact glycopeptide level," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    19. Guilherme Reis-de-Oliveira & Victor Corasolla Carregari & Gabriel Rodrigues dos Reis de Sousa & Daniel Martins-de-Souza, 2024. "OmicScope unravels systems-level insights from quantitative proteomics data," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    20. Sofani Tafesse Gebreyesus & Asad Ali Siyal & Reta Birhanu Kitata & Eric Sheng-Wen Chen & Bayarmaa Enkhbayar & Takashi Angata & Kuo-I Lin & Yu-Ju Chen & Hsiung-Lin Tu, 2022. "Streamlined single-cell proteomics by an integrated microfluidic chip and data-independent acquisition mass spectrometry," Nature Communications, Nature, vol. 13(1), pages 1-13, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-46771-1. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.