IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v14y2023i1d10.1038_s41467-023-43010-x.html
   My bibliography  Save this article

Accurate de novo peptide sequencing using fully convolutional neural networks

Author

Listed:
  • Kaiyuan Liu

    (Indiana University)

  • Yuzhen Ye

    (Indiana University)

  • Sujun Li

    (Indiana University
    Dengding BioAI Co., Ltd.)

  • Haixu Tang

    (Indiana University)

Abstract

De novo peptide sequencing, which does not rely on a comprehensive target sequence database, provides us with a way to identify novel peptides from tandem mass spectra. However, current de novo sequencing algorithms suffer from low accuracy and coverage, which hinders their application in proteomics. In this paper, we present PepNet, a fully convolutional neural network for high accuracy de novo peptide sequencing. PepNet takes an MS/MS spectrum (represented as a high-dimensional vector) as input, and outputs the optimal peptide sequence along with its confidence score. The PepNet model is trained using a total of 3 million high-energy collisional dissociation MS/MS spectra from multiple human peptide spectral libraries. Evaluation results show that PepNet significantly outperforms current best-performing de novo sequencing algorithms (e.g. PointNovo and DeepNovo) in both peptide-level accuracy and positional-level accuracy. PepNet can sequence a large fraction of spectra that were not identified by database search engines, and thus could be used as a complementary tool to database search engines for peptide identification in proteomics. In addition, PepNet runs around 3x and 7x faster than PointNovo and DeepNovo on GPUs, respectively, thus being more suitable for the analysis of large-scale proteomics data.

Suggested Citation

  • Kaiyuan Liu & Yuzhen Ye & Sujun Li & Haixu Tang, 2023. "Accurate de novo peptide sequencing using fully convolutional neural networks," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
  • Handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-43010-x
    DOI: 10.1038/s41467-023-43010-x
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-023-43010-x
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-023-43010-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Sangtae Kim & Pavel A. Pevzner, 2014. "MS-GF+ makes progress towards a universal database search tool for proteomics," Nature Communications, Nature, vol. 5(1), pages 1-10, December.
    2. Chen, Gong-meng & Firth, Michael & Rui, Oliver M, 2001. "The Dynamic Relation between Stock Returns, Trading Volume, and Volatility," The Financial Review, Eastern Finance Association, vol. 36(3), pages 153-173, August.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Melih Yilmaz & William E. Fondrie & Wout Bittremieux & Carlo F. Melendez & Rowan Nelson & Varun Ananth & Sewoong Oh & William Stafford Noble, 2024. "Sequence-to-sequence translation from mass spectra to peptides with a transformer model," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    2. Thierry Bihan & Teresa Nunez de Villavicencio Diaz & Chelsea Reitzel & Victoria Lange & Minyoung Park & Emma Beadle & Lin Wu & Marko Jovic & Rosalin M. Dubois & Amber L. Couzens & Jin Duan & Xiaobing , 2024. "De novo protein sequencing of antibodies for identification of neutralizing antibodies in human plasma post SARS-CoV-2 vaccination," Nature Communications, Nature, vol. 15(1), pages 1-13, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Brian Sing Fan Chan & Andy Cheuk Hin Cheng & Alfred Ka Chun Ma, 2018. "Stock Market Volatility and Trading Volume: A Special Case in Hong Kong With Stock Connect Turnover," JRFM, MDPI, vol. 11(4), pages 1-17, October.
    2. Dinh, Minh Thi Hong, 2018. "The relationship between volume imbalance and spread," Research in International Business and Finance, Elsevier, vol. 44(C), pages 76-87.
    3. Hau, Liya & Zhu, Huiming & Shahbaz, Muhammad & Sun, Wuqin, 2021. "Does transaction activity predict Bitcoin returns? Evidence from quantile-on-quantile analysis," The North American Journal of Economics and Finance, Elsevier, vol. 55(C).
    4. Zou, Yongjie & Li, Honggang, 2014. "Time spans between price maxima and price minima in stock markets," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 395(C), pages 303-309.
    5. Do, Hung Xuan & Brooks, Robert & Treepongkaruna, Sirimon & Wu, Eliza, 2014. "How does trading volume affect financial return distributions?," International Review of Financial Analysis, Elsevier, vol. 35(C), pages 190-206.
    6. Abhinava Tripathi, 2021. "The Arrival of Information and Price Adjustment Across Extreme Quantiles: Global Evidence," IIM Kozhikode Society & Management Review, , vol. 10(1), pages 7-19, January.
    7. Pankaj Sinha & Shalini Agnihotri, 2016. "Investigating Impact of Volatility Persistence and Information Inflow on Volatility of Stock Indices Using Bivarite GJR-GARCH," Global Business Review, International Management Institute, vol. 17(5), pages 1145-1161, October.
    8. Tazhikul Mashirova & Karlygash Tastanbekova & Murat Nurgabylov & Gulnar Lukhmanova & Kundyz Myrzabekkyzy, 2023. "Analysis of the Relationship between the Highest Price and the Trading Volume of the Energy Company Shares in Kazakhstan with Frequency Domain Causality Method," International Journal of Energy Economics and Policy, Econjournals, vol. 13(4), pages 22-27, July.
    9. Kausik Chaudhuri & Alok Kumar, 2015. "A Markov-Switching Model for Indian Stock Price and Volume," Journal of Emerging Market Finance, Institute for Financial Management and Research, vol. 14(3), pages 239-257, December.
    10. Hervé, Fabrice & Zouaoui, Mohamed & Belvaux, Bertrand, 2019. "Noise traders and smart money: Evidence from online searches," Economic Modelling, Elsevier, vol. 83(C), pages 141-149.
    11. Yao, Yi & Yang, Rong & Liu, Zhiyuan & Hasan, Iftekhar, 2013. "Government intervention and institutional trading strategy: Evidence from a transition country," Global Finance Journal, Elsevier, vol. 24(1), pages 44-68.
    12. Alizadeh, Amir H. & Tamvakis, Michael, 2016. "Market conditions, trader types and price–volume relation in energy futures markets," Energy Economics, Elsevier, vol. 56(C), pages 134-149.
    13. repec:dau:papers:123456789/5069 is not listed on IDEAS
    14. Wang, Danxia, 2024. "Beyond active share: Boosting fund performance through common holdings with same-benchmark mutual funds," International Review of Financial Analysis, Elsevier, vol. 92(C).
    15. Sun, Changyou, 2013. "Price variation and volume dynamics of securitized timberlands," Forest Policy and Economics, Elsevier, vol. 27(C), pages 44-53.
    16. Farag, Hisham & Cressy, Robert, 2011. "Do regulatory policies affect the flow of information in emerging markets?," Research in International Business and Finance, Elsevier, vol. 25(3), pages 238-254, September.
    17. Rashid, Abdul, 2007. "Stock prices and trading volume: An assessment for linear and nonlinear Granger causality," Journal of Asian Economics, Elsevier, vol. 18(4), pages 595-612, August.
    18. Cathy W.S. Chen & Mike K.P. So & Thomas C. Chiang, 2016. "Evidence of Stock Returns and Abnormal Trading Volume: A Threshold Quantile Regression Approach," The Japanese Economic Review, Japanese Economic Association, vol. 67(1), pages 96-124, March.
    19. Nikolaos Antonakakis & Ioannis Chatziantoniou & David Gabauer, 2021. "A regional decomposition of US housing prices and volume: market dynamics and Portfolio diversification," The Annals of Regional Science, Springer;Western Regional Science Association, vol. 66(2), pages 279-307, April.
    20. Kyritsis Konstantinos & Sotiropoulos Ioannis & Gogos Christos & Kypriotelis Efstratios, 2007. "Note On The Effect Of The 11 Years Global Climate Cycle On The Prices Of The Capital Markets," Post-Print hal-01552349, HAL.
    21. Hussain, Syed Mujahid, 2011. "Intraday trading volume and international spillover effects," Research in International Business and Finance, Elsevier, vol. 25(2), pages 183-194, June.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-43010-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.