IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v15y2024i1d10.1038_s41467-024-51639-5.html
   My bibliography  Save this article

Adapting nanopore sequencing basecalling models for modification detection via incremental learning and anomaly detection

Author

Listed:
  • Ziyuan Wang

    (University of Arizona)

  • Yinshan Fang

    (Columbia University Medical Center)

  • Ziyang Liu

    (University of Arizona
    University of Arizona)

  • Ning Hao

    (University of Arizona
    University of Arizona)

  • Hao Helen Zhang

    (University of Arizona
    University of Arizona)

  • Xiaoxiao Sun

    (University of Arizona
    University of Arizona)

  • Jianwen Que

    (Columbia University Medical Center)

  • Hongxu Ding

    (University of Arizona
    University of Arizona)

Abstract

We leverage machine learning approaches to adapt nanopore sequencing basecallers for nucleotide modification detection. We first apply the incremental learning (IL) technique to improve the basecalling of modification-rich sequences, which are usually of high biological interest. With sequence backbones resolved, we further run anomaly detection (AD) on individual nucleotides to determine their modification status. By this means, our pipeline promises the single-molecule, single-nucleotide, and sequence context-free detection of modifications. We benchmark the pipeline using control oligos, further apply it in the basecalling of densely-modified yeast tRNAs and E.coli genomic DNAs, the cross-species detection of N6-methyladenosine (m6A) in mammalian mRNAs, and the simultaneous detection of N1-methyladenosine (m1A) and m6A in human mRNAs. Our IL-AD workflow is available at: https://github.com/wangziyuan66/IL-AD .

Suggested Citation

  • Ziyuan Wang & Yinshan Fang & Ziyang Liu & Ning Hao & Hao Helen Zhang & Xiaoxiao Sun & Jianwen Que & Hongxu Ding, 2024. "Adapting nanopore sequencing basecalling models for modification detection via incremental learning and anomaly detection," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
  • Handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-51639-5
    DOI: 10.1038/s41467-024-51639-5
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-024-51639-5
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-024-51639-5?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Alexander M. Price & Katharina E. Hayer & Alexa B. R. McIntyre & Nandan S. Gokhale & Jonathan S. Abebe & Ashley N. Fera & Christopher E. Mason & Stacy M. Horner & Angus C. Wilson & Daniel P. Depledge , 2020. "Direct RNA sequencing reveals m6A modifications on adenovirus RNA are necessary for efficient splicing," Nature Communications, Nature, vol. 11(1), pages 1-17, December.
    2. Qian Liu & Li Fang & Guoliang Yu & Depeng Wang & Chuan-Le Xiao & Kai Wang, 2019. "Detection of DNA base modifications by deep recurrent neural network on Oxford Nanopore sequencing data," Nature Communications, Nature, vol. 10(1), pages 1-11, December.
    3. Modi Safra & Aldema Sas-Chen & Ronit Nir & Roni Winkler & Aharon Nachshon & Dan Bar-Yaacov & Matthias Erlacher & Walter Rossmanith & Noam Stern-Ginossar & Schraga Schwartz, 2017. "The m1A landscape on cytosolic and mitochondrial mRNA at single-base resolution," Nature, Nature, vol. 551(7679), pages 251-255, November.
    4. Casslynn W. Q. Koh & Yeek Teck Goh & W. S. Sho Goh, 2019. "Atlas of quantitative single-base-resolution N6-methyl-adenine methylomes," Nature Communications, Nature, vol. 10(1), pages 1-15, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. P Acera Mateos & A J Sethi & A Ravindran & A Srivastava & K Woodward & S Mahmud & M Kanchi & M Guarnacci & J Xu & Z W S Yuen & Y Zhou & A Sneddon & W Hamilton & J Gao & L M Starrs & R Hayashi & V Wick, 2024. "Prediction of m6A and m5C at single-molecule resolution reveals a transcriptome-wide co-occurrence of RNA modifications," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    2. Jianheng Liu & Tao Huang & Jing Yao & Tianxuan Zhao & Yusen Zhang & Rui Zhang, 2023. "Epitranscriptomic subtyping, visualization, and denoising by global motif visualization," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    3. Eseosa Halima Ighile & Hiroaki Shirakawa & Hiroki Tanikawa, 2022. "Application of GIS and Machine Learning to Predict Flood Areas in Nigeria," Sustainability, MDPI, vol. 14(9), pages 1-33, April.
    4. Zhiyuan Luo & Jiacheng Zhang & Jingyi Fei & Shengdong Ke, 2022. "Deep learning modeling m6A deposition reveals the importance of downstream cis-element sequences," Nature Communications, Nature, vol. 13(1), pages 1-16, December.
    5. Mian Umair Ahsan & Anagha Gouru & Joe Chan & Wanding Zhou & Kai Wang, 2024. "A signal processing and deep learning framework for methylation detection using Oxford Nanopore sequencing," Nature Communications, Nature, vol. 15(1), pages 1-21, December.
    6. Abdur Rasool & Qiang Qu & Yang Wang & Qingshan Jiang, 2022. "Bio-Constrained Codes with Neural Network for Density-Based DNA Data Storage," Mathematics, MDPI, vol. 10(5), pages 1-21, March.
    7. Adrian Chan & Isabel S. Naarmann-de Vries & Carolin P. M. Scheitl & Claudia Höbartner & Christoph Dieterich, 2024. "Detecting m6A at single-molecular resolution via direct RNA sequencing and realistic training data," Nature Communications, Nature, vol. 15(1), pages 1-8, December.
    8. Dominik Stanojević & Zhe Li & Sara Bakić & Roger Foo & Mile Šikić, 2024. "Rockfish: A transformer-based model for accurate 5-methylcytosine prediction from nanopore sequencing," Nature Communications, Nature, vol. 15(1), pages 1-19, December.
    9. Katja Hartstock & Nadine A. Kueck & Petr Spacek & Anna Ovcharenko & Sabine Hüwel & Nicolas V. Cornelissen & Amarnath Bollu & Christoph Dieterich & Andrea Rentmeister, 2023. "MePMe-seq: antibody-free simultaneous m6A and m5C mapping in mRNA by metabolic propargyl labeling and sequencing," Nature Communications, Nature, vol. 14(1), pages 1-19, December.
    10. Belinda Baquero-Pérez & Ivaylo D. Yonchev & Anna Delgado-Tejedor & Rebeca Medina & Mireia Puig-Torrents & Ian Sudbery & Oguzhan Begik & Stuart A. Wilson & Eva Maria Novoa & Juana Díez, 2024. "N6-methyladenosine modification is not a general trait of viral RNA genomes," Nature Communications, Nature, vol. 15(1), pages 1-18, December.
    11. You Wu & Wenna Shao & Mengxiao Yan & Yuqin Wang & Pengfei Xu & Guoqiang Huang & Xiaofei Li & Brian D. Gregory & Jun Yang & Hongxia Wang & Xiang Yu, 2024. "Transfer learning enables identification of multiple types of RNA modifications using nanopore direct RNA sequencing," Nature Communications, Nature, vol. 15(1), pages 1-19, December.
    12. Adrien Leger & Paulo P. Amaral & Luca Pandolfini & Charlotte Capitanchik & Federica Capraro & Valentina Miano & Valentina Migliori & Patrick Toolan-Kerr & Theodora Sideri & Anton J. Enright & Konstant, 2021. "RNA modifications detection by comparative Nanopore direct RNA sequencing," Nature Communications, Nature, vol. 12(1), pages 1-17, December.
    13. Anna Delgado-Tejedor & Rebeca Medina & Oguzhan Begik & Luca Cozzuto & Judith López & Sandra Blanco & Julia Ponomarenko & Eva Maria Novoa, 2024. "Native RNA nanopore sequencing reveals antibiotic-induced loss of rRNA modifications in the A- and P-sites," Nature Communications, Nature, vol. 15(1), pages 1-19, December.
    14. Zhangli Su & Ida Monshaugen & Briana Wilson & Fengbin Wang & Arne Klungland & Rune Ougland & Anindya Dutta, 2022. "TRMT6/61A-dependent base methylation of tRNA-derived fragments regulates gene-silencing activity and the unfolded protein response in bladder cancer," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    15. Gaolian Xu & Hao Yang & Jiani Qiu & Julien Reboud & Linqing Zhen & Wei Ren & Hong Xu & Jonathan M. Cooper & Hongchen Gu, 2023. "Sequence terminus dependent PCR for site-specific mutation and modification detection," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    16. Hongna Zuo & Aiwei Wu & Mingwei Wang & Liquan Hong & Hu Wang, 2024. "tRNA m1A modification regulate HSC maintenance and self-renewal via mTORC1 signaling," Nature Communications, Nature, vol. 15(1), pages 1-13, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-51639-5. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.