IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v15y2024i1d10.1038_s41467-024-52771-y.html
   My bibliography  Save this article

Taxometer: Improving taxonomic classification of metagenomics contigs

Author

Listed:
  • Svetlana Kutuzova

    (University of Copenhagen
    University of Copenhagen
    University of Copenhagen)

  • Mads Nielsen

    (University of Copenhagen)

  • Pau Piera

    (University of Copenhagen
    University of Copenhagen)

  • Jakob Nybo Nissen

    (University of Copenhagen
    University of Copenhagen)

  • Simon Rasmussen

    (University of Copenhagen
    University of Copenhagen
    Broad Institute of MIT and Harvard)

Abstract

For taxonomy based classification of metagenomics assembled contigs, current methods use sequence similarity to identify their most likely taxonomy. However, in the related field of metagenomic binning, contigs are routinely clustered using information from both the contig sequences and their abundance. We introduce Taxometer, a neural network based method that improves the annotations and estimates the quality of any taxonomic classifier using contig abundance profiles and tetra-nucleotide frequencies. We apply Taxometer to five short-read CAMI2 datasets and find that it increases the average share of correct species-level contig annotations of the MMSeqs2 tool from 66.6% to 86.2%. Additionally, it reduce the share of wrong species-level annotations in the CAMI2 Rhizosphere dataset by an average of two-fold for Metabuli, Centrifuge, and Kraken2. Futhermore, we use Taxometer for benchmarking taxonomic classifiers on two complex long-read metagenomics data sets where ground truth is not known. Taxometer is available as open-source software and can enhance any taxonomic annotation of metagenomic contigs.

Suggested Citation

  • Svetlana Kutuzova & Mads Nielsen & Pau Piera & Jakob Nybo Nissen & Simon Rasmussen, 2024. "Taxometer: Improving taxonomic classification of metagenomics contigs," Nature Communications, Nature, vol. 15(1), pages 1-9, December.
  • Handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-52771-y
    DOI: 10.1038/s41467-024-52771-y
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-024-52771-y
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-024-52771-y?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Alessio Milanese & Daniel R Mende & Lucas Paoli & Guillem Salazar & Hans-Joachim Ruscheweyh & Miguelangel Cuenca & Pascal Hingamp & Renato Alves & Paul I Costea & Luis Pedro Coelho & Thomas S. B. Schm, 2019. "Microbial abundance, activity and population genomic profiling with mOTUs2," Nature Communications, Nature, vol. 10(1), pages 1-11, December.
    2. Alexander T. Dilthey & Chirag Jain & Sergey Koren & Adam M. Phillippy, 2019. "Strain-level metagenomic assignment and compositional estimation for long reads with MetaMaps," Nature Communications, Nature, vol. 10(1), pages 1-12, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Vincent Somerville & Nadine Thierer & Remo S. Schmidt & Alexandra Roetschi & Lauriane Braillard & Monika Haueter & Hélène Berthoud & Noam Shani & Ueli Ah & Florent Mazel & Philipp Engel, 2024. "Genomic and phenotypic imprints of microbial domestication on cheese starter cultures," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    2. Ernestina Hauptfeld & Nikolaos Pappas & Sandra Iwaarden & Basten L. Snoek & Andrea Aldas-Vargas & Bas E. Dutilh & F. A. Bastiaan Meijenfeldt, 2024. "Integrating taxonomic signals from MAGs and contigs improves read annotation and taxonomic profiling of metagenomes," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    3. Corentin Hochart & Lucas Paoli & Hans-Joachim Ruscheweyh & Guillem Salazar & Emilie Boissin & Sarah Romac & Julie Poulain & Guillaume Bourdin & Guillaume Iwankow & Clémentine Moulin & Maren Ziegler & , 2023. "Ecology of Endozoicomonadaceae in three coral genera across the Pacific Ocean," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    4. Trine Zachariasen & Jakob Russel & Charisse Petersen & Gisle A. Vestergaard & Shiraz Shah & Pablo Atienza Lopez & Moschoula Passali & Stuart E. Turvey & Søren J. Sørensen & Ole Lund & Jakob Stokholm &, 2024. "MAGinator enables accurate profiling of de novo MAGs with strain-level phylogenies," Nature Communications, Nature, vol. 15(1), pages 1-10, December.
    5. Suguru Nishijima & Naoyoshi Nagata & Yuya Kiguchi & Yasushi Kojima & Tohru Miyoshi-Akiyama & Moto Kimura & Mitsuru Ohsugi & Kohjiro Ueki & Shinichi Oka & Masashi Mizokami & Takao Itoi & Takashi Kawai , 2022. "Extensive gut virome variation and its associations with host and environmental factors in a population-level cohort," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    6. Rocky D. Payet & Lorelei J. Bilham & Shah Md Tamim Kabir & Serena Monaco & Ash R. Norcott & Mellieha G. E. Allen & Xiao-Yu Zhu & Anthony J. Davy & Charles A. Brearley & Jonathan D. Todd & J. Benjamin , 2024. "Elucidation of Spartina dimethylsulfoniopropionate synthesis genes enables engineering of stress tolerant plants," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    7. Laura Nies & Susheel Bhanu Busi & Mina Tsenkova & Rashi Halder & Elisabeth Letellier & Paul Wilmes, 2022. "Evolution of the murine gut resistome following broad-spectrum antibiotic treatment," Nature Communications, Nature, vol. 13(1), pages 1-11, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-52771-y. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.