IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v14y2023i1d10.1038_s41467-023-41099-8.html
   My bibliography  Save this article

Removal of false positives in metagenomics-based taxonomy profiling via targeting Type IIB restriction sites

Author

Listed:
  • Zheng Sun

    (Brigham and Women’s Hospital and Harvard Medical School)

  • Jiang Liu

    (Qingdao OE Biotechnology Company Limited)

  • Meng Zhang

    (Inner Mongolia Agricultural University)

  • Tong Wang

    (Brigham and Women’s Hospital and Harvard Medical School)

  • Shi Huang

    (The University of Hong Kong)

  • Scott T. Weiss

    (Brigham and Women’s Hospital and Harvard Medical School)

  • Yang-Yu Liu

    (Brigham and Women’s Hospital and Harvard Medical School
    University of Illinois at Urbana-Champaign)

Abstract

Accurate species identification and abundance estimation are critical for the interpretation of whole metagenome sequencing (WMS) data. Yet, existing metagenomic profilers suffer from false-positive identifications, which can account for more than 90% of total identified species. Here, by leveraging species-specific Type IIB restriction endonuclease digestion sites as reference instead of universal markers or whole microbial genomes, we present a metagenomic profiler, MAP2B (MetAgenomic Profiler based on type IIB restriction sites), to resolve those issues. We first illustrate the pitfalls of using relative abundance as the only feature in determining false positives. We then propose a feature set to distinguish false positives from true positives, and using simulated metagenomes from CAMI2, we establish a false-positive recognition model. By benchmarking the performance in metagenomic profiling using a simulation dataset with varying sequencing depth and species richness, we illustrate the superior performance of MAP2B over existing metagenomic profilers in species identification. We further test the performance of MAP2B using real WMS data from an ATCC mock community, confirming its superior precision against sequencing depth. Finally, by leveraging WMS data from an IBD cohort, we demonstrate the taxonomic features generated by MAP2B can better discriminate IBD and predict metabolomic profiles.

Suggested Citation

  • Zheng Sun & Jiang Liu & Meng Zhang & Tong Wang & Shi Huang & Scott T. Weiss & Yang-Yu Liu, 2023. "Removal of false positives in metagenomics-based taxonomy profiling via targeting Type IIB restriction sites," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
  • Handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-41099-8
    DOI: 10.1038/s41467-023-41099-8
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-023-41099-8
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-023-41099-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. James Robert White & Niranjan Nagarajan & Mihai Pop, 2009. "Statistical Methods for Detecting Differentially Abundant Features in Clinical Metagenomic Samples," PLOS Computational Biology, Public Library of Science, vol. 5(4), pages 1-11, April.
    2. Gregory D. Poore & Evguenia Kopylova & Qiyun Zhu & Carolina Carpenter & Serena Fraraccio & Stephen Wandro & Tomasz Kosciolek & Stefan Janssen & Jessica Metcalf & Se Jin Song & Jad Kanbar & Sandrine Mi, 2020. "RETRACTED ARTICLE: Microbiome analyses of blood and tissues suggest cancer diagnostic approach," Nature, Nature, vol. 579(7800), pages 567-574, March.
    3. Peter Menzel & Kim Lee Ng & Anders Krogh, 2016. "Fast and sensitive taxonomic classification for metagenomics with Kaiju," Nature Communications, Nature, vol. 7(1), pages 1-9, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Kaiyuan Zhu & Alejandro A. Schäffer & Welles Robinson & Junyan Xu & Eytan Ruppin & A. Funda Ergun & Yuzhen Ye & S. Cenk Sahinalp, 2022. "Strain level microbial detection and quantification with applications to single cell metagenomics," Nature Communications, Nature, vol. 13(1), pages 1-19, December.
    2. Pengfei Song & Wen Qin & YanGan Huang & Lei Wang & Zhenyuan Cai & Tongzuo Zhang, 2020. "Grazing Management Influences Gut Microbial Diversity of Livestock in the Same Area," Sustainability, MDPI, vol. 12(10), pages 1-12, May.
    3. Shilan Li & Jianxin Shi & Paul Albert & Hong-Bin Fang, 2022. "Dependence Structure Analysis and Its Application in Human Microbiome," Mathematics, MDPI, vol. 11(1), pages 1-14, December.
    4. Allison G. White & George S. Watts & Zhenqiang Lu & Maria M. Meza-Montenegro & Eric A. Lutz & Philip Harber & Jefferey L. Burgess, 2014. "Environmental Arsenic Exposure and Microbiota in Induced Sputum," IJERPH, MDPI, vol. 11(2), pages 1-15, February.
    5. Leandro C. Hermida & E. Michael Gertz & Eytan Ruppin, 2022. "Predicting cancer prognosis and drug response from the tumor microbiome," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    6. Yunmin Yang & Binbin Chu & Jiayi Cheng & Jiali Tang & Bin Song & Houyu Wang & Yao He, 2022. "Bacteria eat nanoprobes for aggregation-enhanced imaging and killing diverse microorganisms," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    7. Amanda Sörensen Ristinmaa & Albert Tafur Rangel & Alexander Idström & Sebastian Valenzuela & Eduard J. Kerkhoven & Phillip B. Pope & Merima Hasani & Johan Larsbrink, 2023. "Resin acids play key roles in shaping microbial communities during degradation of spruce bark," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    8. Yong Li & Jiejie Zhang & Jianqiang Zhang & Wenlai Xu & Zishen Mou, 2019. "Microbial Community Structure in the Sediments and Its Relation to Environmental Factors in Eutrophicated Sancha Lake," IJERPH, MDPI, vol. 16(11), pages 1-15, May.
    9. Monica Vera-Lise Tulstrup & Ellen Gerd Christensen & Vera Carvalho & Caroline Linninge & Siv Ahrné & Ole Højberg & Tine Rask Licht & Martin Iain Bahl, 2015. "Antibiotic Treatment Affects Intestinal Permeability and Gut Microbial Composition in Wistar Rats Dependent on Antibiotic Class," PLOS ONE, Public Library of Science, vol. 10(12), pages 1-17, December.
    10. Zhenqiu Liu & Dechang Chen & Li Sheng & Amy Y Liu, 2013. "Class Prediction and Feature Selection with Linear Optimization for Metagenomic Count Data," PLOS ONE, Public Library of Science, vol. 8(3), pages 1-7, March.
    11. Peng Liu & Jessica Ewald & Zhiqiang Pang & Elena Legrand & Yeon Seon Jeon & Jonathan Sangiovanni & Orcun Hacariz & Guangyan Zhou & Jessica A. Head & Niladri Basu & Jianguo Xia, 2023. "ExpressAnalyst: A unified platform for RNA-sequencing analysis in non-model species," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    12. Paul J McMurdie & Susan Holmes, 2014. "Waste Not, Want Not: Why Rarefying Microbiome Data Is Inadmissible," PLOS Computational Biology, Public Library of Science, vol. 10(4), pages 1-12, April.
    13. Candice R. Gurbatri & Georgette A. Radford & Laura Vrbanac & Jongwon Im & Elaine M. Thomas & Courtney Coker & Samuel R. Taylor & YoungUk Jang & Ayelet Sivan & Kyu Rhee & Anas A. Saleh & Tiffany Chien , 2024. "Engineering tumor-colonizing E. coli Nissle 1917 for detection and treatment of colorectal neoplasia," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    14. Wei Ding & Shougang Wang & Peng Qin & Shen Fan & Xiaoyan Su & Peiyan Cai & Jie Lu & Han Cui & Meng Wang & Yi Shu & Yongming Wang & Hui-Hui Fu & Yu-Zhong Zhang & Yong-Xin Li & Weipeng Zhang, 2023. "Anaerobic thiosulfate oxidation by the Roseobacter group is prevalent in marine biofilms," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    15. Daniela S. Aliaga Goltsman & Lisa M. Alexander & Jyun-Liang Lin & Rodrigo Fregoso Ocampo & Benjamin Freeman & Rebecca C. Lamothe & Andres Perez Rivas & Morayma M. Temoche-Diaz & Shailaja Chadha & Nata, 2022. "Compact Cas9d and HEARO enzymes for genome editing discovered from uncultivated microbes," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    16. Ernestina Hauptfeld & Nikolaos Pappas & Sandra Iwaarden & Basten L. Snoek & Andrea Aldas-Vargas & Bas E. Dutilh & F. A. Bastiaan Meijenfeldt, 2024. "Integrating taxonomic signals from MAGs and contigs improves read annotation and taxonomic profiling of metagenomes," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    17. Edoardo Pasolli & Duy Tin Truong & Faizan Malik & Levi Waldron & Nicola Segata, 2016. "Machine Learning Meta-analysis of Large Metagenomic Datasets: Tools and Biological Insights," PLOS Computational Biology, Public Library of Science, vol. 12(7), pages 1-26, July.
    18. J. L. Rolando & M. Kolton & T. Song & Y. Liu & P. Pinamang & R. Conrad & J. T. Morris & K. T. Konstantinidis & J. E. Kostka, 2024. "Sulfur oxidation and reduction are coupled to nitrogen fixation in the roots of the salt marsh foundation plant Spartina alterniflora," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    19. Omary Mzava & Alexandre Pellan Cheng & Adrienne Chang & Sami Smalling & Liz-Audrey Kounatse Djomnang & Joan Sesing Lenz & Randy Longman & Amy Steadman & Luis G. Gómez-Escobar & Edward J. Schenck & Mir, 2022. "A metagenomic DNA sequencing assay that is robust against environmental DNA contamination," Nature Communications, Nature, vol. 13(1), pages 1-10, December.
    20. Bin Wang, 2020. "A Zipf-plot based normalization method for high-throughput RNA-seq data," PLOS ONE, Public Library of Science, vol. 15(4), pages 1-15, April.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-41099-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.