IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v13y2022i1d10.1038_s41467-022-34381-8.html
   My bibliography  Save this article

VeChat: correcting errors in long reads using variation graphs

Author

Listed:
  • Xiao Luo

    (Bielefeld University
    Centrum Wiskunde & Informatica)

  • Xiongbin Kang

    (Bielefeld University)

  • Alexander Schönhuth

    (Bielefeld University
    Centrum Wiskunde & Informatica)

Abstract

Error correction is the canonical first step in long-read sequencing data analysis. Current self-correction methods, however, are affected by consensus sequence induced biases that mask true variants in haplotypes of lower frequency showing in mixed samples. Unlike consensus sequence templates, graph-based reference systems are not affected by such biases, so do not mistakenly mask true variants as errors. We present VeChat, as an approach to implement this idea: VeChat is based on variation graphs, as a popular type of data structure for pangenome reference systems. Extensive benchmarking experiments demonstrate that long reads corrected by VeChat contain 4 to 15 (Pacific Biosciences) and 1 to 10 times (Oxford Nanopore Technologies) less errors than when being corrected by state of the art approaches. Further, using VeChat prior to long-read assembly significantly improves the haplotype awareness of the assemblies. VeChat is an easy-to-use open-source tool and publicly available at https://github.com/HaploKit/vechat .

Suggested Citation

  • Xiao Luo & Xiongbin Kang & Alexander Schönhuth, 2022. "VeChat: correcting errors in long reads using variation graphs," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
  • Handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-34381-8
    DOI: 10.1038/s41467-022-34381-8
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-022-34381-8
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-022-34381-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Arang Rhie & Shane A. McCarthy & Olivier Fedrigo & Joana Damas & Giulio Formenti & Sergey Koren & Marcela Uliano-Silva & William Chow & Arkarachai Fungtammasan & Juwan Kim & Chul Lee & Byung June Ko &, 2021. "Towards complete and error-free genome assemblies of all vertebrate species," Nature, Nature, vol. 592(7856), pages 737-746, April.
    2. Chirag Jain & Luis M. Rodriguez-R & Adam M. Phillippy & Konstantinos T. Konstantinidis & Srinivas Aluru, 2018. "High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries," Nature Communications, Nature, vol. 9(1), pages 1-8, December.
    3. Karen H. Miga & Sergey Koren & Arang Rhie & Mitchell R. Vollger & Ariel Gershman & Andrey Bzikadze & Shelise Brooks & Edmund Howe & David Porubsky & Glennis A. Logsdon & Valerie A. Schneider & Tamara , 2020. "Telomere-to-telomere assembly of a complete human X chromosome," Nature, Nature, vol. 585(7823), pages 79-84, September.
    4. Peter Edge & Vikas Bansal, 2019. "Longshot enables accurate variant calling in diploid genomes from single-molecule long read sequencing," Nature Communications, Nature, vol. 10(1), pages 1-10, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Joanna Hård & Jeff E. Mold & Jesper Eisfeldt & Christian Tellgren-Roth & Susana Häggqvist & Ignas Bunikis & Orlando Contreras-Lopez & Chen-Shan Chin & Jessica Nordlund & Carl-Johan Rubin & Lars Feuk &, 2023. "Long-read whole-genome analysis of human single cells," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    2. Sarah Morrison-Smith & Christina Boucher & Aleksandra Sarcevic & Noelle Noyes & Catherine O’Brien & Nazaret Cuadros & Jaime Ruiz, 2022. "Challenges in large-scale bioinformatics projects," Palgrave Communications, Palgrave Macmillan, vol. 9(1), pages 1-9, December.
    3. Zhikun Wu & Zehang Jiang & Tong Li & Chuanbo Xie & Liansheng Zhao & Jiaqi Yang & Shuai Ouyang & Yizhi Liu & Tao Li & Zhi Xie, 2021. "Structural variants in the Chinese population and their impact on phenotypes, diseases and population adaptation," Nature Communications, Nature, vol. 12(1), pages 1-12, December.
    4. Iliana Bista & Jonathan M. D. Wood & Thomas Desvignes & Shane A. McCarthy & Michael Matschiner & Zemin Ning & Alan Tracey & James Torrance & Ying Sims & William Chow & Michelle Smith & Karen Oliver & , 2023. "Genomics of cold adaptations in the Antarctic notothenioid fish radiation," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    5. Heiner Kuhl & Kang Du & Manfred Schartl & Lukáš Kalous & Matthias Stöck & Dunja K. Lamatsch, 2022. "Equilibrated evolution of the mixed auto-/allopolyploid haplotype-resolved genome of the invasive hexaploid Prussian carp," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    6. Kunpeng Li & Peng Xu & Jinpeng Wang & Xin Yi & Yuannian Jiao, 2023. "Identification of errors in draft genome assemblies at single-nucleotide resolution for quality assessment and improvement," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    7. Mohamed Awad & Xiangchao Gan, 2023. "GALA: a computational framework for de novo chromosome-by-chromosome assembly with long reads," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    8. Zhen Huang & Ivanete De O. Furo & Jing Liu & Valentina Peona & Anderson J. B. Gomes & Wan Cen & Hao Huang & Yanding Zhang & Duo Chen & Ting Xue & Qiujin Zhang & Zhicao Yue & Quanxi Wang & Lingyu Yu & , 2022. "Recurrent chromosome reshuffling and the evolution of neo-sex chromosomes in parrots," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    9. M. C. Rühlemann & C. Bang & J. F. Gogarten & B. M. Hermes & M. Groussin & S. Waschina & M. Poyet & M. Ulrich & C. Akoua-Koffi & T. Deschner & J. J. Muyembe-Tamfum & M. M. Robbins & M. Surbeck & R. M. , 2024. "Functional host-specific adaptation of the intestinal microbiome in hominids," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    10. Lucas Serra Moncadas & Cyrill Hofer & Paul-Adrian Bulzu & Jakob Pernthaler & Adrian-Stefan Andrei, 2024. "Freshwater genome-reduced bacteria exhibit pervasive episodes of adaptive stasis," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    11. Ling Zhong & Menghan Zhang & Libing Sun & Yu Yang & Bo Wang & Haibing Yang & Qiang Shen & Yu Xia & Jiarui Cui & Hui Hang & Yi Ren & Bo Pang & Xiangyu Deng & Yahui Zhan & Heng Li & Zhemin Zhou, 2023. "Distributed genotyping and clustering of Neisseria strains reveal continual emergence of epidemic meningococcus over a century," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    12. Aleksandar Stanojković & Svatopluk Skoupý & Hanna Johannesson & Petr Dvořák, 2024. "The global speciation continuum of the cyanobacterium Microcoleus," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    13. Xin Fan & Rong-Chen Dai & Shu Zhang & Yuan-Yuan Geng & Mei Kang & Da-Wen Guo & Ya-Ning Mei & Yu-Hong Pan & Zi-Yong Sun & Ying-Chun Xu & Jie Gong & Meng Xiao, 2023. "Tandem gene duplications contributed to high-level azole resistance in a rapidly expanding Candida tropicalis population," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
    14. Lucie Semenec & Amy K. Cain & Catherine J. Dawson & Qi Liu & Hue Dinh & Hannah Lott & Anahit Penesyan & Ram Maharjan & Francesca L. Short & Karl A. Hassan & Ian T. Paulsen, 2023. "Cross-protection and cross-feeding between Klebsiella pneumoniae and Acinetobacter baumannii promotes their co-existence," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
    15. Corentin Hochart & Lucas Paoli & Hans-Joachim Ruscheweyh & Guillem Salazar & Emilie Boissin & Sarah Romac & Julie Poulain & Guillaume Bourdin & Guillaume Iwankow & Clémentine Moulin & Maren Ziegler & , 2023. "Ecology of Endozoicomonadaceae in three coral genera across the Pacific Ocean," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    16. Daniel P. Morreale & Eric A. Porsch & Brad K. Kern & Joseph W. Geme & Paul J. Planet, 2023. "Acquisition, co-option, and duplication of the rtx toxin system and the emergence of virulence in Kingella," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    17. Xiyang Dong & Yongyi Peng & Muhua Wang & Laura Woods & Wenxue Wu & Yong Wang & Xi Xiao & Jiwei Li & Kuntong Jia & Chris Greening & Zongze Shao & Casey R. J. Hubert, 2023. "Evolutionary ecology of microbial populations inhabiting deep sea sediments associated with cold seeps," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    18. Xuanji Li & Asker Brejnrod & Jonathan Thorsen & Trine Zachariasen & Urvish Trivedi & Jakob Russel & Gisle Alberg Vestergaard & Jakob Stokholm & Morten Arendt Rasmussen & Søren Johannes Sørensen, 2023. "Differential responses of the gut microbiome and resistome to antibiotic exposures in infants and adults," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    19. Sanjam S. Sawhney & Rhiannon C. Vargas & Meghan A. Wallace & Carol E. Muenks & Brian V. Lubbers & Stephanie A. Fritz & Carey-Ann D. Burnham & Gautam Dantas, 2023. "Diagnostic and commensal Staphylococcus pseudintermedius genomes reveal niche adaptation through parallel selection of defense mechanisms," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    20. Matteo Sebastianelli & Sifiso M. Lukhele & Simona Secomandi & Stacey G. Souza & Bettina Haase & Michaella Moysi & Christos Nikiforou & Alexander Hutfluss & Jacquelyn Mountcastle & Jennifer Balacco & S, 2024. "A genomic basis of vocal rhythm in birds," Nature Communications, Nature, vol. 15(1), pages 1-15, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-34381-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.