IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v13y2022i1d10.1038_s41467-022-34381-8.html
   My bibliography  Save this article

VeChat: correcting errors in long reads using variation graphs

Author

Listed:
  • Xiao Luo

    (Bielefeld University
    Centrum Wiskunde & Informatica)

  • Xiongbin Kang

    (Bielefeld University)

  • Alexander Schönhuth

    (Bielefeld University
    Centrum Wiskunde & Informatica)

Abstract

Error correction is the canonical first step in long-read sequencing data analysis. Current self-correction methods, however, are affected by consensus sequence induced biases that mask true variants in haplotypes of lower frequency showing in mixed samples. Unlike consensus sequence templates, graph-based reference systems are not affected by such biases, so do not mistakenly mask true variants as errors. We present VeChat, as an approach to implement this idea: VeChat is based on variation graphs, as a popular type of data structure for pangenome reference systems. Extensive benchmarking experiments demonstrate that long reads corrected by VeChat contain 4 to 15 (Pacific Biosciences) and 1 to 10 times (Oxford Nanopore Technologies) less errors than when being corrected by state of the art approaches. Further, using VeChat prior to long-read assembly significantly improves the haplotype awareness of the assemblies. VeChat is an easy-to-use open-source tool and publicly available at https://github.com/HaploKit/vechat .

Suggested Citation

  • Xiao Luo & Xiongbin Kang & Alexander Schönhuth, 2022. "VeChat: correcting errors in long reads using variation graphs," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
  • Handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-34381-8
    DOI: 10.1038/s41467-022-34381-8
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-022-34381-8
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-022-34381-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Peter Edge & Vikas Bansal, 2019. "Longshot enables accurate variant calling in diploid genomes from single-molecule long read sequencing," Nature Communications, Nature, vol. 10(1), pages 1-10, December.
    2. Arang Rhie & Shane A. McCarthy & Olivier Fedrigo & Joana Damas & Giulio Formenti & Sergey Koren & Marcela Uliano-Silva & William Chow & Arkarachai Fungtammasan & Juwan Kim & Chul Lee & Byung June Ko &, 2021. "Towards complete and error-free genome assemblies of all vertebrate species," Nature, Nature, vol. 592(7856), pages 737-746, April.
    3. Chirag Jain & Luis M. Rodriguez-R & Adam M. Phillippy & Konstantinos T. Konstantinidis & Srinivas Aluru, 2018. "High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries," Nature Communications, Nature, vol. 9(1), pages 1-8, December.
    4. Karen H. Miga & Sergey Koren & Arang Rhie & Mitchell R. Vollger & Ariel Gershman & Andrey Bzikadze & Shelise Brooks & Edmund Howe & David Porubsky & Glennis A. Logsdon & Valerie A. Schneider & Tamara , 2020. "Telomere-to-telomere assembly of a complete human X chromosome," Nature, Nature, vol. 585(7823), pages 79-84, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Sarah Morrison-Smith & Christina Boucher & Aleksandra Sarcevic & Noelle Noyes & Catherine O’Brien & Nazaret Cuadros & Jaime Ruiz, 2022. "Challenges in large-scale bioinformatics projects," Palgrave Communications, Palgrave Macmillan, vol. 9(1), pages 1-9, December.
    2. Joanna Hård & Jeff E. Mold & Jesper Eisfeldt & Christian Tellgren-Roth & Susana Häggqvist & Ignas Bunikis & Orlando Contreras-Lopez & Chen-Shan Chin & Jessica Nordlund & Carl-Johan Rubin & Lars Feuk &, 2023. "Long-read whole-genome analysis of human single cells," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    3. Zhikun Wu & Zehang Jiang & Tong Li & Chuanbo Xie & Liansheng Zhao & Jiaqi Yang & Shuai Ouyang & Yizhi Liu & Tao Li & Zhi Xie, 2021. "Structural variants in the Chinese population and their impact on phenotypes, diseases and population adaptation," Nature Communications, Nature, vol. 12(1), pages 1-12, December.
    4. Max E. Schön & Vasily V. Zlatogursky & Rohan P. Singh & Camille Poirier & Susanne Wilken & Varsha Mathur & Jürgen F. H. Strassert & Jarone Pinhassi & Alexandra Z. Worden & Patrick J. Keeling & Thijs J, 2021. "Single cell genomics reveals plastid-lacking Picozoa are close relatives of red algae," Nature Communications, Nature, vol. 12(1), pages 1-10, December.
    5. Cheng-Kai Shiau & Lina Lu & Rachel Kieser & Kazutaka Fukumura & Timothy Pan & Hsiao-Yun Lin & Jie Yang & Eric L. Tong & GaHyun Lee & Yuanqing Yan & Jason T. Huse & Ruli Gao, 2023. "High throughput single cell long-read sequencing analyses of same-cell genotypes and phenotypes in human tumors," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    6. Qian Zhou & Fahu Ji & Dongxiao Lin & Xianming Liu & Zexuan Zhu & Jue Ruan, 2024. "KSNP: a fast de Bruijn graph-based haplotyping tool approaching data-in time cost," Nature Communications, Nature, vol. 15(1), pages 1-7, December.
    7. Iliana Bista & Jonathan M. D. Wood & Thomas Desvignes & Shane A. McCarthy & Michael Matschiner & Zemin Ning & Alan Tracey & James Torrance & Ying Sims & William Chow & Michelle Smith & Karen Oliver & , 2023. "Genomics of cold adaptations in the Antarctic notothenioid fish radiation," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    8. Nenad Macesic & Jane Hawkey & Ben Vezina & Jessica A. Wisniewski & Hugh Cottingham & Luke V. Blakeway & Taylor Harshegyi & Katherine Pragastis & Gnei Zweena Badoordeen & Amanda Dennison & Denis W. Spe, 2023. "Genomic dissection of endemic carbapenem resistance reveals metallo-beta-lactamase dissemination through clonal, plasmid and integron transfer," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    9. Minghui Cheng & Yingjie Xu & Xiao Cui & Xin Wei & Yundi Chang & Jun Xu & Cheng Lei & Lei Xue & Yifan Zheng & Zhang Wang & Lingtong Huang & Min Zheng & Hong Luo & Yuxin Leng & Chao Jiang, 2024. "Deep longitudinal lower respiratory tract microbiome profiling reveals genome-resolved functional and evolutionary dynamics in critical illness," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    10. M. Mahmoud & Y. Huang & K. Garimella & P. A. Audano & W. Wan & N. Prasad & R. E. Handsaker & S. Hall & A. Pionzio & M. C. Schatz & M. E. Talkowski & E. E. Eichler & S. E. Levy & F. J. Sedlazeck, 2024. "Utility of long-read sequencing for All of Us," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    11. Anna Zimmermann & Julian E. Prieto-Vivas & Charlotte Cautereels & Anton Gorkovskiy & Jan Steensels & Yves Peer & Kevin J. Verstrepen, 2023. "A Cas3-base editing tool for targetable in vivo mutagenesis," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    12. Jean-Sebastien Gounot & Minghao Chia & Denis Bertrand & Woei-Yuh Saw & Aarthi Ravikrishnan & Adrian Low & Yichen Ding & Amanda Hui Qi Ng & Linda Wei Lin Tan & Yik-Ying Teo & Henning Seedorf & Niranjan, 2022. "Genome-centric analysis of short and long read metagenomes reveals uncharacterized microbiome diversity in Southeast Asians," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    13. Vincent Somerville & Nadine Thierer & Remo S. Schmidt & Alexandra Roetschi & Lauriane Braillard & Monika Haueter & Hélène Berthoud & Noam Shani & Ueli Ah & Florent Mazel & Philipp Engel, 2024. "Genomic and phenotypic imprints of microbial domestication on cheese starter cultures," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    14. Heiner Kuhl & Kang Du & Manfred Schartl & Lukáš Kalous & Matthias Stöck & Dunja K. Lamatsch, 2022. "Equilibrated evolution of the mixed auto-/allopolyploid haplotype-resolved genome of the invasive hexaploid Prussian carp," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    15. Ashley T. Sendell-Price & Frank J. Tulenko & Mats Pettersson & Du Kang & Margo Montandon & Sylke Winkler & Kathleen Kulb & Gavin P. Naylor & Adam Phillippy & Olivier Fedrigo & Jacquelyn Mountcastle & , 2023. "Low mutation rate in epaulette sharks is consistent with a slow rate of evolution in sharks," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    16. Kunpeng Li & Peng Xu & Jinpeng Wang & Xin Yi & Yuannian Jiao, 2023. "Identification of errors in draft genome assemblies at single-nucleotide resolution for quality assessment and improvement," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    17. Temitayo A. Olagunju & Benjamin D. Rosen & Holly L. Neibergs & Gabrielle M. Becker & Kimberly M. Davenport & Christine G. Elsik & Tracy S. Hadfield & Sergey Koren & Kristen L. Kuhn & Arang Rhie & Kati, 2024. "Telomere-to-telomere assemblies of cattle and sheep Y-chromosomes uncover divergent structure and gene content," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    18. Mohamed Awad & Xiangchao Gan, 2023. "GALA: a computational framework for de novo chromosome-by-chromosome assembly with long reads," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    19. Jordy Evan Sulaiman & Jaron Thompson & Yili Qian & Eugenio I. Vivas & Christian Diener & Sean M. Gibbons & Nasia Safdar & Ophelia S. Venturelli, 2024. "Elucidating human gut microbiota interactions that robustly inhibit diverse Clostridioides difficile strains across different nutrient landscapes," Nature Communications, Nature, vol. 15(1), pages 1-20, December.
    20. Zhen Huang & Ivanete De O. Furo & Jing Liu & Valentina Peona & Anderson J. B. Gomes & Wan Cen & Hao Huang & Yanding Zhang & Duo Chen & Ting Xue & Qiujin Zhang & Zhicao Yue & Quanxi Wang & Lingyu Yu & , 2022. "Recurrent chromosome reshuffling and the evolution of neo-sex chromosomes in parrots," Nature Communications, Nature, vol. 13(1), pages 1-11, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-34381-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.