IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v12y2021i1d10.1038_s41467-020-20236-7.html
   My bibliography  Save this article

Efficient assembly of nanopore reads via highly accurate and intact error correction

Author

Listed:
  • Ying Chen

    (Sun Yat-sen University)

  • Fan Nie

    (Central South University)

  • Shang-Qian Xie

    (Hainan University
    Hainan University)

  • Ying-Feng Zheng

    (Sun Yat-sen University)

  • Qi Dai

    (Zhejiang Sci-Tech University)

  • Thomas Bray

    (Oxford Nanopore Technologies)

  • Yao-Xin Wang

    (Zhejiang Sci-Tech University)

  • Jian-Feng Xing

    (Hainan University
    Hainan University)

  • Zhi-Jian Huang

    (Sun Yat-sen University
    Sun Yat-sen University
    Sun Yat-sen University)

  • De-Peng Wang

    (Nextomics Biosciences Co., Ltd)

  • Li-Juan He

    (Sun Yat-sen University)

  • Feng Luo

    (Clemson University)

  • Jian-Xin Wang

    (Central South University
    Central South University)

  • Yi-Zhi Liu

    (Sun Yat-sen University
    Chinese Academy of Medical Sciences)

  • Chuan-Le Xiao

    (Sun Yat-sen University)

Abstract

Long nanopore reads are advantageous in de novo genome assembly. However, nanopore reads usually have broad error distribution and high-error-rate subsequences. Existing error correction tools cannot correct nanopore reads efficiently and effectively. Most methods trim high-error-rate subsequences during error correction, which reduces both the length of the reads and contiguity of the final assembly. Here, we develop an error correction, and de novo assembly tool designed to overcome complex errors in nanopore reads. We propose an adaptive read selection and two-step progressive method to quickly correct nanopore reads to high accuracy. We introduce a two-stage assembler to utilize the full length of nanopore reads. Our tool achieves superior performance in both error correction and de novo assembling nanopore reads. It requires only 8122 hours to assemble a 35X coverage human genome and achieves a 2.47-fold improvement in NG50. Furthermore, our assembly of the human WERI cell line shows an NG50 of 22 Mbp. The high-quality assembly of nanopore reads can significantly reduce false positives in structure variation detection.

Suggested Citation

  • Ying Chen & Fan Nie & Shang-Qian Xie & Ying-Feng Zheng & Qi Dai & Thomas Bray & Yao-Xin Wang & Jian-Feng Xing & Zhi-Jian Huang & De-Peng Wang & Li-Juan He & Feng Luo & Jian-Xin Wang & Yi-Zhi Liu & Chu, 2021. "Efficient assembly of nanopore reads via highly accurate and intact error correction," Nature Communications, Nature, vol. 12(1), pages 1-10, December.
  • Handle: RePEc:nat:natcom:v:12:y:2021:i:1:d:10.1038_s41467-020-20236-7
    DOI: 10.1038/s41467-020-20236-7
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-020-20236-7
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-020-20236-7?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Rubén Barcia-Cruz & David Goudenège & Jorge A. Moura de Sousa & Damien Piel & Martial Marbouty & Eduardo P. C. Rocha & Frédérique Roux, 2024. "Phage-inducible chromosomal minimalist islands (PICMIs), a novel family of small marine satellites of virulent phages," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    2. Kunpeng Li & Peng Xu & Jinpeng Wang & Xin Yi & Yuannian Jiao, 2023. "Identification of errors in draft genome assemblies at single-nucleotide resolution for quality assessment and improvement," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    3. Ana Paula Zotta Mota & Georgios D. Koutsovoulos & Laetitia Perfus-Barbeoch & Evelin Despot-Slade & Karine Labadie & Jean-Marc Aury & Karine Robbe-Sermesant & Marc Bailly-Bechet & Caroline Belser & Art, 2024. "Unzipped genome assemblies of polyploid root-knot nematodes reveal unusual and clade-specific telomeric repeats," Nature Communications, Nature, vol. 15(1), pages 1-18, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:12:y:2021:i:1:d:10.1038_s41467-020-20236-7. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.