IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1007843.html
   My bibliography  Save this article

Ranbow: A fast and accurate method for polyploid haplotype reconstruction

Author

Listed:
  • M-Hossein Moeinzadeh
  • Jun Yang
  • Evgeny Muzychenko
  • Giuseppe Gallone
  • David Heller
  • Knut Reinert
  • Stefan Haas
  • Martin Vingron

Abstract

Reconstructing haplotypes from sequencing data is one of the major challenges in genetics. Haplotypes play a crucial role in many analyses, including genome-wide association studies and population genetics. Haplotype reconstruction becomes more difficult for higher numbers of homologous chromosomes, as it is often the case for polyploid plants. This complexity is compounded further by higher heterozygosity, which denotes the frequent presence of variants between haplotypes. We have designed Ranbow, a new tool for haplotype reconstruction of polyploid genome from short read sequencing data. Ranbow integrates all types of small variants in bi- and multi-allelic sites to reconstruct haplotypes. To evaluate Ranbow and currently available competing methods on real data, we have created and released a real gold standard dataset from sweet potato sequencing data. Our evaluations on real and simulated data clearly show Ranbow’s superior performance in terms of accuracy, haplotype length, memory usage, and running time. Specifically, Ranbow is one order of magnitude faster than the next best method. The efficiency and accuracy of Ranbow makes whole genome haplotype reconstruction of complex genome with higher ploidy feasible.Author summary: We focus on the problem of reconstructing haplotypes for polyploid genomes. Our approach explored using short read sequence data from a highly heterozygous hexaploid genome. We observed that short read data from strongly heterozygous organisms open up a way for haplotype reconstruction by supplying overlap information between reads. We therefore investigated the role of heterozygosity and ploidy number. Though higher heterozygosity provides more useful reads for reconstructing haplotypes, polyploidy increases the challenge in assembling reads into longer sequences. We called this the problem of “Ambiguity of Merging” fragments. We addressed this problem by designing a new algorithm called Ranbow. Ranbow was evaluated on real and simulated data from the genomes of tetraploid Capsella bursa-pastoris (Shepherd’s Purse) and hexaploid Ipomoea batatas (sweet potato). We were able to show that our method achieved high accuracy and long assembled haplotypes in a feasible amount of time, performing at a level consistently superior to other algorithms.

Suggested Citation

  • M-Hossein Moeinzadeh & Jun Yang & Evgeny Muzychenko & Giuseppe Gallone & David Heller & Knut Reinert & Stefan Haas & Martin Vingron, 2020. "Ranbow: A fast and accurate method for polyploid haplotype reconstruction," PLOS Computational Biology, Public Library of Science, vol. 16(5), pages 1-23, May.
  • Handle: RePEc:plo:pcbi00:1007843
    DOI: 10.1371/journal.pcbi.1007843
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007843
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1007843&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1007843?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1007843. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.