Author
Listed:
- Zev N. Kronenberg
(Phase Genomics
Pacific Biosciences)
- Arang Rhie
(National Human Genome Research Institute)
- Sergey Koren
(National Human Genome Research Institute)
- Gregory T. Concepcion
(Pacific Biosciences)
- Paul Peluso
(Pacific Biosciences)
- Katherine M. Munson
(University of Washington School of Medicine)
- David Porubsky
(University of Washington School of Medicine)
- Kristen Kuhn
(Clay Center)
- Kathryn A. Mueller
(Phase Genomics)
- Wai Yee Low
(The University of Adelaide)
- Stefan Hiendleder
(The University of Adelaide)
- Olivier Fedrigo
(The Rockefeller University)
- Ivan Liachko
(Phase Genomics)
- Richard J. Hall
(Pacific Biosciences)
- Adam M. Phillippy
(National Human Genome Research Institute)
- Evan E. Eichler
(University of Washington School of Medicine
University of Washington)
- John L. Williams
(The University of Adelaide
Università Cattolica del Sacro Cuore)
- Timothy P. L. Smith
(Clay Center)
- Erich D. Jarvis
(The Rockefeller University
Howard Hughes Medical Institute)
- Shawn T. Sullivan
(Phase Genomics)
- Sarah B. Kingan
(Pacific Biosciences)
Abstract
Haplotype-resolved genome assemblies are important for understanding how combinations of variants impact phenotypes. To date, these assemblies have been best created with complex protocols, such as cultured cells that contain a single-haplotype (haploid) genome, single cells where haplotypes are separated, or co-sequencing of parental genomes in a trio-based approach. These approaches are impractical in most situations. To address this issue, we present FALCON-Phase, a phasing tool that uses ultra-long-range Hi-C chromatin interaction data to extend phase blocks of partially-phased diploid assembles to chromosome or scaffold scale. FALCON-Phase uses the inherent phasing information in Hi-C reads, skipping variant calling, and reduces the computational complexity of phasing. Our method is validated on three benchmark datasets generated as part of the Vertebrate Genomes Project (VGP), including human, cow, and zebra finch, for which high-quality, fully haplotype-resolved assemblies are available using the trio-based approach. FALCON-Phase is accurate without having parental data and performance is better in samples with higher heterozygosity. For cow and zebra finch the accuracy is 97% compared to 80–91% for human. FALCON-Phase is applicable to any draft assembly that contains long primary contigs and phased associate contigs.
Suggested Citation
Zev N. Kronenberg & Arang Rhie & Sergey Koren & Gregory T. Concepcion & Paul Peluso & Katherine M. Munson & David Porubsky & Kristen Kuhn & Kathryn A. Mueller & Wai Yee Low & Stefan Hiendleder & Olivi, 2021.
"Extended haplotype-phasing of long-read de novo genome assemblies using Hi-C,"
Nature Communications, Nature, vol. 12(1), pages 1-10, December.
Handle:
RePEc:nat:natcom:v:12:y:2021:i:1:d:10.1038_s41467-020-20536-y
DOI: 10.1038/s41467-020-20536-y
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:12:y:2021:i:1:d:10.1038_s41467-020-20536-y. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.