Author
Listed:
- Emily Berger
(Massachusetts Institute of Technology
Massachusetts Institute of Technology
UC Berkeley)
- Deniz Yorukoglu
(Massachusetts Institute of Technology)
- Lillian Zhang
(Massachusetts Institute of Technology)
- Sarah K. Nyquist
(Massachusetts Institute of Technology)
- Alex K. Shalek
(Massachusetts Institute of Technology)
- Manolis Kellis
(Massachusetts Institute of Technology)
- Ibrahim Numanagić
(Massachusetts Institute of Technology
Massachusetts Institute of Technology
University of Victoria)
- Bonnie Berger
(Massachusetts Institute of Technology
Massachusetts Institute of Technology)
Abstract
Haplotype reconstruction of distant genetic variants remains an unsolved problem due to the short-read length of common sequencing data. Here, we introduce HapTree-X, a probabilistic framework that utilizes latent long-range information to reconstruct unspecified haplotypes in diploid and polyploid organisms. It introduces the observation that differential allele-specific expression can link genetic variants from the same physical chromosome, thus even enabling using reads that cover only individual variants. We demonstrate HapTree-X’s feasibility on in-house sequenced Genome in a Bottle RNA-seq and various whole exome, genome, and 10X Genomics datasets. HapTree-X produces more complete phases (up to 25%), even in clinically important genes, and phases more variants than other methods while maintaining similar or higher accuracy and being up to 10× faster than other tools. The advantage of HapTree-X’s ability to use multiple lines of evidence, as well as to phase polyploid genomes in a single integrative framework, substantially grows as the amount of diverse data increases.
Suggested Citation
Emily Berger & Deniz Yorukoglu & Lillian Zhang & Sarah K. Nyquist & Alex K. Shalek & Manolis Kellis & Ibrahim Numanagić & Bonnie Berger, 2020.
"Improved haplotype inference by exploiting long-range linking and allelic imbalance in RNA-seq datasets,"
Nature Communications, Nature, vol. 11(1), pages 1-9, December.
Handle:
RePEc:nat:natcom:v:11:y:2020:i:1:d:10.1038_s41467-020-18320-z
DOI: 10.1038/s41467-020-18320-z
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:11:y:2020:i:1:d:10.1038_s41467-020-18320-z. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.