IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v13y2022i1d10.1038_s41467-022-30930-3.html
   My bibliography  Save this article

Robust and accurate estimation of paralog-specific copy number for duplicated genes using whole-genome sequencing

Author

Listed:
  • Timofey Prodanov

    (University of California)

  • Vikas Bansal

    (University of California)

Abstract

The human genome contains hundreds of low-copy repeats (LCRs) that are challenging to analyze using short-read sequencing technologies due to extensive copy number variation and ambiguity in read mapping. Copy number and sequence variants in more than 150 duplicated genes that overlap LCRs have been implicated in monogenic and complex human diseases. We describe a computational tool, Parascopy, for estimating the aggregate and paralog-specific copy number of duplicated genes using whole-genome sequencing (WGS). Parascopy is an efficient method that jointly analyzes reads mapped to different repeat copies without the need for global realignment. It leverages multiple samples to mitigate sequencing bias and to identify reliable paralogous sequence variants (PSVs) that differentiate repeat copies. Analysis of WGS data for 2504 individuals from diverse populations showed that Parascopy is robust to sequencing bias, has higher accuracy compared to existing methods and enables prioritization of pathogenic copy number changes in duplicated genes.

Suggested Citation

  • Timofey Prodanov & Vikas Bansal, 2022. "Robust and accurate estimation of paralog-specific copy number for duplicated genes using whole-genome sequencing," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
  • Handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-30930-3
    DOI: 10.1038/s41467-022-30930-3
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-022-30930-3
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-022-30930-3?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Ryan E. Mills & Klaudia Walter & Chip Stewart & Robert E. Handsaker & Ken Chen & Can Alkan & Alexej Abyzov & Seungtai Chris Yoon & Kai Ye & R. Keira Cheetham & Asif Chinwalla & Donald F. Conrad & Yuta, 2011. "Mapping copy number variation by population-scale genome sequencing," Nature, Nature, vol. 470(7332), pages 59-65, February.
    2. Daniel Taliun & Daniel N. Harris & Michael D. Kessler & Jedidiah Carlson & Zachary A. Szpiech & Raul Torres & Sarah A. Gagliano Taliun & André Corvelo & Stephanie M. Gogarten & Hyun Min Kang & Achille, 2021. "Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program," Nature, Nature, vol. 590(7845), pages 290-299, February.
    3. Ernest Turro & William J. Astle & Karyn Megy & Stefan Gräf & Daniel Greene & Olga Shamardina & Hana Lango Allen & Alba Sanchis-Juan & Mattia Frontini & Chantal Thys & Jonathan Stephens & Rutendo Mapet, 2020. "Whole-genome sequencing of patients with rare diseases in a national health system," Nature, Nature, vol. 583(7814), pages 96-102, July.
    4. Joshua D. Backman & Alexander H. Li & Anthony Marcketta & Dylan Sun & Joelle Mbatchou & Michael D. Kessler & Christian Benner & Daren Liu & Adam E. Locke & Suganthi Balasubramanian & Ashish Yadav & Ni, 2021. "Exome sequencing and analysis of 454,787 UK Biobank participants," Nature, Nature, vol. 599(7886), pages 628-634, November.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Marsha M. Wheeler & Adrienne M. Stilp & Shuquan Rao & Bjarni V. Halldórsson & Doruk Beyter & Jia Wen & Anna V. Mihkaylova & Caitlin P. McHugh & John Lane & Min-Zhi Jiang & Laura M. Raffield & Goo Jun , 2022. "Whole genome sequencing identifies structural variants contributing to hematologic traits in the NHLBI TOPMed program," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
    2. Vincent Michaud & Eulalie Lasseaux & David J. Green & Dave T. Gerrard & Claudio Plaisant & Tomas Fitzgerald & Ewan Birney & Benoît Arveiler & Graeme C. Black & Panagiotis I. Sergouniotis, 2022. "The contribution of common regulatory and protein-coding TYR variants to the genetic architecture of albinism," Nature Communications, Nature, vol. 13(1), pages 1-8, December.
    3. Natalie DeForest & Yuqi Wang & Zhiyi Zhu & Jacqueline S. Dron & Ryan Koesterer & Pradeep Natarajan & Jason Flannick & Tiffany Amariuta & Gina M. Peloso & Amit R. Majithia, 2024. "Genome-wide discovery and integrative genomic characterization of insulin resistance loci using serum triglycerides to HDL-cholesterol ratio as a proxy," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    4. Dick Schijven & Sourena Soheili-Nezhad & Simon E. Fisher & Clyde Francks, 2024. "Exome-wide analysis implicates rare protein-altering variants in human handedness," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    5. Sean A. Misek & Aaron Fultineer & Jeremie Kalfon & Javad Noorbakhsh & Isabella Boyle & Priyanka Roy & Joshua Dempster & Lia Petronio & Katherine Huang & Alham Saadat & Thomas Green & Adam Brown & John, 2024. "Germline variation contributes to false negatives in CRISPR-based experiments with varying burden across ancestries," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    6. Mit Shah & Marco H. A. Inácio & Chang Lu & Pierre-Raphaël Schiratti & Sean L. Zheng & Adam Clement & Antonio Marvao & Wenjia Bai & Andrew P. King & James S. Ware & Martin R. Wilkins & Johanna Mielke &, 2023. "Environmental and genetic predictors of human cardiovascular ageing," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    7. Yash Pershad & Taralynn Mack & Hannah Poisner & Yasminka A. Jakubek & Adrienne M. Stilp & Braxton D. Mitchell & Joshua P. Lewis & Eric Boerwinkle & Ruth J. F. Loos & Nathalie Chami & Zhe Wang & Kathle, 2024. "Determinants of mosaic chromosomal alteration fitness," Nature Communications, Nature, vol. 15(1), pages 1-10, December.
    8. Elena V. Feofanova & Michael R. Brown & Taryn Alkis & Astrid M. Manuel & Xihao Li & Usman A. Tahir & Zilin Li & Kevin M. Mendez & Rachel S. Kelly & Qibin Qi & Han Chen & Martin G. Larson & Rozenn N. L, 2023. "Whole-Genome Sequencing Analysis of Human Metabolome in Multi-Ethnic Populations," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    9. Yu Chen & Amy Y. Wang & Courtney A. Barkley & Yixin Zhang & Xinyang Zhao & Min Gao & Mick D. Edmonds & Zechen Chong, 2023. "Deciphering the exact breakpoints of structural variations using long sequencing reads with DeBreak," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    10. Naman S. Shetty & Mokshad Gaonkar & Nirav Patel & Akhil Pampana & Nehal Vekariya & Peng Li & Garima Arora & Pankaj Arora, 2024. "Determinants of transthyretin levels and their association with adverse clinical outcomes among UK Biobank participants," Nature Communications, Nature, vol. 15(1), pages 1-7, December.
    11. Parsa Akbari & Olukayode A. Sosina & Jonas Bovijn & Karl Landheer & Jonas B. Nielsen & Minhee Kim & Senem Aykul & Tanima De & Mary E. Haas & George Hindy & Nan Lin & Ian R. Dinsmore & Jonathan Z. Luo , 2022. "Multiancestry exome sequencing reveals INHBE mutations associated with favorable fat distribution and protection from diabetes," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    12. Injeong Shim & Hiroyuki Kuwahara & NingNing Chen & Mais O. Hashem & Lama AlAbdi & Mohamed Abouelhoda & Hong-Hee Won & Pradeep Natarajan & Patrick T. Ellinor & Amit V. Khera & Xin Gao & Fowzan S. Alkur, 2023. "Clinical utility of polygenic scores for cardiometabolic disease in Arabs," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    13. Mateus H. Gouveia & Amy R. Bentley & Thiago P. Leal & Eduardo Tarazona-Santos & Carlos D. Bustamante & Adebowale A. Adeyemo & Charles N. Rotimi & Daniel Shriner, 2023. "Unappreciated subcontinental admixture in Europeans and European Americans and implications for genetic epidemiology studies," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    14. Zihuai He & Linxi Liu & Michael E. Belloy & Yann Guen & Aaron Sossin & Xiaoxia Liu & Xinran Qi & Shiyang Ma & Prashnna K. Gyawali & Tony Wyss-Coray & Hua Tang & Chiara Sabatti & Emmanuel Candès & Mich, 2022. "GhostKnockoff inference empowers identification of putative causal variants in genome-wide association studies," Nature Communications, Nature, vol. 13(1), pages 1-16, December.
    15. Pol Solé-Navais & Julius Juodakis & Karin Ytterberg & Xiaoping Wu & Jonathan P. Bradfield & Marc Vaudel & Abigail L. LaBella & Øyvind Helgeland & Christopher Flatley & Frank Geller & Moshe Finel & Men, 2024. "Genome-wide analyses of neonatal jaundice reveal a marked departure from adult bilirubin metabolism," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    16. Yanjun Guo & Quanhong Liu & Zhilin Zheng & Mengxia Qing & Tianci Yao & Bin Wang & Min Zhou & Dongming Wang & Qinmei Ke & Jixuan Ma & Zhilei Shan & Weihong Chen, 2024. "Genetic association of inflammatory marker GlycA with lung function and respiratory diseases," Nature Communications, Nature, vol. 15(1), pages 1-10, December.
    17. Peter H. Dixon & Adam P. Levine & Inês Cebola & Melanie M. Y. Chan & Aliya S. Amin & Anshul Aich & Monika Mozere & Hannah Maude & Alice L. Mitchell & Jun Zhang & Jenny Chambers & Argyro Syngelaki & Je, 2022. "GWAS meta-analysis of intrahepatic cholestasis of pregnancy implicates multiple hepatic genes and regulatory elements," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
    18. Marcin Kierczak & Nima Rafati & Julia Höglund & Hadrien Gourlé & Valeria Lo Faro & Daniel Schmitz & Weronica E. Ek & Ulf Gyllensten & Stefan Enroth & Diana Ekman & Björn Nystedt & Torgny Karlsson & Ås, 2022. "Contribution of rare whole-genome sequencing variants to plasma protein levels and the missing heritability," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
    19. Ozvan Bocher & Cristen J. Willer & Eleftheria Zeggini, 2023. "Unravelling the genetic architecture of human complex traits through whole genome sequencing," Nature Communications, Nature, vol. 14(1), pages 1-4, December.
    20. Bárbara Sousa da Mota & Simone Rubinacci & Diana Ivette Cruz Dávalos & Carlos Eduardo G. Amorim & Martin Sikora & Niels N. Johannsen & Marzena H. Szmyt & Piotr Włodarczak & Anita Szczepanek & Marcin M, 2023. "Imputation of ancient human genomes," Nature Communications, Nature, vol. 14(1), pages 1-17, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-30930-3. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.