IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1009960.html
   My bibliography  Save this article

Bayesian inference of ancestral recombination graphs

Author

Listed:
  • Ali Mahmoudi
  • Jere Koskela
  • Jerome Kelleher
  • Yao-ban Chan
  • David Balding

Abstract

We present a novel algorithm, implemented in the software ARGinfer, for probabilistic inference of the Ancestral Recombination Graph under the Coalescent with Recombination. Our Markov Chain Monte Carlo algorithm takes advantage of the Succinct Tree Sequence data structure that has allowed great advances in simulation and point estimation, but not yet probabilistic inference. Unlike previous methods, which employ the Sequentially Markov Coalescent approximation, ARGinfer uses the Coalescent with Recombination, allowing more accurate inference of key evolutionary parameters. We show using simulations that ARGinfer can accurately estimate many properties of the evolutionary history of the sample, including the topology and branch lengths of the genealogical tree at each sequence site, and the times and locations of mutation and recombination events. ARGinfer approximates posterior probability distributions for these and other quantities, providing interpretable assessments of uncertainty that we show to be well calibrated. ARGinfer is currently limited to tens of DNA sequences of several hundreds of kilobases, but has scope for further computational improvements to increase its applicability.Author summary: One of the important challenges in population genetics is to reconstruct the historical mutation, recombination, and shared ancestor events that underly a sample of DNA sequences drawn from a population. Aspects of this history can inform us about evolutionary processes, ages of mutations and times of common ancestors, and historical population sizes and migration rates. Performing such inferences is difficult, and progress has been slow over the past two decades. Recently, a new and more efficient way to store sequence data has led to improved simulations and also a fast way to reconstruct some aspects of the history. We augment the new data structure to infer many more details of the history, including the times of events. We also provide approximations of the full probability distributions for all the unknowns, not just plausible values. Because this task is highly challenging, we are limited to relatively small data sets, but we show that our inference algorithm represents an important step forward over those currently available in terms of the accuracy of its inferences.

Suggested Citation

  • Ali Mahmoudi & Jere Koskela & Jerome Kelleher & Yao-ban Chan & David Balding, 2022. "Bayesian inference of ancestral recombination graphs," PLOS Computational Biology, Public Library of Science, vol. 18(3), pages 1-15, March.
  • Handle: RePEc:plo:pcbi00:1009960
    DOI: 10.1371/journal.pcbi.1009960
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1009960
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1009960&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1009960?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Jerome Kelleher & Alison M Etheridge & Gilean McVean, 2016. "Efficient Coalescent Simulation and Genealogical Analysis for Large Sample Sizes," PLOS Computational Biology, Public Library of Science, vol. 12(5), pages 1-22, May.
    2. Matthew D Rasmussen & Melissa J Hubisz & Ilan Gronau & Adam Siepel, 2014. "Genome-Wide Inference of Ancestral Recombination Graphs," PLOS Genetics, Public Library of Science, vol. 10(5), pages 1-27, May.
    3. Jerome Kelleher & Kevin R Thornton & Jaime Ashander & Peter L Ralph, 2018. "Efficient pedigree recording for fast population genetics simulation," PLOS Computational Biology, Public Library of Science, vol. 14(11), pages 1-21, November.
    4. Ying-Xin Zhang & Kim Perry & Victor A. Vinci & Keith Powell & Willem P. C. Stemmer & Stephen B. del Cardayré, 2002. "Genome shuffling leads to rapid phenotypic improvement in bacteria," Nature, Nature, vol. 415(6872), pages 644-646, February.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ralph, Peter L., 2019. "An empirical approach to demographic inference with genomic data," Theoretical Population Biology, Elsevier, vol. 127(C), pages 91-101.
    2. Deng, Yun & Song, Yun S. & Nielsen, Rasmus, 2021. "The distribution of waiting distances in ancestral recombination graphs," Theoretical Population Biology, Elsevier, vol. 141(C), pages 34-43.
    3. Nicola F. Müller & Kathryn E. Kistler & Trevor Bedford, 2022. "A Bayesian approach to infer recombination patterns in coronaviruses," Nature Communications, Nature, vol. 13(1), pages 1-9, December.
    4. Sergio F. Nigenda-Morales & Meixi Lin & Paulina G. Nuñez-Valencia & Christopher C. Kyriazis & Annabel C. Beichman & Jacqueline A. Robinson & Aaron P. Ragsdale & Jorge Urbán R. & Frederick I. Archer & , 2023. "The genomic footprint of whaling and isolation in fin whale populations," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
    5. Zihao Wang & Wenxi Wang & Xiaoming Xie & Yongfa Wang & Zhengzhao Yang & Huiru Peng & Mingming Xin & Yingyin Yao & Zhaorong Hu & Jie Liu & Zhenqi Su & Chaojie Xie & Baoyun Li & Zhongfu Ni & Qixin Sun &, 2022. "Dispersed emergence and protracted domestication of polyploid wheat uncovered by mosaic ancestral haploblock inference," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    6. Vasili Pankratov & Milyausha Yunusbaeva & Sergei Ryakhovsky & Maksym Zarodniuk & Bayazit Yunusbayev, 2022. "Prioritizing autoimmunity risk variants for functional analyses by fine-mapping mutations under natural selection," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    7. Michael DeGiorgio & Zachary A Szpiech, 2022. "A spatially aware likelihood test to detect sweeps from haplotype distributions," PLOS Genetics, Public Library of Science, vol. 18(4), pages 1-37, April.
    8. Kerdoncuff, Elise & Lambert, Amaury & Achaz, Guillaume, 2020. "Testing for population decline using maximal linkage disequilibrium blocks," Theoretical Population Biology, Elsevier, vol. 134(C), pages 171-181.
    9. Bing Guo & Victor Borda & Roland Laboulaye & Michele D. Spring & Mariusz Wojnarski & Brian A. Vesely & Joana C. Silva & Norman C. Waters & Timothy D. O’Connor & Shannon Takala-Harrison, 2024. "Strong positive selection biases identity-by-descent-based inferences of recent demography and population structure in Plasmodium falciparum," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    10. Parul Johri & Wolfgang Stephan & Jeffrey D Jensen, 2022. "Soft selective sweeps: Addressing new definitions, evaluating competing models, and interpreting empirical outliers," PLOS Genetics, Public Library of Science, vol. 18(2), pages 1-12, February.
    11. Wu, Bo & Wang, Yan-Wei & Dai, Yong-Hua & Song, Chao & Zhu, Qi-Li & Qin, Han & Tan, Fu-Rong & Chen, Han-Cheng & Dai, Li-Chun & Hu, Guo-Quan & He, Ming-Xiong, 2021. "Current status and future prospective of bio-ethanol industry in China," Renewable and Sustainable Energy Reviews, Elsevier, vol. 145(C).
    12. Simone Rubinacci & Olivier Delaneau & Jonathan Marchini, 2020. "Genotype imputation using the Positional Burrows Wheeler Transform," PLOS Genetics, Public Library of Science, vol. 16(11), pages 1-19, November.
    13. Andrea Fulgione & Célia Neto & Ahmed F. Elfarargi & Emmanuel Tergemina & Shifa Ansari & Mehmet Göktay & Herculano Dinis & Nina Döring & Pádraic J. Flood & Sofia Rodriguez-Pacheco & Nora Walden & Marcu, 2022. "Parallel reduction in flowering time from de novo mutations enable evolutionary rescue in colonizing lineages," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    14. Hayman, Elizabeth & Ignatieva, Anastasia & Hein, Jotun, 2023. "Recoverability of ancestral recombination graph topologies," Theoretical Population Biology, Elsevier, vol. 154(C), pages 27-39.
    15. Miró Pina, Verónica & Joly, Émilien & Siri-Jégousse, Arno, 2023. "Estimating the Lambda measure in multiple-merger coalescents," Theoretical Population Biology, Elsevier, vol. 154(C), pages 94-101.
    16. Sam Tallman & Maria das Dores Sungo & Sílvio Saranga & Sandra Beleza, 2023. "Whole genomes from Angola and Mozambique inform about the origins and dispersals of major African migrations," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    17. Victoria L. Sork & Shawn J. Cokus & Sorel T. Fitz-Gibbon & Aleksey V. Zimin & Daniela Puiu & Jesse A. Garcia & Paul F. Gugger & Claudia L. Henriquez & Ying Zhen & Kirk E. Lohmueller & Matteo Pellegrin, 2022. "High-quality genome and methylomes illustrate features underlying evolutionary success of oaks," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    18. Tan, Taison & Frenkel, Daan & Gupta, Vishal & Deem, Michael W., 2005. "Length, protein–protein interactions, and complexity," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 350(1), pages 52-62.
    19. Max Lundberg & Alexander Mackintosh & Anna Petri & Staffan Bensch, 2023. "Inversions maintain differences between migratory phenotypes of a songbird," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    20. Philippe Gambette & Leo van Iersel & Mark Jones & Manuel Lafond & Fabio Pardi & Celine Scornavacca, 2017. "Rearrangement moves on rooted phylogenetic networks," PLOS Computational Biology, Public Library of Science, vol. 13(8), pages 1-21, August.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1009960. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.