IDEAS home Printed from https://ideas.repec.org/a/plo/pgen00/1003905.html
   My bibliography  Save this article

Robust Demographic Inference from Genomic and SNP Data

Author

Listed:
  • Laurent Excoffier
  • Isabelle Dupanloup
  • Emilia Huerta-Sánchez
  • Vitor C Sousa
  • Matthieu Foll

Abstract

We introduce a flexible and robust simulation-based framework to infer demographic parameters from the site frequency spectrum (SFS) computed on large genomic datasets. We show that our composite-likelihood approach allows one to study evolutionary models of arbitrary complexity, which cannot be tackled by other current likelihood-based methods. For simple scenarios, our approach compares favorably in terms of accuracy and speed with , the current reference in the field, while showing better convergence properties for complex models. We first apply our methodology to non-coding genomic SNP data from four human populations. To infer their demographic history, we compare neutral evolutionary models of increasing complexity, including unsampled populations. We further show the versatility of our framework by extending it to the inference of demographic parameters from SNP chips with known ascertainment, such as that recently released by Affymetrix to study human origins. Whereas previous ways of handling ascertained SNPs were either restricted to a single population or only allowed the inference of divergence time between a pair of populations, our framework can correctly infer parameters of more complex models including the divergence of several populations, bottlenecks and migration. We apply this approach to the reconstruction of African demography using two distinct ascertained human SNP panels studied under two evolutionary models. The two SNP panels lead to globally very similar estimates and confidence intervals, and suggest an ancient divergence (>110 Ky) between Yoruba and San populations. Our methodology appears well suited to the study of complex scenarios from large genomic data sets.Author Summary: We present a new likelihood-based method to infer the past demography of a set of populations from large genomic datasets. Our method can be applied to arbitrarily complex models as the likelihood is estimated by coalescent simulations. Under simple scenarios, our method behaves similarly to a widely used diffusion-based method while showing better convergence properties. In addition, our approach can be applied to very complex models including as many as a dozen populations, and still retrieve parameters very accurately in a reasonable time. We apply our approach to estimate the past demography of four human populations for which non-coding whole genome diversity is available, estimating the degree of European admixture of a southwest African American population and that of a Kenyan population with an unsampled East African population. We also show the versatility of our framework by inferring the demographic history of African populations from SNP chip data with known ascertainment bias, and find a very old divergence time (>110 Ky) between Yorubas from Western Africa and Sans from Southern Africa.

Suggested Citation

  • Laurent Excoffier & Isabelle Dupanloup & Emilia Huerta-Sánchez & Vitor C Sousa & Matthieu Foll, 2013. "Robust Demographic Inference from Genomic and SNP Data," PLOS Genetics, Public Library of Science, vol. 9(10), pages 1-17, October.
  • Handle: RePEc:plo:pgen00:1003905
    DOI: 10.1371/journal.pgen.1003905
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1003905
    Download Restriction: no

    File URL: https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1003905&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pgen.1003905?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Heng Li & Richard Durbin, 2011. "Inference of human population history from individual whole-genome sequences," Nature, Nature, vol. 475(7357), pages 493-496, July.
    2. Augustine Kong & Michael L. Frigge & Gisli Masson & Soren Besenbacher & Patrick Sulem & Gisli Magnusson & Sigurjon A. Gudjonsson & Asgeir Sigurdsson & Aslaug Jonasdottir & Adalbjorg Jonasdottir & Wend, 2012. "Rate of de novo mutations and the importance of father’s age to disease risk," Nature, Nature, vol. 488(7412), pages 471-475, August.
    3. Joseph K. Pickrell & Nick Patterson & Chiara Barbieri & Falko Berthold & Linda Gerlach & Tom Güldemann & Blesswell Kure & Sununguko Wata Mpoloka & Hirosi Nakagawa & Christfried Naumann & Mark Lipson &, 2012. "The genetic prehistory of southern Africa," Nature Communications, Nature, vol. 3(1), pages 1-6, January.
    4. Chen, Hua, 2012. "The joint allele frequency spectrum of multiple populations: A coalescent theory approach," Theoretical Population Biology, Elsevier, vol. 81(2), pages 179-195.
    5. Ryan N Gutenkunst & Ryan D Hernandez & Scott H Williamson & Carlos D Bustamante, 2009. "Inferring the Joint Demographic History of Multiple Populations from Multidimensional SNP Frequency Data," PLOS Genetics, Public Library of Science, vol. 5(10), pages 1-11, October.
    6. David Reich & Richard E. Green & Martin Kircher & Johannes Krause & Nick Patterson & Eric Y. Durand & Bence Viola & Adrian W. Briggs & Udo Stenzel & Philip L. F. Johnson & Tomislav Maricic & Jeffrey M, 2010. "Genetic history of an archaic hominin group from Denisova Cave in Siberia," Nature, Nature, vol. 468(7327), pages 1053-1060, December.
    7. Myers, Simon & Fefferman, Charles & Patterson, Nick, 2008. "Can one learn history from the allelic spectrum?," Theoretical Population Biology, Elsevier, vol. 73(3), pages 342-348.
    8. Rasmus Nielsen & Thorfinn Korneliussen & Anders Albrechtsen & Yingrui Li & Jun Wang, 2012. "SNP Calling, Genotype Calling, and Sample Allele Frequency Estimation from New-Generation Sequencing Data," PLOS ONE, Public Library of Science, vol. 7(7), pages 1-10, July.
    9. Paul A Jenkins & Yun S Song & Rachel B Brem, 2012. "Genealogy-Based Methods for Inference of Historical Recombination and Gene Flow and Their Application in Saccharomyces cerevisiae," PLOS ONE, Public Library of Science, vol. 7(11), pages 1-13, November.
    10. Mark A. Beaumont & Jean-Marie Cornuet & Jean-Michel Marin & Christian P. Robert, 2009. "Adaptive approximate Bayesian computation," Biometrika, Biometrika Trust, vol. 96(4), pages 983-990.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Kelley Harris & Rasmus Nielsen, 2013. "Inferring Demographic History from a Spectrum of Shared Haplotype Lengths," PLOS Genetics, Public Library of Science, vol. 9(6), pages 1-20, June.
    2. Steinrücken, Matthias & Paul, Joshua S. & Song, Yun S., 2013. "A sequentially Markov conditional sampling distribution for structured populations with migration and recombination," Theoretical Population Biology, Elsevier, vol. 87(C), pages 51-61.
    3. Melissa J Hubisz & Amy L Williams & Adam Siepel, 2020. "Mapping gene flow between ancient hominins through demography-aware inference of the ancestral recombination graph," PLOS Genetics, Public Library of Science, vol. 16(8), pages 1-24, August.
    4. Legried, Brandon & Terhorst, Jonathan, 2022. "Rates of convergence in the two-island and isolation-with-migration models," Theoretical Population Biology, Elsevier, vol. 147(C), pages 16-27.
    5. Jörn Bethune & April Kleppe & Søren Besenbacher, 2022. "A method to build extended sequence context models of point mutations and indels," Nature Communications, Nature, vol. 13(1), pages 1-10, December.
    6. Johndrow, James E. & Palacios, Julia A., 2019. "Exact limits of inference in coalescent models," Theoretical Population Biology, Elsevier, vol. 125(C), pages 75-93.
    7. Krystyna Nadachowska-Brzyska & Reto Burri & Pall I Olason & Takeshi Kawakami & Linnéa Smeds & Hans Ellegren, 2013. "Demographic Divergence History of Pied Flycatcher and Collared Flycatcher Inferred from Whole-Genome Re-sequencing Data," PLOS Genetics, Public Library of Science, vol. 9(11), pages 1-14, November.
    8. Kim, Junhyong & Mossel, Elchanan & Rácz, Miklós Z. & Ross, Nathan, 2015. "Can one hear the shape of a population history?," Theoretical Population Biology, Elsevier, vol. 100(C), pages 26-38.
    9. Francois Olivier & Laval Guillaume, 2011. "Deviance Information Criteria for Model Selection in Approximate Bayesian Computation," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 10(1), pages 1-25, July.
    10. Ryan N Gutenkunst & Ryan D Hernandez & Scott H Williamson & Carlos D Bustamante, 2009. "Inferring the Joint Demographic History of Multiple Populations from Multidimensional SNP Frequency Data," PLOS Genetics, Public Library of Science, vol. 5(10), pages 1-11, October.
    11. Gideon S Bradburd & Peter L Ralph & Graham M Coop, 2016. "A Spatial Framework for Understanding Population Structure and Admixture," PLOS Genetics, Public Library of Science, vol. 12(1), pages 1-38, January.
    12. Marina Muzzio & Josefina M B Motti & Paula B Paz Sepulveda & Muh-ching Yee & Thomas Cooke & María R Santos & Virginia Ramallo & Emma L Alfaro & Jose E Dipierri & Graciela Bailliet & Claudio M Bravi & , 2018. "Population structure in Argentina," PLOS ONE, Public Library of Science, vol. 13(5), pages 1-13, May.
    13. Baharian, Soheil & Gravel, Simon, 2018. "On the decidability of population size histories from finite allele frequency spectra," Theoretical Population Biology, Elsevier, vol. 120(C), pages 42-51.
    14. Xing Ju Lee & Christopher C. Drovandi & Anthony N. Pettitt, 2015. "Model choice problems using approximate Bayesian computation with applications to pathogen transmission data sets," Biometrics, The International Biometric Society, vol. 71(1), pages 198-207, March.
    15. McKinley, Trevelyan J. & Ross, Joshua V. & Deardon, Rob & Cook, Alex R., 2014. "Simulation-based Bayesian inference for epidemic models," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 434-447.
    16. Aryal, Nanda R. & Jones, Owen D., 2020. "Fitting the Bartlett–Lewis rainfall model using Approximate Bayesian Computation," Mathematics and Computers in Simulation (MATCOM), Elsevier, vol. 175(C), pages 153-163.
    17. Juraj Bergman & Rasmus Ø. Pedersen & Erick J. Lundgren & Rhys T. Lemoine & Sophie Monsarrat & Elena A. Pearce & Mikkel H. Schierup & Jens-Christian Svenning, 2023. "Worldwide Late Pleistocene and Early Holocene population declines in extant megafauna are associated with Homo sapiens expansion rather than climate change," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    18. Fiona A. Hagenbeek & Jana S. Hirzinger & Sophie Breunig & Susanne Bruins & Dmitry V. Kuznetsov & Kirsten Schut & Veronika V. Odintsova & Dorret I. Boomsma, 2023. "Maximizing the value of twin studies in health and behaviour," Nature Human Behaviour, Nature, vol. 7(6), pages 849-860, June.
    19. Mikula, Lynette Caitlin & Vogl, Claus, 2024. "The expected sample allele frequencies from populations of changing size via orthogonal polynomials," Theoretical Population Biology, Elsevier, vol. 157(C), pages 55-85.
    20. Costa, Rui J. & Wilkinson-Herbots, Hilde M., 2021. "Inference of gene flow in the process of speciation: Efficient maximum-likelihood implementation of a generalised isolation-with-migration model," Theoretical Population Biology, Elsevier, vol. 140(C), pages 1-15.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pgen00:1003905. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosgenetics (email available below). General contact details of provider: https://journals.plos.org/plosgenetics/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.