IDEAS home Printed from https://ideas.repec.org/a/plo/pgen00/1000078.html
   My bibliography  Save this article

Inferring Human Colonization History Using a Copying Model

Author

Listed:
  • Garrett Hellenthal
  • Adam Auton
  • Daniel Falush

Abstract

Genome-wide scans of genetic variation can potentially provide detailed information on how modern humans colonized the world but require new methods of analysis. We introduce a statistical approach that uses Single Nucleotide Polymorphism (SNP) data to identify sharing of chromosomal segments between populations and uses the pattern of sharing to reconstruct a detailed colonization scenario. We apply our model to the SNP data for the 53 populations of the Human Genome Diversity Project described in Conrad et al. (Nature Genetics 38,1251-60, 2006). Our results are consistent with the consensus view of a single “Out-of-Africa” bottleneck and serial dilution of diversity during global colonization, including a prominent East Asian bottleneck. They also suggest novel details including: (1) the most northerly East Asian population in the sample (Yakut) has received a significant genetic contribution from the ancestors of the most northerly European one (Orcadian). (2) Native South Americans have received ancestry from a source closely related to modern North-East Asians (Mongolians and Oroquen) that is distinct from the sources for native North Americans, implying multiple waves of migration into the Americas. A detailed depiction of the peopling of the world is available in animated form.Author Summary: Humans like to tell stories. Amongst the most captivating is the story of the global spread of modern humans from their original homeland in Africa. Traditionally this has been the preserve of anthropologists, but geneticists are starting to make an important contribution. However, genetic evidence is typically analyzed in the context of anthropological preconceptions. For genetics to provide an accurate and detailed history without reference to anthropology, methods are required that translate DNA sequence data into histories. We introduce a statistical method that has three virtues. First, it is based on a copying model that incorporates the block-by-block inheritance of DNA from one generation to the next. This allows it to capture the rich information provided by patterns of DNA sharing across the whole genome. Second, its parameter space includes an enormous number of possible colonization scenarios, meaning that inferences are correspondingly rich in detail. Third, the inferred colonization scenario is determined algorithmically. We have applied this method to data from 53 human populations and find that while the current consensus is broadly supported, some populations have surprising histories. This scenario can be viewed as a movie, making it transparent where statistical analysis ends and where interpretation begins.

Suggested Citation

  • Garrett Hellenthal & Adam Auton & Daniel Falush, 2008. "Inferring Human Colonization History Using a Copying Model," PLOS Genetics, Public Library of Science, vol. 4(5), pages 1-11, May.
  • Handle: RePEc:plo:pgen00:1000078
    DOI: 10.1371/journal.pgen.1000078
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1000078
    Download Restriction: no

    File URL: https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1000078&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pgen.1000078?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Noah A Rosenberg & Saurabh Mahajan & Sohini Ramachandran & Chengfeng Zhao & Jonathan K Pritchard & Marcus W Feldman, 2005. "Clines, Clusters, and the Effect of Study Design on the Inference of Human Population Structure," PLOS Genetics, Public Library of Science, vol. 1(6), pages 1-12, December.
    2. Matthew Stephens & Peter Donnelly, 2000. "Inference in molecular population genetics," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 62(4), pages 605-635.
    3. Mattias Jakobsson & Sonja W. Scholz & Paul Scheet & J. Raphael Gibbs & Jenna M. VanLiere & Hon-Chung Fung & Zachary A. Szpiech & James H. Degnan & Kai Wang & Rita Guerreiro & Jose M. Bras & Jennifer C, 2008. "Genotype, haplotype and copy-number variation in worldwide human populations," Nature, Nature, vol. 451(7181), pages 998-1003, February.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Eric R Londin & Margaret A Keller & Cathleen Maista & Gretchen Smith & Laura A Mamounas & Ran Zhang & Steven J Madore & Katrina Gwinn & Roderick A Corriveau, 2010. "CoAIMs: A Cost-Effective Panel of Ancestry Informative Markers for Determining Continental Origins," PLOS ONE, Public Library of Science, vol. 5(10), pages 1-12, October.
    2. Ricardo Kanitz & Elsa G Guillot & Sylvain Antoniazza & Samuel Neuenschwander & Jérôme Goudet, 2018. "Complex genetic patterns in human arise from a simple range-expansion model over continental landmasses," PLOS ONE, Public Library of Science, vol. 13(2), pages 1-16, February.
    3. Ryan N Gutenkunst & Ryan D Hernandez & Scott H Williamson & Carlos D Bustamante, 2009. "Inferring the Joint Demographic History of Multiple Populations from Multidimensional SNP Frequency Data," PLOS Genetics, Public Library of Science, vol. 5(10), pages 1-11, October.
    4. Mikula, Lynette Caitlin & Vogl, Claus, 2024. "The expected sample allele frequencies from populations of changing size via orthogonal polynomials," Theoretical Population Biology, Elsevier, vol. 157(C), pages 55-85.
    5. Griffiths, Robert C. & Tavaré, Simon, 2018. "Ancestral inference from haplotypes and mutations," Theoretical Population Biology, Elsevier, vol. 122(C), pages 12-21.
    6. Steinrücken, Matthias & Paul, Joshua S. & Song, Yun S., 2013. "A sequentially Markov conditional sampling distribution for structured populations with migration and recombination," Theoretical Population Biology, Elsevier, vol. 87(C), pages 51-61.
    7. Liu Xiran & Ahsan Zarif & Martheswaran Tarun K. & Rosenberg Noah A., 2023. "When is the allele-sharing dissimilarity between two populations exceeded by the allele-sharing dissimilarity of a population with itself?," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 22(1), pages 1-24, January.
    8. Nick Patterson & Alkes L Price & David Reich, 2006. "Population Structure and Eigenanalysis," PLOS Genetics, Public Library of Science, vol. 2(12), pages 1-20, December.
    9. Xuchao Li & Shengpei Chen & Weiwei Xie & Ida Vogel & Kwong Wai Choy & Fang Chen & Rikke Christensen & Chunlei Zhang & Huijuan Ge & Haojun Jiang & Chang Yu & Fang Huang & Wei Wang & Hui Jiang & Xiuqing, 2014. "PSCC: Sensitive and Reliable Population-Scale Copy Number Variation Detection Method Based on Low Coverage Sequencing," PLOS ONE, Public Library of Science, vol. 9(1), pages 1-9, January.
    10. Birkner, Matthias & Blath, Jochen & Steinrücken, Matthias, 2011. "Importance sampling for Lambda-coalescents in the infinitely many sites model," Theoretical Population Biology, Elsevier, vol. 79(4), pages 155-173.
    11. Larribe Fabrice & Lessard Sabin, 2008. "A Composite-Conditional-Likelihood Approach for Gene Mapping Based on Linkage Disequilibrium in Windows of Marker Loci," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 7(1), pages 1-33, August.
    12. Vaughan, Laura K. & Divers, Jasmin & Padilla, Miguel A. & Redden, David T. & Tiwari, Hemant K. & Pomp, Daniel & Allison, David B., 2009. "The use of plasmodes as a supplement to simulations: A simple example evaluating individual admixture estimation methodologies," Computational Statistics & Data Analysis, Elsevier, vol. 53(5), pages 1755-1766, March.
    13. Blath, Jochen & Buzzoni, Eugenio & Koskela, Jere & Wilke Berenguer, Maite, 2020. "Statistical tools for seed bank detection," Theoretical Population Biology, Elsevier, vol. 132(C), pages 1-15.
    14. Peristera Paschou & Petros Drineas & Jamey Lewis & Caroline M Nievergelt & Deborah A Nickerson & Joshua D Smith & Paul M Ridker & Daniel I Chasman & Ronald M Krauss & Elad Ziv, 2008. "Tracing Sub-Structure in the European American Population with PCA-Informative Markers," PLOS Genetics, Public Library of Science, vol. 4(7), pages 1-13, July.
    15. Uyenoyama, Marcy K. & Takebayashi, Naoki & Kumagai, Seiji, 2020. "Allele frequency spectra in structured populations: Novel-allele probabilities under the labelled coalescent," Theoretical Population Biology, Elsevier, vol. 133(C), pages 130-140.
    16. Spilimbergo, Antonio & Giuliano, Paola & Tonon, Giovanni, 2006. "Genetic, Cultural and Geographical Distances," CEPR Discussion Papers 5807, C.E.P.R. Discussion Papers.
    17. Chaolong Wang & Sebastian Zöllner & Noah A Rosenberg, 2012. "A Quantitative Comparison of the Similarity between Genes and Geography in Worldwide Human Populations," PLOS Genetics, Public Library of Science, vol. 8(8), pages 1-16, August.
    18. Wang Chaolong & Szpiech Zachary A & Degnan James H & Jakobsson Mattias & Pemberton Trevor J & Hardy John A & Singleton Andrew B & Rosenberg Noah A, 2010. "Comparing Spatial Maps of Human Population-Genetic Variation Using Procrustes Analysis," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 9(1), pages 1-22, January.
    19. Ramachandran, Sohini & Rosenberg, Noah A. & Feldman, Marcus W. & Wakeley, John, 2008. "Population differentiation and migration: Coalescence times in a two-sex island model for autosomal and X-linked loci," Theoretical Population Biology, Elsevier, vol. 74(4), pages 291-301.
    20. Hössjer Ola & Hartman Linda & Humphreys Keith, 2009. "Ancestral Recombination Graphs under Non-Random Ascertainment, with Applications to Gene Mapping," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 8(1), pages 1-46, September.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pgen00:1000078. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosgenetics (email available below). General contact details of provider: https://journals.plos.org/plosgenetics/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.