IDEAS home Printed from https://ideas.repec.org/a/bpj/sagmbi/v23y2024i1p18n1.html
   My bibliography  Save this article

A global test of hybrid ancestry from genome-scale data

Author

Listed:
  • Haque Md Rejuan

    (Division of Biostatistics, College of Public Health, and Department of Statistics, The Ohio State University, Columbus, OH 43210, USA)

  • Kubatko Laura

    (Department of Statistics and Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH 43210, USA)

Abstract

Methods based on the multi-species coalescent have been widely used in phylogenetic tree estimation using genome-scale DNA sequence data to understand the underlying evolutionary relationship between the sampled species. Evolutionary processes such as hybridization, which creates new species through interbreeding between two different species, necessitate inferring a species network instead of a species tree. A species tree is strictly bifurcating and thus fails to incorporate hybridization events which require an internal node of degree three. Hence, it is crucial to decide whether a tree or network analysis should be performed given a DNA sequence data set, a decision that is based on the presence of hybrid species in the sampled species. Although many methods have been proposed for hybridization detection, it is rare to find a technique that does so globally while considering a data generation mechanism that allows both hybridization and incomplete lineage sorting. In this paper, we consider hybridization and coalescence in a unified framework and propose a new test that can detect whether there are any hybrid species in a set of species of arbitrary size. Based on this global test of hybridization, one can decide whether a tree or network analysis is appropriate for a given data set.

Suggested Citation

  • Haque Md Rejuan & Kubatko Laura, 2024. "A global test of hybrid ancestry from genome-scale data," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 23(1), pages 1-18, January.
  • Handle: RePEc:bpj:sagmbi:v:23:y:2024:i:1:p:18:n:1
    DOI: 10.1515/sagmb-2022-0061
    as

    Download full text from publisher

    File URL: https://doi.org/10.1515/sagmb-2022-0061
    Download Restriction: For access to full text, subscription to the journal or payment for the individual article is required.

    File URL: https://libkey.io/10.1515/sagmb-2022-0061?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Charles-Elie Rabier & Vincent Berry & Marnus Stoltz & João D Santos & Wensheng Wang & Jean-Christophe Glaszmann & Fabio Pardi & Celine Scornavacca, 2021. "On the inference of complex phylogenetic networks by Markov Chain Monte-Carlo," PLOS Computational Biology, Public Library of Science, vol. 17(9), pages 1-40, September.
    2. Ian Barnett & Rajarshi Mukherjee & Xihong Lin, 2017. "The Generalized Higher Criticism for Testing SNP-Set Effects in Genetic Association Studies," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(517), pages 64-76, January.
    3. Yaowu Liu & Jun Xie, 2020. "Cauchy Combination Test: A Powerful Test With Analytic p-Value Calculation Under Arbitrary Dependency Structures," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(529), pages 393-402, January.
    4. Jesús Mavárez & Camilo A. Salazar & Eldredge Bermingham & Christian Salcedo & Chris D. Jiggins & Mauricio Linares, 2006. "Speciation by hybridization in Heliconius butterflies," Nature, Nature, vol. 441(7095), pages 868-871, June.
    5. Yaowu Liu & Jun Xie, 2019. "Accurate and Efficient P-value Calculation Via Gaussian Approximation: A Novel Monte-Carlo Method," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(525), pages 384-392, January.
    6. Meng, Chen & Kubatko, Laura Salter, 2009. "Detecting hybrid speciation in the presence of incomplete lineage sorting using gene tree incongruence: A model," Theoretical Population Biology, Elsevier, vol. 75(1), pages 35-45.
    7. Antonis Rokas & Barry L. Williams & Nicole King & Sean B. Carroll, 2003. "Genome-scale approaches to resolving incongruence in molecular phylogenies," Nature, Nature, vol. 425(6960), pages 798-804, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hong Zhang & Zheyang Wu, 2023. "The generalized Fisher's combination and accurate p‐value calculation under dependence," Biometrics, The International Biometric Society, vol. 79(2), pages 1159-1172, June.
    2. Wang Yuancheng & Degnan James H, 2011. "Performance of Matrix Representation with Parsimony for Inferring Species from Gene Trees," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 10(1), pages 1-39, May.
    3. Zichen Zhang & Ye Eun Bae & Jonathan R. Bradley & Lang Wu & Chong Wu, 2022. "SUMMIT: An integrative approach for better transcriptomic data imputation improves causal gene identification," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
    4. Nicola F. Müller & Kathryn E. Kistler & Trevor Bedford, 2022. "A Bayesian approach to infer recombination patterns in coronaviruses," Nature Communications, Nature, vol. 13(1), pages 1-9, December.
    5. Mark S Hibbins & Matthew W Hahn, 2021. "The effects of introgression across thousands of quantitative traits revealed by gene expression in wild tomatoes," PLOS Genetics, Public Library of Science, vol. 17(11), pages 1-20, November.
    6. Zhang, Hong & Wu, Zheyang, 2022. "The general goodness-of-fit tests for correlated data," Computational Statistics & Data Analysis, Elsevier, vol. 167(C).
    7. Martín Espariz & Federico A Zuljan & Luis Esteban & Christian Magni, 2016. "Taxonomic Identity Resolution of Highly Phylogenetically Related Strains and Selection of Phylogenetic Markers by Using Genome-Scale Methods: The Bacillus pumilus Group Case," PLOS ONE, Public Library of Science, vol. 11(9), pages 1-17, September.
    8. Rahul Siddharthan & Eric D Siggia & Erik van Nimwegen, 2005. "PhyloGibbs: A Gibbs Sampling Motif Finder That Incorporates Phylogeny," PLOS Computational Biology, Public Library of Science, vol. 1(7), pages 1-23, December.
    9. Roch, Sebastien & Steel, Mike, 2015. "Likelihood-based tree reconstruction on a concatenation of aligned sequence data sets can be statistically inconsistent," Theoretical Population Biology, Elsevier, vol. 100(C), pages 56-62.
    10. David Ardia & S'ebastien Laurent & Rosnel Sessinou, 2024. "High-Dimensional Mean-Variance Spanning Tests," Papers 2403.17127, arXiv.org.
    11. Yuyu Chen & Ruodu Wang, 2024. "Infinite-mean models in risk management: Discussions and recent advances," Papers 2408.08678, arXiv.org, revised Oct 2024.
    12. Xiong, Peihan & Hu, Taizhong, 2022. "On Samuel’s p-value model and the Simes test under dependence," Statistics & Probability Letters, Elsevier, vol. 187(C).
    13. William R. Reay & Dylan J. Kiltschewskij & Maria A. Biase & Zachary F. Gerring & Kousik Kundu & Praveen Surendran & Laura A. Greco & Erin D. Clarke & Clare E. Collins & Alison M. Mondul & Demetrius Al, 2024. "Genetic influences on circulating retinol and its relationship to human health," Nature Communications, Nature, vol. 15(1), pages 1-20, December.
    14. David Peris & Emily J. Ubbelohde & Meihua Christina Kuang & Jacek Kominek & Quinn K. Langdon & Marie Adams & Justin A. Koshalek & Amanda Beth Hulfachor & Dana A. Opulente & David J. Hall & Katie Hyma , 2023. "Macroevolutionary diversity of traits and genomes in the model yeast genus Saccharomyces," Nature Communications, Nature, vol. 14(1), pages 1-19, December.
    15. Zihan Zhao & Jianjun Zhang & Qiuying Sha & Han Hao, 2020. "Testing gene-environment interactions for rare and/or common variants in sequencing association studies," PLOS ONE, Public Library of Science, vol. 15(3), pages 1-15, March.
    16. Remo Monti & Pia Rautenstrauch & Mahsa Ghanbari & Alva Rani James & Matthias Kirchler & Uwe Ohler & Stefan Konigorski & Christoph Lippert, 2022. "Identifying interpretable gene-biomarker associations with functionally informed kernel-based tests in 190,000 exomes," Nature Communications, Nature, vol. 13(1), pages 1-16, December.
    17. William R. Reay & Michael P. Geaghan & Murray J. Cairns, 2022. "The genetic architecture of pneumonia susceptibility implicates mucin biology and a relationship with psychiatric illness," Nature Communications, Nature, vol. 13(1), pages 1-16, December.
    18. Juan Antonio Villatoro-García & Jordi Martorell-Marugán & Daniel Toro-Domínguez & Yolanda Román-Montoya & Pedro Femia & Pedro Carmona-Sáez, 2022. "DExMA: An R Package for Performing Gene Expression Meta-Analysis with Missing Genes," Mathematics, MDPI, vol. 10(18), pages 1-15, September.
    19. Zhao, Sihai Dave & Cai, T. Tony & Li, Hongzhe, 2017. "Optimal detection of weak positive latent dependence between two sequences of multiple tests," Journal of Multivariate Analysis, Elsevier, vol. 160(C), pages 169-184.
    20. Nabil Bouamara & S'ebastien Laurent & Shuping Shi, 2023. "Sequential Cauchy Combination Test for Multiple Testing Problems with Financial Applications," Papers 2303.13406, arXiv.org, revised Jun 2023.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bpj:sagmbi:v:23:y:2024:i:1:p:18:n:1. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.degruyter.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.