IDEAS home Printed from https://ideas.repec.org/a/eee/thpobi/v100y2015icp13-25.html
   My bibliography  Save this article

The site frequency spectrum of dispensable genes

Author

Listed:
  • Baumdicker, Franz

Abstract

The differences between DNA-sequences within a population are the basis to infer the ancestral relationship of the individuals. Within the classical infinitely many sites model, it is possible to estimate the mutation rate based on the site frequency spectrum, which is comprised by the numbers C1,…,Cn−1 where n is the sample size and Cs is the number of site mutations (Single Nucleotide Polymorphisms, SNPs) which are seen in s genomes. Classical results can be used to compare the observed site frequency spectrum with its neutral expectation, E[Cs]=θ2/s, where θ2 is the scaled site mutation rate. In this paper, we will relax the assumption of the infinitely many sites model that all individuals only carry homologous genetic material. Especially, it is today well-known that bacterial genomes have the ability to gain and lose genes, such that every single genome is a mosaic of genes, and genes are present and absent in a random fashion, giving rise to the dispensable genome. While this presence and absence has been modeled under neutral evolution within the infinitely many genes model in Baumdicker et al. (2010), we link presence and absence of genes with the numbers of site mutations seen within each gene. In this work we derive a formula for the expectation of the joint gene and site frequency spectrum, denoted by Gk,s, the number of mutated sites occurring in exactly s gene sequences, while the corresponding gene is present in exactly k individuals. We show that standard estimators of θ2 for dispensable genes are biased and that the site frequency spectrum for dispensable genes differs from the classical result.

Suggested Citation

  • Baumdicker, Franz, 2015. "The site frequency spectrum of dispensable genes," Theoretical Population Biology, Elsevier, vol. 100(C), pages 13-25.
  • Handle: RePEc:eee:thpobi:v:100:y:2015:i:c:p:13-25
    DOI: 10.1016/j.tpb.2014.12.001
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0040580914000975
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.tpb.2014.12.001?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:thpobi:v:100:y:2015:i:c:p:13-25. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: https://www.journals.elsevier.com/intelligence .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.