Author
Listed:
- Caroline M Nievergelt
- Ondrej Libiger
- Nicholas J Schork
Abstract
Many studies in the fields of genetic epidemiology and applied population genetics are predicated on, or require, an assessment of the genetic background diversity of the individuals chosen for study. A number of strategies have been developed for assessing genetic background diversity. These strategies typically focus on genotype data collected on the individuals in the study, based on a panel of DNA markers. However, many of these strategies are either rooted in cluster analysis techniques, and hence suffer from problems inherent to the assignment of the biological and statistical meaning to resulting clusters, or have formulations that do not permit easy and intuitive extensions. We describe a very general approach to the problem of assessing genetic background diversity that extends the analysis of molecular variance (AMOVA) strategy introduced by Excoffier and colleagues some time ago. As in the original AMOVA strategy, the proposed approach, termed generalized AMOVA (GAMOVA), requires a genetic similarity matrix constructed from the allelic profiles of individuals under study and/or allele frequency summaries of the populations from which the individuals have been sampled. The proposed strategy can be used to either estimate the fraction of genetic variation explained by grouping factors such as country of origin, race, or ethnicity, or to quantify the strength of the relationship of the observed genetic background variation to quantitative measures collected on the subjects, such as blood pressure levels or anthropometric measures. Since the formulation of our test statistic is rooted in multivariate linear models, sets of variables can be related to genetic background in multiple regression-like contexts. GAMOVA can also be used to complement graphical representations of genetic diversity such as tree diagrams (dendrograms) or heatmaps. We examine features, advantages, and power of the proposed procedure and showcase its flexibility by using it to analyze a wide variety of published data sets, including data from the Human Genome Diversity Project, classical anthropometry data collected by Howells, and the International HapMap Project.: Humans exhibit great genetic diversity. Understanding the factors that contribute to and sustain this diversity is an important research area. Not only can such understanding shed light on human origins, but it can also assist in the discovery of genes and genetic factors that contribute to debilitating diseases. Statistical analysis methods that can facilitate the identification of factors contributing to or associated with human genetic diversity are growing in number as new high-throughput molecular genetic assays and technologies are developed. We consider the use of an analysis method termed generalized analysis of molecular variance (GAMOVA), which builds off of previously proposed analysis methods for testing hypotheses about the factors associated with genetic background diversity. We apply the method in a wide variety of settings and show that it is both flexible and powerful. GAMOVA has great potential to assist in population-based human genetic studies, as it can be used to address questions such as: Is a sample of affected cases and unaffected controls from a homogeneous population, or is there evidence of heterogeneity that could affect the results of an association study? Is there reason to believe that the ancestry of a set of individuals influences the traits that they have?
Suggested Citation
Caroline M Nievergelt & Ondrej Libiger & Nicholas J Schork, 2007.
"Generalized Analysis of Molecular Variance,"
PLOS Genetics, Public Library of Science, vol. 3(4), pages 1-12, April.
Handle:
RePEc:plo:pgen00:0030051
DOI: 10.1371/journal.pgen.0030051
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pgen00:0030051. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosgenetics (email available below). General contact details of provider: https://journals.plos.org/plosgenetics/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.