IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0029115.html
   My bibliography  Save this article

Bayesian Variable Selection in Searching for Additive and Dominant Effects in Genome-Wide Data

Author

Listed:
  • Tomi Peltola
  • Pekka Marttinen
  • Antti Jula
  • Veikko Salomaa
  • Markus Perola
  • Aki Vehtari

Abstract

Although complex diseases and traits are thought to have multifactorial genetic basis, the common methods in genome-wide association analyses test each variant for association independent of the others. This computational simplification may lead to reduced power to identify variants with small effect sizes and requires correcting for multiple hypothesis tests with complex relationships. However, advances in computational methods and increase in computational resources are enabling the computation of models that adhere more closely to the theory of multifactorial inheritance. Here, a Bayesian variable selection and model averaging approach is formulated for searching for additive and dominant genetic effects. The approach considers simultaneously all available variants for inclusion as predictors in a linear genotype-phenotype mapping and averages over the uncertainty in the variable selection. This leads to naturally interpretable summary quantities on the significances of the variants and their contribution to the genetic basis of the studied trait. We first characterize the behavior of the approach in simulations. The results indicate a gain in the causal variant identification performance when additive and dominant variation are simulated, with a negligible loss of power in purely additive case. An application to the analysis of high- and low-density lipoprotein cholesterol levels in a dataset of 3895 Finns is then presented, demonstrating the feasibility of the approach at the current scale of single-nucleotide polymorphism data. We describe a Markov chain Monte Carlo algorithm for the computation and give suggestions on the specification of prior parameters using commonly available prior information. An open-source software implementing the method is available at http://www.lce.hut.fi/research/mm/bmagwa/ and https://github.com/to-mi/.

Suggested Citation

  • Tomi Peltola & Pekka Marttinen & Antti Jula & Veikko Salomaa & Markus Perola & Aki Vehtari, 2012. "Bayesian Variable Selection in Searching for Additive and Dominant Effects in Genome-Wide Data," PLOS ONE, Public Library of Science, vol. 7(1), pages 1-11, January.
  • Handle: RePEc:plo:pone00:0029115
    DOI: 10.1371/journal.pone.0029115
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0029115
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0029115&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0029115?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Tanya M. Teslovich & Kiran Musunuru & Albert V. Smith & Andrew C. Edmondson & Ioannis M. Stylianou & Masahiro Koseki & James P. Pirruccello & Samuli Ripatti & Daniel I. Chasman & Cristen J. Willer & C, 2010. "Biological, clinical and population relevance of 95 loci for blood lipids," Nature, Nature, vol. 466(7307), pages 707-713, August.
    2. Yongtao Guan & Matthew Stephens, 2008. "Practical Issues in Imputation-Based Association Mapping," PLOS Genetics, Public Library of Science, vol. 4(12), pages 1-11, December.
    3. Brendan Maher, 2008. "Personal genomes: The case of the missing heritability," Nature, Nature, vol. 456(7218), pages 18-21, November.
    4. Clive J Hoggart & John C Whittaker & Maria De Iorio & David J Balding, 2008. "Simultaneous Analysis of All SNPs in Genome-Wide and Re-Sequencing Association Studies," PLOS Genetics, Public Library of Science, vol. 4(7), pages 1-8, July.
    5. David J. Nott & Robert Kohn, 2005. "Adaptive sampling for Bayesian variable selection," Biometrika, Biometrika Trust, vol. 92(4), pages 747-763, December.
    6. Eric S. Lander, 2011. "Initial impact of the sequencing of the human genome," Nature, Nature, vol. 470(7333), pages 187-197, February.
    7. P. J. Brown & M. Vannucci & T. Fearn, 2002. "Bayes model averaging with selection of regressors," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 64(3), pages 519-536, August.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Gabriel E Hoffman & Benjamin A Logsdon & Jason G Mezey, 2013. "PUMA: A Unified Framework for Penalized Multiple Regression Analysis of GWAS Data," PLOS Computational Biology, Public Library of Science, vol. 9(6), pages 1-19, June.
    2. Ouysse, Rachida & Kohn, Robert, 2010. "Bayesian variable selection and model averaging in the arbitrage pricing theory model," Computational Statistics & Data Analysis, Elsevier, vol. 54(12), pages 3249-3268, December.
    3. Tomi Peltola & Pekka Marttinen & Aki Vehtari, 2012. "Finite Adaptation and Multistep Moves in the Metropolis-Hastings Algorithm for Variable Selection in Genome-Wide Association Analysis," PLOS ONE, Public Library of Science, vol. 7(11), pages 1-11, November.
    4. Nagel, Mats, 2020. "Changing perspectives: Towards detailed phenotyping in genetics," Thesis Commons a4nz2_v1, Center for Open Science.
    5. Chuong B Do & David A Hinds & Uta Francke & Nicholas Eriksson, 2012. "Comparison of Family History and SNPs for Predicting Risk of Complex Disease," PLOS Genetics, Public Library of Science, vol. 8(10), pages 1-16, October.
    6. Frommlet, Florian & Ruhaltinger, Felix & Twaróg, Piotr & Bogdan, Małgorzata, 2012. "Modified versions of Bayesian Information Criterion for genome-wide association studies," Computational Statistics & Data Analysis, Elsevier, vol. 56(5), pages 1038-1051.
    7. Li, Feng & Kang, Yanfei, 2018. "Improving forecasting performance using covariate-dependent copula models," International Journal of Forecasting, Elsevier, vol. 34(3), pages 456-476.
    8. Ahmed Ismaïl & Hartikainen Anna-Liisa & Järvelin Marjo-Riitta & Richardson Sylvia, 2011. "False Discovery Rate Estimation for Stability Selection: Application to Genome-Wide Association Studies," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 10(1), pages 1-20, November.
    9. Szefer Elena & Lu Donghuan & Nathoo Farouk & Beg Mirza Faisal & Graham Jinko, 2017. "Multivariate association between single-nucleotide polymorphisms in Alzgene linkage regions and structural changes in the brain: discovery, refinement and validation," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 16(5-6), pages 367-386, December.
    10. Ley, Eduardo & Steel, Mark F.J., 2012. "Mixtures of g-priors for Bayesian model averaging with economic applications," Journal of Econometrics, Elsevier, vol. 171(2), pages 251-266.
    11. Ohad Manor & Eran Segal, 2013. "Predicting Disease Risk Using Bootstrap Ranking and Classification Algorithms," PLOS Computational Biology, Public Library of Science, vol. 9(8), pages 1-10, August.
    12. Ley, Eduardo & Steel, Mark F. J., 2007. "On the effect of prior assumptions in Bayesian model averaging with applications to growth regression," Policy Research Working Paper Series 4238, The World Bank.
    13. Ruixue Fan & Shaw-Hwa Lo, 2013. "A Robust Model-free Approach for Rare Variants Association Studies Incorporating Gene-Gene and Gene-Environmental Interactions," PLOS ONE, Public Library of Science, vol. 8(12), pages 1-14, December.
    14. Pasanisi, Alberto & Fu, Shuai & Bousquet, Nicolas, 2012. "Estimating discrete Markov models from various incomplete data schemes," Computational Statistics & Data Analysis, Elsevier, vol. 56(9), pages 2609-2625.
    15. Gael M. Martin & David T. Frazier & Christian P. Robert, 2020. "Computing Bayes: Bayesian Computation from 1763 to the 21st Century," Monash Econometrics and Business Statistics Working Papers 14/20, Monash University, Department of Econometrics and Business Statistics.
    16. Ander Wilson & Brian J. Reich, 2014. "Confounder selection via penalized credible regions," Biometrics, The International Biometric Society, vol. 70(4), pages 852-861, December.
    17. Hannes Rothe & Katharina Barbara Lauer & Callum Talbot-Cooper & Daniel Juan Sivizaca Conde, 2023. "Digital entrepreneurship from cellular data: How omics afford the emergence of a new wave of digital ventures in health," Electronic Markets, Springer;IIM University of St. Gallen, vol. 33(1), pages 1-17, December.
    18. Villani, Mattias & Kohn, Robert & Giordani, Paolo, 2009. "Regression density estimation using smooth adaptive Gaussian mixtures," Journal of Econometrics, Elsevier, vol. 153(2), pages 155-173, December.
    19. Iuliana Ionita-Laza & Joseph D Buxbaum & Nan M Laird & Christoph Lange, 2011. "A New Testing Strategy to Identify Rare Variants with Either Risk or Protective Effect on Disease," PLOS Genetics, Public Library of Science, vol. 7(2), pages 1-6, February.
    20. Gael M. Martin & David T. Frazier & Ruben Loaiza-Maya & Florian Huber & Gary Koop & John Maheu & Didier Nibbering & Anastasios Panagiotelis, 2023. "Bayesian Forecasting in the 21st Century: A Modern Review," Monash Econometrics and Business Statistics Working Papers 1/23, Monash University, Department of Econometrics and Business Statistics.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0029115. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.