IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0041694.html
   My bibliography  Save this article

Multiple Regression Methods Show Great Potential for Rare Variant Association Tests

Author

Listed:
  • ChangJiang Xu
  • Martin Ladouceur
  • Zari Dastani
  • J Brent Richards
  • Antonio Ciampi
  • Celia M T Greenwood

Abstract

The investigation of associations between rare genetic variants and diseases or phenotypes has two goals. Firstly, the identification of which genes or genomic regions are associated, and secondly, discrimination of associated variants from background noise within each region. Over the last few years, many new methods have been developed which associate genomic regions with phenotypes. However, classical methods for high-dimensional data have received little attention. Here we investigate whether several classical statistical methods for high-dimensional data: ridge regression (RR), principal components regression (PCR), partial least squares regression (PLS), a sparse version of PLS (SPLS), and the LASSO are able to detect associations with rare genetic variants. These approaches have been extensively used in statistics to identify the true associations in data sets containing many predictor variables. Using genetic variants identified in three genes that were Sanger sequenced in 1998 individuals, we simulated continuous phenotypes under several different models, and we show that these feature selection and feature extraction methods can substantially outperform several popular methods for rare variant analysis. Furthermore, these approaches can identify which variants are contributing most to the model fit, and therefore both goals of rare variant analysis can be achieved simultaneously with the use of regression regularization methods. These methods are briefly illustrated with an analysis of adiponectin levels and variants in the ADIPOQ gene.

Suggested Citation

  • ChangJiang Xu & Martin Ladouceur & Zari Dastani & J Brent Richards & Antonio Ciampi & Celia M T Greenwood, 2012. "Multiple Regression Methods Show Great Potential for Rare Variant Association Tests," PLOS ONE, Public Library of Science, vol. 7(8), pages 1-10, August.
  • Handle: RePEc:plo:pone00:0041694
    DOI: 10.1371/journal.pone.0041694
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0041694
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0041694&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0041694?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Martin Ladouceur & Zari Dastani & Yurii S Aulchenko & Celia M T Greenwood & J Brent Richards, 2012. "The Empirical Power of Rare Variant Association Methods: Results from Sanger Sequencing in 1,998 Individuals," PLOS Genetics, Public Library of Science, vol. 8(2), pages 1-11, February.
    2. Mevik, Björn-Helge & Wehrens, Ron, 2007. "The pls Package: Principal Component and Partial Least Squares Regression in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 18(i02).
    3. Benjamin M Neale & Manuel A Rivas & Benjamin F Voight & David Altshuler & Bernie Devlin & Marju Orho-Melander & Sekar Kathiresan & Shaun M Purcell & Kathryn Roeder & Mark J Daly, 2011. "Testing for an Unusual Distribution of Rare Variants," PLOS Genetics, Public Library of Science, vol. 7(3), pages 1-8, March.
    4. Bo Eskerod Madsen & Sharon R Browning, 2009. "A Groupwise Association Test for Rare Mutations Using a Weighted Sum Statistic," PLOS Genetics, Public Library of Science, vol. 5(2), pages 1-11, February.
    5. Hyonho Chun & Sündüz Keleş, 2010. "Sparse partial least squares regression for simultaneous dimension reduction and variable selection," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 72(1), pages 3-25, January.
    6. Zari Dastani & Marie-France Hivert & Nicholas Timpson & John R B Perry & Xin Yuan & Robert A Scott & Peter Henneman & Iris M Heid & Jorge R Kizer & Leo-Pekka Lyytikäinen & Christian Fuchsberger & Tosh, 2012. "Novel Loci for Adiponectin Levels and Their Influence on Type 2 Diabetes and Metabolic Traits: A Multi-Ethnic Meta-Analysis of 45,891 Individuals," PLOS Genetics, Public Library of Science, vol. 8(3), pages 1-23, March.
    7. Thomas J Hoffmann & Nicholas J Marini & John S Witte, 2010. "Comprehensive Approach to Analyzing Rare Genetic Variants," PLOS ONE, Public Library of Science, vol. 5(11), pages 1-9, November.
    8. Dajiang J Liu & Suzanne M Leal, 2010. "A Novel Adaptive Method for the Analysis of Next-Generation Sequencing Data to Detect Complex Trait Associations with Rare Variants Due to Gene Main Effects and Interactions," PLOS Genetics, Public Library of Science, vol. 6(10), pages 1-14, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Wan-Yu Lin & Xiang-Yang Lou & Guimin Gao & Nianjun Liu, 2014. "Rare Variant Association Testing by Adaptive Combination of P-values," PLOS ONE, Public Library of Science, vol. 9(1), pages 1-7, January.
    2. Ruixue Fan & Shaw-Hwa Lo, 2013. "A Robust Model-free Approach for Rare Variants Association Studies Incorporating Gene-Gene and Gene-Environmental Interactions," PLOS ONE, Public Library of Science, vol. 8(12), pages 1-14, December.
    3. Chung-Feng Kao & Jia-Rou Liu & Hung Hung & Po-Hsiu Kuo, 2015. "A Robust GWSS Method to Simultaneously Detect Rare and Common Variants for Complex Disease," PLOS ONE, Public Library of Science, vol. 10(4), pages 1-14, April.
    4. Elodie Persyn & Richard Redon & Lise Bellanger & Christian Dina, 2018. "The impact of a fine-scale population stratification on rare variant association test results," PLOS ONE, Public Library of Science, vol. 13(12), pages 1-17, December.
    5. Diana Chang & Alon Keinan, 2012. "Predicting Signatures of “Synthetic Associations” and “Natural Associations” from Empirical Patterns of Human Genetic Variation," PLOS Computational Biology, Public Library of Science, vol. 8(7), pages 1-9, July.
    6. Nanye Long & Samuel P Dickson & Jessica M Maia & Hee Shin Kim & Qianqian Zhu & Andrew S Allen, 2013. "Leveraging Prior Information to Detect Causal Variants via Multi-Variant Regression," PLOS Computational Biology, Public Library of Science, vol. 9(6), pages 1-11, June.
    7. Silviu-Alin Bacanu & Matthew R Nelson & John C Whittaker, 2012. "Comparison of Statistical Tests for Association between Rare Variants and Binary Traits," PLOS ONE, Public Library of Science, vol. 7(8), pages 1-7, August.
    8. Yao-Hwei Fang & Yen-Feng Chiu, 2013. "A Novel Support Vector Machine-Based Approach for Rare Variant Detection," PLOS ONE, Public Library of Science, vol. 8(8), pages 1-9, August.
    9. Daniel D Kinnamon & Ray E Hershberger & Eden R Martin, 2012. "Reconsidering Association Testing Methods Using Single-Variant Test Statistics as Alternatives to Pooling Tests for Sequence Data with Rare Variants," PLOS ONE, Public Library of Science, vol. 7(2), pages 1-15, February.
    10. Loukas Moutsianas & Vineeta Agarwala & Christian Fuchsberger & Jason Flannick & Manuel A Rivas & Kyle J Gaulton & Patrick K Albers & GoT2D Consortium & Gil McVean & Michael Boehnke & David Altshuler &, 2015. "The Power of Gene-Based Rare Variant Methods to Detect Disease-Associated Variation and Test Hypotheses About Complex Disease," PLOS Genetics, Public Library of Science, vol. 11(4), pages 1-24, April.
    11. Fan, Jianqing & Jiang, Bai & Sun, Qiang, 2022. "Bayesian factor-adjusted sparse regression," Journal of Econometrics, Elsevier, vol. 230(1), pages 3-19.
    12. Julieta Fuentes & Pilar Poncela & Julio Rodríguez, 2015. "Sparse Partial Least Squares in Time Series for Macroeconomic Forecasting," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 30(4), pages 576-595, June.
    13. repec:jss:jstsof:23:i12 is not listed on IDEAS
    14. Tommaso Proietti, 2016. "On the Selection of Common Factors for Macroeconomic Forecasting," Advances in Econometrics, in: Dynamic Factor Models, volume 35, pages 593-628, Emerald Group Publishing Limited.
    15. Faming Liang & Momiao Xiong, 2013. "Bayesian Detection of Causal Rare Variants under Posterior Consistency," PLOS ONE, Public Library of Science, vol. 8(7), pages 1-16, July.
    16. Kawano, Shuichi & Fujisawa, Hironori & Takada, Toyoyuki & Shiroishi, Toshihiko, 2015. "Sparse principal component regression with adaptive loading," Computational Statistics & Data Analysis, Elsevier, vol. 89(C), pages 192-203.
    17. Debamita Kundu & Riten Mitra & Jeremy T. Gaskins, 2021. "Bayesian variable selection for multioutcome models through shared shrinkage," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 48(1), pages 295-320, March.
    18. Elton Mammadov & Michael Denk & Frank Riedel & Cezary Kaźmierowski & Karolina Lewinska & Remigiusz Łukowiak & Witold Grzebisz & Amrakh I. Mamedov & Cornelia Glaesser, 2022. "Determination of Mehlich 3 Extractable Elements with Visible and Near Infrared Spectroscopy in a Mountainous Agricultural Land, the Caucasus Mountains," Land, MDPI, vol. 11(3), pages 1-24, March.
    19. Giacomo Crucil & Fabio Castaldi & Emilien Aldana-Jague & Bas van Wesemael & Andy Macdonald & Kristof Van Oost, 2019. "Assessing the Performance of UAS-Compatible Multispectral and Hyperspectral Sensors for Soil Organic Carbon Prediction," Sustainability, MDPI, vol. 11(7), pages 1-18, March.
    20. Zhang Yuping & Tibshirani Robert J. & Davis Ronald W., 2010. "Predicting Patient Survival from Longitudinal Gene Expression," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 9(1), pages 1-23, November.
    21. Qiang Sun & Hongtu Zhu & Yufeng Liu & Joseph G. Ibrahim, 2015. "SPReM: Sparse Projection Regression Model For High-Dimensional Linear Regression," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(509), pages 289-302, March.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0041694. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.