IDEAS home Printed from https://ideas.repec.org/a/bpj/sagmbi/v13y2014i5p21n4.html
   My bibliography  Save this article

Robustness of the linear mixed effects model to error distribution assumptions and the consequences for genome-wide association studies

Author

Listed:
  • Warrington Nicole M.

    (School of Women’s and Infants’ Health, The University of Western Australia, Perth, Western Australia, Australia University of Queensland Diamantina Institute, Translational Research Institute, Brisbane, Queensland, Australia)

  • Tilling Kate

    (School of Social and Community Medicine, University of Bristol, Bristol, UK MRC Integrative Epidemiology Unit at the University of Bristol, Bristol, UK)

  • Howe Laura D.

    (School of Social and Community Medicine, University of Bristol, Bristol, UK MRC Integrative Epidemiology Unit at the University of Bristol, Bristol, UK)

  • Paternoster Lavinia

    (School of Social and Community Medicine, University of Bristol, Bristol, UK MRC Integrative Epidemiology Unit at the University of Bristol, Bristol, UK)

  • Pennell Craig E.

    (School of Women’s and Infants’ Health, The University of Western Australia, Perth, Western Australia, Australia)

  • Wu Yan Yan

    (Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario, Canada)

  • Briollais Laurent

    (Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario, Canada)

Abstract

Genome-wide association studies have been successful in uncovering novel genetic variants that are associated with disease status or cross-sectional phenotypic traits. Researchers are beginning to investigate how genes play a role in the development of a trait over time. Linear mixed effects models (LMM) are commonly used to model longitudinal data; however, it is unclear if the failure to meet the models distributional assumptions will affect the conclusions when conducting a genome-wide association study. In an extensive simulation study, we compare coverage probabilities, bias, type 1 error rates and statistical power when the error of the LMM is either heteroscedastic or has a non-Gaussian distribution. We conclude that the model is robust to misspecification if the same function of age is included in the fixed and random effects. However, type 1 error of the genetic effect over time is inflated, regardless of the model misspecification, if the polynomial function for age in the fixed and random effects differs. In situations where the model will not converge with a high order polynomial function in the random effects, a reduced function can be used but a robust standard error needs to be calculated to avoid inflation of the type 1 error. As an illustration, a LMM was applied to longitudinal body mass index (BMI) data over childhood in the ALSPAC cohort; the results emphasised the need for the robust standard error to ensure correct inference of associations of longitudinal BMI with chromosome 16 single nucleotide polymorphisms.

Suggested Citation

  • Warrington Nicole M. & Tilling Kate & Howe Laura D. & Paternoster Lavinia & Pennell Craig E. & Wu Yan Yan & Briollais Laurent, 2014. "Robustness of the linear mixed effects model to error distribution assumptions and the consequences for genome-wide association studies," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 13(5), pages 567-587, October.
  • Handle: RePEc:bpj:sagmbi:v:13:y:2014:i:5:p:21:n:4
    DOI: 10.1515/sagmb-2013-0066
    as

    Download full text from publisher

    File URL: https://doi.org/10.1515/sagmb-2013-0066
    Download Restriction: For access to full text, subscription to the journal or payment for the individual article is required.

    File URL: https://libkey.io/10.1515/sagmb-2013-0066?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Verbeke, Geert & Lesaffre, Emmanuel, 1997. "The effect of misspecifying the random-effects distribution in linear mixed models for longitudinal data," Computational Statistics & Data Analysis, Elsevier, vol. 23(4), pages 541-556, February.
    2. Laura D Howe & Kate Tilling & Li Benfield & Jennifer Logue & Naveed Sattar & Andy R Ness & George Davey Smith & Debbie A Lawlor, 2010. "Changes in Ponderal Index and Body Mass Index across Childhood and Their Associations with Fat Mass and Cardiovascular Risk Factors at Age 15," PLOS ONE, Public Library of Science, vol. 5(12), pages 1-13, December.
    3. Nicole M Warrington & Yan Yan Wu & Craig E Pennell & Julie A Marsh & Lawrence J Beilin & Lyle J Palmer & Stephen J Lye & Laurent Briollais, 2013. "Modelling BMI Trajectories in Children for Genetic Association Studies," PLOS ONE, Public Library of Science, vol. 8(1), pages 1-12, January.
    4. Daowen Zhang & Marie Davidian, 2001. "Linear Mixed Models with Flexible Distributions of Random Effects for Longitudinal Data," Biometrics, The International Biometric Society, vol. 57(3), pages 795-802, September.
    5. Ian R. White, 2010. "simsum: Analyses of simulation studies including Monte Carlo error," Stata Journal, StataCorp LP, vol. 10(3), pages 369-385, September.
    6. Koehler, Elizabeth & Brown, Elizabeth & Haneuse, Sebastien J.-P. A., 2009. "On the Assessment of Monte Carlo Error in Simulation-Based Statistical Analyses," The American Statistician, American Statistical Association, vol. 63(2), pages 155-162.
    7. B. Devlin & Kathryn Roeder, 1999. "Genomic Control for Association Studies," Biometrics, The International Biometric Society, vol. 55(4), pages 997-1004, December.
    8. Jacqmin-Gadda, Helene & Sibillot, Solenne & Proust, Cecile & Molina, Jean-Michel & Thiebaut, Rodolphe, 2007. "Robustness of the linear mixed model to misspecified error distribution," Computational Statistics & Data Analysis, Elsevier, vol. 51(10), pages 5142-5154, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Daniel McNeish & Jeffrey R. Harring & Denis Dumas, 2023. "A multilevel structured latent curve model for disaggregating student and school contributions to learning," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 32(2), pages 545-575, June.
    2. Leonardo Grilli & Carla Rampichini, 2015. "Specification of random effects in multilevel models: a review," Quality & Quantity: International Journal of Methodology, Springer, vol. 49(3), pages 967-976, May.
    3. Peng Zhang & Peter X.-K. Song & Annie Qu & Tom Greene, 2008. "Efficient Estimation for Patient-Specific Rates of Disease Progression Using Nonnormal Linear Mixed Models," Biometrics, The International Biometric Society, vol. 64(1), pages 29-38, March.
    4. Ye, Rendao & Wang, Tonghui & Gupta, Arjun K., 2014. "Distribution of matrix quadratic forms under skew-normal settings," Journal of Multivariate Analysis, Elsevier, vol. 131(C), pages 229-239.
    5. Francis K. C. Hui & Samuel Müller & Alan H. Welsh, 2021. "Random Effects Misspecification Can Have Severe Consequences for Random Effects Inference in Linear Mixed Models," International Statistical Review, International Statistical Institute, vol. 89(1), pages 186-206, April.
    6. Ng'ombe, John, 2019. "Economics of the Greenseeder Hand Planter, Discrete Choice Modeling, and On-Farm Field Experimentation," Thesis Commons jckt7, Center for Open Science.
    7. Jacqmin-Gadda, Helene & Sibillot, Solenne & Proust, Cecile & Molina, Jean-Michel & Thiebaut, Rodolphe, 2007. "Robustness of the linear mixed model to misspecified error distribution," Computational Statistics & Data Analysis, Elsevier, vol. 51(10), pages 5142-5154, June.
    8. Li, Erning & Pourahmadi, Mohsen, 2013. "An alternative REML estimation of covariance matrices in linear mixed models," Statistics & Probability Letters, Elsevier, vol. 83(4), pages 1071-1077.
    9. R. N. Rattihalli, 2023. "A Class of Multivariate Power Skew Symmetric Distributions: Properties and Inference for the Power-Parameter," Sankhya A: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 85(2), pages 1356-1393, August.
    10. Jara, Alejandro & Quintana, Fernando & San Marti­n, Ernesto, 2008. "Linear mixed models with skew-elliptical distributions: A Bayesian approach," Computational Statistics & Data Analysis, Elsevier, vol. 52(11), pages 5033-5045, July.
    11. Reyhaneh Rikhtehgaran & Iraj Kazemi, 2013. "Semi-parametric Bayesian estimation of mixed-effects models using the multivariate skew-normal distribution," Computational Statistics, Springer, vol. 28(5), pages 2007-2027, October.
    12. Rendao Ye & Tonghui Wang & Saowanit Sukparungsee & Arjun Gupta, 2015. "Tests in variance components models under skew-normal settings," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 78(7), pages 885-904, October.
    13. Weiping Zhang & MengMeng Zhang & Yu Chen, 2020. "A Copula-Based GLMM Model for Multivariate Longitudinal Data with Mixed-Types of Responses," Sankhya B: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 82(2), pages 353-379, November.
    14. Loy, Adam & Hofmann, Heike, 2014. "HLMdiag: A Suite of Diagnostics for Hierarchical Linear Models in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 56(i05).
    15. Xiao Song & Marie Davidian & Anastasios A. Tsiatis, 2002. "A Semiparametric Likelihood Approach to Joint Modeling of Longitudinal and Time-to-Event Data," Biometrics, The International Biometric Society, vol. 58(4), pages 742-753, December.
    16. Philip S. Boonstra & Bhramar Mukherjee & Jeremy M. G. Taylor & Mef Nilbert & Victor Moreno & Stephen B. Gruber, 2011. "Bayesian Modeling for Genetic Anticipation in Presence of Mutational Heterogeneity: A Case Study in Lynch Syndrome," Biometrics, The International Biometric Society, vol. 67(4), pages 1627-1637, December.
    17. Huang, Pei & McCarl, Bruce A., 2014. "Estimating Decadal Climate Variability Effects on Crop Yields: A Bayesian Hierarchical Approach," 2014 Annual Meeting, July 27-29, 2014, Minneapolis, Minnesota 169828, Agricultural and Applied Economics Association.
    18. Zeinolabedin Najafi & Karim Zare & Mohammad Reza Mahmoudi & Soheil Shokri & Amir Mosavi, 2022. "Inference and Local Influence Assessment in a Multifactor Skew-Normal Linear Mixed Model," Mathematics, MDPI, vol. 10(15), pages 1-21, August.
    19. Wendimagegn Ghidey & Emmanuel Lesaffre & Paul Eilers, 2004. "Smooth Random Effects Distribution in a Linear Mixed Model," Biometrics, The International Biometric Society, vol. 60(4), pages 945-953, December.
    20. Charles E. McCulloch & John M. Neuhaus, 2011. "Prediction of Random Effects in Linear and Generalized Linear Models under Model Misspecification," Biometrics, The International Biometric Society, vol. 67(1), pages 270-279, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bpj:sagmbi:v:13:y:2014:i:5:p:21:n:4. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.degruyter.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.