IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v13y2022i1d10.1038_s41467-022-32885-x.html
   My bibliography  Save this article

Efficient and accurate frailty model approach for genome-wide survival association analysis in large-scale biobanks

Author

Listed:
  • Rounak Dey

    (Harvard T.H. Chan School of Public Health)

  • Wei Zhou

    (Massachusetts General Hospital
    Broad Institute of Harvard and MIT
    Broad Institute of Harvard and MIT
    University of Helsinki)

  • Tuomo Kiiskinen

    (University of Helsinki
    Finnish Institute for Health and Welfare)

  • Aki Havulinna

    (University of Helsinki
    Finnish Institute for Health and Welfare)

  • Amanda Elliott

    (Harvard T.H. Chan School of Public Health
    Massachusetts General Hospital
    Broad Institute of Harvard and MIT)

  • Juha Karjalainen

    (Massachusetts General Hospital
    Broad Institute of Harvard and MIT
    Broad Institute of Harvard and MIT
    University of Helsinki)

  • Mitja Kurki

    (Massachusetts General Hospital
    Broad Institute of Harvard and MIT
    Broad Institute of Harvard and MIT
    University of Helsinki)

  • Ashley Qin

    (Harvard T.H. Chan School of Public Health)

  • Seunggeun Lee

    (Seoul National University)

  • Aarno Palotie

    (Massachusetts General Hospital
    Broad Institute of Harvard and MIT
    Broad Institute of Harvard and MIT
    University of Helsinki)

  • Benjamin Neale

    (Massachusetts General Hospital
    Broad Institute of Harvard and MIT
    Broad Institute of Harvard and MIT)

  • Mark Daly

    (Massachusetts General Hospital
    Broad Institute of Harvard and MIT
    Broad Institute of Harvard and MIT
    University of Helsinki)

  • Xihong Lin

    (Harvard T.H. Chan School of Public Health
    Broad Institute of Harvard and MIT
    Harvard University)

Abstract

With decades of electronic health records linked to genetic data, large biobanks provide unprecedented opportunities for systematically understanding the genetics of the natural history of complex diseases. Genome-wide survival association analysis can identify genetic variants associated with ages of onset, disease progression and lifespan. We propose an efficient and accurate frailty model approach for genome-wide survival association analysis of censored time-to-event (TTE) phenotypes by accounting for both population structure and relatedness. Our method utilizes state-of-the-art optimization strategies to reduce the computational cost. The saddlepoint approximation is used to allow for analysis of heavily censored phenotypes (>90%) and low frequency variants (down to minor allele count 20). We demonstrate the performance of our method through extensive simulation studies and analysis of five TTE phenotypes, including lifespan, with heavy censoring rates (90.9% to 99.8%) on ~400,000 UK Biobank participants with white British ancestry and ~180,000 individuals in FinnGen. We further analyzed 871 TTE phenotypes in the UK Biobank and presented the genome-wide scale phenome-wide association results with the PheWeb browser.

Suggested Citation

  • Rounak Dey & Wei Zhou & Tuomo Kiiskinen & Aki Havulinna & Amanda Elliott & Juha Karjalainen & Mitja Kurki & Ashley Qin & Seunggeun Lee & Aarno Palotie & Benjamin Neale & Mark Daly & Xihong Lin, 2022. "Efficient and accurate frailty model approach for genome-wide survival association analysis in large-scale biobanks," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
  • Handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-32885-x
    DOI: 10.1038/s41467-022-32885-x
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-022-32885-x
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-022-32885-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Yashin, Anatoli I. & Iachine, Ivan A., 1999. "Dependent Hazards in Multivariate Survival Problems," Journal of Multivariate Analysis, Elsevier, vol. 71(2), pages 241-261, November.
    2. Robert C Barber & Nicole R Phillips & Jeffrey L Tilson & Ryan M Huebinger & Shantanu J Shewale & Jessica L Koenig & Jeffrey S Mitchel & Sid E O’Bryant & Stephen C Waring & Ramon Diaz-Arrastia & Scott , 2015. "Can Genetic Analysis of Putative Blood Alzheimer’s Disease Biomarkers Lead to Identification of Susceptibility Loci?," PLOS ONE, Public Library of Science, vol. 10(12), pages 1-19, December.
    3. Samuli Ripatti & Juni Palmgren, 2000. "Estimation of Multivariate Frailty Models Using Penalized Partial Likelihood," Biometrics, The International Biometric Society, vol. 56(4), pages 1016-1022, December.
    4. Clare Bycroft & Colin Freeman & Desislava Petkova & Gavin Band & Lloyd T. Elliott & Kevin Sharp & Allan Motyer & Damjan Vukcevic & Olivier Delaneau & Jared O’Connell & Adrian Cortes & Samantha Welsh &, 2018. "The UK Biobank resource with deep phenotyping and genomic data," Nature, Nature, vol. 562(7726), pages 203-209, October.
    5. David C. Johnson & Niels Weinhold & Jonathan S. Mitchell & Bowang Chen & Martin Kaiser & Dil B. Begum & Jens Hillengass & Uta Bertsch & Walter A. Gregory & David Cairns & Graham H. Jackson & Asta Förs, 2016. "Genome-wide association study identifies variation at 6q25.1 associated with survival in multiple myeloma," Nature Communications, Nature, vol. 7(1), pages 1-7, April.
    6. J. H. Petersen & P. K. Andersen & R.D. Gill, 1996. "Variance components models for survival data," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 50(1), pages 193-211, March.
    7. James Vaupel & Kenneth Manton & Eric Stallard, 1979. "The impact of heterogeneity in individual frailty on the dynamics of mortality," Demography, Springer;Population Association of America (PAA), vol. 16(3), pages 439-454, August.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Emil M. Pedersen & Esben Agerbo & Oleguer Plana-Ripoll & Jette Steinbach & Morten D. Krebs & David M. Hougaard & Thomas Werge & Merete Nordentoft & Anders D. Børglum & Katherine L. Musliner & Andrea G, 2023. "ADuLT: An efficient and robust time-to-event GWAS," Nature Communications, Nature, vol. 14(1), pages 1-12, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jaap H. Abbring & Gerard J. Van Den Berg, 2007. "The unobserved heterogeneity distribution in duration analysis," Biometrika, Biometrika Trust, vol. 94(1), pages 87-99.
    2. Yashin, Anatoli I. & Iachine, Ivan A., 1999. "Dependent Hazards in Multivariate Survival Problems," Journal of Multivariate Analysis, Elsevier, vol. 71(2), pages 241-261, November.
    3. Andreas Wienke & Konstantin G. Arbeev & Isabella Locatelli & Anatoli I. Yashin, 2003. "A simulation study of different correlated frailty models and estimation strategies," MPIDR Working Papers WP-2003-018, Max Planck Institute for Demographic Research, Rostock, Germany.
    4. P. Sankaran & V. Gleeja, 2008. "Proportional reversed hazard and frailty models," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 68(3), pages 333-342, November.
    5. David B. Dunson & Zhen Chen, 2004. "Selecting Factors Predictive of Heterogeneity in Multivariate Event Time Data," Biometrics, The International Biometric Society, vol. 60(2), pages 352-358, June.
    6. Il Do Ha & Maengseok Noh & Youngjo Lee, 2010. "Bias Reduction of Likelihood Estimators in Semiparametric Frailty Models," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 37(2), pages 307-320, June.
    7. Jing Wang, 2019. "Weighted estimation for multivariate shared frailty models for complex surveys," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 25(3), pages 469-479, July.
    8. Djeundje, Viani Biatat & Crook, Jonathan, 2018. "Incorporating heterogeneity and macroeconomic variables into multi-state delinquency models for credit cards," European Journal of Operational Research, Elsevier, vol. 271(2), pages 697-709.
    9. Sukhmani Sidhu & Kanchan Jain & Suresh Kumar Sharma, 2018. "Bayesian estimation of generalized gamma shared frailty model," Computational Statistics, Springer, vol. 33(1), pages 277-297, March.
    10. Mitra Rahimzadeh & Ebrahim Hajizadeh & Farzad Eskandari, 2011. "Non-mixture cure correlated frailty models in Bayesian approach," Journal of Applied Statistics, Taylor & Francis Journals, vol. 38(8), pages 1651-1663, August.
    11. Abrahantes, Jose Cortinas & Legrand, Catherine & Burzykowski, Tomasz & Janssen, Paul & Ducrocq, Vincent & Duchateau, Luc, 2007. "Comparison of different estimation procedures for proportional hazards model with random effects," Computational Statistics & Data Analysis, Elsevier, vol. 51(8), pages 3913-3930, May.
    12. Matteo Di Scipio & Mohammad Khan & Shihong Mao & Michael Chong & Conor Judge & Nazia Pathan & Nicolas Perrot & Walter Nelson & Ricky Lali & Shuang Di & Robert Morton & Jeremy Petch & Guillaume Paré, 2023. "A versatile, fast and unbiased method for estimation of gene-by-environment interaction effects on biobank-scale datasets," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    13. Jie Huang & Haiming Zhou & Nader Ebrahimi, 2022. "Bayesian Bivariate Cure Rate Models Using Copula Functions," International Journal of Statistics and Probability, Canadian Center of Science and Education, vol. 11(3), pages 1-9, May.
    14. Bagdonavicius, Vilijandas & Nikulin, Mikhail, 2000. "On goodness-of-fit for the linear transformation and frailty models," Statistics & Probability Letters, Elsevier, vol. 47(2), pages 177-188, April.
    15. Heng Du & Lei Zhou & Zhen Liu & Yue Zhuo & Meilin Zhang & Qianqian Huang & Shiyu Lu & Kai Xing & Li Jiang & Jian-Feng Liu, 2024. "The 1000 Chinese Indigenous Pig Genomes Project provides insights into the genomic architecture of pigs," Nature Communications, Nature, vol. 15(1), pages 1-18, December.
    16. Feehan, Dennis & Wrigley-Field, Elizabeth, 2020. "How do populations aggregate?," SocArXiv 2fkw3, Center for Open Science.
    17. Vincent Michaud & Eulalie Lasseaux & David J. Green & Dave T. Gerrard & Claudio Plaisant & Tomas Fitzgerald & Ewan Birney & Benoît Arveiler & Graeme C. Black & Panagiotis I. Sergouniotis, 2022. "The contribution of common regulatory and protein-coding TYR variants to the genetic architecture of albinism," Nature Communications, Nature, vol. 13(1), pages 1-8, December.
    18. Natalie DeForest & Yuqi Wang & Zhiyi Zhu & Jacqueline S. Dron & Ryan Koesterer & Pradeep Natarajan & Jason Flannick & Tiffany Amariuta & Gina M. Peloso & Amit R. Majithia, 2024. "Genome-wide discovery and integrative genomic characterization of insulin resistance loci using serum triglycerides to HDL-cholesterol ratio as a proxy," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    19. Dick Schijven & Sourena Soheili-Nezhad & Simon E. Fisher & Clyde Francks, 2024. "Exome-wide analysis implicates rare protein-altering variants in human handedness," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    20. Filipe Costa Souza & Wilton Bernardino & Silvio C. Patricio, 2024. "How life-table right-censoring affected the Brazilian social security factor: an application of the gamma-Gompertz-Makeham model," Journal of Population Research, Springer, vol. 41(3), pages 1-38, September.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-32885-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.