Author
Listed:
- Jeffery A Goldstein
- Joshua S Weinstock
- Lisa A Bastarache
- Daniel B Larach
- Lars G Fritsche
- Ellen M Schmidt
- Chad M Brummett
- Sachin Kheterpal
- Goncalo R Abecasis
- Joshua C Denny
- Matthew Zawistowski
Abstract
Phenotypes extracted from Electronic Health Records (EHRs) are increasingly prevalent in genetic studies. EHRs contain hundreds of distinct clinical laboratory test results, providing a trove of health data beyond diagnoses. Such lab data is complex and lacks a ubiquitous coding scheme, making it more challenging than diagnosis data. Here we describe the first large-scale cross-health system genome-wide association study (GWAS) of EHR-based quantitative laboratory-derived phenotypes. We meta-analyzed 70 lab traits matched between the BioVU cohort from the Vanderbilt University Health System and the Michigan Genomics Initiative (MGI) cohort from Michigan Medicine. We show high replication of known association for these traits, validating EHR-based measurements as high-quality phenotypes for genetic analysis. Notably, our analysis provides the first replication for 699 previous GWAS associations across 46 different traits. We discovered 31 novel associations at genome-wide significance for 22 distinct traits, including the first reported associations for two lab-based traits. We replicated 22 of these novel associations in an independent tranche of BioVU samples. The summary statistics for all association tests are freely available to benefit other researchers. Finally, we performed mirrored analyses in BioVU and MGI to assess competing analytic practices for EHR lab traits. We find that using the mean of all available lab measurements provides a robust summary value, but alternate summarizations can improve power in certain circumstances. This study provides a proof-of-principle for cross health system GWAS and is a framework for future studies of quantitative EHR lab traits.Author summary: Electronic Health Records (EHRs) have emerged as an abundant data source for deriving phenotypes used in genetic association studies. EHRs provide a broad range of clinical data in large health system cohorts and are readily incorporated into large-scale meta-analyses. The abundance of available data in EHRs introduces unique technical challenges, particularly longitudinal clinical lab measurements which lack the structure of more commonly used disease diagnosis codes. Conflicting strategies exist in the literature and it is not clear how portable these strategies are across health systems. In this study we performed a proof-of-principle meta-analysis of 70 clinical lab traits in two large-scale health systems: BioVU from Vanderbilt University and the Michigan Genomics Initiative from Michigan Medicine. Despite the challenges of matching labs across the two health systems, we observed a high replication rate for known genetic variants. Further, we identified 31 novel associations, 22 of which replicated in an independent BioVU cohort, indicating the potential for future meta-analyses. Finally, we explored the impact of various analytic strategies, looking for consistent effects between our two cohorts, to determine optimal strategies for future genetic analysis of EHR-derived lab traits.
Suggested Citation
Jeffery A Goldstein & Joshua S Weinstock & Lisa A Bastarache & Daniel B Larach & Lars G Fritsche & Ellen M Schmidt & Chad M Brummett & Sachin Kheterpal & Goncalo R Abecasis & Joshua C Denny & Matthew , 2020.
"LabWAS: Novel findings and study design recommendations from a meta-analysis of clinical labs in two independent biobanks,"
PLOS Genetics, Public Library of Science, vol. 16(11), pages 1-23, November.
Handle:
RePEc:plo:pgen00:1009077
DOI: 10.1371/journal.pgen.1009077
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pgen00:1009077. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosgenetics (email available below). General contact details of provider: https://journals.plos.org/plosgenetics/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.