IDEAS home Printed from https://ideas.repec.org/a/plo/pgen00/1007586.html
   My bibliography  Save this article

Genetic architecture of gene expression traits across diverse populations

Author

Listed:
  • Lauren S Mogil
  • Angela Andaleon
  • Alexa Badalamenti
  • Scott P Dickinson
  • Xiuqing Guo
  • Jerome I Rotter
  • W Craig Johnson
  • Hae Kyung Im
  • Yongmei Liu
  • Heather E Wheeler

Abstract

For many complex traits, gene regulation is likely to play a crucial mechanistic role. How the genetic architectures of complex traits vary between populations and subsequent effects on genetic prediction are not well understood, in part due to the historical paucity of GWAS in populations of non-European ancestry. We used data from the MESA (Multi-Ethnic Study of Atherosclerosis) cohort to characterize the genetic architecture of gene expression within and between diverse populations. Genotype and monocyte gene expression were available in individuals with African American (AFA, n = 233), Hispanic (HIS, n = 352), and European (CAU, n = 578) ancestry. We performed expression quantitative trait loci (eQTL) mapping in each population and show genetic correlation of gene expression depends on shared ancestry proportions. Using elastic net modeling with cross validation to optimize genotypic predictors of gene expression in each population, we show the genetic architecture of gene expression for most predictable genes is sparse. We found the best predicted gene in each population, TACSTD2 in AFA and CHURC1 in CAU and HIS, had similar prediction performance across populations with R2 > 0.8 in each population. However, we identified a subset of genes that are well-predicted in one population, but poorly predicted in another. We show these differences in predictive performance are due to allele frequency differences between populations. Using genotype weights trained in MESA to predict gene expression in independent populations showed that a training set with ancestry similar to the test set is better at predicting gene expression in test populations, demonstrating an urgent need for diverse population sampling in genomics. Our predictive models and performance statistics in diverse cohorts are made publicly available for use in transcriptome mapping methods at https://github.com/WheelerLab/DivPop.Author summary: Most genome-wide association studies (GWAS) have been conducted in populations of European ancestry leading to a disparity in understanding the genetics of complex traits between populations. For many complex traits, gene regulation is critical, given the consistent enrichment of regulatory variants among trait-associated variants. However, it is still unknown how the effects of these key variants differ across populations. We used data from MESA to study the underlying genetic architecture of gene expression by optimizing gene expression prediction within and across diverse populations. The populations with genotype and gene expression data available are from individuals with African American (AFA, n = 233), Hispanic (HIS, n = 352), and European (CAU, n = 578) ancestry. After calculating the prediction performance, we found that many genes that were well predicted in one population are poorly predicted in another. We further show that a training set with ancestry similar to the test set resulted in better gene expression predictions, demonstrating the need to incorporate diverse populations in genomic studies. Our gene expression prediction models and performance statistics are publicly available to facilitate future transcriptome mapping studies in diverse populations.

Suggested Citation

  • Lauren S Mogil & Angela Andaleon & Alexa Badalamenti & Scott P Dickinson & Xiuqing Guo & Jerome I Rotter & W Craig Johnson & Hae Kyung Im & Yongmei Liu & Heather E Wheeler, 2018. "Genetic architecture of gene expression traits across diverse populations," PLOS Genetics, Public Library of Science, vol. 14(8), pages 1-21, August.
  • Handle: RePEc:plo:pgen00:1007586
    DOI: 10.1371/journal.pgen.1007586
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1007586
    Download Restriction: no

    File URL: https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1007586&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pgen.1007586?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Alice B. Popejoy & Stephanie M. Fullerton, 2016. "Genomics is failing on diversity," Nature, Nature, vol. 538(7624), pages 161-164, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Kevin L Keys & Angel C Y Mak & Marquitta J White & Walter L Eckalbar & Andrew W Dahl & Joel Mefford & Anna V Mikhaylova & María G Contreras & Jennifer R Elhawary & Celeste Eng & Donglei Hu & Scott Hun, 2020. "On the cross-population generalizability of gene expression prediction models," PLOS Genetics, Public Library of Science, vol. 16(8), pages 1-28, August.
    2. Lulu Shang & Wei Zhao & Yi Zhe Wang & Zheng Li & Jerome J. Choi & Minjung Kho & Thomas H. Mosley & Sharon L. R. Kardia & Jennifer A. Smith & Xiang Zhou, 2023. "meQTL mapping in the GENOA study reveals genetic determinants of DNA methylation in African Americans," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    3. Randy L. Parrish & Aron S. Buchman & Shinya Tasaki & Yanling Wang & Denis Avey & Jishu Xu & Philip L. De Jager & David A. Bennett & Michael P. Epstein & Jingjing Yang, 2024. "SR-TWAS: leveraging multiple reference panels to improve transcriptome-wide association study power by ensemble machine learning," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    4. Angela Andaleon & Lauren S Mogil & Heather E Wheeler, 2019. "Genetically regulated gene expression underlies lipid traits in Hispanic cohorts," PLOS ONE, Public Library of Science, vol. 14(8), pages 1-21, August.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Nadine R. Caron & Wilf Adam & Kate Anderson & Brooke T. Boswell & Meck Chongo & Viktor Deineko & Alexanne Dick & Shannon E. Hall & Jessica T. Hatcher & Patricia Howard & Megan Hunt & Kevin Linn & Ashl, 2023. "Partnering with First Nations in Northern British Columbia Canada to Reduce Inequity in Access to Genomic Research," IJERPH, MDPI, vol. 20(10), pages 1-31, May.
    2. Rohini Chakravarthy & Sarah C Stallings & Michael Williams & Megan Hollister & Mario Davidson & Juan Canedo & Consuelo H Wilkins, 2020. "Factors influencing precision medicine knowledge and attitudes," PLOS ONE, Public Library of Science, vol. 15(11), pages 1-14, November.
    3. Michel S. Naslavsky & Marilia O. Scliar & Guilherme L. Yamamoto & Jaqueline Yu Ting Wang & Stepanka Zverinova & Tatiana Karp & Kelly Nunes & José Ricardo Magliocco Ceroni & Diego Lima Carvalho & Carlo, 2022. "Whole-genome sequencing of 1,171 elderly admixed individuals from Brazil," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    4. Pei-Kuan Cong & Wei-Yang Bai & Jin-Chen Li & Meng-Yuan Yang & Saber Khederzadeh & Si-Rui Gai & Nan Li & Yu-Heng Liu & Shi-Hui Yu & Wei-Wei Zhao & Jun-Quan Liu & Yi Sun & Xiao-Wei Zhu & Pian-Pian Zhao , 2022. "Genomic analyses of 10,376 individuals in the Westlake BioBank for Chinese (WBBC) pilot project," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    5. Jingning Zhang & Jianan Zhan & Jin Jin & Cheng Ma & Ruzhang Zhao & Jared O’Connell & Yunxuan Jiang & Bertram L. Koelsch & Haoyu Zhang & Nilanjan Chatterjee, 2024. "An ensemble penalized regression method for multi-ancestry polygenic risk prediction," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    6. Brenton R Swenson & Tin Louie & Henry J Lin & Raúl Méndez-Giráldez & Jennifer E Below & Cathy C Laurie & Kathleen F Kerr & Heather Highland & Timothy A Thornton & Kelli K Ryckman & Charles Kooperberg , 2019. "GWAS of QRS duration identifies new loci specific to Hispanic/Latino populations," PLOS ONE, Public Library of Science, vol. 14(6), pages 1-15, June.
    7. Wei Fu & Shin-Yi Chou & Li-San Wang, 2022. "NIH Grant Expansion, Ancestral Diversity and Scientific Discovery in Genomics Research," NBER Working Papers 30155, National Bureau of Economic Research, Inc.
    8. Alesha A. Hatton & Fei-Fei Cheng & Tian Lin & Ren-Juan Shen & Jie Chen & Zhili Zheng & Jia Qu & Fan Lyu & Sarah E. Harris & Simon R. Cox & Zi-Bing Jin & Nicholas G. Martin & Dongsheng Fan & Grant W. M, 2024. "Genetic control of DNA methylation is largely shared across European and East Asian populations," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    9. Shim, Janet K. & Bentz, Michael & Vasquez, Emily & Jeske, Melanie & Saperstein, Aliya & Fullerton, Stephanie M. & Foti, Nicole & McMahon, Caitlin & Lee, Sandra Soo-Jin, 2022. "Strategies of inclusion: The tradeoffs of pursuing “baked in” diversity through place-based recruitment," Social Science & Medicine, Elsevier, vol. 306(C).
    10. Jiacheng Miao & Hanmin Guo & Gefei Song & Zijie Zhao & Lin Hou & Qiongshi Lu, 2023. "Quantifying portable genetic effects and improving cross-ancestry genetic prediction with GWAS summary statistics," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    11. Surina Singh & Ananyo Choudhury & Scott Hazelhurst & Nigel J. Crowther & Palwendé R. Boua & Hermann Sorgho & Godfred Agongo & Engelbert A. Nonterah & Lisa K. Micklesfield & Shane A. Norris & Isaac Kis, 2023. "Genome-wide association study meta-analysis of blood pressure traits and hypertension in sub-Saharan African populations: an AWI-Gen study," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    12. Ananyo Choudhury & Jean-Tristan Brandenburg & Tinashe Chikowore & Dhriti Sengupta & Palwende Romuald Boua & Nigel J. Crowther & Godfred Agongo & Gershim Asiki & F. Xavier Gómez-Olivé & Isaac Kisiangan, 2022. "Meta-analysis of sub-Saharan African studies provides insights into genetic architecture of lipid traits," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    13. Naser Ansari-Pour & Yonglan Zheng & Toshio F. Yoshimatsu & Ayodele Sanni & Mustapha Ajani & Jean-Baptiste Reynier & Avraam Tapinos & Jason J. Pitt & Stefan Dentro & Anna Woodard & Padma Sheila Rajagop, 2021. "Whole-genome analysis of Nigerian patients with breast cancer reveals ethnic-driven somatic evolution and distinct genomic subtypes," Nature Communications, Nature, vol. 12(1), pages 1-15, December.
    14. Jie Ping & Guochong Jia & Qiuyin Cai & Xingyi Guo & Ran Tao & Christine Ambrosone & Dezheng Huo & Stefan Ambs & Mollie E. Barnard & Yu Chen & Montserrat Garcia-Closas & Jian Gu & Jennifer J. Hu & Esth, 2024. "Using genome and transcriptome data from African-ancestry female participants to identify putative breast cancer susceptibility genes," Nature Communications, Nature, vol. 15(1), pages 1-8, December.
    15. Kevin L Keys & Angel C Y Mak & Marquitta J White & Walter L Eckalbar & Andrew W Dahl & Joel Mefford & Anna V Mikhaylova & María G Contreras & Jennifer R Elhawary & Celeste Eng & Donglei Hu & Scott Hun, 2020. "On the cross-population generalizability of gene expression prediction models," PLOS Genetics, Public Library of Science, vol. 16(8), pages 1-28, August.
    16. Brandy M Mapes & Christopher S Foster & Sheila V Kusnoor & Marcia I Epelbaum & Mona AuYoung & Gwynne Jenkins & Maria Lopez-Class & Dara Richardson-Heron & Ahmed Elmi & Karl Surkan & Robert M Cronin & , 2020. "Diversity and inclusion for the All of Us research program: A scoping review," PLOS ONE, Public Library of Science, vol. 15(7), pages 1-14, July.
    17. Brieuc Lehmann & Maxine Mackintosh & Gil McVean & Chris Holmes, 2023. "Optimal strategies for learning multi-ancestry polygenic scores vary across traits," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    18. Irene Unceta & Jordi Nin & Oriol Pujol, 2020. "Risk mitigation in algorithmic accountability: The role of machine learning copies," PLOS ONE, Public Library of Science, vol. 15(11), pages 1-26, November.
    19. Evans, Linnea & Engelman, Michal & Mikulas, Alex & Malecki, Kristen, 2021. "How are social determinants of health integrated into epigenetic research? A systematic review," Social Science & Medicine, Elsevier, vol. 273(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pgen00:1007586. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosgenetics (email available below). General contact details of provider: https://journals.plos.org/plosgenetics/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.