IDEAS home Printed from https://ideas.repec.org/a/plo/pgen00/1004722.html
   My bibliography  Save this article

Integrating Functional Data to Prioritize Causal Variants in Statistical Fine-Mapping Studies

Author

Listed:
  • Gleb Kichaev
  • Wen-Yun Yang
  • Sara Lindstrom
  • Farhad Hormozdiari
  • Eleazar Eskin
  • Alkes L Price
  • Peter Kraft
  • Bogdan Pasaniuc

Abstract

Standard statistical approaches for prioritization of variants for functional testing in fine-mapping studies either use marginal association statistics or estimate posterior probabilities for variants to be causal under simplifying assumptions. Here, we present a probabilistic framework that integrates association strength with functional genomic annotation data to improve accuracy in selecting plausible causal variants for functional validation. A key feature of our approach is that it empirically estimates the contribution of each functional annotation to the trait of interest directly from summary association statistics while allowing for multiple causal variants at any risk locus. We devise efficient algorithms that estimate the parameters of our model across all risk loci to further increase performance. Using simulations starting from the 1000 Genomes data, we find that our framework consistently outperforms the current state-of-the-art fine-mapping methods, reducing the number of variants that need to be selected to capture 90% of the causal variants from an average of 13.3 to 10.4 SNPs per locus (as compared to the next-best performing strategy). Furthermore, we introduce a cost-to-benefit optimization framework for determining the number of variants to be followed up in functional assays and assess its performance using real and simulation data. We validate our findings using a large scale meta-analysis of four blood lipids traits and find that the relative probability for causality is increased for variants in exons and transcription start sites and decreased in repressed genomic regions at the risk loci of these traits. Using these highly predictive, trait-specific functional annotations, we estimate causality probabilities across all traits and variants, reducing the size of the 90% confidence set from an average of 17.5 to 13.5 variants per locus in this data.Author Summary: Genome-wide association studies (GWAS) have successfully identified numerous regions in the genome that harbor genetic variants that increase risk for various complex traits and diseases. However, it is generally the case that GWAS risk variants are not themselves causally affecting the trait, but rather, are correlated to the true causal variant through linkage disequilibrium (LD). Plausible causal variants are identified in fine-mapping studies through targeted sequencing followed by prioritization of variants for functional validation. In this work, we propose methods that leverage two sources of independent information, the association strength and genomic functional location, to prioritize causal variants. We demonstrate in simulations and empirical data that our approach reduces the number of SNPs that need to be selected for follow-up to identify the true causal variants at GWAS risk loci.

Suggested Citation

  • Gleb Kichaev & Wen-Yun Yang & Sara Lindstrom & Farhad Hormozdiari & Eleazar Eskin & Alkes L Price & Peter Kraft & Bogdan Pasaniuc, 2014. "Integrating Functional Data to Prioritize Causal Variants in Statistical Fine-Mapping Studies," PLOS Genetics, Public Library of Science, vol. 10(10), pages 1-16, October.
  • Handle: RePEc:plo:pgen00:1004722
    DOI: 10.1371/journal.pgen.1004722
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1004722
    Download Restriction: no

    File URL: https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1004722&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pgen.1004722?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Laura L Faye & Mitchell J Machiela & Peter Kraft & Shelley B Bull & Lei Sun, 2013. "Re-Ranking Sequencing Variants in the Post-GWAS Era for Accurate Causal Variant Identification," PLOS Genetics, Public Library of Science, vol. 9(8), pages 1-16, August.
    2. Ying Wu & Lindsay L Waite & Anne U Jackson & Wayne H-H Sheu & Steven Buyske & Devin Absher & Donna K Arnett & Eric Boerwinkle & Lori L Bonnycastle & Cara L Carty & Iona Cheng & Barbara Cochran & Damie, 2013. "Trans-Ethnic Fine-Mapping of Lipid Loci Identifies Population-Specific Signals and Allelic Heterogeneity That Increases the Trait Variance Explained," PLOS Genetics, Public Library of Science, vol. 9(3), pages 1-16, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Sijie Wu & Manfei Zhang & Xinzhou Yang & Fuduan Peng & Juan Zhang & Jingze Tan & Yajun Yang & Lina Wang & Yanan Hu & Qianqian Peng & Jinxi Li & Yu Liu & Yaqun Guan & Chen Chen & Merel A Hamer & Tamar , 2018. "Genome-wide association studies and CRISPR/Cas9-mediated gene editing identify regulatory variants influencing eyebrow thickness in humans," PLOS Genetics, Public Library of Science, vol. 14(9), pages 1-22, September.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Nathan LaPierre & Kodi Taraszka & Helen Huang & Rosemary He & Farhad Hormozdiari & Eleazar Eskin, 2021. "Identifying causal variants by fine mapping across multiple studies," PLOS Genetics, Public Library of Science, vol. 17(9), pages 1-19, September.
    2. Martijn van de Bunt & Adrian Cortes & IGAS Consortium & Matthew A Brown & Andrew P Morris & Mark I McCarthy, 2015. "Evaluating the Performance of Fine-Mapping Strategies at Common Variant GWAS Loci," PLOS Genetics, Public Library of Science, vol. 11(9), pages 1-14, September.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pgen00:1004722. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosgenetics (email available below). General contact details of provider: https://journals.plos.org/plosgenetics/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.