IDEAS home Printed from https://ideas.repec.org/a/plo/pgen00/1009703.html
   My bibliography  Save this article

Exploiting collider bias to apply two-sample summary data Mendelian randomization methods to one-sample individual level data

Author

Listed:
  • Ciarrah Barry
  • Junxi Liu
  • Rebecca Richmond
  • Martin K Rutter
  • Deborah A Lawlor
  • Frank Dudbridge
  • Jack Bowden

Abstract

Over the last decade the availability of SNP-trait associations from genome-wide association studies has led to an array of methods for performing Mendelian randomization studies using only summary statistics. A common feature of these methods, besides their intuitive simplicity, is the ability to combine data from several sources, incorporate multiple variants and account for biases due to weak instruments and pleiotropy. With the advent of large and accessible fully-genotyped cohorts such as UK Biobank, there is now increasing interest in understanding how best to apply these well developed summary data methods to individual level data, and to explore the use of more sophisticated causal methods allowing for non-linearity and effect modification.In this paper we describe a general procedure for optimally applying any two sample summary data method using one sample data. Our procedure first performs a meta-analysis of summary data estimates that are intentionally contaminated by collider bias between the genetic instruments and unmeasured confounders, due to conditioning on the observed exposure. These estimates are then used to correct the standard observational association between an exposure and outcome. Simulations are conducted to demonstrate the method’s performance against naive applications of two sample summary data MR. We apply the approach to the UK Biobank cohort to investigate the causal role of sleep disturbance on HbA1c levels, an important determinant of diabetes.Our approach can be viewed as a generalization of Dudbridge et al. (Nat. Comm. 10: 1561), who developed a technique to adjust for index event bias when uncovering genetic predictors of disease progression based on case-only data. Our work serves to clarify that in any one sample MR analysis, it can be advantageous to estimate causal relationships by artificially inducing and then correcting for collider bias.Author summary: Uncovering causal mechanisms between risk factors and disease is challenging with observational data because of unobserved confounding. Mendelian randomization offers a potential solution by replacing an individual’s observed risk factor data with an unconfounded genetic proxy measure. Over the last decade an array of methods for performing Mendelian randomization studies (MR) using publicly available summary statistics gleaned from two separate genome-wide association studies. With the advent of large and accessible fully-genotyped cohorts such as UK Biobank, there is now increasing interest in understanding how best to apply these well-developed summary data methods to individual level data. In this paper we describe a general procedure for optimally applying any summary data MR method using individual level data from one cohort study. Our approach may at first seem nonsensical: we create summary statistics that are intentionally biased by confounding. This bias can, however, be very accurately estimated, and the estimate then used to correct the results of a standard observational analysis. We apply our new way of performing an MR analysis to data from UK Biobank to investigate the causal role of sleep disturbance on HbA1c levels, an important determinant of diabetes.

Suggested Citation

  • Ciarrah Barry & Junxi Liu & Rebecca Richmond & Martin K Rutter & Deborah A Lawlor & Frank Dudbridge & Jack Bowden, 2021. "Exploiting collider bias to apply two-sample summary data Mendelian randomization methods to one-sample individual level data," PLOS Genetics, Public Library of Science, vol. 17(8), pages 1-26, August.
  • Handle: RePEc:plo:pgen00:1009703
    DOI: 10.1371/journal.pgen.1009703
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1009703
    Download Restriction: no

    File URL: https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1009703&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pgen.1009703?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Nuala A Sheehan & Vanessa Didelez & Paul R Burton & Martin D Tobin, 2008. "Mendelian Randomisation and Causal Inference in Observational Epidemiology," PLOS Medicine, Public Library of Science, vol. 5(8), pages 1-6, August.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Noémi Kreif & Richard Grieve & M. Zia Sadique, 2013. "Statistical Methods For Cost‐Effectiveness Analyses That Use Observational Data: A Critical Appraisal Tool And Review Of Current Practice," Health Economics, John Wiley & Sons, Ltd., vol. 22(4), pages 486-500, April.
    2. Barban, Nicola & De Cao, Elisabetta & Oreffice, Sonia & Quintana-Domeque, Climent, 2021. "The effect of education on spousal education: A genetic approach," Labour Economics, Elsevier, vol. 71(C).
    3. Nicola Barban & Elisabetta De Cao & Sonia Oreffice & Climent Quintana-Domeque, 2016. "Assortative Mating on Education: A Genetic Assessment," Working Papers 2016-034, Human Capital and Economic Opportunity Working Group.
    4. Black, Nicole & Hughes, Robert & Jones, Andrew M., 2018. "The health care costs of childhood obesity in Australia: An instrumental variables approach," Economics & Human Biology, Elsevier, vol. 31(C), pages 1-13.
    5. Lilah M. Besser & Willa D. Brenowitz & Oanh L. Meyer & Serena Hoermann & John Renne, 2021. "Methods to Address Self-Selection and Reverse Causation in Studies of Neighborhood Environments and Brain Health," IJERPH, MDPI, vol. 18(12), pages 1-19, June.
    6. Xiaobo Li & Yuqiong Li & Bei Song & Shujie Guo & Shaoli Chu & Nan Jia & Wenquan Niu, 2012. "Hematopoietically-Expressed Homeobox Gene Three Widely-Evaluated Polymorphisms and Risk for Diabetes: A Meta-Analysis," PLOS ONE, Public Library of Science, vol. 7(11), pages 1-10, November.
    7. Lina Zgaga & Felix Agakov & Evropi Theodoratou & Susan M Farrington & Albert Tenesa & Malcolm G Dunlop & Paul McKeigue & Harry Campbell, 2013. "Model Selection Approach Suggests Causal Association between 25-Hydroxyvitamin D and Colorectal Cancer," PLOS ONE, Public Library of Science, vol. 8(5), pages 1-11, May.
    8. Tanica Lyngdoh & Philippe Vuistiner & Pedro Marques-Vidal & Valentin Rousson & Gérard Waeber & Peter Vollenweider & Murielle Bochud, 2012. "Serum Uric Acid and Adiposity: Deciphering Causality Using a Bidirectional Mendelian Randomization Approach," PLOS ONE, Public Library of Science, vol. 7(6), pages 1-8, June.
    9. Siqi Xu & Peng Wang & Wing Kam Fung & Zhonghua Liu, 2023. "A novel penalized inverse‐variance weighted estimator for Mendelian randomization with applications to COVID‐19 outcomes," Biometrics, The International Biometric Society, vol. 79(3), pages 2184-2195, September.
    10. Sinclair Carr & Dana Bryazka & Susan A. McLaughlin & Peng Zheng & Sarasvati Bahadursingh & Aleksandr Y. Aravkin & Simon I. Hay & Hilary R. Lawlor & Erin C. Mullany & Christopher J. L. Murray & Sneha I, 2024. "A burden of proof study on alcohol consumption and ischemic heart disease," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    11. Leandro Fornias Machado de Rezende & Maurício Rodrigues Lopes & Juan Pablo Rey-López & Victor Keihan Rodrigues Matsudo & Olinda do Carmo Luiz, 2014. "Sedentary Behavior and Health Outcomes: An Overview of Systematic Reviews," PLOS ONE, Public Library of Science, vol. 9(8), pages 1-7, August.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pgen00:1009703. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosgenetics (email available below). General contact details of provider: https://journals.plos.org/plosgenetics/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.