IDEAS home Printed from https://ideas.repec.org/a/bla/biomet/v79y2023i2p1472-1484.html
   My bibliography  Save this article

Leveraging a surrogate outcome to improve inference on a partially missing target outcome

Author

Listed:
  • Zachary R. McCaw
  • Sheila M. Gaynor
  • Ryan Sun
  • Xihong Lin

Abstract

Sample sizes vary substantially across tissues in the Genotype‐Tissue Expression (GTEx) project, where considerably fewer samples are available from certain inaccessible tissues, such as the substantia nigra (SSN), than from accessible tissues, such as blood. This severely limits power for identifying tissue‐specific expression quantitative trait loci (eQTL) in undersampled tissues. Here we propose Surrogate Phenotype Regression Analysis (Spray) for leveraging information from a correlated surrogate outcome (eg, expression in blood) to improve inference on a partially missing target outcome (eg, expression in SSN). Rather than regarding the surrogate outcome as a proxy for the target outcome, Spray jointly models the target and surrogate outcomes within a bivariate regression framework. Unobserved values of either outcome are treated as missing data. We describe and implement an expectation conditional maximization algorithm for performing estimation in the presence of bilateral outcome missingness. Spray estimates the same association parameter estimated by standard eQTL mapping and controls the type I error even when the target and surrogate outcomes are truly uncorrelated. We demonstrate analytically and empirically, using simulations and GTEx data, that in comparison with marginally modeling the target outcome, jointly modeling the target and surrogate outcomes increases estimation precision and improves power.

Suggested Citation

  • Zachary R. McCaw & Sheila M. Gaynor & Ryan Sun & Xihong Lin, 2023. "Leveraging a surrogate outcome to improve inference on a partially missing target outcome," Biometrics, The International Biometric Society, vol. 79(2), pages 1472-1484, June.
  • Handle: RePEc:bla:biomet:v:79:y:2023:i:2:p:1472-1484
    DOI: 10.1111/biom.13629
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/biom.13629
    Download Restriction: no

    File URL: https://libkey.io/10.1111/biom.13629?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Jeffrey T Leek & John D Storey, 2007. "Capturing Heterogeneity in Gene Expression Studies by Surrogate Variable Analysis," PLOS Genetics, Public Library of Science, vol. 3(9), pages 1-12, September.
    2. Seunggeun Lee & Wei Sun & Fred A. Wright & Fei Zou, 2017. "An improved and explicit surrogate variable analysis procedure by coefficient adjustment," Biometrika, Biometrika Trust, vol. 104(2), pages 303-316.
    3. Timothée Flutre & Xiaoquan Wen & Jonathan Pritchard & Matthew Stephens, 2013. "A Statistical Framework for Joint eQTL Analysis in Multiple Tissues," PLOS Genetics, Public Library of Science, vol. 9(5), pages 1-13, May.
    4. Jae Hoon Sul & Buhm Han & Chun Ye & Ted Choi & Eleazar Eskin, 2013. "Effectively Identifying eQTLs from Multiple Tissues by Combining Mixed Model and Meta-analytic Approaches," PLOS Genetics, Public Library of Science, vol. 9(6), pages 1-13, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yuto Hasegawa & Juhyun Kim & Gianluca Ursini & Yan Jouroukhin & Xiaolei Zhu & Yu Miyahara & Feiyi Xiong & Samskruthi Madireddy & Mizuho Obayashi & Beat Lutz & Akira Sawa & Solange P. Brown & Mikhail V, 2023. "Microglial cannabinoid receptor type 1 mediates social memory deficits in mice produced by adolescent THC exposure and 16p11.2 duplication," Nature Communications, Nature, vol. 14(1), pages 1-19, December.
    2. Charlotte Soneson & Sarah Gerster & Mauro Delorenzi, 2014. "Batch Effect Confounding Leads to Strong Bias in Performance Estimates Obtained by Cross-Validation," PLOS ONE, Public Library of Science, vol. 9(6), pages 1-13, June.
    3. Arjun Bhattacharya & Anastasia N. Freedman & Vennela Avula & Rebeca Harris & Weifang Liu & Calvin Pan & Aldons J. Lusis & Robert M. Joseph & Lisa Smeester & Hadley J. Hartwell & Karl C. K. Kuban & Car, 2022. "Placental genomics mediates genetic associations with complex health traits and disease," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    4. Sudhir Varma, 2020. "Blind estimation and correction of microarray batch effect," PLOS ONE, Public Library of Science, vol. 15(4), pages 1-15, April.
    5. Blum Yuna & Houée-Bigot Magalie & Causeur David, 2016. "Sparse factor model for co-expression networks with an application using prior biological knowledge," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 15(3), pages 253-272, June.
    6. Hillary Koch & Cheryl A. Keller & Guanjue Xiang & Belinda Giardine & Feipeng Zhang & Yicheng Wang & Ross C. Hardison & Qunhua Li, 2022. "CLIMB: High-dimensional association detection in large scale genomic data," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    7. Angela Tung & Megan M. Sperry & Wesley Clawson & Ananya Pavuluri & Sydney Bulatao & Michelle Yue & Ramses Martinez Flores & Vaibhav P. Pai & Patrick McMillen & Franz Kuchling & Michael Levin, 2024. "Embryos assist morphogenesis of others through calcium and ATP signaling mechanisms in collective teratogen resistance," Nature Communications, Nature, vol. 15(1), pages 1-22, December.
    8. Friguet, Chloé & Causeur, David, 2011. "Estimation of the proportion of true null hypotheses in high-dimensional data under dependence," Computational Statistics & Data Analysis, Elsevier, vol. 55(9), pages 2665-2676, September.
    9. Xiaoquan Wen, 2017. "Robust Bayesian FDR Control Using Bayes Factors, with Applications to Multi-tissue eQTL Discovery," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 9(1), pages 28-49, June.
    10. Jonathan M. Dreyfuss & Yixing Yuchi & Xuehong Dong & Vissarion Efthymiou & Hui Pan & Donald C. Simonson & Ashley Vernon & Florencia Halperin & Pratik Aryal & Anish Konkar & Yinong Sebastian & Brandon , 2021. "High-throughput mediation analysis of human proteome and metabolome identifies mediators of post-bariatric surgical diabetes control," Nature Communications, Nature, vol. 12(1), pages 1-13, December.
    11. repec:jss:jstsof:40:i14 is not listed on IDEAS
    12. Xuemeng Zhou & Tsz Wing Sam & Ah Young Lee & Danny Leung, 2021. "Mouse strain-specific polymorphic provirus functions as cis-regulatory element leading to epigenomic and transcriptomic variations," Nature Communications, Nature, vol. 12(1), pages 1-18, December.
    13. Michael W Nagle & Jeanne C Latourelle & Adam Labadorf & Alexandra Dumitriu & Tiffany C Hadzi & Thomas G Beach & Richard H Myers, 2016. "The 4p16.3 Parkinson Disease Risk Locus Is Associated with GAK Expression and Genes Involved with the Synaptic Vesicle Membrane," PLOS ONE, Public Library of Science, vol. 11(8), pages 1-14, August.
    14. Parker Hilary S. & Leek Jeffrey T., 2012. "The practical effect of batch on genomic prediction," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(3), pages 1-22, April.
    15. Won Jun Lee & Sang Cheol Kim & Jung-Ho Yoon & Sang Jun Yoon & Johan Lim & You-Sun Kim & Sung Won Kwon & Jeong Hill Park, 2016. "Meta-Analysis of Tumor Stem-Like Breast Cancer Cells Using Gene Set and Network Analysis," PLOS ONE, Public Library of Science, vol. 11(2), pages 1-20, February.
    16. Kynon J. M. Benjamin & Ria Arora & Arthur S. Feltrin & Geo Pertea & Hunter H. Giles & Joshua M. Stolz & Laura D’Ignazio & Leonardo Collado-Torres & Joo Heon Shin & William S. Ulrich & Thomas M. Hyde &, 2024. "Sex affects transcriptional associations with schizophrenia across the dorsolateral prefrontal cortex, hippocampus, and caudate nucleus," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    17. Harshita Dogra & Shengxian Ding & Miyeon Yeon & Rongjie Liu & Chao Huang, 2023. "Confounder Adjustment in Shape-on-Scalar Regression Model: Corpus Callosum Shape Alterations in Alzheimer’s Disease," Stats, MDPI, vol. 6(4), pages 1-10, September.
    18. Oliver Stegle & Leopold Parts & Richard Durbin & John Winn, 2010. "A Bayesian Framework to Account for Complex Non-Genetic Factors in Gene Expression Levels Greatly Increases Power in eQTL Studies," PLOS Computational Biology, Public Library of Science, vol. 6(5), pages 1-11, May.
    19. Narasimhan, Balasubramanian & Rubin, Daniel L. & Gross, Samuel M. & Bendersky, Marina & Lavori, Philip W., 2017. "Software for Distributed Computation on Medical Databases: A Demonstration Project," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 77(i13).
    20. Sean M Gibbons & Claire Duvallet & Eric J Alm, 2018. "Correcting for batch effects in case-control microbiome studies," PLOS Computational Biology, Public Library of Science, vol. 14(4), pages 1-17, April.
    21. Emanuele Aliverti & Kristian Lum & James E. Johndrow & David B. Dunson, 2021. "Removing the influence of group variables in high‐dimensional predictive modelling," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(3), pages 791-811, July.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:biomet:v:79:y:2023:i:2:p:1472-1484. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0006-341X .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.