IDEAS home Printed from https://ideas.repec.org/a/taf/jnlasa/v110y2015i509p27-44.html
   My bibliography  Save this article

Estimating a Structured Covariance Matrix From Multilab Measurements in High-Throughput Biology

Author

Listed:
  • Alexander M. Franks
  • Gábor Csárdi
  • D. Allan Drummond
  • Edoardo M. Airoldi

Abstract

We consider the problem of quantifying the degree of coordination between transcription and translation, in yeast. Several studies have reported a surprising lack of coordination over the years, in organisms as different as yeast and humans, using diverse technologies. However, a close look at this literature suggests that the lack of reported correlation may not reflect the biology of regulation. These reports do not control for between-study biases and structure in the measurement errors, ignore key aspects of how the data connect to the estimand, and systematically underestimate the correlation as a consequence. Here, we design a careful meta-analysis of 27 yeast datasets, supported by a multilevel model, full uncertainty quantification, a suite of sensitivity analyses, and novel theory, to produce a more accurate estimate of the correlation between mRNA and protein levels--a proxy for coordination. From a statistical perspective, this problem motivates new theory on the impact of noise, model misspecifications, and nonignorable missing data on estimates of the correlation between high-dimensional responses. We find that the correlation between mRNA and protein levels is quite high under the studied conditions, in yeast, suggesting that post-transcriptional regulation plays a less prominent role than previously thought.

Suggested Citation

  • Alexander M. Franks & Gábor Csárdi & D. Allan Drummond & Edoardo M. Airoldi, 2015. "Estimating a Structured Covariance Matrix From Multilab Measurements in High-Throughput Biology," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(509), pages 27-44, March.
  • Handle: RePEc:taf:jnlasa:v:110:y:2015:i:509:p:27-44
    DOI: 10.1080/01621459.2014.964404
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1080/01621459.2014.964404
    Download Restriction: Access to full text is restricted to subscribers.

    File URL: https://libkey.io/10.1080/01621459.2014.964404?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Sina Ghaemmaghami & Won-Ki Huh & Kiowa Bower & Russell W. Howson & Archana Belle & Noah Dephoure & Erin K. O'Shea & Jonathan S. Weissman, 2003. "Global analysis of protein expression in yeast," Nature, Nature, vol. 425(6959), pages 737-741, October.
    2. Lyris M. F. de Godoy & Jesper V. Olsen & Jürgen Cox & Michael L. Nielsen & Nina C. Hubner & Florian Fröhlich & Tobias C. Walther & Matthias Mann, 2008. "Comprehensive mass-spectrometry-based proteome quantification of haploid versus diploid yeast," Nature, Nature, vol. 455(7217), pages 1251-1254, October.
    3. John R. S. Newman & Sina Ghaemmaghami & Jan Ihmels & David K. Breslow & Matthew Noble & Joseph L. DeRisi & Jonathan S. Weissman, 2006. "Single-cell proteomic analysis of S. cerevisiae reveals the architecture of biological noise," Nature, Nature, vol. 441(7095), pages 840-846, June.
    4. Joseph G. Ibrahim & Ming-Hui Chen & Stuart R. Lipsitz & Amy H. Herring, 2005. "Missing-Data Methods for Generalized Linear Models: A Comparative Review," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 332-346, March.
    5. Giovanni Parmigiani & Elizabeth S. Garrett & Ramaswamy Anbazhagan & Edward Gabrielson, 2002. "A statistical framework for expression‐based molecular classification in cancer," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 64(4), pages 717-736, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Marc S Sherman & Barak A Cohen, 2014. "A Computational Framework for Analyzing Stochasticity in Gene Expression," PLOS Computational Biology, Public Library of Science, vol. 10(5), pages 1-13, May.
    2. Alexey A Gritsenko & Marc Hulsman & Marcel J T Reinders & Dick de Ridder, 2015. "Unbiased Quantitative Models of Protein Translation Derived from Ribosome Profiling Data," PLOS Computational Biology, Public Library of Science, vol. 11(8), pages 1-26, August.
    3. Mohammad Soltani & Cesar A Vargas-Garcia & Duarte Antunes & Abhyudai Singh, 2016. "Intercellular Variability in Protein Levels from Stochastic Expression and Noisy Cell Cycle Processes," PLOS Computational Biology, Public Library of Science, vol. 12(8), pages 1-23, August.
    4. Jae Kyoung Kim & Eduardo D Sontag, 2017. "Reduction of multiscale stochastic biochemical reaction networks using exact moment derivation," PLOS Computational Biology, Public Library of Science, vol. 13(6), pages 1-24, June.
    5. Kazunari Iwamoto & Yuki Shindo & Koichi Takahashi, 2016. "Modeling Cellular Noise Underlying Heterogeneous Cell Responses in the Epidermal Growth Factor Signaling Pathway," PLOS Computational Biology, Public Library of Science, vol. 12(11), pages 1-18, November.
    6. Ryo Kato & Takahiro Hoshino, 2020. "Semiparametric Bayesian multiple imputation for regression models with missing mixed continuous–discrete covariates," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 72(3), pages 803-825, June.
    7. Li Cai & Lijie Gu & Qihua Wang & Suojin Wang, 2021. "Simultaneous confidence bands for nonparametric regression with missing covariate data," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 73(6), pages 1249-1279, December.
    8. Lee, Julian, 2023. "Poisson distributions in stochastic dynamics of gene expression: What events do they count?," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 630(C).
    9. Brian Caffo & Liu Dongmei & Giovanni Parmigiani, 2004. "Power Conjugate Multilevel Models with Applications to Genomics," Johns Hopkins University Dept. of Biostatistics Working Paper Series 1062, Berkeley Electronic Press.
    10. Louis-François Handfield & Yolanda T Chong & Jibril Simmons & Brenda J Andrews & Alan M Moses, 2013. "Unsupervised Clustering of Subcellular Protein Expression Patterns in High-Throughput Microscopy Images Reveals Protein Complexes and Functional Relationships between Proteins," PLOS Computational Biology, Public Library of Science, vol. 9(6), pages 1-19, June.
    11. McDonough, Ian K. & Millimet, Daniel L., 2017. "Missing data, imputation, and endogeneity," Journal of Econometrics, Elsevier, vol. 199(2), pages 141-155.
    12. Anneke Brümmer & Carlos Salazar & Vittoria Zinzalla & Lilia Alberghina & Thomas Höfer, 2010. "Mathematical Modelling of DNA Replication Reveals a Trade-off between Coherence of Origin Activation and Robustness against Rereplication," PLOS Computational Biology, Public Library of Science, vol. 6(5), pages 1-13, May.
    13. J. Andrew Royle, 2009. "Analysis of Capture–Recapture Models with Individual Covariates Using Data Augmentation," Biometrics, The International Biometric Society, vol. 65(1), pages 267-274, March.
    14. Xie Yanmei & Zhang Biao, 2017. "Empirical Likelihood in Nonignorable Covariate-Missing Data Problems," The International Journal of Biostatistics, De Gruyter, vol. 13(1), pages 1-20, May.
    15. Stuart Aitken & Marie-Cécile Robert & Ross D Alexander & Igor Goryanin & Edouard Bertrand & Jean D Beggs, 2010. "Processivity and Coupling in Messenger RNA Transcription," PLOS ONE, Public Library of Science, vol. 5(1), pages 1-12, January.
    16. Emma Pierson & the GTEx Consortium & Daphne Koller & Alexis Battle & Sara Mostafavi, 2015. "Sharing and Specificity of Co-expression Networks across 35 Human Tissues," PLOS Computational Biology, Public Library of Science, vol. 11(5), pages 1-19, May.
    17. Donatello Telesca & Lurdes Y.T. Inoue & Mauricio Neira & Ruth Etzioni & Martin Gleave & Colleen Nelson, 2009. "Differential Expression and Network Inferences through Functional Data Modeling," Biometrics, The International Biometric Society, vol. 65(3), pages 793-804, September.
    18. Jiang, Depeng & Zhao, Puying & Tang, Niansheng, 2016. "A propensity score adjustment method for regression models with nonignorable missing covariates," Computational Statistics & Data Analysis, Elsevier, vol. 94(C), pages 98-119.
    19. Lei Jin & Suojin Wang, 2010. "A Model Validation Procedure when Covariate Data are Missing at Random," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 37(3), pages 403-421, September.
    20. J. F. Lawless, 2018. "Two-phase outcome-dependent studies for failure times and testing for effects of expensive covariates," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 24(1), pages 28-44, January.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:taf:jnlasa:v:110:y:2015:i:509:p:27-44. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Longhurst (email available below). General contact details of provider: http://www.tandfonline.com/UASA20 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.