IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1004075.html
   My bibliography  Save this article

Proportionality: A Valid Alternative to Correlation for Relative Data

Author

Listed:
  • David Lovell
  • Vera Pawlowsky-Glahn
  • Juan José Egozcue
  • Samuel Marguerat
  • Jürg Bähler

Abstract

In the life sciences, many measurement methods yield only the relative abundances of different components in a sample. With such relative—or compositional—data, differential expression needs careful interpretation, and correlation—a statistical workhorse for analyzing pairwise relationships—is an inappropriate measure of association. Using yeast gene expression data we show how correlation can be misleading and present proportionality as a valid alternative for relative data. We show how the strength of proportionality between two variables can be meaningfully and interpretably described by a new statistic ϕ which can be used instead of correlation as the basis of familiar analyses and visualisation methods, including co-expression networks and clustered heatmaps. While the main aim of this study is to present proportionality as a means to analyse relative data, it also raises intriguing questions about the molecular mechanisms underlying the proportional regulation of a range of yeast genes.Author Summary: Relative abundance data is common in the life sciences, but appreciation that it needs special analysis and interpretation is scarce. Correlation is popular as a statistical measure of pairwise association but should not be used on data that carry only relative information. Using timecourse yeast gene expression data, we show how correlation of relative abundances can lead to conclusions opposite to those drawn from absolute abundances, and that its value changes when different components are included in the analysis. Once all absolute information has been removed, only a subset of those associations will reliably endure in the remaining relative data, specifically, associations where pairs of values behave proportionally across observations. We propose a new statistic ϕ to describe the strength of proportionality between two variables and demonstrate how it can be straightforwardly used instead of correlation as the basis of familiar analyses and visualization methods.

Suggested Citation

  • David Lovell & Vera Pawlowsky-Glahn & Juan José Egozcue & Samuel Marguerat & Jürg Bähler, 2015. "Proportionality: A Valid Alternative to Correlation for Relative Data," PLOS Computational Biology, Public Library of Science, vol. 11(3), pages 1-12, March.
  • Handle: RePEc:plo:pcbi00:1004075
    DOI: 10.1371/journal.pcbi.1004075
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004075
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1004075&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1004075?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Zhang Bin & Horvath Steve, 2005. "A General Framework for Weighted Gene Co-Expression Network Analysis," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 4(1), pages 1-45, August.
    2. Darrel C. Ince & Leslie Hatton & John Graham-Cumming, 2012. "The case for open computer programs," Nature, Nature, vol. 482(7386), pages 485-488, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Huang Lin & Merete Eggesbø & Shyamal Das Peddada, 2022. "Linear and nonlinear correlation estimators unveil undescribed taxa interactions in microbiome data," Nature Communications, Nature, vol. 13(1), pages 1-16, December.
    2. Lucas Czech & Alexandros Stamatakis, 2019. "Scalable methods for analyzing and visualizing phylogenetic placement of metagenomic samples," PLOS ONE, Public Library of Science, vol. 14(5), pages 1-50, May.
    3. Maria Rita Perrone & Salvatore Romano & Giuseppe De Maria & Paolo Tundo & Anna Rita Bruno & Luigi Tagliaferro & Michele Maffia & Mattia Fragola, 2022. "Compositional Data Analysis of 16S rRNA Gene Sequencing Results from Hospital Airborne Microbiome Samples," IJERPH, MDPI, vol. 19(16), pages 1-21, August.
    4. Colignatus, Thomas, 2017. "Comparing votes and seats with a diagonal (dis-) proportionality measure, using the slope-diagonal deviation (SDD) with cosine, sine and sign," MPRA Paper 80833, University Library of Munich, Germany, revised 17 Aug 2017.
    5. Juan José Egozcue & Vera Pawlowsky-Glahn, 2019. "Compositional data: the sample space and its structure," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(3), pages 599-638, September.
    6. Colignatus, Thomas, 2017. "Comparing votes and seats with a diagonal (dis-) proportionality measure, using the slope-diagonal deviation (SDD) with cosine, sine and sign," MPRA Paper 80965, University Library of Munich, Germany, revised 24 Aug 2017.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yixuan Qiu & Jing Lei & Kathryn Roeder, 2023. "Gradient-based sparse principal component analysis with extensions to online learning," Biometrika, Biometrika Trust, vol. 110(2), pages 339-360.
    2. Ruiz Vargas, E. & Mitchell, D.G.V. & Greening, S.G. & Wahl, L.M., 2014. "Topology of whole-brain functional MRI networks: Improving the truncated scale-free model," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 405(C), pages 151-158.
    3. Yan Guo & Hui Yu & Haocan Song & Jiapeng He & Olufunmilola Oyebamiji & Huining Kang & Jie Ping & Scott Ness & Yu Shyr & Fei Ye, 2021. "MetaGSCA: A tool for meta-analysis of gene set differential coexpression," PLOS Computational Biology, Public Library of Science, vol. 17(5), pages 1-15, May.
    4. Xue Jiang & Han Zhang & Xiongwen Quan & Zhandong Liu & Yanbin Yin, 2017. "Disease-related gene module detection based on a multi-label propagation clustering algorithm," PLOS ONE, Public Library of Science, vol. 12(5), pages 1-17, May.
    5. Mandel, Antoine & Landini, Simone & Gallegati, Mauro & Gintis, Herbert, 2015. "Price dynamics, financial fragility and aggregate volatility," Journal of Economic Dynamics and Control, Elsevier, vol. 51(C), pages 257-277.
    6. Peter Langfelder & Rui Luo & Michael C Oldham & Steve Horvath, 2011. "Is My Network Module Preserved and Reproducible?," PLOS Computational Biology, Public Library of Science, vol. 7(1), pages 1-29, January.
    7. Elva María Novoa-del-Toro & Efrén Mezura-Montes & Matthieu Vignes & Morgane Térézol & Frédérique Magdinier & Laurent Tichit & Anaïs Baudot, 2021. "A multi-objective genetic algorithm to find active modules in multiplex biological networks," PLOS Computational Biology, Public Library of Science, vol. 17(8), pages 1-24, August.
    8. Matias Nehuen Iglesias, 2021. "The Overlooked Insights from Correlation Structures in Economic Geography," Papers in Evolutionary Economic Geography (PEEG) 2105, Utrecht University, Department of Human Geography and Spatial Planning, Group Economic Geography, revised Jan 2021.
    9. Lingxue Zhang & Seyoung Kim, 2014. "Learning Gene Networks under SNP Perturbations Using eQTL Datasets," PLOS Computational Biology, Public Library of Science, vol. 10(2), pages 1-20, February.
    10. Benjamin A Samuels & E David Leonardo & Alex Dranovsky & Amanda Williams & Erik Wong & Addie May I Nesbitt & Richard D McCurdy & Rene Hen & Mark Alter, 2014. "Global State Measures of the Dentate Gyrus Gene Expression System Predict Antidepressant-Sensitive Behaviors," PLOS ONE, Public Library of Science, vol. 9(1), pages 1-10, January.
    11. Tingting Bo & Jie Li & Ganlu Hu & Ge Zhang & Wei Wang & Qian Lv & Shaoling Zhao & Junjie Ma & Meng Qin & Xiaohui Yao & Meiyun Wang & Guang-Zhong Wang & Zheng Wang, 2023. "Brain-wide and cell-specific transcriptomic insights into MRI-derived cortical morphology in macaque monkeys," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    12. Hunter, Kevin & Sreepathi, Sarat & DeCarolis, Joseph F., 2013. "Modeling for insight using Tools for Energy Model Optimization and Analysis (Temoa)," Energy Economics, Elsevier, vol. 40(C), pages 339-349.
    13. Chang Su & Zichun Xu & Xinning Shan & Biao Cai & Hongyu Zhao & Jingfei Zhang, 2023. "Cell-type-specific co-expression inference from single cell RNA-sequencing data," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    14. Sahra Uygun & Cheng Peng & Melissa D Lehti-Shiu & Robert L Last & Shin-Han Shiu, 2016. "Utility and Limitations of Using Gene Expression Data to Identify Functional Associations," PLOS Computational Biology, Public Library of Science, vol. 12(12), pages 1-27, December.
    15. Li, Jie & Wang, Lidan & Zhou, Zhong-Qiang & Zhang, Yongjie, 2021. "Monitoring or tunneling? Information interaction among large shareholders and the crash risk of the stock price," Pacific-Basin Finance Journal, Elsevier, vol. 65(C).
    16. Khang Tsung Fei & Yap Von Bing, 2010. "The Apportionment of Total Genetic Variation by Categorical Analysis of Variance," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 9(1), pages 1-34, January.
    17. Shaoshuo Li & Baixing Chen & Hao Chen & Zhen Hua & Yang Shao & Heng Yin & Jianwei Wang, 2021. "Analysis of potential genetic biomarkers and molecular mechanism of smoking-related postmenopausal osteoporosis using weighted gene co-expression network analysis and machine learning," PLOS ONE, Public Library of Science, vol. 16(9), pages 1-18, September.
    18. Peter Langfelder & Fuying Gao & Nan Wang & David Howland & Seung Kwak & Thomas F Vogt & Jeffrey S Aaronson & Jim Rosinski & Giovanni Coppola & Steve Horvath & X William Yang, 2018. "MicroRNA signatures of endogenous Huntingtin CAG repeat expansion in mice," PLOS ONE, Public Library of Science, vol. 13(1), pages 1-20, January.
    19. Renaud Tissier & Jeanine Houwing-Duistermaat & Mar Rodríguez-Girondo, 2018. "Improving stability of prediction models based on correlated omics data by using network approaches," PLOS ONE, Public Library of Science, vol. 13(2), pages 1-23, February.
    20. Shujuan Zhao & Kedous Y. Mekbib & Martijn A. Ent & Garrett Allington & Andrew Prendergast & Jocelyn E. Chau & Hannah Smith & John Shohfi & Jack Ocken & Daniel Duran & Charuta G. Furey & Le Thi Hao & P, 2023. "Mutation of key signaling regulators of cerebrovascular development in vein of Galen malformations," Nature Communications, Nature, vol. 14(1), pages 1-23, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1004075. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.