IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0230594.html
   My bibliography  Save this article

A Zipf-plot based normalization method for high-throughput RNA-seq data

Author

Listed:
  • Bin Wang

Abstract

Normalization is crucial in RNA-seq data analyses. Due to the existence of excessive zeros and a large number of small measures, it is challenging to find reliable linear rescaling normalization parameters. We propose a Zipf plot based normalization method (ZN) assuming that all gene profiles have similar upper tail behaviors in their expression distributions. The new normalization method uses global information of all genes in the same profile without gene-level expression alteration. It doesn’t require the majority of genes to be not differentially expressed (DE), and can be applied to data where the majority of genes are weakly or not expressed. Two normalization schemes are implemented with ZN: a linear rescaling scheme and a non-linear transformation scheme. The linear rescaling scheme can be applied alone or together with the non-linear normalization scheme. The performance of ZN is benchmarked against five popular linear normalization methods for RNA-seq data. Results show that the linear rescaling normalization scheme by itself works well and is robust. The non-linear normalization scheme can further improve the normalization outcomes and is optional if the Zipf plots show parallel patterns.

Suggested Citation

  • Bin Wang, 2020. "A Zipf-plot based normalization method for high-throughput RNA-seq data," PLOS ONE, Public Library of Science, vol. 15(4), pages 1-15, April.
  • Handle: RePEc:plo:pone00:0230594
    DOI: 10.1371/journal.pone.0230594
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0230594
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0230594&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0230594?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. James Robert White & Niranjan Nagarajan & Mihai Pop, 2009. "Statistical Methods for Detecting Differentially Abundant Features in Clinical Metagenomic Samples," PLOS Computational Biology, Public Library of Science, vol. 5(4), pages 1-11, April.
    2. Joseph K. Pickrell & John C. Marioni & Athma A. Pai & Jacob F. Degner & Barbara E. Engelhardt & Everlyne Nkadori & Jean-Baptiste Veyrieras & Matthew Stephens & Yoav Gilad & Jonathan K. Pritchard, 2010. "Understanding mechanisms underlying human gene expression variation with RNA sequencing," Nature, Nature, vol. 464(7289), pages 768-772, April.
    3. Farnoosh Abbas-Aghababazadeh & Qian Li & Brooke L Fridley, 2018. "Comparison of normalization approaches for gene expression studies completed with high-throughput sequencing," PLOS ONE, Public Library of Science, vol. 13(10), pages 1-21, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Chen Ge & Shu-Guang Zhang & Bin Wang, 2020. "Modeling the joint distribution of firm size and firm age based on grouped data," PLOS ONE, Public Library of Science, vol. 15(7), pages 1-19, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Pengfei Song & Wen Qin & YanGan Huang & Lei Wang & Zhenyuan Cai & Tongzuo Zhang, 2020. "Grazing Management Influences Gut Microbial Diversity of Livestock in the Same Area," Sustainability, MDPI, vol. 12(10), pages 1-12, May.
    2. Shilan Li & Jianxin Shi & Paul Albert & Hong-Bin Fang, 2022. "Dependence Structure Analysis and Its Application in Human Microbiome," Mathematics, MDPI, vol. 11(1), pages 1-14, December.
    3. Allison G. White & George S. Watts & Zhenqiang Lu & Maria M. Meza-Montenegro & Eric A. Lutz & Philip Harber & Jefferey L. Burgess, 2014. "Environmental Arsenic Exposure and Microbiota in Induced Sputum," IJERPH, MDPI, vol. 11(2), pages 1-15, February.
    4. Yong Li & Jiejie Zhang & Jianqiang Zhang & Wenlai Xu & Zishen Mou, 2019. "Microbial Community Structure in the Sediments and Its Relation to Environmental Factors in Eutrophicated Sancha Lake," IJERPH, MDPI, vol. 16(11), pages 1-15, May.
    5. Pingting Ying & Can Chen & Zequn Lu & Shuoni Chen & Ming Zhang & Yimin Cai & Fuwei Zhang & Jinyu Huang & Linyun Fan & Caibo Ning & Yanmin Li & Wenzhuo Wang & Hui Geng & Yizhuo Liu & Wen Tian & Zhiyong, 2023. "Genome-wide enhancer-gene regulatory maps link causal variants to target genes underlying human cancer risk," Nature Communications, Nature, vol. 14(1), pages 1-20, December.
    6. Paul J McMurdie & Susan Holmes, 2014. "Waste Not, Want Not: Why Rarefying Microbiome Data Is Inadmissible," PLOS Computational Biology, Public Library of Science, vol. 10(4), pages 1-12, April.
    7. Nicoló Fusi & Oliver Stegle & Neil D Lawrence, 2012. "Joint Modelling of Confounding Factors and Prominent Genetic Regulators Provides Increased Accuracy in Genetical Genomics Studies," PLOS Computational Biology, Public Library of Science, vol. 8(1), pages 1-9, January.
    8. Hongjian Wei & Yongqi Wang & Juming Zhang & Liangfa Ge & Tianzeng Liu, 2022. "Changes in Soil Bacterial Community Structure in Bermudagrass Turf under Short-Term Traffic Stress," Agriculture, MDPI, vol. 12(5), pages 1-18, May.
    9. Kai Qiu & Huiyi Cai & Xin Wang & Guohua Liu, 2023. "Effects of Peroral Microbiota Transplantation on the Establishment of Intestinal Microorganisms in a Newly-Hatched Chick Model," Agriculture, MDPI, vol. 13(5), pages 1-13, April.
    10. Amirhossein Shamsaddini & Kimia Dadkhah & Patrick M Gillevet, 2020. "BiomMiner: An advanced exploratory microbiome analysis and visualization pipeline," PLOS ONE, Public Library of Science, vol. 15(6), pages 1-13, June.
    11. Gregor Gorkiewicz & Gerhard G Thallinger & Slave Trajanoski & Stefan Lackner & Gernot Stocker & Thomas Hinterleitner & Christian Gülly & Christoph Högenauer, 2013. "Alterations in the Colonic Microbiota in Response to Osmotic Diarrhea," PLOS ONE, Public Library of Science, vol. 8(2), pages 1-17, February.
    12. Gabriele Bellotti & Eren Taskin & Simone Sello & Cristina Sudiro & Rossella Bortolaso & Francesca Bandini & Maria Chiara Guerrieri & Pier Sandro Cocconcelli & Francesco Vuolo & Edoardo Puglisi, 2022. "LABs Fermentation Side-Product Positively Influences Rhizosphere and Plant Growth in Greenhouse Lettuce and Tomatoes," Land, MDPI, vol. 11(9), pages 1-15, September.
    13. Claudia Giambartolomei & Damjan Vukcevic & Eric E Schadt & Lude Franke & Aroon D Hingorani & Chris Wallace & Vincent Plagnol, 2014. "Bayesian Test for Colocalisation between Pairs of Genetic Association Studies Using Summary Statistics," PLOS Genetics, Public Library of Science, vol. 10(5), pages 1-15, May.
    14. Yuan Ge & Joshua P Schimel & Patricia A Holden, 2014. "Analysis of Run-to-Run Variation of Bar-Coded Pyrosequencing for Evaluating Bacterial Community Shifts and Individual Taxa Dynamics," PLOS ONE, Public Library of Science, vol. 9(6), pages 1-6, June.
    15. Zheng Sun & Jiang Liu & Meng Zhang & Tong Wang & Shi Huang & Scott T. Weiss & Yang-Yu Liu, 2023. "Removal of false positives in metagenomics-based taxonomy profiling via targeting Type IIB restriction sites," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    16. Jiapei Yuan & Yang Tong & Le Wang & Xiaoxiao Yang & Xiaochuan Liu & Meng Shu & Zekun Li & Wen Jin & Chenchen Guan & Yuting Wang & Qiang Zhang & Yang Yang, 2024. "A compendium of genetic variations associated with promoter usage across 49 human tissues," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    17. Thanh Nguyen & Asim Bhatti & Samuel Yang & Saeid Nahavandi, 2016. "RNA-Seq Count Data Modelling by Grey Relational Analysis and Nonparametric Gaussian Process," PLOS ONE, Public Library of Science, vol. 11(10), pages 1-18, October.
    18. Asta Laiho & Laura L Elo, 2014. "A Note on an Exon-Based Strategy to Identify Differentially Expressed Genes in RNA-Seq Experiments," PLOS ONE, Public Library of Science, vol. 9(12), pages 1-12, December.
    19. Lulu Shang & Wei Zhao & Yi Zhe Wang & Zheng Li & Jerome J. Choi & Minjung Kho & Thomas H. Mosley & Sharon L. R. Kardia & Jennifer A. Smith & Xiang Zhou, 2023. "meQTL mapping in the GENOA study reveals genetic determinants of DNA methylation in African Americans," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    20. Chuan Gao & Ian C McDowell & Shiwen Zhao & Christopher D Brown & Barbara E Engelhardt, 2016. "Context Specific and Differential Gene Co-expression Networks via Bayesian Biclustering," PLOS Computational Biology, Public Library of Science, vol. 12(7), pages 1-39, July.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0230594. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.