IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v9y2021i11p1307-d570315.html
   My bibliography  Save this article

Ordering of Omics Features Using Beta Distributions on Montecarlo p -Values

Author

Listed:
  • Angela L. Riffo-Campos

    (Centro de Excelencia de Modelación y Computación Científica, Universidad de La Frontera, Temuco 01145, Chile)

  • Guillermo Ayala

    (Department of Statistics and Operation Research, Faculty of Mathematics, Universitat de Valencia, 46100 Burjasot, Spain)

  • Juan Domingo

    (Department of Computer Science, ETSE, Universitat de Valencia, Avda. de la Universidad, s/n, 46100 Burjasot, Spain)

Abstract

The current trend in genetic research is the study of omics data as a whole, either combining studies or omics techniques. This raises the need for new robust statistical methods that can integrate and order the relevant biological information. A good way to approach the problem is to order the features studied according to the different kinds of data so a key point is to associate good values to the features that permit us a good sorting of them. These values are usually the p -values corresponding to a hypothesis which has been tested for each feature studied. The Montecarlo method is certainly one of the most robust methods for hypothesis testing. However, a large number of simulations is needed to obtain a reliable p -value, so the method becomes computationally infeasible in many situations. We propose a new way to order genes according to their differential features by using a score defined from a beta distribution fitted to the generated p -values. Our approach has been tested using simulated data and colorectal cancer datasets from Infinium methylationEPIC array, Affymetrix gene expression array and Illumina RNA-seq platforms. The results show that this approach allows a proper ordering of genes using a number of simulations much lower than with the Montecarlo method. Furthermore, the score can be interpreted as an estimated p -value and compared with Montecarlo and other approaches like the p -value of the moderated t -tests. We have also identified a new expression pattern of eighteen genes common to all colorectal cancer microarrays, i.e., 21 datasets. Thus, the proposed method is effective for obtaining biological results using different datasets. Our score shows a slightly smaller type I error for small sizes than the Montecarlo p -value. The type II error of Montecarlo p -value is lower than the one obtained with the proposed score and with a moderated p -value, but these differences are highly reduced for larger sample sizes and higher false discovery rates. Similar performances from type I and II errors and the score enable a clear ordering of the features being evaluated.

Suggested Citation

  • Angela L. Riffo-Campos & Guillermo Ayala & Juan Domingo, 2021. "Ordering of Omics Features Using Beta Distributions on Montecarlo p -Values," Mathematics, MDPI, vol. 9(11), pages 1-18, June.
  • Handle: RePEc:gam:jmathe:v:9:y:2021:i:11:p:1307-:d:570315
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/9/11/1307/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/9/11/1307/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Orsolya Galamb & Barnabás Wichmann & Ferenc Sipos & Sándor Spisák & Tibor Krenács & Kinga Tóth & Katalin Leiszter & Alexandra Kalmár & Zsolt Tulassay & Béla Molnár, 2012. "Dysplasia-Carcinoma Transition Specific Transcripts in Colonic Biopsy Samples," PLOS ONE, Public Library of Science, vol. 7(11), pages 1-10, November.
    2. Phipson Belinda & Smyth Gordon K, 2010. "Permutation P-values Should Never Be Zero: Calculating Exact P-values When Permutations Are Randomly Drawn," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 9(1), pages 1-16, October.
    3. Smyth Gordon K, 2004. "Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 3(1), pages 1-28, February.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Aaron C Ericsson & J Wade Davis & William Spollen & Nathan Bivens & Scott Givan & Catherine E Hagan & Mark McIntosh & Craig L Franklin, 2015. "Effects of Vendor and Genetic Background on the Composition of the Fecal Microbiota of Inbred Mice," PLOS ONE, Public Library of Science, vol. 10(2), pages 1-19, February.
    2. Hossain, Ahmed & Beyene, Joseph & Willan, Andrew R. & Hu, Pingzhao, 2009. "A flexible approximate likelihood ratio test for detecting differential expression in microarray data," Computational Statistics & Data Analysis, Elsevier, vol. 53(10), pages 3685-3695, August.
    3. Xiaohong Li & Guy N Brock & Eric C Rouchka & Nigel G F Cooper & Dongfeng Wu & Timothy E O’Toole & Ryan S Gill & Abdallah M Eteleeb & Liz O’Brien & Shesh N Rai, 2017. "A comparison of per sample global scaling and per gene normalization methods for differential expression analysis of RNA-seq data," PLOS ONE, Public Library of Science, vol. 12(5), pages 1-22, May.
    4. Ambroise Jérôme & Bearzatto Bertrand & Robert Annie & Macq Benoit & Gala Jean-Luc, 2012. "Combining Multiple Laser Scans of Spotted Microarrays by Means of a Two-Way ANOVA Model," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(3), pages 1-20, February.
    5. J. McClatchy & R. Strogantsev & E. Wolfe & H. Y. Lin & M. Mohammadhosseini & B. A. Davis & C. Eden & D. Goldman & W. H. Fleming & P. Conley & G. Wu & L. Cimmino & H. Mohammed & A. Agarwal, 2023. "Clonal hematopoiesis related TET2 loss-of-function impedes IL1β-mediated epigenetic reprogramming in hematopoietic stem and progenitor cells," Nature Communications, Nature, vol. 14(1), pages 1-17, December.
    6. Alexandra Gyurdieva & Stefan Zajic & Ya-Fang Chang & E. Andres Houseman & Shan Zhong & Jaegil Kim & Michael Nathenson & Thomas Faitg & Mary Woessner & David C. Turner & Aisha N. Hasan & John Glod & Ro, 2022. "Biomarker correlates with response to NY-ESO-1 TCR T cells in patients with synovial sarcoma," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
    7. Yu Lianbo & Gulati Parul & Fernandez Soledad & Pennell Michael & Kirschner Lawrence & Jarjoura David, 2011. "Fully Moderated T-statistic for Small Sample Size Gene Expression Arrays," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 10(1), pages 1-22, September.
    8. Sinan Xiong & Jianbiao Zhou & Tze King Tan & Tae-Hoon Chung & Tuan Zea Tan & Sabrina Hui-Min Toh & Nicole Xin Ning Tang & Yunlu Jia & Yi Xiang See & Melissa Jane Fullwood & Takaomi Sanda & Wee-Joo Chn, 2024. "Super enhancer acquisition drives expression of oncogenic PPP1R15B that regulates protein homeostasis in multiple myeloma," Nature Communications, Nature, vol. 15(1), pages 1-21, December.
    9. Chaofeng Yuan & Wensheng Zhu & Xuming He & Jianhua Guo, 2019. "A mixture factor model with applications to microarray data," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(1), pages 60-76, March.
    10. Nott, David J. & Yu, Zeming & Chan, Eva & Cotsapas, Chris & Cowley, Mark J. & Pulvers, Jeremy & Williams, Rohan & Little, Peter, 2007. "Hierarchical Bayes variable selection and microarray experiments," Journal of Multivariate Analysis, Elsevier, vol. 98(4), pages 852-872, April.
    11. Alexander Kaever & Manuel Landesfeind & Kirstin Feussner & Burkhard Morgenstern & Ivo Feussner & Peter Meinicke, 2014. "Meta-Analysis of Pathway Enrichment: Combining Independent and Dependent Omics Data Sets," PLOS ONE, Public Library of Science, vol. 9(2), pages 1-12, February.
    12. Iqbal Mahmud & Guimei Tian & Jia Wang & Tarun E. Hutchinson & Brandon J. Kim & Nikee Awasthee & Seth Hale & Chengcheng Meng & Allison Moore & Liming Zhao & Jessica E. Lewis & Aaron Waddell & Shangtao , 2023. "DAXX drives de novo lipogenesis and contributes to tumorigenesis," Nature Communications, Nature, vol. 14(1), pages 1-20, December.
    13. Erminia Donnarumma & Michael Kohlhaas & Elodie Vimont & Etienne Kornobis & Thibault Chaze & Quentin Giai Gianetto & Mariette Matondo & Maryse Moya-Nilges & Christoph Maack & Timothy Wai, 2022. "Mitochondrial Fission Process 1 controls inner membrane integrity and protects against heart failure," Nature Communications, Nature, vol. 13(1), pages 1-24, December.
    14. J. T. Gene Hwang & Jing Qiu & Zhigen Zhao, 2009. "Empirical Bayes confidence intervals shrinking both means and variances," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(1), pages 265-285, January.
    15. Long Qu & Dan Nettleton & Jack C. M. Dekkers, 2012. "Improved Estimation of the Noncentrality Parameter Distribution from a Large Number of t-Statistics, with Applications to False Discovery Rate Estimation in Microarray Data Analysis," Biometrics, The International Biometric Society, vol. 68(4), pages 1178-1187, December.
    16. Tobias Bexte & Nawid Albinger & Ahmad Al Ajami & Philipp Wendel & Leon Buchinger & Alec Gessner & Jamal Alzubi & Vinzenz Särchen & Meike Vogler & Hadeer Mohamed Rasheed & Beate Anahita Jung & Sebastia, 2024. "CRISPR/Cas9 editing of NKG2A improves the efficacy of primary CD33-directed chimeric antigen receptor natural killer cells," Nature Communications, Nature, vol. 15(1), pages 1-19, December.
    17. Saori Kashima & Masatoshi Matsumoto & Takahiko Ogawa & Akira Eboshida & Keisuke Takeuchi, 2012. "The Impact of Travel Time on Geographic Distribution of Dialysis Patients," PLOS ONE, Public Library of Science, vol. 7(10), pages 1-8, October.
    18. Sahra Uygun & Cheng Peng & Melissa D Lehti-Shiu & Robert L Last & Shin-Han Shiu, 2016. "Utility and Limitations of Using Gene Expression Data to Identify Functional Associations," PLOS Computational Biology, Public Library of Science, vol. 12(12), pages 1-27, December.
    19. Cherif Ben Hamda & Raphael Sangeda & Liberata Mwita & Ayton Meintjes & Siana Nkya & Sumir Panji & Nicola Mulder & Lamia Guizani-Tabbane & Alia Benkahla & Julie Makani & Kais Ghedira & H3ABioNet Consor, 2018. "A common molecular signature of patients with sickle cell disease revealed by microarray meta-analysis and a genome-wide association study," PLOS ONE, Public Library of Science, vol. 13(7), pages 1-21, July.
    20. Tony Marion & Husni Elbahesh & Paul G Thomas & John P DeVincenzo & Richard Webby & Klaus Schughart, 2016. "Respiratory Mucosal Proteome Quantification in Human Influenza Infections," PLOS ONE, Public Library of Science, vol. 11(4), pages 1-16, April.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:9:y:2021:i:11:p:1307-:d:570315. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.