IDEAS home Printed from https://ideas.repec.org/a/spr/jstada/v8y2021i1d10.1186_s40488-021-00126-z.html
   My bibliography  Save this article

Combining assumptions and graphical network into gene expression data analysis

Author

Listed:
  • Demba Fofana

    (University of Texas Rio Grande Valley)

  • E. O. George

    (University of Memphis)

  • Dale Bowman

    (University of Memphis)

Abstract

Background Analyzing gene expression data rigorously requires taking assumptions into consideration but also relies on using information about network relations that exist among genes. Combining these different elements cannot only improve statistical power, but also provide a better framework through which gene expression can be properly analyzed. Material and methods We propose a novel statistical model that combines assumptions and gene network information into the analysis. Assumptions are important since every test statistic is valid only when required assumptions hold. So, we propose hybrid p-values and show that, under the null hypothesis of primary interest, these p-values are uniformly distributed. These proposed hybrid p-values take assumptions into consideration. We incorporate gene network information into the analysis because neighboring genes share biological functions. This correlation factor is taken into account via similar prior probabilities for neighboring genes. Results With a series of simulations our approach is compared with other approaches. Area Under the ROC Curves (AUCs) are constructed to compare the different methodologies; the AUC based on our methodology is larger than others. For regression analysis, AUC from our proposed method contains AUCs of Spearman test and of Pearson test. In addition, true negative rates (TNRs) also known as specificities are higher with our approach than with the other approaches. For two group comparison analysis, for instance, with a sample size of n=10, specificity corresponding to our proposed methodology is 0.716146 and specificities for t-test and rank sum are 0.689223 and 0.69797, respectively. Our method that combines assumptions and network information into the analysis is shown to be more powerful. Conclusions These proposed procedures are introduced as a general class of methods that can incorporate procedure-selection, account for multiple-testing, and incorporate graphical network information into the analysis. We obtain very good performance in simulations, and in real data analysis.

Suggested Citation

  • Demba Fofana & E. O. George & Dale Bowman, 2021. "Combining assumptions and graphical network into gene expression data analysis," Journal of Statistical Distributions and Applications, Springer, vol. 8(1), pages 1-17, December.
  • Handle: RePEc:spr:jstada:v:8:y:2021:i:1:d:10.1186_s40488-021-00126-z
    DOI: 10.1186/s40488-021-00126-z
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1186/s40488-021-00126-z
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1186/s40488-021-00126-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Pounds, Stan & Rai, Shesh N., 2009. "Assumption adequacy averaging as a concept for developing more robust methods for differential gene expression analysis," Computational Statistics & Data Analysis, Elsevier, vol. 53(5), pages 1604-1612, March.
    2. Julian Besag & Jeremy York & Annie Mollié, 1991. "Bayesian image restoration, with two applications in spatial statistics," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 43(1), pages 1-20, March.
    3. Lee, Duncan, 2013. "CARBayes: An R Package for Bayesian Spatial Modeling with Conditional Autoregressive Priors," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 55(i13).
    4. J. Besag & D. Higdon, 1999. "Bayesian analysis of agricultural field experiments," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 61(4), pages 691-746.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Wiki, Jesse & Kingham, Simon & Campbell, Malcolm, 2021. "A geospatial analysis of Type 2 Diabetes Mellitus and the food environment in urban New Zealand," Social Science & Medicine, Elsevier, vol. 288(C).
    2. Ferreira, Marco A.R. & Porter, Erica M. & Franck, Christopher T., 2021. "Fast and scalable computations for Gaussian hierarchical models with intrinsic conditional autoregressive spatial random effects," Computational Statistics & Data Analysis, Elsevier, vol. 162(C).
    3. Håvard Rue & Ingelin Steinsland & Sveinung Erland, 2004. "Approximating hidden Gaussian Markov random fields," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 66(4), pages 877-892, November.
    4. Gamerman, Dani & Moreira, Ajax R. B. & Rue, Havard, 2003. "Space-varying regression models: specifications and simulation," Computational Statistics & Data Analysis, Elsevier, vol. 42(3), pages 513-533, March.
    5. Demetris Lamnisos & Nicos Middleton & Nikoletta Kyprianou & Michael A. Talias, 2019. "Geodemographic Area Classification and Association with Mortality: An Ecological Study of Small Areas of Cyprus," IJERPH, MDPI, vol. 16(16), pages 1-13, August.
    6. Mabel Morales-Otero & Vicente Núñez-Antón, 2021. "Comparing Bayesian Spatial Conditional Overdispersion and the Besag–York–Mollié Models: Application to Infant Mortality Rates," Mathematics, MDPI, vol. 9(3), pages 1-33, January.
    7. Mary Kathryn Cowles & Stephen Bonett & Michael Seedorff, 2018. "Independent sampling for Bayesian normal conditional autoregressive models with OpenCL acceleration," Computational Statistics, Springer, vol. 33(1), pages 159-177, March.
    8. Meen Chel Jung & Jaewoo Park & Sunghwan Kim, 2019. "Spatial Relationships between Urban Structures and Air Pollution in Korea," Sustainability, MDPI, vol. 11(2), pages 1-17, January.
    9. Duncan Lee & Richard Mitchell, 2013. "Locally adaptive spatial smoothing using conditional auto-regressive models," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 62(4), pages 593-608, August.
    10. KWON, Heeeun & HWANG, Beom Seuk, 2023. "Do Spatial Characteristics Affect Housing Prices in Korea? : Evidence from Bayesian Spatial Models," Hitotsubashi Journal of Economics, Hitotsubashi University, vol. 64(2), pages 109-124, December.
    11. James S. Hodges & Bradley P. Carlin & Qiao Fan, 2003. "On the Precision of the Conditionally Autoregressive Prior in Spatial Models," Biometrics, The International Biometric Society, vol. 59(2), pages 317-322, June.
    12. Konstantinos Giannakou & Demetris Lamnisos, 2022. "Small-Area Geographic and Socioeconomic Inequalities in Colorectal Cancer in Cyprus," IJERPH, MDPI, vol. 20(1), pages 1-15, December.
    13. Dongkwan Lee & Jean-Michel Guldmann & Choongik Choi, 2019. "Factors Contributing to the Relationship between Driving Mileage and Crash Frequency of Older Drivers," Sustainability, MDPI, vol. 11(23), pages 1-13, November.
    14. Duncan Lee & Chris Robertson & Colin Ramsay & Kate Pyper, 2020. "Quantifying the impact of the modifiable areal unit problem when estimating the health effects of air pollution," Environmetrics, John Wiley & Sons, Ltd., vol. 31(8), December.
    15. Nikoline N. Knudsen & Jörg Schullehner & Birgitte Hansen & Lisbeth F. Jørgensen & Søren M. Kristiansen & Denitza D. Voutchkova & Thomas A. Gerds & Per K. Andersen & Kristine Bihrmann & Morten Grønbæk , 2017. "Lithium in Drinking Water and Incidence of Suicide: A Nationwide Individual-Level Cohort Study with 22 Years of Follow-Up," IJERPH, MDPI, vol. 14(6), pages 1-13, June.
    16. Katie Wilson & Jon Wakefield, 2022. "A probabilistic model for analyzing summary birth history data," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 47(11), pages 291-344.
    17. Pounds Stanley B. & Gao Cuilan L. & Zhang Hui, 2012. "Empirical Bayesian Selection of Hypothesis Testing Procedures for Analysis of Sequence Count Expression Data," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(5), pages 1-32, October.
    18. Eibich, Peter & Ziebarth, Nicolas, 2014. "Examining the Structure of Spatial Health Effects in Germany Using Hierarchical Bayes Models," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 49, pages 305-320.
    19. Shreosi Sanyal & Thierry Rochereau & Cara Nichole Maesano & Laure Com-Ruelle & Isabella Annesi-Maesano, 2018. "Long-Term Effect of Outdoor Air Pollution on Mortality and Morbidity: A 12-Year Follow-Up Study for Metropolitan France," IJERPH, MDPI, vol. 15(11), pages 1-8, November.
    20. Mayer Alvo & Jingrui Mu, 2023. "COVID-19 Data Analysis Using Bayesian Models and Nonparametric Geostatistical Models," Mathematics, MDPI, vol. 11(6), pages 1-13, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:jstada:v:8:y:2021:i:1:d:10.1186_s40488-021-00126-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.