IDEAS home Printed from https://ideas.repec.org/a/spr/stabio/v17y2025i1d10.1007_s12561-024-09435-8.html
   My bibliography  Save this article

A Comprehensive Performance Comparison Study of Various Statistical Models that Accommodate Challenges of the Gut Microbiome Data

Author

Listed:
  • Morteza Hajihosseini

    (University of Alberta)

  • Payam Amini

    (Keele University)

  • Alireza Saidi-Mehrabad

    (Division of Hydrological Sciences)

  • Nastaran Hajizadeh

    (University of Alberta)

  • Anita L. Kozyrskyj

    (University of Alberta)

  • Irina Dinu

    (University of Alberta)

Abstract

The human gut microbiome refers to trillions of symbiotic bacteria that colonize the human gut after birth, having an essential role in maintaining human health. Various factors can influence the human microbiome, delaying normal gut microbiota’s maturation and leading to the onset of various diseases. Therefore, studying gut microbiome composition offers evidence for early disease detection and intervention opportunities. Stool samples analyzed based on 16S ribosomal RNA via high-throughput sequencing technologies, usually result in the generation of a count table (number of reads) of detected species per sample in a form of amplicon sequence variants. The ASV count data has several inherent challenges, such as over-dispersion, within-samples correlation, and a large number of zeros. Appropriate statistical methods are necessary to measure the effect of important factors on the gut microbial community while addressing specific challenges inherent to the ASV counts. This paper compared the behavior of the most common statistical methods that accommodate the challenges of gut microbiome data in a comprehensive simulation study. Sixty-seven percent of our simulation scenarios indicate that Zero Inflated Negative Binomial model had a lower mean square error as compared to the other methods, and the zero-inflated gaussian mixture model had better statistical power. The real data application on the SKOT Cohorts dataset showed the effect of maternal obesity on the taxon abundance of infants at 9- and 18-months assessments. Our study showed that some of the more recent methods could adequately accommodate the challenges in the gut microbiome data without requiring data transformation or normalization.

Suggested Citation

  • Morteza Hajihosseini & Payam Amini & Alireza Saidi-Mehrabad & Nastaran Hajizadeh & Anita L. Kozyrskyj & Irina Dinu, 2025. "A Comprehensive Performance Comparison Study of Various Statistical Models that Accommodate Challenges of the Gut Microbiome Data," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 17(1), pages 216-231, April.
  • Handle: RePEc:spr:stabio:v:17:y:2025:i:1:d:10.1007_s12561-024-09435-8
    DOI: 10.1007/s12561-024-09435-8
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s12561-024-09435-8
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s12561-024-09435-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:stabio:v:17:y:2025:i:1:d:10.1007_s12561-024-09435-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.