IDEAS home Printed from https://ideas.repec.org/a/wly/envmet/v32y2021i7ne2682.html
   My bibliography  Save this article

Bayesian variable selection for high‐dimensional rank data

Author

Listed:
  • Can Cui
  • Susheela P. Singh
  • Ana‐Maria Staicu
  • Brian J. Reich

Abstract

The study of microbiomes has become a topic of intense interest in last several decades as the development of new sequencing technologies has made DNA data accessible across disciplines. In this paper, we analyze a global dataset to investigate environmental factors that affect topsoil microbiome. As yet, much associated work has focused on linking indicators of microbial health to specific outcomes in various fields, rather than understanding how external factors may influence the microbiome composition itself. This is partially due to limited statistical methods to model abundance counts. The counts are high‐dimensional, overdispersed, often zero‐inflated, and exhibit complex dependence structures. Additionally, the raw counts are often noisy and compositional, and thus are not directly comparable across samples. Often, practitioners transform the counts to presence–absence indicators, but this transformation discards much of the data. As an alternative, we propose transforming to taxa ranks and develop a Bayesian variable selection model that uses ranks to identify covariates that influence microbiome composition. We show by simulation that the proposed model outperforms competitors across various settings and particular improvement in recall for small magnitude and low prevalence covariates. When applied to the topsoil data, the proposed method identifies several factors that affect microbiome composition.

Suggested Citation

  • Can Cui & Susheela P. Singh & Ana‐Maria Staicu & Brian J. Reich, 2021. "Bayesian variable selection for high‐dimensional rank data," Environmetrics, John Wiley & Sons, Ltd., vol. 32(7), November.
  • Handle: RePEc:wly:envmet:v:32:y:2021:i:7:n:e2682
    DOI: 10.1002/env.2682
    as

    Download full text from publisher

    File URL: https://doi.org/10.1002/env.2682
    Download Restriction: no

    File URL: https://libkey.io/10.1002/env.2682?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. David I. Warton, 2011. "Regularized Sandwich Estimators for Analysis of High-Dimensional Data Using Generalized Estimating Equations," Biometrics, The International Biometric Society, vol. 67(1), pages 116-123, March.
    2. Johnson V. E. & Deaner R. O. & van Schaik C. P., 2002. "Bayesian Analysis of Rank Data With Application to Primate Intelligence Experiments," Journal of the American Statistical Association, American Statistical Association, vol. 97, pages 8-17, March.
    3. Junjie Qin & Yingrui Li & Zhiming Cai & Shenghui Li & Jianfeng Zhu & Fan Zhang & Suisha Liang & Wenwei Zhang & Yuanlin Guan & Dongqian Shen & Yangqing Peng & Dongya Zhang & Zhuye Jie & Wenxian Wu & Yo, 2012. "A metagenome-wide association study of gut microbiota in type 2 diabetes," Nature, Nature, vol. 490(7418), pages 55-60, October.
    4. Pratheepa Jeganathan & Susan P. Holmes, 2021. "A Statistical Perspective on the Challenges in Molecular Microbial Biology," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 26(2), pages 131-160, June.
    5. Bradley J. Barney & Federica Amici & Filippo Aureli & Josep Call & Valen E. Johnson, 2015. "Joint Bayesian Modeling of Binomial and Rank Data for Primate Cognition," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(510), pages 573-582, June.
    6. Koop, G & Poirier, D J, 1994. "Rank-Ordered Logit Models: An Empirical Analysis of Ontario Voter Preferences," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 9(4), pages 369-388, Oct.-Dec..
    7. Fan Xia & Jun Chen & Wing Kam Fung & Hongzhe Li, 2013. "A Logistic Normal Multinomial Regression Model for Microbiome Compositional Data Analysis," Biometrics, The International Biometric Society, vol. 69(4), pages 1053-1063, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. McCabe, Christopher & Brazier, John & Gilks, Peter & Tsuchiya, Aki & Roberts, Jennifer & O'Hagan, Anthony & Stevens, Katherine, 2006. "Using rank data to estimate health state utility models," Journal of Health Economics, Elsevier, vol. 25(3), pages 418-431, May.
    2. Duo Jiang & Thomas Sharpton & Yuan Jiang, 2021. "Microbial Interaction Network Estimation via Bias-Corrected Graphical Lasso," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 13(2), pages 329-350, July.
    3. Philip Yu, 2000. "Bayesian analysis of order-statistics models for ranking data," Psychometrika, Springer;The Psychometric Society, vol. 65(3), pages 281-299, September.
    4. Lijuan Kong & Qijin Zhao & Xiaojing Jiang & Jinping Hu & Qian Jiang & Li Sheng & Xiaohong Peng & Shusen Wang & Yibing Chen & Yanjun Wan & Shaocong Hou & Xingfeng Liu & Chunxiao Ma & Yan Li & Li Quan &, 2024. "Trimethylamine N-oxide impairs β-cell function and glucose tolerance," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    5. Andreas Wartel & Patrik Lindenfors & Johan Lind, 2019. "Whatever you want: Inconsistent results are the rule, not the exception, in the study of primate brain evolution," PLOS ONE, Public Library of Science, vol. 14(7), pages 1-15, July.
    6. Kerstin Thriene & Karin B. Michels, 2023. "Human Gut Microbiota Plasticity throughout the Life Course," IJERPH, MDPI, vol. 20(2), pages 1-14, January.
    7. Alain Carpentier & Karine Latouche & Pierre Rainelli & . Association of Environmental And Resource Economists, 2002. "Food safety in the demand for meat quality : the case of pork chops in France," Post-Print hal-01937048, HAL.
    8. Poirier, Dale J., 1996. "A Bayesian analysis of nested logit models," Journal of Econometrics, Elsevier, vol. 75(1), pages 163-181, November.
    9. Dennis Fok & Richard Paap & Bram Van Dijk, 2012. "A Rank‐Ordered Logit Model With Unobserved Heterogeneity In Ranking Capabilities," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 27(5), pages 831-846, August.
    10. Peyhardi, Jean & Fernique, Pierre & Durand, Jean-Baptiste, 2021. "Splitting models for multivariate count data," Journal of Multivariate Analysis, Elsevier, vol. 181(C).
    11. Seung Jin Han & Kyoung Hwa Ha & Ja Young Jeon & Hae Jin Kim & Kwan Woo Lee & Dae Jung Kim, 2015. "Impact of Cadmium Exposure on the Association between Lipopolysaccharide and Metabolic Syndrome," IJERPH, MDPI, vol. 12(9), pages 1-14, September.
    12. Magdalena Jastrzębska & Urszula Wachowska & Marta K. Kostrzewska, 2020. "Pathogenic and Non-Pathogenic Fungal Communities in Wheat Grain as Influenced by Recycled Phosphorus Fertilizers: A Case Study," Agriculture, MDPI, vol. 10(6), pages 1-15, June.
    13. Zengliang Jiang & Lai-bao Zhuo & Yan He & Yuanqing Fu & Luqi Shen & Fengzhe Xu & Wanglong Gou & Zelei Miao & Menglei Shuai & Yuhui Liang & Congmei Xiao & Xinxiu Liang & Yunyi Tian & Jiali Wang & Jun T, 2022. "The gut microbiota-bile acid axis links the positive association between chronic insomnia and cardiometabolic diseases," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    14. Xiaoxiao Yuan & Ruirui Wang & Bing Han & ChengJun Sun & Ruimin Chen & Haiyan Wei & Linqi Chen & Hongwei Du & Guimei Li & Yu Yang & Xiaojuan Chen & Lanwei Cui & Zhenran Xu & Junfen Fu & Jin Wu & Wei Gu, 2022. "Functional and metabolic alterations of gut microbiota in children with new-onset type 1 diabetes," Nature Communications, Nature, vol. 13(1), pages 1-16, December.
    15. Patrick LeBlanc & Li Ma, 2023. "Microbiome subcommunity learning with logistic‐tree normal latent Dirichlet allocation," Biometrics, The International Biometric Society, vol. 79(3), pages 2321-2332, September.
    16. Shuqi Qin & Dianye Zhang & Bin Wei & Yuanhe Yang, 2024. "Dual roles of microbes in mediating soil carbon dynamics in response to warming," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    17. Aibo Gao & Junlei Su & Ruixin Liu & Shaoqian Zhao & Wen Li & Xiaoqiang Xu & Danjie Li & Juan Shi & Bin Gu & Juan Zhang & Qi Li & Xiaolin Wang & Yifei Zhang & Yu Xu & Jieli Lu & Guang Ning & Jie Hong &, 2021. "Sexual dimorphism in glucose metabolism is shaped by androgen-driven gut microbiome," Nature Communications, Nature, vol. 12(1), pages 1-14, December.
    18. Tao Wang & Hongyu Zhao, 2017. "A Dirichlet-tree multinomial regression model for associating dietary nutrients with gut microorganisms," Biometrics, The International Biometric Society, vol. 73(3), pages 792-801, September.
    19. Alessandra N. Bazzano & Kaitlin S. Potts & Lydia A. Bazzano & John B. Mason, 2017. "The Life Course Implications of Ready to Use Therapeutic Food for Children in Low-Income Countries," IJERPH, MDPI, vol. 14(4), pages 1-19, April.
    20. Eryun Zhang & Lihua Jin & Yangmeng Wang & Jui Tu & Ruirong Zheng & Lili Ding & Zhipeng Fang & Mingjie Fan & Ismail Al-Abdullah & Rama Natarajan & Ke Ma & Zhengtao Wang & Arthur D. Riggs & Sarah C. Shu, 2022. "Intestinal AMPK modulation of microbiota mediates crosstalk with brown fat to control thermogenesis," Nature Communications, Nature, vol. 13(1), pages 1-10, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wly:envmet:v:32:y:2021:i:7:n:e2682. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.interscience.wiley.com/jpages/1180-4009/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.