IDEAS home Printed from https://ideas.repec.org/a/oup/biomet/v110y2023i1p169-185..html
   My bibliography  Save this article

Robust differential abundance test in compositional data

Author

Listed:
  • Shulei Wang

Abstract

SummaryDifferential abundance tests for compositional data are essential and fundamental in various biomedical applications, such as single-cell, bulk RNA-seq and microbiome data analysis. However, because of the compositional constraint and the prevalence of zero counts in the data, differential abundance analysis on compositional data remains a complicated and unsolved statistical problem. This article proposes a new differential abundance test, the robust differential abundance test, to address these challenges. Compared with existing methods, the robust differential abundance test is simple and computationally efficient, is robust to prevalent zero counts in compositional datasets, can take the data’s compositional nature into account, and has a theoretical guarantee of controlling false discoveries in a general setting. Furthermore, in the presence of observed covariates, the robust differential abundance test can work with covariate-balancing techniques to remove potential confounding effects and draw reliable conclusions. The proposed test is applied to several numerical examples, and its merits are demonstrated using both simulated and real datasets.

Suggested Citation

  • Shulei Wang, 2023. "Robust differential abundance test in compositional data," Biometrika, Biometrika Trust, vol. 110(1), pages 169-185.
  • Handle: RePEc:oup:biomet:v:110:y:2023:i:1:p:169-185.
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1093/biomet/asac029
    Download Restriction: Access to full text is restricted to subscribers.
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Ruoqi Yu & Paul R. Rosenbaum, 2019. "Directional penalties for optimal matching in observational studies," Biometrics, The International Biometric Society, vol. 75(4), pages 1380-1390, December.
    2. Yuanpei Cao & Anru Zhang & Hongzhe Li, 2020. "Multisample estimation of bacterial composition matrices in metagenomics data," Biometrika, Biometrika Trust, vol. 107(1), pages 75-92.
    3. Huang Lin & Shyamal Das Peddada, 2020. "Analysis of compositions of microbiomes with bias correction," Nature Communications, Nature, vol. 11(1), pages 1-11, December.
    4. Kosuke Imai & Marc Ratkovic, 2014. "Covariate balancing propensity score," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(1), pages 243-263, January.
    5. Kim-Anh Lê Cao & Mary-Ellen Costello & Vanessa Anne Lakis & François Bartolo & Xin-Yi Chua & Rémi Brazeilles & Pascale Rondeau, 2016. "MixMC: A Multivariate Statistical Framework to Gain Insight into Microbial Communities," PLOS ONE, Public Library of Science, vol. 11(8), pages 1-21, August.
    6. Efron, Bradley, 2004. "Large-Scale Simultaneous Hypothesis Testing: The Choice of a Null Hypothesis," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 96-104, January.
    7. James T. Morton & Clarisse Marotz & Alex Washburne & Justin Silverman & Livia S. Zaramela & Anna Edlund & Karsten Zengler & Rob Knight, 2019. "Establishing microbial composition measurement standards with reference frames," Nature Communications, Nature, vol. 10(1), pages 1-11, December.
    8. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ruoqi Yu, 2021. "Evaluating and improving a matched comparison of antidepressants and bone density," Biometrics, The International Biometric Society, vol. 77(4), pages 1276-1288, December.
    2. Pedro H. C. Sant'Anna & Xiaojun Song & Qi Xu, 2022. "Covariate distribution balance via propensity scores," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 37(6), pages 1093-1120, September.
    3. Caloffi, Annalisa & Freo, Marzia & Ghinoi, Stefano & Mariani, Marco & Rossi, Federica, 2022. "Assessing the effects of a deliberate policy mix: The case of technology and innovation advisory services and innovation vouchers," Research Policy, Elsevier, vol. 51(6).
    4. Ott, Laurent & Weber, Sylvain, 2022. "How effective is carbon taxation on residential heating demand? A household-level analysis," Energy Policy, Elsevier, vol. 160(C).
    5. Soojin Park & Peter M. Steiner & David Kaplan, 2018. "Identification and Sensitivity Analysis for Average Causal Mediation Effects with Time-Varying Treatments and Mediators: Investigating the Underlying Mechanisms of Kindergarten Retention Policy," Psychometrika, Springer;The Psychometric Society, vol. 83(2), pages 298-320, June.
    6. Stefano Carattini & Suphi Sen, 2019. "Carbon Taxes and Stranded Assets: Evidence from Washington State," CESifo Working Paper Series 7785, CESifo.
    7. Susan Athey & Guido W. Imbens, 2017. "The State of Applied Econometrics: Causality and Policy Evaluation," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 3-32, Spring.
    8. Francesca Caselli & Mr. Philippe Wingender, 2018. "Bunching at 3 Percent: The Maastricht Fiscal Criterion and Government Deficits," IMF Working Papers 2018/182, International Monetary Fund.
    9. Tamara Bischof & Boris Kaiser, 2021. "Who cares when you close down? The effects of primary care practice closures on patients," Health Economics, John Wiley & Sons, Ltd., vol. 30(9), pages 2004-2025, September.
    10. Tenglong Li & Jordan Lawson, 2021. "A generalized bootstrap procedure of the standard error and confidence interval estimation for inverse probability of treatment weighting," Papers 2109.00171, arXiv.org.
    11. Shixiao Zhang & Peisong Han & Changbao Wu, 2023. "Calibration Techniques Encompassing Survey Sampling, Missing Data Analysis and Causal Inference," International Statistical Review, International Statistical Institute, vol. 91(2), pages 165-192, August.
    12. Dmitry Arkhangelsky & Susan Athey & David A. Hirshberg & Guido W. Imbens & Stefan Wager, 2021. "Synthetic Difference-in-Differences," American Economic Review, American Economic Association, vol. 111(12), pages 4088-4118, December.
    13. Huang Lin & Merete Eggesbø & Shyamal Das Peddada, 2022. "Linear and nonlinear correlation estimators unveil undescribed taxa interactions in microbiome data," Nature Communications, Nature, vol. 13(1), pages 1-16, December.
    14. Kallus Nathan & Santacatterina Michele, 2021. "Optimal balancing of time-dependent confounders for marginal structural models," Journal of Causal Inference, De Gruyter, vol. 9(1), pages 345-369, January.
    15. Brian G. Vegetabile & Daniel L. Gillen & Hal S. Stern, 2020. "Optimally balanced Gaussian process propensity scores for estimating treatment effects," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 183(1), pages 355-377, January.
    16. Susan Athey & Guido W. Imbens & Stefan Wager, 2018. "Approximate residual balancing: debiased inference of average treatment effects in high dimensions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 80(4), pages 597-623, September.
    17. Sakaue, Katsuki & Wokadala, James, 2022. "Effects of including refugees in local government schools on pupils’ learning achievement: Evidence from West Nile, Uganda," International Journal of Educational Development, Elsevier, vol. 90(C).
    18. Orihara, Shunichiro & Hamada, Etsuo, 2021. "Determination of the optimal number of strata for propensity score subclassification," Statistics & Probability Letters, Elsevier, vol. 168(C).
    19. Lundberg, Ian & Brand, Jennie E. & Jeon, Nanum, 2022. "Researcher reasoning meets computational capacity: Machine learning for social science," SocArXiv s5zc8, Center for Open Science.
    20. Nikolay Doudchenko & Guido W. Imbens, 2016. "Balancing, Regression, Difference-In-Differences and Synthetic Control Methods: A Synthesis," NBER Working Papers 22791, National Bureau of Economic Research, Inc.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:oup:biomet:v:110:y:2023:i:1:p:169-185.. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Oxford University Press (email available below). General contact details of provider: https://academic.oup.com/biomet .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.