IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0167055.html
   My bibliography  Save this article

Into the Bowels of Depression: Unravelling Medical Symptoms Associated with Depression by Applying Machine-Learning Techniques to a Community Based Population Sample

Author

Listed:
  • Joanna F Dipnall
  • Julie A Pasco
  • Michael Berk
  • Lana J Williams
  • Seetal Dodd
  • Felice N Jacka
  • Denny Meyer

Abstract

Background: Depression is commonly comorbid with many other somatic diseases and symptoms. Identification of individuals in clusters with comorbid symptoms may reveal new pathophysiological mechanisms and treatment targets. The aim of this research was to combine machine-learning (ML) algorithms with traditional regression techniques by utilising self-reported medical symptoms to identify and describe clusters of individuals with increased rates of depression from a large cross-sectional community based population epidemiological study. Methods: A multi-staged methodology utilising ML and traditional statistical techniques was performed using the community based population National Health and Nutrition Examination Study (2009–2010) (N = 3,922). A Self-organised Mapping (SOM) ML algorithm, combined with hierarchical clustering, was performed to create participant clusters based on 68 medical symptoms. Binary logistic regression, controlling for sociodemographic confounders, was used to then identify the key clusters of participants with higher levels of depression (PHQ-9≥10, n = 377). Finally, a Multiple Additive Regression Tree boosted ML algorithm was run to identify the important medical symptoms for each key cluster within 17 broad categories: heart, liver, thyroid, respiratory, diabetes, arthritis, fractures and osteoporosis, skeletal pain, blood pressure, blood transfusion, cholesterol, vision, hearing, psoriasis, weight, bowels and urinary. Results: Five clusters of participants, based on medical symptoms, were identified to have significantly increased rates of depression compared to the cluster with the lowest rate: odds ratios ranged from 2.24 (95% CI 1.56, 3.24) to 6.33 (95% CI 1.67, 24.02). The ML boosted regression algorithm identified three key medical condition categories as being significantly more common in these clusters: bowel, pain and urinary symptoms. Bowel-related symptoms was found to dominate the relative importance of symptoms within the five key clusters. Conclusion: This methodology shows promise for the identification of conditions in general populations and supports the current focus on the potential importance of bowel symptoms and the gut in mental health research.

Suggested Citation

  • Joanna F Dipnall & Julie A Pasco & Michael Berk & Lana J Williams & Seetal Dodd & Felice N Jacka & Denny Meyer, 2016. "Into the Bowels of Depression: Unravelling Medical Symptoms Associated with Depression by Applying Machine-Learning Techniques to a Community Based Population Sample," PLOS ONE, Public Library of Science, vol. 11(12), pages 1-19, December.
  • Handle: RePEc:plo:pone00:0167055
    DOI: 10.1371/journal.pone.0167055
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0167055
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0167055&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0167055?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Lawrence A. David & Corinne F. Maurice & Rachel N. Carmody & David B. Gootenberg & Julie E. Button & Benjamin E. Wolfe & Alisha V. Ling & A. Sloan Devlin & Yug Varma & Michael A. Fischbach & Sudha B. , 2014. "Diet rapidly and reproducibly alters the human gut microbiome," Nature, Nature, vol. 505(7484), pages 559-563, January.
    2. Wehrens, Ron & Buydens, Lutgarde M. C., 2007. "Self- and Super-organizing Maps in R: The kohonen Package," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 21(i05).
    3. Kellie J. Archer & Stanley Lemeshow, 2006. "Goodness-of-fit test for a logistic regression model fitted using survey sample data," Stata Journal, StataCorp LP, vol. 6(1), pages 97-105, March.
    4. Joanna F Dipnall & Julie A Pasco & Michael Berk & Lana J Williams & Seetal Dodd & Felice N Jacka & Denny Meyer, 2016. "Fusing Data Mining, Machine Learning and Traditional Statistics to Detect Biomarkers Associated with Depression," PLOS ONE, Public Library of Science, vol. 11(2), pages 1-23, February.
    5. Lumley, Thomas, 2004. "Analysis of Complex Survey Samples," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 9(i08).
    6. Chris Poulin & Brian Shiner & Paul Thompson & Linas Vepstas & Yinong Young-Xu & Benjamin Goertzel & Bradley Watts & Laura Flashman & Thomas McAllister, 2014. "Predicting the Risk of Suicide by Analyzing the Text of Clinical Notes," PLOS ONE, Public Library of Science, vol. 9(1), pages 1-7, January.
    7. Felice N Jacka & Nicolas Cherbuin & Kaarin J Anstey & Peter Butterworth, 2014. "Dietary Patterns and Depressive Symptoms over Time: Examining the Relationships with Socioeconomic Position, Health Behaviours and Cardiovascular Risk," PLOS ONE, Public Library of Science, vol. 9(1), pages 1-9, January.
    8. Matthias Schonlau, 2005. "Boosted regression (boosting): An introductory tutorial and a Stata plugin," Stata Journal, StataCorp LP, vol. 5(3), pages 330-354, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Irene Mosca & Alan Barrett, 2016. "The impact of adult child emigration on the mental health of older parents," Journal of Population Economics, Springer;European Society for Population Economics, vol. 29(3), pages 687-719, July.
    2. Maciej Berk{e}sewicz & Herman Cherniaiev & Robert Pater, 2021. "Estimating the number of entities with vacancies using administrative and online data," Papers 2106.03263, arXiv.org.
    3. Janvier Gasana & Boubakari Ibrahimou & Ahmed N. Albatineh & Mustafa Al-Zoughool & Dina Zein, 2021. "Exposures in the Indoor Environment and Prevalence of Allergic Conditions in the United States of America," IJERPH, MDPI, vol. 18(9), pages 1-13, May.
    4. Andreas Karpf, 2014. "Expectation Formation and Social Influence," Documents de travail du Centre d'Economie de la Sorbonne 14005, Université Panthéon-Sorbonne (Paris 1), Centre d'Economie de la Sorbonne.
    5. J. Michael Brick & Michael E. Jones, 2008. "Propensity to respond and nonresponse bias," Metron - International Journal of Statistics, Dipartimento di Statistica, Probabilità e Statistiche Applicate - University of Rome, vol. 0(1), pages 51-73.
    6. Joe J. Lim & Christian Diener & James Wilson & Jacob J. Valenzuela & Nitin S. Baliga & Sean M. Gibbons, 2023. "Growth phase estimation for abundant bacterial populations sampled longitudinally from human stool metagenomes," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    7. Jacques Muthusi & Samuel Mwalili & Peter Young, 2019. "%svy_logistic_regression: A generic SAS macro for simple and multiple logistic regression and creating quality publication-ready tables using survey or non-survey data," PLOS ONE, Public Library of Science, vol. 14(9), pages 1-14, September.
    8. Kenneth A. Wilson & Sudipta Bar & Eric B. Dammer & Enrique M. Carrera & Brian A. Hodge & Tyler A. U. Hilsabeck & Joanna Bons & George W. Brownridge & Jennifer N. Beck & Jacob Rose & Melia Granath-Pane, 2024. "OXR1 maintains the retromer to delay brain aging under dietary restriction," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    9. Richard Fabling & Steven Stillman & David C. Maré, 2011. "Immigration and Innovation," Working Papers 11_05, Motu Economic and Public Policy Research.
    10. Saka, Umut Mete & Duzgun, Sebnem & Bazilian, Morgan D., 2024. "Analysis of world trade data with machine learning to enhance policies of mineral supply chain transparency," Resources Policy, Elsevier, vol. 89(C).
    11. Anne-Kathrin M. Loer & Olga M. Domanska & Christiane Stock & Susanne Jordan, 2022. "Correction: Loer et al. Subjective Generic Health Literacy and Its Associated Factors among Adolescents: Results of a Population-Based Online Survey in Germany. Int. J. Environ. Res. Public Health 202," IJERPH, MDPI, vol. 19(3), pages 1-3, February.
    12. Michał Brzozowski & Grzegorz Tchorek, 2017. "Exchange Rate Risk as an Obstacle to Export Activity," Gospodarka Narodowa. The Polish Journal of Economics, Warsaw School of Economics, issue 3, pages 115-141.
    13. Jonathan Wakefield & Taylor Okonek & Jon Pedersen, 2020. "Small Area Estimation for Disease Prevalence Mapping," International Statistical Review, International Statistical Institute, vol. 88(2), pages 398-418, August.
    14. Jach Agnieszka E & Marín Juan M, 2010. "Classification of Genomic Sequences via Wavelet Variance and a Self-Organizing Map with an Application to Mitochondrial DNA," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 9(1), pages 1-14, July.
    15. Fenton, Alex, 2013. "Small-area measures of income poverty," LSE Research Online Documents on Economics 58053, London School of Economics and Political Science, LSE Library.
    16. repec:cep:sticas:/173 is not listed on IDEAS
    17. Mei‐Chih Meg Tseng & Yi‐Ping Lin & Fu‐Chang Hu & Tsun‐Jen Cheng, 2013. "Risks Perception of Electromagnetic Fields in Taiwan: The Influence of Psychopathology and the Degree of Sensitivity to Electromagnetic Fields," Risk Analysis, John Wiley & Sons, vol. 33(11), pages 2002-2012, November.
    18. Raphael Nishimura & James Wagner & Michael Elliott, 2016. "Alternative Indicators for the Risk of Non-response Bias: A Simulation Study," International Statistical Review, International Statistical Institute, vol. 84(1), pages 43-62, April.
    19. Michael Barth & Eike Emrich & Arne Güllich, 2019. "A Machine Learning Approach to “Revisit†Specialization and Sampling in Institutionalized Practice," SAGE Open, , vol. 9(2), pages 21582440198, April.
    20. Lily Davies & Mark Kattenberg & Benedikt Vogt, 2023. "Predicting Firm Exits with Machine Learning: Implications for Selection into COVID-19 Support and Productivity Growth," CPB Discussion Paper 444, CPB Netherlands Bureau for Economic Policy Analysis.
    21. Manuel Mendoza-Carranza & Elisabet Ejarque & Leopold A J Nagelkerke, 2018. "Disentangling the complexity of tropical small-scale fisheries dynamics using supervised Self-Organizing Maps," PLOS ONE, Public Library of Science, vol. 13(5), pages 1-28, May.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0167055. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.