IDEAS home Printed from https://ideas.repec.org/a/gam/jijerp/v19y2022i13p8197-d855762.html
   My bibliography  Save this article

Detecting Suicidal Ideation in Social Media: An Ensemble Method Based on Feature Fusion

Author

Listed:
  • Jingfang Liu

    (School of Management, Shanghai University, Shanghai 201800, China)

  • Mengshi Shi

    (School of Management, Shanghai University, Shanghai 201800, China)

  • Huihong Jiang

    (School of Management, Shanghai University, Shanghai 201800, China)

Abstract

Suicide has become a serious problem, and how to prevent suicide has become a very important research topic. Social media provides an ideal platform for monitoring suicidal ideation. This paper presents an integrated model for multidimensional information fusion. By integrating the best classification models determined by single and multiple features, different feature information is combined to better identify suicidal posts in online social media. This approach was assessed with a dataset formed from 40,222 posts annotated by Weibo. By integrating the best classification model of single features and multidimensional features, the proposed model ((BSC + RFS)-fs, WEC-fs) achieved 80.61% accuracy and a 79.20% F1-score. Other representative text information representation methods and demographic factors related to suicide may also be important predictors of suicide, which were not considered in this study. To the best of our knowledge, this is the good try that feature combination and ensemble algorithms have been fused to detect user-generated content with suicidal ideation. The findings suggest that feature combinations do not always work well, and that an appropriate combination strategy can make classification models work better. There are differences in the information contained in different functional carriers, and a targeted choice classification model may improve the detection rate of suicidal ideation.

Suggested Citation

  • Jingfang Liu & Mengshi Shi & Huihong Jiang, 2022. "Detecting Suicidal Ideation in Social Media: An Ensemble Method Based on Feature Fusion," IJERPH, MDPI, vol. 19(13), pages 1-13, July.
  • Handle: RePEc:gam:jijerp:v:19:y:2022:i:13:p:8197-:d:855762
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/1660-4601/19/13/8197/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/1660-4601/19/13/8197/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Wei-Yin Loh, 2014. "Fifty Years of Classification and Regression Trees," International Statistical Review, International Statistical Institute, vol. 82(3), pages 329-348, December.
    2. Volker Liermann & Sangmeng Li, 2021. "Methods of Machine Learning," Springer Books, in: Volker Liermann & Claus Stegmann (ed.), The Digital Journey of Banking and Insurance, Volume III, pages 225-238, Springer.
    3. Theil, Henri, 1969. "A Multinomial Extension of the Linear Logit Model," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 10(3), pages 251-259, October.
    4. Fahey, Robert A. & Boo, Jeremy & Ueda, Michiko, 2020. "Covariance in diurnal patterns of suicide-related expressions on Twitter and recorded suicide deaths," Social Science & Medicine, Elsevier, vol. 253(C).
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Wei Pan & Xianbin Wang & Wenwei Zhou & Bowen Hang & Liwen Guo, 2023. "Linguistic Analysis for Identifying Depression and Subsequent Suicidal Ideation on Weibo: Machine Learning Approaches," IJERPH, MDPI, vol. 20(3), pages 1-12, February.
    2. Yun Gu & Deyuan Chen & Xiaoqian Liu, 2022. "Suicide Possibility Scale Detection via Sina Weibo Analytics: Preliminary Results," IJERPH, MDPI, vol. 20(1), pages 1-11, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Barry Moore & Peter Tyler & Dan Elliott, 1991. "The Influence of Regional Development Incentives and Infrastructure on the Location of Small and Medium Sized Companies in Europe," Urban Studies, Urban Studies Journal Limited, vol. 28(6), pages 1001-1026, December.
    2. Ariana Chang & Tian‐Shyug Lee & Hsiu‐Mei Lee, 2024. "Applying sustainable development goals in financial forecasting using machine learning techniques," Corporate Social Responsibility and Environmental Management, John Wiley & Sons, vol. 31(3), pages 2277-2289, May.
    3. Carpentier, Alain & Letort, Elodie, 2009. "Modeling acreage decisions within the multinomial Logit framework," Working Papers 211011, Institut National de la recherche Agronomique (INRA), Departement Sciences Sociales, Agriculture et Alimentation, Espace et Environnement (SAE2).
    4. A. Gupta & D. Kasturiratna & T. Nguyen & L. Pardo, 2006. "A New Family of BAN Estimators for Polytomous Logistic Regression Models based on ϕ- Divergence Measures," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 15(2), pages 159-176, August.
    5. Yang Zhang & Bora Cetin & Tuncer B. Edil, 2021. "Seasonal Performance Evaluation of Pavement Base Using Recycled Materials," Sustainability, MDPI, vol. 13(22), pages 1-15, November.
    6. Jaap Bikker & Adelina Popescu, 2014. "Efficiency and competition in the Dutch non-life insurance industry: Effects of the 2006 health care reform," Working Papers 14-12, Utrecht School of Economics.
    7. Farkas, Sébastien & Lopez, Olivier & Thomas, Maud, 2021. "Cyber claim analysis using Generalized Pareto regression trees with applications to insurance," Insurance: Mathematics and Economics, Elsevier, vol. 98(C), pages 92-105.
    8. Paolo Lazzeroni & Brunella Caroleo & Maurizio Arnone & Cristiana Botta, 2021. "A Simplified Approach to Estimate EV Charging Demand in Urban Area: An Italian Case Study," Energies, MDPI, vol. 14(20), pages 1-18, October.
    9. Emilio Aguirre & Federico García-Suárez & Gabriela Sicilia, 2021. "Eficiencia técnica en la ganadería de carne bovina pastoril. Medición y exploración de sus determinantes en Uruguay," Documentos de Trabajo (working papers) 1321, Department of Economics - dECON.
    10. NGUENA, Christian L., 2012. "Le Financement des PME au Cameroun dans un Contexte de Crise Financière [SMEs Financing issue in Cameroon in the context of Financial Crises]," MPRA Paper 49417, University Library of Munich, Germany, revised 01 Sep 2013.
    11. Lotfi Boudabsa & Damir Filipovi'c, 2022. "Ensemble learning for portfolio valuation and risk management," Papers 2204.05926, arXiv.org.
    12. Eldar Yeskuatov & Sook-Ling Chua & Lee Kien Foo, 2022. "Leveraging Reddit for Suicidal Ideation Detection: A Review of Machine Learning and Natural Language Processing Techniques," IJERPH, MDPI, vol. 19(16), pages 1-20, August.
    13. Yan, Ran & Wang, Shuaian & Du, Yuquan, 2020. "Development of a two-stage ship fuel consumption prediction and reduction model for a dry bulk ship," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 138(C).
    14. Aksoy, Ozan & Yıldırım, Sinan, 2020. "A model of dynamic migration networks: Explaining Turkey's inter-provincial migration flows," SocArXiv rf724, Center for Open Science.
    15. Qi Chu & Guang Bao & Jiayu Sun, 2022. "Progress and Prospects of Destination Image Research in the Last Decade," Sustainability, MDPI, vol. 14(17), pages 1-21, August.
    16. Nystrom, Kaj & Skoglund, Jimmy, 2006. "A credit risk model for large dimensional portfolios with application to economic capital," Journal of Banking & Finance, Elsevier, vol. 30(8), pages 2163-2197, August.
    17. Shaheena Bashir & Edward Carter, 2010. "Penalized multinomial mixture logit model," Computational Statistics, Springer, vol. 25(1), pages 121-141, March.
    18. Israr Ullah & Bilal Aslam & Syed Hassan Iqbal Ahmad Shah & Aqil Tariq & Shujing Qin & Muhammad Majeed & Hans-Balder Havenith, 2022. "An Integrated Approach of Machine Learning, Remote Sensing, and GIS Data for the Landslide Susceptibility Mapping," Land, MDPI, vol. 11(8), pages 1-20, August.
    19. Mariusz Woszczyński & Joanna Rogala-Rojek & Krzysztof Stankiewicz, 2022. "Advancement of the Monitoring System for Arch Support Geometry and Loads," Energies, MDPI, vol. 15(6), pages 1-21, March.
    20. Fok, D. & Paap, R., 2019. "New Misspecification Tests for Multinomial Logit Models," Econometric Institute Research Papers EI2019-24, Erasmus University Rotterdam, Erasmus School of Economics (ESE), Econometric Institute.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jijerp:v:19:y:2022:i:13:p:8197-:d:855762. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.