IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v10y2022i24p4815-d1007136.html
   My bibliography  Save this article

A New Text-Mining–Bayesian Network Approach for Identifying Chemical Safety Risk Factors

Author

Listed:
  • Zhiyong Zhou

    (School of Resources and Safety Engineering, Central South University, Changsha 410083, China)

  • Jianhui Huang

    (School of Resources and Safety Engineering, Central South University, Changsha 410083, China)

  • Yao Lu

    (School of Resources and Safety Engineering, Central South University, Changsha 410083, China)

  • Hongcai Ma

    (School of Resources and Safety Engineering, Central South University, Changsha 410083, China)

  • Wenwen Li

    (School of Resources and Safety Engineering, Central South University, Changsha 410083, China)

  • Jianhong Chen

    (School of Resources and Safety Engineering, Central South University, Changsha 410083, China)

Abstract

The frequent occurrence of accidents in the chemical industry has caused serious economic loss and negative social impact. The chemical accident investigation report is of great value for analyzing the risk factors involved. However, traditional manual analysis is time-consuming and labor-intensive, while existing keyword extraction methods still need to be improved. This study aims to propose an improved text-mining method to analyze a large number of chemical accident reports. A workflow was designed for building and updating lexicons of word segmentation. An improved keyword extraction algorithm was proposed to extract the top 100 keywords from 330 incident reports. A total of 51 safety risk factors was obtained by standardizing these keywords. In all, 294 strong association rules were obtained by Apriori. Based on these rules, a Bayesian network was built to analyze safety risk factors. The mean accuracy and mean recall of the BM25 model in the comparison experiments were 10.5% and 14.38% higher than those of TF-IDF, respectively. The results of association-rule mining and Bayesian network analysis can clearly demonstrate the interrelationship between the safety risk factors. The methodology of this study can quickly and efficiently extract key information from incident reports which can provide managers with new insights and suggestions.

Suggested Citation

  • Zhiyong Zhou & Jianhui Huang & Yao Lu & Hongcai Ma & Wenwen Li & Jianhong Chen, 2022. "A New Text-Mining–Bayesian Network Approach for Identifying Chemical Safety Risk Factors," Mathematics, MDPI, vol. 10(24), pages 1-25, December.
  • Handle: RePEc:gam:jmathe:v:10:y:2022:i:24:p:4815-:d:1007136
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/10/24/4815/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/10/24/4815/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Shan Yang & Kaijun Su & Bing Wang & Zitong Xu, 2022. "A Coupled Mathematical Model of the Dissemination Route of Short-Term Fund-Raising Fraud," Mathematics, MDPI, vol. 10(10), pages 1-23, May.
    2. Jianhong Chen & Shuyue Du & Shan Yang, 2022. "Mining and Evolution Analysis of Network Public Opinion Concerns of Stakeholders in Hot Social Events," Mathematics, MDPI, vol. 10(12), pages 1-18, June.
    3. Uusitalo, Laura, 2007. "Advantages and challenges of Bayesian networks in environmental modelling," Ecological Modelling, Elsevier, vol. 203(3), pages 312-318.
    4. Ouali, Abdelaziz & Ramdane Cherif, Amar & Krebs, Marie-Odile, 2006. "Data mining based Bayesian networks for best classification," Computational Statistics & Data Analysis, Elsevier, vol. 51(2), pages 1278-1292, November.
    5. K. Coussement & D. van den Poel, 2008. "Integrating the voice of customers through call center emails into a decision support system for churn prediction," Post-Print hal-00788086, HAL.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Di Zhang & Xinping Yan & Zaili Yang & Jin Wang, 2014. "An accident data–based approach for congestion risk assessment of inland waterways: A Yangtze River case," Journal of Risk and Reliability, , vol. 228(2), pages 176-188, April.
    2. Zhang, Quanzhong & Wei, Haiyan & Liu, Jing & Zhao, Zefang & Ran, Qiao & Gu, Wei, 2021. "A Bayesian network with fuzzy mathematics for species habitat suitability analysis: A case with limited Angelica sinensis (Oliv.) Diels data," Ecological Modelling, Elsevier, vol. 450(C).
    3. Jim Lewis & Kerrie Mengersen & Laurie Buys & Desley Vine & John Bell & Peter Morris & Gerard Ledwich, 2015. "Systems Modelling of the Socio-Technical Aspects of Residential Electricity Use and Network Peak Demand," PLOS ONE, Public Library of Science, vol. 10(7), pages 1-21, July.
    4. Nicholson, Ann E. & Flores, M. Julia, 2011. "Combining state and transition models with dynamic Bayesian networks," Ecological Modelling, Elsevier, vol. 222(3), pages 555-566.
    5. Moe, S. Jannicke & Haande, Sigrid & Couture, Raoul-Marie, 2016. "Climate change, cyanobacteria blooms and ecological status of lakes: A Bayesian network approach," Ecological Modelling, Elsevier, vol. 337(C), pages 330-347.
    6. Meineri, Eric & Dahlberg, C. Johan & Hylander, Kristoffer, 2015. "Using Gaussian Bayesian Networks to disentangle direct and indirect associations between landscape physiography, environmental variables and species distribution," Ecological Modelling, Elsevier, vol. 313(C), pages 127-136.
    7. Mostafa Shaaban & Carmen Schwartz & Joseph Macpherson & Annette Piorr, 2021. "A Conceptual Model Framework for Mapping, Analyzing and Managing Supply–Demand Mismatches of Ecosystem Services in Agricultural Landscapes," Land, MDPI, vol. 10(2), pages 1-19, January.
    8. De Iuliis, Melissa & Kammouh, Omar & Cimellaro, Gian Paolo & Tesfamariam, Solomon, 2021. "Quantifying restoration time of power and telecommunication lifelines after earthquakes using Bayesian belief network model," Reliability Engineering and System Safety, Elsevier, vol. 208(C).
    9. Dayong Li & Zengchuan Dong & Liyao Shi & Jintao Liu & Zhenye Zhu & Wei Xu, 2019. "Risk Probability Assessment of Sudden Water Pollution in the Plain River Network Based on Random Discharge from Multiple Risk Sources," Water Resources Management: An International Journal, Published for the European Water Resources Association (EWRA), Springer;European Water Resources Association (EWRA), vol. 33(12), pages 4051-4065, September.
    10. Arno de Caigny & Kristof Coussement & Koen W. de Bock & Stefan Lessmann, 2019. "Incorporating textual information in customer churn prediction models based on a convolutional neural network," Post-Print hal-02275958, HAL.
    11. Tiller, Rachel Gjelsvik & Hansen, Lillian & Richards, Russell & Strand, Hillevi, 2015. "Work segmentation in the Norwegian salmon industry: The application of segmented labor market theory to work migrants on the island community of Frøya, Norway," Marine Policy, Elsevier, vol. 51(C), pages 563-572.
    12. Leonel Lara-Estrada & Livia Rasche & L. Enrique Sucar & Uwe A. Schneider, 2018. "Inferring Missing Climate Data for Agricultural Planning Using Bayesian Networks," Land, MDPI, vol. 7(1), pages 1-13, January.
    13. Iva Salov & Aleksandra Krajnovic & Ante Panjkota, 2017. "Relation between Data Mining and Business Fields in the Four Dimensional CRM Model," MIC 2017: Managing the Global Economy; Proceedings of the Joint International Conference, Monastier di Treviso, Italy, 24–27 May 2017,, University of Primorska Press.
    14. Jinjia Zhang & Yiping Zeng & Genserik Reniers & Jie Liu, 2022. "Analysis of the Interaction Mechanism of the Risk Factors of Gas Explosions in Chinese Underground Coal Mines," IJERPH, MDPI, vol. 19(2), pages 1-18, January.
    15. Nguyen, Minh-Hoang, 2023. "Investigating urban residents' involvement in biodiversity conservation in protected areas: Empirical evidence from Vietnam," Thesis Commons z2hjv, Center for Open Science.
    16. De Caigny, Arno & Coussement, Kristof & De Bock, Koen W. & Lessmann, Stefan, 2020. "Incorporating textual information in customer churn prediction models based on a convolutional neural network," International Journal of Forecasting, Elsevier, vol. 36(4), pages 1563-1578.
    17. Johannes Habel & Sascha Alavi & Nicolas Heinitz, 2023. "A theory of predictive sales analytics adoption," AMS Review, Springer;Academy of Marketing Science, vol. 13(1), pages 34-54, June.
    18. Li, Gong & Shi, Jing, 2012. "Applications of Bayesian methods in wind energy conversion systems," Renewable Energy, Elsevier, vol. 43(C), pages 1-8.
    19. Adumene, Sidum & Khan, Faisal & Adedigba, Sunday & Zendehboudi, Sohrab & Shiri, Hodjat, 2021. "Dynamic risk analysis of marine and offshore systems suffering microbial induced stochastic degradation," Reliability Engineering and System Safety, Elsevier, vol. 207(C).
    20. Meyer, Spencer R. & Johnson, Michelle L. & Lilieholm, Robert J. & Cronan, Christopher S., 2014. "Development of a stakeholder-driven spatial modeling framework for strategic landscape planning using Bayesian networks across two urban-rural gradients in Maine, USA," Ecological Modelling, Elsevier, vol. 291(C), pages 42-57.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:10:y:2022:i:24:p:4815-:d:1007136. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.