IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v10y2022i24p4815-d1007136.html
   My bibliography  Save this article

A New Text-Mining–Bayesian Network Approach for Identifying Chemical Safety Risk Factors

Author

Listed:
  • Zhiyong Zhou

    (School of Resources and Safety Engineering, Central South University, Changsha 410083, China)

  • Jianhui Huang

    (School of Resources and Safety Engineering, Central South University, Changsha 410083, China)

  • Yao Lu

    (School of Resources and Safety Engineering, Central South University, Changsha 410083, China)

  • Hongcai Ma

    (School of Resources and Safety Engineering, Central South University, Changsha 410083, China)

  • Wenwen Li

    (School of Resources and Safety Engineering, Central South University, Changsha 410083, China)

  • Jianhong Chen

    (School of Resources and Safety Engineering, Central South University, Changsha 410083, China)

Abstract

The frequent occurrence of accidents in the chemical industry has caused serious economic loss and negative social impact. The chemical accident investigation report is of great value for analyzing the risk factors involved. However, traditional manual analysis is time-consuming and labor-intensive, while existing keyword extraction methods still need to be improved. This study aims to propose an improved text-mining method to analyze a large number of chemical accident reports. A workflow was designed for building and updating lexicons of word segmentation. An improved keyword extraction algorithm was proposed to extract the top 100 keywords from 330 incident reports. A total of 51 safety risk factors was obtained by standardizing these keywords. In all, 294 strong association rules were obtained by Apriori. Based on these rules, a Bayesian network was built to analyze safety risk factors. The mean accuracy and mean recall of the BM25 model in the comparison experiments were 10.5% and 14.38% higher than those of TF-IDF, respectively. The results of association-rule mining and Bayesian network analysis can clearly demonstrate the interrelationship between the safety risk factors. The methodology of this study can quickly and efficiently extract key information from incident reports which can provide managers with new insights and suggestions.

Suggested Citation

  • Zhiyong Zhou & Jianhui Huang & Yao Lu & Hongcai Ma & Wenwen Li & Jianhong Chen, 2022. "A New Text-Mining–Bayesian Network Approach for Identifying Chemical Safety Risk Factors," Mathematics, MDPI, vol. 10(24), pages 1-25, December.
  • Handle: RePEc:gam:jmathe:v:10:y:2022:i:24:p:4815-:d:1007136
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/10/24/4815/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/10/24/4815/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Shan Yang & Kaijun Su & Bing Wang & Zitong Xu, 2022. "A Coupled Mathematical Model of the Dissemination Route of Short-Term Fund-Raising Fraud," Mathematics, MDPI, vol. 10(10), pages 1-23, May.
    2. Jianhong Chen & Shuyue Du & Shan Yang, 2022. "Mining and Evolution Analysis of Network Public Opinion Concerns of Stakeholders in Hot Social Events," Mathematics, MDPI, vol. 10(12), pages 1-18, June.
    3. Uusitalo, Laura, 2007. "Advantages and challenges of Bayesian networks in environmental modelling," Ecological Modelling, Elsevier, vol. 203(3), pages 312-318.
    4. Ouali, Abdelaziz & Ramdane Cherif, Amar & Krebs, Marie-Odile, 2006. "Data mining based Bayesian networks for best classification," Computational Statistics & Data Analysis, Elsevier, vol. 51(2), pages 1278-1292, November.
    5. K. Coussement & D. van den Poel, 2008. "Integrating the voice of customers through call center emails into a decision support system for churn prediction," Post-Print hal-00788086, HAL.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Di Zhang & Xinping Yan & Zaili Yang & Jin Wang, 2014. "An accident data–based approach for congestion risk assessment of inland waterways: A Yangtze River case," Journal of Risk and Reliability, , vol. 228(2), pages 176-188, April.
    2. Nicholson, Ann E. & Flores, M. Julia, 2011. "Combining state and transition models with dynamic Bayesian networks," Ecological Modelling, Elsevier, vol. 222(3), pages 555-566.
    3. Moe, S. Jannicke & Haande, Sigrid & Couture, Raoul-Marie, 2016. "Climate change, cyanobacteria blooms and ecological status of lakes: A Bayesian network approach," Ecological Modelling, Elsevier, vol. 337(C), pages 330-347.
    4. Meineri, Eric & Dahlberg, C. Johan & Hylander, Kristoffer, 2015. "Using Gaussian Bayesian Networks to disentangle direct and indirect associations between landscape physiography, environmental variables and species distribution," Ecological Modelling, Elsevier, vol. 313(C), pages 127-136.
    5. De Iuliis, Melissa & Kammouh, Omar & Cimellaro, Gian Paolo & Tesfamariam, Solomon, 2021. "Quantifying restoration time of power and telecommunication lifelines after earthquakes using Bayesian belief network model," Reliability Engineering and System Safety, Elsevier, vol. 208(C).
    6. De Caigny, Arno & Coussement, Kristof & De Bock, Koen W. & Lessmann, Stefan, 2020. "Incorporating textual information in customer churn prediction models based on a convolutional neural network," International Journal of Forecasting, Elsevier, vol. 36(4), pages 1563-1578.
    7. Johannes Habel & Sascha Alavi & Nicolas Heinitz, 2023. "A theory of predictive sales analytics adoption," AMS Review, Springer;Academy of Marketing Science, vol. 13(1), pages 34-54, June.
    8. Arno de Caigny & Kristof Coussement & Koen de Bock, 2020. "Leveraging fine-grained transaction data for customer life event predictions," Post-Print hal-02507998, HAL.
    9. Antioco, Michael & Coussement, Kristof, 2018. "Misreading of consumer dissatisfaction in online product reviews: Writing style as a cause for bias," International Journal of Information Management, Elsevier, vol. 38(1), pages 301-310.
    10. Antonio Bracale & Pasquale De Falco, 2015. "An Advanced Bayesian Method for Short-Term Probabilistic Forecasting of the Generation of Wind Power," Energies, MDPI, vol. 8(9), pages 1-22, September.
    11. Renken, Henk & Mumby, Peter J., 2009. "Modelling the dynamics of coral reef macroalgae using a Bayesian belief network approach," Ecological Modelling, Elsevier, vol. 220(9), pages 1305-1314.
    12. Alasdair Reid, 2023. "Closing the Affordable Housing Gap: Identifying the Barriers Hindering the Sustainable Design and Construction of Affordable Homes," Sustainability, MDPI, vol. 15(11), pages 1-27, May.
    13. Jin, Ruining & Hoang, Giang & Nguyen, Thi-Phuong & Tri, Nguyen Phuong & Le, Tam-Tri & La, Viet-Phuong, 2022. "An analytical framework-based pedagogical method for scholarly community coaching: A proof of concept," OSF Preprints qabhj_v1, Center for Open Science.
    14. K. W. De Bock & D. Van Den Poel, 2011. "An empirical evaluation of rotation-based ensemble classifiers for customer churn prediction," Working Papers of Faculty of Economics and Business Administration, Ghent University, Belgium 11/717, Ghent University, Faculty of Economics and Business Administration.
    15. Verda Kocabas & Suzana Dragicevic, 2013. "Bayesian networks and agent-based modeling approach for urban land-use and population density change: a BNAS model," Journal of Geographical Systems, Springer, vol. 15(4), pages 403-426, October.
    16. Ruining Jin & Tam-Tri Le & Thu-Trang Vuong & Thi-Phuong Nguyen & Giang Hoang & Minh-Hoang Nguyen & Quan-Hoang Vuong, 2023. "A Gender Study of Food Stress and Implications for International Students Acculturation," World, MDPI, vol. 4(1), pages 1-15, January.
    17. Kragt, Marit Ellen & Bennett, Jeffrey W., 2009. "Integrating economic values and catchment modelling," 2009 Conference (53rd), February 11-13, 2009, Cairns, Australia 47956, Australian Agricultural and Resource Economics Society.
    18. Jin, Ruining & Hoang, Giang & Nguyen, Thi-Phuong & Nguyen, Phuong-Tri & Le, Tam-Tri & La, Viet-Phuong & Nguyen, Minh-Hoang & Vuong, Quan-Hoang, 2022. "An analytical framework-based pedagogical method for scholarly community coaching: A proof of concept," OSF Preprints qabhj, Center for Open Science.
    19. Marcot, Bruce G., 2012. "Metrics for evaluating performance and uncertainty of Bayesian network models," Ecological Modelling, Elsevier, vol. 230(C), pages 50-62.
    20. Xiaoliang Xie & Jinxia Zuo & Bingqi Xie & Thomas A. Dooling & Selvarajah Mohanarajah, 2021. "Bayesian network reasoning and machine learning with multiple data features: air pollution risk monitoring and early warning," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 107(3), pages 2555-2572, July.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:10:y:2022:i:24:p:4815-:d:1007136. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.