IDEAS home Printed from https://ideas.repec.org/a/gam/jsusta/v16y2024i12p5081-d1415190.html
   My bibliography  Save this article

Research on the Prediction of Sustainable Safety Production in Building Construction Based on Text Data

Author

Listed:
  • Jifei Fan

    (School of Civil Engineering, Lanzhou University of Technology, Lanzhou 730050, China)

  • Daopeng Wang

    (School of Civil Engineering, Lanzhou University of Technology, Lanzhou 730050, China)

  • Ping Liu

    (School of Civil Engineering, Lanzhou University of Technology, Lanzhou 730050, China)

  • Jiaming Xu

    (School of Civil Engineering, Lanzhou University of Technology, Lanzhou 730050, China)

Abstract

Given the complexity and variability of modern construction projects, safety risk management has become increasingly challenging, while traditional methods exhibit deficiencies in handling complex dynamic environments, particularly those involving unstructured text data. Consequently, this study proposes a text data-based risk prediction method for building construction safety. Initially, heuristic Chinese automatic word segmentation, which incorporates mutual information, information entropy statistics, and the TF-IDF algorithm, preprocesses text data to extract risk factor keywords and construct accident attribute variables. At the same time, the Spearman correlation coefficient is utilized to eliminate the multicollinearity between feature variables. Next, the XGBoost algorithm is employed to develop a model for predicting the risks associated with safe production. Its performance is optimized through three experimental scenarios. The results indicate that the model achieves satisfactory overall performance after hyperparameter tuning, with the prediction accuracy and F1 score reaching approximately 86%. Finally, the SHAP model interpretation technique identifies critical factors influencing the safety production risk in building construction, highlighting project managers’ attention to safety, government regulation, safety design, and emergency response as critical determinants of accident severity. The main objective of this study is to minimize human intervention in risk assessment and to construct a text data-based risk prediction model for building construction safety production using the rich empirical knowledge embedded in unstructured accident text, with the aim of reducing safety production accidents and promoting the sustainable development of construction safety in the industry. This model not only enables a paradigm shift toward intelligent risk control in safety production but also provides theoretical and practical insights into decision-making and technical support in safety production.

Suggested Citation

  • Jifei Fan & Daopeng Wang & Ping Liu & Jiaming Xu, 2024. "Research on the Prediction of Sustainable Safety Production in Building Construction Based on Text Data," Sustainability, MDPI, vol. 16(12), pages 1-21, June.
  • Handle: RePEc:gam:jsusta:v:16:y:2024:i:12:p:5081-:d:1415190
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2071-1050/16/12/5081/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2071-1050/16/12/5081/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Gambrill, Eileen & Shlonsky, Aron, 2000. "Risk assessment in context," Children and Youth Services Review, Elsevier, vol. 22(11-12), pages 813-837.
    2. Haiyan Chen & Yihua Mao & Yidong Xu & Rui Wang, 2023. "The Impact of Wearable Devices on the Construction Safety of Building Workers: A Systematic Review," Sustainability, MDPI, vol. 15(14), pages 1-15, July.
    3. Aven, Terje, 2016. "Risk assessment and risk management: Review of recent advances on their foundation," European Journal of Operational Research, Elsevier, vol. 253(1), pages 1-13.
    4. Gandomi, Amir & Haider, Murtaza, 2015. "Beyond the hype: Big data concepts, methods, and analytics," International Journal of Information Management, Elsevier, vol. 35(2), pages 137-144.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ahmad Ibrahim Aljumah & Mohammed T. Nuseir & Md. Mahmudul Alam, 2021. "Traditional marketing analytics, big data analytics and big data system quality and the success of new product development," Post-Print hal-03538161, HAL.
    2. Cano-Marin, Enrique & Mora-Cantallops, Marçal & Sánchez-Alonso, Salvador, 2023. "Twitter as a predictive system: A systematic literature review," Journal of Business Research, Elsevier, vol. 157(C).
    3. de Camargo Fiorini, Paula & Roman Pais Seles, Bruno Michel & Chiappetta Jabbour, Charbel Jose & Barberio Mariano, Enzo & de Sousa Jabbour, Ana Beatriz Lopes, 2018. "Management theory and big data literature: From a review to a research agenda," International Journal of Information Management, Elsevier, vol. 43(C), pages 112-129.
    4. Amiri, Babak & Karimianghadim, Ramin, 2024. "A novel text clustering model based on topic modelling and social network analysis," Chaos, Solitons & Fractals, Elsevier, vol. 181(C).
    5. Aven, Terje & Renn, Ortwin, 2018. "Improving government policy on risk: Eight key principles," Reliability Engineering and System Safety, Elsevier, vol. 176(C), pages 230-241.
    6. Lutfi, Abdalwali & Alrawad, Mahmaod & Alsyouf, Adi & Almaiah, Mohammed Amin & Al-Khasawneh, Ahmad & Al-Khasawneh, Akif Lutfi & Alshira'h, Ahmad Farhan & Alshirah, Malek Hamed & Saad, Mohamed & Ibrahim, 2023. "Drivers and impact of big data analytic adoption in the retail industry: A quantitative investigation applying structural equation modeling," Journal of Retailing and Consumer Services, Elsevier, vol. 70(C).
    7. repec:arp:tjssrr:2019:p:69-75 is not listed on IDEAS
    8. Mohamed Gaber & Edward J. Lusk, 2019. "A Vetting Protocol for the Analytical Procedures Platform for the AP-Phase of PCAOB Audits," Accounting and Finance Research, Sciedu Press, vol. 8(4), pages 1-43, November.
    9. Acharya, Abhilash & Singh, Sanjay Kumar & Pereira, Vijay & Singh, Poonam, 2018. "Big data, knowledge co-creation and decision making in fashion industry," International Journal of Information Management, Elsevier, vol. 42(C), pages 90-101.
    10. Arno de Caigny & Kristof Coussement & Koen W. de Bock & Stefan Lessmann, 2019. "Incorporating textual information in customer churn prediction models based on a convolutional neural network," Post-Print hal-02275958, HAL.
    11. Mussard, Stéphane & Pi Alperin, María Noel, 2021. "Accounting for risk factors on health outcomes: The case of Luxembourg," European Journal of Operational Research, Elsevier, vol. 291(3), pages 1180-1197.
    12. Harkaran Kava & Konstantina Spanaki & Thanos Papadopoulos & Stella Despoudi & Oscar Rodriguez-Espindola & Masoud Fakhimi, 2021. "Data Analytics Diffusion in the UK Renewable Energy Sector: An Innovation Perspective," Post-Print hal-03781046, HAL.
    13. Oesterreich, Thuy Duong & Anton, Eduard & Teuteberg, Frank & Dwivedi, Yogesh K, 2022. "The role of the social and technical factors in creating business value from big data analytics: A meta-analysis," Journal of Business Research, Elsevier, vol. 153(C), pages 128-149.
    14. Zio, E., 2018. "The future of risk assessment," Reliability Engineering and System Safety, Elsevier, vol. 177(C), pages 176-190.
    15. Tasneem Bani-Mustafa & Nicola Pedroni & Enrico Zio & Dominique Vasseur & Francois Beaudouin, 2020. "A hierarchical tree-based decision-making approach for assessing the relative trustworthiness of risk assessment models," Journal of Risk and Reliability, , vol. 234(6), pages 748-763, December.
    16. Johannes Habel & Sascha Alavi & Nicolas Heinitz, 2023. "A theory of predictive sales analytics adoption," AMS Review, Springer;Academy of Marketing Science, vol. 13(1), pages 34-54, June.
    17. Judita Peterlin & Maja Meško & Vlado Dimovski & Vasja Roblek, 2021. "Automated content analysis: The review of the big data systemic discourse in tourism and hospitality," Systems Research and Behavioral Science, Wiley Blackwell, vol. 38(3), pages 377-385, May.
    18. Heiner Ackermann & Erik Diessel & Michael Helmling & Neil Jami & Johanna Münch, 2024. "Computing Optimal Mitigation Plans for Force-Majeure Scenarios in Dynamic Manufacturing Chains," SN Operations Research Forum, Springer, vol. 5(2), pages 1-35, June.
    19. Tiago Carneiro & Winnie Ng Picoto & Inês Pinto, 2023. "Big Data Analytics and Firm Performance in the Hotel Sector," Tourism and Hospitality, MDPI, vol. 4(2), pages 1-13, April.
    20. Aigner, Philipp & Schlütter, Sebastian, 2023. "Enhancing gradient capital allocation with orthogonal convexity scenarios," ICIR Working Paper Series 47/23, Goethe University Frankfurt, International Center for Insurance Regulation (ICIR).
    21. Mangirdas Morkunas & Gintaras Cernius & Gintare Giriuniene, 2019. "Assessing Business Risks of Natural Gas Trading Companies: Evidence from GET Baltic," Energies, MDPI, vol. 12(14), pages 1-14, July.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jsusta:v:16:y:2024:i:12:p:5081-:d:1415190. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.