IDEAS home Printed from https://ideas.repec.org/a/gam/jijerp/v21y2024i7p831-d1422484.html
   My bibliography  Save this article

A Review of Data Mining Strategies by Data Type, with a Focus on Construction Processes and Health and Safety Management

Author

Listed:
  • Antonella Pireddu

    (Department of Technological Innovations and Safety of Plants, Products and Anthropic Settlements (DIT), Italian National Institute for Insurance against Accidents at Work, Inail, 00144 Rome, Italy)

  • Angelico Bedini

    (Department of Technological Innovations and Safety of Plants, Products and Anthropic Settlements (DIT), Italian National Institute for Insurance against Accidents at Work, Inail, 00144 Rome, Italy)

  • Mara Lombardi

    (Department of Chemical Engineering Materials Environment (DICMA), Sapienza-University of Rome, 00184 Rome, Italy)

  • Angelo L. C. Ciribini

    (Department of Civil Engineering, Architecture, Land, Environment and Mathematics (DICATAM), Brescia University, 25121 Brescia, Italy)

  • Davide Berardi

    (Department of Chemical Engineering Materials Environment (DICMA), Sapienza-University of Rome, 00184 Rome, Italy)

Abstract

Increasingly, information technology facilitates the storage and management of data useful for risk analysis and event prediction. Studies on data extraction related to occupational health and safety are increasingly available; however, due to its variability, the construction sector warrants special attention. This review is conducted under the research programs of the National Institute for Occupational Accident Insurance (Inail). Objectives: The research question focuses on identifying which data mining (DM) methods, among supervised, unsupervised, and others, are most appropriate for certain investigation objectives, types, and sources of data, as defined by the authors. Methods: Scopus and ProQuest were the main sources from which we extracted studies in the field of construction, published between 2014 and 2023. The eligibility criteria applied in the selection of studies were based on the Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA). For exploratory purposes, we applied hierarchical clustering, while for in-depth analysis, we used principal component analysis (PCA) and meta-analysis. Results: The search strategy based on the PRISMA eligibility criteria provided us with 63 out of 2234 potential articles, 206 observations, 89 methodologies, 4 survey purposes, 3 data sources, 7 data types, and 3 resource types. Cluster analysis and PCA organized the information included in the paper dataset into two dimensions and labels: “supervised methods, institutional dataset, and predictive and classificatory purposes” (correlation 0.97–8.18 × 10 −1 ; p -value 7.67 × 10 −55 –1.28 × 10 −22 ) and the second, Dim2 “not-supervised methods; project, simulation, literature, text data; monitoring, decision-making processes; machinery and environment” (corr. 0.84–0.47; p -value 5.79 × 10 −25 –-3.59 × 10 −6 ). We answered the research question regarding which method, among supervised, unsupervised, or other, is most suitable for application to data in the construction industry. Conclusions: The meta-analysis provided an overall estimate of the better effectiveness of supervised methods (Odds Ratio = 0.71, Confidence Interval 0.53–0.96) compared to not-supervised methods.

Suggested Citation

  • Antonella Pireddu & Angelico Bedini & Mara Lombardi & Angelo L. C. Ciribini & Davide Berardi, 2024. "A Review of Data Mining Strategies by Data Type, with a Focus on Construction Processes and Health and Safety Management," IJERPH, MDPI, vol. 21(7), pages 1-26, June.
  • Handle: RePEc:gam:jijerp:v:21:y:2024:i:7:p:831-:d:1422484
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/1660-4601/21/7/831/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/1660-4601/21/7/831/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Peng Lin & Qingbin Li & Qixiang Fan & Xiangyou Gao & Senying Hu, 2014. "A Real-Time Location-Based Services System Using WiFi Fingerprinting Algorithm for Safety Risk Assessment of Workers in Tunnels," Mathematical Problems in Engineering, Hindawi, vol. 2014, pages 1-10, April.
    2. Anurag Yedla & Fatemeh Davoudi Kakhki & Ali Jannesari, 2020. "Predictive Modeling for Occupational Safety Outcomes and Days Away from Work Analysis in Mining Operations," IJERPH, MDPI, vol. 17(19), pages 1-17, September.
    3. Aminu Darda’u Rafindadi & Nasir Shafiq & Idris Othman & Miljan Mikić, 2023. "Mechanism Models of the Conventional and Advanced Methods of Construction Safety Training. Is the Traditional Method of Safety Training Sufficient?," IJERPH, MDPI, vol. 20(2), pages 1-19, January.
    4. Fatemeh Mostofi & Vedat Toğan & Yunus Emre Ayözen & Onur Behzat Tokdemir, 2022. "Construction Safety Risk Model with Construction Accident Network: A Graph Convolutional Network Approach," Sustainability, MDPI, vol. 14(23), pages 1-18, November.
    5. Mohamed Zul Fadhli Khairuddin & Puat Lu Hui & Khairunnisa Hasikin & Nasrul Anuar Abd Razak & Khin Wee Lai & Ahmad Shakir Mohd Saudi & Siti Salwa Ibrahim, 2022. "Occupational Injury Risk Mitigation: Machine Learning Approach and Feature Optimization for Smart Workplace Surveillance," IJERPH, MDPI, vol. 19(21), pages 1-19, October.
    6. Mark Chiang & Boris Mirkin, 2010. "Intelligent Choice of the Number of Clusters in K-Means Clustering: An Experimental Study with Different Cluster Spreads," Journal of Classification, Springer;The Classification Society, vol. 27(1), pages 3-40, March.
    7. Muneerah M. Alateeq & Fathimathul Rajeena P.P. & Mona A. S. Ali, 2023. "Construction Site Hazards Identification Using Deep Learning and Computer Vision," Sustainability, MDPI, vol. 15(3), pages 1-19, January.
    8. Haleh Sadeghi & Saeed Reza Mohandes & M. Reza Hosseini & Saeed Banihashemi & Amir Mahdiyar & Arham Abdullah, 2020. "Developing an Ensemble Predictive Safety Risk Assessment Model: Case of Malaysian Construction Projects," IJERPH, MDPI, vol. 17(22), pages 1-25, November.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Shouxiang Wang & Pengfei Dong & Yingjie Tian, 2017. "A Novel Method of Statistical Line Loss Estimation for Distribution Feeders Based on Feeder Cluster and Modified XGBoost," Energies, MDPI, vol. 10(12), pages 1-17, December.
    2. Pawel Dlotko & Wanling Qiu & Simon Rudkin, 2022. "Topological Data Analysis Ball Mapper for Finance," Papers 2206.03622, arXiv.org.
    3. J. Fernando Vera & Rodrigo Macías, 2021. "On the Behaviour of K-Means Clustering of a Dissimilarity Matrix by Means of Full Multidimensional Scaling," Psychometrika, Springer;The Psychometric Society, vol. 86(2), pages 489-513, June.
    4. Katarzyna Boczkowska & Konrad Nizio³ek & El¿bieta Roszko-Wójtowicz, 2022. "A multivariate approach towards the measurement of active employee participation in the area of occupational health and safety in different sectors of the economy," Equilibrium. Quarterly Journal of Economics and Economic Policy, Institute of Economic Research, vol. 17(4), pages 1051-1085, December.
    5. Sara Dolnicar & Friedrich Leisch, 2017. "Using segment level stability to select target segments in data-driven market segmentation studies," Marketing Letters, Springer, vol. 28(3), pages 423-436, September.
    6. Muhamad Rizki & Muhammad Zudhy Irawan & Puspita Dirgahayani & Prawira Fajarindra Belgiawan & Retno Wihanesta, 2022. "Low Emission Zone (LEZ) Expansion in Jakarta: Acceptability and Restriction Preference," Sustainability, MDPI, vol. 14(19), pages 1-22, September.
    7. Aslani, Mehrdad & Faraji, Jamal & Hashemi-Dezaki, Hamed & Ketabi, Abbas, 2022. "A novel clustering-based method for reliability assessment of cyber-physical microgrids considering cyber interdependencies and information transmission errors," Applied Energy, Elsevier, vol. 315(C).
    8. Mohanad Kamil Buniya & Idris Othman & Serdar Durdyev & Riza Yosia Sunindijo & Syuhaida Ismail & Ahmed Farouk Kineber, 2021. "Safety Program Elements in the Construction Industry: The Case of Iraq," IJERPH, MDPI, vol. 18(2), pages 1-13, January.
    9. Maryam Pishgar & Salah Fuad Issa & Margaret Sietsema & Preethi Pratap & Houshang Darabi, 2021. "REDECA: A Novel Framework to Review Artificial Intelligence and Its Applications in Occupational Safety and Health," IJERPH, MDPI, vol. 18(13), pages 1-42, June.
    10. Mei Liu & Boning Li & Hongjun Cui & Pin-Chao Liao & Yuecheng Huang, 2022. "Research Paradigm of Network Approaches in Construction Safety and Occupational Health," IJERPH, MDPI, vol. 19(19), pages 1-22, September.
    11. Yin Junjia & Aidi Hizami Alias & Nuzul Azam Haron & Nabilah Abu Bakar, 2023. "A Bibliometric Review on Safety Risk Assessment of Construction Based on CiteSpace Software and WoS Database," Sustainability, MDPI, vol. 15(15), pages 1-24, August.
    12. Xiaoming Yuan & Yueqi Bi & Mingrui Hao & Qiang Ji & Zhigeng Liu & Jiusheng Bao, 2022. "Research on Location Estimation for Coal Tunnel Vehicle Based on Ultra-Wide Band Equipment," Energies, MDPI, vol. 15(22), pages 1-17, November.
    13. Kyunghwan Kim & Kangeun Kim & Soyoon Jeong, 2023. "Application of YOLO v5 and v8 for Recognition of Safety Risk Factors at Construction Sites," Sustainability, MDPI, vol. 15(20), pages 1-17, October.
    14. J. Fernando Vera & Rodrigo Macías, 2017. "Variance-Based Cluster Selection Criteria in a K-Means Framework for One-Mode Dissimilarity Data," Psychometrika, Springer;The Psychometric Society, vol. 82(2), pages 275-294, June.
    15. Cristina Tortora & Mireille Gettler Summa & Marina Marino & Francesco Palumbo, 2016. "Factor probabilistic distance clustering (FPDC): a new clustering method," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 10(4), pages 441-464, December.
    16. Mohamed Zul Fadhli Khairuddin & Puat Lu Hui & Khairunnisa Hasikin & Nasrul Anuar Abd Razak & Khin Wee Lai & Ahmad Shakir Mohd Saudi & Siti Salwa Ibrahim, 2022. "Occupational Injury Risk Mitigation: Machine Learning Approach and Feature Optimization for Smart Workplace Surveillance," IJERPH, MDPI, vol. 19(21), pages 1-19, October.
    17. Jaehong Yu & Hua Zhong & Seoung Bum Kim, 2020. "An Ensemble Feature Ranking Algorithm for Clustering Analysis," Journal of Classification, Springer;The Classification Society, vol. 37(2), pages 462-489, July.
    18. Haoyang Ping & Zhuocheng Li & Xizhu Shen & Haizhen Sun, 2024. "Optimization of Vegetable Restocking and Pricing Strategies for Innovating Supermarket Operations Utilizing a Combination of ARIMA, LSTM, and FP-Growth Algorithms," Mathematics, MDPI, vol. 12(7), pages 1-17, March.
    19. Dogan Gursoy & Anna Maria Parroco & Raffaele Scuderi, 2013. "An Examination of Tourist Arrivals Dynamics Using Short-Term Time Series Data: A Space—Time Cluster Approach," Tourism Economics, , vol. 19(4), pages 761-777, August.
    20. Arso M. Vukićević & Ivan Mačužić & Marko Djapan & Vladimir Milićević & Luiza Shamina, 2021. "Digital Training and Advanced Learning in Occupational Safety and Health Based on Modern and Affordable Technologies," Sustainability, MDPI, vol. 13(24), pages 1-13, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jijerp:v:21:y:2024:i:7:p:831-:d:1422484. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.