IDEAS home Printed from https://ideas.repec.org/a/gam/jijerp/v19y2022i5p2925-d762712.html
   My bibliography  Save this article

Predicting and Analyzing Road Traffic Injury Severity Using Boosting-Based Ensemble Learning Models with SHAPley Additive exPlanations

Author

Listed:
  • Sheng Dong

    (School of Civil and Transportation Engineering, Ningbo University of Technology, Fenghua Road No. 201, Ningbo 315211, China)

  • Afaq Khattak

    (The Key Laboratory of Road and Traffic Engineering, Ministry of Education, Tongji University, 4800 Cao’an Road, Jiading, Shanghai 201804, China)

  • Irfan Ullah

    (Department of Civil Engineering, International Islamic University, Sector H-10, Islamabad 1243, Pakistan)

  • Jibiao Zhou

    (College of Transportation Engineering, Tongji University, 4800 Cao’an Road, Jiading, Shanghai 201804, China)

  • Arshad Hussain

    (NUST Institute of Civil Engineering, National University of Sciences and Technology, Sector H-12, Islamabad 44000, Pakistan)

Abstract

Road traffic accidents are one of the world’s most serious problems, as they result in numerous fatalities and injuries, as well as economic losses each year. Assessing the factors that contribute to the severity of road traffic injuries has proven to be insightful. The findings may contribute to a better understanding of and potential mitigation of the risk of serious injuries associated with crashes. While ensemble learning approaches are capable of establishing complex and non-linear relationships between input risk variables and outcomes for the purpose of injury severity prediction and classification, most of them share a critical limitation: their “black-box” nature. To develop interpretable predictive models for road traffic injury severity, this paper proposes four boosting-based ensemble learning models, namely a novel Natural Gradient Boosting, Adaptive Gradient Boosting, Categorical Gradient Boosting, and Light Gradient Boosting Machine, and uses a recently developed SHapley Additive exPlanations analysis to rank the risk variables and explain the optimal model. Among four models, LightGBM achieved the highest classification accuracy (73.63%), precision (72.61%), and recall (70.09%), F1-scores (70.81%), and AUC (0.71) when tested on 2015–2019 Pakistan’s National Highway N-5 (Peshawar to Rahim Yar Khan Section) accident data. By incorporating the SHapley Additive exPlanations approach, we were able to interpret the model’s estimation results from both global and local perspectives. Following interpretation, it was determined that the Month_of_Year, Cause_of_Accident, Driver_Age and Collision_Type all played a significant role in the estimation process. According to the analysis, young drivers and pedestrians struck by a trailer have a higher risk of suffering fatal injuries. The combination of trailers and passenger vehicles, as well as driver at-fault, hitting pedestrians and rear-end collisions, significantly increases the risk of fatal injuries. This study suggests that combining LightGBM and SHAP has the potential to develop an interpretable model for predicting road traffic injury severity.

Suggested Citation

  • Sheng Dong & Afaq Khattak & Irfan Ullah & Jibiao Zhou & Arshad Hussain, 2022. "Predicting and Analyzing Road Traffic Injury Severity Using Boosting-Based Ensemble Learning Models with SHAPley Additive exPlanations," IJERPH, MDPI, vol. 19(5), pages 1-23, March.
  • Handle: RePEc:gam:jijerp:v:19:y:2022:i:5:p:2925-:d:762712
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/1660-4601/19/5/2925/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/1660-4601/19/5/2925/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Khaled Assi & Syed Masiur Rahman & Umer Mansoor & Nedal Ratrout, 2020. "Predicting Crash Injury Severity with Machine Learning Algorithm Synergized with Clustering Technique: A Promising Protocol," IJERPH, MDPI, vol. 17(15), pages 1-17, July.
    2. Haorong Peng & Xiaoxiang Ma & Feng Chen, 2020. "Examining Injury Severity of Pedestrians in Vehicle–Pedestrian Crashes at Mid-Blocks Using Path Analysis," IJERPH, MDPI, vol. 17(17), pages 1-16, August.
    3. Morakot Worachairungreung & Sarawut Ninsawat & Apichon Witayangkurn & Matthew N. Dailey, 2021. "Identification of Road Traffic Injury Risk Prone Area Using Environmental Factors by Machine Learning Classification in Nonthaburi, Thailand," Sustainability, MDPI, vol. 13(7), pages 1-25, April.
    4. Feng Chen & Mingtao Song & Xiaoxiang Ma, 2019. "Investigation on the Injury Severity of Drivers in Rear-End Collisions Between Cars Using a Random Parameters Bivariate Ordered Probit Model," IJERPH, MDPI, vol. 16(14), pages 1-12, July.
    5. Natalia Casado-Sanz & Begoña Guirao & Maria Attard, 2020. "Analysis of the Risk Factors Affecting the Severity of Traffic Accidents on Spanish Crosstown Roads: The Driver’s Perspective," Sustainability, MDPI, vol. 12(6), pages 1-26, March.
    6. Feng Chen & Suren Chen & Xiaoxiang Ma, 2016. "Crash Frequency Modeling Using Real-Time Environmental and Traffic Data and Unbalanced Panel Data Models," IJERPH, MDPI, vol. 13(6), pages 1-16, June.
    7. Feng Chen & Xiaoxiang Ma & Suren Chen & Lin Yang, 2016. "Crash Frequency Analysis Using Hurdle Models with Random Effects Considering Short-Term Panel Data," IJERPH, MDPI, vol. 13(11), pages 1-11, October.
    8. Qiang Zeng & Wei Hao & Jaeyoung Lee & Feng Chen, 2020. "Investigating the Impacts of Real-Time Weather Conditions on Freeway Crash Severity: A Bayesian Spatial Analysis," IJERPH, MDPI, vol. 17(8), pages 1-15, April.
    9. Jian-feng Xi & Hai-zhu Liu & Wei Cheng & Zhong-hao Zhao & Tong-qiang Ding, 2014. "The Model of Severity Prediction of Traffic Crash on the Curve," Mathematical Problems in Engineering, Hindawi, vol. 2014, pages 1-5, January.
    10. Chen Zhang & Jie He & Yinhai Wang & Xintong Yan & Changjian Zhang & Yikai Chen & Ziyang Liu & Bojian Zhou, 2020. "A Crash Severity Prediction Method Based on Improved Neural Network and Factor Analysis," Discrete Dynamics in Nature and Society, Hindawi, vol. 2020, pages 1-13, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Mansoor, Umer & Jamal, Arshad & Su, Junbiao & Sze, N.N. & Chen, Anthony, 2023. "Investigating the risk factors of motorcycle crash injury severity in Pakistan: Insights and policy recommendations," Transport Policy, Elsevier, vol. 139(C), pages 21-38.
    2. Aleksandar Aleksić & Milan Ranđelović & Dragan Ranđelović, 2023. "Using Machine Learning in Predicting the Impact of Meteorological Parameters on Traffic Incidents," Mathematics, MDPI, vol. 11(2), pages 1-30, January.
    3. Roksana Asadi & Afaq Khattak & Hossein Vashani & Hamad R. Almujibah & Helia Rabie & Seyedamirhossein Asadi & Branislav Dimitrijevic, 2023. "Self-Paced Ensemble-SHAP Approach for the Classification and Interpretation of Crash Severity in Work Zone Areas," Sustainability, MDPI, vol. 15(11), pages 1-23, June.
    4. Munim, Ziaul Haque & Sørli, Michael André & Kim, Hyungju & Alon, Ilan, 2024. "Predicting maritime accident risk using Automated Machine Learning," Reliability Engineering and System Safety, Elsevier, vol. 248(C).
    5. Afaq Khattak & Hamad Almujibah & Ahmed Elamary & Caroline Mongina Matara, 2022. "Interpretable Dynamic Ensemble Selection Approach for the Prediction of Road Traffic Injury Severity: A Case Study of Pakistan’s National Highway N-5," Sustainability, MDPI, vol. 14(19), pages 1-18, September.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Haorong Peng & Xiaoxiang Ma & Feng Chen, 2020. "Examining Injury Severity of Pedestrians in Vehicle–Pedestrian Crashes at Mid-Blocks Using Path Analysis," IJERPH, MDPI, vol. 17(17), pages 1-16, August.
    2. Khaled Assi, 2020. "Traffic Crash Severity Prediction—A Synergy by Hybrid Principal Component Analysis and Machine Learning Models," IJERPH, MDPI, vol. 17(20), pages 1-16, October.
    3. Ming Lv & Xiaojun Shao & Chimou Li & Feng Chen, 2022. "Driving Performance Evaluation of Shuttle Buses: A Case Study of Hong Kong–Zhuhai–Macau Bridge," IJERPH, MDPI, vol. 19(3), pages 1-13, January.
    4. Zhi Zhang & Yingshi Guo & Rui Fu & Wei Yuan & Chang Wang, 2020. "Linking executive functions to distracted driving, does it differ between young and mature drivers?," PLOS ONE, Public Library of Science, vol. 15(9), pages 1-12, September.
    5. Xiaojun Shao & Xiaoxiang Ma & Feng Chen & Mingtao Song & Xiaodong Pan & Kesi You, 2020. "A Random Parameters Ordered Probit Analysis of Injury Severity in Truck Involved Rear-End Collisions," IJERPH, MDPI, vol. 17(2), pages 1-18, January.
    6. Qiang Zeng & Wei Hao & Jaeyoung Lee & Feng Chen, 2020. "Investigating the Impacts of Real-Time Weather Conditions on Freeway Crash Severity: A Bayesian Spatial Analysis," IJERPH, MDPI, vol. 17(8), pages 1-15, April.
    7. Shuaiming Chen & Haipeng Shao & Ximing Ji, 2021. "Insights into Factors Affecting Traffic Accident Severity of Novice and Experienced Drivers: A Machine Learning Approach," IJERPH, MDPI, vol. 18(23), pages 1-20, December.
    8. Jaewoong Yun, 2023. "Strategies for Improving the Sustainability of Fare-Free Policy for the Elderly through Preferences by Travel Modes," Sustainability, MDPI, vol. 15(20), pages 1-14, October.
    9. Zhuanglin Ma & Mingjie Luo & Steven I-Jy Chien & Dawei Hu & Xue Zhao, 2020. "Analyzing drivers’ perceived service quality of variable message signs (VMS)," PLOS ONE, Public Library of Science, vol. 15(10), pages 1-19, October.
    10. Vorapot Sapsirisavat & Wiriya Mahikul, 2021. "Drinking and Night-Time Driving May Increase the Risk of Severe Health Outcomes: A 5-Year Retrospective Study of Traffic Injuries among International Travelers at a University Hospital Emergency Cente," IJERPH, MDPI, vol. 18(18), pages 1-9, September.
    11. Changxi Ma & Jibiao Zhou & Dong Yang, 2020. "Causation Analysis of Hazardous Material Road Transportation Accidents Based on the Ordered Logit Regression Model," IJERPH, MDPI, vol. 17(4), pages 1-25, February.
    12. Afaq Khattak & Hamad Almujibah & Ahmed Elamary & Caroline Mongina Matara, 2022. "Interpretable Dynamic Ensemble Selection Approach for the Prediction of Road Traffic Injury Severity: A Case Study of Pakistan’s National Highway N-5," Sustainability, MDPI, vol. 14(19), pages 1-18, September.
    13. Zheng Chen & Huiying Wen & Qiang Zhu & Sheng Zhao, 2023. "Severity Analysis of Multi-Truck Crashes on Mountain Freeways Using a Mixed Logit Model," Sustainability, MDPI, vol. 15(8), pages 1-15, April.
    14. Yulong Bao & Yongle Li & Jiajie Ding, 2016. "A Case Study of Dynamic Response Analysis and Safety Assessment for a Suspended Monorail System," IJERPH, MDPI, vol. 13(11), pages 1-17, November.
    15. Kanghyun Kim & Jungyeol Hong, 2023. "Severity Predictions for Intercity Bus Crashes on Highway Using a Random Parameter Ordered Probit Model," Sustainability, MDPI, vol. 15(17), pages 1-15, August.
    16. Mahyar Madarshahian & Aditya Balaram & Fahim Ahmed & Nathan Huynh & Chowdhury K. A. Siddiqui & Mark Ferguson, 2023. "Analysis of Injury Severity of Work Zone Truck-Involved Crashes in South Carolina for Interstates and Non-Interstates," Sustainability, MDPI, vol. 15(9), pages 1-18, April.
    17. Sherif Shokry & Naglaa K. Rashwan & Seham Hemdan & Ali Alrashidi & Amr M. Wahaballa, 2023. "Characterization of Traffic Accidents Based on Long-Horizon Aggregated and Disaggregated Data," Sustainability, MDPI, vol. 15(2), pages 1-18, January.
    18. Feifeng Jiang & Kwok Kit Richard Yuen & Eric Wai Ming Lee & Jun Ma, 2020. "Analysis of Run-Off-Road Accidents by Association Rule Mining and Geographic Information System Techniques on Imbalanced Datasets," Sustainability, MDPI, vol. 12(12), pages 1-32, June.
    19. Mehari Beyene Teshome & Faisal Rasool & Guido Orzes, 2024. "Mountain Logistics: A Systematic Literature Review and Future Research Directions," Logistics, MDPI, vol. 8(4), pages 1-19, November.
    20. Turgut Karakose & Ramazan Yirci & Stamatis Papadakis, 2022. "Examining the Associations between COVID-19-Related Psychological Distress, Social Media Addiction, COVID-19-Related Burnout, and Depression among School Principals and Teachers through Structural Equ," IJERPH, MDPI, vol. 19(4), pages 1-19, February.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jijerp:v:19:y:2022:i:5:p:2925-:d:762712. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.