IDEAS home Printed from https://ideas.repec.org/a/gam/jijerp/v18y2021i11p5604-d561131.html
   My bibliography  Save this article

Comparison of Prediction Models for Mortality Related to Injuries from Road Traffic Accidents after Correcting for Undersampling

Author

Listed:
  • Yookyung Boo

    (Department of Health Administration, Dankook University, Cheonan 31116, Korea)

  • Youngjin Choi

    (Department of Healthcare Management, Eulji University, Seongnam 13135, Korea)

Abstract

In this study, four models—logistic regression (LR), random forest (RF), linear support vector machine (SVM), and radial basis function (RBF)-SVM—were compared for their accuracy in determining mortality caused by road traffic injuries. They were tested using five years of national-level data from the Korea Disease Control and Prevention Agency’s (KDCA) National Hospital Discharge In-Depth Survey (2013 through to 2017). Model performance was measured for accuracy, precision, recall, F1 score, and Brier score metrics using classification analysis that included characteristics of patients, accidents, injuries, and illnesses. Due to the number of variables and differing units, the rates of survival and mortality related to road traffic accidents were imbalanced, so the data was corrected and standardized before the classification models’ performances were compared. Using the importance analysis, the main diagnosis, the type of injury, the site of the injury, the type of injury, the operation status, the type of accident, the role at the time of the accident, and the sex were selected as the analysis factors. The biggest contributing factor was the role in the accident, which is the driver, and the major sites of the injuries were head injuries and deep injuries. Using selected factors, comparisons of the classification performance of each model indicated RBF-SVM and RF models were superior to the others. Of the SVM models, the RBF kernel model was superior to the linear kernel model; it can be inferred that the performance of the high-dimensional transformed RBF model is superior when the dimension is complex because of the use of multiple variables. The findings suggest there are limitations to analyses involving imbalanced, multidimensional original data, such as data on road traffic mortality. Thus, analyses must be performed after imbalances are corrected.

Suggested Citation

  • Yookyung Boo & Youngjin Choi, 2021. "Comparison of Prediction Models for Mortality Related to Injuries from Road Traffic Accidents after Correcting for Undersampling," IJERPH, MDPI, vol. 18(11), pages 1-14, May.
  • Handle: RePEc:gam:jijerp:v:18:y:2021:i:11:p:5604-:d:561131
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/1660-4601/18/11/5604/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/1660-4601/18/11/5604/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Rediet Fikru Gebresenbet & Anteneh Dirar Aliyu, 2019. "Injury severity level and associated factors among road traffic accident victims attending emergency department of Tirunesh Beijing Hospital, Addis Ababa, Ethiopia: A cross sectional hospital-based st," PLOS ONE, Public Library of Science, vol. 14(9), pages 1-16, September.
    2. K. Coussement & D. Van Den Poel, 2006. "Churn Prediction in Subscription Services: an Application of Support Vector Machines While Comparing Two Parameter-Selection Techniques," Working Papers of Faculty of Economics and Business Administration, Ghent University, Belgium 06/412, Ghent University, Faculty of Economics and Business Administration.
    3. Zhou, Xiaoyi & Lu, Pan & Zheng, Zijian & Tolliver, Denver & Keramati, Amin, 2020. "Accident Prediction Accuracy Assessment for Highway-Rail Grade Crossings Using Random Forest Algorithm Compared with Decision Tree," Reliability Engineering and System Safety, Elsevier, vol. 200(C).
    4. Lord, Dominique & Mannering, Fred, 2010. "The statistical analysis of crash-frequency data: A review and assessment of methodological alternatives," Transportation Research Part A: Policy and Practice, Elsevier, vol. 44(5), pages 291-305, June.
    5. Leonard, Kevin J. & Rauner, Marion S. & Schaffhauser-Linzatti, Michaela-Maria & Yap, Richard, 2003. "The effect of funding policy on day of week admissions and discharges in hospitals: the cases of Austria and Canada," Health Policy, Elsevier, vol. 63(3), pages 239-257, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Paulo Infante & Gonçalo Jacinto & Anabela Afonso & Leonor Rego & Pedro Nogueira & Marcelo Silva & Vitor Nogueira & José Saias & Paulo Quaresma & Daniel Santos & Patrícia Góis & Paulo Rebelo Manuel, 2023. "Factors That Influence the Type of Road Traffic Accidents: A Case Study in a District of Portugal," Sustainability, MDPI, vol. 15(3), pages 1-16, January.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Gao, Lu & Lu, Pan & Ren, Yihao, 2021. "A deep learning approach for imbalanced crash data in predicting highway-rail grade crossings accidents," Reliability Engineering and System Safety, Elsevier, vol. 216(C).
    2. Risselada, Hans & Verhoef, Peter C. & Bijmolt, Tammo H.A., 2010. "Staying Power of Churn Prediction Models," Journal of Interactive Marketing, Elsevier, vol. 24(3), pages 198-208.
    3. Najaf, Pooya & Thill, Jean-Claude & Zhang, Wenjia & Fields, Milton Greg, 2018. "City-level urban form and traffic safety: A structural equation modeling analysis of direct and indirect effects," Journal of Transport Geography, Elsevier, vol. 69(C), pages 257-270.
    4. Buddhavarapu, Prasad & Bansal, Prateek & Prozzi, Jorge A., 2021. "A new spatial count data model with time-varying parameters," Transportation Research Part B: Methodological, Elsevier, vol. 150(C), pages 566-586.
    5. Hou, Hui & Liu, Chao & Wei, Ruizeng & He, Huan & Wang, Lei & Li, Weibo, 2023. "Outage duration prediction under typhoon disaster with stacking ensemble learning," Reliability Engineering and System Safety, Elsevier, vol. 237(C).
    6. Khondoker Billah & Qasim Adegbite & Hatim O. Sharif & Samer Dessouky & Lauren Simcic, 2021. "Analysis of Intersection Traffic Safety in the City of San Antonio, 2013–2017," Sustainability, MDPI, vol. 13(9), pages 1-18, May.
    7. Quintanilha, Igor M. & Elias, Vitor R.M. & da Silva, Felipe B. & Fonini, Pedro A.M. & da Silva, Eduardo A.B. & Netto, Sergio L. & Apolinário, José A. & de Campos, Marcello L.R. & Martins, Wallace A., 2021. "A fault detector/classifier for closed-ring power generators using machine learning," Reliability Engineering and System Safety, Elsevier, vol. 212(C).
    8. Baumann, Elias & Kern, Jana & Lessmann, Stefan, 2019. "Usage Continuance in Software-as-a-Service," IRTG 1792 Discussion Papers 2019-005, Humboldt University of Berlin, International Research Training Group 1792 "High Dimensional Nonstationary Time Series".
    9. Yen-Chun Chou & Howard Hao-Chun Chuang, 2018. "A predictive investigation of first-time customer retention in online reservation services," Service Business, Springer;Pan-Pacific Business Association, vol. 12(4), pages 685-699, December.
    10. Koen W. de Bock & Arno de Caigny, 2021. "Spline-rule ensemble classifiers with structured sparsity regularization for interpretable customer churn modeling," Post-Print hal-03391564, HAL.
    11. Bo Yang & Yao Wu & Weihua Zhang & Jie Bao, 2020. "Modeling Collision Probability on Freeway: Accounting for Different Types and Severities in Various LOS," Sustainability, MDPI, vol. 12(18), pages 1-13, September.
    12. Bae, Bumjoon & Seo, Changbeom, 2022. "Do public-private partnerships help improve road safety? Finding empirical evidence using panel data models," Transport Policy, Elsevier, vol. 126(C), pages 336-342.
    13. Vorapot Sapsirisavat & Wiriya Mahikul, 2021. "Drinking and Night-Time Driving May Increase the Risk of Severe Health Outcomes: A 5-Year Retrospective Study of Traffic Injuries among International Travelers at a University Hospital Emergency Cente," IJERPH, MDPI, vol. 18(18), pages 1-9, September.
    14. Svetlana BAČKALIĆ & Dragan JOVANOVIĆ & Todor BAČKALIĆ & Boško MATOVIĆ & Miloš PLJAKIĆ, 2019. "The Application Of Reliability Reallocation Model In Traffic Safety Analysis On Rural Roads," Transport Problems, Silesian University of Technology, Faculty of Transport, vol. 14(1), pages 115-125, April.
    15. Slãvescu Ecaterina Oana & Panait Iulian, 2012. "Improving Customer Churn Models as one of Customer Relationship Management Business Solutions for the Telecommunication Industry," Ovidius University Annals, Economic Sciences Series, Ovidius University of Constantza, Faculty of Economic Sciences, vol. 0(1), pages 1156-1160, May.
    16. De Caigny, Arno & Coussement, Kristof & De Bock, Koen W., 2018. "A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees," European Journal of Operational Research, Elsevier, vol. 269(2), pages 760-772.
    17. Arno de Caigny & Kristof Coussement & Koen W. de Bock & Stefan Lessmann, 2019. "Incorporating textual information in customer churn prediction models based on a convolutional neural network," Post-Print hal-02275958, HAL.
    18. Izdebski, Mariusz & Jacyna-Gołda, Ilona & Gołda, Paweł, 2022. "Minimisation of the probability of serious road accidents in the transport of dangerous goods," Reliability Engineering and System Safety, Elsevier, vol. 217(C).
    19. Dong, Chunjiao & Shao, Chunfu & Clarke, David B. & Nambisan, Shashi S., 2018. "An innovative approach for traffic crash estimation and prediction on accommodating unobserved heterogeneities," Transportation Research Part B: Methodological, Elsevier, vol. 118(C), pages 407-428.
    20. Albrecht, Tobias & Rausch, Theresa Maria & Derra, Nicholas Daniel, 2021. "Call me maybe: Methods and practical implementation of artificial intelligence in call center arrivals’ forecasting," Journal of Business Research, Elsevier, vol. 123(C), pages 267-278.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jijerp:v:18:y:2021:i:11:p:5604-:d:561131. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.