IDEAS home Printed from https://ideas.repec.org/a/gam/jsusta/v15y2023i13p10668-d1188179.html
   My bibliography  Save this article

Analyzing the Effectiveness of Imbalanced Data Handling Techniques in Predicting Driver Phone Use

Author

Listed:
  • Madhar M. Taamneh

    (Department of Civil Engineering, Yarmouk University, P.O. Box 566, Irbid 21163, Jordan)

  • Salah Taamneh

    (Department of Computer Science and Applications, Faculty of Prince Al-Hussien Bin Abdullah for IT, The Hashemite University, P.O. Box 330127, Zarqa 13133, Jordan)

  • Ahmad H. Alomari

    (Department of Civil Engineering, Yarmouk University, P.O. Box 566, Irbid 21163, Jordan)

  • Musab Abuaddous

    (Department of Civil Engineering, Yarmouk University, P.O. Box 566, Irbid 21163, Jordan)

Abstract

Distracted driving leads to a significant number of road crashes worldwide. Smartphone use is one of the most common causes of cognitive distraction among drivers. Available data on drivers’ phone use presents an invaluable opportunity to identify the main factors behind this behavior. Machine learning (ML) techniques are among the most effective techniques for this purpose. However, the potential and usefulness of these techniques are limited, due to the imbalance of available data. The majority class of instances collected is for drivers who do not use their phones, while the minority class is for those who do use their phones. This paper evaluates two main approaches for handling imbalanced datasets on driver phone use. These methods include oversampling and undersampling. The effectiveness of each method was evaluated using six ML techniques: Multilayer Perceptron (MLP), Support Vector Machine (SVM), Naive Bayes (NB), Bayesian Network (BayesNet), J48, and ID3. The proposed methods were also evaluated on three Deep Learning (DL) models: Arch1 (5 hidden layers), Arch2 (10 hidden layers), and Arch3 (15 hidden layers). The data used in this document were collected through a direct observation study to explore a set of human, vehicle, and road surface characteristics. The results showed that all ML methods, as well as DL methods, achieved balanced accuracy values for both classes. ID3, J48, and MLP methods outperformed the rest of the ML methods in all scenarios, with ID3 achieving slightly better accuracy. The DL methods also provided good performances, especially for the undersampling data. The results also showed that the classification methods performed best on the undersampled data. It was concluded that road classification has the highest impact on cell phone use, followed by driver age group, driver gender, vehicle type, and, finally, driver seatbelt usage.

Suggested Citation

  • Madhar M. Taamneh & Salah Taamneh & Ahmad H. Alomari & Musab Abuaddous, 2023. "Analyzing the Effectiveness of Imbalanced Data Handling Techniques in Predicting Driver Phone Use," Sustainability, MDPI, vol. 15(13), pages 1-20, July.
  • Handle: RePEc:gam:jsusta:v:15:y:2023:i:13:p:10668-:d:1188179
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2071-1050/15/13/10668/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2071-1050/15/13/10668/
    Download Restriction: no
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Yang Liu & Tianxing Yang & Liwei Tian & Bincheng Huang & Jiaming Yang & Zihan Zeng, 2024. "Ada-XG-CatBoost: A Combined Forecasting Model for Gross Ecosystem Product (GEP) Prediction," Sustainability, MDPI, vol. 16(16), pages 1-19, August.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jsusta:v:15:y:2023:i:13:p:10668-:d:1188179. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.