IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v12y2024i12p1776-d1410836.html
   My bibliography  Save this article

Improving the Automatic Detection of Dropout Risk in Middle and High School Students: A Comparative Study of Feature Selection Techniques

Author

Listed:
  • Daniel Zapata-Medina

    (Department of Computer and Decision Sciences, Faculty of Mines, Universidad Nacional de Colombia, Medellín 050034, Colombia)

  • Albeiro Espinosa-Bedoya

    (Department of Computer and Decision Sciences, Faculty of Mines, Universidad Nacional de Colombia, Medellín 050034, Colombia)

  • Jovani Alberto Jiménez-Builes

    (Department of Computer and Decision Sciences, Faculty of Mines, Universidad Nacional de Colombia, Medellín 050034, Colombia)

Abstract

The dropout rate in underdeveloped and emerging countries is a pressing social issue, as highlighted by studies conducted by The Organization for Economic Co-operation and Development. This study compares five feature selection techniques to address this challenge and improve the automatic detection of dropout risk. The methodological design involves three distinct phases: data preparation, feature selection, and model evaluation utilizing machine learning algorithms. The results demonstrate that (1) the top features identified by feature selection techniques, i.e., those constructed through feature engineering, proved to be among the most effective in classifying student dropout; (2) the F-score of the best model increased by 5% with feature selection techniques; and (3) depending on the type of feature selection, the performance of the machine learning algorithm can vary, potentially increasing or decreasing based on the sensitivity of features with higher noise. At the same time, metaheuristic algorithms demonstrated significant precision improvements, but there was a risk of increasing errors and reducing recall.

Suggested Citation

  • Daniel Zapata-Medina & Albeiro Espinosa-Bedoya & Jovani Alberto Jiménez-Builes, 2024. "Improving the Automatic Detection of Dropout Risk in Middle and High School Students: A Comparative Study of Feature Selection Techniques," Mathematics, MDPI, vol. 12(12), pages 1-20, June.
  • Handle: RePEc:gam:jmathe:v:12:y:2024:i:12:p:1776-:d:1410836
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/12/12/1776/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/12/12/1776/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Delen, Dursun & Topuz, Kazim & Eryarsoy, Enes, 2020. "Development of a Bayesian Belief Network-based DSS for predicting and understanding freshmen student attrition," European Journal of Operational Research, Elsevier, vol. 281(3), pages 575-587.
    2. Chung, Jae Young & Lee, Sunbok, 2019. "Dropout early warning systems for high school students using machine learning," Children and Youth Services Review, Elsevier, vol. 96(C), pages 346-353.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Kazim Topuz & Behrooz Davazdahemami & Dursun Delen, 2024. "A Bayesian belief network-based analytics methodology for early-stage risk detection of novel diseases," Annals of Operations Research, Springer, vol. 341(1), pages 673-697, October.
    2. Diogo E. Moreira da Silva & Eduardo J. Solteiro Pires & Arsénio Reis & Paulo B. de Moura Oliveira & João Barroso, 2022. "Forecasting Students Dropout: A UTAD University Study," Future Internet, MDPI, vol. 14(3), pages 1-14, February.
    3. Kazim Topuz & Timothy L. Urban & Robert A. Russell & Mehmet B. Yildirim, 2024. "Decision support system for appointment scheduling and overbooking under patient no-show behavior," Annals of Operations Research, Springer, vol. 342(1), pages 845-873, November.
    4. Anne Parlina & Kalamullah Ramli & Hendri Murfi, 2021. "Exposing Emerging Trends in Smart Sustainable City Research Using Deep Autoencoders-Based Fuzzy C-Means," Sustainability, MDPI, vol. 13(5), pages 1-28, March.
    5. Benoit, Dries F. & Tsang, Wai Kit & Coussement, Kristof & Raes, Annelies, 2024. "High-stake student drop-out prediction using hidden Markov models in fully asynchronous subscription-based MOOCs," Technological Forecasting and Social Change, Elsevier, vol. 198(C).
    6. Rebai, Sonia & Ben Yahia, Fatma & Essid, Hédi, 2020. "A graphically based machine learning approach to predict secondary schools performance in Tunisia," Socio-Economic Planning Sciences, Elsevier, vol. 70(C).
    7. Vafadarnikjoo, Amin & Chalvatzis, Konstantinos & Botelho, Tiago & Bamford, David, 2023. "A stratified decision-making model for long-term planning: Application in flood risk management in Scotland," Omega, Elsevier, vol. 116(C).
    8. Bacon, Victoria R. & Kearney, Christopher A., 2020. "School climate and student-based contextual learning factors as predictors of school absenteeism severity at multiple levels via CHAID analysis," Children and Youth Services Review, Elsevier, vol. 118(C).
    9. Wang, Qiang & Zhang, Wen & Li, Jian & Ma, Zhenzhong, 2023. "Complements or confounders? A study of effects of target and non-target features on online fraudulent reviewer detection," Journal of Business Research, Elsevier, vol. 167(C).
    10. Qazi, Abroon, 2023. "Exploring Global Competitiveness Index 4.0 through the lens of country risk," Technological Forecasting and Social Change, Elsevier, vol. 196(C).
    11. Zhang, Mingming & Zhou, Simei & Wang, Qunwei & Liu, Liyun & Zhou, Dequn, 2023. "Will the carbon neutrality target impact China's energy security? A dynamic Bayesian network model," Energy Economics, Elsevier, vol. 125(C).
    12. Ahmed, Abdulaziz & Topuz, Kazim & Moqbel, Murad & Abdulrashid, Ismail, 2024. "What makes accidents severe! explainable analytics framework with parameter optimization," European Journal of Operational Research, Elsevier, vol. 317(2), pages 425-436.
    13. Thuy, Arthur & Benoit, Dries F., 2024. "Explainability through uncertainty: Trustworthy decision-making with neural networks," European Journal of Operational Research, Elsevier, vol. 317(2), pages 330-340.
    14. Joyce de Souza Zanirato Maia & Ana Paula Arantes Bueno & Joao Ricardo Sato, 2023. "Applications of Artificial Intelligence Models in Educational Analytics and Decision Making: A Systematic Review," World, MDPI, vol. 4(2), pages 1-26, May.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:12:y:2024:i:12:p:1776-:d:1410836. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.