IDEAS home Printed from https://ideas.repec.org/a/spr/aodasc/v9y2022i4d10.1007_s40745-021-00321-4.html
   My bibliography  Save this article

Statistical Learning for Predicting School Dropout in Elementary Education: A Comparative Study

Author

Listed:
  • Rafaella L. S. Nascimento

    (Universidade Federal de Pernambuco)

  • Roberta A. de A. Fagundes

    (Universidade de Pernambuco)

  • Renata M. C. R. Souza

    (Universidade Federal de Pernambuco)

Abstract

School dropout is a significant challenge for the education system. This phenomenon is present in different environments, modalities, and stages of education. In the Brazilian scenario, despite advances in some respects as a reduction of indexes, combating evasion is still one of the significant efforts. Identifying the factors that involve school dropout is supported by different decision support techniques such as Statistical Learning. Statistical learning consists of a method set for exploring and understanding data to establish an association between explanatory and response variables and develop an accurate model. We propose to examine the use of some regression methods commonly used in the Statistical Learning literature for estimating school dropout in the context of elementary school from the state of Pernambuco. The data involves educational indicators, and we defined phases in the study to understand, prepare, and model the data. For prediction, we apply models for estimating school dropout using kernel-based and linear regression methods. We measured the performance by the prediction error from the test data set using Mean Absolute Error and Root Mean Square Error. We considered Statistical tests to confirm the results. The findings show that kernel-based models are effective alternatives to provide greater precision in the estimation of school dropout in scope studied. The reason to explore more accurate predictive models is supporting intervening and targeting the most at-risk students of scholar dropout. The study provides knowledge about the applied scenario supporting policies to mitigate the problem.

Suggested Citation

  • Rafaella L. S. Nascimento & Roberta A. de A. Fagundes & Renata M. C. R. Souza, 2022. "Statistical Learning for Predicting School Dropout in Elementary Education: A Comparative Study," Annals of Data Science, Springer, vol. 9(4), pages 801-828, August.
  • Handle: RePEc:spr:aodasc:v:9:y:2022:i:4:d:10.1007_s40745-021-00321-4
    DOI: 10.1007/s40745-021-00321-4
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s40745-021-00321-4
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s40745-021-00321-4?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Melissa Adelman & Francisco Haimovich & Andres Ham & Emmanuel Vazquez, 2018. "Predicting school dropout with administrative data: new evidence from Guatemala and Honduras," Education Economics, Taylor & Francis Journals, vol. 26(4), pages 356-372, July.
    2. Robison, Samuel & Jaggers, Jeremiah & Rhodes, Judith & Blackmon, Bret J. & Church, Wesley, 2017. "Correlates of educational success: Predictors of school dropout and graduation for urban students in the Deep South," Children and Youth Services Review, Elsevier, vol. 73(C), pages 37-46.
    3. Manish Sharma & Shikha N. Khera & Pritam B. Sharma, 2019. "Applicability of Machine Learning in the Measurement of Emotional Intelligence," Annals of Data Science, Springer, vol. 6(1), pages 179-187, March.
    4. Hayfield, Tristen & Racine, Jeffrey S., 2008. "Nonparametric Econometrics: The np Package," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 27(i05).
    5. Roberta Costa & Ariana Britto & Fábio Waltenberg, 2018. "Impact of out-of-field teaching on school results in Brazilian high schools: An analysis with panel data from the School Census," Investigaciones de Economía de la Educación volume 13, in: Josep-Oriol Escardíbul & Álvaro Choi (ed.), Investigaciones de Economía de la Educación 13, edition 1, volume 13, chapter 5, pages 95-114, Asociación de Economía de la Educación.
    6. Annalina Sarra & Lara Fontanella & Simone Zio, 2019. "Identifying Students at Risk of Academic Failure Within the Educational Data Mining Framework," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 146(1), pages 41-60, November.
    7. James M. Tien, 2017. "Internet of Things, Real-Time Decision Making, and Artificial Intelligence," Annals of Data Science, Springer, vol. 4(2), pages 149-178, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Prashant Singh & Prashant Verma & Nikhil Singh, 2022. "Offline Signature Verification: An Application of GLCM Features in Machine Learning," Annals of Data Science, Springer, vol. 9(6), pages 1309-1321, December.
    2. Manoj Verma & Harish Kumar Ghritlahre & Ghrithanchi Chandrakar, 2023. "Wind Speed Prediction of Central Region of Chhattisgarh (India) Using Artificial Neural Network and Multiple Linear Regression Technique: A Comparative Study," Annals of Data Science, Springer, vol. 10(4), pages 851-873, August.
    3. David H. Bernstein & Christopher F. Parmeter, 2017. "Returns to Scale in Electricity Generation: Revisited and Replicated," Working Papers 2017-08, University of Miami, Department of Economics.
    4. El Ghouch, Anouar & Genton, Marc G. & Bouezmarni , Taoufik, 2012. "Measuring the Discrepancy of a Parametric Model via Local Polynomial Smoothing," LIDAM Discussion Papers ISBA 2012001, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    5. Kemp, Gordon C.R. & Santos Silva, J.M.C., 2012. "Regression towards the mode," Journal of Econometrics, Elsevier, vol. 170(1), pages 92-101.
    6. Requillart, Vincent & Nauges, Celine & Simioni, Michel & Bontemps, Christophe, 2012. "Food Safety Regulation and Firm Productivity: Evidence from the French Food Industry," 2012 First Congress, June 4-5, 2012, Trento, Italy 124378, Italian Association of Agricultural and Applied Economics (AIEAA).
    7. Degl’Innocenti, Marta & Matousek, Roman & Sevic, Zeljko & Tzeremes, Nickolaos G., 2017. "Bank efficiency and financial centres: Does geographical location matter?," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 46(C), pages 188-198.
    8. George Halkos & Roman Matousek & Nickolaos Tzeremes, 2016. "Pre-evaluating technical efficiency gains from possible mergers and acquisitions: evidence from Japanese regional banks," Review of Quantitative Finance and Accounting, Springer, vol. 46(1), pages 47-77, January.
    9. George E. Halkos & Nickolaos G. Tzeremes, 2015. "Measuring Seaports' Productivity: A Malmquist Productivity Index Decomposition Approach," Journal of Transport Economics and Policy, University of Bath, vol. 49(2), pages 355-376, April.
    10. Qi Li & Juan Lin & Jeffrey S. Racine, 2013. "Optimal Bandwidth Selection for Nonparametric Conditional Distribution and Quantile Functions," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 31(1), pages 57-65, January.
    11. Bodory, Hugo & Huber, Martin, 2018. "The causalweight package for causal inference in R," FSES Working Papers 493, Faculty of Economics and Social Sciences, University of Freiburg/Fribourg Switzerland.
    12. Heba Soltan Mohamed & M. Masoom Ali & Haitham M. Yousof, 2023. "The Lindley Gompertz Model for Estimating the Survival Rates: Properties and Applications in Insurance," Annals of Data Science, Springer, vol. 10(5), pages 1199-1216, October.
    13. Besstremyannaya, Galina, 2015. "Measuring the effect of health insurance companies on the quality of healthcare systems with kernel and parametric regressions," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 38(2), pages 3-20.
    14. Roberto Moro-Visconti & Salvador Cruz Rambaud & Joaquín López Pascual, 2023. "Artificial intelligence-driven scalability and its impact on the sustainability and valuation of traditional firms," Palgrave Communications, Palgrave Macmillan, vol. 10(1), pages 1-14, December.
    15. Michael S. Delgado & Daniel J. Henderson & Christopher F. Parmeter, 2014. "Does Education Matter for Economic Growth?," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 76(3), pages 334-359, June.
    16. Crespo, Cristian, 2020. "Two become one: improving the targeting of conditional cash transfers with a predictive model of school dropout," LSE Research Online Documents on Economics 123139, London School of Economics and Political Science, LSE Library.
    17. Mansoureh Beheshti Nejad & Seyed Mahmoud Zanjirchi & Seyed Mojtaba Hosseini Bamakan & Negar Jalilian, 2024. "Blockchain Adoption in Operations Management: A Systematic Literature Review of 14 Years of Research," Annals of Data Science, Springer, vol. 11(4), pages 1361-1389, August.
    18. M. Sridharan, 2023. "Generalized Regression Neural Network Model Based Estimation of Global Solar Energy Using Meteorological Parameters," Annals of Data Science, Springer, vol. 10(4), pages 1107-1125, August.
    19. Michael Zschille, 2014. "Nonparametric measures of returns to scale: an application to German water supply," Empirical Economics, Springer, vol. 47(3), pages 1029-1053, November.
    20. Ye, Yuxiang & Koch, Steven F., 2021. "Measuring energy poverty in South Africa based on household required energy consumption," Energy Economics, Elsevier, vol. 103(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:aodasc:v:9:y:2022:i:4:d:10.1007_s40745-021-00321-4. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.