IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0242334.html
   My bibliography  Save this article

Predicting time to graduation at a large enrollment American university

Author

Listed:
  • John M Aiken
  • Riccardo De Bin
  • Morten Hjorth-Jensen
  • Marcos D Caballero

Abstract

The time it takes a student to graduate with a university degree is mitigated by a variety of factors such as their background, the academic performance at university, and their integration into the social communities of the university they attend. Different universities have different populations, student services, instruction styles, and degree programs, however, they all collect institutional data. This study presents data for 160,933 students attending a large American research university. The data includes performance, enrollment, demographics, and preparation features. Discrete time hazard models for the time-to-graduation are presented in the context of Tinto’s Theory of Drop Out. Additionally, a novel machine learning method: gradient boosted trees, is applied and compared to the typical maximum likelihood method. We demonstrate that enrollment factors (such as changing a major) lead to greater increases in model predictive performance of when a student graduates than performance factors (such as grades) or preparation (such as high school GPA).

Suggested Citation

  • John M Aiken & Riccardo De Bin & Morten Hjorth-Jensen & Marcos D Caballero, 2020. "Predicting time to graduation at a large enrollment American university," PLOS ONE, Public Library of Science, vol. 15(11), pages 1-28, November.
  • Handle: RePEc:plo:pone00:0242334
    DOI: 10.1371/journal.pone.0242334
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0242334
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0242334&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0242334?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. L. Lombardo & M. Cama & C. Conoscenti & M. Märker & E. Rotigliano, 2015. "Binary logistic regression versus stochastic gradient boosted decision trees in assessing landslide susceptibility for multiple-occurring landslide events: application to the 2009 storm event in Messi," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 79(3), pages 1621-1648, December.
    2. Friedman, Jerome H., 2002. "Stochastic gradient boosting," Computational Statistics & Data Analysis, Elsevier, vol. 38(4), pages 367-378, February.
    3. Hongtao Yue & Xuanning Fu, 2017. "Rethinking Graduation and Time to Degree: A Fresh Perspective," Research in Higher Education, Springer;Association for Institutional Research, vol. 58(2), pages 184-213, March.
    4. DesJardins, S. L. & Ahlburg, D. A. & McCall, B. P., 1999. "An event history model of student departure," Economics of Education Review, Elsevier, vol. 18(3), pages 375-390, June.
    5. Jaison R. Abel & Richard Deitz & Yaquin Su, 2014. "Are recent college graduates finding good jobs?," Current Issues in Economics and Finance, Federal Reserve Bank of New York, vol. 20.
    6. James Vaupel & Kenneth Manton & Eric Stallard, 1979. "The impact of heterogeneity in individual frailty on the dynamics of mortality," Demography, Springer;Population Association of America (PAA), vol. 16(3), pages 439-454, August.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Corsi, Matteo & di Bella, Enrico & Persico, Luca, 2024. "A reassessment of graduation modeling for policy design," Socio-Economic Planning Sciences, Elsevier, vol. 96(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Shruti Sachdeva & Bijendra Kumar, 2020. "A Comparative Study between Frequency Ratio Model and Gradient Boosted Decision Trees with Greedy Dimensionality Reduction in Groundwater Potential Assessment," Water Resources Management: An International Journal, Published for the European Water Resources Association (EWRA), Springer;European Water Resources Association (EWRA), vol. 34(15), pages 4593-4615, December.
    2. Corsi, Matteo & di Bella, Enrico & Persico, Luca, 2024. "A reassessment of graduation modeling for policy design," Socio-Economic Planning Sciences, Elsevier, vol. 96(C).
    3. Bagdonavicius, Vilijandas & Nikulin, Mikhail, 2000. "On goodness-of-fit for the linear transformation and frailty models," Statistics & Probability Letters, Elsevier, vol. 47(2), pages 177-188, April.
    4. Feehan, Dennis & Wrigley-Field, Elizabeth, 2020. "How do populations aggregate?," SocArXiv 2fkw3, Center for Open Science.
    5. Jaison R. Abel & Richard Deitz, 2017. "Underemployment in the Early Careers of College Graduates following the Great Recession," NBER Chapters, in: Education, Skills, and Technical Change: Implications for Future US GDP Growth, pages 149-181, National Bureau of Economic Research, Inc.
    6. John M. Nunley & Adam Pugh & Nicholas Romero & Richard Alan Seals, Jr., 2014. "Unemployment, Underemployment, and Employment Opportunities: Results from a Correspondence Audit of the Labor Market for College Graduates," Auburn Economics Working Paper Series auwp2014-04, Department of Economics, Auburn University.
    7. Bissan Ghaddar & Ignacio Gómez-Casares & Julio González-Díaz & Brais González-Rodríguez & Beatriz Pateiro-López & Sofía Rodríguez-Ballesteros, 2023. "Learning for Spatial Branching: An Algorithm Selection Approach," INFORMS Journal on Computing, INFORMS, vol. 35(5), pages 1024-1043, September.
    8. Filipe Costa Souza & Wilton Bernardino & Silvio C. Patricio, 2024. "How life-table right-censoring affected the Brazilian social security factor: an application of the gamma-Gompertz-Makeham model," Journal of Population Research, Springer, vol. 41(3), pages 1-38, September.
    9. Nahushananda Chakravarthy H G & Karthik M Seenappa & Sujay Raghavendra Naganna & Dayananda Pruthviraja, 2023. "Machine Learning Models for the Prediction of the Compressive Strength of Self-Compacting Concrete Incorporating Incinerated Bio-Medical Waste Ash," Sustainability, MDPI, vol. 15(18), pages 1-22, September.
    10. K. Motarjem & M. Mohammadzadeh & A. Abyar, 2020. "Geostatistical survival model with Gaussian random effect," Statistical Papers, Springer, vol. 61(1), pages 85-107, February.
    11. Xu, Linzhi & Zhang, Jiajia, 2010. "An EM-like algorithm for the semiparametric accelerated failure time gamma frailty model," Computational Statistics & Data Analysis, Elsevier, vol. 54(6), pages 1467-1474, June.
    12. Wen, Shaoting & Buyukada, Musa & Evrendilek, Fatih & Liu, Jingyong, 2020. "Uncertainty and sensitivity analyses of co-combustion/pyrolysis of textile dyeing sludge and incense sticks: Regression and machine-learning models," Renewable Energy, Elsevier, vol. 151(C), pages 463-474.
    13. Spiliotis, Evangelos & Makridakis, Spyros & Kaltsounis, Anastasios & Assimakopoulos, Vassilios, 2021. "Product sales probabilistic forecasting: An empirical evaluation using the M5 competition data," International Journal of Production Economics, Elsevier, vol. 240(C).
    14. Annamaria Olivieri & Ermanno Pitacco, 2016. "Frailty and Risk Classification for Life Annuity Portfolios," Risks, MDPI, vol. 4(4), pages 1-23, October.
    15. James W. Vaupel, 2002. "Post-Darwinian longevity," MPIDR Working Papers WP-2002-043, Max Planck Institute for Demographic Research, Rostock, Germany.
    16. Maxim S. Finkelstein, 2005. "Shocks in homogeneous and heterogeneous populations," MPIDR Working Papers WP-2005-024, Max Planck Institute for Demographic Research, Rostock, Germany.
    17. Al-Amin Abba Dabo & Amin Hosseinian-Far, 2023. "An Integrated Methodology for Enhancing Reverse Logistics Flows and Networks in Industry 5.0," Logistics, MDPI, vol. 7(4), pages 1-26, December.
    18. Luping Zhao & Timothy E. Hanson, 2011. "Spatially Dependent Polya Tree Modeling for Survival Data," Biometrics, The International Biometric Society, vol. 67(2), pages 391-403, June.
    19. Kusiak, Andrew & Zheng, Haiyang & Song, Zhe, 2009. "On-line monitoring of power curves," Renewable Energy, Elsevier, vol. 34(6), pages 1487-1493.
    20. Yeo, Keng Leong & Valdez, Emiliano A., 2006. "Claim dependence with common effects in credibility models," Insurance: Mathematics and Economics, Elsevier, vol. 38(3), pages 609-629, June.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0242334. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.