IDEAS home Printed from https://ideas.repec.org/p/osf/osfxxx/qy8zd.html
   My bibliography  Save this paper

The GPT Surprise: Offering Large Language Model Chat in a Massive Coding Class Reduced Engagement but Increased Adopters Exam Performances

Author

Listed:
  • Nie, Allen
  • Chandak, Yash
  • Suzara, Miroslav
  • Ali, Malika
  • Woodrow, Juliette
  • Peng, Matt
  • Sahami, Mehran
  • Brunskill, Emma
  • Piech, Chris

Abstract

Large language models (LLMs) are quickly being adopted in a wide range of learning experiences, especially via ubiquitous and broadly accessible chat interfaces like ChatGPT and Copilot. This type of interface is readily available to students and teachers around the world, yet relatively little research has been done to assess the impact of such generic tools on student learning. Coding education is an interesting test case, both because LLMs have strong performance on coding tasks, and because LLM-powered support tools are rapidly becoming part of the workflow of professional software engineers. To help understand the impact of generic LLM use on coding education, we conducted a large-scale randomized control trial with 5,831 students from 146 countries in an online coding class in which we provided some students with access to a chat interface with GPT-4. We estimate positive benefits on exam performance for adopters, the students who used the tool, but over all students, the advertisement of GPT-4 led to a significant average decrease in exam participation. We observe similar decreases in other forms of course engagement. However, this decrease is modulated by the student's country of origin. Offering access to LLMs to students from low human development index countries increased their exam participation rate on average. Our results suggest there may be promising benefits to using LLMs in an introductory coding class, but also potential harms for engagement, which makes their longer term impact on student success unclear. Our work highlights the need for additional investigations to help understand the potential impact of future adoption and integration of LLMs into classrooms.

Suggested Citation

  • Nie, Allen & Chandak, Yash & Suzara, Miroslav & Ali, Malika & Woodrow, Juliette & Peng, Matt & Sahami, Mehran & Brunskill, Emma & Piech, Chris, 2024. "The GPT Surprise: Offering Large Language Model Chat in a Massive Coding Class Reduced Engagement but Increased Adopters Exam Performances," OSF Preprints qy8zd, Center for Open Science.
  • Handle: RePEc:osf:osfxxx:qy8zd
    DOI: 10.31219/osf.io/qy8zd
    as

    Download full text from publisher

    File URL: https://osf.io/download/6628930d80d25c0de8f919e6/
    Download Restriction: no

    File URL: https://libkey.io/10.31219/osf.io/qy8zd?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    2. Angrist, Joshua D, 1990. "Lifetime Earnings and the Vietnam Era Draft Lottery: Evidence from Social Security Administrative Records," American Economic Review, American Economic Association, vol. 80(3), pages 313-336, June.
    3. David Card, 1993. "Using Geographic Variation in College Proximity to Estimate the Return to Schooling," Working Papers 696, Princeton University, Department of Economics, Industrial Relations Section..
    4. Imbens, Guido W & Angrist, Joshua D, 1994. "Identification and Estimation of Local Average Treatment Effects," Econometrica, Econometric Society, vol. 62(2), pages 467-475, March.
    5. David Card, 1993. "Using Geographic Variation in College Proximity to Estimate the Return to Schooling," Working Papers 696, Princeton University, Department of Economics, Industrial Relations Section..
    6. Angrist, Joshua D, 1990. "Lifetime Earnings and the Vietnam Era Draft Lottery: Evidence from Social Security Administrative Records: Errata," American Economic Review, American Economic Association, vol. 80(5), pages 1284-1286, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Sun, Zhenting, 2023. "Instrument validity for heterogeneous causal effects," Journal of Econometrics, Elsevier, vol. 237(2).
    2. Huber Martin & Wüthrich Kaspar, 2019. "Local Average and Quantile Treatment Effects Under Endogeneity: A Review," Journal of Econometric Methods, De Gruyter, vol. 8(1), pages 1-27, January.
    3. Guanghui Pan, 2024. "Methodological Foundations of Modern Causal Inference in Social Science Research," Papers 2408.00032, arXiv.org.
    4. Xiaolin Sun, 2022. "Estimation of Heterogeneous Treatment Effects Using a Conditional Moment Based Approach," Papers 2210.15829, arXiv.org, revised Oct 2024.
    5. Joshua D. Angrist, 2022. "Empirical Strategies in Economics: Illuminating the Path From Cause to Effect," Econometrica, Econometric Society, vol. 90(6), pages 2509-2539, November.
    6. Phillip Heiler, 2020. "Efficient Covariate Balancing for the Local Average Treatment Effect," Papers 2007.04346, arXiv.org.
    7. Zhichao Jiang & Shu Yang & Peng Ding, 2022. "Multiply robust estimation of causal effects under principal ignorability," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(4), pages 1423-1445, September.
    8. Haitian Xie, 2020. "Efficient and Robust Estimation of the Generalized LATE Model," Papers 2001.06746, arXiv.org, revised Feb 2022.
    9. van Elk, Roel & van der Steeg, Marc & Webbink, Dinand, 2011. "Does the timing of tracking affect higher education completion?," Economics of Education Review, Elsevier, vol. 30(5), pages 1009-1021, October.
    10. Xintong Wang & Carlos A. Flores & Alfonso Flores-Lagunes, 2020. "The Effects of Vietnam-Era Military Service on the Long-Term Health of Veterans: A Bounds Analysis," Center for Policy Research Working Papers 234, Center for Policy Research, Maxwell School, Syracuse University.
    11. Alberto Abadie, 2000. "Semiparametric Estimation of Instrumental Variable Models for Causal Effects," NBER Technical Working Papers 0260, National Bureau of Economic Research, Inc.
    12. David S. Lee & Thomas Lemieux, 2009. "Regression Discontinuity Designs In Economics," Working Papers 1118, Princeton University, Department of Economics, Industrial Relations Section..
    13. Timothy F. Harris & Aaron Yelowitz, 2018. "Life Insurance Holdings And Well‐Being Of Surviving Spouses," Contemporary Economic Policy, Western Economic Association International, vol. 36(3), pages 526-538, July.
    14. Anna Piil Damm, 2009. "Ethnic Enclaves and Immigrant Labor Market Outcomes: Quasi-Experimental Evidence," Journal of Labor Economics, University of Chicago Press, vol. 27(2), pages 281-314, April.
    15. Nikolov, Plamen & Jimi, Nusrat & Chang, Jerray, 2020. "The Importance of Cognitive Domains and the Returns to Schooling in South Africa: Evidence from Two Labor Surveys," Labour Economics, Elsevier, vol. 65(C).
    16. Evans, William N. & Ringel, Jeanne S., 1999. "Can higher cigarette taxes improve birth outcomes?," Journal of Public Economics, Elsevier, vol. 72(1), pages 135-154, April.
    17. Markus Frölich, 2004. "Programme Evaluation with Multiple Treatments," Journal of Economic Surveys, Wiley Blackwell, vol. 18(2), pages 181-224, April.
    18. Manuel Denzer, 2019. "Estimating Causal Effects in Binary Response Models with Binary Endogenous Explanatory Variables - A Comparison of Possible Estimators," Working Papers 1916, Gutenberg School of Management and Economics, Johannes Gutenberg-Universität Mainz.
    19. David S. Lee & Thomas Lemieux, 2010. "Regression Discontinuity Designs in Economics," Journal of Economic Literature, American Economic Association, vol. 48(2), pages 281-355, June.
    20. Guido W. Imbens & Jeffrey M. Wooldridge, 2009. "Recent Developments in the Econometrics of Program Evaluation," Journal of Economic Literature, American Economic Association, vol. 47(1), pages 5-86, March.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:osf:osfxxx:qy8zd. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: OSF (email available below). General contact details of provider: https://osf.io/preprints/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.