IDEAS home Printed from https://ideas.repec.org/p/pen/papers/24-017.html
   My bibliography  Save this paper

Recovering Overlooked Information in Categorical Variables with LLMs: An Application to Labor Market Mismatch

Author

Listed:
  • Yi Chen

    (ShanghaiTech University)

  • Hanming Fang

    (University of Pennsylvania)

  • Yi Zhao

    (Tsinghua University)

  • Zibo Zhao

    (ShanghaiTech University)

Abstract

Categorical variables have no intrinsic ordering, and researchers often adopt a fixed-effect (FE) approach in empirical analysis. However, this approach has two significant limitations: it overlooks textual information associated with the categorical variables; and it produces unstable results when there are only limited observations in a category. In this paper, we propose a novel method that utilizes recent advances in large language models (LLMs) to recover overlooked information in categorical variables. We apply this method to investigate labor market mismatch. Specifically, we task LLMs with simulating the role of a human resources specialist to assess the suitability of an applicant with specific characteristics for a given job. Our main findings can be summarized in three parts. First, using comprehensive administrative data from an online job posting platform, we show that our new match quality measure is positively correlated with several traditional measures in the literature, and we highlight the LLM’s capability to provide additional information beyond that contained in the traditional measures. Second, we demonstrate the broad applicability of the new method with a survey data containing significantly less information than the administrative data, which makes it impossible to compute most of the traditional match quality measures. Our LLM measure successfully replicates most of the salient patterns observed in a hard-to-access administrative dataset using easily accessible survey data. Third, we investigate the gender gap in match quality and explore whether there exists gender stereotypes in the hiring process. We simulate an audit study, examining whether revealing gender information to LLMs influences their assessment. We show that when gender information is disclosed to the LLMs, the model deems females better suited for traditionally female-dominated roles.

Suggested Citation

  • Yi Chen & Hanming Fang & Yi Zhao & Zibo Zhao, 2024. "Recovering Overlooked Information in Categorical Variables with LLMs: An Application to Labor Market Mismatch," PIER Working Paper Archive 24-017, Penn Institute for Economic Research, Department of Economics, University of Pennsylvania.
  • Handle: RePEc:pen:papers:24-017
    as

    Download full text from publisher

    File URL: https://economics.sas.upenn.edu/system/files/working-papers/24-017%20PIER%20Paper%20Submission.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Tyna Eloundou & Sam Manning & Pamela Mishkin & Daniel Rock, 2023. "GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models," Papers 2303.10130, arXiv.org, revised Aug 2023.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Carvajal, Daniel & Franco, Catalina & Isaksson, Siri, 2024. "Will Artificial Intelligence Get in the Way of Achieving Gender Equality?," Discussion Paper Series in Economics 3/2024, Norwegian School of Economics, Department of Economics, revised 05 Aug 2024.
    2. Evangelos Katsamakas & Oleg V. Pavlov & Ryan Saklad, 2024. "Artificial intelligence and the transformation of higher education institutions," Papers 2402.08143, arXiv.org.
    3. Caleb Peppiatt, 2024. "The Future of Work: Inequality, Artificial Intelligence, and What Can Be Done About It. A Literature Review," Papers 2408.13300, arXiv.org.
    4. D'Al, Francesco & Santarelli, Enrico & Vivarelli, Marco, 2024. "The KSTE+I approach and the advent of AI technologies: evidence from the European regions," GLO Discussion Paper Series 1473, Global Labor Organization (GLO).
    5. Amali Matharaarachchi & Wishmitha Mendis & Kanishka Randunu & Daswin De Silva & Gihan Gamage & Harsha Moraliyage & Nishan Mills & Andrew Jennings, 2024. "Optimizing Generative AI Chatbots for Net-Zero Emissions Energy Internet-of-Things Infrastructure," Energies, MDPI, vol. 17(8), pages 1-19, April.
    6. Anna Davies & Betsy Donald & Mia Gray, 2023. "The power of platforms—precarity and place," Cambridge Journal of Regions, Economy and Society, Cambridge Political Economy Society, vol. 16(2), pages 245-256.
    7. Samir Huseynov, 2023. "ChatGPT and the Labor Market: Unraveling the Effect of AI Discussions on Students' Earnings Expectations," Papers 2305.11900, arXiv.org, revised Aug 2023.
    8. Christian Peukert & Florian Abeillon & Jérémie Haese & Franziska Kaiser & Alexander Staub, 2024. "Strategic Behavior and AI Training Data," CESifo Working Paper Series 11099, CESifo.
    9. Thomas Cantens, 2023. "How will the State think with the assistance of ChatGPT? The case of customs as an example of generative artificial intelligence in public administrations," CERDI Working papers hal-04233370, HAL.
    10. Avi Goldfarb, 2024. "Pause artificial intelligence research? Understanding AI policy challenges," Canadian Journal of Economics/Revue canadienne d'économique, John Wiley & Sons, vol. 57(2), pages 363-377, May.
    11. Anil R. Doshi & Oliver P. Hauser, 2023. "Generative artificial intelligence enhances creativity but reduces the diversity of novel content," Papers 2312.00506, arXiv.org, revised Mar 2024.
    12. Ekaterina Novozhilova & Kate Mays & James E. Katz, 2024. "Looking towards an automated future: U.S. attitudes towards future artificial intelligence instantiations and their effect," Palgrave Communications, Palgrave Macmillan, vol. 11(1), pages 1-11, December.
    13. Korinek, Anton & Suh, Donghyun, 2024. "Scenarios for the Transition to AGI," CEPR Discussion Papers 18928, C.E.P.R. Discussion Papers.
    14. D’Alessandro, Francesco & Santarelli, Enrico & Vivarelli, Marco, 2024. "The Knowledge Spillover Theory of Entrepreneurship and Innovation (KSTE+I) Approach and the Advent of AI Technologies: Evidence from the European Regions," IZA Discussion Papers 17206, Institute of Labor Economics (IZA).
    15. Ali Merali, 2024. "Scaling Laws for Economic Productivity: Experimental Evidence in LLM-Assisted Translation," Papers 2409.02391, arXiv.org.
    16. Kristina McElheran & J. Frank Li & Erik Brynjolfsson & Zachary Kroff & Emin Dinlersoz & Lucia Foster & Nikolas Zolas, 2024. "AI adoption in America: Who, what, and where," Journal of Economics & Management Strategy, Wiley Blackwell, vol. 33(2), pages 375-415, March.
    17. Clément Le Ludec & Maxime Cornet & Antonio Casilli, 2023. "The problem with annotation. Human labour and outsourcing between France and Madagascar," Post-Print hal-04174945, HAL.
    18. Oschinski, Matthias, 2023. "Assessing the Impact of Artificial Intelligence on Germany's Labor Market: Insights from a ChatGPT Analysis," MPRA Paper 118300, University Library of Munich, Germany.
    19. David H. Kreitmeir & Paul A. Raschky, 2023. "The Unintended Consequences of Censoring Digital Technology - Evidence from Italy's ChatGPT Ban," SoDa Laboratories Working Paper Series 2023-01, Monash University, SoDa Laboratories.
    20. Dario Guarascio & Jelena Reljic & Roman Stollinger, 2023. "Artificial Intelligence and Employment: A Look into the Crystal Ball," LEM Papers Series 2023/34, Laboratory of Economics and Management (LEM), Sant'Anna School of Advanced Studies, Pisa, Italy.

    More about this item

    Keywords

    Large Language Models; Categorical Variables; Labor Market Mismatch;
    All these keywords.

    JEL classification:

    • C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis
    • J16 - Labor and Demographic Economics - - Demographic Economics - - - Economics of Gender; Non-labor Discrimination
    • J24 - Labor and Demographic Economics - - Demand and Supply of Labor - - - Human Capital; Skills; Occupational Choice; Labor Productivity
    • J31 - Labor and Demographic Economics - - Wages, Compensation, and Labor Costs - - - Wage Level and Structure; Wage Differentials

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:pen:papers:24-017. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Administrator (email available below). General contact details of provider: https://edirc.repec.org/data/deupaus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.