IDEAS home Printed from https://ideas.repec.org/p/pdn/dispap/132.html
   My bibliography  Save this paper

Can Verbal Performance Appraisals and Machine Learning Models Improve the Accuracy of Performance Evaluations?

Author

Listed:
  • Jana Kim Gutt

    (Paderborn University)

  • Kirsten Thommes

    (Paderborn University)

  • Miro Mehic

    (Paderborn University)

Abstract

Performance appraisals are subject to recent debates with one common denominator: most discussions point to their lack of accuracy. In theory, performance appraisals aim to reflect an employee’s performance over a certain period of time. However, recent research shows that appraisals fall short in reaching this goal. Although many studies acknowledge the benefits of performance comments over ratings on a scale, research has paid little attention to the potential of performance comments to achieve higher accuracy in performance evaluations. To approach this issue, we conducted a laboratory experiment and collected objective performance data as well as numerical and verbal performance appraisals. In particular, we compile numerical ratings, written comments, and spoken comments on performance from independent evaluators. To make the numbers (assigned ratings) and the comments comparable, we applied a Random Forest algorithm to transfer the comments into numerical ratings (algorithmic ratings). By analyzing each rating (assigned and algorithmic) in relation to the performance, we find evidence that spoken comments reflect performance differences most accurately within a team. Our results offer important insights into how performance appraisals may be approached to reflect objective performance more accurately.

Suggested Citation

  • Jana Kim Gutt & Kirsten Thommes & Miro Mehic, 2025. "Can Verbal Performance Appraisals and Machine Learning Models Improve the Accuracy of Performance Evaluations?," Working Papers Dissertations 132, Paderborn University, Faculty of Business Administration and Economics.
  • Handle: RePEc:pdn:dispap:132
    as

    Download full text from publisher

    File URL: http://groups.uni-paderborn.de/wp-wiwi/RePEc/pdf/dispap/DP132.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Frederiksen, Anders & Lange, Fabian & Kriechel, Ben, 2017. "Subjective performance evaluations and employee careers," Journal of Economic Behavior & Organization, Elsevier, vol. 134(C), pages 408-429.
    2. Paola Criscuolo & Linus Dahlander & Thorsten Grohsjean & Ammon Salter, 2021. "The Sequence Effect in Panel Decisions: Evidence from the Evaluation of Research and Development Projects," Organization Science, INFORMS, vol. 32(4), pages 987-1008, July.
    3. Oriana Bandiera & Iwan Barankay & Imran Rasul, 2005. "Social Preferences and the Response to Incentives: Evidence from Personnel Data," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 120(3), pages 917-962.
    4. Angelovski, Andrej & Brandts, Jordi & Sola, Carles, 2016. "Hiring and escalation bias in subjective performance evaluations: A laboratory experiment," Journal of Economic Behavior & Organization, Elsevier, vol. 121(C), pages 114-129.
    5. Page, Lionel & Page, Katie, 2010. "Last shall be first: A field study of biases in sequential performance evaluation on the Idol series," Journal of Economic Behavior & Organization, Elsevier, vol. 73(2), pages 186-198, February.
    6. Adler, Seymour & Campion, Michael & Colquitt, Alan & Grubb, Amy & Murphy, Kevin & Ollander-Krane, Rob & Pulakos, Elaine D., 2016. "Getting Rid of Performance Ratings: Genius or Folly? A Debate," Industrial and Organizational Psychology, Cambridge University Press, vol. 9(2), pages 219-252, June.
    7. Heneman, Robert L. & Moore, Michael L. & Wexley, Kenneth N., 1987. "Performance-rating accuracy: A critical review," Journal of Business Research, Elsevier, vol. 15(5), pages 431-448, October.
    8. Nikolay Archak & Anindya Ghose & Panagiotis G. Ipeirotis, 2011. "Deriving the Pricing Power of Product Features by Mining Consumer Reviews," Management Science, INFORMS, vol. 57(8), pages 1485-1509, August.
    9. Alexander Chernev, 2011. "Semantic Anchoring in Sequential Evaluations of Vices and Virtues," Journal of Consumer Research, Journal of Consumer Research Inc., vol. 37(5), pages 761-774.
    10. Gary E. Bolton & David J. Kusterer & Johannes Mans, 2019. "Inflated Reputations: Uncertainty, Leniency, and Moral Wiggle Room in Trader Feedback Systems," Management Science, INFORMS, vol. 65(11), pages 5371-5391, November.
    11. Jurjen J.A. Kamphorst & Otto H. Swank, 2018. "The role of performance appraisals in motivating employees," Journal of Economics & Management Strategy, Wiley Blackwell, vol. 27(2), pages 251-269, June.
    12. Rachana Chattopadhayay & Anil Kumar Ghosh, 2012. "Performance appraisal based on a forced distribution system: its drawbacks and remedies," International Journal of Productivity and Performance Management, Emerald Group Publishing Limited, vol. 61(8), pages 881-896, October.
    13. Roberto A. Weber & Colin F. Camerer, 2003. "Cultural Conflict and Merger Failure: An Experimental Approach," Management Science, INFORMS, vol. 49(4), pages 400-415, April.
    14. Golman, Russell & Bhatia, Sudeep, 2012. "Performance evaluation inflation and compression," Accounting, Organizations and Society, Elsevier, vol. 37(8), pages 534-543.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Axel Ockenfels & Dirk Sliwka & Peter Werner, 2024. "Multi-rater Performance Evaluations and Incentives," ECONtribute Discussion Papers Series 307, University of Bonn and University of Cologne, Germany.
    2. Thuy-Van Tran & Sinikka Lepistö & Janne Järvinen, 2021. "The relationship between subjectivity in managerial performance evaluation and the three dimensions of justice perception," Journal of Management Control: Zeitschrift für Planung und Unternehmenssteuerung, Springer, vol. 32(3), pages 369-399, September.
    3. Ola Andersson & Marieke Huysentruyt & Topi Miettinen & Ute Stephan, 2017. "Person–Organization Fit and Incentives: A Causal Test," Management Science, INFORMS, vol. 63(1), pages 73-96, January.
    4. Irene Trapp & Rouven Trapp, 2019. "The psychological effects of centrality bias: an experimental analysis," Journal of Business Economics, Springer, vol. 89(2), pages 155-189, March.
    5. Enzo Brox & Michael Lechner, 2024. "Teamwork and Spillover Effects in Performance Evaluations," Papers 2403.15200, arXiv.org.
    6. Bartke, Simon & Gelhaar, Felix, 2018. "When does team remuneration work? An experimental study on interactions between workplace contexts," Kiel Working Papers 2105, Kiel Institute for the World Economy (IfW Kiel).
    7. Jana Kim Gutt, 2025. "Evaluators’ Consideration of Warmth and Competence in Verbal and Numerical Performance Assessments," Working Papers Dissertations 131, Paderborn University, Faculty of Business Administration and Economics.
    8. Kusterer, David & Sliwka, Dirk, 2022. "Social Preferences and Rating Biases in Subjective Performance Evaluations," IZA Discussion Papers 15496, Institute of Labor Economics (IZA).
    9. Anna Stankiewicz-Mróz, 2019. "Influence of Interlocking Directorates on Integration after the Acquisition of Warsaw Stock Exchange—Listed Companies," Sustainability, MDPI, vol. 11(24), pages 1-22, December.
    10. Blanco, M. & Dalton, P.S. & Vargas, J.F., 2013. "Does the Unemployement Benefit Institution Affect the Productivity of Workers? Evidence from a Field Experiment," Other publications TiSEM ba37e033-06ab-4fc3-b56e-9, Tilburg University, School of Economics and Management.
    11. Hirota, Haruaki & Iwata, Kazuyuki & Tanaka, Kenta, 2022. "Is public official training effective at reducing costs? Evidence from survey data on Japanese municipal mergers," Economic Analysis and Policy, Elsevier, vol. 75(C), pages 145-158.
    12. Robert S. Gibbons & Manuel Grieder & Holger Herz & Christian Zehnder, 2019. "Building an Equilibrium: Rules Versus Principles in Relational Contracts," CESifo Working Paper Series 7871, CESifo.
    13. Adrian Bruhin & Ernst Fehr & Daniel Schunk, 2019. "The many Faces of Human Sociality: Uncovering the Distribution and Stability of Social Preferences," Journal of the European Economic Association, European Economic Association, vol. 17(4), pages 1025-1069.
    14. Simon Gaechter & Chris Starmer & Fabio Tufano, 2022. "Measuring “group cohesion” to reveal the power of social relationships in team production," Discussion Papers 2022-12, The Centre for Decision Research and Experimental Economics, School of Economics, University of Nottingham.
    15. Houser, Daniel & Yang, Yang, 2024. "Learning language: An experiment," Journal of Economic Behavior & Organization, Elsevier, vol. 217(C), pages 547-559.
    16. Emmanuel Dechenaux & Dan Kovenock & Roman Sheremeta, 2015. "A survey of experimental research on contests, all-pay auctions and tournaments," Experimental Economics, Springer;Economic Science Association, vol. 18(4), pages 609-669, December.
    17. Shi, Wei & Tang, Yinuo, 2015. "Cultural similarity as in-group favoritism: The impact of religious and ethnic similarities on alliance formation and announcement returns," Journal of Corporate Finance, Elsevier, vol. 34(C), pages 32-46.
    18. Johannes Abeler & Felix Marklein, 2017. "Fungibility, Labels, and Consumption," Journal of the European Economic Association, European Economic Association, vol. 15(1), pages 99-127.
    19. Joe Cox & Daniel Kaimann, 2013. "The Signaling Effect of Critics - Evidence from a Market for Experience Goods," Working Papers CIE 68, Paderborn University, CIE Center for International Economics.
    20. Ambrus, Attila & Pathak, Parag A., 2011. "Cooperation over finite horizons: A theory and experiments," Journal of Public Economics, Elsevier, vol. 95(7), pages 500-512.

    More about this item

    Keywords

    performance appraisal; rating accuracy; rating format; performance appraisal comment; rating scale;
    All these keywords.

    JEL classification:

    • J24 - Labor and Demographic Economics - - Demand and Supply of Labor - - - Human Capital; Skills; Occupational Choice; Labor Productivity
    • M51 - Business Administration and Business Economics; Marketing; Accounting; Personnel Economics - - Personnel Economics - - - Firm Employment Decisions; Promotions
    • D91 - Microeconomics - - Micro-Based Behavioral Economics - - - Role and Effects of Psychological, Emotional, Social, and Cognitive Factors on Decision Making

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:pdn:dispap:132. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: WP-WiWi-Info (email available below). General contact details of provider: https://edirc.repec.org/data/fwpadde.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.