IDEAS home Printed from https://ideas.repec.org/p/osf/osfxxx/udz28_v2.html
   My bibliography  Save this paper

Addressing Systematic Non-response Bias with Supervised Fine-Tuning of Large Language Models: A Case Study on German Voting Behaviour

Author

Listed:
  • Holtdirk, Tobias
  • Assenmacher, Dennis
  • Bleier, Arnim
  • Wagner, Claudia

Abstract

A major challenge for survey researchers is dealing with missing data, which restricts the scope of analysis and the reliability of inferences that can be drawn. Recently, researchers have started investigating the potential of Large Language Models (LLMs) to role-play a pre-defined set of ``characters'' and simulate their survey responses with little or no additional training data and costs. Previous research has mostly focused on zero-shot LLM predictions. However, often other survey responses are at least partially available. This work investigates the viability and robustness of supervised fine-tuning on these responses to simulate systematic and random item-level non-responses in the context of German voting behaviour. Our results show when systematic item non-responses are present, fine-tuned LLMs outperform traditional classification approaches on survey data. Fine-tuned LLMs also seem to be more robust to changes in the set of features that the model can use to make predictions. Finally, we see that fine-tuned LLMs match the performance of traditional classification methods when survey responses are missing completely at random.

Suggested Citation

  • Holtdirk, Tobias & Assenmacher, Dennis & Bleier, Arnim & Wagner, Claudia, 2025. "Addressing Systematic Non-response Bias with Supervised Fine-Tuning of Large Language Models: A Case Study on German Voting Behaviour," OSF Preprints udz28_v2, Center for Open Science.
  • Handle: RePEc:osf:osfxxx:udz28_v2
    DOI: 10.31219/osf.io/udz28_v2
    as

    Download full text from publisher

    File URL: https://osf.io/download/67bd9482dace37a8115566c4/
    Download Restriction: no

    File URL: https://libkey.io/10.31219/osf.io/udz28_v2?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Mullinix, Kevin J. & Leeper, Thomas J. & Druckman, James N. & Freese, Jeremy, 2015. "The Generalizability of Survey Experiments," Journal of Experimental Political Science, Cambridge University Press, vol. 2(2), pages 109-138, January.
    2. Junyung Ji & Jiwoo Kim & Younghoon Kim, 2024. "Predicting Missing Values in Survey Data Using Prompt Engineering for Addressing Item Non-Response," Future Internet, MDPI, vol. 16(10), pages 1-19, September.
    3. Murray Shanahan & Kyle McDonell & Laria Reynolds, 2023. "Role play with large language models," Nature, Nature, vol. 623(7987), pages 493-498, November.
    4. Argyle, Lisa P. & Busby, Ethan C. & Fulda, Nancy & Gubler, Joshua R. & Rytting, Christopher & Wingate, David, 2023. "Out of One, Many: Using Language Models to Simulate Human Samples," Political Analysis, Cambridge University Press, vol. 31(3), pages 337-351, July.
    5. Yao Qu & Jue Wang, 2024. "Performance and biases of Large Language Models in public opinion simulation," Palgrave Communications, Palgrave Macmillan, vol. 11(1), pages 1-13, December.
    6. Schmitt-Beck, Rüdiger & Roßteutscher, Sigrid & Schoen, Harald & Weßels, Bernhard & Wolf, Christof, 2022. "A New Era of Electoral Instability," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, pages 3-24.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. von der Heyde, Leah & Haensch, Anna-Carolina & Wenz, Alexander, 2023. "Assessing Bias in LLM-Generated Synthetic Datasets: The Case of German Voter Behavior," SocArXiv 97r8s_v1, Center for Open Science.
    2. Gilles Grolleau & Murat C. Mungan & Naoufel Mzoughi, 2024. "Punishment menus and their deterrent effects: an exploratory analysis," European Journal of Law and Economics, Springer, vol. 58(1), pages 1-19, August.
    3. Delis, Manthos & Galariotis, Emilios & Monne, Jerome, 2021. "Financial vulnerability and seeking expert advice: Evidence from a survey experiment," MPRA Paper 107095, University Library of Munich, Germany.
    4. Zhen Wang & Ruiqi Song & Chen Shen & Shiya Yin & Zhao Song & Balaraju Battu & Lei Shi & Danyang Jia & Talal Rahwan & Shuyue Hu, 2024. "Large Language Models Overcome the Machine Penalty When Acting Fairly but Not When Acting Selfishly or Altruistically," Papers 2410.03724, arXiv.org, revised Oct 2024.
    5. Katherine Farrow & Gilles Grolleau & Lisette Ibanez, 2022. "Does misery love company? An experimental investigation [How much do we care about absolute versus relative income and consumption?]," Oxford Economic Papers, Oxford University Press, vol. 74(2), pages 523-540.
    6. Adolfo Carballo‐Penela & Emilio Ruzo‐Sanmartín & Carlos M. P. Sousa, 2023. "Does business commitment to sustainability increase job seekers' perceptions of organisational attractiveness? The role of organisational prestige and cultural masculinity," Business Strategy and the Environment, Wiley Blackwell, vol. 32(8), pages 5521-5535, December.
    7. Kuehnhanss, Colin R. & Heyndels, Bruno, 2018. "All’s fair in taxation: A framing experiment with local politicians," Journal of Economic Psychology, Elsevier, vol. 65(C), pages 26-40.
    8. Stone, Daniel & sood, Gaurav & Garz, Marcel & Wallace, Justin, 2018. "The supply of media slant across outlets and demand for slant within-outlets: Evidence from US presidential campaign news," SocArXiv fy2we_v1, Center for Open Science.
    9. Dib-Slamani, Hind & Grolleau, Gilles & Mzoughi, Naoufel, 2022. "Robbing a robber is not robbing," The Quarterly Review of Economics and Finance, Elsevier, vol. 85(C), pages 1-7.
    10. Karl D. Jackson & Giovanna Maria Dora Dore, 2021. "In Sizing Civil Society, Wording and Format Matter," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 155(3), pages 983-994, June.
    11. Soojong Kim, 2019. "Directionality of information flow and echoes without chambers," PLOS ONE, Public Library of Science, vol. 14(5), pages 1-22, May.
    12. Schwaiger, Rene & Hueber, Laura, 2021. "Do MTurkers exhibit myopic loss aversion?," Economics Letters, Elsevier, vol. 209(C).
    13. Voelkel, Jan G. & Stagnaro, Michael & Chu, James & Pink, Sophia Lerner & Mernyk, Joseph S. & Redekopp, Chrystal & Ghezae, Isaias & Cashman, Matthew & Adjodah, Dhaval & Allen, Levi, 2024. "Megastudy testing 25 treatments to reduce antidemocratic attitudes and partisan animosity," OSF Preprints y79u5_v1, Center for Open Science.
    14. César Merino-Soto & Manuel Fernández-Arata & Jaime Fuentes-Balderrama & Guillermo M. Chans & Filiberto Toledano-Toledano, 2022. "Research Perceived Competency Scale: A New Psychometric Adaptation for University Students’ Research Learning," Sustainability, MDPI, vol. 14(19), pages 1-17, September.
    15. Wieser, Luisa & Abraham, Martin & Schnabel, Claus & Niessen, Cornelia & Wolff, Mauren, 2023. "When are employers interested in electronic performance monitoring? Results from a factorial survey experiment," Discussion Papers 127, Friedrich-Alexander University Erlangen-Nuremberg, Chair of Labour and Regional Economics.
    16. Grolleau, Gilles & Mungan, Murat C. & Mzoughi, Naoufel, 2022. "Seemingly irrelevant information? The impact of legal team size on third party perceptions," International Review of Law and Economics, Elsevier, vol. 71(C).
    17. Petrik Runst, 2018. "Does Immigration Affect Demand for Redistribution? – An Experimental Design," German Economic Review, Verein für Socialpolitik, vol. 19(4), pages 383-400, November.
    18. Barbara Caci & Maurizio Cardaci & Silvana Miceli, 2019. "Development and Maintenance of Self-Disclosure on Facebook: The Role of Personality Traits," SAGE Open, , vol. 9(2), pages 21582440198, June.
    19. Lala Muradova & Ross James Gildea, 2021. "Oil wealth and US public support for war," Conflict Management and Peace Science, Peace Science Society (International), vol. 38(1), pages 3-19, January.
    20. Logan S. Casey & Jesse Chandler & Adam Seth Levine & Andrew Proctor & Dara Z. Strolovitch, 2017. "Intertemporal Differences Among MTurk Workers: Time-Based Sample Variations and Implications for Online Data Collection," SAGE Open, , vol. 7(2), pages 21582440177, June.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:osf:osfxxx:udz28_v2. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: OSF (email available below). General contact details of provider: https://osf.io/preprints/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.