Fluent but Not Factual: A Comparative Analysis of ChatGPT and Other AI Chatbots’ Proficiency and Originality in Scientific Writing for Humanities

My bibliography Save this article

Fluent but Not Factual: A Comparative Analysis of ChatGPT and Other AI Chatbots’ Proficiency and Originality in Scientific Writing for Humanities

Author

Listed:

Edisa Lozić
(Research Centre of the Slovenian Academy of Sciences and Arts, 1000 Ljubljana, Slovenia)
Benjamin Štular
(Research Centre of the Slovenian Academy of Sciences and Arts, 1000 Ljubljana, Slovenia)

Registered:

Abstract

Historically, mastery of writing was deemed essential to human progress. However, recent advances in generative AI have marked an inflection point in this narrative, including for scientific writing. This article provides a comprehensive analysis of the capabilities and limitations of six AI chatbots in scholarly writing in the humanities and archaeology. The methodology was based on tagging AI-generated content for quantitative accuracy and qualitative precision by human experts. Quantitative accuracy assessed the factual correctness in a manner similar to grading students, while qualitative precision gauged the scientific contribution similar to reviewing a scientific article. In the quantitative test, ChatGPT-4 scored near the passing grade (−5) whereas ChatGPT-3.5 (−18), Bing (−21) and Bard (−31) were not far behind. Claude 2 (−75) and Aria (−80) scored much lower. In the qualitative test, all AI chatbots, but especially ChatGPT-4, demonstrated proficiency in recombining existing knowledge, but all failed to generate original scientific content. As a side note, our results suggest that with ChatGPT-4, the size of large language models has reached a plateau. Furthermore, this paper underscores the intricate and recursive nature of human research. This process of transforming raw data into refined knowledge is computationally irreducible, highlighting the challenges AI chatbots face in emulating human originality in scientific writing. Our results apply to the state of affairs in the third quarter of 2023. In conclusion, while large language models have revolutionised content generation, their ability to produce original scientific contributions in the humanities remains limited. We expect this to change in the near future as current large language model-based AI chatbots evolve into large language model-powered software.

Suggested Citation

Edisa Lozić & Benjamin Štular, 2023. "Fluent but Not Factual: A Comparative Analysis of ChatGPT and Other AI Chatbots’ Proficiency and Originality in Scientific Writing for Humanities," Future Internet, MDPI, vol. 15(10), pages 1-26, October.

Handle: RePEc:gam:jftint:v:15:y:2023:i:10:p:336-:d:1259029

Download full text from publisher

References listed on IDEAS

Cristòfol Rovira & Lluís Codina & Carlos Lopezosa, 2021. "Language Bias in the Google Scholar Ranking Algorithm," Future Internet, MDPI, vol. 13(2), pages 1-17, January.
Holly Else, 2023. "Abstracts written by ChatGPT fool scientists," Nature, Nature, vol. 613(7944), pages 423-423, January.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Christopher J. Lynch & Erik J. Jensen & Virginia Zamponi & Kevin O’Brien & Erika Frydenlund & Ross Gore, 2023. "A Structured Narrative Prompt for Prompting Narratives from Large Language Models: Sentiment Assessment of ChatGPT-Generated Narratives and Real Tweets," Future Internet, MDPI, vol. 15(12), pages 1-36, November.
Ketmanto Wangsa & Shakir Karim & Ergun Gide & Mahmoud Elkhodr, 2024. "A Systematic Review and Comprehensive Analysis of Pioneering AI Chatbot Models from Education to Healthcare: ChatGPT, Bard, Llama, Ernie and Grok," Future Internet, MDPI, vol. 16(7), pages 1-23, June.
Sakhi Aggrawal & Alejandra J. Magana, 2024. "Teamwork Conflict Management Training and Conflict Resolution Practice via Large Language Models," Future Internet, MDPI, vol. 16(5), pages 1-25, May.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Arpan Kumar Kar & P. S. Varsha & Shivakami Rajan, 2023. "Unravelling the Impact of Generative Artificial Intelligence (GAI) in Industrial Applications: A Review of Scientific and Grey Literature," Global Journal of Flexible Systems Management, Springer;Global Institute of Flexible Systems Management, vol. 24(4), pages 659-689, December.
Wang Xian & Chen Guomin & Varsha Arya & Kwok Tai Chui, 2024. "Examining the Influence of AI Chatbots on Semantic Web-Based Global Information Management in Various Industries," International Journal on Semantic Web and Information Systems (IJSWIS), IGI Global, vol. 20(1), pages 1-14, January.
Alin ZAMFIROIU & Denisa VASILE & Daniel SAVU, 2023. "ChatGPT â€“ A Systematic Review of Published Research Papers," Informatica Economica, Academy of Economic Studies - Bucharest, Romania, vol. 27(1), pages 5-16.
Amina Badreddine & Hadjira Larbi Cherif, 2023. "ChatGPT in Academic Research: Demonstrating Limitations through Real Practical Examples," Post-Print hal-04379581, HAL.
Howell, Bronwyn E. & Potgieter, Petrus H., 2023. "AI-generated lemons: a sour outlook for content producers?," 32nd European Regional ITS Conference, Madrid 2023: Realising the digital decade in the European Union – Easier said than done? 277971, International Telecommunications Society (ITS).
Borker, Girija, 2024. "Understanding the constraints to women’s use of urban public transport in developing countries," World Development, Elsevier, vol. 180(C).
Konstantinos I. Roumeliotis & Nikolaos D. Tselikas & Dimitrios K. Nasiopoulos, 2022. "Airlines’ Sustainability Study Based on Search Engine Optimization Techniques and Technologies," Sustainability, MDPI, vol. 14(18), pages 1-23, September.
Peres, Renana & Schreier, Martin & Schweidel, David & Sorescu, Alina, 2023. "On ChatGPT and beyond: How generative artificial intelligence may affect research, teaching, and practice," International Journal of Research in Marketing, Elsevier, vol. 40(2), pages 269-275.
Toni GIBEA & Radu USZKAI & Mihail Valentin CERNEA, 2023. "The Ethical Risks Posed By New Technologies In Research," Proceedings of the INTERNATIONAL MANAGEMENT CONFERENCE, Faculty of Management, Academy of Economic Studies, Bucharest, Romania, vol. 17(1), pages 757-765, November.
Chiarello, Filippo & Giordano, Vito & Spada, Irene & Barandoni, Simone & Fantoni, Gualtiero, 2024. "Future applications of generative large language models: A data-driven case study on ChatGPT," Technovation, Elsevier, vol. 133(C).
Enrique Orduña-Malea & Cristina I. Font-Julián & Jorge Serrano-Cobos, 2024. "Open access publications drive few visits from Google Search results to institutional repositories," Scientometrics, Springer;Akadémiai Kiadó, vol. 129(11), pages 7131-7152, November.
Giordano, Vito & Spada, Irene & Chiarello, Filippo & Fantoni, Gualtiero, 2024. "The impact of ChatGPT on human skills: A quantitative study on twitter data," Technological Forecasting and Social Change, Elsevier, vol. 203(C).
Lian, Ying & Tang, Huiting & Xiang, Mengting & Dong, Xuefan, 2024. "Public attitudes and sentiments toward ChatGPT in China: A text mining analysis based on social media," Technology in Society, Elsevier, vol. 76(C).

More about this item

Keywords

generative AI; large language model (LLM); ChatGPT; Bard; Bing; scientific writing; digital humanities;
All these keywords.

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jftint:v:15:y:2023:i:10:p:336-:d:1259029. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Fluent but Not Factual: A Comparative Analysis of ChatGPT and Other AI Chatbots’ Proficiency and Originality in Scientific Writing for Humanities

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data