Transformer-based models application for bug detection in source code

My bibliography Save this article

Transformer-based models application for bug detection in source code

Author

Listed:

Illia Vokhranov
(National Technical University of Ukraine «Igor Sikorsky Kyiv Polytechnic Institute»)
Bogdan Bulakh
(National Technical University of Ukraine «Igor Sikorsky Kyiv Polytechnic Institute»)

Registered:

Abstract

This paper explores the use of transformer-based models for bug detection in source code, aiming to better understand the capacity of these models to learn complex patterns and relationships within the code. Traditional static analysis tools are highly limited in their ability to detect semantic errors, resulting in numerous defects passing through to the code execution stage. This research represents a step towards enhancing static code analysis using neural networks.The experiments were designed as binary classification tasks to detect buggy code snippets, each targeting a specific defect type such as NameError, TypeError, IndexError, AttributeError, ValueError, EOFError, SyntaxError, and ModuleNotFoundError. Utilizing the «RunBugRun» dataset, which relies on code execution results, the models – BERT, CodeBERT, GPT-2, and CodeT5 – were fine-tuned and compared under identical conditions and hyperparameters. Performance was evaluated using F1-Score, Precision, and Recall.The results indicated that transformer-based models, especially CodeT5 and CodeBERT, were effective in identifying various defects, demonstrating their ability to learn complex code patterns. However, performance varied by defect type, with some defects like IndexError and TypeError being more challenging to detect. The outcomes underscore the importance of high-quality, diverse training data and highlight the potential of transformer-based models to achieve more accurate early defect detection.Future research should further explore advanced transformer architectures for detecting complicated defects, and investigate the integration of additional contextual information to the detection process. This study highlights the potential of modern machine learning architectures to advance software engineering practices, leading to more efficient and reliable software development.

Suggested Citation

Illia Vokhranov & Bogdan Bulakh, 2024. "Transformer-based models application for bug detection in source code," Technology audit and production reserves, PC TECHNOLOGY CENTER, vol. 5(2(79)), pages 6-15, August.

Handle: RePEc:baq:taprar:v:5:y:2024:i:2:p:6-15
DOI: 10.15587/2706-5448.2024.310822

Download full text from publisher

More about this item

Keywords

transformers; large language models; bug detection; defect detection; static code analysis; neural networks;
All these keywords.

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:baq:taprar:v:5:y:2024:i:2:p:6-15. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

We have no bibliographic references for this item. You can help adding them by using this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Iryna Prudius (email available below). General contact details of provider: https://journals.uran.ua/tarp/issue/archive .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Transformer-based models application for bug detection in source code

Author

Abstract

Suggested Citation

Download full text from publisher

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data