IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v10y2022i17p3120-d902398.html
   My bibliography  Save this article

Deep Learning-Based Software Defect Prediction via Semantic Key Features of Source Code—Systematic Survey

Author

Listed:
  • Ahmed Abdu

    (School of Software, Northwestern Polytechnical University, Xi’an 710072, China)

  • Zhengjun Zhai

    (School of Software, Northwestern Polytechnical University, Xi’an 710072, China
    School of Computer Science, Northwestern Polytechnical University, Xi’an 710072, China)

  • Redhwan Algabri

    (School of Mechanical Engineering, Sungkyunkwan University, Suwon 16419, Korea)

  • Hakim A. Abdo

    (Department of Computer Science, Hodeidah University, Al-Hudaydah P.O. Box 3114, Yemen)

  • Kotiba Hamad

    (School of Advanced Materials Science & Engineering, Sungkyunkwan University, Suwon 16419, Korea)

  • Mugahed A. Al-antari

    (Department of Artificial Intelligence, College of Software & Convergence Technology, Daeyang AI Center, Sejong University, Seoul 05006, Korea)

Abstract

Software defect prediction (SDP) methodology could enhance software’s reliability through predicting any suspicious defects in its source code. However, developing defect prediction models is a difficult task, as has been demonstrated recently. Several research techniques have been proposed over time to predict source code defects. However, most of the previous studies focus on conventional feature extraction and modeling. Such traditional methodologies often fail to find the contextual information of the source code files, which is necessary for building reliable prediction deep learning models. Alternatively, the semantic feature strategies of defect prediction have recently evolved and developed. Such strategies could automatically extract the contextual information from the source code files and use them to directly predict the suspicious defects. In this study, a comprehensive survey is conducted to systematically show recent software defect prediction techniques based on the source code’s key features. The most recent studies on this topic are critically reviewed through analyzing the semantic feature methods based on the source codes, the domain’s critical problems and challenges are described, and the recent and current progress in this domain are discussed. Such a comprehensive survey could enable research communities to identify the current challenges and future research directions. An in-depth literature review of 283 articles on software defect prediction and related work was performed, of which 90 are referenced.

Suggested Citation

  • Ahmed Abdu & Zhengjun Zhai & Redhwan Algabri & Hakim A. Abdo & Kotiba Hamad & Mugahed A. Al-antari, 2022. "Deep Learning-Based Software Defect Prediction via Semantic Key Features of Source Code—Systematic Survey," Mathematics, MDPI, vol. 10(17), pages 1-26, August.
  • Handle: RePEc:gam:jmathe:v:10:y:2022:i:17:p:3120-:d:902398
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/10/17/3120/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/10/17/3120/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Kamran Shaukat & Suhuai Luo & Vijay Varadharajan & Ibrahim A. Hameed & Shan Chen & Dongxi Liu & Jiaming Li, 2020. "Performance Comparison and Current Challenges of Using Machine Learning Techniques in Cybersecurity," Energies, MDPI, vol. 13(10), pages 1-27, May.
    2. Shi Meilong & Peng He & Haitao Xiao & Huixin Li & Cheng Zeng, 2020. "An Approach to Semantic and Structural Features Learning for Software Defect Prediction," Mathematical Problems in Engineering, Hindawi, vol. 2020, pages 1-13, April.
    3. Elena N. Akimova & Alexander Yu. Bersenev & Artem A. Deikov & Konstantin S. Kobylkin & Anton V. Konygin & Ilya P. Mezentsev & Vladimir E. Misilov, 2021. "A Survey on Software Defect Prediction Using Deep Learning," Mathematics, MDPI, vol. 9(11), pages 1-14, May.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Fatima Rafiq & Mazhar Javed Awan & Awais Yasin & Haitham Nobanee & Azlan Mohd Zain & Saeed Ali Bahaj, 2022. "Privacy Prevention of Big Data Applications: A Systematic Literature Review," SAGE Open, , vol. 12(2), pages 21582440221, May.
    2. Chetna Monga & Deepali Gupta & Devendra Prasad & Sapna Juneja & Ghulam Muhammad & Zulfiqar Ali, 2022. "Sustainable Network by Enhancing Attribute-Based Selection Mechanism Using Lagrange Interpolation," Sustainability, MDPI, vol. 14(10), pages 1-15, May.
    3. Hail Jung & Jinsu Jeon & Dahui Choi & Jung-Ywn Park, 2021. "Application of Machine Learning Techniques in Injection Molding Quality Prediction: Implications on Sustainable Manufacturing Industry," Sustainability, MDPI, vol. 13(8), pages 1-16, April.
    4. Frank Cremer & Barry Sheehan & Michael Fortmann & Arash N. Kia & Martin Mullins & Finbarr Murphy & Stefan Materne, 2022. "Cyber risk and cybersecurity: a systematic review of data availability," The Geneva Papers on Risk and Insurance - Issues and Practice, Palgrave Macmillan;The Geneva Association, vol. 47(3), pages 698-736, July.
    5. Wojciech Szczepanik & Marcin Niemiec, 2022. "Heuristic Intrusion Detection Based on Traffic Flow Statistical Analysis," Energies, MDPI, vol. 15(11), pages 1-19, May.
    6. Feng Wu & Wanqiang Xu & Chaoran Lin & Yanwei Zhang, 2022. "Knowledge Trajectories on Public Crisis Management Research from Massive Literature Text Using Topic-Clustered Evolution Extraction," Mathematics, MDPI, vol. 10(12), pages 1-18, June.
    7. Pengyi Liao & Jun Yan & Jean Michel Sellier & Yongxuan Zhang, 2022. "TADA: A Transferable Domain-Adversarial Training for Smart Grid Intrusion Detection Based on Ensemble Divergence Metrics and Spatiotemporal Features," Energies, MDPI, vol. 15(23), pages 1-18, November.
    8. Nasir, Nida & Kansal, Afreen & Alshaltone, Omar & Barneih, Feras & Shanableh, Abdallah & Al-Shabi, Mohammad & Al Shammaa, Ahmed, 2023. "Deep learning detection of types of water-bodies using optical variables and ensembling," LSE Research Online Documents on Economics 118724, London School of Economics and Political Science, LSE Library.
    9. Yuan Wang & Liping Yang & Jun Wu & Zisheng Song & Li Shi, 2022. "Mining Campus Big Data: Prediction of Career Choice Using Interpretable Machine Learning Method," Mathematics, MDPI, vol. 10(8), pages 1-18, April.
    10. Aimen Khalid & Gran Badshah & Nasir Ayub & Muhammad Shiraz & Mohamed Ghouse, 2023. "Software Defect Prediction Analysis Using Machine Learning Techniques," Sustainability, MDPI, vol. 15(6), pages 1-17, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:10:y:2022:i:17:p:3120-:d:902398. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.