IDEAS home Printed from https://ideas.repec.org/a/gam/jftint/v14y2022i6p168-d828622.html
   My bibliography  Save this article

Fraud Detection Using Neural Networks: A Case Study of Income Tax

Author

Listed:
  • Belle Fille Murorunkwere

    (African Center of Excellence in Data Science, University of Rwanda, KK 737 Street, Gikondo, Kigali P.O. Box 4285, Rwanda)

  • Origene Tuyishimire

    (African Institute for Mathematical Sciences, KN 3 Street, Remera, Kigali P.O. Box 7150, Rwanda)

  • Dominique Haughton

    (Department of Mathematical Sciences and Global Studies, Bentley University, Watham, MA 02452-4705, USA
    Department of Mathematical Sciences and Global Studies, Université Paris 1 (SAMM), 75634 Paris, France
    Department of Mathematical Sciences and Global Studies, Université Toulouse 1 (TSE-R), 31042 Toulouse, France)

  • Joseph Nzabanita

    (Department of Mathematics, College of Science and Technology, University of Rwanda, KN 67 Street, Nyarugenge, Kigali P.O. Box 3900, Rwanda)

Abstract

Detecting tax fraud is a top objective for practically all tax agencies in order to maximize revenues and maintain a high level of compliance. Data mining, machine learning, and other approaches such as traditional random auditing have been used in many studies to deal with tax fraud. The goal of this study is to use Artificial Neural Networks to identify factors of tax fraud in income tax data. The results show that Artificial Neural Networks perform well in identifying tax fraud with an accuracy of 92%, a precision of 85%, a recall score of 99%, and an AUC-ROC of 95%. All businesses, either cross-border or domestic, the period of the business, small businesses, and corporate businesses, are among the factors identified by the model to be more relevant to income tax fraud detection. This study is consistent with the previous closely related work in terms of features related to tax fraud where it covered all tax types together using different machine learning models. To the best of our knowledge, this study is the first to use Artificial Neural Networks to detect income tax fraud in Rwanda by comparing different parameters such as layers, batch size, and epochs and choosing the optimal ones that give better accuracy than others. For this study, a simple model with no hidden layers, softsign activation function performs better. The evidence from this study will help auditors in understanding the factors that contribute to income tax fraud which will reduce the audit time and cost, as well as recover money foregone in income tax fraud.

Suggested Citation

  • Belle Fille Murorunkwere & Origene Tuyishimire & Dominique Haughton & Joseph Nzabanita, 2022. "Fraud Detection Using Neural Networks: A Case Study of Income Tax," Future Internet, MDPI, vol. 14(6), pages 1-14, May.
  • Handle: RePEc:gam:jftint:v:14:y:2022:i:6:p:168-:d:828622
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/1999-5903/14/6/168/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/1999-5903/14/6/168/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. César Pérez López & María Jesús Delgado Rodríguez & Sonia de Lucas Santos, 2019. "Tax Fraud Detection through Neural Networks: An Application Using a Sample of Personal Income Taxpayers," Future Internet, MDPI, vol. 11(4), pages 1-13, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Kudzanai Charity Muchuchuti, 2024. "An Ensemble Machine Learning Model to Detect Tax Fraud: Conceptual Framework," International Journal of Research and Innovation in Social Science, International Journal of Research and Innovation in Social Science (IJRISS), vol. 8(6), pages 2276-2282, June.
    2. Camino González Vasco & María Jesús Delgado Rodríguez & Sonia de Lucas Santos, 2021. "Segmentation of Potential Fraud Taxpayers and Characterization in Personal Income Tax Using Data Mining Techniques," Hacienda Pública Española / Review of Public Economics, IEF, vol. 239(4), pages 127-157, November.
    3. Carmen De-Pablos-Heredero, 2019. "Future Intelligent Systems and Networks," Future Internet, MDPI, vol. 11(6), pages 1-2, June.
    4. César Pérez López & María Jesús Delgado Rodríguez & Sonia de Lucas Santos, 2023. "Modelización de los factores que afectan al fraude fiscal con técnicas de minería de datos: aplicación al Impuesto de la Renta en España," Hacienda Pública Española / Review of Public Economics, IEF, vol. 246(3), pages 137-164, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jftint:v:14:y:2022:i:6:p:168-:d:828622. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.