IDEAS home Printed from https://ideas.repec.org/a/rsk/journ1/7960421.html
   My bibliography  Save this article

A method of classifying imbalanced credit data based on the AC-CTGAN hybrid sampling algorithm

Author

Listed:
  • Tinggui Chen
  • Hailian Gu
  • Zhiyu Yang
  • Jianjun Yang
  • Bing Wang

Abstract

The rapid growth of consumer credit services has heightened financial institutions’ need for enhanced risk management capabilities, as they strive to satisfy individuals’ various consumption preferences. Identifying personal credit risk is crucial in financial risk management, underscoring the importance of financial institutions developing a systematic and effective credit risk identification framework to mitigate the likelihood of credit defaults. To address the class imbalance of credit data, this paper starts at the data level and proposes the method of adaptive cluster mixed sampling based on conditional tabular generative adversarial networks (AC-CTGAN). The method first uses the edited nearest neighbors algorithm (ENN) for preliminary denoising of the original credit data, then employs the improved K-means algorithm to obtain multiple subclusters of the minority samples. The local density of each subcluster is calculated, and the oversampling weight of each subcluster is adaptively determined on the basis of the size of the local density. Finally, minority samples are generated via the CTGAN, and the decision boundary is clarified via the TomekLink algorithm. Comparative experimental results show that the minority class samples generated by the AC-CTGAN algorithm can realistically reflect the distribution of the original data, minimize the appearance of class-overlapping and limit the introduction of new noisy data, which increases sample diversity. The potential within-class imbalance of credit data is also somewhat alleviated. The risk-identification models trained on credit data processed by the AC-CTGAN algorithm have a greater generalization ability compared with the synthetic minority oversampling technique (SMOTE), SMOTE variants and the original CTGAN.

Suggested Citation

Handle: RePEc:rsk:journ1:7960421
as

Download full text from publisher

File URL: https://www.risk.net/system/files/digital_asset/2024-11/jcr_Wang_online_early.pdf
Download Restriction: no
---><---

More about this item

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:rsk:journ1:7960421. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

We have no bibliographic references for this item. You can help adding them by using this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Thomas Paine (email available below). General contact details of provider: https://www.risk.net/journal-of-credit-risk .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.