IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v13y2025i3p392-d1576575.html
   My bibliography  Save this article

FldtMatch: Improving Unbalanced Data Classification via Deep Semi-Supervised Learning with Self-Adaptive Dynamic Threshold

Author

Listed:
  • Xin Wu

    (Department of Artificial Intelligence and Data Science, Guangzhou Xinhua University, 248 Yanjiangxi Road, Machong Town, Dongguan 523133, China)

  • Jingjing Xu

    (Department of Artificial Intelligence and Data Science, Guangzhou Xinhua University, 248 Yanjiangxi Road, Machong Town, Dongguan 523133, China)

  • Kuan Li

    (School of Computer Science and Technology, Dongguan University of Technology, Dongguan 523808, China)

  • Jianping Yin

    (School of Computer Science and Technology, Dongguan University of Technology, Dongguan 523808, China)

  • Jian Xiong

    (Department of Artificial Intelligence and Data Science, Guangzhou Xinhua University, 248 Yanjiangxi Road, Machong Town, Dongguan 523133, China)

Abstract

Among the many methods of deep semi-supervised learning (DSSL), the holistic method combines ideas from other methods, such as consistency regularization and pseudo-labeling, with great success. This method typically introduces a threshold to utilize unlabeled data. If the highest predictive value from unlabeled data exceeds the threshold, the associated class is designated as the data’s pseudo-label. However, current methods utilize fixed or dynamic thresholds, disregarding the varying learning difficulties across categories in unbalanced datasets. To overcome these issues, in this paper, we first designed Cumulative Effective Labeling (CEL) to reflect a particular class’s learning difficulty. This approach differs from previous methods because it uses effective pseudo-labels and ground truth, collectively influencing the model’s capacity to acquire category knowledge. In addition, based on CEL, we propose a simple but effective way to compute the threshold, Self-adaptive Dynamic Threshold (SDT). It requires a single hyperparameter to adjust to various scenarios, eliminating the necessity for a unique threshold modification approach for each case. SDT utilizes a clever mapping function that can solve the problem of differential learning difficulty of various categories in an unbalanced image dataset that adversely affects dynamic thresholding. Finally, we propose a deep semi-supervised method with SDT called FldtMatch. Through theoretical analysis and extensive experiments, we have fully proven that FldtMatch can overcome the negative impact of unbalanced data. Regardless of the choice of the backbone network, our method achieves the best results on multiple datasets. The maximum improvement of the macro F1-Score metric is about 5.6% in DFUC2021 and 2.2% in ISIC2018.

Suggested Citation

  • Xin Wu & Jingjing Xu & Kuan Li & Jianping Yin & Jian Xiong, 2025. "FldtMatch: Improving Unbalanced Data Classification via Deep Semi-Supervised Learning with Self-Adaptive Dynamic Threshold," Mathematics, MDPI, vol. 13(3), pages 1-21, January.
  • Handle: RePEc:gam:jmathe:v:13:y:2025:i:3:p:392-:d:1576575
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/13/3/392/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/13/3/392/
    Download Restriction: no
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:13:y:2025:i:3:p:392-:d:1576575. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.