IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v12y2024i17p2634-d1463462.html
   My bibliography  Save this article

A Semi-Supervised Active Learning Method for Structured Data Enhancement with Small Samples

Author

Listed:
  • Fangling Leng

    (School of Computer Science and Engineering, Northeastern University, Shenyang 110169, China)

  • Fan Li

    (School of Computer Science and Engineering, Northeastern University, Shenyang 110169, China)

  • Wei Lv

    (School of Computer Science and Engineering, Northeastern University, Shenyang 110169, China)

  • Yubin Bao

    (School of Computer Science and Engineering, Northeastern University, Shenyang 110169, China)

  • Xiaofeng Liu

    (School of Computer Science and Engineering, Northeastern University, Shenyang 110169, China)

  • Tiancheng Zhang

    (School of Computer Science and Engineering, Northeastern University, Shenyang 110169, China)

  • Ge Yu

    (School of Computer Science and Engineering, Northeastern University, Shenyang 110169, China)

Abstract

In order to solve the problems of the small capacity of structured data and uneven distribution among classes in machine learning tasks, a supervised generation method for structured data called WAGAN and a cyclic sampling method named SACS (Semi-supervised and Active-learning Cyclic Sampling), based on semi-supervised active learning, are proposed. The loss function and neural network structure are optimized, and the quantity and quality of the small sample set are enhanced. To enhance the reliability of generating pseudo-labels, a Semi-supervised Active learning Framework (SAF) is designed. This framework redistributes class labels to samples, which not only enhances the reliability of generated samples but also reduces the influence of noise and uncertainty on the generation of false labels. To mine the diversity information of generated samples, an uncertain sampling strategy based on spatial overlap is designed. This strategy incorporates the idea of spatial overlap and uses global and local sampling methods to calculate the information content of generated samples. Experimental results show that the proposed method performs better than other data enhancement methods on three different datasets. Compared to the original data, the average F 1 m a c r o value of the classification model is improved by 11.5%, 16.1%, and 19.6% relative to compared methods.

Suggested Citation

  • Fangling Leng & Fan Li & Wei Lv & Yubin Bao & Xiaofeng Liu & Tiancheng Zhang & Ge Yu, 2024. "A Semi-Supervised Active Learning Method for Structured Data Enhancement with Small Samples," Mathematics, MDPI, vol. 12(17), pages 1-22, August.
  • Handle: RePEc:gam:jmathe:v:12:y:2024:i:17:p:2634-:d:1463462
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/12/17/2634/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/12/17/2634/
    Download Restriction: no
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:12:y:2024:i:17:p:2634-:d:1463462. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.