IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v12y2024i19p3090-d1491272.html
   My bibliography  Save this article

Adaptive Bi-Encoder Model Selection and Ensemble for Text Classification

Author

Listed:
  • Youngki Park

    (Department of Computer Education, Chuncheon National University of Education, Chuncheon 24328, Republic of Korea)

  • Youhyun Shin

    (Department of Computer Science and Engineering, Incheon National University, Incheon 22012, Republic of Korea)

Abstract

Can bi-encoders, without additional fine-tuning, achieve a performance comparable to fine-tuned BERT models in classification tasks? To answer this question, we present a simple yet effective approach to text classification using bi-encoders without the need for fine-tuning. Our main observation is that state-of-the-art bi-encoders exhibit varying performance across different datasets. Therefore, our proposed approaches involve preparing multiple bi-encoders and, when a new dataset is provided, selecting and ensembling the most appropriate ones based on the dataset. Experimental results show that, for text classification tasks on subsets of the AG News, SMS Spam Collection, Stanford Sentiment Treebank v2, and TREC Question Classification datasets, the proposed approaches achieve performance comparable to fine-tuned BERT-Base, DistilBERT-Base, ALBERT-Base, and RoBERTa-Base. For instance, using the well-known bi-encoder model all-MiniLM-L12-v2 without additional optimization resulted in an average accuracy of 77.84%. This improved to 89.49% through the application of the proposed adaptive selection and ensemble techniques, and further increased to 91.96% when combined with the RoBERTa-Base model. We believe that this approach will be particularly useful in fields such as K-12 AI programming education, where pre-trained models are applied to small datasets without fine-tuning.

Suggested Citation

  • Youngki Park & Youhyun Shin, 2024. "Adaptive Bi-Encoder Model Selection and Ensemble for Text Classification," Mathematics, MDPI, vol. 12(19), pages 1-14, October.
  • Handle: RePEc:gam:jmathe:v:12:y:2024:i:19:p:3090-:d:1491272
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/12/19/3090/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/12/19/3090/
    Download Restriction: no
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:12:y:2024:i:19:p:3090-:d:1491272. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.