IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v11y2023i12p2733-d1172802.html
   My bibliography  Save this article

Multitask Fine Tuning on Pretrained Language Model for Retrieval-Based Question Answering in Automotive Domain

Author

Listed:
  • Zhiyi Luo

    (School of Computer Science and Technology and the Key Laboratory of Intelligent Textile and Flexible Interconnection of Zhejiang Province, Zhejiang Sci-Tech University, Hangzhou 310018, China)

  • Sirui Yan

    (School of Computer Science and Technology and the Key Laboratory of Intelligent Textile and Flexible Interconnection of Zhejiang Province, Zhejiang Sci-Tech University, Hangzhou 310018, China)

  • Shuyun Luo

    (School of Computer Science and Technology and the Key Laboratory of Intelligent Textile and Flexible Interconnection of Zhejiang Province, Zhejiang Sci-Tech University, Hangzhou 310018, China)

Abstract

Retrieval-based question answering in the automotive domain requires a model to comprehend and articulate relevant domain knowledge, accurately understand user intent, and effectively match the required information. Typically, these systems employ an encoder–retriever architecture. However, existing encoders, which rely on pretrained language models, suffer from limited specialization, insufficient awareness of domain knowledge, and biases in user intent understanding. To overcome these limitations, this paper constructs a Chinese corpus specifically tailored for the automotive domain, comprising question–answer pairs, document collections, and multitask annotated data. Subsequently, a pretraining–multitask fine-tuning framework based on masked language models is introduced to integrate domain knowledge as well as enhance semantic representations, thereby yielding benefits for downstream applications. To evaluate system performance, an evaluation dataset is created using ChatGPT, and a novel retrieval task evaluation metric called mean linear window rank (MLWR) is proposed. Experimental results demonstrate that the proposed system (based on BERT b a s e ), achieves accuracies of 77.5% and 84.75% for Hit@1 and Hit@3, respectively, in the automotive domain retrieval-based question-answering task. Additionally, the MLWR reaches 87.71%. Compared to a system utilizing a general encoder, the proposed multitask fine-tuning strategy shows improvements of 12.5%, 12.5%, and 28.16% for Hit@1, Hit@3, and MLWR, respectively. Furthermore, when compared to the best single-task fine-tuning strategy, the enhancements amount to 0.5%, 1.25%, and 0.95% for Hit@1, Hit@3, and MLWR, respectively.

Suggested Citation

  • Zhiyi Luo & Sirui Yan & Shuyun Luo, 2023. "Multitask Fine Tuning on Pretrained Language Model for Retrieval-Based Question Answering in Automotive Domain," Mathematics, MDPI, vol. 11(12), pages 1-13, June.
  • Handle: RePEc:gam:jmathe:v:11:y:2023:i:12:p:2733-:d:1172802
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/11/12/2733/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/11/12/2733/
    Download Restriction: no
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Chong Lan & Yongsheng Wang & Chengze Wang & Shirong Song & Zheng Gong, 2023. "Application of ChatGPT-Based Digital Human in Animation Creation," Future Internet, MDPI, vol. 15(9), pages 1-18, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:11:y:2023:i:12:p:2733-:d:1172802. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.