Author
Listed:
- Weiwei Yuan
(College of Mechanical and Electrical Engineering, Gansu Agricultural University, Lanzhou 730070, China)
- Wanxia Yang
(College of Mechanical and Electrical Engineering, Gansu Agricultural University, Lanzhou 730070, China)
- Liang He
(Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
Xinjiang Key Laboratory of Signal Detection and Processing, Xinjiang University, Urumqi 830017, China
College of Computer Science and Technology, Xinjiang University, Urumqi 830017, China)
- Tingwei Zhang
(College of Plant Protection, Gansu Agricultural University, Lanzhou 730070, China)
- Yan Hao
(College of Mechanical and Electrical Engineering, Gansu Agricultural University, Lanzhou 730070, China)
- Jing Lu
(College of Mechanical and Electrical Engineering, Gansu Agricultural University, Lanzhou 730070, China)
- Wenbo Yan
(College of Mechanical and Electrical Engineering, Gansu Agricultural University, Lanzhou 730070, China)
Abstract
The extraction of entities and relationships is a crucial task in the field of natural language processing (NLP). However, existing models for this task often rely heavily on a substantial amount of labeled data, which not only consumes time and labor but also hinders the development of downstream tasks. Therefore, with a focus on enhancing the model’s ability to learn from small samples, this paper proposes an entity and relationship extraction method based on the Universal Information Extraction (UIE) model. The core of the approach is the design of a specialized prompt template and schema on cotton pests and diseases as one of the main inputs to the UIE, which, under its guided fine-tuning, enables the model to subdivide the entity and relationship in the corpus. As a result, the UIE-base model achieves an accuracy of 86.5% with only 40 labeled training samples, which really solves the problem of the existing models that require a large amount of manually labeled training data for knowledge extraction. To verify the generalization ability of the model in this paper, experiments are designed to compare the model with four classical models, such as the Bert-BiLSTM-CRF. The experimental results show that the F1 value on the self-built cotton data set is 1.4% higher than that of the Bert-BiLSTM-CRF model, and the F1 value on the public data set is 2.5% higher than that of the Bert-BiLSTM-CRF model. Furthermore, experiments are designed to verify that the UIE-base model has the best small-sample learning performance when the number of samples is 40. This paper provides an effective method for small-sample knowledge extraction.
Suggested Citation
Weiwei Yuan & Wanxia Yang & Liang He & Tingwei Zhang & Yan Hao & Jing Lu & Wenbo Yan, 2024.
"Research on Entity and Relationship Extraction with Small Training Samples for Cotton Pests and Diseases,"
Agriculture, MDPI, vol. 14(3), pages 1-16, March.
Handle:
RePEc:gam:jagris:v:14:y:2024:i:3:p:457-:d:1354990
Download full text from publisher
Most related items
These are the items that most often cite the same works as this one and are cited by the same works as this one.
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jagris:v:14:y:2024:i:3:p:457-:d:1354990. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.