IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v12y2024i22p3489-d1516608.html
   My bibliography  Save this article

Universal Knowledge Graph Embedding Framework Based on High-Quality Negative Sampling and Weighting

Author

Listed:
  • Pengfei Zhang

    (Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, China)

  • Huang Peng

    (Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, China)

  • Yang Fang

    (College of Information and Communication, National University of Defense Technology, Changsha 410073, China)

  • Zongqiang Yang

    (Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, China)

  • Yanli Hu

    (Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, China)

  • Zhen Tan

    (Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, China)

  • Weidong Xiao

    (Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, China)

Abstract

The traditional model training approach based on negative sampling randomly samples a portion of negative samples for training, which can easily overlook important negative samples and adversely affect the training of knowledge graph embedding models. Some researchers have explored non-sampling model training frameworks that use all unobserved triples as negative samples to improve model training performance. However, both training methods inevitably introduce false negative samples and easy-to-separate negative samples that are far from the model’s decision boundary, and they do not consider the adverse effects of long-tail entities and relations during training, thus limiting the improvement of model training performance. To address this issue, we propose a universal knowledge graph embedding framework based on high-quality negative sampling and weighting, called HNSW-KGE. First, we conduct pre-training based on the NS-KGE non-sampling training framework to quickly obtain an initial set of relatively high-quality embedding vector representations for all entities and relations. Second, we design a candidate negative sample set construction strategy that samples a certain number of negative samples that are neither false negatives nor easy-to-separate negatives for all positive triples, based on the embedding vectors obtained from pre-training. This ensures the provision of high-quality negative samples for model training. Finally, we apply weighting to the loss function based on the frequency of the entities and relations appearing in the triples to mitigate the adverse effects of long-tail entities and relations on model training. Experiments conducted on benchmark datasets FB15K237 and WN18RR using various knowledge graph embedding models demonstrate that our proposed framework HNSW-KGE, based on high-quality negative sampling and weighting, achieves better training performance and exhibits versatility, making it applicable to various types of knowledge embedding models.

Suggested Citation

  • Pengfei Zhang & Huang Peng & Yang Fang & Zongqiang Yang & Yanli Hu & Zhen Tan & Weidong Xiao, 2024. "Universal Knowledge Graph Embedding Framework Based on High-Quality Negative Sampling and Weighting," Mathematics, MDPI, vol. 12(22), pages 1-22, November.
  • Handle: RePEc:gam:jmathe:v:12:y:2024:i:22:p:3489-:d:1516608
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/12/22/3489/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/12/22/3489/
    Download Restriction: no
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:12:y:2024:i:22:p:3489-:d:1516608. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.