Author
Listed:
- Yongwen Liu
(College of Software Engineering, Zhengzhou University of Light Industry, Zhengzhou 450001, China)
- Dongqing Liu
(National Engineering Laboratory for Internet Medical Systems and Applications, Zhengzhou University, Zhengzhou 450052, China)
- Shaolin Zhu
(College of Software Engineering, Zhengzhou University of Light Industry, Zhengzhou 450001, China)
Abstract
Current multimodal neural machine translation (MNMT) approaches primarily focus on ensuring consistency between visual annotations and the source language, often overlooking the broader aspect of multimodal coherence, including target–visual and bilingual–visual alignment. In this paper, we propose a novel approach that effectively leverages target–visual consistency (TVC) and bilingual–visual consistency (BiVC) to improve MNMT performance. Our method leverages visual annotations depicting concepts across bilingual parallel sentences to enhance multimodal coherence in translation. We exploit target–visual harmony by extracting contextual cues from visual annotations during auto-regressive decoding, incorporating vital future context to improve target sentence representation. Additionally, we introduce a consistency loss promoting semantic congruence between bilingual sentence pairs and their visual annotations, fostering a tighter integration of textual and visual modalities. Extensive experiments on diverse multimodal translation datasets empirically demonstrate our approach’s effectiveness. This visually aware, data-driven framework opens exciting opportunities for intelligent learning, adaptive control, and robust distributed optimization of multi-agent systems in uncertain, complex environments. By seamlessly fusing multimodal data and machine learning, our method paves the way for novel control paradigms capable of effectively handling the dynamics and constraints of real-world multi-agent applications.
Suggested Citation
Yongwen Liu & Dongqing Liu & Shaolin Zhu, 2024.
"Bilingual–Visual Consistency for Multimodal Neural Machine Translation,"
Mathematics, MDPI, vol. 12(15), pages 1-18, July.
Handle:
RePEc:gam:jmathe:v:12:y:2024:i:15:p:2361-:d:1445040
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:12:y:2024:i:15:p:2361-:d:1445040. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.