Author
Listed:
- Sio-Kei Im
(Faculty of Applied Sciences, Macao Polytechnic University, Macau, China
Engineering Research Centre of Applied Technology on Machine Translation and Artificial Intelligence, Macao Polytechnic University, Macau, China)
- Ka-Hou Chan
(Faculty of Applied Sciences, Macao Polytechnic University, Macau, China
Engineering Research Centre of Applied Technology on Machine Translation and Artificial Intelligence, Macao Polytechnic University, Macau, China)
Abstract
The attention mechanism performs well for the Neural Machine Translation (NMT) task, but heavily depends on the context vectors generated by the attention network to predict target words. This reliance raises the issue of long-term dependencies. Indeed, it is very common to combine predicates with postpositions in sentences, and the same predicate may have different meanings when combined with different postpositions. This usually poses an additional challenge to the NMT study. In this work, we observe that the embedding vectors of different target tokens can be classified by part-of-speech, thus we analyze the Natural Language Processing (NLP) related Content-Adaptive Recurrent Unit (CARU) unit and apply it to our attention model (CAAtt) and embedding layer (CAEmbed). By encoding the source sentence with the current decoded feature through the CARU, CAAtt is capable of achieving translation content-adaptive representations, which attention weights are contributed and enhanced by our proposed L 1 exp N x normalization. Furthermore, CAEmbed aims to alleviate long-term dependencies in the target language through partial recurrent design, performing the feature extraction in a local perspective. Experiments on the WMT14, WMT17, and Multi30k translation tasks show that the proposed model achieves improvements in BLEU scores and enhancement of convergence over the attention-based plain NMT model. We also investigate the attention weights generated by the proposed approaches, which indicate that refinement over the different combinations of adposition can lead to different interpretations. Specifically, this work provides local attention to some specific phrases translated in our experiment. The results demonstrate that our approach is effective in improving performance and achieving a more reasonable attention distribution compared to the state-of-the-art models.
Suggested Citation
Sio-Kei Im & Ka-Hou Chan, 2024.
"Neural Machine Translation with CARU-Embedding Layer and CARU-Gated Attention Layer,"
Mathematics, MDPI, vol. 12(7), pages 1-19, March.
Handle:
RePEc:gam:jmathe:v:12:y:2024:i:7:p:997-:d:1364975
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:12:y:2024:i:7:p:997-:d:1364975. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.