Author
Listed:
- Weiyu Zhong
(School of Electronics and Information Engineering, South China Normal University, Foshan 528225, China
These authors contributed equally to this work.)
- Zhengxuan Zhang
(School of Electronics and Information Engineering, South China Normal University, Foshan 528225, China
These authors contributed equally to this work.)
- Qiaofeng Wu
(School of Electronics and Information Engineering, South China Normal University, Foshan 528225, China)
- Yun Xue
(School of Electronics and Information Engineering, South China Normal University, Foshan 528225, China)
- Qianhua Cai
(School of Electronics and Information Engineering, South China Normal University, Foshan 528225, China)
Abstract
Sarcasm represents a language form where a discrepancy lies between the literal meanings and implied intention. Sarcasm detection is challenging with unimodal text without clearly understanding the context, based on which multimodal information is introduced to benefit detection. However, current approaches only focus on modeling text–image incongruity at the token level and use the incongruity as the key to detection, ignoring the significance of the overall multimodal features and textual semantics during processing. Moreover, semantic information from other samples with a similar manner of expression also facilitates sarcasm detection. In this work, a semantic enhancement framework is proposed to address image–text congruity by modeling textual and visual information at the multi-scale and multi-span token level. The efficacy of textual semantics in multimodal sarcasm detection is pronounced. Aiming to bridge the cross-modal semantic gap, semantic enhancement is performed by using a multiple contrastive learning strategy. Experiments were conducted on a benchmark dataset. Our model outperforms the latest baseline by 1.87% in terms of the F1-score and 1% in terms of accuracy.
Suggested Citation
Weiyu Zhong & Zhengxuan Zhang & Qiaofeng Wu & Yun Xue & Qianhua Cai, 2024.
"A Semantic Enhancement Framework for Multimodal Sarcasm Detection,"
Mathematics, MDPI, vol. 12(2), pages 1-13, January.
Handle:
RePEc:gam:jmathe:v:12:y:2024:i:2:p:317-:d:1321771
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:12:y:2024:i:2:p:317-:d:1321771. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.