论文标题
部分可观测时空混沌系统的无模型预测
CLMLF:A Contrastive Learning and Multi-Layer Fusion Method for Multimodal Sentiment Detection
论文作者
论文摘要
与单峰数据相比,多模式数据可以提供更多功能,以帮助模型分析数据的情感。先前的研究作品很少考虑令牌级的功能融合,很少有著作探索学习与情感数据相关的共同特征,以帮助模型融合融合多模式特征。在本文中,我们提出了一种对比度学习和多层融合方法(CLMLF)用于多模式情感检测。具体而言,我们首先编码文本和图像以获取隐藏的表示形式,然后使用多层融合模块来对齐和融合文本和图像的令牌级特征。除了情感分析任务外,我们还设计了两个对比学习任务,基于标签的对比度学习和基于数据的对比学习任务,这将帮助该模型学习与多模式数据中情感相关的共同特征。与现有方法相比,在三个公开可用的多模式数据集上进行的广泛实验证明了我们对多模式情感检测的有效性。这些代码可在https://github.com/link-li/clmlf上使用
Compared with unimodal data, multimodal data can provide more features to help the model analyze the sentiment of data. Previous research works rarely consider token-level feature fusion, and few works explore learning the common features related to sentiment in multimodal data to help the model fuse multimodal features. In this paper, we propose a Contrastive Learning and Multi-Layer Fusion (CLMLF) method for multimodal sentiment detection. Specifically, we first encode text and image to obtain hidden representations, and then use a multi-layer fusion module to align and fuse the token-level features of text and image. In addition to the sentiment analysis task, we also designed two contrastive learning tasks, label based contrastive learning and data based contrastive learning tasks, which will help the model learn common features related to sentiment in multimodal data. Extensive experiments conducted on three publicly available multimodal datasets demonstrate the effectiveness of our approach for multimodal sentiment detection compared with existing methods. The codes are available for use at https://github.com/Link-Li/CLMLF