Multichannel Multimodal Emotion Analysis of Cross-Modal Feedback Interactions Based on Knowledge Graph

IF 2.6 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Neural Processing Letters Pub Date : 2024-05-29 DOI:10.1007/s11063-024-11641-w

Shaohua Dong, Xiaochao Fan, Xinchun Ma

{"title":"Multichannel Multimodal Emotion Analysis of Cross-Modal Feedback Interactions Based on Knowledge Graph","authors":"Shaohua Dong, Xiaochao Fan, Xinchun Ma","doi":"10.1007/s11063-024-11641-w","DOIUrl":null,"url":null,"abstract":"<p>Multimodal sentiment analysis is a downstream branch task of sentiment analysis with high attention at present. Previous work in multimodal sentiment analysis have focused on the representation and fusion of modalities, capturing the underlying semantic relationships between modalities by considering contextual information. While this approach is feasible for simple contextual comments, more complex comments require the integration of external knowledge to obtain more accurate sentiment information. However, incorporating external knowledge into sentiment analysis to enhance information complementarity has not been thoroughly investigated. To address this, we propose a multichannel cross-modal feedback interaction model that incorporates the knowledge graph into multimodal sentiment analysis. Our proposed model consists of two main components: the cross-modal feedback recurrent interaction module and the external knowledge module for capturing latent information. The cross-modal interaction employs a self-feedback mechanism during network training, extracting feature representations of each modality and using these representations to mask sensory inputs, allowing the model to perform feedback-based feature masking. The external knowledge graph captures potential semantic information representations in the textual data through knowledge graph embedding. Finally, a global feature fusion module is employed for multichannel multimodal information integration. On two publicly available datasets, our method demonstrates good performance in terms of accuracy and F1 scores, compared to state-of-the-art models and several baselines.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"52 1","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Processing Letters","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11063-024-11641-w","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Multimodal sentiment analysis is a downstream branch task of sentiment analysis with high attention at present. Previous work in multimodal sentiment analysis have focused on the representation and fusion of modalities, capturing the underlying semantic relationships between modalities by considering contextual information. While this approach is feasible for simple contextual comments, more complex comments require the integration of external knowledge to obtain more accurate sentiment information. However, incorporating external knowledge into sentiment analysis to enhance information complementarity has not been thoroughly investigated. To address this, we propose a multichannel cross-modal feedback interaction model that incorporates the knowledge graph into multimodal sentiment analysis. Our proposed model consists of two main components: the cross-modal feedback recurrent interaction module and the external knowledge module for capturing latent information. The cross-modal interaction employs a self-feedback mechanism during network training, extracting feature representations of each modality and using these representations to mask sensory inputs, allowing the model to perform feedback-based feature masking. The external knowledge graph captures potential semantic information representations in the textual data through knowledge graph embedding. Finally, a global feature fusion module is employed for multichannel multimodal information integration. On two publicly available datasets, our method demonstrates good performance in terms of accuracy and F1 scores, compared to state-of-the-art models and several baselines.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于知识图谱的跨模态反馈互动的多通道多模态情感分析

多模态情感分析是情感分析的下游分支任务，目前备受关注。以往的多模态情感分析工作侧重于模态的表示和融合，通过考虑上下文信息来捕捉模态之间的潜在语义关系。虽然这种方法对于简单的上下文评论是可行的，但更复杂的评论则需要整合外部知识才能获得更准确的情感信息。然而，将外部知识纳入情感分析以增强信息互补性的做法尚未得到深入研究。为此，我们提出了一种多渠道跨模态反馈交互模型，将知识图谱融入多模态情感分析中。我们提出的模型由两个主要部分组成：跨模态反馈循环交互模块和用于捕捉潜在信息的外部知识模块。跨模态交互模块在网络训练过程中采用自我反馈机制，提取每种模态的特征表征，并利用这些表征来屏蔽感官输入，从而使模型能够执行基于反馈的特征屏蔽。外部知识图谱通过知识图谱嵌入捕捉文本数据中潜在的语义信息表征。最后，全局特征融合模块用于多通道多模态信息整合。在两个公开可用的数据集上，与最先进的模型和几种基线相比，我们的方法在准确率和 F1 分数方面表现出色。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Neural Processing Letters 工程技术-计算机：人工智能

CiteScore

4.90

自引率

12.90%

发文量

392

审稿时长

2.8 months

期刊介绍： Neural Processing Letters is an international journal publishing research results and innovative ideas on all aspects of artificial neural networks. Coverage includes theoretical developments, biological models, new formal modes, learning, applications, software and hardware developments, and prospective researches. The journal promotes fast exchange of information in the community of neural network researchers and users. The resurgence of interest in the field of artificial neural networks since the beginning of the 1980s is coupled to tremendous research activity in specialized or multidisciplinary groups. Research, however, is not possible without good communication between people and the exchange of information, especially in a field covering such different areas; fast communication is also a key aspect, and this is the reason for Neural Processing Letters