SSLMM: Semi-Supervised Learning with Missing Modalities for Multimodal Sentiment Analysis

IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Information Fusion Pub Date : 2025-03-07 DOI:10.1016/j.inffus.2025.103058
Yiyu Wang , Haifang Jian , Jian Zhuang , Huimin Guo , Yan Leng
{"title":"SSLMM: Semi-Supervised Learning with Missing Modalities for Multimodal Sentiment Analysis","authors":"Yiyu Wang ,&nbsp;Haifang Jian ,&nbsp;Jian Zhuang ,&nbsp;Huimin Guo ,&nbsp;Yan Leng","doi":"10.1016/j.inffus.2025.103058","DOIUrl":null,"url":null,"abstract":"<div><div>Multimodal Sentiment Analysis (MSA) integrates information from text, audio, and visuals to understand human emotions, but real-world applications face two challenges: (1) expensive annotation costs reduce the effectiveness of fully supervised methods, and (2) missing modality severely impact model robustness. While there are studies addressing these issues separately, few focus on solving both within a single framework. In real-world scenarios, these challenges often occur together, necessitating an algorithm that can handle both. To address this, we propose a Semi-Supervised Learning with Missing Modalities (SSLMM) framework. SSLMM combines self-supervised learning, alternating interaction information, semi-supervised learning, and modality reconstruction to tackle label scarcity and modality missing simultaneously. Firstly, SSLMM captures latent structural information through self-supervised pre-training. It then fine-tunes the model using semi-supervised learning and modality reconstruction to reduce dependence on labeled data and improve robustness to modality missing. The framework uses a graph-based architecture with an iterative message propagation mechanism to alternately propagate intra-modal and inter-modal messages, capturing emotional associations within and across modalities. Experiments on CMU-MOSI, CMU-MOSEI, and CH-SIMS demonstrate that under the condition where the proportion of labeled samples and the missing modality rate are both 0.5, SSLMM achieves binary classification (negative vs. positive) accuracies of 80.2%, 81.7%, and 77.1%, respectively, surpassing existing methods.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"120 ","pages":"Article 103058"},"PeriodicalIF":15.5000,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525001319","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Multimodal Sentiment Analysis (MSA) integrates information from text, audio, and visuals to understand human emotions, but real-world applications face two challenges: (1) expensive annotation costs reduce the effectiveness of fully supervised methods, and (2) missing modality severely impact model robustness. While there are studies addressing these issues separately, few focus on solving both within a single framework. In real-world scenarios, these challenges often occur together, necessitating an algorithm that can handle both. To address this, we propose a Semi-Supervised Learning with Missing Modalities (SSLMM) framework. SSLMM combines self-supervised learning, alternating interaction information, semi-supervised learning, and modality reconstruction to tackle label scarcity and modality missing simultaneously. Firstly, SSLMM captures latent structural information through self-supervised pre-training. It then fine-tunes the model using semi-supervised learning and modality reconstruction to reduce dependence on labeled data and improve robustness to modality missing. The framework uses a graph-based architecture with an iterative message propagation mechanism to alternately propagate intra-modal and inter-modal messages, capturing emotional associations within and across modalities. Experiments on CMU-MOSI, CMU-MOSEI, and CH-SIMS demonstrate that under the condition where the proportion of labeled samples and the missing modality rate are both 0.5, SSLMM achieves binary classification (negative vs. positive) accuracies of 80.2%, 81.7%, and 77.1%, respectively, surpassing existing methods.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于缺失模态的半监督学习多模态情感分析
多模态情感分析(MSA)集成了来自文本、音频和视觉的信息来理解人类情感,但现实应用面临两个挑战:(1)昂贵的注释成本降低了全监督方法的有效性;(2)缺失模态严重影响模型的鲁棒性。虽然有研究分别解决这些问题,但很少有人关注在一个框架内解决这两个问题。在现实场景中,这些挑战经常同时发生,因此需要一种能够同时处理这两种挑战的算法。为了解决这个问题,我们提出了一个缺失模态的半监督学习(SSLMM)框架。SSLMM结合了自监督学习、交替交互信息、半监督学习和模态重构,同时解决了标签稀缺和模态缺失问题。首先,SSLMM通过自监督预训练捕获潜在的结构信息。然后使用半监督学习和模态重建对模型进行微调,以减少对标记数据的依赖,并提高对模态缺失的鲁棒性。该框架使用基于图的架构和迭代消息传播机制,交替传播模态内和模态间的消息,捕获模态内部和模态之间的情感关联。在CMU-MOSI、CMU-MOSEI和CH-SIMS上的实验表明,在标记样本比例和缺失模态率均为0.5的情况下,SSLMM的二值分类(阴性vs阳性)准确率分别达到80.2%、81.7%和77.1%,超过了现有方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Information Fusion
Information Fusion 工程技术-计算机:理论方法
CiteScore
33.20
自引率
4.30%
发文量
161
审稿时长
7.9 months
期刊介绍: Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.
期刊最新文献
PCFNet: Period–channel fusion network for multivariate time series forecasting — towards multi-period dependency modeling Learning Spatio-Temporal Affine Representation Subspace for Video-based Person Re-Identification From Unimodal to Flexible: A Survey of Generalized Biometric Systems Trustworthy Text-to-Image Diffusion Models: A Timely and Focused Survey Consensus Learning Framework Boosting Co-clustering
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1