PSPN: Pseudo-Siamese Pyramid Network for multimodal emotion analysis

IF 3.1 3区工程技术 Q2 NEUROSCIENCES Cognitive Neurodynamics Pub Date : 2024-05-28 DOI:10.1007/s11571-024-10123-y

Yanyan Yin, Wanzeng Kong, Jiajia Tang, Jinghao Li, Fabio Babiloni

{"title":"PSPN: Pseudo-Siamese Pyramid Network for multimodal emotion analysis","authors":"Yanyan Yin, Wanzeng Kong, Jiajia Tang, Jinghao Li, Fabio Babiloni","doi":"10.1007/s11571-024-10123-y","DOIUrl":null,"url":null,"abstract":"<p>Emotion recognition plays an important role in human life and healthcare. The EEG has been extensively researched as an objective indicator of intense emotions. However, current existing methods lack sufficient analysis of shallow and deep EEG features. In addition, human emotions are complex and variable, making it difficult to comprehensively represent emotions using a single-modal signal. As a signal associated with gaze tracking and eye movement detection, Eye-related signals provide various forms of supplementary information for multimodal emotion analysis. Therefore, we propose a Pseudo-Siamese Pyramid Network (PSPN) for multimodal emotion analysis. The PSPN model employs a Depthwise Separable Convolutional Pyramid (DSCP) to extract and integrate intrinsic emotional features at various levels and scales from EEG signals. Simultaneously, we utilize a fully connected subnetwork to extract the external emotional features from eye-related signals. Finally, we introduce a Pseudo-Siamese network that integrates a flexible cross-modal dual-branch subnetwork to collaboratively utilize EEG emotional features and eye-related behavioral features, achieving consistency and complementarity in multimodal emotion recognition. For evaluation, we conducted experiments on the DEAP and SEED-IV public datasets. The experimental results demonstrate that multimodal fusion significantly improves the accuracy of emotion recognition compared to single-modal approaches. Our PSPN model achieved the best accuracy of 96.02% and 96.45% on the valence and arousal dimensions of the DEAP dataset, and 77.81% on the SEED-IV dataset, respectively. Our code link is: https://github.com/Yinyanyan003/PSPN.git.</p>","PeriodicalId":10500,"journal":{"name":"Cognitive Neurodynamics","volume":"65 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognitive Neurodynamics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11571-024-10123-y","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"NEUROSCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

Emotion recognition plays an important role in human life and healthcare. The EEG has been extensively researched as an objective indicator of intense emotions. However, current existing methods lack sufficient analysis of shallow and deep EEG features. In addition, human emotions are complex and variable, making it difficult to comprehensively represent emotions using a single-modal signal. As a signal associated with gaze tracking and eye movement detection, Eye-related signals provide various forms of supplementary information for multimodal emotion analysis. Therefore, we propose a Pseudo-Siamese Pyramid Network (PSPN) for multimodal emotion analysis. The PSPN model employs a Depthwise Separable Convolutional Pyramid (DSCP) to extract and integrate intrinsic emotional features at various levels and scales from EEG signals. Simultaneously, we utilize a fully connected subnetwork to extract the external emotional features from eye-related signals. Finally, we introduce a Pseudo-Siamese network that integrates a flexible cross-modal dual-branch subnetwork to collaboratively utilize EEG emotional features and eye-related behavioral features, achieving consistency and complementarity in multimodal emotion recognition. For evaluation, we conducted experiments on the DEAP and SEED-IV public datasets. The experimental results demonstrate that multimodal fusion significantly improves the accuracy of emotion recognition compared to single-modal approaches. Our PSPN model achieved the best accuracy of 96.02% and 96.45% on the valence and arousal dimensions of the DEAP dataset, and 77.81% on the SEED-IV dataset, respectively. Our code link is: https://github.com/Yinyanyan003/PSPN.git.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

PSPN：用于多模态情感分析的伪暹罗金字塔网络

情绪识别在人类生活和医疗保健中发挥着重要作用。脑电图作为强烈情绪的客观指标已被广泛研究。然而，现有方法缺乏对浅层和深层脑电图特征的充分分析。此外，人类的情绪复杂多变，很难通过单一模态信号来全面表达情绪。作为一种与注视跟踪和眼动检测相关的信号，眼动相关信号为多模态情绪分析提供了多种形式的补充信息。因此，我们提出了一种用于多模态情感分析的伪暹罗金字塔网络（PSPN）。PSPN 模型采用深度可分离卷积金字塔（DSCP）从脑电信号中提取并整合不同层次和尺度的内在情绪特征。同时，我们利用全连接子网络从眼部相关信号中提取外部情绪特征。最后，我们引入了一个伪连通网络（Pseudo-Siamese network），该网络整合了一个灵活的跨模态双分支子网络，协同利用脑电图情感特征和眼部相关行为特征，实现多模态情感识别的一致性和互补性。为了进行评估，我们在 DEAP 和 SEED-IV 公共数据集上进行了实验。实验结果表明，与单模态方法相比，多模态融合能显著提高情绪识别的准确性。我们的 PSPN 模型在 DEAP 数据集的情感维度和唤醒维度上分别达到了 96.02% 和 96.45% 的最佳准确率，在 SEED-IV 数据集上达到了 77.81% 的最佳准确率。我们的代码链接是：https://github.com/Yinyanyan003/PSPN.git。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Cognitive Neurodynamics 医学-神经科学

CiteScore

6.90

自引率

18.90%

发文量

140

审稿时长

12 months

期刊介绍： Cognitive Neurodynamics provides a unique forum of communication and cooperation for scientists and engineers working in the field of cognitive neurodynamics, intelligent science and applications, bridging the gap between theory and application, without any preference for pure theoretical, experimental or computational models. The emphasis is to publish original models of cognitive neurodynamics, novel computational theories and experimental results. In particular, intelligent science inspired by cognitive neuroscience and neurodynamics is also very welcome. The scope of Cognitive Neurodynamics covers cognitive neuroscience, neural computation based on dynamics, computer science, intelligent science as well as their interdisciplinary applications in the natural and engineering sciences. Papers that are appropriate for non-specialist readers are encouraged. 1. There is no page limit for manuscripts submitted to Cognitive Neurodynamics. Research papers should clearly represent an important advance of especially broad interest to researchers and technologists in neuroscience, biophysics, BCI, neural computer and intelligent robotics. 2. Cognitive Neurodynamics also welcomes brief communications: short papers reporting results that are of genuinely broad interest but that for one reason and another do not make a sufficiently complete story to justify a full article publication. Brief Communications should consist of approximately four manuscript pages. 3. Cognitive Neurodynamics publishes review articles in which a specific field is reviewed through an exhaustive literature survey. There are no restrictions on the number of pages. Review articles are usually invited, but submitted reviews will also be considered.