ER-MRL:基于多模态表示学习的情绪识别

2022 12th International Conference on Information Science and Technology (ICIST) Pub Date : 2022-10-14 DOI:10.1109/ICIST55546.2022.9926848

Xiaoding Guo, Yadi Wang, Zhijun Miao, Xiaojin Yang, Jinkai Guo, Xianhong Hou, Feifei Zao

{"title":"ER-MRL:基于多模态表示学习的情绪识别","authors":"Xiaoding Guo, Yadi Wang, Zhijun Miao, Xiaojin Yang, Jinkai Guo, Xianhong Hou, Feifei Zao","doi":"10.1109/ICIST55546.2022.9926848","DOIUrl":null,"url":null,"abstract":"In recent years, emotion recognition technology has been widely used in emotion change perception and mental illness diagnosis. Previous methods are mainly based on single-task learning strategies, which are unable to fuse multimodal features and remove redundant information. This paper proposes an emotion recognition model ER-MRL, which is based on multimodal representation learning. ER-MRL vectorizes the multimodal emotion data through encoders based on neural networks. The gate mechanism is used for multimodal feature selection. On this basis, ER-MRL calculates the modality specific and modality invariant representation for each emotion category. The Transformer model and multihead self-attention layer are applied to multimodal feature fusion. ER-MRL figures out the prediction result through the tower layer based on fully connected neural networks. Experimental results on the CMU-MOSI dataset show that ER-MRL has better performance on emotion recognition than previous methods.","PeriodicalId":211213,"journal":{"name":"2022 12th International Conference on Information Science and Technology (ICIST)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"ER-MRL: Emotion Recognition based on Multimodal Representation Learning\",\"authors\":\"Xiaoding Guo, Yadi Wang, Zhijun Miao, Xiaojin Yang, Jinkai Guo, Xianhong Hou, Feifei Zao\",\"doi\":\"10.1109/ICIST55546.2022.9926848\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, emotion recognition technology has been widely used in emotion change perception and mental illness diagnosis. Previous methods are mainly based on single-task learning strategies, which are unable to fuse multimodal features and remove redundant information. This paper proposes an emotion recognition model ER-MRL, which is based on multimodal representation learning. ER-MRL vectorizes the multimodal emotion data through encoders based on neural networks. The gate mechanism is used for multimodal feature selection. On this basis, ER-MRL calculates the modality specific and modality invariant representation for each emotion category. The Transformer model and multihead self-attention layer are applied to multimodal feature fusion. ER-MRL figures out the prediction result through the tower layer based on fully connected neural networks. Experimental results on the CMU-MOSI dataset show that ER-MRL has better performance on emotion recognition than previous methods.\",\"PeriodicalId\":211213,\"journal\":{\"name\":\"2022 12th International Conference on Information Science and Technology (ICIST)\",\"volume\":\"43 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 12th International Conference on Information Science and Technology (ICIST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIST55546.2022.9926848\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 12th International Conference on Information Science and Technology (ICIST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIST55546.2022.9926848","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

近年来，情绪识别技术在情绪变化感知和精神疾病诊断中得到了广泛的应用。以往的方法主要基于单任务学习策略，无法融合多模态特征和去除冗余信息。提出了一种基于多模态表示学习的情感识别模型ER-MRL。ER-MRL通过基于神经网络的编码器对多模态情绪数据进行矢量化。闸门机构用于多模态特征选择。在此基础上，ER-MRL计算每个情绪类别的情态特定表示和情态不变表示。将Transformer模型和多头自关注层应用于多模态特征融合。ER-MRL基于全连接神经网络，通过塔层计算预测结果。在CMU-MOSI数据集上的实验结果表明，ER-MRL在情绪识别方面比以往的方法有更好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

ER-MRL: Emotion Recognition based on Multimodal Representation Learning

In recent years, emotion recognition technology has been widely used in emotion change perception and mental illness diagnosis. Previous methods are mainly based on single-task learning strategies, which are unable to fuse multimodal features and remove redundant information. This paper proposes an emotion recognition model ER-MRL, which is based on multimodal representation learning. ER-MRL vectorizes the multimodal emotion data through encoders based on neural networks. The gate mechanism is used for multimodal feature selection. On this basis, ER-MRL calculates the modality specific and modality invariant representation for each emotion category. The Transformer model and multihead self-attention layer are applied to multimodal feature fusion. ER-MRL figures out the prediction result through the tower layer based on fully connected neural networks. Experimental results on the CMU-MOSI dataset show that ER-MRL has better performance on emotion recognition than previous methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 12th International Conference on Information Science and Technology (ICIST)

自引率

0.00%

发文量