RF-GCN：利用多模态的残差融合图卷积网络进行面部情绪识别

IF 2.5 4区计算机科学 Q3 TELECOMMUNICATIONS Transactions on Emerging Telecommunications Technologies Pub Date : 2024-09-07 DOI:10.1002/ett.5031

D. Vishnu Sakthi, P. Ezhumalai

{"title":"RF-GCN：利用多模态的残差融合图卷积网络进行面部情绪识别","authors":"D. Vishnu Sakthi, P. Ezhumalai","doi":"10.1002/ett.5031","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>The emotional state of individuals is difficult to identify and it is developing now a days because of vast interest in recognition. Many technologies have been developed to identify this emotional expression based on facial expressions, vocal expressions, physiological signals, and body expressions. Among these, facial emotion is very expressive for recognition using multimodalities. Understanding facial emotions has applications in mental well-being, decision-making, and even social change, as emotions play a crucial role in our lives. This recognition is complicated by the high dimensionality of data and non-linear interactions across modalities. Moreover, the way emotion is expressed by people varies and these feature identification remains challenging, where these limitations are overcome by Deep learning models.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>This research work aims at facial emotion recognition through the utilization of a deep learning model, named the proposed Residual Fused-Graph Convolution Network (RF-GCN). Here, multimodal data included is video as well as an Electroencephalogram (EEG) signal. Also, the Non-Local Means (NLM) filter is used for pre-processing input video frames. Here, the feature selection process is carried out using chi-square, after feature extraction, which is done in both pre-processed video frames and input EEG signals. Finally, facial emotion recognition and its types are determined by RF-GCN, which is a combination of both the Deep Residual Network (DRN) and Graph Convolutional Network (GCN).</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>Further, RF-GCN is evaluated for performance by metrics such as accuracy, recall, and precision, with superior values of 91.6%, 96.5%, and 94.7%.</p>\n </section>\n \n <section>\n \n <h3> Conclusions</h3>\n \n <p>RF-GCN captures the nuanced relationships between different emotional states and improves recognition accuracy. The model is trained and evaluated on the dataset and reflects real-world conditions.</p>\n </section>\n </div>","PeriodicalId":23282,"journal":{"name":"Transactions on Emerging Telecommunications Technologies","volume":"35 9","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"RF-GCN: Residual fused-graph convolutional network using multimodalities for facial emotion recognition\",\"authors\":\"D. Vishnu Sakthi, P. Ezhumalai\",\"doi\":\"10.1002/ett.5031\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Background</h3>\\n \\n <p>The emotional state of individuals is difficult to identify and it is developing now a days because of vast interest in recognition. Many technologies have been developed to identify this emotional expression based on facial expressions, vocal expressions, physiological signals, and body expressions. Among these, facial emotion is very expressive for recognition using multimodalities. Understanding facial emotions has applications in mental well-being, decision-making, and even social change, as emotions play a crucial role in our lives. This recognition is complicated by the high dimensionality of data and non-linear interactions across modalities. Moreover, the way emotion is expressed by people varies and these feature identification remains challenging, where these limitations are overcome by Deep learning models.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3>\\n \\n <p>This research work aims at facial emotion recognition through the utilization of a deep learning model, named the proposed Residual Fused-Graph Convolution Network (RF-GCN). Here, multimodal data included is video as well as an Electroencephalogram (EEG) signal. Also, the Non-Local Means (NLM) filter is used for pre-processing input video frames. Here, the feature selection process is carried out using chi-square, after feature extraction, which is done in both pre-processed video frames and input EEG signals. Finally, facial emotion recognition and its types are determined by RF-GCN, which is a combination of both the Deep Residual Network (DRN) and Graph Convolutional Network (GCN).</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>Further, RF-GCN is evaluated for performance by metrics such as accuracy, recall, and precision, with superior values of 91.6%, 96.5%, and 94.7%.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusions</h3>\\n \\n <p>RF-GCN captures the nuanced relationships between different emotional states and improves recognition accuracy. The model is trained and evaluated on the dataset and reflects real-world conditions.</p>\\n </section>\\n </div>\",\"PeriodicalId\":23282,\"journal\":{\"name\":\"Transactions on Emerging Telecommunications Technologies\",\"volume\":\"35 9\",\"pages\":\"\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2024-09-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Transactions on Emerging Telecommunications Technologies\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/ett.5031\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"TELECOMMUNICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transactions on Emerging Telecommunications Technologies","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ett.5031","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

背景个人的情绪状态很难识别，如今由于人们对识别的极大兴趣，这种技术正在不断发展。目前已开发出许多技术，可根据面部表情、声音表情、生理信号和肢体表情来识别这种情绪表达。其中，面部情绪对于使用多模态技术进行识别具有很强的表现力。由于情绪在我们的生活中起着至关重要的作用，因此了解面部情绪在心理健康、决策甚至社会变革方面都有应用。数据的高维度和跨模态的非线性交互使识别变得复杂。此外，人们表达情绪的方式各不相同，这些特征识别仍然具有挑战性，而深度学习模型可以克服这些限制。方法本研究工作旨在通过利用一种名为 "残差融合图卷积网络（RF-GCN）"的深度学习模型进行面部情绪识别。这里的多模态数据包括视频和脑电图（EEG）信号。此外，非局部均值（NLM）滤波器用于预处理输入视频帧。在这里，特征提取后的特征选择过程是在预处理过的视频帧和输入的脑电图信号中进行的。最后，面部情绪识别及其类型由 RF-GCN 确定，RF-GCN 是深度残差网络（DRN）和图卷积网络（GCN）的结合体。结果通过准确率、召回率和精确率等指标对 RF-GCN 的性能进行了进一步评估，结果显示 RF-GCN 的准确率、召回率和精确率分别达到 91.6%、96.5% 和 94.7%。结论 RF-GCN 能够捕捉不同情绪状态之间的细微关系，提高识别准确率。该模型在数据集上进行了训练和评估，反映了真实世界的情况。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

RF-GCN: Residual fused-graph convolutional network using multimodalities for facial emotion recognition

Background

The emotional state of individuals is difficult to identify and it is developing now a days because of vast interest in recognition. Many technologies have been developed to identify this emotional expression based on facial expressions, vocal expressions, physiological signals, and body expressions. Among these, facial emotion is very expressive for recognition using multimodalities. Understanding facial emotions has applications in mental well-being, decision-making, and even social change, as emotions play a crucial role in our lives. This recognition is complicated by the high dimensionality of data and non-linear interactions across modalities. Moreover, the way emotion is expressed by people varies and these feature identification remains challenging, where these limitations are overcome by Deep learning models.

Methods

This research work aims at facial emotion recognition through the utilization of a deep learning model, named the proposed Residual Fused-Graph Convolution Network (RF-GCN). Here, multimodal data included is video as well as an Electroencephalogram (EEG) signal. Also, the Non-Local Means (NLM) filter is used for pre-processing input video frames. Here, the feature selection process is carried out using chi-square, after feature extraction, which is done in both pre-processed video frames and input EEG signals. Finally, facial emotion recognition and its types are determined by RF-GCN, which is a combination of both the Deep Residual Network (DRN) and Graph Convolutional Network (GCN).

Results

Further, RF-GCN is evaluated for performance by metrics such as accuracy, recall, and precision, with superior values of 91.6%, 96.5%, and 94.7%.

Conclusions

RF-GCN captures the nuanced relationships between different emotional states and improves recognition accuracy. The model is trained and evaluated on the dataset and reflects real-world conditions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Transactions on Emerging Telecommunications Technologies TELECOMMUNICATIONS-

CiteScore

8.90

自引率

13.90%

发文量

249

期刊介绍： ransactions on Emerging Telecommunications Technologies (ETT), formerly known as European Transactions on Telecommunications (ETT), has the following aims: - to attract cutting-edge publications from leading researchers and research groups around the world - to become a highly cited source of timely research findings in emerging fields of telecommunications - to limit revision and publication cycles to a few months and thus significantly increase attractiveness to publish - to become the leading journal for publishing the latest developments in telecommunications