PH-CBAM: A Parallel Hybrid CBAM Network with Multi-Feature Extraction for Facial Expression Recognition

Liefa Liao, Shouluan Wu, Chao Song, Jianglong Fu
{"title":"PH-CBAM: A Parallel Hybrid CBAM Network with Multi-Feature Extraction for Facial Expression Recognition","authors":"Liefa Liao, Shouluan Wu, Chao Song, Jianglong Fu","doi":"10.3390/electronics13163149","DOIUrl":null,"url":null,"abstract":"Convolutional neural networks have made significant progress in human Facial Expression Recognition (FER). However, they still face challenges in effectively focusing on and extracting facial features. Recent research has turned to attention mechanisms to address this issue, focusing primarily on local feature details rather than overall facial features. Building upon the classical Convolutional Block Attention Module (CBAM), this paper introduces a novel Parallel Hybrid Attention Model, termed PH-CBAM. This model employs split-channel attention to enhance the extraction of key features while maintaining a minimal parameter count. The proposed model enables the network to emphasize relevant details during expression classification. Heatmap analysis demonstrates that PH-CBAM effectively highlights key facial information. By employing a multimodal extraction approach in the initial image feature extraction phase, the network structure captures various facial features. The algorithm integrates a residual network and the MISH activation function to create a multi-feature extraction network, addressing issues such as gradient vanishing and negative gradient zero point in residual transmission. This enhances the retention of valuable information and facilitates information flow between key image details and target images. Evaluation on benchmark datasets FER2013, CK+, and Bigfer2013 yielded accuracies of 68.82%, 97.13%, and 72.31%, respectively. Comparison with mainstream network models on FER2013 and CK+ datasets demonstrates the efficiency of the PH-CBAM model, with comparable accuracy to current advanced models, showcasing its effectiveness in emotion detection.","PeriodicalId":504598,"journal":{"name":"Electronics","volume":"37 22","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electronics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/electronics13163149","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Convolutional neural networks have made significant progress in human Facial Expression Recognition (FER). However, they still face challenges in effectively focusing on and extracting facial features. Recent research has turned to attention mechanisms to address this issue, focusing primarily on local feature details rather than overall facial features. Building upon the classical Convolutional Block Attention Module (CBAM), this paper introduces a novel Parallel Hybrid Attention Model, termed PH-CBAM. This model employs split-channel attention to enhance the extraction of key features while maintaining a minimal parameter count. The proposed model enables the network to emphasize relevant details during expression classification. Heatmap analysis demonstrates that PH-CBAM effectively highlights key facial information. By employing a multimodal extraction approach in the initial image feature extraction phase, the network structure captures various facial features. The algorithm integrates a residual network and the MISH activation function to create a multi-feature extraction network, addressing issues such as gradient vanishing and negative gradient zero point in residual transmission. This enhances the retention of valuable information and facilitates information flow between key image details and target images. Evaluation on benchmark datasets FER2013, CK+, and Bigfer2013 yielded accuracies of 68.82%, 97.13%, and 72.31%, respectively. Comparison with mainstream network models on FER2013 and CK+ datasets demonstrates the efficiency of the PH-CBAM model, with comparable accuracy to current advanced models, showcasing its effectiveness in emotion detection.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
PH-CBAM:用于面部表情识别的多特征提取并行混合 CBAM 网络
卷积神经网络在人类面部表情识别(FER)领域取得了重大进展。然而,它们在有效聚焦和提取面部特征方面仍面临挑战。最近的研究转向了注意力机制来解决这一问题,主要关注局部特征细节而非整体面部特征。本文以经典的卷积块注意力模块(CBAM)为基础,介绍了一种新颖的并行混合注意力模型,称为 PH-CBAM。该模型采用分通道注意力,在保持最小参数数量的同时,加强了对关键特征的提取。所提出的模型能让网络在表达分类过程中强调相关细节。热图分析表明,PH-CBAM 能有效突出关键的面部信息。通过在初始图像特征提取阶段采用多模态提取方法,网络结构捕捉到了各种面部特征。该算法整合了残差网络和 MISH 激活函数,创建了一个多特征提取网络,解决了残差传输中梯度消失和负梯度零点等问题。这增强了有价值信息的保留,促进了关键图像细节与目标图像之间的信息流。在基准数据集 FER2013、CK+ 和 Bigfer2013 上进行的评估得出的准确率分别为 68.82%、97.13% 和 72.31%。在 FER2013 和 CK+ 数据集上与主流网络模型的比较显示了 PH-CBAM 模型的效率,其准确率与当前的先进模型相当,展示了其在情感检测方面的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Transformer-Based Spatiotemporal Graph Diffusion Convolution Network for Traffic Flow Forecasting Compact Walsh–Hadamard Transform-Driven S-Box Design for ASIC Implementations RETRACTED: Liu et al. Ground Risk Estimation of Unmanned Aerial Vehicles Based on Probability Approximation for Impact Positions with Multi-Uncertainties. Electronics 2023, 12, 829 The Use of TheraBracelet Upper Extremity Vibrotactile Stimulation in a Child with Cerebral Palsy—A Case Report Image Databases with Features Augmented with Singular-Point Shapes to Enhance Machine Learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1