Hierarchical Pyramid Diverse Attention Networks for Face Recognition

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2020-06-01 DOI:10.1109/cvpr42600.2020.00835

Qiangchang Wang, Tianyi Wu, He Zheng, G. Guo

{"title":"Hierarchical Pyramid Diverse Attention Networks for Face Recognition","authors":"Qiangchang Wang, Tianyi Wu, He Zheng, G. Guo","doi":"10.1109/cvpr42600.2020.00835","DOIUrl":null,"url":null,"abstract":"Deep learning has achieved a great success in face recognition (FR), however, few existing models take hierarchical multi-scale local features into consideration. In this work, we propose a hierarchical pyramid diverse attention (HPDA) network. First, it is observed that local patches would play important roles in FR when the global face appearance changes dramatically. Some recent works apply attention modules to locate local patches automatically without relying on face landmarks. Unfortunately, without considering diversity, some learned attentions tend to have redundant responses around some similar local patches, while neglecting other potential discriminative facial parts. Meanwhile, local patches may appear at different scales due to pose variations or large expression changes. To alleviate these challenges, we propose a pyramid diverse attention (PDA) to learn multi-scale diverse local representations automatically and adaptively. More specifically, a pyramid attention is developed to capture multi-scale features. Meanwhile, a diverse learning is developed to encourage models to focus on different local patches and generate diverse local features. Second, almost all existing models focus on extracting features from the last convolutional layer, lacking of local details or small-scale face parts in lower layers. Instead of simple concatenation or addition, we propose to use a hierarchical bilinear pooling (HBP) to fuse information from multiple layers effectively. Thus, the HPDA is developed by integrating the PDA into the HBP. Experimental results on several datasets show the effectiveness of the HPDA, compared to the state-of-the-art methods.","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"1 1","pages":"8323-8332"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"51","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/cvpr42600.2020.00835","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 51

Abstract

Deep learning has achieved a great success in face recognition (FR), however, few existing models take hierarchical multi-scale local features into consideration. In this work, we propose a hierarchical pyramid diverse attention (HPDA) network. First, it is observed that local patches would play important roles in FR when the global face appearance changes dramatically. Some recent works apply attention modules to locate local patches automatically without relying on face landmarks. Unfortunately, without considering diversity, some learned attentions tend to have redundant responses around some similar local patches, while neglecting other potential discriminative facial parts. Meanwhile, local patches may appear at different scales due to pose variations or large expression changes. To alleviate these challenges, we propose a pyramid diverse attention (PDA) to learn multi-scale diverse local representations automatically and adaptively. More specifically, a pyramid attention is developed to capture multi-scale features. Meanwhile, a diverse learning is developed to encourage models to focus on different local patches and generate diverse local features. Second, almost all existing models focus on extracting features from the last convolutional layer, lacking of local details or small-scale face parts in lower layers. Instead of simple concatenation or addition, we propose to use a hierarchical bilinear pooling (HBP) to fuse information from multiple layers effectively. Thus, the HPDA is developed by integrating the PDA into the HBP. Experimental results on several datasets show the effectiveness of the HPDA, compared to the state-of-the-art methods.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于层次金字塔的多注意力网络人脸识别

深度学习在人脸识别中取得了很大的成功，但是现有的模型很少考虑到层次多尺度局部特征。在这项工作中，我们提出了一个层次金字塔不同注意力(HPDA)网络。首先，研究发现，当全球人脸外观发生剧烈变化时，局部斑块在人脸识别中发挥着重要作用。最近的一些研究使用注意力模块来自动定位局部斑块，而不依赖于面部地标。不幸的是，如果不考虑多样性，一些习得的注意往往会在一些相似的局部斑块周围产生冗余反应，而忽略了其他潜在的歧视性面部部位。同时，由于位姿变化或表达变化较大，局部斑块可能在不同尺度上出现。为了缓解这些挑战，我们提出了一种金字塔多样化注意力(PDA)来自动自适应地学习多尺度不同的局部表征。更具体地说，金字塔注意力是用来捕捉多尺度特征的。同时，开发了一种多样化的学习方法，鼓励模型关注不同的局部补丁，生成不同的局部特征。其次，几乎所有现有的模型都侧重于从最后一层卷积提取特征，缺乏底层的局部细节或小尺度人脸部分。我们建议使用分层双线性池(HBP)来有效地融合来自多个层的信息，而不是简单的连接或添加。因此，通过将PDA集成到HBP中来开发HPDA。在多个数据集上的实验结果表明，与目前最先进的方法相比，HPDA是有效的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

自引率

0.00%

发文量

期刊最新文献

Geometric Structure Based and Regularized Depth Estimation From 360 Indoor Imagery 3D Part Guided Image Editing for Fine-Grained Object Understanding SDC-Depth: Semantic Divide-and-Conquer Network for Monocular Depth Estimation Approximating shapes in images with low-complexity polygons PFRL: Pose-Free Reinforcement Learning for 6D Pose Estimation