RGB-D Face Recognition With Identity-Style Disentanglement and Depth Augmentation

IF 5 IEEE transactions on biometrics, behavior, and identity science Pub Date : 2023-01-09 DOI:10.1109/TBIOM.2022.3233769

Meng-Tzu Chiu;Hsun-Ying Cheng;Chien-Yi Wang;Shang-Hong Lai

{"title":"RGB-D Face Recognition With Identity-Style Disentanglement and Depth Augmentation","authors":"Meng-Tzu Chiu;Hsun-Ying Cheng;Chien-Yi Wang;Shang-Hong Lai","doi":"10.1109/TBIOM.2022.3233769","DOIUrl":null,"url":null,"abstract":"Deep learning approaches achieve highly accurate face recognition by training the models with huge face image datasets. Unlike 2D face image datasets, there is a lack of large 3D face datasets available to the public. Existing public 3D face datasets were usually collected with few subjects, leading to the over-fitting problem. This paper proposes two CNN models to improve the RGB-D face recognition task. The first is a segmentation-aware depth estimation network, called DepthNet, which estimates depth maps from RGB face images by exploiting semantic segmentation for more accurate face region localization. The other is a novel segmentation-guided RGB-D face recognition model that contains an RGB recognition branch, a depth map recognition branch, and an auxiliary segmentation mask branch. In our multi-modality face recognition model, a feature disentanglement scheme is employed to factorize the feature representation into identity-related and style-related components. DepthNet is applied to augment a large 2D face image dataset to a large RGB-D face dataset, which is used for training our RGB-D face recognition model. Our experimental results show that DepthNet can produce more reliable depth maps from face images with the segmentation mask. Our multi-modality face recognition model fully exploits the depth map and outperforms state-of-the-art methods on several public 3D face datasets with challenging variations.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"5 3","pages":"334-347"},"PeriodicalIF":5.0000,"publicationDate":"2023-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on biometrics, behavior, and identity science","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10011574/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Deep learning approaches achieve highly accurate face recognition by training the models with huge face image datasets. Unlike 2D face image datasets, there is a lack of large 3D face datasets available to the public. Existing public 3D face datasets were usually collected with few subjects, leading to the over-fitting problem. This paper proposes two CNN models to improve the RGB-D face recognition task. The first is a segmentation-aware depth estimation network, called DepthNet, which estimates depth maps from RGB face images by exploiting semantic segmentation for more accurate face region localization. The other is a novel segmentation-guided RGB-D face recognition model that contains an RGB recognition branch, a depth map recognition branch, and an auxiliary segmentation mask branch. In our multi-modality face recognition model, a feature disentanglement scheme is employed to factorize the feature representation into identity-related and style-related components. DepthNet is applied to augment a large 2D face image dataset to a large RGB-D face dataset, which is used for training our RGB-D face recognition model. Our experimental results show that DepthNet can produce more reliable depth maps from face images with the segmentation mask. Our multi-modality face recognition model fully exploits the depth map and outperforms state-of-the-art methods on several public 3D face datasets with challenging variations.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于身份风格解缠和深度增强的RGB-D人脸识别

深度学习方法通过训练具有大量人脸图像数据集的模型来实现高精度的人脸识别。与2D人脸图像数据集不同，目前还缺乏可供公众使用的大型3D人脸数据集。现有的公共三维人脸数据集通常采集对象较少，导致过拟合问题。本文提出了两种CNN模型来改进RGB-D人脸识别任务。首先是一个分割感知深度估计网络，称为DepthNet，它通过利用语义分割来更准确地定位人脸区域，从而从RGB人脸图像中估计深度图。另一种是基于分割的RGB- d人脸识别模型，该模型包含一个RGB识别分支、一个深度图识别分支和一个辅助分割掩码分支。在我们的多模态人脸识别模型中，采用特征解纠缠方案将特征表示分解为与身份相关和与风格相关的组件。应用deepnet将大型2D人脸图像数据集增强为大型RGB-D人脸数据集，用于训练我们的RGB-D人脸识别模型。我们的实验结果表明，使用分割掩码，DepthNet可以从人脸图像中生成更可靠的深度图。我们的多模态人脸识别模型充分利用了深度图，并在几个具有挑战性变化的公共3D人脸数据集上优于最先进的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE transactions on biometrics, behavior, and identity science

CiteScore

10.90

自引率

0.00%

发文量