首页 > 最新文献

IET Biometrics最新文献

英文 中文
Research on TCN Model Based on SSARF Feature Selection in the Field of Human Behavior Recognition 基于 SSARF 特征选择的 TCN 模型在人类行为识别领域的研究
IF 1.8 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-30 DOI: 10.1049/2024/4982277
Wei Zhang, Guibo Yu, Shijie Deng

Human behavior recognition is the process of automatically identifying and analyzing multiple human behaviors using modern technology. From previous studies, we find that redundant features not only slow down the model training process and increase the structural complexity but also degrade the overall performance of the model. To overcome this problem, this paper investigates a temporal convolutional neural network (TCN) model based on improved sparrow search algorithm random forest (SSARF) feature selection to accurately identify human behavioral traits based on wearable devices. The model is based on the TCN classification model and incorporates a random forest with the sparrow optimization algorithm to perform dimensionality reduction on the original features, which is used to remove poorly correlated and unimportant features and retain effective features with a certain contribution rate to generate the optimal feature subset. In order to verify the reliability of the method, the performance of the model was evaluated on two public datasets, UCI Human Activity Recognition and WISDM, respectively, and 98.54% and 97.83% recognition accuracies were obtained, which were improved by 0.47% and 1.04% compared to the prefeature selection, but the number of features was reduced by 84.31% and 32.50% compared to the original feature set. In addition, we compared the TCN classification model with other deep learning models in terms of evaluation metrics such as F1 score, recall, precision, and accuracy, and the results showed that the TCN model outperformed the other control models in all four metrics. Meanwhile, it also outperforms the existing recognition methods in terms of accuracy and other aspects, which have some practical application value.

人类行为识别是利用现代技术自动识别和分析多种人类行为的过程。在以往的研究中,我们发现冗余特征不仅会减慢模型训练过程、增加结构复杂度,还会降低模型的整体性能。为了克服这一问题,本文研究了一种基于改进的麻雀搜索算法随机森林(SSARF)特征选择的时序卷积神经网络(TCN)模型,以准确识别基于可穿戴设备的人类行为特征。该模型以 TCN 分类模型为基础,将随机森林与麻雀优化算法相结合,对原始特征进行降维处理,用于去除相关性差和不重要的特征,保留具有一定贡献率的有效特征,生成最优特征子集。为了验证该方法的可靠性,我们分别在 UCI Human Activity Recognition 和 WISDM 两个公开数据集上对模型的性能进行了评估,得到了 98.54% 和 97.83% 的识别准确率,与预特征选择相比分别提高了 0.47% 和 1.04%,但与原始特征集相比,特征数量分别减少了 84.31% 和 32.50%。此外,我们还将 TCN 分类模型与其他深度学习模型在 F1 分数、召回率、精度和准确率等评价指标方面进行了比较,结果表明 TCN 模型在所有四个指标上都优于其他对照模型。同时,它在准确率等方面也优于现有的识别方法,具有一定的实际应用价值。
{"title":"Research on TCN Model Based on SSARF Feature Selection in the Field of Human Behavior Recognition","authors":"Wei Zhang,&nbsp;Guibo Yu,&nbsp;Shijie Deng","doi":"10.1049/2024/4982277","DOIUrl":"https://doi.org/10.1049/2024/4982277","url":null,"abstract":"<div>\u0000 <p>Human behavior recognition is the process of automatically identifying and analyzing multiple human behaviors using modern technology. From previous studies, we find that redundant features not only slow down the model training process and increase the structural complexity but also degrade the overall performance of the model. To overcome this problem, this paper investigates a temporal convolutional neural network (TCN) model based on improved sparrow search algorithm random forest (SSARF) feature selection to accurately identify human behavioral traits based on wearable devices. The model is based on the TCN classification model and incorporates a random forest with the sparrow optimization algorithm to perform dimensionality reduction on the original features, which is used to remove poorly correlated and unimportant features and retain effective features with a certain contribution rate to generate the optimal feature subset. In order to verify the reliability of the method, the performance of the model was evaluated on two public datasets, UCI Human Activity Recognition and WISDM, respectively, and 98.54% and 97.83% recognition accuracies were obtained, which were improved by 0.47% and 1.04% compared to the prefeature selection, but the number of features was reduced by 84.31% and 32.50% compared to the original feature set. In addition, we compared the TCN classification model with other deep learning models in terms of evaluation metrics such as <i>F</i><sub>1</sub> score, recall, precision, and accuracy, and the results showed that the TCN model outperformed the other control models in all four metrics. Meanwhile, it also outperforms the existing recognition methods in terms of accuracy and other aspects, which have some practical application value.</p>\u0000 </div>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/4982277","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142359950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Finger Vein Recognition Algorithm Based on the Histogram of Variable Curvature Directional Binary Statistics 基于变曲率方向二元统计直方图的手指静脉识别算法
IF 1.8 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-25 DOI: 10.1049/2024/7408331
Min Li, Xue Jiang, Honghao Zhu, Fei Liu, Huabin Wang, Liang Tao, Shijun Liu

Structural features are capable of effectively capturing the overall texture variations in images. However, in locally prominent areas with visible veins, other characteristics such as directionality, convexity–concavity, and curvature also play a crucial role in recognition, and their impact cannot be overlooked. This paper introduces a novel approach, the histogram of variable curvature directional binary statistical (HVCDBS), which combines the structural and directional features of images. The proposed method is designed for extracting discriminative multifeature information in vein recognition. First, a multidirection and multicurvature Gabor filter is introduced for convolution with vein images, yielding directional and convexity–concavity information at each pixel, along with curvature information for the corresponding curve. Simultaneously incorporating the original image feature information, these four aspects of information are fused and encoded to construct a variable curvature binary pattern (VCBP) with multifeatures. Second, the feature map containing multifeature information is blockwise processed to build variable curvature binary statistical features. Finally, competitive Gabor directional binary statistical features are combined, and a matching score-level fusion scheme is employed based on maximizing the interclass distance and minimizing the intraclass distance to determine the optimal weights. This process fuses the two feature maps into a one-dimensional feature vector, achieving an effective representation of vein images. Extensive experiments were conducted on four widely utilized vein databases, and the results indicate that the proposed algorithm, compared with solely extraction of structural features, achieved higher recognition rates and lower equal error rates.

结构特征能够有效捕捉图像的整体纹理变化。然而,在局部有明显纹理的突出区域,其他特征如方向性、凸凹性和曲率等在识别中也起着至关重要的作用,其影响不容忽视。本文介绍了一种结合图像结构和方向特征的新方法--变曲率直方图方向二元统计法(HVCDBS)。所提出的方法旨在提取静脉识别中的多特征判别信息。首先,引入一个多方向和多曲率的 Gabor 滤波器与静脉图像卷积,得到每个像素的方向和凸凹信息,以及相应曲线的曲率信息。在结合原始图像特征信息的同时,将这四个方面的信息进行融合和编码,从而构建出具有多特征的可变曲率二进制模式(VCBP)。其次,对包含多特征信息的特征图进行顺时针处理,以建立变曲率二元统计特征。最后,将具有竞争力的 Gabor 定向二元统计特征进行组合,并采用基于类间距离最大化和类内距离最小化的匹配分数级融合方案来确定最佳权重。这一过程将两个特征图融合为一维特征向量,从而实现静脉图像的有效表示。我们在四个广泛使用的静脉数据库中进行了广泛的实验,结果表明,与单纯提取结构特征相比,所提出的算法实现了更高的识别率和更低的相等错误率。
{"title":"A Finger Vein Recognition Algorithm Based on the Histogram of Variable Curvature Directional Binary Statistics","authors":"Min Li,&nbsp;Xue Jiang,&nbsp;Honghao Zhu,&nbsp;Fei Liu,&nbsp;Huabin Wang,&nbsp;Liang Tao,&nbsp;Shijun Liu","doi":"10.1049/2024/7408331","DOIUrl":"https://doi.org/10.1049/2024/7408331","url":null,"abstract":"<div>\u0000 <p>Structural features are capable of effectively capturing the overall texture variations in images. However, in locally prominent areas with visible veins, other characteristics such as directionality, convexity–concavity, and curvature also play a crucial role in recognition, and their impact cannot be overlooked. This paper introduces a novel approach, the histogram of variable curvature directional binary statistical (HVCDBS), which combines the structural and directional features of images. The proposed method is designed for extracting discriminative multifeature information in vein recognition. First, a multidirection and multicurvature Gabor filter is introduced for convolution with vein images, yielding directional and convexity–concavity information at each pixel, along with curvature information for the corresponding curve. Simultaneously incorporating the original image feature information, these four aspects of information are fused and encoded to construct a variable curvature binary pattern (VCBP) with multifeatures. Second, the feature map containing multifeature information is blockwise processed to build variable curvature binary statistical features. Finally, competitive Gabor directional binary statistical features are combined, and a matching score-level fusion scheme is employed based on maximizing the interclass distance and minimizing the intraclass distance to determine the optimal weights. This process fuses the two feature maps into a one-dimensional feature vector, achieving an effective representation of vein images. Extensive experiments were conducted on four widely utilized vein databases, and the results indicate that the proposed algorithm, compared with solely extraction of structural features, achieved higher recognition rates and lower equal error rates.</p>\u0000 </div>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2024-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/7408331","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142320637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Survey on Automatic Face Recognition Using Side-View Face Images 关于使用侧视人脸图像自动识别人脸的调查
IF 1.8 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-08 DOI: 10.1049/2024/7886911
Pinar Santemiz, Luuk J. Spreeuwers, Raymond N. J. Veldhuis

Face recognition from side-view positions poses a considerable challenge in automatic face recognition tasks. Pose variation up to the side-view is an issue of difference in appearance and visibility since only one eye is visible at the side-view poses. Traditionally overlooked, recent advancements in deep learning have brought side-view poses to the forefront of research attention. This survey comprehensively investigates methods addressing pose variations up to side-view and categorizes research efforts into feature-based, image-based, and set-based pose handling. Unlike existing surveys addressing pose variations, our emphasis is specifically on extreme poses. We report numerous promising innovations in each category and contemplate the utilization and challenges associated with side-view. Furthermore, we introduce current datasets and benchmarks, conduct performance evaluations across diverse methods, and explore their unique constraints. Notably, while feature-based methods currently stand as the state-of-the-art, our observations suggest that cross-dataset evaluations, attempted by only a few researchers, produce worse results. Consequently, the challenge of matching arbitrary poses in uncontrolled settings persists.

在自动人脸识别任务中,侧视位置的人脸识别是一个相当大的挑战。由于侧视姿势时只有一只眼睛可见,因此侧视姿势的变化是一个外观和可见度差异的问题。一直以来,侧视姿势都被忽视,但最近深度学习的进步使侧视姿势成为研究关注的焦点。本调查全面研究了处理侧视姿势变化的方法,并将研究工作分为基于特征、基于图像和基于集合的姿势处理。与处理姿势变化的现有调查不同,我们的重点是极端姿势。我们报告了每个类别中许多有前景的创新,并思考了与侧视相关的利用和挑战。此外,我们还介绍了当前的数据集和基准,对各种方法进行了性能评估,并探讨了其独特的限制因素。值得注意的是,虽然基于特征的方法目前处于最先进水平,但我们的观察表明,只有少数研究人员尝试过跨数据集评估,但结果却更糟。因此,在不受控制的环境中匹配任意姿势的挑战依然存在。
{"title":"A Survey on Automatic Face Recognition Using Side-View Face Images","authors":"Pinar Santemiz,&nbsp;Luuk J. Spreeuwers,&nbsp;Raymond N. J. Veldhuis","doi":"10.1049/2024/7886911","DOIUrl":"https://doi.org/10.1049/2024/7886911","url":null,"abstract":"<div>\u0000 <p>Face recognition from side-view positions poses a considerable challenge in automatic face recognition tasks. Pose variation up to the side-view is an issue of difference in appearance and visibility since only one eye is visible at the side-view poses. Traditionally overlooked, recent advancements in deep learning have brought side-view poses to the forefront of research attention. This survey comprehensively investigates methods addressing pose variations up to side-view and categorizes research efforts into feature-based, image-based, and set-based pose handling. Unlike existing surveys addressing pose variations, our emphasis is specifically on extreme poses. We report numerous promising innovations in each category and contemplate the utilization and challenges associated with side-view. Furthermore, we introduce current datasets and benchmarks, conduct performance evaluations across diverse methods, and explore their unique constraints. Notably, while feature-based methods currently stand as the state-of-the-art, our observations suggest that cross-dataset evaluations, attempted by only a few researchers, produce worse results. Consequently, the challenge of matching arbitrary poses in uncontrolled settings persists.</p>\u0000 </div>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/7886911","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141966581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Interpretable Siamese Attention Res-CNN for Fingerprint Spoofing Detection 用于指纹欺骗检测的可解释连体注意力 Res-CNN
IF 1.8 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-16 DOI: 10.1049/2024/6630173
Chengsheng Yuan, Zhenyu Xu, Xinting Li, Zhili Zhou, Junhao Huang, Ping Guo

In recent years, fingerprint authentication has gained widespread adoption in diverse identification systems, including smartphones, wearable devices, and attendance machines, etc. Nonetheless, these systems are vulnerable to spoofing attacks from suspicious fingerprints, posing significant risks to privacy. Consequently, a fingerprint presentation attack detection (PAD) strategy is proposed to ensure the security of these systems. Most of the previous work concentrated on how to build a deep learning framework to improve the PAD performance by augmenting fingerprint samples, and little attention has been paid to the fundamental difference between live and fake fingerprints to optimize feature extractors. This paper proposes a new fingerprint liveness detection method based on Siamese attention residual convolutional neural network (Res-CNN) that offers an interpretative perspective to this challenge. To leverage the variance in ridge continuity features (RCFs) between live and fake fingerprints, a Gabor filter is utilized to enhance the texture details of the fingerprint ridges, followed by the construction of an attention Res-CNN model to extract RCF between the live and fake fingerprints. The model mitigates the performance deterioration caused by gradient disappearance. Furthermore, to highlight the difference in RCF, a Siamese attention residual network is devised, and the ridge continuity amplification loss function is designed to optimize the training process. Ultimately, the RCF parameters are transferred to the model, and transfer learning is utilized to aid its acquisition, thereby assuring the model’s interpretability. The experimental outcomes conducted on three publicly accessible fingerprint datasets demonstrate the superiority of the proposed method, exhibiting remarkable performance in both true detection rate and average classification error rate. Moreover, our method exhibits remarkable capabilities in PAD tasks, including cross-material experiments and cross-sensor experiments. Additionally, we leverage Gradient-weighted Class Activation Mapping to generate a heatmap that visualizes the interpretability of our model, offering a compelling visual validation.

近年来,指纹验证在智能手机、可穿戴设备和考勤机等各种身份识别系统中得到了广泛应用。然而,这些系统很容易受到可疑指纹的欺骗攻击,给隐私带来巨大风险。因此,我们提出了指纹呈现攻击检测(PAD)策略,以确保这些系统的安全性。以往的工作大多集中在如何构建一个深度学习框架,通过增强指纹样本来提高 PAD 性能,而很少有人关注活体指纹和假指纹之间的根本区别,以优化特征提取器。本文提出了一种基于连体注意残差卷积神经网络(Res-CNN)的新型指纹真实性检测方法,为应对这一挑战提供了一种解释性视角。为了利用真假指纹脊连续性特征(RCF)的差异,本文利用 Gabor 滤波器增强指纹脊的纹理细节,然后构建注意力残差卷积神经网络模型来提取真假指纹之间的 RCF。该模型可减轻因梯度消失而导致的性能下降。此外,为了突出 RCF 的差异,设计了一个连体注意残差网络,并设计了脊连续性放大损失函数来优化训练过程。最后,将 RCF 参数转移到模型中,并利用迁移学习来帮助模型的获取,从而确保模型的可解释性。在三个可公开访问的指纹数据集上进行的实验结果表明了所提方法的优越性,在真实检测率和平均分类错误率方面都表现出色。此外,我们的方法在 PAD 任务(包括跨材料实验和跨传感器实验)中表现出卓越的能力。此外,我们还利用梯度加权类激活映射生成热图,直观显示了我们模型的可解释性,提供了令人信服的可视化验证。
{"title":"An Interpretable Siamese Attention Res-CNN for Fingerprint Spoofing Detection","authors":"Chengsheng Yuan,&nbsp;Zhenyu Xu,&nbsp;Xinting Li,&nbsp;Zhili Zhou,&nbsp;Junhao Huang,&nbsp;Ping Guo","doi":"10.1049/2024/6630173","DOIUrl":"https://doi.org/10.1049/2024/6630173","url":null,"abstract":"<div>\u0000 <p>In recent years, fingerprint authentication has gained widespread adoption in diverse identification systems, including smartphones, wearable devices, and attendance machines, etc. Nonetheless, these systems are vulnerable to spoofing attacks from suspicious fingerprints, posing significant risks to privacy. Consequently, a fingerprint presentation attack detection (PAD) strategy is proposed to ensure the security of these systems. Most of the previous work concentrated on how to build a deep learning framework to improve the PAD performance by augmenting fingerprint samples, and little attention has been paid to the fundamental difference between live and fake fingerprints to optimize feature extractors. This paper proposes a new fingerprint liveness detection method based on Siamese attention residual convolutional neural network (Res-CNN) that offers an interpretative perspective to this challenge. To leverage the variance in ridge continuity features (RCFs) between live and fake fingerprints, a Gabor filter is utilized to enhance the texture details of the fingerprint ridges, followed by the construction of an attention Res-CNN model to extract RCF between the live and fake fingerprints. The model mitigates the performance deterioration caused by gradient disappearance. Furthermore, to highlight the difference in RCF, a Siamese attention residual network is devised, and the ridge continuity amplification loss function is designed to optimize the training process. Ultimately, the RCF parameters are transferred to the model, and transfer learning is utilized to aid its acquisition, thereby assuring the model’s interpretability. The experimental outcomes conducted on three publicly accessible fingerprint datasets demonstrate the superiority of the proposed method, exhibiting remarkable performance in both true detection rate and average classification error rate. Moreover, our method exhibits remarkable capabilities in PAD tasks, including cross-material experiments and cross-sensor experiments. Additionally, we leverage Gradient-weighted Class Activation Mapping to generate a heatmap that visualizes the interpretability of our model, offering a compelling visual validation.</p>\u0000 </div>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/6630173","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141631158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FSErasing: Improving Face Recognition with Data Augmentation Using Face Parsing FSErasing:利用人脸解析进行数据扩充,提高人脸识别能力
IF 2 4区 计算机科学 Q2 Computer Science Pub Date : 2024-06-12 DOI: 10.1049/2024/6663315
Hiroya Kawai, Koichi Ito, Hwann-Tzong Chen, Takafumi Aoki

We propose original semantic labels for detailed face parsing to improve the accuracy of face recognition by focusing on parts in a face. The part labels used in conventional face parsing are defined based on biological features, and thus, one label is given to a large region, such as skin. Our semantic labels are defined by separating parts with large areas based on the structure of the face and considering the left and right sides for all parts to consider head pose changes, occlusion, and other factors. By utilizing the capability of assigning detailed part labels to face images, we propose a novel data augmentation method based on detailed face parsing called Face Semantic Erasing (FSErasing) to improve the performance of face recognition. FSErasing is to randomly mask a part of the face image based on the detailed part labels, and therefore, we can apply erasing-type data augmentation to the face image that considers the characteristics of the face. Through experiments using public face image datasets, we demonstrate that FSErasing is effective for improving the performance of face recognition and face attribute estimation. In face recognition, adding FSErasing in training ResNet-34 with Softmax using CelebA improves the average accuracy by 0.354 points and the average equal error rate (EER) by 0.312 points, and with ArcFace, the average accuracy and EER improve by 0.752 and 0.802 points, respectively. ResNet-50 with Softmax using CASIA-WebFace improves the average accuracy by 0.442 points and the average EER by 0.452 points, and with ArcFace, the average accuracy and EER improve by 0.228 points and 0.500 points, respectively. In face attribute estimation, adding FSErasing as a data augmentation method in training with CelebA improves the estimation accuracy by 0.54 points. We also apply our detailed face parsing model to visualize face recognition models and demonstrate its higher explainability than general visualization methods.

我们提出了用于详细人脸解析的原创语义标签,通过关注人脸的各个部分来提高人脸识别的准确性。传统的人脸解析中使用的部位标签是根据生物特征定义的,因此,一个大的区域(如皮肤)会被赋予一个标签。而我们的语义标签是根据人脸的结构将面积较大的部分分开,并考虑到头部姿势变化、遮挡等因素,对所有部分的左右两侧进行定义。利用为人脸图像分配详细部分标签的能力,我们提出了一种基于详细人脸解析的新型数据增强方法--人脸语义擦除(FSErasing),以提高人脸识别性能。FSErasing 是根据详细的部分标签随机屏蔽人脸图像的一部分,因此我们可以对人脸图像进行考虑人脸特征的擦除式数据增强。通过使用公共人脸图像数据集进行实验,我们证明了 FSErasing 能够有效提高人脸识别和人脸属性估计的性能。在人脸识别方面,在使用 CelebA 的 Softmax ResNet-34 的训练中加入 FSErasing,平均准确率提高了 0.354 点,平均等错误率(EER)提高了 0.312 点;在使用 ArcFace 的训练中加入 FSErasing,平均准确率和等错误率分别提高了 0.752 点和 0.802 点。带有 Softmax 的 ResNet-50 使用 CASIA-WebFace 时,平均准确率提高了 0.442 点,平均等误率提高了 0.452 点;使用 ArcFace 时,平均准确率和等误率分别提高了 0.228 点和 0.500 点。在人脸属性估计方面,在使用 CelebA 进行训练时添加 FSErasing 作为数据增强方法,估计准确率提高了 0.54 点。我们还将详细的人脸解析模型应用于人脸识别模型的可视化,并证明了其比一般可视化方法更高的可解释性。
{"title":"FSErasing: Improving Face Recognition with Data Augmentation Using Face Parsing","authors":"Hiroya Kawai,&nbsp;Koichi Ito,&nbsp;Hwann-Tzong Chen,&nbsp;Takafumi Aoki","doi":"10.1049/2024/6663315","DOIUrl":"https://doi.org/10.1049/2024/6663315","url":null,"abstract":"<div>\u0000 <p>We propose original semantic labels for detailed face parsing to improve the accuracy of face recognition by focusing on parts in a face. The part labels used in conventional face parsing are defined based on biological features, and thus, one label is given to a large region, such as skin. Our semantic labels are defined by separating parts with large areas based on the structure of the face and considering the left and right sides for all parts to consider head pose changes, occlusion, and other factors. By utilizing the capability of assigning detailed part labels to face images, we propose a novel data augmentation method based on detailed face parsing called Face Semantic Erasing (FSErasing) to improve the performance of face recognition. FSErasing is to randomly mask a part of the face image based on the detailed part labels, and therefore, we can apply erasing-type data augmentation to the face image that considers the characteristics of the face. Through experiments using public face image datasets, we demonstrate that FSErasing is effective for improving the performance of face recognition and face attribute estimation. In face recognition, adding FSErasing in training ResNet-34 with Softmax using CelebA improves the average accuracy by 0.354 points and the average equal error rate (EER) by 0.312 points, and with ArcFace, the average accuracy and EER improve by 0.752 and 0.802 points, respectively. ResNet-50 with Softmax using CASIA-WebFace improves the average accuracy by 0.442 points and the average EER by 0.452 points, and with ArcFace, the average accuracy and EER improve by 0.228 points and 0.500 points, respectively. In face attribute estimation, adding FSErasing as a data augmentation method in training with CelebA improves the estimation accuracy by 0.54 points. We also apply our detailed face parsing model to visualize face recognition models and demonstrate its higher explainability than general visualization methods.</p>\u0000 </div>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/6663315","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141308918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring Static–Dynamic ID Matching and Temporal Static ID Inconsistency for Generalizable Deepfake Detection 探索静态-动态 ID 匹配和时态静态 ID 不一致性,实现可通用的深度伪造检测
IF 2 4区 计算机科学 Q2 Computer Science Pub Date : 2024-06-09 DOI: 10.1049/2024/2280143
Huimin She, Yongjian Hu, Beibei Liu, Chang-Tsun Li

Identity-based Deepfake detection methods have the potential to improve the generalization, robustness, and interpretability of the model. However, current identity-based methods either require a reference or can only be used to detect face replacement but not face reenactment. In this paper, we propose a novel Deepfake video detection approach based on identity anomalies. We observe two types of identity anomalies: the inconsistency between clip-level static ID (facial appearance) and clip-level dynamic ID (facial behavior) and the temporal inconsistency of image-level static IDs. Since these two types of anomalies can be detected through self-consistency and do not depend on the manipulation type, our method is a reference-free and manipulation-independent approach. Specifically, our detection network consists of two branches: the static–dynamic ID discrepancy detection branch for the inconsistency between dynamic and static ID and the temporal static ID anomaly detection branch for the temporal anomaly of static ID. We combine the outputs of the two branches by weighted averaging to obtain the final detection result. We also designed two loss functions: the static–dynamic ID matching loss and the dynamic ID constraint loss, to enhance the representation and discriminability of dynamic ID. We conduct experiments on four benchmark datasets and compare our method with the state-of-the-art methods. Results show that our method can detect not only face replacement but also face reenactment, and also has better detection performance over the state-of-the-art methods on unknown datasets. It also has superior robustness against compression. Identity-based features provide a good explanation of the detection results.

基于身份的 Deepfake 检测方法有可能提高模型的通用性、鲁棒性和可解释性。然而,目前基于身份的方法要么需要参照物,要么只能用于检测人脸替换,而不能检测人脸重现。在本文中,我们提出了一种基于身份异常的新型 Deepfake 视频检测方法。我们观察到两类身份异常:片段级静态 ID(面部外观)和片段级动态 ID(面部行为)之间的不一致性,以及图像级静态 ID 的时间不一致性。由于这两类异常可以通过自洽性检测出来,并且不依赖于操作类型,因此我们的方法是一种无参照、不依赖于操作的方法。具体来说,我们的检测网络由两个分支组成:静态-动态 ID 差异检测分支,用于检测动态 ID 和静态 ID 之间的不一致;时间静态 ID 异常检测分支,用于检测静态 ID 的时间异常。我们通过加权平均的方式将两个分支的输出结果合并,得到最终的检测结果。我们还设计了两个损失函数:静态-动态 ID 匹配损失和动态 ID 约束损失,以增强动态 ID 的代表性和可辨别性。我们在四个基准数据集上进行了实验,并将我们的方法与最先进的方法进行了比较。结果表明,我们的方法不仅能检测到人脸替换,还能检测到人脸重演,而且在未知数据集上的检测性能优于最先进的方法。此外,该方法还具有卓越的抗压缩鲁棒性。基于身份的特征很好地解释了检测结果。
{"title":"Exploring Static–Dynamic ID Matching and Temporal Static ID Inconsistency for Generalizable Deepfake Detection","authors":"Huimin She,&nbsp;Yongjian Hu,&nbsp;Beibei Liu,&nbsp;Chang-Tsun Li","doi":"10.1049/2024/2280143","DOIUrl":"https://doi.org/10.1049/2024/2280143","url":null,"abstract":"<div>\u0000 <p>Identity-based Deepfake detection methods have the potential to improve the generalization, robustness, and interpretability of the model. However, current identity-based methods either require a reference or can only be used to detect face replacement but not face reenactment. In this paper, we propose a novel Deepfake video detection approach based on identity anomalies. We observe two types of identity anomalies: the inconsistency between clip-level static ID (facial appearance) and clip-level dynamic ID (facial behavior) and the temporal inconsistency of image-level static IDs. Since these two types of anomalies can be detected through self-consistency and do not depend on the manipulation type, our method is a reference-free and manipulation-independent approach. Specifically, our detection network consists of two branches: the static–dynamic ID discrepancy detection branch for the inconsistency between dynamic and static ID and the temporal static ID anomaly detection branch for the temporal anomaly of static ID. We combine the outputs of the two branches by weighted averaging to obtain the final detection result. We also designed two loss functions: the static–dynamic ID matching loss and the dynamic ID constraint loss, to enhance the representation and discriminability of dynamic ID. We conduct experiments on four benchmark datasets and compare our method with the state-of-the-art methods. Results show that our method can detect not only face replacement but also face reenactment, and also has better detection performance over the state-of-the-art methods on unknown datasets. It also has superior robustness against compression. Identity-based features provide a good explanation of the detection results.</p>\u0000 </div>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2024-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/2280143","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141298409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Emotion Recognition Based on Handwriting Using Generative Adversarial Networks and Deep Learning 使用生成式对抗网络和深度学习进行基于手写的情感识别
IF 2 4区 计算机科学 Q2 Computer Science Pub Date : 2024-05-27 DOI: 10.1049/2024/5351588
Hengnian Qi, Gang Zeng, Keke Jia, Chu Zhang, Xiaoping Wu, Mengxia Li, Qing Lang, Lingxuan Wang

The quality of people’s lives is closely related to their emotional state. Positive emotions can boost confidence and help overcome difficulties, while negative emotions can harm both physical and mental health. Research has shown that people’s handwriting is associated with their emotions. In this study, audio-visual media were used to induce emotions, and a dot-matrix digital pen was used to collect neutral text data written by participants in three emotional states: calm, happy, and sad. To address the challenge of limited samples, a novel conditional table generative adversarial network called conditional tabular-generative adversarial network (CTAB-GAN) was used to increase the number of task samples, and the recognition accuracy of task samples improved by 4.18%. The TabNet (a neural network designed for tabular data) with SimAM (a simple, parameter-free attention module) was employed and compared with the original TabNet and traditional machine learning models; the incorporation of the SimAm attention mechanism led to a 1.35% improvement in classification accuracy. Experimental results revealed significant differences between negative (sad) and nonnegative (calm and happy) emotions, with a recognition accuracy of 80.67%. Overall, this study demonstrated the feasibility of emotion recognition based on handwriting with the assistance of CTAB-GAN and SimAm-TabNet. It provides guidance for further research on emotion recognition or other handwriting-based applications.

人们的生活质量与其情绪状态密切相关。积极的情绪可以增强信心,帮助克服困难,而消极的情绪则会损害身心健康。研究表明,人们的笔迹与其情绪有关。本研究利用视听媒体诱发情绪,并使用点阵数码笔收集参与者在平静、快乐和悲伤三种情绪状态下书写的中性文字数据。为了解决样本有限的难题,研究人员使用了一种名为条件表生成对抗网络(CTAB-GAN)的新型条件表生成对抗网络来增加任务样本的数量,结果任务样本的识别准确率提高了 4.18%。采用了带有 SimAM(一种简单、无参数的注意力模块)的 TabNet(一种专为表格数据设计的神经网络),并与原始 TabNet 和传统机器学习模型进行了比较;加入 SimAm 注意机制后,分类准确率提高了 1.35%。实验结果显示,负面(悲伤)和非负面(平静和快乐)情绪之间存在明显差异,识别准确率达到 80.67%。总之,这项研究证明了在 CTAB-GAN 和 SimAm-TabNet 的帮助下基于手写进行情绪识别的可行性。它为进一步研究情绪识别或其他基于手写的应用提供了指导。
{"title":"Emotion Recognition Based on Handwriting Using Generative Adversarial Networks and Deep Learning","authors":"Hengnian Qi,&nbsp;Gang Zeng,&nbsp;Keke Jia,&nbsp;Chu Zhang,&nbsp;Xiaoping Wu,&nbsp;Mengxia Li,&nbsp;Qing Lang,&nbsp;Lingxuan Wang","doi":"10.1049/2024/5351588","DOIUrl":"https://doi.org/10.1049/2024/5351588","url":null,"abstract":"<div>\u0000 <p>The quality of people’s lives is closely related to their emotional state. Positive emotions can boost confidence and help overcome difficulties, while negative emotions can harm both physical and mental health. Research has shown that people’s handwriting is associated with their emotions. In this study, audio-visual media were used to induce emotions, and a dot-matrix digital pen was used to collect neutral text data written by participants in three emotional states: calm, happy, and sad. To address the challenge of limited samples, a novel conditional table generative adversarial network called conditional tabular-generative adversarial network (CTAB-GAN) was used to increase the number of task samples, and the recognition accuracy of task samples improved by 4.18%. The TabNet (a neural network designed for tabular data) with SimAM (a simple, parameter-free attention module) was employed and compared with the original TabNet and traditional machine learning models; the incorporation of the SimAm attention mechanism led to a 1.35% improvement in classification accuracy. Experimental results revealed significant differences between negative (sad) and nonnegative (calm and happy) emotions, with a recognition accuracy of 80.67%. Overall, this study demonstrated the feasibility of emotion recognition based on handwriting with the assistance of CTAB-GAN and SimAm-TabNet. It provides guidance for further research on emotion recognition or other handwriting-based applications.</p>\u0000 </div>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/5351588","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141246105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Comparative Study of Cross-Device Finger Vein Recognition Using Classical and Deep Learning Approaches 使用经典方法和深度学习方法进行跨设备手指静脉识别的比较研究
IF 2 4区 计算机科学 Q2 Computer Science Pub Date : 2024-03-25 DOI: 10.1049/2024/3236602
Tuğçe Arıcan, Raymond Veldhuis, Luuk Spreeuwers, Loïc Bergeron, Christoph Busch, Ehsaneddin Jalilian, Christof Kauba, Simon Kirchgasser, Sébastien Marcel, Bernhard Prommegger, Kiran Raja, Raghavendra Ramachandra, Andreas Uhl

Finger vein recognition is gaining popularity in the field of biometrics, yet the inter-operability of finger vein patterns has received limited attention. This study aims to fill this gap by introducing a cross-device finger vein dataset and evaluating the performance of finger vein recognition across devices using a classical method, a convolutional neural network, and our proposed patch-based convolutional auto-encoder (CAE). The findings emphasise the importance of standardisation of finger vein recognition, similar to that of fingerprints or irises, crucial for achieving inter-operability. Despite the inherent challenges of cross-device recognition, the proposed CAE architecture in this study demonstrates promising results in finger vein recognition, particularly in the context of cross-device comparisons.

手指静脉识别在生物识别领域越来越受欢迎,但手指静脉模式的互操作性受到的关注却很有限。本研究旨在填补这一空白,它引入了跨设备手指静脉数据集,并使用经典方法、卷积神经网络和我们提出的基于补丁的卷积自动编码器(CAE)评估了跨设备手指静脉识别的性能。研究结果强调了手指静脉识别标准化的重要性,这与指纹或虹膜识别类似,对于实现互操作性至关重要。尽管跨设备识别存在固有的挑战,但本研究中提出的 CAE 架构在手指静脉识别方面,特别是在跨设备比较方面,显示出了良好的效果。
{"title":"A Comparative Study of Cross-Device Finger Vein Recognition Using Classical and Deep Learning Approaches","authors":"Tuğçe Arıcan,&nbsp;Raymond Veldhuis,&nbsp;Luuk Spreeuwers,&nbsp;Loïc Bergeron,&nbsp;Christoph Busch,&nbsp;Ehsaneddin Jalilian,&nbsp;Christof Kauba,&nbsp;Simon Kirchgasser,&nbsp;Sébastien Marcel,&nbsp;Bernhard Prommegger,&nbsp;Kiran Raja,&nbsp;Raghavendra Ramachandra,&nbsp;Andreas Uhl","doi":"10.1049/2024/3236602","DOIUrl":"10.1049/2024/3236602","url":null,"abstract":"<div>\u0000 <p>Finger vein recognition is gaining popularity in the field of biometrics, yet the inter-operability of finger vein patterns has received limited attention. This study aims to fill this gap by introducing a cross-device finger vein dataset and evaluating the performance of finger vein recognition across devices using a classical method, a convolutional neural network, and our proposed patch-based convolutional auto-encoder (CAE). The findings emphasise the importance of standardisation of finger vein recognition, similar to that of fingerprints or irises, crucial for achieving inter-operability. Despite the inherent challenges of cross-device recognition, the proposed CAE architecture in this study demonstrates promising results in finger vein recognition, particularly in the context of cross-device comparisons.</p>\u0000 </div>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/3236602","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140381478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning Deep Embedding with Acoustic and Phoneme Features for Speaker Recognition in FM Broadcasting 利用声学和音素特征学习深度嵌入,用于调频广播中的扬声器识别
IF 2 4区 计算机科学 Q2 Computer Science Pub Date : 2024-03-22 DOI: 10.1049/2024/6694481
Xiao Li, Xiao Chen, Rui Fu, Xiao Hu, Mintong Chen, Kun Niu

Text-independent speaker verification (TI-SV) is a crucial task in speaker recognition, as it involves verifying an individual’s claimed identity from speech of arbitrary content without any human intervention. The target for TI-SV is to design a discriminative network to learn deep speaker embedding for speaker idiosyncrasy. In this paper, we propose a deep speaker embedding learning approach of a hybrid deep neural network (DNN) for TI-SV in FM broadcasting. Not only acoustic features are utilized, but also phoneme features are introduced as prior knowledge to collectively learn deep speaker embedding. The hybrid DNN consists of a convolutional neural network architecture for generating acoustic features and a multilayer perceptron architecture for extracting phoneme features sequentially, which represent significant pronunciation attributes. The extracted acoustic and phoneme features are concatenated to form deep embedding descriptors for speaker identity. The hybrid DNN demonstrates not only the complementarity between acoustic and phoneme features but also the temporality of phoneme features in a sequence. Our experiments show that the hybrid DNN outperforms existing methods and delivers a remarkable performance in FM broadcasting TI-SV.

独立于文本的说话人验证(TI-SV)是说话人识别中的一项重要任务,因为它涉及在没有任何人工干预的情况下,从任意内容的语音中验证个人声称的身份。TI-SV 的目标是设计一个判别网络,以学习针对说话人特异性的深度说话人嵌入。在本文中,我们为调频广播中的 TI-SV 提出了一种混合深度神经网络(DNN)的深度说话者嵌入学习方法。不仅利用了声学特征,还引入了音素特征作为先验知识,以共同学习深度扬声器嵌入。混合 DNN 由用于生成声学特征的卷积神经网络架构和用于依次提取代表重要发音属性的音素特征的多层感知器架构组成。提取的声学特征和音素特征通过串联形成深度嵌入描述符,用于识别说话者。混合 DNN 不仅证明了声学特征和音素特征之间的互补性,还证明了音素特征在序列中的时间性。我们的实验表明,混合 DNN 优于现有的方法,在调频广播 TI-SV 中表现出色。
{"title":"Learning Deep Embedding with Acoustic and Phoneme Features for Speaker Recognition in FM Broadcasting","authors":"Xiao Li,&nbsp;Xiao Chen,&nbsp;Rui Fu,&nbsp;Xiao Hu,&nbsp;Mintong Chen,&nbsp;Kun Niu","doi":"10.1049/2024/6694481","DOIUrl":"10.1049/2024/6694481","url":null,"abstract":"<div>\u0000 <p>Text-independent speaker verification (TI-SV) is a crucial task in speaker recognition, as it involves verifying an individual’s claimed identity from speech of arbitrary content without any human intervention. The target for TI-SV is to design a discriminative network to learn deep speaker embedding for speaker idiosyncrasy. In this paper, we propose a deep speaker embedding learning approach of a hybrid deep neural network (DNN) for TI-SV in FM broadcasting. Not only acoustic features are utilized, but also phoneme features are introduced as prior knowledge to collectively learn deep speaker embedding. The hybrid DNN consists of a convolutional neural network architecture for generating acoustic features and a multilayer perceptron architecture for extracting phoneme features sequentially, which represent significant pronunciation attributes. The extracted acoustic and phoneme features are concatenated to form deep embedding descriptors for speaker identity. The hybrid DNN demonstrates not only the complementarity between acoustic and phoneme features but also the temporality of phoneme features in a sequence. Our experiments show that the hybrid DNN outperforms existing methods and delivers a remarkable performance in FM broadcasting TI-SV.</p>\u0000 </div>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/6694481","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140220402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the Potential of Algorithm Fusion for Demographic Bias Mitigation in Face Recognition 论算法融合在减轻人脸识别中的人口统计学偏差方面的潜力
IF 2 4区 计算机科学 Q2 Computer Science Pub Date : 2024-02-23 DOI: 10.1049/2024/1808587
Jascha Kolberg, Yannik Schäfer, Christian Rathgeb, Christoph Busch

With the rise of deep neural networks, the performance of biometric systems has increased tremendously. Biometric systems for face recognition are now used in everyday life, e.g., border control, crime prevention, or personal device access control. Although the accuracy of face recognition systems is generally high, they are not without flaws. Many biometric systems have been found to exhibit demographic bias, resulting in different demographic groups being not recognized with the same accuracy. This is especially true for facial recognition due to demographic factors, e.g., gender and skin color. While many previous works already reported demographic bias, this work aims to reduce demographic bias for biometric face recognition applications. In this regard, 12 face recognition systems are benchmarked regarding biometric recognition performance as well as demographic differentials, i.e., fairness. Subsequently, multiple fusion techniques are applied with the goal to improve the fairness in contrast to single systems. The experimental results show that it is possible to improve the fairness regarding single demographics, e.g., skin color or gender, while improving fairness for demographic subgroups turns out to be more challenging.

随着深度神经网络的兴起,生物识别系统的性能大幅提高。目前,人脸识别生物识别系统已广泛应用于日常生活中,如边境管制、预防犯罪或个人设备访问控制等。虽然人脸识别系统的准确率普遍较高,但也并非没有缺陷。许多生物识别系统被发现存在人口统计学偏差,导致不同人口群体的识别准确率不尽相同。由于性别和肤色等人口统计学因素,人脸识别尤其如此。尽管之前的许多工作已经报告了人口统计偏差,但这项工作旨在减少生物人脸识别应用中的人口统计偏差。为此,我们对 12 个人脸识别系统的生物识别性能以及人口统计学差异(即公平性)进行了基准测试。随后,应用了多种融合技术,目的是提高与单一系统相比的公平性。实验结果表明,提高单一人口统计学特征(如肤色或性别)的公平性是可能的,而提高人口亚群的公平性则更具挑战性。
{"title":"On the Potential of Algorithm Fusion for Demographic Bias Mitigation in Face Recognition","authors":"Jascha Kolberg,&nbsp;Yannik Schäfer,&nbsp;Christian Rathgeb,&nbsp;Christoph Busch","doi":"10.1049/2024/1808587","DOIUrl":"10.1049/2024/1808587","url":null,"abstract":"<div>\u0000 <p>With the rise of deep neural networks, the performance of biometric systems has increased tremendously. Biometric systems for face recognition are now used in everyday life, e.g., border control, crime prevention, or personal device access control. Although the accuracy of face recognition systems is generally high, they are not without flaws. Many biometric systems have been found to exhibit demographic bias, resulting in different demographic groups being not recognized with the same accuracy. This is especially true for facial recognition due to demographic factors, e.g., gender and skin color. While many previous works already reported demographic bias, this work aims to reduce demographic bias for biometric face recognition applications. In this regard, 12 face recognition systems are benchmarked regarding biometric recognition performance as well as demographic differentials, i.e., fairness. Subsequently, multiple fusion techniques are applied with the goal to improve the fairness in contrast to single systems. The experimental results show that it is possible to improve the fairness regarding single demographics, e.g., skin color or gender, while improving fairness for demographic subgroups turns out to be more challenging.</p>\u0000 </div>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2024-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/1808587","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140436576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IET Biometrics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1