IET Biometrics最新文献

Wavelet-Based Texture Mining and Enhancement for Face Forgery Detection

IF 1.8 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IET Biometrics

Pub Date : 2025-02-13 DOI: 10.1049/bme2/2217175

Xin Li, Hui Zhao, Bingxin Xu, Hongzhe Liu

Due to the abuse of deep forgery technology, the research on forgery detection methods has become increasingly urgent. The corresponding relationship between the frequency spectrum information and the spatial clues, which is often neglected by current methods, could be conducive to a more accurate and generalized forgery detection. Motivated by this inspiration, we propose a wavelet-based texture mining and enhancement framework for face forgery detection. First, we introduce a frequency-guided texture enhancement (FGTE) module that mining the high-frequency information to improve the network’s extraction of effective texture features. Next, we propose a global–local feature refinement (GLFR) module to enhance the model’s leverage of both global semantic features and local texture features. Moreover, the interactive fusion module (IFM) is designed to fully incorporate the enhanced texture clues with spatial features. The proposed method has been extensively evaluated on five public datasets, such as FaceForensics++ (FF++), deepfake (DF) detection (DFD) challenge (DFDC), Celeb-DFv2, DFDC preview (DFDC-P), and DFD, for face forgery detection, yielding promising performance within and cross dataset experiments.

{"title":"Wavelet-Based Texture Mining and Enhancement for Face Forgery Detection","authors":"Xin Li, Hui Zhao, Bingxin Xu, Hongzhe Liu","doi":"10.1049/bme2/2217175","DOIUrl":"https://doi.org/10.1049/bme2/2217175","url":null,"abstract":"<div>\u0000 <p>Due to the abuse of deep forgery technology, the research on forgery detection methods has become increasingly urgent. The corresponding relationship between the frequency spectrum information and the spatial clues, which is often neglected by current methods, could be conducive to a more accurate and generalized forgery detection. Motivated by this inspiration, we propose a wavelet-based texture mining and enhancement framework for face forgery detection. First, we introduce a frequency-guided texture enhancement (FGTE) module that mining the high-frequency information to improve the network’s extraction of effective texture features. Next, we propose a global–local feature refinement (GLFR) module to enhance the model’s leverage of both global semantic features and local texture features. Moreover, the interactive fusion module (IFM) is designed to fully incorporate the enhanced texture clues with spatial features. The proposed method has been extensively evaluated on five public datasets, such as FaceForensics++ (FF++), deepfake (DF) detection (DFD) challenge (DFDC), Celeb-DFv2, DFDC preview (DFDC-P), and DFD, for face forgery detection, yielding promising performance within and cross dataset experiments.</p>\u0000 </div>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"2025 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/bme2/2217175","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143404595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Product Color Design Concept that Considers Human Emotion Perception: Based on Deep Learning and Cluster Analysis

IF 1.8 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IET Biometrics

Pub Date : 2024-12-25 DOI: 10.1049/bme2/5576927

Anqi Gao, Yantao Zhong

Emotions play a significant role in how we perceive and interact with products. Thoughtfully designed emotionally appealing products can evoke strong user responses, making them more attractive. Color, as a crucial attribute of products, is a significant aspect to consider in the process of emotional product design. However, users’ emotional perception of product colors is highly intricate and challenging to define. To address this, this research proposes a product color design concept that considers human emotion perception based on deep learning and cluster analysis. First, for a given product, a color style is chosen for rerendering, which is an emotional color image. Different emotional color images have distinct RGB color representations. Second, clustering methods are employed to establish relationships between various emotional color images and different colors, selecting emotionally close style images. Subsequently, through transfer learning techniques, specific grid structures are used to retrain network weights, allowing for the fusion design of style and content images. This process ultimately achieves emotional color rendering design based on emotional color clustering and transfer learning. Multiple sets of emotional color design examples demonstrate that the method proposed in this study can accurately fulfill the emotional color design requirements of products, thereby, offering practical applicability. The satisfaction survey shows that the proposed method has certain guiding significance for clothing emotional color design.

{"title":"Product Color Design Concept that Considers Human Emotion Perception: Based on Deep Learning and Cluster Analysis","authors":"Anqi Gao, Yantao Zhong","doi":"10.1049/bme2/5576927","DOIUrl":"https://doi.org/10.1049/bme2/5576927","url":null,"abstract":"<div>\u0000 <p>Emotions play a significant role in how we perceive and interact with products. Thoughtfully designed emotionally appealing products can evoke strong user responses, making them more attractive. Color, as a crucial attribute of products, is a significant aspect to consider in the process of emotional product design. However, users’ emotional perception of product colors is highly intricate and challenging to define. To address this, this research proposes a product color design concept that considers human emotion perception based on deep learning and cluster analysis. First, for a given product, a color style is chosen for rerendering, which is an emotional color image. Different emotional color images have distinct RGB color representations. Second, clustering methods are employed to establish relationships between various emotional color images and different colors, selecting emotionally close style images. Subsequently, through transfer learning techniques, specific grid structures are used to retrain network weights, allowing for the fusion design of style and content images. This process ultimately achieves emotional color rendering design based on emotional color clustering and transfer learning. Multiple sets of emotional color design examples demonstrate that the method proposed in this study can accurately fulfill the emotional color design requirements of products, thereby, offering practical applicability. The satisfaction survey shows that the proposed method has certain guiding significance for clothing emotional color design.</p>\u0000 </div>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"2024 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/bme2/5576927","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143118922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Facial and Neck Region Analysis for Deepfake Detection Using Remote Photoplethysmography Signal Similarity 利用远程心动图信号相似性进行面部和颈部区域分析以进行深度伪装检测

IF 1.8 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IET Biometrics

Pub Date : 2024-11-21 DOI: 10.1049/bme2/7095412

Byeong Seon An, Hyeji Lim, Hyeon Ah Seong, Eui Chul Lee

Deepfake (DF) involves utilizing artificial intelligence (AI) technology to synthesize or manipulate images, voices, and other human or object data. However, recent times have seen a surge in instances of DF technology misuse, raising concerns about cybercrime and the credibility of manipulated information. The objective of this study is to devise a method that employs remote photoplethysmography (rPPG) biosignals for DF detection. The face was divided into five regions based on landmarks, with automatic extraction performed on the neck region. We conducted rPPG signal extraction from each facial area and the neck region was defined as the ground truth. The five signals extracted from the face were used as inputs to an support vector machine (SVM) model by calculating the euclidean distance between each signal and the signal extracted from the neck region, measuring rPPG signal similarity with five features. Our approach demonstrated robust performance with an area under the curve (AUC) score of 91.2% on the audio-driven dataset and 99.7% on the face swapping generative adversarial network (FSGAN) dataset, even though we only used datasets excluding DF techniques that can be visually identified in Korean DF Detection Dataset (KoDF). Therefore, our research findings demonstrate that similarity features of rPPG signals can be utilized as key features for detecting DFs.

深度伪造（DF）是指利用人工智能（AI）技术合成或篡改图像、声音和其他人类或物体数据。然而，近来DF技术被滥用的情况激增，引发了人们对网络犯罪和被篡改信息可信度的担忧。本研究的目的是设计一种方法，利用远程光感（rPPG）生物信号进行 DF 检测。根据地标将面部分为五个区域，并对颈部区域进行自动提取。我们对每个面部区域进行了 rPPG 信号提取，并将颈部区域定义为地面实况。从面部提取的五个信号被用作支持向量机 (SVM) 模型的输入，方法是计算每个信号与从颈部提取的信号之间的欧氏距离，用五个特征来衡量 rPPG 信号的相似性。尽管我们只使用了不包括韩国 DF 检测数据集（KoDF）中可直观识别的 DF 技术的数据集，但我们的方法在音频驱动数据集和人脸交换生成对抗网络（FSGAN）数据集上分别获得了 91.2% 和 99.7% 的曲线下面积（AUC）分数，表现出了稳健的性能。因此，我们的研究结果表明，rPPG 信号的相似性特征可用作检测 DF 的关键特征。

{"title":"Facial and Neck Region Analysis for Deepfake Detection Using Remote Photoplethysmography Signal Similarity","authors":"Byeong Seon An, Hyeji Lim, Hyeon Ah Seong, Eui Chul Lee","doi":"10.1049/bme2/7095412","DOIUrl":"https://doi.org/10.1049/bme2/7095412","url":null,"abstract":"<div>\u0000 <p>Deepfake (DF) involves utilizing artificial intelligence (AI) technology to synthesize or manipulate images, voices, and other human or object data. However, recent times have seen a surge in instances of DF technology misuse, raising concerns about cybercrime and the credibility of manipulated information. The objective of this study is to devise a method that employs remote photoplethysmography (rPPG) biosignals for DF detection. The face was divided into five regions based on landmarks, with automatic extraction performed on the neck region. We conducted rPPG signal extraction from each facial area and the neck region was defined as the ground truth. The five signals extracted from the face were used as inputs to an support vector machine (SVM) model by calculating the euclidean distance between each signal and the signal extracted from the neck region, measuring rPPG signal similarity with five features. Our approach demonstrated robust performance with an area under the curve (AUC) score of 91.2% on the audio-driven dataset and 99.7% on the face swapping generative adversarial network (FSGAN) dataset, even though we only used datasets excluding DF techniques that can be visually identified in Korean DF Detection Dataset (KoDF). Therefore, our research findings demonstrate that similarity features of rPPG signals can be utilized as key features for detecting DFs.</p>\u0000 </div>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"2024 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/bme2/7095412","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142708010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Multimodal Biometric Recognition Method Based on Federated Learning 基于联合学习的多模态生物识别方法

IF 1.8 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IET Biometrics

Pub Date : 2024-11-08 DOI: 10.1049/2024/5873909

Guang Chen, Dacan Luo, Fengzhao Lian, Feng Tian, Xu Yang, Wenxiong Kang

Recently, multimodal authentication methods based on deep learning have been widely explored in biometrics. Nevertheless, the contradiction between the data privacy protection and the requirement of sufficient data when model optimizing has become increasingly prominent. To this end, we proposes a multimodal biometric federated learning framework (FedMB) to realize the multiparty joint training of identity authentication models with different modal data while protecting the users’ data privacy. Specifically, a personalized multimodal biometric recognition model fully trained by each participant is first obtained to improve the authentication performance, using modal point clustering with class-first federated learning methods on the service side with the modal. Then a complementary multimodal biometric recognition strategy is implemented to build a complementary modal model. Finally, the fusion participant local model, with the modal model and complementary modal model, is trained by all participants again to obtain a more personalized modal model. The experimental results have demonstrated that the proposed FedMB can either protect the data privacy or utilize the data from all participants to train the personalized biometric recognition model to improve identity authentication performance.

近年来，基于深度学习的多模态身份验证方法在生物识别领域得到了广泛探索。然而，在模型优化时，数据隐私保护与充足数据要求之间的矛盾日益突出。为此，我们提出了一种多模态生物识别联合学习框架（FedMB），在保护用户数据隐私的前提下，实现不同模态数据身份认证模型的多方联合训练。具体来说，首先在有模态的服务端使用模态点聚类与类优先联合学习方法，获得由每个参与者完全训练的个性化多模态生物识别模型，以提高身份验证性能。然后实施互补多模态生物识别策略，建立互补模态模型。最后，所有参与者再次训练模态模型和互补模态模型的融合参与者本地模型，以获得更加个性化的模态模型。实验结果表明，拟议的 FedMB 既能保护数据隐私，也能利用所有参与者的数据来训练个性化生物特征识别模型，从而提高身份验证性能。

{"title":"A Multimodal Biometric Recognition Method Based on Federated Learning","authors":"Guang Chen, Dacan Luo, Fengzhao Lian, Feng Tian, Xu Yang, Wenxiong Kang","doi":"10.1049/2024/5873909","DOIUrl":"https://doi.org/10.1049/2024/5873909","url":null,"abstract":"<div>\u0000 <p>Recently, multimodal authentication methods based on deep learning have been widely explored in biometrics. Nevertheless, the contradiction between the data privacy protection and the requirement of sufficient data when model optimizing has become increasingly prominent. To this end, we proposes a multimodal biometric federated learning framework (FedMB) to realize the multiparty joint training of identity authentication models with different modal data while protecting the users’ data privacy. Specifically, a personalized multimodal biometric recognition model fully trained by each participant is first obtained to improve the authentication performance, using modal point clustering with class-first federated learning methods on the service side with the modal. Then a complementary multimodal biometric recognition strategy is implemented to build a complementary modal model. Finally, the fusion participant local model, with the modal model and complementary modal model, is trained by all participants again to obtain a more personalized modal model. The experimental results have demonstrated that the proposed FedMB can either protect the data privacy or utilize the data from all participants to train the personalized biometric recognition model to improve identity authentication performance.</p>\u0000 </div>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"2024 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/5873909","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142641417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Deep and Shallow Feature Fusion in Feature Score Level for Palmprint Recognition 特征分数级掌纹识别中的深层和浅层特征融合

IF 1.8 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IET Biometrics

Pub Date : 2024-10-22 DOI: 10.1049/2024/5683547

Yihang Wu, Junlin Hu

Contactless palmprint recognition offers friendly customer experience due to its ability to operate without touching the recognition device under rigid constrained conditions. Recent palmprint recognition methods have shown promising accuracy; however, there still exist some issues that need to be further studied such as the limited discrimination of the single feature and how to effectively fuse deep features and shallow features. In this paper, deep features and shallow features are integrated into a unified framework using feature-level and score-level fusion methods. Specifically, deep feature is extracted by residual neural network (ResNet), and shallow features are extracted by principal component analysis (PCA), linear discriminant analysis (LDA), and competitive coding (CompCode). In feature-level fusion stage, ResNet feature and PCA feature are dimensionally reduced and fused by canonical correlation analysis technique to achieve the fused feature for the next stage. In score-level fusion stage, score information is embedded in the fused feature, LDA feature, and CompCode feature to obtain a more reliable and robust recognition performance. The proposed method achieves competitive performance on Tongji dataset and demonstrates more satisfying generalization capabilities on IITD and CASIA datasets. Comprehensive validation across three palmprint datasets confirms the effectiveness of our proposed deep and shallow feature fusion approach.

非接触式掌纹识别技术能够在刚性约束条件下不接触识别设备进行操作，为客户提供友好的体验。最近的掌纹识别方法已经显示出良好的准确性，但仍存在一些需要进一步研究的问题，如单一特征的辨别能力有限，以及如何有效地融合深层特征和浅层特征等。本文采用特征级和分数级融合方法，将深度特征和浅层特征整合到一个统一的框架中。具体来说，深度特征通过残差神经网络（ResNet）提取，浅层特征通过主成分分析（PCA）、线性判别分析（LDA）和竞争编码（CompCode）提取。在特征级融合阶段，ResNet 特征和 PCA 特征被降维，并通过典型相关分析技术进行融合，以获得下一阶段的融合特征。在分数级融合阶段，分数信息被嵌入到融合特征、LDA 特征和 CompCode 特征中，以获得更可靠、更稳健的识别性能。所提出的方法在同济数据集上取得了具有竞争力的性能，并在 IITD 和 CASIA 数据集上展示了更令人满意的泛化能力。三个掌纹数据集的综合验证证实了我们提出的深浅特征融合方法的有效性。

{"title":"Deep and Shallow Feature Fusion in Feature Score Level for Palmprint Recognition","authors":"Yihang Wu, Junlin Hu","doi":"10.1049/2024/5683547","DOIUrl":"https://doi.org/10.1049/2024/5683547","url":null,"abstract":"<div>\u0000 <p>Contactless palmprint recognition offers friendly customer experience due to its ability to operate without touching the recognition device under rigid constrained conditions. Recent palmprint recognition methods have shown promising accuracy; however, there still exist some issues that need to be further studied such as the limited discrimination of the single feature and how to effectively fuse deep features and shallow features. In this paper, deep features and shallow features are integrated into a unified framework using feature-level and score-level fusion methods. Specifically, deep feature is extracted by residual neural network (ResNet), and shallow features are extracted by principal component analysis (PCA), linear discriminant analysis (LDA), and competitive coding (CompCode). In feature-level fusion stage, ResNet feature and PCA feature are dimensionally reduced and fused by canonical correlation analysis technique to achieve the fused feature for the next stage. In score-level fusion stage, score information is embedded in the fused feature, LDA feature, and CompCode feature to obtain a more reliable and robust recognition performance. The proposed method achieves competitive performance on Tongji dataset and demonstrates more satisfying generalization capabilities on IITD and CASIA datasets. Comprehensive validation across three palmprint datasets confirms the effectiveness of our proposed deep and shallow feature fusion approach.</p>\u0000 </div>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"2024 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/5683547","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142524913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Research on TCN Model Based on SSARF Feature Selection in the Field of Human Behavior Recognition 基于 SSARF 特征选择的 TCN 模型在人类行为识别领域的研究

IF 1.8 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IET Biometrics

Pub Date : 2024-09-30 DOI: 10.1049/2024/4982277

Wei Zhang, Guibo Yu, Shijie Deng

Human behavior recognition is the process of automatically identifying and analyzing multiple human behaviors using modern technology. From previous studies, we find that redundant features not only slow down the model training process and increase the structural complexity but also degrade the overall performance of the model. To overcome this problem, this paper investigates a temporal convolutional neural network (TCN) model based on improved sparrow search algorithm random forest (SSARF) feature selection to accurately identify human behavioral traits based on wearable devices. The model is based on the TCN classification model and incorporates a random forest with the sparrow optimization algorithm to perform dimensionality reduction on the original features, which is used to remove poorly correlated and unimportant features and retain effective features with a certain contribution rate to generate the optimal feature subset. In order to verify the reliability of the method, the performance of the model was evaluated on two public datasets, UCI Human Activity Recognition and WISDM, respectively, and 98.54% and 97.83% recognition accuracies were obtained, which were improved by 0.47% and 1.04% compared to the prefeature selection, but the number of features was reduced by 84.31% and 32.50% compared to the original feature set. In addition, we compared the TCN classification model with other deep learning models in terms of evaluation metrics such as F₁ score, recall, precision, and accuracy, and the results showed that the TCN model outperformed the other control models in all four metrics. Meanwhile, it also outperforms the existing recognition methods in terms of accuracy and other aspects, which have some practical application value.

人类行为识别是利用现代技术自动识别和分析多种人类行为的过程。在以往的研究中，我们发现冗余特征不仅会减慢模型训练过程、增加结构复杂度，还会降低模型的整体性能。为了克服这一问题，本文研究了一种基于改进的麻雀搜索算法随机森林（SSARF）特征选择的时序卷积神经网络（TCN）模型，以准确识别基于可穿戴设备的人类行为特征。该模型以 TCN 分类模型为基础，将随机森林与麻雀优化算法相结合，对原始特征进行降维处理，用于去除相关性差和不重要的特征，保留具有一定贡献率的有效特征，生成最优特征子集。为了验证该方法的可靠性，我们分别在 UCI Human Activity Recognition 和 WISDM 两个公开数据集上对模型的性能进行了评估，得到了 98.54% 和 97.83% 的识别准确率，与预特征选择相比分别提高了 0.47% 和 1.04%，但与原始特征集相比，特征数量分别减少了 84.31% 和 32.50%。此外，我们还将 TCN 分类模型与其他深度学习模型在 F1 分数、召回率、精度和准确率等评价指标方面进行了比较，结果表明 TCN 模型在所有四个指标上都优于其他对照模型。同时，它在准确率等方面也优于现有的识别方法，具有一定的实际应用价值。

{"title":"Research on TCN Model Based on SSARF Feature Selection in the Field of Human Behavior Recognition","authors":"Wei Zhang, Guibo Yu, Shijie Deng","doi":"10.1049/2024/4982277","DOIUrl":"https://doi.org/10.1049/2024/4982277","url":null,"abstract":"<div>\u0000 <p>Human behavior recognition is the process of automatically identifying and analyzing multiple human behaviors using modern technology. From previous studies, we find that redundant features not only slow down the model training process and increase the structural complexity but also degrade the overall performance of the model. To overcome this problem, this paper investigates a temporal convolutional neural network (TCN) model based on improved sparrow search algorithm random forest (SSARF) feature selection to accurately identify human behavioral traits based on wearable devices. The model is based on the TCN classification model and incorporates a random forest with the sparrow optimization algorithm to perform dimensionality reduction on the original features, which is used to remove poorly correlated and unimportant features and retain effective features with a certain contribution rate to generate the optimal feature subset. In order to verify the reliability of the method, the performance of the model was evaluated on two public datasets, UCI Human Activity Recognition and WISDM, respectively, and 98.54% and 97.83% recognition accuracies were obtained, which were improved by 0.47% and 1.04% compared to the prefeature selection, but the number of features was reduced by 84.31% and 32.50% compared to the original feature set. In addition, we compared the TCN classification model with other deep learning models in terms of evaluation metrics such as <i>F</i><sub>1</sub> score, recall, precision, and accuracy, and the results showed that the TCN model outperformed the other control models in all four metrics. Meanwhile, it also outperforms the existing recognition methods in terms of accuracy and other aspects, which have some practical application value.</p>\u0000 </div>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"2024 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/4982277","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142359950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Finger Vein Recognition Algorithm Based on the Histogram of Variable Curvature Directional Binary Statistics 基于变曲率方向二元统计直方图的手指静脉识别算法

IF 1.8 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IET Biometrics

Pub Date : 2024-09-25 DOI: 10.1049/2024/7408331

Min Li, Xue Jiang, Honghao Zhu, Fei Liu, Huabin Wang, Liang Tao, Shijun Liu

Structural features are capable of effectively capturing the overall texture variations in images. However, in locally prominent areas with visible veins, other characteristics such as directionality, convexity–concavity, and curvature also play a crucial role in recognition, and their impact cannot be overlooked. This paper introduces a novel approach, the histogram of variable curvature directional binary statistical (HVCDBS), which combines the structural and directional features of images. The proposed method is designed for extracting discriminative multifeature information in vein recognition. First, a multidirection and multicurvature Gabor filter is introduced for convolution with vein images, yielding directional and convexity–concavity information at each pixel, along with curvature information for the corresponding curve. Simultaneously incorporating the original image feature information, these four aspects of information are fused and encoded to construct a variable curvature binary pattern (VCBP) with multifeatures. Second, the feature map containing multifeature information is blockwise processed to build variable curvature binary statistical features. Finally, competitive Gabor directional binary statistical features are combined, and a matching score-level fusion scheme is employed based on maximizing the interclass distance and minimizing the intraclass distance to determine the optimal weights. This process fuses the two feature maps into a one-dimensional feature vector, achieving an effective representation of vein images. Extensive experiments were conducted on four widely utilized vein databases, and the results indicate that the proposed algorithm, compared with solely extraction of structural features, achieved higher recognition rates and lower equal error rates.

结构特征能够有效捕捉图像的整体纹理变化。然而，在局部有明显纹理的突出区域，其他特征如方向性、凸凹性和曲率等在识别中也起着至关重要的作用，其影响不容忽视。本文介绍了一种结合图像结构和方向特征的新方法--变曲率直方图方向二元统计法（HVCDBS）。所提出的方法旨在提取静脉识别中的多特征判别信息。首先，引入一个多方向和多曲率的 Gabor 滤波器与静脉图像卷积，得到每个像素的方向和凸凹信息，以及相应曲线的曲率信息。在结合原始图像特征信息的同时，将这四个方面的信息进行融合和编码，从而构建出具有多特征的可变曲率二进制模式（VCBP）。其次，对包含多特征信息的特征图进行顺时针处理，以建立变曲率二元统计特征。最后，将具有竞争力的 Gabor 定向二元统计特征进行组合，并采用基于类间距离最大化和类内距离最小化的匹配分数级融合方案来确定最佳权重。这一过程将两个特征图融合为一维特征向量，从而实现静脉图像的有效表示。我们在四个广泛使用的静脉数据库中进行了广泛的实验，结果表明，与单纯提取结构特征相比，所提出的算法实现了更高的识别率和更低的相等错误率。

{"title":"A Finger Vein Recognition Algorithm Based on the Histogram of Variable Curvature Directional Binary Statistics","authors":"Min Li, Xue Jiang, Honghao Zhu, Fei Liu, Huabin Wang, Liang Tao, Shijun Liu","doi":"10.1049/2024/7408331","DOIUrl":"https://doi.org/10.1049/2024/7408331","url":null,"abstract":"<div>\u0000 <p>Structural features are capable of effectively capturing the overall texture variations in images. However, in locally prominent areas with visible veins, other characteristics such as directionality, convexity–concavity, and curvature also play a crucial role in recognition, and their impact cannot be overlooked. This paper introduces a novel approach, the histogram of variable curvature directional binary statistical (HVCDBS), which combines the structural and directional features of images. The proposed method is designed for extracting discriminative multifeature information in vein recognition. First, a multidirection and multicurvature Gabor filter is introduced for convolution with vein images, yielding directional and convexity–concavity information at each pixel, along with curvature information for the corresponding curve. Simultaneously incorporating the original image feature information, these four aspects of information are fused and encoded to construct a variable curvature binary pattern (VCBP) with multifeatures. Second, the feature map containing multifeature information is blockwise processed to build variable curvature binary statistical features. Finally, competitive Gabor directional binary statistical features are combined, and a matching score-level fusion scheme is employed based on maximizing the interclass distance and minimizing the intraclass distance to determine the optimal weights. This process fuses the two feature maps into a one-dimensional feature vector, achieving an effective representation of vein images. Extensive experiments were conducted on four widely utilized vein databases, and the results indicate that the proposed algorithm, compared with solely extraction of structural features, achieved higher recognition rates and lower equal error rates.</p>\u0000 </div>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"2024 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/7408331","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142320637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Survey on Automatic Face Recognition Using Side-View Face Images 关于使用侧视人脸图像自动识别人脸的调查

IF 1.8 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IET Biometrics

Pub Date : 2024-08-08 DOI: 10.1049/2024/7886911

Pinar Santemiz, Luuk J. Spreeuwers, Raymond N. J. Veldhuis

Face recognition from side-view positions poses a considerable challenge in automatic face recognition tasks. Pose variation up to the side-view is an issue of difference in appearance and visibility since only one eye is visible at the side-view poses. Traditionally overlooked, recent advancements in deep learning have brought side-view poses to the forefront of research attention. This survey comprehensively investigates methods addressing pose variations up to side-view and categorizes research efforts into feature-based, image-based, and set-based pose handling. Unlike existing surveys addressing pose variations, our emphasis is specifically on extreme poses. We report numerous promising innovations in each category and contemplate the utilization and challenges associated with side-view. Furthermore, we introduce current datasets and benchmarks, conduct performance evaluations across diverse methods, and explore their unique constraints. Notably, while feature-based methods currently stand as the state-of-the-art, our observations suggest that cross-dataset evaluations, attempted by only a few researchers, produce worse results. Consequently, the challenge of matching arbitrary poses in uncontrolled settings persists.

在自动人脸识别任务中，侧视位置的人脸识别是一个相当大的挑战。由于侧视姿势时只有一只眼睛可见，因此侧视姿势的变化是一个外观和可见度差异的问题。一直以来，侧视姿势都被忽视，但最近深度学习的进步使侧视姿势成为研究关注的焦点。本调查全面研究了处理侧视姿势变化的方法，并将研究工作分为基于特征、基于图像和基于集合的姿势处理。与处理姿势变化的现有调查不同，我们的重点是极端姿势。我们报告了每个类别中许多有前景的创新，并思考了与侧视相关的利用和挑战。此外，我们还介绍了当前的数据集和基准，对各种方法进行了性能评估，并探讨了其独特的限制因素。值得注意的是，虽然基于特征的方法目前处于最先进水平，但我们的观察表明，只有少数研究人员尝试过跨数据集评估，但结果却更糟。因此，在不受控制的环境中匹配任意姿势的挑战依然存在。

{"title":"A Survey on Automatic Face Recognition Using Side-View Face Images","authors":"Pinar Santemiz, Luuk J. Spreeuwers, Raymond N. J. Veldhuis","doi":"10.1049/2024/7886911","DOIUrl":"https://doi.org/10.1049/2024/7886911","url":null,"abstract":"<div>\u0000 <p>Face recognition from side-view positions poses a considerable challenge in automatic face recognition tasks. Pose variation up to the side-view is an issue of difference in appearance and visibility since only one eye is visible at the side-view poses. Traditionally overlooked, recent advancements in deep learning have brought side-view poses to the forefront of research attention. This survey comprehensively investigates methods addressing pose variations up to side-view and categorizes research efforts into feature-based, image-based, and set-based pose handling. Unlike existing surveys addressing pose variations, our emphasis is specifically on extreme poses. We report numerous promising innovations in each category and contemplate the utilization and challenges associated with side-view. Furthermore, we introduce current datasets and benchmarks, conduct performance evaluations across diverse methods, and explore their unique constraints. Notably, while feature-based methods currently stand as the state-of-the-art, our observations suggest that cross-dataset evaluations, attempted by only a few researchers, produce worse results. Consequently, the challenge of matching arbitrary poses in uncontrolled settings persists.</p>\u0000 </div>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"2024 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/7886911","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141966581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An Interpretable Siamese Attention Res-CNN for Fingerprint Spoofing Detection 用于指纹欺骗检测的可解释连体注意力 Res-CNN

IF 1.8 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IET Biometrics

Pub Date : 2024-07-16 DOI: 10.1049/2024/6630173

Chengsheng Yuan, Zhenyu Xu, Xinting Li, Zhili Zhou, Junhao Huang, Ping Guo

In recent years, fingerprint authentication has gained widespread adoption in diverse identification systems, including smartphones, wearable devices, and attendance machines, etc. Nonetheless, these systems are vulnerable to spoofing attacks from suspicious fingerprints, posing significant risks to privacy. Consequently, a fingerprint presentation attack detection (PAD) strategy is proposed to ensure the security of these systems. Most of the previous work concentrated on how to build a deep learning framework to improve the PAD performance by augmenting fingerprint samples, and little attention has been paid to the fundamental difference between live and fake fingerprints to optimize feature extractors. This paper proposes a new fingerprint liveness detection method based on Siamese attention residual convolutional neural network (Res-CNN) that offers an interpretative perspective to this challenge. To leverage the variance in ridge continuity features (RCFs) between live and fake fingerprints, a Gabor filter is utilized to enhance the texture details of the fingerprint ridges, followed by the construction of an attention Res-CNN model to extract RCF between the live and fake fingerprints. The model mitigates the performance deterioration caused by gradient disappearance. Furthermore, to highlight the difference in RCF, a Siamese attention residual network is devised, and the ridge continuity amplification loss function is designed to optimize the training process. Ultimately, the RCF parameters are transferred to the model, and transfer learning is utilized to aid its acquisition, thereby assuring the model’s interpretability. The experimental outcomes conducted on three publicly accessible fingerprint datasets demonstrate the superiority of the proposed method, exhibiting remarkable performance in both true detection rate and average classification error rate. Moreover, our method exhibits remarkable capabilities in PAD tasks, including cross-material experiments and cross-sensor experiments. Additionally, we leverage Gradient-weighted Class Activation Mapping to generate a heatmap that visualizes the interpretability of our model, offering a compelling visual validation.

近年来，指纹验证在智能手机、可穿戴设备和考勤机等各种身份识别系统中得到了广泛应用。然而，这些系统很容易受到可疑指纹的欺骗攻击，给隐私带来巨大风险。因此，我们提出了指纹呈现攻击检测（PAD）策略，以确保这些系统的安全性。以往的工作大多集中在如何构建一个深度学习框架，通过增强指纹样本来提高 PAD 性能，而很少有人关注活体指纹和假指纹之间的根本区别，以优化特征提取器。本文提出了一种基于连体注意残差卷积神经网络（Res-CNN）的新型指纹真实性检测方法，为应对这一挑战提供了一种解释性视角。为了利用真假指纹脊连续性特征（RCF）的差异，本文利用 Gabor 滤波器增强指纹脊的纹理细节，然后构建注意力残差卷积神经网络模型来提取真假指纹之间的 RCF。该模型可减轻因梯度消失而导致的性能下降。此外，为了突出 RCF 的差异，设计了一个连体注意残差网络，并设计了脊连续性放大损失函数来优化训练过程。最后，将 RCF 参数转移到模型中，并利用迁移学习来帮助模型的获取，从而确保模型的可解释性。在三个可公开访问的指纹数据集上进行的实验结果表明了所提方法的优越性，在真实检测率和平均分类错误率方面都表现出色。此外，我们的方法在 PAD 任务（包括跨材料实验和跨传感器实验）中表现出卓越的能力。此外，我们还利用梯度加权类激活映射生成热图，直观显示了我们模型的可解释性，提供了令人信服的可视化验证。

{"title":"An Interpretable Siamese Attention Res-CNN for Fingerprint Spoofing Detection","authors":"Chengsheng Yuan, Zhenyu Xu, Xinting Li, Zhili Zhou, Junhao Huang, Ping Guo","doi":"10.1049/2024/6630173","DOIUrl":"https://doi.org/10.1049/2024/6630173","url":null,"abstract":"<div>\u0000 <p>In recent years, fingerprint authentication has gained widespread adoption in diverse identification systems, including smartphones, wearable devices, and attendance machines, etc. Nonetheless, these systems are vulnerable to spoofing attacks from suspicious fingerprints, posing significant risks to privacy. Consequently, a fingerprint presentation attack detection (PAD) strategy is proposed to ensure the security of these systems. Most of the previous work concentrated on how to build a deep learning framework to improve the PAD performance by augmenting fingerprint samples, and little attention has been paid to the fundamental difference between live and fake fingerprints to optimize feature extractors. This paper proposes a new fingerprint liveness detection method based on Siamese attention residual convolutional neural network (Res-CNN) that offers an interpretative perspective to this challenge. To leverage the variance in ridge continuity features (RCFs) between live and fake fingerprints, a Gabor filter is utilized to enhance the texture details of the fingerprint ridges, followed by the construction of an attention Res-CNN model to extract RCF between the live and fake fingerprints. The model mitigates the performance deterioration caused by gradient disappearance. Furthermore, to highlight the difference in RCF, a Siamese attention residual network is devised, and the ridge continuity amplification loss function is designed to optimize the training process. Ultimately, the RCF parameters are transferred to the model, and transfer learning is utilized to aid its acquisition, thereby assuring the model’s interpretability. The experimental outcomes conducted on three publicly accessible fingerprint datasets demonstrate the superiority of the proposed method, exhibiting remarkable performance in both true detection rate and average classification error rate. Moreover, our method exhibits remarkable capabilities in PAD tasks, including cross-material experiments and cross-sensor experiments. Additionally, we leverage Gradient-weighted Class Activation Mapping to generate a heatmap that visualizes the interpretability of our model, offering a compelling visual validation.</p>\u0000 </div>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"2024 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/6630173","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141631158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

FSErasing: Improving Face Recognition with Data Augmentation Using Face Parsing FSErasing：利用人脸解析进行数据扩充，提高人脸识别能力

IF 2 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IET Biometrics

Pub Date : 2024-06-12 DOI: 10.1049/2024/6663315

Hiroya Kawai, Koichi Ito, Hwann-Tzong Chen, Takafumi Aoki

We propose original semantic labels for detailed face parsing to improve the accuracy of face recognition by focusing on parts in a face. The part labels used in conventional face parsing are defined based on biological features, and thus, one label is given to a large region, such as skin. Our semantic labels are defined by separating parts with large areas based on the structure of the face and considering the left and right sides for all parts to consider head pose changes, occlusion, and other factors. By utilizing the capability of assigning detailed part labels to face images, we propose a novel data augmentation method based on detailed face parsing called Face Semantic Erasing (FSErasing) to improve the performance of face recognition. FSErasing is to randomly mask a part of the face image based on the detailed part labels, and therefore, we can apply erasing-type data augmentation to the face image that considers the characteristics of the face. Through experiments using public face image datasets, we demonstrate that FSErasing is effective for improving the performance of face recognition and face attribute estimation. In face recognition, adding FSErasing in training ResNet-34 with Softmax using CelebA improves the average accuracy by 0.354 points and the average equal error rate (EER) by 0.312 points, and with ArcFace, the average accuracy and EER improve by 0.752 and 0.802 points, respectively. ResNet-50 with Softmax using CASIA-WebFace improves the average accuracy by 0.442 points and the average EER by 0.452 points, and with ArcFace, the average accuracy and EER improve by 0.228 points and 0.500 points, respectively. In face attribute estimation, adding FSErasing as a data augmentation method in training with CelebA improves the estimation accuracy by 0.54 points. We also apply our detailed face parsing model to visualize face recognition models and demonstrate its higher explainability than general visualization methods.

我们提出了用于详细人脸解析的原创语义标签，通过关注人脸的各个部分来提高人脸识别的准确性。传统的人脸解析中使用的部位标签是根据生物特征定义的，因此，一个大的区域（如皮肤）会被赋予一个标签。而我们的语义标签是根据人脸的结构将面积较大的部分分开，并考虑到头部姿势变化、遮挡等因素，对所有部分的左右两侧进行定义。利用为人脸图像分配详细部分标签的能力，我们提出了一种基于详细人脸解析的新型数据增强方法--人脸语义擦除（FSErasing），以提高人脸识别性能。FSErasing 是根据详细的部分标签随机屏蔽人脸图像的一部分，因此我们可以对人脸图像进行考虑人脸特征的擦除式数据增强。通过使用公共人脸图像数据集进行实验，我们证明了 FSErasing 能够有效提高人脸识别和人脸属性估计的性能。在人脸识别方面，在使用 CelebA 的 Softmax ResNet-34 的训练中加入 FSErasing，平均准确率提高了 0.354 点，平均等错误率（EER）提高了 0.312 点；在使用 ArcFace 的训练中加入 FSErasing，平均准确率和等错误率分别提高了 0.752 点和 0.802 点。带有 Softmax 的 ResNet-50 使用 CASIA-WebFace 时，平均准确率提高了 0.442 点，平均等误率提高了 0.452 点；使用 ArcFace 时，平均准确率和等误率分别提高了 0.228 点和 0.500 点。在人脸属性估计方面，在使用 CelebA 进行训练时添加 FSErasing 作为数据增强方法，估计准确率提高了 0.54 点。我们还将详细的人脸解析模型应用于人脸识别模型的可视化，并证明了其比一般可视化方法更高的可解释性。

{"title":"FSErasing: Improving Face Recognition with Data Augmentation Using Face Parsing","authors":"Hiroya Kawai, Koichi Ito, Hwann-Tzong Chen, Takafumi Aoki","doi":"10.1049/2024/6663315","DOIUrl":"https://doi.org/10.1049/2024/6663315","url":null,"abstract":"<div>\u0000 <p>We propose original semantic labels for detailed face parsing to improve the accuracy of face recognition by focusing on parts in a face. The part labels used in conventional face parsing are defined based on biological features, and thus, one label is given to a large region, such as skin. Our semantic labels are defined by separating parts with large areas based on the structure of the face and considering the left and right sides for all parts to consider head pose changes, occlusion, and other factors. By utilizing the capability of assigning detailed part labels to face images, we propose a novel data augmentation method based on detailed face parsing called Face Semantic Erasing (FSErasing) to improve the performance of face recognition. FSErasing is to randomly mask a part of the face image based on the detailed part labels, and therefore, we can apply erasing-type data augmentation to the face image that considers the characteristics of the face. Through experiments using public face image datasets, we demonstrate that FSErasing is effective for improving the performance of face recognition and face attribute estimation. In face recognition, adding FSErasing in training ResNet-34 with Softmax using CelebA improves the average accuracy by 0.354 points and the average equal error rate (EER) by 0.312 points, and with ArcFace, the average accuracy and EER improve by 0.752 and 0.802 points, respectively. ResNet-50 with Softmax using CASIA-WebFace improves the average accuracy by 0.442 points and the average EER by 0.452 points, and with ArcFace, the average accuracy and EER improve by 0.228 points and 0.500 points, respectively. In face attribute estimation, adding FSErasing as a data augmentation method in training with CelebA improves the estimation accuracy by 0.54 points. We also apply our detailed face parsing model to visualize face recognition models and demonstrate its higher explainability than general visualization methods.</p>\u0000 </div>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"2024 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/6663315","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141308918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0