首页 > 最新文献

IEEE transactions on biometrics, behavior, and identity science最新文献

英文 中文
IEEE Transactions on Biometrics, Behavior, and Identity Science Information for Authors IEEE生物识别、行为和身份科学信息作者汇刊
IF 5 Pub Date : 2026-01-26 DOI: 10.1109/TBIOM.2026.3652264
{"title":"IEEE Transactions on Biometrics, Behavior, and Identity Science Information for Authors","authors":"","doi":"10.1109/TBIOM.2026.3652264","DOIUrl":"https://doi.org/10.1109/TBIOM.2026.3652264","url":null,"abstract":"","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"8 1","pages":"C3-C3"},"PeriodicalIF":5.0,"publicationDate":"2026-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11364043","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146045359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IEEE Transactions on Biometrics, Behavior, and Identity Science Publication Information IEEE生物计量学、行为与身份科学学报
IF 5 Pub Date : 2026-01-26 DOI: 10.1109/TBIOM.2026.3652243
{"title":"IEEE Transactions on Biometrics, Behavior, and Identity Science Publication Information","authors":"","doi":"10.1109/TBIOM.2026.3652243","DOIUrl":"https://doi.org/10.1109/TBIOM.2026.3652243","url":null,"abstract":"","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"8 1","pages":"C2-C2"},"PeriodicalIF":5.0,"publicationDate":"2026-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11364035","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146045356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PrIdentity: Generalizable Privacy-Preserving Adversarial Perturbations for Anonymizing Facial Identity 身份:匿名化面部身份的可推广隐私保护对抗扰动
IF 5 Pub Date : 2025-10-27 DOI: 10.1109/TBIOM.2025.3625986
Saheb Chhabra;Kartik Thakral;Richa Singh;Mayank Vatsa
With the rapid proliferation of face recognition systems, the risk of privacy leakage from facial images has become a pressing concern. Applications such as Find Face and Social Mapper can readily expose an individual’s identity without consent. Existing anonymization approaches partially address the problem: synthesis and fusion-based methods suppress identity but often distort facial attributes, reducing data utility, while adversarial perturbation methods improve generalizability across recognition models but rely on fixed $mathcal {L}_{1}$ or $mathcal {L}_{2}$ norms, leading to underfitting or overfitting. As a result, no single method jointly satisfies the three essential properties of effective anonymization: privacy, data utility, and generalizability. To address these limitations, we present a novel Privacy-Preserving Identity Anonymization (PrIdentity) algorithm that anonymizes the identity of a given image while preserving privacy. Our approach learns adversarial perturbations through an $mathcal {L}_{p}$ norm based regularization technique, maintaining a balance between privacy and data utility. Furthermore, we ensure the anonymized images generalize effectively across different unseen face recognition models. To the best of our knowledge, this is the first work to introduce a learnable $p$ parameter in the $mathcal {L}_{p}$ norm for privacy-preservation. We evaluate PrIdentity on the LFW, CelebA, and CelebA-HQ datasets across multiple face recognition architectures, complemented by a user study on both original and anonymized images. The results demonstrate that our algorithm effectively conceals identities while preserving visual appearance, achieving state-of-the-art performance in identity anonymization. We also carry out bounding box distance prediction experiments to validate data utility, attaining a state-of-the-art Euclidean distance of 2.65, which is 1.17 lower than the second-best method.
随着人脸识别系统的迅速普及,人脸图像的隐私泄露风险已成为一个迫切关注的问题。诸如Find Face和Social Mapper等应用程序可以在未经同意的情况下轻易暴露个人身份。现有的匿名化方法部分解决了这个问题:基于综合和融合的方法抑制身份,但经常扭曲面部属性,降低数据效用,而对抗扰动方法提高了识别模型的泛化性,但依赖于固定的$mathcal {L}_{2}$规范,导致欠拟合或过拟合。因此,没有一种方法能同时满足有效匿名化的三个基本属性:隐私性、数据实用性和泛化性。为了解决这些限制,我们提出了一种新的保护隐私的身份匿名化(PrIdentity)算法,该算法可以在保护隐私的同时匿名给定图像的身份。我们的方法通过$mathcal {L}_{p}$基于范数的正则化技术来学习对抗性扰动,保持隐私和数据效用之间的平衡。此外,我们确保匿名图像在不同的未见过的人脸识别模型中有效地泛化。据我们所知,这是第一次在$mathcal {L}_{p}$范数中引入可学习的$p$参数来保护隐私。我们评估了LFW、CelebA和CelebA- hq数据集在多个人脸识别架构上的身份,并辅以对原始图像和匿名图像的用户研究。结果表明,我们的算法在保留视觉外观的同时有效地隐藏了身份,实现了最先进的身份匿名化性能。我们还进行了边界盒距离预测实验来验证数据的实用性,获得了2.65的最先进的欧几里得距离,比第二好的方法低1.17。
{"title":"PrIdentity: Generalizable Privacy-Preserving Adversarial Perturbations for Anonymizing Facial Identity","authors":"Saheb Chhabra;Kartik Thakral;Richa Singh;Mayank Vatsa","doi":"10.1109/TBIOM.2025.3625986","DOIUrl":"https://doi.org/10.1109/TBIOM.2025.3625986","url":null,"abstract":"With the rapid proliferation of face recognition systems, the risk of privacy leakage from facial images has become a pressing concern. Applications such as Find Face and Social Mapper can readily expose an individual’s identity without consent. Existing anonymization approaches partially address the problem: synthesis and fusion-based methods suppress identity but often distort facial attributes, reducing data utility, while adversarial perturbation methods improve generalizability across recognition models but rely on fixed <inline-formula> <tex-math>$mathcal {L}_{1}$ </tex-math></inline-formula> or <inline-formula> <tex-math>$mathcal {L}_{2}$ </tex-math></inline-formula> norms, leading to underfitting or overfitting. As a result, no single method jointly satisfies the three essential properties of effective anonymization: privacy, data utility, and generalizability. To address these limitations, we present a novel Privacy-Preserving Identity Anonymization (PrIdentity) algorithm that anonymizes the identity of a given image while preserving privacy. Our approach learns adversarial perturbations through an <inline-formula> <tex-math>$mathcal {L}_{p}$ </tex-math></inline-formula> norm based regularization technique, maintaining a balance between privacy and data utility. Furthermore, we ensure the anonymized images generalize effectively across different unseen face recognition models. To the best of our knowledge, this is the first work to introduce a learnable <inline-formula> <tex-math>$p$ </tex-math></inline-formula> parameter in the <inline-formula> <tex-math>$mathcal {L}_{p}$ </tex-math></inline-formula> norm for privacy-preservation. We evaluate PrIdentity on the LFW, CelebA, and CelebA-HQ datasets across multiple face recognition architectures, complemented by a user study on both original and anonymized images. The results demonstrate that our algorithm effectively conceals identities while preserving visual appearance, achieving state-of-the-art performance in identity anonymization. We also carry out bounding box distance prediction experiments to validate data utility, attaining a state-of-the-art Euclidean distance of 2.65, which is 1.17 lower than the second-best method.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"8 1","pages":"137-151"},"PeriodicalIF":5.0,"publicationDate":"2025-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146045313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
2025 Index IEEE Transactions on Biometrics, Behavior, and Identity Science 生物识别、行为与身份科学汇刊
IF 5 Pub Date : 2025-10-10 DOI: 10.1109/TBIOM.2025.3618315
{"title":"2025 Index IEEE Transactions on Biometrics, Behavior, and Identity Science","authors":"","doi":"10.1109/TBIOM.2025.3618315","DOIUrl":"https://doi.org/10.1109/TBIOM.2025.3618315","url":null,"abstract":"","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"7 4","pages":"953-970"},"PeriodicalIF":5.0,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11199364","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145255922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Person Recognition in Aerial Surveillance: A Decade Survey 航空监视中的人识别:十年调查
IF 5 Pub Date : 2025-10-03 DOI: 10.1109/TBIOM.2025.3617011
K. Nguyen;Feng Liu;C. Fookes;S. Sridharan;Xiaoming Liu;Arun Ross
The rapid emergence of airborne platforms and imaging sensors is enabling new forms of aerial surveillance due to their unprecedented advantages in scale, mobility, deployment, and covert observation capabilities. This article provides a comprehensive overview of 150+ papers over the last 10 years of human-centric aerial surveillance tasks from a computer vision and machine learning perspective. It aims to provide readers with an in-depth systematic review and technical analysis of the current state of aerial surveillance tasks using drones, UAVs, and other airborne platforms. The object of interest is humans, where human subjects are to be detected, identified, and re-identified. More specifically, for each of these tasks, we first identify unique challenges in performing these tasks in an aerial setting compared to the popular ground-based setting and subsequently compile and analyze aerial datasets publicly available for each task. Most importantly, we delve deep into the approaches in the aerial surveillance literature with a focus on investigating how they presently address aerial challenges and techniques for improvement. We conclude the paper by discussing the gaps and open research questions to inform future research avenues.
由于机载平台和成像传感器在规模、机动性、部署和隐蔽观察能力方面具有前所未有的优势,它们的迅速出现使新型空中监视成为可能。本文从计算机视觉和机器学习的角度全面概述了过去10年以人为中心的空中监视任务的150多篇论文。它旨在为读者提供使用无人机、无人机和其他机载平台的空中监视任务的现状的深入系统审查和技术分析。感兴趣的对象是人类,其中人类主体将被检测、识别和重新识别。更具体地说,对于每一项任务,我们首先确定在空中执行这些任务时与流行的地面设置相比所面临的独特挑战,然后编译和分析每个任务公开可用的空中数据集。最重要的是,我们深入研究了空中监视文献中的方法,重点是调查它们目前如何应对空中挑战和改进技术。我们通过讨论差距和开放的研究问题来总结本文,以告知未来的研究途径。
{"title":"Person Recognition in Aerial Surveillance: A Decade Survey","authors":"K. Nguyen;Feng Liu;C. Fookes;S. Sridharan;Xiaoming Liu;Arun Ross","doi":"10.1109/TBIOM.2025.3617011","DOIUrl":"https://doi.org/10.1109/TBIOM.2025.3617011","url":null,"abstract":"The rapid emergence of airborne platforms and imaging sensors is enabling new forms of aerial surveillance due to their unprecedented advantages in scale, mobility, deployment, and covert observation capabilities. This article provides a comprehensive overview of 150+ papers over the last 10 years of human-centric aerial surveillance tasks from a computer vision and machine learning perspective. It aims to provide readers with an in-depth systematic review and technical analysis of the current state of aerial surveillance tasks using drones, UAVs, and other airborne platforms. The object of interest is humans, where human subjects are to be detected, identified, and re-identified. More specifically, for each of these tasks, we first identify unique challenges in performing these tasks in an aerial setting compared to the popular ground-based setting and subsequently compile and analyze aerial datasets publicly available for each task. Most importantly, we delve deep into the approaches in the aerial surveillance literature with a focus on investigating how they presently address aerial challenges and techniques for improvement. We conclude the paper by discussing the gaps and open research questions to inform future research avenues.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"8 1","pages":"3-19"},"PeriodicalIF":5.0,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146045303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gradient-Reversed Domain-Generalizable Multi-Modal Cross-Attention Network for Robust Face Anti-Spoofing 面向人脸抗欺骗的梯度反域可推广多模态交叉注意网络
IF 5 Pub Date : 2025-10-01 DOI: 10.1109/TBIOM.2025.3616651
Koyya Deepthi Krishna Yadav;Ilaiah Kavati;Ramalingaswamy Cheruku
Face recognition systems are increasingly vulnerable to presentation attacks, wherein adversaries employ artifacts such as printed photographs, replayed videos, or 3D masks, etc, to impersonate genuine users. Despite significant progress in face anti-spoofing, most existing methods exhibit poor generalization when confronted with unseen attack types or domain shifts such as changes in sensors, environments, or acquisition protocols, thereby limiting their robustness in real-world applications. We propose the Gradient-Reversed Domain-Generalizable Multi-Modal Cross-Attention Network (GR-DXNet) that enhances resilience against previously unseen attacks and enables seamless adaptation across diverse domains. GR-DXNet employs dual-modality learning by fusing RGB frames, which capture fine-grained texture, with depth maps that reveal 3D structural cues. Temporal Convolutional Networks (TCNs) are integrated to model motion-based inconsistencies, improving the detection of dynamic emerging spoof patterns. To enhance cross-modal representation, a query-key-value-based cross-attention mechanism is introduced, enabling effective alignment and fusion of RGB and depth features. Furthermore, a post-fusion Gradient Reversal Layer (GRL) is employed that adversarially aligns cross-modal embeddings to suppress domain-specific bias without handcrafted augmentations or complex disentanglement, encouraging the model to learn domain-invariant features and strengthens generalization under unseen domains. Extensive evaluations on benchmark datasets across both intra- and cross-dataset protocols, offering a reliable solution for real-world face spoof detection.
人脸识别系统越来越容易受到演示攻击,其中攻击者使用诸如打印照片,重播视频或3D面具等人工制品来冒充真正的用户。尽管在人脸反欺骗方面取得了重大进展,但大多数现有方法在面对看不见的攻击类型或领域变化(如传感器、环境或获取协议的变化)时表现出较差的泛化能力,从而限制了它们在实际应用中的鲁棒性。我们提出了梯度反转域通用多模态交叉注意网络(GR-DXNet),增强了对以前未见过的攻击的弹性,并实现了跨不同域的无缝适应。GR-DXNet采用双模态学习,通过融合RGB帧,捕捉细粒度纹理,深度图揭示3D结构线索。将时间卷积网络(TCNs)集成到基于运动的不一致性模型中,提高了对动态新出现的欺骗模式的检测。为了增强跨模态表示,引入了基于查询键值的交叉注意机制,实现了RGB和深度特征的有效对齐和融合。此外,采用融合后梯度反转层(GRL)对抗性对齐跨模态嵌入来抑制特定领域的偏差,而无需手工增强或复杂的解纠缠,从而鼓励模型学习领域不变特征并加强在未知领域下的泛化。对跨内部和跨数据集协议的基准数据集进行广泛评估,为现实世界的人脸欺骗检测提供可靠的解决方案。
{"title":"Gradient-Reversed Domain-Generalizable Multi-Modal Cross-Attention Network for Robust Face Anti-Spoofing","authors":"Koyya Deepthi Krishna Yadav;Ilaiah Kavati;Ramalingaswamy Cheruku","doi":"10.1109/TBIOM.2025.3616651","DOIUrl":"https://doi.org/10.1109/TBIOM.2025.3616651","url":null,"abstract":"Face recognition systems are increasingly vulnerable to presentation attacks, wherein adversaries employ artifacts such as printed photographs, replayed videos, or 3D masks, etc, to impersonate genuine users. Despite significant progress in face anti-spoofing, most existing methods exhibit poor generalization when confronted with unseen attack types or domain shifts such as changes in sensors, environments, or acquisition protocols, thereby limiting their robustness in real-world applications. We propose the Gradient-Reversed Domain-Generalizable Multi-Modal Cross-Attention Network (GR-DXNet) that enhances resilience against previously unseen attacks and enables seamless adaptation across diverse domains. GR-DXNet employs dual-modality learning by fusing RGB frames, which capture fine-grained texture, with depth maps that reveal 3D structural cues. Temporal Convolutional Networks (TCNs) are integrated to model motion-based inconsistencies, improving the detection of dynamic emerging spoof patterns. To enhance cross-modal representation, a query-key-value-based cross-attention mechanism is introduced, enabling effective alignment and fusion of RGB and depth features. Furthermore, a post-fusion Gradient Reversal Layer (GRL) is employed that adversarially aligns cross-modal embeddings to suppress domain-specific bias without handcrafted augmentations or complex disentanglement, encouraging the model to learn domain-invariant features and strengthens generalization under unseen domains. Extensive evaluations on benchmark datasets across both intra- and cross-dataset protocols, offering a reliable solution for real-world face spoof detection.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"8 1","pages":"84-98"},"PeriodicalIF":5.0,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146045306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Voice2Visage: Deciphering Faces From Voices Voice2Visage:从声音中解读面孔
IF 5 Pub Date : 2025-09-30 DOI: 10.1109/TBIOM.2025.3615961
Wuyang Chen;Kele Xu;Yanjie Sun;Yong Dou;Huaimin Wang
The human voice carries valuable cues about an individual’s identity and emotions. A more intriguing question emerges: can one’s facial appearance be deduced from their voice alone? Existing efforts have primarily focused on exploring the relationship between natural audio and visual data, with limited attention given to the specific biometric domain of speaker-voice and face correlation. This study seeks to model the facial-related information embedded within the voice and ultimately predict an unknown person’s appearance solely based on their unheard voice. This task presents several challenges: firstly, while natural sounds exhibit significant variability, human voices often share similar frequencies, complicating the establishment of mappings between them. Secondly, generating faces from a voice presents an ill-posed problem, as details such as makeup and heap pose cannot be inferred from voice alone. In this article, we introduce a novel framework named Voice2Visage, designed to tackle this task by leveraging self-supervised cross-modal and intra-modal learning to predict faces corresponding to input voice. To ensure the feasibility of our method, we optimize existing algorithms in automated dataset collection. Additionally, we systematically design experiments to test the usability and stability of commonly used quantitative metrics in the field of facial identity comparison. The results validate the close semantic association between the generated face and the reference one, showcasing its reliability. Our work provides a fresh perspective on exploring the depth of physiological characteristics concealed within human voices and the intricate interplay between appearance and voice. Our code is available at https://github.com/colaudiolab/Voice2Visage.
人类的声音承载着关于个人身份和情感的宝贵线索。一个更有趣的问题出现了:一个人的长相能否仅从他的声音中推断出来?现有的工作主要集中在探索自然音频和视觉数据之间的关系,对说话者-声音和面部相关性的特定生物识别领域的关注有限。这项研究试图建立声音中嵌入的面部相关信息的模型,并最终仅根据他们听不到的声音来预测未知人物的外表。这项任务提出了几个挑战:首先,虽然自然声音表现出显著的可变性,但人类的声音通常具有相似的频率,这使得建立它们之间的映射变得复杂。其次,从声音生成人脸存在病态问题,因为化妆和堆姿等细节不能仅从声音推断出来。在本文中,我们介绍了一个名为Voice2Visage的新框架,旨在通过利用自监督跨模态和模态内学习来预测输入语音对应的人脸来解决这一任务。为了确保我们的方法的可行性,我们优化了现有的自动数据集收集算法。此外,我们还系统地设计了实验,以测试面部身份比较领域常用定量指标的可用性和稳定性。结果验证了生成的人脸与参考人脸之间的语义紧密关联,证明了其可靠性。我们的工作为探索隐藏在人类声音中的生理特征的深度以及外表和声音之间复杂的相互作用提供了一个新的视角。我们的代码可在https://github.com/colaudiolab/Voice2Visage上获得。
{"title":"Voice2Visage: Deciphering Faces From Voices","authors":"Wuyang Chen;Kele Xu;Yanjie Sun;Yong Dou;Huaimin Wang","doi":"10.1109/TBIOM.2025.3615961","DOIUrl":"https://doi.org/10.1109/TBIOM.2025.3615961","url":null,"abstract":"The human voice carries valuable cues about an individual’s identity and emotions. A more intriguing question emerges: can one’s facial appearance be deduced from their voice alone? Existing efforts have primarily focused on exploring the relationship between natural audio and visual data, with limited attention given to the specific biometric domain of speaker-voice and face correlation. This study seeks to model the facial-related information embedded within the voice and ultimately predict an unknown person’s appearance solely based on their unheard voice. This task presents several challenges: firstly, while natural sounds exhibit significant variability, human voices often share similar frequencies, complicating the establishment of mappings between them. Secondly, generating faces from a voice presents an ill-posed problem, as details such as makeup and heap pose cannot be inferred from voice alone. In this article, we introduce a novel framework named Voice2Visage, designed to tackle this task by leveraging self-supervised cross-modal and intra-modal learning to predict faces corresponding to input voice. To ensure the feasibility of our method, we optimize existing algorithms in automated dataset collection. Additionally, we systematically design experiments to test the usability and stability of commonly used quantitative metrics in the field of facial identity comparison. The results validate the close semantic association between the generated face and the reference one, showcasing its reliability. Our work provides a fresh perspective on exploring the depth of physiological characteristics concealed within human voices and the intricate interplay between appearance and voice. Our code is available at <uri>https://github.com/colaudiolab/Voice2Visage</uri>.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"8 1","pages":"111-121"},"PeriodicalIF":5.0,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146045358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Structural Consistency for Face Forgery Detection via Frequency Domain Enhancement and Self-Predictive Learning 基于频域增强和自预测学习的人脸伪造检测结构一致性
IF 5 Pub Date : 2025-09-26 DOI: 10.1109/TBIOM.2025.3614578
Lifang Zhou;Miaomiao Chen;Hangsheng Ruan
Existing face forgery detection methods often overfit to known forgery patterns, resulting in limited generalization to unseen manipulations. To address this issue, we propose a Structural Consistency method for Face Forgery Detection via Frequency Domain Enhancement and Self-predictive Learning, which embeds frequency-domain features into the spatial representation and leverages the structural consistency inherent in genuine facial contexts to provide more discriminative cues for forgery identification. Specifically, we design a data augmentation module to extract the frequency information through the Discrete Cosine Transform (DCT) and enhance it using a Frequency Domain Enhancement Module (FEM) to capture subtle forgery artifacts. Furthermore, we design a Self-Prediction Learning Module (SPLM) that reconstructs the occluded central region of a face by exploiting the structural consistency of real facial features. To further guide the learning process, we define a self-predictive reconstruction loss that minimizes the prediction error in the occluded region and helps reinforce structural consistency. Moreover, we propose a Reconstruction Difference Guidance (RDG) module, which explicitly emphasizes potential forgery regions by computing pixel-wise discrepancies between the reconstructed image and the original input. This process produces an attention map that guides the classifier to focus on semantically inconsistent or anomalous regions. Experimental results demonstrate that our method achieves superior generalization and robustness across diverse datasets.
现有的人脸伪造检测方法往往对已知的伪造模式过拟合,导致对未知操作的泛化有限。为了解决这一问题,我们提出了一种基于频域增强和自预测学习的人脸伪造检测结构一致性方法,该方法将频域特征嵌入到空间表示中,并利用真实面部上下文固有的结构一致性为伪造识别提供更多的判别线索。具体来说,我们设计了一个数据增强模块,通过离散余弦变换(DCT)提取频率信息,并使用频域增强模块(FEM)对其进行增强,以捕获细微的伪造伪像。此外,我们设计了一个自预测学习模块(SPLM),通过利用真实面部特征的结构一致性来重建人脸被遮挡的中心区域。为了进一步指导学习过程,我们定义了一个自预测重建损失,使遮挡区域的预测误差最小化,并有助于增强结构一致性。此外,我们提出了一个重建差异指导(RDG)模块,该模块通过计算重建图像与原始输入之间的像素差异来明确强调潜在的伪造区域。这个过程产生一个注意力图,引导分类器关注语义不一致或异常的区域。实验结果表明,该方法在不同的数据集上具有较好的泛化和鲁棒性。
{"title":"Structural Consistency for Face Forgery Detection via Frequency Domain Enhancement and Self-Predictive Learning","authors":"Lifang Zhou;Miaomiao Chen;Hangsheng Ruan","doi":"10.1109/TBIOM.2025.3614578","DOIUrl":"https://doi.org/10.1109/TBIOM.2025.3614578","url":null,"abstract":"Existing face forgery detection methods often overfit to known forgery patterns, resulting in limited generalization to unseen manipulations. To address this issue, we propose a Structural Consistency method for Face Forgery Detection via Frequency Domain Enhancement and Self-predictive Learning, which embeds frequency-domain features into the spatial representation and leverages the structural consistency inherent in genuine facial contexts to provide more discriminative cues for forgery identification. Specifically, we design a data augmentation module to extract the frequency information through the Discrete Cosine Transform (DCT) and enhance it using a Frequency Domain Enhancement Module (FEM) to capture subtle forgery artifacts. Furthermore, we design a Self-Prediction Learning Module (SPLM) that reconstructs the occluded central region of a face by exploiting the structural consistency of real facial features. To further guide the learning process, we define a self-predictive reconstruction loss that minimizes the prediction error in the occluded region and helps reinforce structural consistency. Moreover, we propose a Reconstruction Difference Guidance (RDG) module, which explicitly emphasizes potential forgery regions by computing pixel-wise discrepancies between the reconstructed image and the original input. This process produces an attention map that guides the classifier to focus on semantically inconsistent or anomalous regions. Experimental results demonstrate that our method achieves superior generalization and robustness across diverse datasets.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"8 1","pages":"99-110"},"PeriodicalIF":5.0,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146045300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IEEE Transactions on Biometrics, Behavior, and Identity Science Information for Authors IEEE生物识别、行为和身份科学信息作者汇刊
IF 5 Pub Date : 2025-09-25 DOI: 10.1109/TBIOM.2025.3607046
{"title":"IEEE Transactions on Biometrics, Behavior, and Identity Science Information for Authors","authors":"","doi":"10.1109/TBIOM.2025.3607046","DOIUrl":"https://doi.org/10.1109/TBIOM.2025.3607046","url":null,"abstract":"","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"7 4","pages":"C3-C3"},"PeriodicalIF":5.0,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11180156","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145134932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GaitMspT: A Novel Multi-Scale and Multi-Perspective Temporal Learning Network for Gait Recognition in the Wild GaitMspT:一种用于野外步态识别的新型多尺度、多视角时间学习网络
IF 5 Pub Date : 2025-09-23 DOI: 10.1109/TBIOM.2025.3613586
Hanlin Li;Wanquan Liu;Chenqiang Gao;Ping Wang;Huafeng Wang
Gait recognition, a promising biometric technique, faces significant challenges in unconstrained in-the-wild scenarios. While spatial modeling has progressed, existing state-of-the-art methods fundamentally struggle with temporal variations due to their reliance on strategies developed for constrained environments, limiting their effectiveness in diverse real-world conditions. To overcome this critical bottleneck, we propose GaitMspT, a novel Multi-scale and Multi-perspective Temporal Learning Network engineered for robust unconstrained gait recognition. GaitMspT introduces two key modules: a Multi-scale Temporal Extraction (MsTE) module that captures diverse temporal features across three distinct scales, effectively mitigating issues like gait contour occlusion; and a Multi-perspective Spatial-Temporal Extraction (MpSTE) module that extracts nuanced horizontal and vertical gait variations, emphasizing salient components. Their synergistic integration endows our network with significantly enhanced temporal modeling capabilities. Extensive experiments on four prominent in-the-wild gait datasets (Gait3D, GREW, CCPG, and SUSTech1K) unequivocally demonstrate that GaitMspT substantially outperforms existing state-of-the-art methods, achieving superior recognition accuracy while maintaining an excellent balance between performance and computational complexity.
步态识别是一种很有前途的生物识别技术,在不受约束的野外环境中面临着重大挑战。虽然空间建模已经取得了进展,但现有的最先进的方法从根本上与时间变化作斗争,因为它们依赖于为受限环境开发的策略,限制了它们在不同现实世界条件下的有效性。为了克服这一关键瓶颈,我们提出了GaitMspT,一种新颖的多尺度和多视角时间学习网络,用于鲁棒无约束步态识别。GaitMspT引入了两个关键模块:一个多尺度时间提取(MsTE)模块,可捕获三个不同尺度的不同时间特征,有效缓解步态轮廓遮挡等问题;多视角时空提取(MpSTE)模块,提取细微的水平和垂直步态变化,强调显著成分。它们的协同集成使我们的网络具有显著增强的时间建模能力。在四个突出的野外步态数据集(Gait3D, GREW, CCPG和SUSTech1K)上进行的大量实验明确表明,GaitMspT大大优于现有的最先进的方法,在保持性能和计算复杂性之间的良好平衡的同时,实现了卓越的识别精度。
{"title":"GaitMspT: A Novel Multi-Scale and Multi-Perspective Temporal Learning Network for Gait Recognition in the Wild","authors":"Hanlin Li;Wanquan Liu;Chenqiang Gao;Ping Wang;Huafeng Wang","doi":"10.1109/TBIOM.2025.3613586","DOIUrl":"https://doi.org/10.1109/TBIOM.2025.3613586","url":null,"abstract":"Gait recognition, a promising biometric technique, faces significant challenges in unconstrained in-the-wild scenarios. While spatial modeling has progressed, existing state-of-the-art methods fundamentally struggle with temporal variations due to their reliance on strategies developed for constrained environments, limiting their effectiveness in diverse real-world conditions. To overcome this critical bottleneck, we propose GaitMspT, a novel Multi-scale and Multi-perspective Temporal Learning Network engineered for robust unconstrained gait recognition. GaitMspT introduces two key modules: a Multi-scale Temporal Extraction (MsTE) module that captures diverse temporal features across three distinct scales, effectively mitigating issues like gait contour occlusion; and a Multi-perspective Spatial-Temporal Extraction (MpSTE) module that extracts nuanced horizontal and vertical gait variations, emphasizing salient components. Their synergistic integration endows our network with significantly enhanced temporal modeling capabilities. Extensive experiments on four prominent in-the-wild gait datasets (Gait3D, GREW, CCPG, and SUSTech1K) unequivocally demonstrate that GaitMspT substantially outperforms existing state-of-the-art methods, achieving superior recognition accuracy while maintaining an excellent balance between performance and computational complexity.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"8 1","pages":"71-83"},"PeriodicalIF":5.0,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146045310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE transactions on biometrics, behavior, and identity science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1