基于领域自适应能量的通用人脸反欺骗模型

IF 8.4 1区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS IEEE Transactions on Multimedia Pub Date : 2024-03-30 DOI:10.1109/TMM.2024.3407697

Dan Zhang;Zhekai Du;Jingjing Li;Lei Zhu;Heng Tao Shen

{"title":"基于领域自适应能量的通用人脸反欺骗模型","authors":"Dan Zhang;Zhekai Du;Jingjing Li;Lei Zhu;Heng Tao Shen","doi":"10.1109/TMM.2024.3407697","DOIUrl":null,"url":null,"abstract":"Face anti-spoofing (FAS) plays a crucial role in securing face recognition systems against presentation attacks. However, existing FAS methods often struggle to generalize to unseen attacks and domains. Existing generalizable FAS studies generally leverage domain generalization (DG) techniques for exploiting intermediate features that support generalization while neglecting the task-specific nature of FAS. In this paper, we argue that the FAS task is an imbalanced classification problem, which renders it unsuitable to be handled by a standard discriminative classifier. In contrast, we propose a novel approach for FAS by modeling the problem from a generative perspective using an energy-based model (EBM). The EBM captures the distribution of genuine faces and detects spoofing attempts as deviations from this distribution. We train the EBM using a discriminative objective and an energy regularization term to shape the learned distribution and improve generalization. To enhance the robustness to unseen domains, we introduce an energy-based domain augmentation technique that explores the latent space around the source distribution guided by the EBM. We further leverage a meta-learning framework and a gradient-based variant to leverage the augmented data for domain generalization. For practicability, we consider a practical setting where samples are holistically collected under different environments without distinct domain labels, and show that our method can naturally harness this challenging setting by training with cluster labels. Extensive experiments on four FAS datasets demonstrate the superiority of our method in both intra- and cross-dataset settings, outperforming state-of-the-art approaches.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"10474-10488"},"PeriodicalIF":8.4000,"publicationDate":"2024-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Domain-Adaptive Energy-Based Models for Generalizable Face Anti-Spoofing\",\"authors\":\"Dan Zhang;Zhekai Du;Jingjing Li;Lei Zhu;Heng Tao Shen\",\"doi\":\"10.1109/TMM.2024.3407697\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Face anti-spoofing (FAS) plays a crucial role in securing face recognition systems against presentation attacks. However, existing FAS methods often struggle to generalize to unseen attacks and domains. Existing generalizable FAS studies generally leverage domain generalization (DG) techniques for exploiting intermediate features that support generalization while neglecting the task-specific nature of FAS. In this paper, we argue that the FAS task is an imbalanced classification problem, which renders it unsuitable to be handled by a standard discriminative classifier. In contrast, we propose a novel approach for FAS by modeling the problem from a generative perspective using an energy-based model (EBM). The EBM captures the distribution of genuine faces and detects spoofing attempts as deviations from this distribution. We train the EBM using a discriminative objective and an energy regularization term to shape the learned distribution and improve generalization. To enhance the robustness to unseen domains, we introduce an energy-based domain augmentation technique that explores the latent space around the source distribution guided by the EBM. We further leverage a meta-learning framework and a gradient-based variant to leverage the augmented data for domain generalization. For practicability, we consider a practical setting where samples are holistically collected under different environments without distinct domain labels, and show that our method can naturally harness this challenging setting by training with cluster labels. Extensive experiments on four FAS datasets demonstrate the superiority of our method in both intra- and cross-dataset settings, outperforming state-of-the-art approaches.\",\"PeriodicalId\":13273,\"journal\":{\"name\":\"IEEE Transactions on Multimedia\",\"volume\":\"26 \",\"pages\":\"10474-10488\"},\"PeriodicalIF\":8.4000,\"publicationDate\":\"2024-03-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Multimedia\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10542418/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multimedia","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10542418/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

人脸防欺骗（FAS）在确保人脸识别系统免受呈现攻击方面发挥着至关重要的作用。然而，现有的 FAS 方法往往难以推广到未见过的攻击和领域。现有的可泛化 FAS 研究通常利用领域泛化（DG）技术来利用支持泛化的中间特征，而忽略了 FAS 的特定任务性质。在本文中，我们认为 FAS 任务是一个不平衡的分类问题，因此不适合用标准的判别分类器来处理。与此相反，我们提出了一种新颖的方法，利用基于能量的模型（EBM）从生成的角度对 FAS 问题进行建模。EBM 可捕捉真实人脸的分布，并将欺骗企图检测为偏离这一分布的行为。我们使用判别目标和能量正则化项对 EBM 进行训练，以形成所学分布并提高泛化能力。为了增强对未知领域的鲁棒性，我们引入了基于能量的领域扩展技术，该技术在 EBM 的指导下探索源分布周围的潜在空间。我们进一步利用元学习框架和基于梯度的变体来利用增强数据进行领域泛化。为了提高实用性，我们考虑了一种实际情况，即样本是在不同环境下整体收集的，没有明显的领域标签，并表明我们的方法可以通过集群标签训练自然地驾驭这种具有挑战性的情况。在四个 FAS 数据集上进行的广泛实验表明，我们的方法在集内和跨数据集设置中都具有优越性，优于最先进的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Domain-Adaptive Energy-Based Models for Generalizable Face Anti-Spoofing

Face anti-spoofing (FAS) plays a crucial role in securing face recognition systems against presentation attacks. However, existing FAS methods often struggle to generalize to unseen attacks and domains. Existing generalizable FAS studies generally leverage domain generalization (DG) techniques for exploiting intermediate features that support generalization while neglecting the task-specific nature of FAS. In this paper, we argue that the FAS task is an imbalanced classification problem, which renders it unsuitable to be handled by a standard discriminative classifier. In contrast, we propose a novel approach for FAS by modeling the problem from a generative perspective using an energy-based model (EBM). The EBM captures the distribution of genuine faces and detects spoofing attempts as deviations from this distribution. We train the EBM using a discriminative objective and an energy regularization term to shape the learned distribution and improve generalization. To enhance the robustness to unseen domains, we introduce an energy-based domain augmentation technique that explores the latent space around the source distribution guided by the EBM. We further leverage a meta-learning framework and a gradient-based variant to leverage the augmented data for domain generalization. For practicability, we consider a practical setting where samples are holistically collected under different environments without distinct domain labels, and show that our method can naturally harness this challenging setting by training with cluster labels. Extensive experiments on four FAS datasets demonstrate the superiority of our method in both intra- and cross-dataset settings, outperforming state-of-the-art approaches.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Multimedia 工程技术-电信学

CiteScore

11.70

自引率

11.00%

发文量

576

审稿时长

5.5 months

期刊介绍： The IEEE Transactions on Multimedia delves into diverse aspects of multimedia technology and applications, covering circuits, networking, signal processing, systems, software, and systems integration. The scope aligns with the Fields of Interest of the sponsors, ensuring a comprehensive exploration of research in multimedia.