Multiple Contexts and Frequencies Aggregation Network forDeepfake Detection

Zifeng Li, Wenzhong Tang, Shijun Gao, Shuai Wang, Yanxiang Wang
{"title":"Multiple Contexts and Frequencies Aggregation Network forDeepfake Detection","authors":"Zifeng Li, Wenzhong Tang, Shijun Gao, Shuai Wang, Yanxiang Wang","doi":"arxiv-2408.01668","DOIUrl":null,"url":null,"abstract":"Deepfake detection faces increasing challenges since the fast growth of\ngenerative models in developing massive and diverse Deepfake technologies.\nRecent advances rely on introducing heuristic features from spatial or\nfrequency domains rather than modeling general forgery features within\nbackbones. To address this issue, we turn to the backbone design with two\nintuitive priors from spatial and frequency detectors, \\textit{i.e.,} learning\nrobust spatial attributes and frequency distributions that are discriminative\nfor real and fake samples. To this end, we propose an efficient network for\nface forgery detection named MkfaNet, which consists of two core modules. For\nspatial contexts, we design a Multi-Kernel Aggregator that adaptively selects\norgan features extracted by multiple convolutions for modeling subtle facial\ndifferences between real and fake faces. For the frequency components, we\npropose a Multi-Frequency Aggregator to process different bands of frequency\ncomponents by adaptively reweighing high-frequency and low-frequency features.\nComprehensive experiments on seven popular deepfake detection benchmarks\ndemonstrate that our proposed MkfaNet variants achieve superior performances in\nboth within-domain and across-domain evaluations with impressive efficiency of\nparameter usage.","PeriodicalId":501480,"journal":{"name":"arXiv - CS - Multimedia","volume":"100 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.01668","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Deepfake detection faces increasing challenges since the fast growth of generative models in developing massive and diverse Deepfake technologies. Recent advances rely on introducing heuristic features from spatial or frequency domains rather than modeling general forgery features within backbones. To address this issue, we turn to the backbone design with two intuitive priors from spatial and frequency detectors, \textit{i.e.,} learning robust spatial attributes and frequency distributions that are discriminative for real and fake samples. To this end, we propose an efficient network for face forgery detection named MkfaNet, which consists of two core modules. For spatial contexts, we design a Multi-Kernel Aggregator that adaptively selects organ features extracted by multiple convolutions for modeling subtle facial differences between real and fake faces. For the frequency components, we propose a Multi-Frequency Aggregator to process different bands of frequency components by adaptively reweighing high-frequency and low-frequency features. Comprehensive experiments on seven popular deepfake detection benchmarks demonstrate that our proposed MkfaNet variants achieve superior performances in both within-domain and across-domain evaluations with impressive efficiency of parameter usage.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于深度防伪检测的多语境和频率聚合网络
最近的进展依赖于从空间或频率域引入启发式特征,而不是在骨干网内对一般伪造特征进行建模。为了解决这个问题,我们转而使用来自空间和频率检测器的两个直观先验来设计骨干网,(textit{i.e.}学习可靠的空间属性和频率分布,这些属性和分布对真实样本和伪造样本具有区分作用。为此,我们提出了一种高效的表面伪造检测网络,命名为 MkfaNet,它由两个核心模块组成。对于空间上下文,我们设计了一个多核聚合器,它能自适应地选择由多重卷积提取的器官特征,以模拟真假人脸之间细微的面部差异。在七个流行的深度假脸检测基准上进行的综合实验证明,我们提出的 MkfaNet 变体在域内和跨域评估中都取得了优异的性能,参数使用效率令人印象深刻。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Vista3D: Unravel the 3D Darkside of a Single Image MoRAG -- Multi-Fusion Retrieval Augmented Generation for Human Motion Efficient Low-Resolution Face Recognition via Bridge Distillation Enhancing Few-Shot Classification without Forgetting through Multi-Level Contrastive Constraints NVLM: Open Frontier-Class Multimodal LLMs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1