Zifeng Li, Wenzhong Tang, Shijun Gao, Shuai Wang, Yanxiang Wang
{"title":"Multiple Contexts and Frequencies Aggregation Network forDeepfake Detection","authors":"Zifeng Li, Wenzhong Tang, Shijun Gao, Shuai Wang, Yanxiang Wang","doi":"arxiv-2408.01668","DOIUrl":null,"url":null,"abstract":"Deepfake detection faces increasing challenges since the fast growth of\ngenerative models in developing massive and diverse Deepfake technologies.\nRecent advances rely on introducing heuristic features from spatial or\nfrequency domains rather than modeling general forgery features within\nbackbones. To address this issue, we turn to the backbone design with two\nintuitive priors from spatial and frequency detectors, \\textit{i.e.,} learning\nrobust spatial attributes and frequency distributions that are discriminative\nfor real and fake samples. To this end, we propose an efficient network for\nface forgery detection named MkfaNet, which consists of two core modules. For\nspatial contexts, we design a Multi-Kernel Aggregator that adaptively selects\norgan features extracted by multiple convolutions for modeling subtle facial\ndifferences between real and fake faces. For the frequency components, we\npropose a Multi-Frequency Aggregator to process different bands of frequency\ncomponents by adaptively reweighing high-frequency and low-frequency features.\nComprehensive experiments on seven popular deepfake detection benchmarks\ndemonstrate that our proposed MkfaNet variants achieve superior performances in\nboth within-domain and across-domain evaluations with impressive efficiency of\nparameter usage.","PeriodicalId":501480,"journal":{"name":"arXiv - CS - Multimedia","volume":"100 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.01668","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Deepfake detection faces increasing challenges since the fast growth of
generative models in developing massive and diverse Deepfake technologies.
Recent advances rely on introducing heuristic features from spatial or
frequency domains rather than modeling general forgery features within
backbones. To address this issue, we turn to the backbone design with two
intuitive priors from spatial and frequency detectors, \textit{i.e.,} learning
robust spatial attributes and frequency distributions that are discriminative
for real and fake samples. To this end, we propose an efficient network for
face forgery detection named MkfaNet, which consists of two core modules. For
spatial contexts, we design a Multi-Kernel Aggregator that adaptively selects
organ features extracted by multiple convolutions for modeling subtle facial
differences between real and fake faces. For the frequency components, we
propose a Multi-Frequency Aggregator to process different bands of frequency
components by adaptively reweighing high-frequency and low-frequency features.
Comprehensive experiments on seven popular deepfake detection benchmarks
demonstrate that our proposed MkfaNet variants achieve superior performances in
both within-domain and across-domain evaluations with impressive efficiency of
parameter usage.