Cairong Zhao , Chutian Wang , Zifan Song , Guosheng Hu , Liang Wang , Duoqian Miao
{"title":"Multi-definition Deepfake detection via semantics reduction and cross-domain training","authors":"Cairong Zhao , Chutian Wang , Zifan Song , Guosheng Hu , Liang Wang , Duoqian Miao","doi":"10.1016/j.patcog.2025.111469","DOIUrl":null,"url":null,"abstract":"<div><div>The recent development of Deepfake videos directly threatens our information security and personal privacy. Although lots of previous works have made much progress on the Deepfake detection, we empirically find that the existing approaches do not perform well on the low definition (LD) and cross-definition (high and low) videos. To address this problem, in this paper, we follow two motivations: (1) high-level semantics reduction and (2) cross-domain training. For (1), we propose the Facial Structure Destruction and Adversarial Jigsaw Loss to reduce our model to learn high-level semantics and focus on learning low-level discriminative information; For (2), we propose an adversarial domain generalization method and a spatial attention distillation which uses the information of HD videos to guide LD videos. We conduct extensive experiments on public datasets, FaceForensics++ and Celeb-DF v2. Results show the great effectiveness of our method and we also achieve very competitive performance against state-of-the-art methods. Surprisingly, we empirically find that our method is also very effective on Face Anti-Spoofing (FAS) task, verified on OULU-NPU dataset.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"163 ","pages":"Article 111469"},"PeriodicalIF":7.5000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325001293","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The recent development of Deepfake videos directly threatens our information security and personal privacy. Although lots of previous works have made much progress on the Deepfake detection, we empirically find that the existing approaches do not perform well on the low definition (LD) and cross-definition (high and low) videos. To address this problem, in this paper, we follow two motivations: (1) high-level semantics reduction and (2) cross-domain training. For (1), we propose the Facial Structure Destruction and Adversarial Jigsaw Loss to reduce our model to learn high-level semantics and focus on learning low-level discriminative information; For (2), we propose an adversarial domain generalization method and a spatial attention distillation which uses the information of HD videos to guide LD videos. We conduct extensive experiments on public datasets, FaceForensics++ and Celeb-DF v2. Results show the great effectiveness of our method and we also achieve very competitive performance against state-of-the-art methods. Surprisingly, we empirically find that our method is also very effective on Face Anti-Spoofing (FAS) task, verified on OULU-NPU dataset.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.