Multi-domain awareness for compressed deepfake videos detection over social networks guided by common mechanisms between artifacts

IF 4.3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Computer Vision and Image Understanding Pub Date : 2024-07-10 DOI:10.1016/j.cviu.2024.104072
{"title":"Multi-domain awareness for compressed deepfake videos detection over social networks guided by common mechanisms between artifacts","authors":"","doi":"10.1016/j.cviu.2024.104072","DOIUrl":null,"url":null,"abstract":"<div><p>The viral spread of massive deepfake videos over social networks has caused serious security problems. Despite the remarkable advancements achieved by existing deepfake detection algorithms, deepfake videos over social networks are inevitably influenced by compression factors. This causes deepfake detection performance to be limited by the following challenging issues: (a) interfering with compression artifacts, (b) loss of feature information, and (c) aliasing of feature distributions. In this paper, we analyze the common mechanism between compression artifacts and deepfake artifacts, revealing the structural similarity between them and providing a reliable theoretical basis for enhancing the robustness of deepfake detection models against compression. Firstly, based on the common mechanism between artifacts, we design a frequency domain adaptive notch filter to eliminate the interference of compression artifacts on specific frequency bands. Secondly, to reduce the sensitivity of deepfake detection models to unknown noise, we propose a spatial residual denoising strategy. Thirdly, to exploit the intrinsic correlation between feature vectors in the frequency domain branch and the spatial domain branch, we enhance deepfake features using an attention-based feature fusion method. Finally, we adopt a multi-task decision approach to enhance the discriminative power of the latent space representation of deepfakes, achieving deepfake detection with robustness against compression. Extensive experiments show that compared with the baseline methods, the detection performance of the proposed algorithm on compressed deepfake videos has been significantly improved. In particular, our model is resistant to various types of noise disturbances and can be easily combined with baseline detection models to improve their robustness.</p></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Vision and Image Understanding","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S107731422400153X","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

The viral spread of massive deepfake videos over social networks has caused serious security problems. Despite the remarkable advancements achieved by existing deepfake detection algorithms, deepfake videos over social networks are inevitably influenced by compression factors. This causes deepfake detection performance to be limited by the following challenging issues: (a) interfering with compression artifacts, (b) loss of feature information, and (c) aliasing of feature distributions. In this paper, we analyze the common mechanism between compression artifacts and deepfake artifacts, revealing the structural similarity between them and providing a reliable theoretical basis for enhancing the robustness of deepfake detection models against compression. Firstly, based on the common mechanism between artifacts, we design a frequency domain adaptive notch filter to eliminate the interference of compression artifacts on specific frequency bands. Secondly, to reduce the sensitivity of deepfake detection models to unknown noise, we propose a spatial residual denoising strategy. Thirdly, to exploit the intrinsic correlation between feature vectors in the frequency domain branch and the spatial domain branch, we enhance deepfake features using an attention-based feature fusion method. Finally, we adopt a multi-task decision approach to enhance the discriminative power of the latent space representation of deepfakes, achieving deepfake detection with robustness against compression. Extensive experiments show that compared with the baseline methods, the detection performance of the proposed algorithm on compressed deepfake videos has been significantly improved. In particular, our model is resistant to various types of noise disturbances and can be easily combined with baseline detection models to improve their robustness.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
以人工制品之间的共同机制为指导,利用多域感知技术检测社交网络上的压缩深度伪造视频
社交网络上大量深度伪造视频的病毒式传播引发了严重的安全问题。尽管现有的深度伪造检测算法取得了显著进步,但社交网络上的深度伪造视频不可避免地受到压缩因素的影响。这导致深度伪造检测性能受到以下挑战性问题的限制:(a)压缩伪影的干扰,(b)特征信息的丢失,以及(c)特征分布的混叠。本文分析了压缩伪影与深度伪影的共同机理,揭示了二者在结构上的相似性,为增强深度伪影检测模型对抗压缩的鲁棒性提供了可靠的理论依据。首先,基于伪影之间的共同机制,我们设计了一种频域自适应陷波滤波器来消除压缩伪影对特定频段的干扰。其次,为了降低深度伪造检测模型对未知噪声的敏感性,我们提出了一种空间残差去噪策略。第三,为了利用频域分支和空间域分支中特征向量之间的内在相关性,我们采用了一种基于注意力的特征融合方法来增强深度伪造特征。最后,我们采用了一种多任务决策方法来增强潜空间表征对深度伪造的判别能力,从而实现了具有抗压缩鲁棒性的深度伪造检测。大量实验表明,与基线方法相比,所提算法在压缩 deepfake 视频上的检测性能有了显著提高。特别是,我们的模型能抵御各种类型的噪声干扰,并能轻松地与基线检测模型相结合,以提高其鲁棒性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Computer Vision and Image Understanding
Computer Vision and Image Understanding 工程技术-工程:电子与电气
CiteScore
7.80
自引率
4.40%
发文量
112
审稿时长
79 days
期刊介绍: The central focus of this journal is the computer analysis of pictorial information. Computer Vision and Image Understanding publishes papers covering all aspects of image analysis from the low-level, iconic processes of early vision to the high-level, symbolic processes of recognition and interpretation. A wide range of topics in the image understanding area is covered, including papers offering insights that differ from predominant views. Research Areas Include: • Theory • Early vision • Data structures and representations • Shape • Range • Motion • Matching and recognition • Architecture and languages • Vision systems
期刊最新文献
Deformable surface reconstruction via Riemannian metric preservation Estimating optical flow: A comprehensive review of the state of the art A lightweight convolutional neural network-based feature extractor for visible images LightSOD: Towards lightweight and efficient network for salient object detection Triple-Stream Commonsense Circulation Transformer Network for Image Captioning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1