Muhammad Fahad, Tao Zhang, Yasir Iqbal, Azaz Ikram, Fazeela Siddiqui, Bin Younas Abdullah, Malik Muhammad Nauman, Xin Zhao, Yanzhang Geng
{"title":"Advanced deepfake detection with enhanced Resnet-18 and multilayer CNN max pooling","authors":"Muhammad Fahad, Tao Zhang, Yasir Iqbal, Azaz Ikram, Fazeela Siddiqui, Bin Younas Abdullah, Malik Muhammad Nauman, Xin Zhao, Yanzhang Geng","doi":"10.1007/s00371-024-03613-x","DOIUrl":null,"url":null,"abstract":"<p>Artificial intelligence has revolutionized technology, with generative adversarial networks (GANs) generating fake samples and deepfake videos. These technologies can lead to panic and instability, allowing anyone to produce propaganda. Therefore, it is crucial to develop a robust system to distinguish between authentic and counterfeit information in the current social media era. This study offers an automated approach for categorizing deepfake videos using advanced machine learning and deep learning techniques. The processed videos are classified using a deep learning-based enhanced Resnet-18 with convolutional neural network (CNN) multilayer max pooling. This research contributes to studying precise detection techniques for deepfake technology, which is gradually becoming a serious problem for digital media. The proposed enhanced Resnet-18 CNN method integrates deep learning algorithms on GAN architecture and artificial intelligence-generated videos to analyze and determine genuine and fake videos. In this research, we fuse the sub-datasets (faceswap, face2face, deepfakes, neural textures) of FaceForensics, CelebDF, DeeperForensics, DeepFake detection and our own created private dataset into one combined dataset, and the total number of videos are (11,404) in this fused dataset. The dataset on which it was trained has a diverse range of videos and sentiments, demonstrating its capability. The structure of the model is designed to predict and identify videos with faces accurately switched as fakes, while those without switches are real. This paper is a great leap forward in the area of digital forensics, providing an excellent response to deepfakes. The proposed model outperformed conventional methods in predicting video frames, with an accuracy score of 99.99%, F-score of 99.98%, recall of 100%, and precision of 99.99%, confirming its effectiveness through a comparative analysis. The source code of this study is available publically at https://doi.org/10.5281/zenodo.12538330.</p>","PeriodicalId":501186,"journal":{"name":"The Visual Computer","volume":"33 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Visual Computer","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00371-024-03613-x","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Artificial intelligence has revolutionized technology, with generative adversarial networks (GANs) generating fake samples and deepfake videos. These technologies can lead to panic and instability, allowing anyone to produce propaganda. Therefore, it is crucial to develop a robust system to distinguish between authentic and counterfeit information in the current social media era. This study offers an automated approach for categorizing deepfake videos using advanced machine learning and deep learning techniques. The processed videos are classified using a deep learning-based enhanced Resnet-18 with convolutional neural network (CNN) multilayer max pooling. This research contributes to studying precise detection techniques for deepfake technology, which is gradually becoming a serious problem for digital media. The proposed enhanced Resnet-18 CNN method integrates deep learning algorithms on GAN architecture and artificial intelligence-generated videos to analyze and determine genuine and fake videos. In this research, we fuse the sub-datasets (faceswap, face2face, deepfakes, neural textures) of FaceForensics, CelebDF, DeeperForensics, DeepFake detection and our own created private dataset into one combined dataset, and the total number of videos are (11,404) in this fused dataset. The dataset on which it was trained has a diverse range of videos and sentiments, demonstrating its capability. The structure of the model is designed to predict and identify videos with faces accurately switched as fakes, while those without switches are real. This paper is a great leap forward in the area of digital forensics, providing an excellent response to deepfakes. The proposed model outperformed conventional methods in predicting video frames, with an accuracy score of 99.99%, F-score of 99.98%, recall of 100%, and precision of 99.99%, confirming its effectiveness through a comparative analysis. The source code of this study is available publically at https://doi.org/10.5281/zenodo.12538330.