{"title":"基于新型解释混合模型的Deepfake图像分类","authors":"Sudarshana Kerenalli, Vamsidhar Yendapalli, Mylarareddy Chinnaiah","doi":"10.21512/commit.v17i2.8761","DOIUrl":null,"url":null,"abstract":"In court, criminal investigations and identity management tools, like check-in and payment logins, face videos, and photos, are used as evidence more frequently. Although deeply falsified information may be found using deep learning classifiers, block-box decisionmaking makes forensic investigation in criminal trials more challenging. Therefore, the research suggests a three-step classification technique to classify the deceptive deepfake image content. The research examines the visual assessments of an EfficientNet and Shifted Window Transformer (SWinT) hybrid model based on Convolutional Neural Network (CNN) and Transformer architectures. The classifier generality is improved in the first stage using a different augmentation. Then, the hybrid model is developed in the second step by combining the EfficientNet and Shifted Window Transformer architectures. Next, the GradCAM approach for assessing human understanding demonstrates deepfake visual interpretation. In 14,204 images for the validation set, there are 7,096 fake photos and 7,108 real images. In contrast to focusing only on a few discrete face parts, the research shows that the entire deepfake image should be investigated. On a custom dataset of real, Generative Adversarial Networks (GAN)-generated, and human-altered web photos, the proposed method achieves an accuracy of 98.45%, a recall of 99.12%, and a loss of 0.11125. The proposed method successfully distinguishes between real and manipulated images. Moreover, the presented approach can assist investigators in clarifying the composition of the artificially produced material.","PeriodicalId":31276,"journal":{"name":"CommIT Journal","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Classification of Deepfake Images Using a Novel Explanatory Hybrid Model\",\"authors\":\"Sudarshana Kerenalli, Vamsidhar Yendapalli, Mylarareddy Chinnaiah\",\"doi\":\"10.21512/commit.v17i2.8761\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In court, criminal investigations and identity management tools, like check-in and payment logins, face videos, and photos, are used as evidence more frequently. Although deeply falsified information may be found using deep learning classifiers, block-box decisionmaking makes forensic investigation in criminal trials more challenging. Therefore, the research suggests a three-step classification technique to classify the deceptive deepfake image content. The research examines the visual assessments of an EfficientNet and Shifted Window Transformer (SWinT) hybrid model based on Convolutional Neural Network (CNN) and Transformer architectures. The classifier generality is improved in the first stage using a different augmentation. Then, the hybrid model is developed in the second step by combining the EfficientNet and Shifted Window Transformer architectures. Next, the GradCAM approach for assessing human understanding demonstrates deepfake visual interpretation. In 14,204 images for the validation set, there are 7,096 fake photos and 7,108 real images. In contrast to focusing only on a few discrete face parts, the research shows that the entire deepfake image should be investigated. On a custom dataset of real, Generative Adversarial Networks (GAN)-generated, and human-altered web photos, the proposed method achieves an accuracy of 98.45%, a recall of 99.12%, and a loss of 0.11125. The proposed method successfully distinguishes between real and manipulated images. Moreover, the presented approach can assist investigators in clarifying the composition of the artificially produced material.\",\"PeriodicalId\":31276,\"journal\":{\"name\":\"CommIT Journal\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-09-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"CommIT Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.21512/commit.v17i2.8761\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"CommIT Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21512/commit.v17i2.8761","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}
Classification of Deepfake Images Using a Novel Explanatory Hybrid Model
In court, criminal investigations and identity management tools, like check-in and payment logins, face videos, and photos, are used as evidence more frequently. Although deeply falsified information may be found using deep learning classifiers, block-box decisionmaking makes forensic investigation in criminal trials more challenging. Therefore, the research suggests a three-step classification technique to classify the deceptive deepfake image content. The research examines the visual assessments of an EfficientNet and Shifted Window Transformer (SWinT) hybrid model based on Convolutional Neural Network (CNN) and Transformer architectures. The classifier generality is improved in the first stage using a different augmentation. Then, the hybrid model is developed in the second step by combining the EfficientNet and Shifted Window Transformer architectures. Next, the GradCAM approach for assessing human understanding demonstrates deepfake visual interpretation. In 14,204 images for the validation set, there are 7,096 fake photos and 7,108 real images. In contrast to focusing only on a few discrete face parts, the research shows that the entire deepfake image should be investigated. On a custom dataset of real, Generative Adversarial Networks (GAN)-generated, and human-altered web photos, the proposed method achieves an accuracy of 98.45%, a recall of 99.12%, and a loss of 0.11125. The proposed method successfully distinguishes between real and manipulated images. Moreover, the presented approach can assist investigators in clarifying the composition of the artificially produced material.