Anqing Zhang , Honglong Chen , Xiaomeng Wang , Junjian Li , Yudong Gao , Xingang Wang
{"title":"基于多尺度失活防御深度神经网络后门攻击","authors":"Anqing Zhang , Honglong Chen , Xiaomeng Wang , Junjian Li , Yudong Gao , Xingang Wang","doi":"10.1016/j.ins.2024.121562","DOIUrl":null,"url":null,"abstract":"<div><div>Deep neural networks (DNNs) have excellent performance in various applications, especially for image classification tasks. However, DNNs also face the threat of backdoor attacks. Backdoor attacks embed a hidden backdoor into a model, after which the infected model can achieve correct classification on benign images, while incorrectly classify the images with the backdoor triggers as the target label. To obtain a clean model from a backdoor dataset, we propose a Kalman filtering based multi-scale inactivation scheme, which can effectively remove poison data in a poison dataset and obtain a clean model. Every sample in the suspicious training dataset will be judged by multi-scale inactivation and obtain a series of judging results, then data fusion is conducted using kalman filtering to determine whether it is a poison sample. To further improve the performance, a trigger localization and target determination based scheme is proposed. Extensive experiments are conducted to demonstrate the superior effectiveness of the proposed method. The results show that the proposed methods can remove poison samples effectively, and achieve greater than 99% recall rate, and the attack success rate of the retrained clean model is smaller than 1%.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"690 ","pages":"Article 121562"},"PeriodicalIF":8.1000,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Defending against backdoor attack on deep neural networks based on multi-scale inactivation\",\"authors\":\"Anqing Zhang , Honglong Chen , Xiaomeng Wang , Junjian Li , Yudong Gao , Xingang Wang\",\"doi\":\"10.1016/j.ins.2024.121562\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Deep neural networks (DNNs) have excellent performance in various applications, especially for image classification tasks. However, DNNs also face the threat of backdoor attacks. Backdoor attacks embed a hidden backdoor into a model, after which the infected model can achieve correct classification on benign images, while incorrectly classify the images with the backdoor triggers as the target label. To obtain a clean model from a backdoor dataset, we propose a Kalman filtering based multi-scale inactivation scheme, which can effectively remove poison data in a poison dataset and obtain a clean model. Every sample in the suspicious training dataset will be judged by multi-scale inactivation and obtain a series of judging results, then data fusion is conducted using kalman filtering to determine whether it is a poison sample. To further improve the performance, a trigger localization and target determination based scheme is proposed. Extensive experiments are conducted to demonstrate the superior effectiveness of the proposed method. The results show that the proposed methods can remove poison samples effectively, and achieve greater than 99% recall rate, and the attack success rate of the retrained clean model is smaller than 1%.</div></div>\",\"PeriodicalId\":51063,\"journal\":{\"name\":\"Information Sciences\",\"volume\":\"690 \",\"pages\":\"Article 121562\"},\"PeriodicalIF\":8.1000,\"publicationDate\":\"2024-10-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Sciences\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0020025524014762\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"0\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0020025524014762","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Defending against backdoor attack on deep neural networks based on multi-scale inactivation
Deep neural networks (DNNs) have excellent performance in various applications, especially for image classification tasks. However, DNNs also face the threat of backdoor attacks. Backdoor attacks embed a hidden backdoor into a model, after which the infected model can achieve correct classification on benign images, while incorrectly classify the images with the backdoor triggers as the target label. To obtain a clean model from a backdoor dataset, we propose a Kalman filtering based multi-scale inactivation scheme, which can effectively remove poison data in a poison dataset and obtain a clean model. Every sample in the suspicious training dataset will be judged by multi-scale inactivation and obtain a series of judging results, then data fusion is conducted using kalman filtering to determine whether it is a poison sample. To further improve the performance, a trigger localization and target determination based scheme is proposed. Extensive experiments are conducted to demonstrate the superior effectiveness of the proposed method. The results show that the proposed methods can remove poison samples effectively, and achieve greater than 99% recall rate, and the attack success rate of the retrained clean model is smaller than 1%.
期刊介绍:
Informatics and Computer Science Intelligent Systems Applications is an esteemed international journal that focuses on publishing original and creative research findings in the field of information sciences. We also feature a limited number of timely tutorial and surveying contributions.
Our journal aims to cater to a diverse audience, including researchers, developers, managers, strategic planners, graduate students, and anyone interested in staying up-to-date with cutting-edge research in information science, knowledge engineering, and intelligent systems. While readers are expected to share a common interest in information science, they come from varying backgrounds such as engineering, mathematics, statistics, physics, computer science, cell biology, molecular biology, management science, cognitive science, neurobiology, behavioral sciences, and biochemistry.