{"title":"基于条件生成模型反演的对抗鲁棒分类","authors":"Mitra Alirezaei, T. Tasdizen","doi":"10.1109/ICMLC56445.2022.9941288","DOIUrl":null,"url":null,"abstract":"Most adversarial attack defense methods rely on obfuscating gradients. These methods are easily circumvented by attacks which either do not use the gradient or by attacks which approximate and use the corrected gradient Defenses that do not obfuscate gradients such as adversarial training exist, but these approaches generally make assumptions about the attack such as its magnitude. We propose a classification model that does not obfuscate gradients and is robust by construction against black-box attacks without assuming prior knowledge about the attack. Our method casts classification as an optimization problem where we \"invert\" a conditional generator trained on unperturbed, natural images to find the class that generates the closest sample to the query image. We hypothesize that a potential source of brittleness against adversarial attacks is the high-to-low-dimensional nature of feed-forward classifiers. On the other hand, a generative model is typically a low-to-high-dimensional mapping. Since the range of images that can be generated by the model for a given class is limited to its learned manifold, the \"inversion\" process cannot generate images that are arbitrarily close to adversarial examples leading to a robust model by construction. While the method is related to Defense-GAN, the use of a conditional generative model and inversion in our model instead of the feed-forward classifier is a critical difference. Unlike Defense-GAN, we show that our method does not obfuscate gradients. We demonstrate that our model is extremely robust against black-box attacks and does not depend on previous knowledge about the attack strength.","PeriodicalId":117829,"journal":{"name":"2022 International Conference on Machine Learning and Cybernetics (ICMLC)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adversarial Robust Classification by Conditional Generative Model Inversion\",\"authors\":\"Mitra Alirezaei, T. Tasdizen\",\"doi\":\"10.1109/ICMLC56445.2022.9941288\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Most adversarial attack defense methods rely on obfuscating gradients. These methods are easily circumvented by attacks which either do not use the gradient or by attacks which approximate and use the corrected gradient Defenses that do not obfuscate gradients such as adversarial training exist, but these approaches generally make assumptions about the attack such as its magnitude. We propose a classification model that does not obfuscate gradients and is robust by construction against black-box attacks without assuming prior knowledge about the attack. Our method casts classification as an optimization problem where we \\\"invert\\\" a conditional generator trained on unperturbed, natural images to find the class that generates the closest sample to the query image. We hypothesize that a potential source of brittleness against adversarial attacks is the high-to-low-dimensional nature of feed-forward classifiers. On the other hand, a generative model is typically a low-to-high-dimensional mapping. Since the range of images that can be generated by the model for a given class is limited to its learned manifold, the \\\"inversion\\\" process cannot generate images that are arbitrarily close to adversarial examples leading to a robust model by construction. While the method is related to Defense-GAN, the use of a conditional generative model and inversion in our model instead of the feed-forward classifier is a critical difference. Unlike Defense-GAN, we show that our method does not obfuscate gradients. We demonstrate that our model is extremely robust against black-box attacks and does not depend on previous knowledge about the attack strength.\",\"PeriodicalId\":117829,\"journal\":{\"name\":\"2022 International Conference on Machine Learning and Cybernetics (ICMLC)\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Machine Learning and Cybernetics (ICMLC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLC56445.2022.9941288\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Machine Learning and Cybernetics (ICMLC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLC56445.2022.9941288","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Adversarial Robust Classification by Conditional Generative Model Inversion
Most adversarial attack defense methods rely on obfuscating gradients. These methods are easily circumvented by attacks which either do not use the gradient or by attacks which approximate and use the corrected gradient Defenses that do not obfuscate gradients such as adversarial training exist, but these approaches generally make assumptions about the attack such as its magnitude. We propose a classification model that does not obfuscate gradients and is robust by construction against black-box attacks without assuming prior knowledge about the attack. Our method casts classification as an optimization problem where we "invert" a conditional generator trained on unperturbed, natural images to find the class that generates the closest sample to the query image. We hypothesize that a potential source of brittleness against adversarial attacks is the high-to-low-dimensional nature of feed-forward classifiers. On the other hand, a generative model is typically a low-to-high-dimensional mapping. Since the range of images that can be generated by the model for a given class is limited to its learned manifold, the "inversion" process cannot generate images that are arbitrarily close to adversarial examples leading to a robust model by construction. While the method is related to Defense-GAN, the use of a conditional generative model and inversion in our model instead of the feed-forward classifier is a critical difference. Unlike Defense-GAN, we show that our method does not obfuscate gradients. We demonstrate that our model is extremely robust against black-box attacks and does not depend on previous knowledge about the attack strength.