{"title":"从班级翻转分布的角度改进对抗性训练","authors":"Dawei Zhou;Nannan Wang;Tongliang Liu;Xinbo Gao","doi":"10.1109/TPAMI.2025.3540200","DOIUrl":null,"url":null,"abstract":"Adversarial training has been proposed and widely recognized as a very effective method to defend against adversarial noise. However, the label flipping pattern on different classes still need deeper exploration to identify potential problems and assist in further enhancing robustness. In this work, we model the class-flipping distribution via statistical investigations and find this distribution reveals two shortcomings: the highly misleading category is present in the model's predictions for data in each class, and the trend in class flipping are significantly different across classes. Based on these observations, we propose a <italic>Class-Flipping-aware Adversarial Training</i> (CFAT) method. On the one hand, we obtain the most misleading categories for the data in each class by counting the samples flipped to different wrong categories, and utilize them as the target to construct corresponding targeted adversarial samples, respectively. On the other hand, we take the proportions of samples flipped to the most misleading category as factors to scale the perturbation budgets of adversarial training samples for the data with corresponding classes. Experimental results on datasets with different class number validate the effectiveness of the proposed method.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 6","pages":"4330-4342"},"PeriodicalIF":18.6000,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving Adversarial Training From the Perspective of Class-Flipping Distribution\",\"authors\":\"Dawei Zhou;Nannan Wang;Tongliang Liu;Xinbo Gao\",\"doi\":\"10.1109/TPAMI.2025.3540200\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Adversarial training has been proposed and widely recognized as a very effective method to defend against adversarial noise. However, the label flipping pattern on different classes still need deeper exploration to identify potential problems and assist in further enhancing robustness. In this work, we model the class-flipping distribution via statistical investigations and find this distribution reveals two shortcomings: the highly misleading category is present in the model's predictions for data in each class, and the trend in class flipping are significantly different across classes. Based on these observations, we propose a <italic>Class-Flipping-aware Adversarial Training</i> (CFAT) method. On the one hand, we obtain the most misleading categories for the data in each class by counting the samples flipped to different wrong categories, and utilize them as the target to construct corresponding targeted adversarial samples, respectively. On the other hand, we take the proportions of samples flipped to the most misleading category as factors to scale the perturbation budgets of adversarial training samples for the data with corresponding classes. Experimental results on datasets with different class number validate the effectiveness of the proposed method.\",\"PeriodicalId\":94034,\"journal\":{\"name\":\"IEEE transactions on pattern analysis and machine intelligence\",\"volume\":\"47 6\",\"pages\":\"4330-4342\"},\"PeriodicalIF\":18.6000,\"publicationDate\":\"2025-02-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on pattern analysis and machine intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10878818/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10878818/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improving Adversarial Training From the Perspective of Class-Flipping Distribution
Adversarial training has been proposed and widely recognized as a very effective method to defend against adversarial noise. However, the label flipping pattern on different classes still need deeper exploration to identify potential problems and assist in further enhancing robustness. In this work, we model the class-flipping distribution via statistical investigations and find this distribution reveals two shortcomings: the highly misleading category is present in the model's predictions for data in each class, and the trend in class flipping are significantly different across classes. Based on these observations, we propose a Class-Flipping-aware Adversarial Training (CFAT) method. On the one hand, we obtain the most misleading categories for the data in each class by counting the samples flipped to different wrong categories, and utilize them as the target to construct corresponding targeted adversarial samples, respectively. On the other hand, we take the proportions of samples flipped to the most misleading category as factors to scale the perturbation budgets of adversarial training samples for the data with corresponding classes. Experimental results on datasets with different class number validate the effectiveness of the proposed method.