{"title":"精确关键帧对抗深度强化学习","authors":"Chunlong Fan, Yingyu Hao","doi":"10.1109/ICWOC55996.2022.9809899","DOIUrl":null,"url":null,"abstract":"Due to the extensive development of deep neural networks, such as strategy based neural networks, they are easy to be deceived and fooled, resulting in model failure or wrong decision. Because DRL has made great achievements in various complex tasks, it is essential to design effective attacks to build a robust DRL algorithm. So far, most of them are to separate the model from the environment and select effective disturbances through several input and output attempts to achieve the purpose of attack. Therefore, this paper proposes a way to predict the future critical state time and attack by observing each state of the environment without constantly observing the input and output of the model. It is verified in Atari game, which can effectively reduce the acquisition of cumulative rewards on the premise of high efficiency and concealment. This method is suitable for most application scenarios, and ensures the characteristics of efficient and covert attack.","PeriodicalId":402416,"journal":{"name":"2022 10th International Conference on Intelligent Computing and Wireless Optical Communications (ICWOC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Precise Key Frames Adversarial Attack against Deep Reinforcement Learning\",\"authors\":\"Chunlong Fan, Yingyu Hao\",\"doi\":\"10.1109/ICWOC55996.2022.9809899\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Due to the extensive development of deep neural networks, such as strategy based neural networks, they are easy to be deceived and fooled, resulting in model failure or wrong decision. Because DRL has made great achievements in various complex tasks, it is essential to design effective attacks to build a robust DRL algorithm. So far, most of them are to separate the model from the environment and select effective disturbances through several input and output attempts to achieve the purpose of attack. Therefore, this paper proposes a way to predict the future critical state time and attack by observing each state of the environment without constantly observing the input and output of the model. It is verified in Atari game, which can effectively reduce the acquisition of cumulative rewards on the premise of high efficiency and concealment. This method is suitable for most application scenarios, and ensures the characteristics of efficient and covert attack.\",\"PeriodicalId\":402416,\"journal\":{\"name\":\"2022 10th International Conference on Intelligent Computing and Wireless Optical Communications (ICWOC)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 10th International Conference on Intelligent Computing and Wireless Optical Communications (ICWOC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICWOC55996.2022.9809899\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 10th International Conference on Intelligent Computing and Wireless Optical Communications (ICWOC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICWOC55996.2022.9809899","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Precise Key Frames Adversarial Attack against Deep Reinforcement Learning
Due to the extensive development of deep neural networks, such as strategy based neural networks, they are easy to be deceived and fooled, resulting in model failure or wrong decision. Because DRL has made great achievements in various complex tasks, it is essential to design effective attacks to build a robust DRL algorithm. So far, most of them are to separate the model from the environment and select effective disturbances through several input and output attempts to achieve the purpose of attack. Therefore, this paper proposes a way to predict the future critical state time and attack by observing each state of the environment without constantly observing the input and output of the model. It is verified in Atari game, which can effectively reduce the acquisition of cumulative rewards on the premise of high efficiency and concealment. This method is suitable for most application scenarios, and ensures the characteristics of efficient and covert attack.