{"title":"基于深度强化学习的物联网安全防御策略算法","authors":"Xuecai Feng, Jikai Han, Rui Zhang, Shuo Xu, Hui Xia","doi":"10.1016/j.hcc.2023.100167","DOIUrl":null,"url":null,"abstract":"<div><p>Currently, important privacy data of the Internet of Things (IoT) face extremely high risks of leakage. Attackers persistently engage in continuous attacks on terminal devices to obtain private data of crucial importance. Although significant progress has been made in recent years in deep reinforcement learning defense strategies, most defense methods still face problems such as low defense resource allocation efficiency and insufficient defense coordination capabilities. To solve the above problems, this paper constructs a novel adversarial security scenario and proposes a security game model that integrates defense resource allocation and patrol inspection. Regarding the above game model, this paper designs a deep reinforcement learning algorithm named SDSA to calculate its security defense strategy. SDSA calculates the allocation strategy of the best patrolling strategy that is most suitable for the defender by searching the policy on a multi-dimensional discrete action space, and enables multiple defense agents to cooperate efficiently by training a multi-intelligent Dueling Double Deep Q-Network (D3QN) with prioritized experience replay. Finally, the experimental results show that the SDSA-learned security defense strategy can provide a feasible and effective security protection strategy for defenders against attacks compared to the MADDPG and OptGradFP methods.</p></div>","PeriodicalId":100605,"journal":{"name":"High-Confidence Computing","volume":"4 1","pages":"Article 100167"},"PeriodicalIF":3.2000,"publicationDate":"2023-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S266729522300065X/pdfft?md5=5ba5e4cf27f862d15547b114a55810e3&pid=1-s2.0-S266729522300065X-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Security defense strategy algorithm for Internet of Things based on deep reinforcement learning\",\"authors\":\"Xuecai Feng, Jikai Han, Rui Zhang, Shuo Xu, Hui Xia\",\"doi\":\"10.1016/j.hcc.2023.100167\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Currently, important privacy data of the Internet of Things (IoT) face extremely high risks of leakage. Attackers persistently engage in continuous attacks on terminal devices to obtain private data of crucial importance. Although significant progress has been made in recent years in deep reinforcement learning defense strategies, most defense methods still face problems such as low defense resource allocation efficiency and insufficient defense coordination capabilities. To solve the above problems, this paper constructs a novel adversarial security scenario and proposes a security game model that integrates defense resource allocation and patrol inspection. Regarding the above game model, this paper designs a deep reinforcement learning algorithm named SDSA to calculate its security defense strategy. SDSA calculates the allocation strategy of the best patrolling strategy that is most suitable for the defender by searching the policy on a multi-dimensional discrete action space, and enables multiple defense agents to cooperate efficiently by training a multi-intelligent Dueling Double Deep Q-Network (D3QN) with prioritized experience replay. Finally, the experimental results show that the SDSA-learned security defense strategy can provide a feasible and effective security protection strategy for defenders against attacks compared to the MADDPG and OptGradFP methods.</p></div>\",\"PeriodicalId\":100605,\"journal\":{\"name\":\"High-Confidence Computing\",\"volume\":\"4 1\",\"pages\":\"Article 100167\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2023-10-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S266729522300065X/pdfft?md5=5ba5e4cf27f862d15547b114a55810e3&pid=1-s2.0-S266729522300065X-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"High-Confidence Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S266729522300065X\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"High-Confidence Computing","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S266729522300065X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Security defense strategy algorithm for Internet of Things based on deep reinforcement learning
Currently, important privacy data of the Internet of Things (IoT) face extremely high risks of leakage. Attackers persistently engage in continuous attacks on terminal devices to obtain private data of crucial importance. Although significant progress has been made in recent years in deep reinforcement learning defense strategies, most defense methods still face problems such as low defense resource allocation efficiency and insufficient defense coordination capabilities. To solve the above problems, this paper constructs a novel adversarial security scenario and proposes a security game model that integrates defense resource allocation and patrol inspection. Regarding the above game model, this paper designs a deep reinforcement learning algorithm named SDSA to calculate its security defense strategy. SDSA calculates the allocation strategy of the best patrolling strategy that is most suitable for the defender by searching the policy on a multi-dimensional discrete action space, and enables multiple defense agents to cooperate efficiently by training a multi-intelligent Dueling Double Deep Q-Network (D3QN) with prioritized experience replay. Finally, the experimental results show that the SDSA-learned security defense strategy can provide a feasible and effective security protection strategy for defenders against attacks compared to the MADDPG and OptGradFP methods.