{"title":"PNPSC网络玩家策略的深度强化学习技术","authors":"E. M. Bearss, Mikel D. Petty","doi":"10.1145/3564746.3587011","DOIUrl":null,"url":null,"abstract":"Petri Nets with Players, Strategies, and Costs (PNPSC) is an extension of Petri nets specifically designed to model cyberattacks. The PNPSC formalism includes a representation of the strategies for the competing \"players,\" i.e., the attacker and defender. Developing well-performing strategies for players in PNPSC nets is challenging for both game tree and reinforcement learning algorithms. This paper presents a method of modeling the PNPSC net player strategies as a game tree and using a combination of Monte Carlo Tree Search (MCTS) and deep reinforcement learning to effectively improve the players' strategies. The performance of this combination method is compared with standard action selection with the deep Q-learning algorithm.","PeriodicalId":322431,"journal":{"name":"Proceedings of the 2023 ACM Southeast Conference","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Deep Reinforcement Learning Technique for PNPSC Net Player Strategies\",\"authors\":\"E. M. Bearss, Mikel D. Petty\",\"doi\":\"10.1145/3564746.3587011\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Petri Nets with Players, Strategies, and Costs (PNPSC) is an extension of Petri nets specifically designed to model cyberattacks. The PNPSC formalism includes a representation of the strategies for the competing \\\"players,\\\" i.e., the attacker and defender. Developing well-performing strategies for players in PNPSC nets is challenging for both game tree and reinforcement learning algorithms. This paper presents a method of modeling the PNPSC net player strategies as a game tree and using a combination of Monte Carlo Tree Search (MCTS) and deep reinforcement learning to effectively improve the players' strategies. The performance of this combination method is compared with standard action selection with the deep Q-learning algorithm.\",\"PeriodicalId\":322431,\"journal\":{\"name\":\"Proceedings of the 2023 ACM Southeast Conference\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-04-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2023 ACM Southeast Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3564746.3587011\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 ACM Southeast Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3564746.3587011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Deep Reinforcement Learning Technique for PNPSC Net Player Strategies
Petri Nets with Players, Strategies, and Costs (PNPSC) is an extension of Petri nets specifically designed to model cyberattacks. The PNPSC formalism includes a representation of the strategies for the competing "players," i.e., the attacker and defender. Developing well-performing strategies for players in PNPSC nets is challenging for both game tree and reinforcement learning algorithms. This paper presents a method of modeling the PNPSC net player strategies as a game tree and using a combination of Monte Carlo Tree Search (MCTS) and deep reinforcement learning to effectively improve the players' strategies. The performance of this combination method is compared with standard action selection with the deep Q-learning algorithm.