MARNet:针对协同多智能体强化学习的后门攻击

IF 7 2区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE IEEE Transactions on Dependable and Secure Computing Pub Date : 2023-09-01 DOI:10.1109/TDSC.2022.3207429

Yanjiao Chen, Zhicong Zheng, Xueluan Gong

{"title":"MARNet:针对协同多智能体强化学习的后门攻击","authors":"Yanjiao Chen, Zhicong Zheng, Xueluan Gong","doi":"10.1109/TDSC.2022.3207429","DOIUrl":null,"url":null,"abstract":"Recent works have revealed that backdoor attacks against Deep Reinforcement Learning (DRL) could lead to abnormal action selections of the agent, which may result in failure or even catastrophe in crucial decision processes. However, existing attacks only consider single-agent reinforcement learning (RL) systems, in which the only agent can observe the global state and have full control of the decision process. In this article, we explore a new backdoor attack paradigm in cooperative multi-agent reinforcement learning (CMARL) scenarios, where a group of agents coordinate with each other to achieve a common goal, while each agent can only observe the local state. In the proposed MARNet attack framework, we carefully design a pipeline of trigger design, action poisoning, and reward hacking modules to accommodate the cooperative multi-agent settings. In particular, as only a subset of agents can observe the triggers in their local observations, we maneuver their actions to the worst actions suggested by an expert policy model. Since the global reward in CMARL is aggregated by individual rewards from all agents, we propose to modify the reward in a way that boosts the bad actions of poisoned agents (agents who observe the triggers) but mitigates the influence on non-poisoned agents. We conduct extensive experiments on three classical CMARL algorithms VDN, COMA, and QMIX, in two popular CMARL games Predator Prey and SMAC. The results show that the baselines extended from single-agent DRL backdoor attacks seldom work in CMARL problems while MARNet performs well by reducing the utility under attack by nearly 100%. We apply fine-tuning as a potential defense against MARNet and demonstrate that fine-tuning cannot entirely eliminate the effect of the attack.","PeriodicalId":13047,"journal":{"name":"IEEE Transactions on Dependable and Secure Computing","volume":"20 1","pages":"4188-4198"},"PeriodicalIF":7.0000,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"MARNet: Backdoor Attacks Against Cooperative Multi-Agent Reinforcement Learning\",\"authors\":\"Yanjiao Chen, Zhicong Zheng, Xueluan Gong\",\"doi\":\"10.1109/TDSC.2022.3207429\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent works have revealed that backdoor attacks against Deep Reinforcement Learning (DRL) could lead to abnormal action selections of the agent, which may result in failure or even catastrophe in crucial decision processes. However, existing attacks only consider single-agent reinforcement learning (RL) systems, in which the only agent can observe the global state and have full control of the decision process. In this article, we explore a new backdoor attack paradigm in cooperative multi-agent reinforcement learning (CMARL) scenarios, where a group of agents coordinate with each other to achieve a common goal, while each agent can only observe the local state. In the proposed MARNet attack framework, we carefully design a pipeline of trigger design, action poisoning, and reward hacking modules to accommodate the cooperative multi-agent settings. In particular, as only a subset of agents can observe the triggers in their local observations, we maneuver their actions to the worst actions suggested by an expert policy model. Since the global reward in CMARL is aggregated by individual rewards from all agents, we propose to modify the reward in a way that boosts the bad actions of poisoned agents (agents who observe the triggers) but mitigates the influence on non-poisoned agents. We conduct extensive experiments on three classical CMARL algorithms VDN, COMA, and QMIX, in two popular CMARL games Predator Prey and SMAC. The results show that the baselines extended from single-agent DRL backdoor attacks seldom work in CMARL problems while MARNet performs well by reducing the utility under attack by nearly 100%. We apply fine-tuning as a potential defense against MARNet and demonstrate that fine-tuning cannot entirely eliminate the effect of the attack.\",\"PeriodicalId\":13047,\"journal\":{\"name\":\"IEEE Transactions on Dependable and Secure Computing\",\"volume\":\"20 1\",\"pages\":\"4188-4198\"},\"PeriodicalIF\":7.0000,\"publicationDate\":\"2023-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Dependable and Secure Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1109/TDSC.2022.3207429\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Dependable and Secure Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/TDSC.2022.3207429","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 5

摘要

最近的研究表明，针对深度强化学习(DRL)的后门攻击可能导致智能体的异常行为选择，这可能导致关键决策过程的失败甚至灾难。然而，现有的攻击只考虑单智能体强化学习(RL)系统，其中唯一的智能体可以观察全局状态并完全控制决策过程。在本文中，我们探索了一种新的多智能体协作强化学习(CMARL)场景中的后门攻击范式，其中一组智能体相互协调以实现共同目标，而每个智能体只能观察局部状态。在提出的MARNet攻击框架中，我们精心设计了触发设计、动作投毒和奖励黑客模块的管道，以适应多智能体的协作设置。特别是，由于只有一小部分代理可以在其局部观察中观察到触发器，因此我们将其行为调整为专家策略模型建议的最坏行为。由于CMARL中的全局奖励是由所有代理的个体奖励汇总而成的，我们建议修改奖励，以促进中毒代理(观察触发器的代理)的不良行为，但减轻对非中毒代理的影响。在两款CMARL热门游戏《Predator Prey》和《SMAC》中，我们对CMARL的三种经典算法VDN、COMA和QMIX进行了广泛的实验。结果表明，从单代理DRL后门攻击扩展的基线很少适用于CMARL问题，而MARNet通过将攻击下的效用降低近100%而表现良好。我们应用微调作为对MARNet的潜在防御，并证明微调不能完全消除攻击的影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

MARNet: Backdoor Attacks Against Cooperative Multi-Agent Reinforcement Learning

Recent works have revealed that backdoor attacks against Deep Reinforcement Learning (DRL) could lead to abnormal action selections of the agent, which may result in failure or even catastrophe in crucial decision processes. However, existing attacks only consider single-agent reinforcement learning (RL) systems, in which the only agent can observe the global state and have full control of the decision process. In this article, we explore a new backdoor attack paradigm in cooperative multi-agent reinforcement learning (CMARL) scenarios, where a group of agents coordinate with each other to achieve a common goal, while each agent can only observe the local state. In the proposed MARNet attack framework, we carefully design a pipeline of trigger design, action poisoning, and reward hacking modules to accommodate the cooperative multi-agent settings. In particular, as only a subset of agents can observe the triggers in their local observations, we maneuver their actions to the worst actions suggested by an expert policy model. Since the global reward in CMARL is aggregated by individual rewards from all agents, we propose to modify the reward in a way that boosts the bad actions of poisoned agents (agents who observe the triggers) but mitigates the influence on non-poisoned agents. We conduct extensive experiments on three classical CMARL algorithms VDN, COMA, and QMIX, in two popular CMARL games Predator Prey and SMAC. The results show that the baselines extended from single-agent DRL backdoor attacks seldom work in CMARL problems while MARNet performs well by reducing the utility under attack by nearly 100%. We apply fine-tuning as a potential defense against MARNet and demonstrate that fine-tuning cannot entirely eliminate the effect of the attack.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Dependable and Secure Computing 工程技术-计算机：软件工程

CiteScore

11.20

自引率

5.50%

发文量

354

审稿时长

9 months

期刊介绍： The "IEEE Transactions on Dependable and Secure Computing (TDSC)" is a prestigious journal that publishes high-quality, peer-reviewed research in the field of computer science, specifically targeting the development of dependable and secure computing systems and networks. This journal is dedicated to exploring the fundamental principles, methodologies, and mechanisms that enable the design, modeling, and evaluation of systems that meet the required levels of reliability, security, and performance. The scope of TDSC includes research on measurement, modeling, and simulation techniques that contribute to the understanding and improvement of system performance under various constraints. It also covers the foundations necessary for the joint evaluation, verification, and design of systems that balance performance, security, and dependability. By publishing archival research results, TDSC aims to provide a valuable resource for researchers, engineers, and practitioners working in the areas of cybersecurity, fault tolerance, and system reliability. The journal's focus on cutting-edge research ensures that it remains at the forefront of advancements in the field, promoting the development of technologies that are critical for the functioning of modern, complex systems.