针对合作多代理深度强化学习的时空隐形后门攻击

Yinbo Yu, Saihao Yan, Jiajia Liu
{"title":"针对合作多代理深度强化学习的时空隐形后门攻击","authors":"Yinbo Yu, Saihao Yan, Jiajia Liu","doi":"arxiv-2409.07775","DOIUrl":null,"url":null,"abstract":"Recent studies have shown that cooperative multi-agent deep reinforcement\nlearning (c-MADRL) is under the threat of backdoor attacks. Once a backdoor\ntrigger is observed, it will perform abnormal actions leading to failures or\nmalicious goals. However, existing proposed backdoors suffer from several\nissues, e.g., fixed visual trigger patterns lack stealthiness, the backdoor is\ntrained or activated by an additional network, or all agents are backdoored. To\nthis end, in this paper, we propose a novel backdoor attack against c-MADRL,\nwhich attacks the entire multi-agent team by embedding the backdoor only in a\nsingle agent. Firstly, we introduce adversary spatiotemporal behavior patterns\nas the backdoor trigger rather than manual-injected fixed visual patterns or\ninstant status and control the attack duration. This method can guarantee the\nstealthiness and practicality of injected backdoors. Secondly, we hack the\noriginal reward function of the backdoored agent via reward reverse and\nunilateral guidance during training to ensure its adverse influence on the\nentire team. We evaluate our backdoor attacks on two classic c-MADRL algorithms\nVDN and QMIX, in a popular c-MADRL environment SMAC. The experimental results\ndemonstrate that our backdoor attacks are able to reach a high attack success\nrate (91.6\\%) while maintaining a low clean performance variance rate (3.7\\%).","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":"25 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Spatiotemporal Stealthy Backdoor Attack against Cooperative Multi-Agent Deep Reinforcement Learning\",\"authors\":\"Yinbo Yu, Saihao Yan, Jiajia Liu\",\"doi\":\"arxiv-2409.07775\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent studies have shown that cooperative multi-agent deep reinforcement\\nlearning (c-MADRL) is under the threat of backdoor attacks. Once a backdoor\\ntrigger is observed, it will perform abnormal actions leading to failures or\\nmalicious goals. However, existing proposed backdoors suffer from several\\nissues, e.g., fixed visual trigger patterns lack stealthiness, the backdoor is\\ntrained or activated by an additional network, or all agents are backdoored. To\\nthis end, in this paper, we propose a novel backdoor attack against c-MADRL,\\nwhich attacks the entire multi-agent team by embedding the backdoor only in a\\nsingle agent. Firstly, we introduce adversary spatiotemporal behavior patterns\\nas the backdoor trigger rather than manual-injected fixed visual patterns or\\ninstant status and control the attack duration. This method can guarantee the\\nstealthiness and practicality of injected backdoors. Secondly, we hack the\\noriginal reward function of the backdoored agent via reward reverse and\\nunilateral guidance during training to ensure its adverse influence on the\\nentire team. We evaluate our backdoor attacks on two classic c-MADRL algorithms\\nVDN and QMIX, in a popular c-MADRL environment SMAC. The experimental results\\ndemonstrate that our backdoor attacks are able to reach a high attack success\\nrate (91.6\\\\%) while maintaining a low clean performance variance rate (3.7\\\\%).\",\"PeriodicalId\":501479,\"journal\":{\"name\":\"arXiv - CS - Artificial Intelligence\",\"volume\":\"25 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.07775\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07775","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

最近的研究表明,合作式多代理深度强化学习(c-MADRL)面临着后门攻击的威胁。一旦后门触发器被观察到,它就会执行异常行动,导致失败或恶意目标。然而,现有的后门存在几个问题,例如,固定的视觉触发模式缺乏隐蔽性,后门由额外的网络训练或激活,或者所有代理都被后门屏蔽。为此,我们在本文中提出了一种针对 c-MADRL 的新型后门攻击,即只在单个代理中嵌入后门,从而攻击整个多代理团队。首先,我们引入对手的时空行为模式作为后门触发器,而不是人工注入固定的视觉模式或瞬时状态,并控制攻击持续时间。这种方法可以保证注入后门的隐蔽性和实用性。其次,我们在训练过程中通过奖励反向和单边引导的方式黑掉了后门代理的原始奖励功能,以确保其对整个团队产生不利影响。我们在流行的 c-MADRL 环境 SMAC 中评估了对两种经典 c-MADRL 算法VDN 和 QMIX 的后门攻击。实验结果表明,我们的后门攻击能够达到较高的攻击成功率(91.6%),同时保持较低的清洁性能差异率(3.7%)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A Spatiotemporal Stealthy Backdoor Attack against Cooperative Multi-Agent Deep Reinforcement Learning
Recent studies have shown that cooperative multi-agent deep reinforcement learning (c-MADRL) is under the threat of backdoor attacks. Once a backdoor trigger is observed, it will perform abnormal actions leading to failures or malicious goals. However, existing proposed backdoors suffer from several issues, e.g., fixed visual trigger patterns lack stealthiness, the backdoor is trained or activated by an additional network, or all agents are backdoored. To this end, in this paper, we propose a novel backdoor attack against c-MADRL, which attacks the entire multi-agent team by embedding the backdoor only in a single agent. Firstly, we introduce adversary spatiotemporal behavior patterns as the backdoor trigger rather than manual-injected fixed visual patterns or instant status and control the attack duration. This method can guarantee the stealthiness and practicality of injected backdoors. Secondly, we hack the original reward function of the backdoored agent via reward reverse and unilateral guidance during training to ensure its adverse influence on the entire team. We evaluate our backdoor attacks on two classic c-MADRL algorithms VDN and QMIX, in a popular c-MADRL environment SMAC. The experimental results demonstrate that our backdoor attacks are able to reach a high attack success rate (91.6\%) while maintaining a low clean performance variance rate (3.7\%).
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Abductive explanations of classifiers under constraints: Complexity and properties Explaining Non-monotonic Normative Reasoning using Argumentation Theory with Deontic Logic Towards Explainable Goal Recognition Using Weight of Evidence (WoE): A Human-Centered Approach A Metric Hybrid Planning Approach to Solving Pandemic Planning Problems with Simple SIR Models Neural Networks for Vehicle Routing Problem
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1