基于多智能体深度强化学习的窃听博弈

Delin Guo, Lan Tang, Luxi Yang, Ying-Chang Liang
{"title":"基于多智能体深度强化学习的窃听博弈","authors":"Delin Guo, Lan Tang, Luxi Yang, Ying-Chang Liang","doi":"10.1109/spawc51304.2022.9833927","DOIUrl":null,"url":null,"abstract":"This paper considers an adversarial scenario between a legitimate eavesdropper and a suspicious communication pair. All three nodes are equipped with multiple antennas. The eavesdropper, which operates in a full-duplex model, aims to wiretap the dubious communication pair via proactive jamming. On the other hand, the suspicious transmitter, which can send artificial noise (AN) to disturb the wiretap channel, aims to guarantee secrecy. More specifically, the eavesdropper adjusts jamming power to enhance the wiretap rate, while the suspicious transmitter jointly adapts the transmit power and noise power against the eavesdropping. Considering the partial observation and complicated interactions between the eavesdropper and the suspicious pair in unknown system dynamics, we model the problem as an imperfect-information stochastic game. To approach the Nash equilibrium solution of the eavesdropping game, we develop a multi-agent reinforcement learning (MARL) algorithm, termed neural fictitious self-play with soft actor-critic (NFSP-SAC), by combining the fictitious self-play (FSP) with a deep reinforcement learning algorithm, SAC. The introduction of SAC enables FSP to handle the problems with continuous and high dimension observation and action space. The simulation results demonstrate that the power allocation policies learned by our method empirically converge to a Nash equilibrium, while the compared reinforcement learning algorithms suffer from severe fluctuations during the learning process.","PeriodicalId":423807,"journal":{"name":"2022 IEEE 23rd International Workshop on Signal Processing Advances in Wireless Communication (SPAWC)","volume":"133 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Eavesdropping Game Based on Multi-Agent Deep Reinforcement Learning\",\"authors\":\"Delin Guo, Lan Tang, Luxi Yang, Ying-Chang Liang\",\"doi\":\"10.1109/spawc51304.2022.9833927\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper considers an adversarial scenario between a legitimate eavesdropper and a suspicious communication pair. All three nodes are equipped with multiple antennas. The eavesdropper, which operates in a full-duplex model, aims to wiretap the dubious communication pair via proactive jamming. On the other hand, the suspicious transmitter, which can send artificial noise (AN) to disturb the wiretap channel, aims to guarantee secrecy. More specifically, the eavesdropper adjusts jamming power to enhance the wiretap rate, while the suspicious transmitter jointly adapts the transmit power and noise power against the eavesdropping. Considering the partial observation and complicated interactions between the eavesdropper and the suspicious pair in unknown system dynamics, we model the problem as an imperfect-information stochastic game. To approach the Nash equilibrium solution of the eavesdropping game, we develop a multi-agent reinforcement learning (MARL) algorithm, termed neural fictitious self-play with soft actor-critic (NFSP-SAC), by combining the fictitious self-play (FSP) with a deep reinforcement learning algorithm, SAC. The introduction of SAC enables FSP to handle the problems with continuous and high dimension observation and action space. The simulation results demonstrate that the power allocation policies learned by our method empirically converge to a Nash equilibrium, while the compared reinforcement learning algorithms suffer from severe fluctuations during the learning process.\",\"PeriodicalId\":423807,\"journal\":{\"name\":\"2022 IEEE 23rd International Workshop on Signal Processing Advances in Wireless Communication (SPAWC)\",\"volume\":\"133 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 23rd International Workshop on Signal Processing Advances in Wireless Communication (SPAWC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/spawc51304.2022.9833927\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 23rd International Workshop on Signal Processing Advances in Wireless Communication (SPAWC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/spawc51304.2022.9833927","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本文考虑了合法窃听者和可疑通信对之间的对抗场景。三个节点均配置多天线。该窃听器工作在全双工模式下,旨在通过主动干扰窃听可疑通信对。另一方面,可疑发射机通过发射人工噪声干扰窃听信道,以保证窃听信道的保密性。具体来说,窃听者通过调整干扰功率来提高窃听率,而可疑发射方则通过调整发射功率和噪声功率来共同对抗窃听。考虑到未知系统动力学条件下偷听者与可疑对之间的局部观察和复杂的相互作用,将该问题建模为不完全信息随机博弈。为了接近窃听博弈的纳什均衡解,我们通过将虚拟自我博弈(FSP)与深度强化学习算法SAC相结合,开发了一种多智能体强化学习(MARL)算法,称为神经虚拟自我博弈与软行为评论(NFSP-SAC)。SAC的引入使FSP能够处理具有连续高维观测和行动空间的问题。仿真结果表明,本文方法学习到的权力分配策略经验收敛于纳什均衡,而与之比较的强化学习算法在学习过程中存在较大的波动。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Eavesdropping Game Based on Multi-Agent Deep Reinforcement Learning
This paper considers an adversarial scenario between a legitimate eavesdropper and a suspicious communication pair. All three nodes are equipped with multiple antennas. The eavesdropper, which operates in a full-duplex model, aims to wiretap the dubious communication pair via proactive jamming. On the other hand, the suspicious transmitter, which can send artificial noise (AN) to disturb the wiretap channel, aims to guarantee secrecy. More specifically, the eavesdropper adjusts jamming power to enhance the wiretap rate, while the suspicious transmitter jointly adapts the transmit power and noise power against the eavesdropping. Considering the partial observation and complicated interactions between the eavesdropper and the suspicious pair in unknown system dynamics, we model the problem as an imperfect-information stochastic game. To approach the Nash equilibrium solution of the eavesdropping game, we develop a multi-agent reinforcement learning (MARL) algorithm, termed neural fictitious self-play with soft actor-critic (NFSP-SAC), by combining the fictitious self-play (FSP) with a deep reinforcement learning algorithm, SAC. The introduction of SAC enables FSP to handle the problems with continuous and high dimension observation and action space. The simulation results demonstrate that the power allocation policies learned by our method empirically converge to a Nash equilibrium, while the compared reinforcement learning algorithms suffer from severe fluctuations during the learning process.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Secure Multi-Antenna Coded Caching Deep Transfer Learning Based Radio Map Estimation for Indoor Wireless Communications A New Outage Probability Bound for IR-HARQ and Its Application to Power Adaptation SPAWC 2022 Cover Page A Sequential Experience-driven Contextual Bandit Policy for MIMO TWAF Online Relay Selection
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1