BrainQN: Enhancing the Robustness of Deep Reinforcement Learning with Spiking Neural Networks

IF 6.8 Q1 AUTOMATION & CONTROL SYSTEMS Advanced intelligent systems (Weinheim an der Bergstrasse, Germany) Pub Date : 2024-08-04 DOI:10.1002/aisy.202400075

Shuo Feng, Jian Cao, Zehong Ou, Guang Chen, Yi Zhong, Zilin Wang, Juntong Yan, Jue Chen, Bingsen Wang, Chenglong Zou, Zebang Feng, Yuan Wang

{"title":"BrainQN: Enhancing the Robustness of Deep Reinforcement Learning with Spiking Neural Networks","authors":"Shuo Feng, Jian Cao, Zehong Ou, Guang Chen, Yi Zhong, Zilin Wang, Juntong Yan, Jue Chen, Bingsen Wang, Chenglong Zou, Zebang Feng, Yuan Wang","doi":"10.1002/aisy.202400075","DOIUrl":null,"url":null,"abstract":"<p>As the third-generation network succeeding artificial neural networks (ANNs), spiking neural networks (SNNs) offer high robustness and low energy consumption. Inspired by biological systems, the limitations of low robustness and high-power consumption in deep reinforcement learning (DRL) are addressed by introducing SNNs. The Brain Q-network (BrainQN) is proposed, which replaces the neurons in the classic Deep Q-learning (DQN) algorithm with SNN neurons. BrainQN is trained using surrogate gradient learning (SGL) and ANN-to-SNN conversion methods. Robustness tests with input noise reveal BrainQN's superior performance, achieving an 82.14% increase in rewards under low noise and 71.74% under high noise compared to DQN. These findings highlight BrainQN's robustness and superior performance in noisy environments, supporting its application in complex scenarios. SGL-trained BrainQN is more robust than ANN-to-SNN conversion under high noise. The differences in network output correlations between noisy and original inputs, along with training algorithm distinctions, explain this phenomenon. BrainQN successfully transitioned from a simulated Pong environment to a ball-catching robot with dynamic vision sensors (DVS). On the neuromorphic chip PAICORE, it shows significant advantages in latency and power consumption compared to Jetson Xavier NX.</p>","PeriodicalId":93858,"journal":{"name":"Advanced intelligent systems (Weinheim an der Bergstrasse, Germany)","volume":"6 9","pages":""},"PeriodicalIF":6.8000,"publicationDate":"2024-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aisy.202400075","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced intelligent systems (Weinheim an der Bergstrasse, Germany)","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/aisy.202400075","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

As the third-generation network succeeding artificial neural networks (ANNs), spiking neural networks (SNNs) offer high robustness and low energy consumption. Inspired by biological systems, the limitations of low robustness and high-power consumption in deep reinforcement learning (DRL) are addressed by introducing SNNs. The Brain Q-network (BrainQN) is proposed, which replaces the neurons in the classic Deep Q-learning (DQN) algorithm with SNN neurons. BrainQN is trained using surrogate gradient learning (SGL) and ANN-to-SNN conversion methods. Robustness tests with input noise reveal BrainQN's superior performance, achieving an 82.14% increase in rewards under low noise and 71.74% under high noise compared to DQN. These findings highlight BrainQN's robustness and superior performance in noisy environments, supporting its application in complex scenarios. SGL-trained BrainQN is more robust than ANN-to-SNN conversion under high noise. The differences in network output correlations between noisy and original inputs, along with training algorithm distinctions, explain this phenomenon. BrainQN successfully transitioned from a simulated Pong environment to a ball-catching robot with dynamic vision sensors (DVS). On the neuromorphic chip PAICORE, it shows significant advantages in latency and power consumption compared to Jetson Xavier NX.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

BrainQN：利用尖峰神经网络增强深度强化学习的鲁棒性

作为继人工神经网络（ANN）之后的第三代网络，尖峰神经网络（SNN）具有高鲁棒性和低能耗的特点。受生物系统的启发，通过引入 SNN，解决了深度强化学习（DRL）中低鲁棒性和高能耗的局限性。我们提出了脑 Q 网络（BrainQN），用 SNN 神经元取代经典深度 Q 学习（DQN）算法中的神经元。BrainQN 采用代梯度学习（SGL）和 ANN 到 SNN 转换方法进行训练。与 DQN 相比，BrainQN 在低噪音和高噪音条件下的奖励分别增加了 82.14% 和 71.74%。这些发现凸显了 BrainQN 在噪声环境中的鲁棒性和卓越性能，为其在复杂场景中的应用提供了支持。在高噪音环境下，SGL 训练的 BrainQN 比 ANN 到 SNN 的转换更稳健。噪声输入与原始输入之间网络输出相关性的差异以及训练算法的不同解释了这一现象。BrainQN 成功地从模拟 Pong 环境过渡到了带有动态视觉传感器（DVS）的接球机器人。在神经形态芯片 PAICORE 上，与 Jetson Xavier NX 相比，它在延迟和功耗方面显示出显著优势。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊