{"title":"BrainQN:利用尖峰神经网络增强深度强化学习的鲁棒性","authors":"Shuo Feng, Jian Cao, Zehong Ou, Guang Chen, Yi Zhong, Zilin Wang, Juntong Yan, Jue Chen, Bingsen Wang, Chenglong Zou, Zebang Feng, Yuan Wang","doi":"10.1002/aisy.202400075","DOIUrl":null,"url":null,"abstract":"<p>As the third-generation network succeeding artificial neural networks (ANNs), spiking neural networks (SNNs) offer high robustness and low energy consumption. Inspired by biological systems, the limitations of low robustness and high-power consumption in deep reinforcement learning (DRL) are addressed by introducing SNNs. The Brain Q-network (BrainQN) is proposed, which replaces the neurons in the classic Deep Q-learning (DQN) algorithm with SNN neurons. BrainQN is trained using surrogate gradient learning (SGL) and ANN-to-SNN conversion methods. Robustness tests with input noise reveal BrainQN's superior performance, achieving an 82.14% increase in rewards under low noise and 71.74% under high noise compared to DQN. These findings highlight BrainQN's robustness and superior performance in noisy environments, supporting its application in complex scenarios. SGL-trained BrainQN is more robust than ANN-to-SNN conversion under high noise. The differences in network output correlations between noisy and original inputs, along with training algorithm distinctions, explain this phenomenon. BrainQN successfully transitioned from a simulated Pong environment to a ball-catching robot with dynamic vision sensors (DVS). On the neuromorphic chip PAICORE, it shows significant advantages in latency and power consumption compared to Jetson Xavier NX.</p>","PeriodicalId":93858,"journal":{"name":"Advanced intelligent systems (Weinheim an der Bergstrasse, Germany)","volume":null,"pages":null},"PeriodicalIF":6.8000,"publicationDate":"2024-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aisy.202400075","citationCount":"0","resultStr":"{\"title\":\"BrainQN: Enhancing the Robustness of Deep Reinforcement Learning with Spiking Neural Networks\",\"authors\":\"Shuo Feng, Jian Cao, Zehong Ou, Guang Chen, Yi Zhong, Zilin Wang, Juntong Yan, Jue Chen, Bingsen Wang, Chenglong Zou, Zebang Feng, Yuan Wang\",\"doi\":\"10.1002/aisy.202400075\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>As the third-generation network succeeding artificial neural networks (ANNs), spiking neural networks (SNNs) offer high robustness and low energy consumption. Inspired by biological systems, the limitations of low robustness and high-power consumption in deep reinforcement learning (DRL) are addressed by introducing SNNs. The Brain Q-network (BrainQN) is proposed, which replaces the neurons in the classic Deep Q-learning (DQN) algorithm with SNN neurons. BrainQN is trained using surrogate gradient learning (SGL) and ANN-to-SNN conversion methods. Robustness tests with input noise reveal BrainQN's superior performance, achieving an 82.14% increase in rewards under low noise and 71.74% under high noise compared to DQN. These findings highlight BrainQN's robustness and superior performance in noisy environments, supporting its application in complex scenarios. SGL-trained BrainQN is more robust than ANN-to-SNN conversion under high noise. The differences in network output correlations between noisy and original inputs, along with training algorithm distinctions, explain this phenomenon. BrainQN successfully transitioned from a simulated Pong environment to a ball-catching robot with dynamic vision sensors (DVS). On the neuromorphic chip PAICORE, it shows significant advantages in latency and power consumption compared to Jetson Xavier NX.</p>\",\"PeriodicalId\":93858,\"journal\":{\"name\":\"Advanced intelligent systems (Weinheim an der Bergstrasse, Germany)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":6.8000,\"publicationDate\":\"2024-08-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aisy.202400075\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Advanced intelligent systems (Weinheim an der Bergstrasse, Germany)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/aisy.202400075\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced intelligent systems (Weinheim an der Bergstrasse, Germany)","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/aisy.202400075","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
BrainQN: Enhancing the Robustness of Deep Reinforcement Learning with Spiking Neural Networks
As the third-generation network succeeding artificial neural networks (ANNs), spiking neural networks (SNNs) offer high robustness and low energy consumption. Inspired by biological systems, the limitations of low robustness and high-power consumption in deep reinforcement learning (DRL) are addressed by introducing SNNs. The Brain Q-network (BrainQN) is proposed, which replaces the neurons in the classic Deep Q-learning (DQN) algorithm with SNN neurons. BrainQN is trained using surrogate gradient learning (SGL) and ANN-to-SNN conversion methods. Robustness tests with input noise reveal BrainQN's superior performance, achieving an 82.14% increase in rewards under low noise and 71.74% under high noise compared to DQN. These findings highlight BrainQN's robustness and superior performance in noisy environments, supporting its application in complex scenarios. SGL-trained BrainQN is more robust than ANN-to-SNN conversion under high noise. The differences in network output correlations between noisy and original inputs, along with training algorithm distinctions, explain this phenomenon. BrainQN successfully transitioned from a simulated Pong environment to a ball-catching robot with dynamic vision sensors (DVS). On the neuromorphic chip PAICORE, it shows significant advantages in latency and power consumption compared to Jetson Xavier NX.