{"title":"A new hybrid learning control system for robots based on spiking neural networks","authors":"","doi":"10.1016/j.neunet.2024.106656","DOIUrl":null,"url":null,"abstract":"<div><p>This paper presents a new hybrid learning and control method that can tune their parameters based on reinforcement learning. In the new proposed method, nonlinear controllers are considered multi-input multi-output functions and then the functions are replaced with SNNs with reinforcement learning algorithms. Dopamine-modulated spike-timing-dependent plasticity (STDP) is used for reinforcement learning and manipulating the synaptic weights between the input and output of neuronal groups (for parameter adjustment). Details of the method are presented and some case studies are done on nonlinear controllers such as Fractional Order PID (FOPID) and Feedback Linearization. The structure and the dynamic equations for learning are presented, and the proposed algorithm is tested on robots and results are compared with other works. Moreover, to demonstrate the effectiveness of SNNFOPID, we conducted rigorous testing on a variety of systems including a two-wheel mobile robot, a double inverted pendulum, and a four-link manipulator robot. The results revealed impressively low errors of 0.01 m, 0.03 rad, and 0.03 rad for each system, respectively. The method is tested on another controller named Feedback Linearization, which provides acceptable results. Results show that the new method has better performance in terms of Integral Absolute Error (IAE) and is highly useful in hardware implementation due to its low energy consumption, high speed, and accuracy. The duration necessary for achieving full and stable proficiency in the control of various robotic systems using SNNFOPD, and SNNFL on an Asus Core i5 system within Simulink’s Simscape environment is as follows:</p><p>– Two-link robot manipulator with SNNFOPID: 19.85656 hours</p><p>– Two-link robot manipulator with SNNFL: 0.45828 hours</p><p>– Double inverted pendulum with SNNFOPID: 3.455 hours</p><p>– Mobile robot with SNNFOPID: 3.71948 hours</p><p>– Four-link robot manipulator with SNNFOPID: 16.6789 hours.</p><p>This method can be generalized to other controllers and systems like robots.</p></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":null,"pages":null},"PeriodicalIF":6.0000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S089360802400580X/pdfft?md5=52c30df5a4230d66467ce3fcee2b3a54&pid=1-s2.0-S089360802400580X-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S089360802400580X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
This paper presents a new hybrid learning and control method that can tune their parameters based on reinforcement learning. In the new proposed method, nonlinear controllers are considered multi-input multi-output functions and then the functions are replaced with SNNs with reinforcement learning algorithms. Dopamine-modulated spike-timing-dependent plasticity (STDP) is used for reinforcement learning and manipulating the synaptic weights between the input and output of neuronal groups (for parameter adjustment). Details of the method are presented and some case studies are done on nonlinear controllers such as Fractional Order PID (FOPID) and Feedback Linearization. The structure and the dynamic equations for learning are presented, and the proposed algorithm is tested on robots and results are compared with other works. Moreover, to demonstrate the effectiveness of SNNFOPID, we conducted rigorous testing on a variety of systems including a two-wheel mobile robot, a double inverted pendulum, and a four-link manipulator robot. The results revealed impressively low errors of 0.01 m, 0.03 rad, and 0.03 rad for each system, respectively. The method is tested on another controller named Feedback Linearization, which provides acceptable results. Results show that the new method has better performance in terms of Integral Absolute Error (IAE) and is highly useful in hardware implementation due to its low energy consumption, high speed, and accuracy. The duration necessary for achieving full and stable proficiency in the control of various robotic systems using SNNFOPD, and SNNFL on an Asus Core i5 system within Simulink’s Simscape environment is as follows:
– Two-link robot manipulator with SNNFOPID: 19.85656 hours
– Two-link robot manipulator with SNNFL: 0.45828 hours
– Double inverted pendulum with SNNFOPID: 3.455 hours
– Mobile robot with SNNFOPID: 3.71948 hours
– Four-link robot manipulator with SNNFOPID: 16.6789 hours.
This method can be generalized to other controllers and systems like robots.
期刊介绍:
Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.