{"title":"Deep deterministic policy gradient with constraints for gait optimisation of biped robots","authors":"Xingyang Liu, Haina Rong, Ferrante Neri, Peng Yue, Gexiang Zhang","doi":"10.3233/ica-230724","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a novel Reinforcement Learning (RL) algorithm for robotic motion control, that is, a constrained Deep Deterministic Policy Gradient (DDPG) deviation learning strategy to assist biped robots in walking safely and accurately. The previous research on this topic highlighted the limitations in the controller’s ability to accurately track foot placement on discrete terrains and the lack of consideration for safety concerns. In this study, we address these challenges by focusing on ensuring the overall system’s safety. To begin with, we tackle the inverse kinematics problem by introducing constraints to the damping least squares method. This enhancement not only addresses singularity issues but also guarantees safe ranges for joint angles, thus ensuring the stability and reliability of the system. Based on this, we propose the adoption of the constrained DDPG method to correct controller deviations. In constrained DDPG, we incorporate a constraint layer into the Actor network, incorporating joint deviations as state inputs. By conducting offline training within the range of safe angles, it serves as a deviation corrector. Lastly, we validate the effectiveness of our proposed approach by conducting dynamic simulations using the CRANE biped robot. Through comprehensive assessments, including singularity analysis, constraint effectiveness evaluation, and walking experiments on discrete terrains, we demonstrate the superiority and practicality of our approach in enhancing walking performance while ensuring safety. Overall, our research contributes to the advancement of biped robot locomotion by addressing gait optimisation from multiple perspectives, including singularity handling, safety constraints, and deviation learning.","PeriodicalId":50358,"journal":{"name":"Integrated Computer-Aided Engineering","volume":"25 1","pages":""},"PeriodicalIF":5.8000,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Integrated Computer-Aided Engineering","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.3233/ica-230724","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, we propose a novel Reinforcement Learning (RL) algorithm for robotic motion control, that is, a constrained Deep Deterministic Policy Gradient (DDPG) deviation learning strategy to assist biped robots in walking safely and accurately. The previous research on this topic highlighted the limitations in the controller’s ability to accurately track foot placement on discrete terrains and the lack of consideration for safety concerns. In this study, we address these challenges by focusing on ensuring the overall system’s safety. To begin with, we tackle the inverse kinematics problem by introducing constraints to the damping least squares method. This enhancement not only addresses singularity issues but also guarantees safe ranges for joint angles, thus ensuring the stability and reliability of the system. Based on this, we propose the adoption of the constrained DDPG method to correct controller deviations. In constrained DDPG, we incorporate a constraint layer into the Actor network, incorporating joint deviations as state inputs. By conducting offline training within the range of safe angles, it serves as a deviation corrector. Lastly, we validate the effectiveness of our proposed approach by conducting dynamic simulations using the CRANE biped robot. Through comprehensive assessments, including singularity analysis, constraint effectiveness evaluation, and walking experiments on discrete terrains, we demonstrate the superiority and practicality of our approach in enhancing walking performance while ensuring safety. Overall, our research contributes to the advancement of biped robot locomotion by addressing gait optimisation from multiple perspectives, including singularity handling, safety constraints, and deviation learning.
期刊介绍:
Integrated Computer-Aided Engineering (ICAE) was founded in 1993. "Based on the premise that interdisciplinary thinking and synergistic collaboration of disciplines can solve complex problems, open new frontiers, and lead to true innovations and breakthroughs, the cornerstone of industrial competitiveness and advancement of the society" as noted in the inaugural issue of the journal.
The focus of ICAE is the integration of leading edge and emerging computer and information technologies for innovative solution of engineering problems. The journal fosters interdisciplinary research and presents a unique forum for innovative computer-aided engineering. It also publishes novel industrial applications of CAE, thus helping to bring new computational paradigms from research labs and classrooms to reality. Areas covered by the journal include (but are not limited to) artificial intelligence, advanced signal processing, biologically inspired computing, cognitive modeling, concurrent engineering, database management, distributed computing, evolutionary computing, fuzzy logic, genetic algorithms, geometric modeling, intelligent and adaptive systems, internet-based technologies, knowledge discovery and engineering, machine learning, mechatronics, mobile computing, multimedia technologies, networking, neural network computing, object-oriented systems, optimization and search, parallel processing, robotics virtual reality, and visualization techniques.