{"title":"Research on Gait Switching Method Based on Speed Requirement","authors":"Weijun Tian, Kuiyue Zhou, Jian Song, Xu Li, Zhu Chen, Ziteng Sheng, Ruizhi Wang, Jiang Lei, Qian Cong","doi":"10.1007/s42235-024-00589-1","DOIUrl":null,"url":null,"abstract":"<div><p>Real-time gait switching of quadruped robot with speed change is a difficult problem in the field of robot research. It is a novel solution to apply reinforcement learning method to the quadruped robot problem. In this paper, a quadruped robot simulation platform is built based on Robot Operating System (ROS). openai-gym is used as the RL framework, and Proximal Policy Optimization (PPO) algorithm is used for quadruped robot gait switching. The training task is to train different gait parameters according to different speed input, including gait type, gait cycle, gait offset, and gait interval. Then, the trained gait parameters are used as the input of the Model Predictive Control (MPC) controller, and the joint forces/torques are calculated by the MPC controller.The calculated joint forces are transmitted to the joint motor of the quadruped robot to control the joint rotation, and the gait switching of the quadruped robot under different speeds is realized. Thus, it can more realistically imitate the gait transformation of animals, walking at very low speed, trotting at medium speed and galloping at high speed. In this paper, a variety of factors affecting the gait training of quadruped robot are integrated, and many aspects of reward constraints are used, including velocity reward, time reward,energy reward and balance reward. Different weights are given to each reward, and the instant reward at each step of system training is obtained by multiplying each reward with its own weight, which ensures the reliability of training results. At the same time, multiple groups of comparative analysis simulation experiments are carried out. The results show that the priority of balance reward, velocity reward, energy reward and time reward decreases successively and the weight of each reward does not exceed 0.5.When the policy network and the value network are designed, a three-layer neural network is used, the number of neurons in each layer is 64 and the discount factor is 0.99, the training effect is better.</p></div>","PeriodicalId":614,"journal":{"name":"Journal of Bionic Engineering","volume":"21 6","pages":"2817 - 2829"},"PeriodicalIF":4.9000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Bionic Engineering","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s42235-024-00589-1","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Real-time gait switching of quadruped robot with speed change is a difficult problem in the field of robot research. It is a novel solution to apply reinforcement learning method to the quadruped robot problem. In this paper, a quadruped robot simulation platform is built based on Robot Operating System (ROS). openai-gym is used as the RL framework, and Proximal Policy Optimization (PPO) algorithm is used for quadruped robot gait switching. The training task is to train different gait parameters according to different speed input, including gait type, gait cycle, gait offset, and gait interval. Then, the trained gait parameters are used as the input of the Model Predictive Control (MPC) controller, and the joint forces/torques are calculated by the MPC controller.The calculated joint forces are transmitted to the joint motor of the quadruped robot to control the joint rotation, and the gait switching of the quadruped robot under different speeds is realized. Thus, it can more realistically imitate the gait transformation of animals, walking at very low speed, trotting at medium speed and galloping at high speed. In this paper, a variety of factors affecting the gait training of quadruped robot are integrated, and many aspects of reward constraints are used, including velocity reward, time reward,energy reward and balance reward. Different weights are given to each reward, and the instant reward at each step of system training is obtained by multiplying each reward with its own weight, which ensures the reliability of training results. At the same time, multiple groups of comparative analysis simulation experiments are carried out. The results show that the priority of balance reward, velocity reward, energy reward and time reward decreases successively and the weight of each reward does not exceed 0.5.When the policy network and the value network are designed, a three-layer neural network is used, the number of neurons in each layer is 64 and the discount factor is 0.99, the training effect is better.
期刊介绍:
The Journal of Bionic Engineering (JBE) is a peer-reviewed journal that publishes original research papers and reviews that apply the knowledge learned from nature and biological systems to solve concrete engineering problems. The topics that JBE covers include but are not limited to:
Mechanisms, kinematical mechanics and control of animal locomotion, development of mobile robots with walking (running and crawling), swimming or flying abilities inspired by animal locomotion.
Structures, morphologies, composition and physical properties of natural and biomaterials; fabrication of new materials mimicking the properties and functions of natural and biomaterials.
Biomedical materials, artificial organs and tissue engineering for medical applications; rehabilitation equipment and devices.
Development of bioinspired computation methods and artificial intelligence for engineering applications.