一种用于连续软机器人的演员-评论家运动强化学习方法

IF 2.9 Q2 ROBOTICS Robotics Pub Date : 2023-10-09 DOI:10.3390/robotics12050141
Luis Pantoja-Garcia, Vicente Parra-Vega, Rodolfo Garcia-Rodriguez, Carlos Ernesto Vázquez-García
{"title":"一种用于连续软机器人的演员-评论家运动强化学习方法","authors":"Luis Pantoja-Garcia, Vicente Parra-Vega, Rodolfo Garcia-Rodriguez, Carlos Ernesto Vázquez-García","doi":"10.3390/robotics12050141","DOIUrl":null,"url":null,"abstract":"Reinforcement learning (RL) is explored for motor control of a novel pneumatic-driven soft robot modeled after continuum media with a varying density. This model complies with closed-form Lagrangian dynamics, which fulfills the fundamental structural property of passivity, among others. Then, the question arises of how to synthesize a passivity-based RL model to control the unknown continuum soft robot dynamics to exploit its input–output energy properties advantageously throughout a reward-based neural network controller. Thus, we propose a continuous-time Actor–Critic scheme for tracking tasks of the continuum 3D soft robot subject to Lipschitz disturbances. A reward-based temporal difference leads to learning with a novel discontinuous adaptive mechanism of Critic neural weights. Finally, the reward and integral of the Bellman error approximation reinforce the adaptive mechanism of Actor neural weights. Closed-loop stability is guaranteed in the sense of Lyapunov, which leads to local exponential convergence of tracking errors based on integral sliding modes. Notably, it is assumed that dynamics are unknown, yet the control is continuous and robust. A representative simulation study shows the effectiveness of our proposal for tracking tasks.","PeriodicalId":37568,"journal":{"name":"Robotics","volume":"51 1","pages":"0"},"PeriodicalIF":2.9000,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Novel Actor—Critic Motor Reinforcement Learning for Continuum Soft Robots\",\"authors\":\"Luis Pantoja-Garcia, Vicente Parra-Vega, Rodolfo Garcia-Rodriguez, Carlos Ernesto Vázquez-García\",\"doi\":\"10.3390/robotics12050141\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Reinforcement learning (RL) is explored for motor control of a novel pneumatic-driven soft robot modeled after continuum media with a varying density. This model complies with closed-form Lagrangian dynamics, which fulfills the fundamental structural property of passivity, among others. Then, the question arises of how to synthesize a passivity-based RL model to control the unknown continuum soft robot dynamics to exploit its input–output energy properties advantageously throughout a reward-based neural network controller. Thus, we propose a continuous-time Actor–Critic scheme for tracking tasks of the continuum 3D soft robot subject to Lipschitz disturbances. A reward-based temporal difference leads to learning with a novel discontinuous adaptive mechanism of Critic neural weights. Finally, the reward and integral of the Bellman error approximation reinforce the adaptive mechanism of Actor neural weights. Closed-loop stability is guaranteed in the sense of Lyapunov, which leads to local exponential convergence of tracking errors based on integral sliding modes. Notably, it is assumed that dynamics are unknown, yet the control is continuous and robust. A representative simulation study shows the effectiveness of our proposal for tracking tasks.\",\"PeriodicalId\":37568,\"journal\":{\"name\":\"Robotics\",\"volume\":\"51 1\",\"pages\":\"0\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2023-10-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Robotics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/robotics12050141\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Robotics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/robotics12050141","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 1

摘要

研究了一种基于变密度连续介质模型的新型气动软机器人的强化学习控制方法。该模型符合闭型拉格朗日动力学,满足无源性等基本结构性质。然后,如何综合基于被动的强化学习模型来控制未知连续体软机器人动力学,从而在基于奖励的神经网络控制器中有利地利用其输入输出能量特性。因此,我们提出了一种连续时间Actor-Critic方案,用于连续三维软机器人在Lipschitz扰动下的跟踪任务。基于奖励的时间差异导致了一种新的批评性神经权重的不连续适应机制。最后,Bellman误差逼近的奖励和积分强化了Actor神经权值的自适应机制。在Lyapunov意义下保证闭环的稳定性,使得基于积分滑模的跟踪误差的局部指数收敛。值得注意的是,它假设动力学是未知的,但控制是连续的和鲁棒的。一个有代表性的仿真研究表明了我们提出的跟踪任务的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A Novel Actor—Critic Motor Reinforcement Learning for Continuum Soft Robots
Reinforcement learning (RL) is explored for motor control of a novel pneumatic-driven soft robot modeled after continuum media with a varying density. This model complies with closed-form Lagrangian dynamics, which fulfills the fundamental structural property of passivity, among others. Then, the question arises of how to synthesize a passivity-based RL model to control the unknown continuum soft robot dynamics to exploit its input–output energy properties advantageously throughout a reward-based neural network controller. Thus, we propose a continuous-time Actor–Critic scheme for tracking tasks of the continuum 3D soft robot subject to Lipschitz disturbances. A reward-based temporal difference leads to learning with a novel discontinuous adaptive mechanism of Critic neural weights. Finally, the reward and integral of the Bellman error approximation reinforce the adaptive mechanism of Actor neural weights. Closed-loop stability is guaranteed in the sense of Lyapunov, which leads to local exponential convergence of tracking errors based on integral sliding modes. Notably, it is assumed that dynamics are unknown, yet the control is continuous and robust. A representative simulation study shows the effectiveness of our proposal for tracking tasks.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Robotics
Robotics Mathematics-Control and Optimization
CiteScore
6.70
自引率
8.10%
发文量
114
审稿时长
11 weeks
期刊介绍: Robotics publishes original papers, technical reports, case studies, review papers and tutorials in all the aspects of robotics. Special Issues devoted to important topics in advanced robotics will be published from time to time. It particularly welcomes those emerging methodologies and techniques which bridge theoretical studies and applications and have significant potential for real-world applications. It provides a forum for information exchange between professionals, academicians and engineers who are working in the area of robotics, helping them to disseminate research findings and to learn from each other’s work. Suitable topics include, but are not limited to: -intelligent robotics, mechatronics, and biomimetics -novel and biologically-inspired robotics -modelling, identification and control of robotic systems -biomedical, rehabilitation and surgical robotics -exoskeletons, prosthetics and artificial organs -AI, neural networks and fuzzy logic in robotics -multimodality human-machine interaction -wireless sensor networks for robot navigation -multi-sensor data fusion and SLAM
期刊最新文献
Evaluation of a Voice-Enabled Autonomous Camera Control System for the da Vinci Surgical Robot NU-Biped-4.5: A Lightweight and Low-Prototyping-Cost Full-Size Bipedal Robot Probability-Based Strategy for a Football Multi-Agent Autonomous Robot System An Enhanced Multi-Sensor Simultaneous Localization and Mapping (SLAM) Framework with Coarse-to-Fine Loop Closure Detection Based on a Tightly Coupled Error State Iterative Kalman Filter Playing Checkers with an Intelligent and Collaborative Robotic System
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1