针对软连续机器人的定制强化学习方法与高级噪声优化

Jino Jayan;Lal Priya P.S.;Hari Kumar R.
{"title":"针对软连续机器人的定制强化学习方法与高级噪声优化","authors":"Jino Jayan;Lal Priya P.S.;Hari Kumar R.","doi":"10.1109/TAI.2024.3440225","DOIUrl":null,"url":null,"abstract":"Advancements in the fusion of reinforcement learning (RL) and soft robotics are presented in this study, with a focus on refining training methodologies for soft planar continuum robots (SPCRs). The proposed modifications to the twin-delayed deep deterministic (TD3) policy gradient algorithm introduce the innovative dynamic harmonic noise (DHN) to enhance exploration adaptability. Additionally, a tailored adaptive task achievement reward (ATAR) is introduced to balance goal achievement, time efficiency, and trajectory smoothness, thereby improving precision in SPCR navigation. Evaluation metrics, including mean squared distance (MSD), mean error (ME), and mean episodic reward (MER), demonstrate robust generalization capabilities. Significant improvements in average reward, success rate, and convergence speed for the proposed modified TD3 algorithm over traditional TD3 are highlighted in the comparative analysis. Specifically, a 45.17% increase in success rate and a 4.92% increase in convergence speed over TD3 are demonstrated by the proposed TD3. Beyond insights into RL and soft robotics, potential applicability of RL in diverse scenarios is underscored, laying the foundation for future breakthroughs in real-world applications.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 11","pages":"5509-5518"},"PeriodicalIF":0.0000,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Tailor-Made Reinforcement Learning Approach With Advanced Noise Optimization for Soft Continuum Robots\",\"authors\":\"Jino Jayan;Lal Priya P.S.;Hari Kumar R.\",\"doi\":\"10.1109/TAI.2024.3440225\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Advancements in the fusion of reinforcement learning (RL) and soft robotics are presented in this study, with a focus on refining training methodologies for soft planar continuum robots (SPCRs). The proposed modifications to the twin-delayed deep deterministic (TD3) policy gradient algorithm introduce the innovative dynamic harmonic noise (DHN) to enhance exploration adaptability. Additionally, a tailored adaptive task achievement reward (ATAR) is introduced to balance goal achievement, time efficiency, and trajectory smoothness, thereby improving precision in SPCR navigation. Evaluation metrics, including mean squared distance (MSD), mean error (ME), and mean episodic reward (MER), demonstrate robust generalization capabilities. Significant improvements in average reward, success rate, and convergence speed for the proposed modified TD3 algorithm over traditional TD3 are highlighted in the comparative analysis. Specifically, a 45.17% increase in success rate and a 4.92% increase in convergence speed over TD3 are demonstrated by the proposed TD3. Beyond insights into RL and soft robotics, potential applicability of RL in diverse scenarios is underscored, laying the foundation for future breakthroughs in real-world applications.\",\"PeriodicalId\":73305,\"journal\":{\"name\":\"IEEE transactions on artificial intelligence\",\"volume\":\"5 11\",\"pages\":\"5509-5518\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on artificial intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10631661/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10631661/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本研究介绍了强化学习(RL)与软机器人技术融合的进展,重点是改进软平面连续机器人(SPCR)的训练方法。对双延迟深度确定性(TD3)策略梯度算法的修改引入了创新的动态谐波噪声(DHN),以增强探索的适应性。此外,还引入了量身定制的自适应任务成就奖励(ATAR),以平衡目标实现、时间效率和轨迹平滑性,从而提高 SPCR 导航的精度。包括平均平方距离(MSD)、平均误差(ME)和平均偶发奖励(MER)在内的评估指标都证明了强大的泛化能力。与传统的 TD3 相比,改进后的 TD3 算法在平均奖励、成功率和收敛速度方面都有显著提高,这一点在比较分析中得到了强调。具体来说,与 TD3 相比,拟议的 TD3 算法的成功率提高了 45.17%,收敛速度提高了 4.92%。除了对 RL 和软机器人学的深入了解,RL 在不同场景中的潜在适用性也得到了强调,为未来在现实世界的应用中取得突破奠定了基础。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Tailor-Made Reinforcement Learning Approach With Advanced Noise Optimization for Soft Continuum Robots
Advancements in the fusion of reinforcement learning (RL) and soft robotics are presented in this study, with a focus on refining training methodologies for soft planar continuum robots (SPCRs). The proposed modifications to the twin-delayed deep deterministic (TD3) policy gradient algorithm introduce the innovative dynamic harmonic noise (DHN) to enhance exploration adaptability. Additionally, a tailored adaptive task achievement reward (ATAR) is introduced to balance goal achievement, time efficiency, and trajectory smoothness, thereby improving precision in SPCR navigation. Evaluation metrics, including mean squared distance (MSD), mean error (ME), and mean episodic reward (MER), demonstrate robust generalization capabilities. Significant improvements in average reward, success rate, and convergence speed for the proposed modified TD3 algorithm over traditional TD3 are highlighted in the comparative analysis. Specifically, a 45.17% increase in success rate and a 4.92% increase in convergence speed over TD3 are demonstrated by the proposed TD3. Beyond insights into RL and soft robotics, potential applicability of RL in diverse scenarios is underscored, laying the foundation for future breakthroughs in real-world applications.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
7.70
自引率
0.00%
发文量
0
期刊最新文献
Table of Contents Front Cover IEEE Transactions on Artificial Intelligence Publication Information Table of Contents Front Cover
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1