针对软连续机器人的定制强化学习方法与高级噪声优化

IEEE transactions on artificial intelligence Pub Date : 2024-08-08 DOI:10.1109/TAI.2024.3440225

Jino Jayan;Lal Priya P.S.;Hari Kumar R.

{"title":"针对软连续机器人的定制强化学习方法与高级噪声优化","authors":"Jino Jayan;Lal Priya P.S.;Hari Kumar R.","doi":"10.1109/TAI.2024.3440225","DOIUrl":null,"url":null,"abstract":"Advancements in the fusion of reinforcement learning (RL) and soft robotics are presented in this study, with a focus on refining training methodologies for soft planar continuum robots (SPCRs). The proposed modifications to the twin-delayed deep deterministic (TD3) policy gradient algorithm introduce the innovative dynamic harmonic noise (DHN) to enhance exploration adaptability. Additionally, a tailored adaptive task achievement reward (ATAR) is introduced to balance goal achievement, time efficiency, and trajectory smoothness, thereby improving precision in SPCR navigation. Evaluation metrics, including mean squared distance (MSD), mean error (ME), and mean episodic reward (MER), demonstrate robust generalization capabilities. Significant improvements in average reward, success rate, and convergence speed for the proposed modified TD3 algorithm over traditional TD3 are highlighted in the comparative analysis. Specifically, a 45.17% increase in success rate and a 4.92% increase in convergence speed over TD3 are demonstrated by the proposed TD3. Beyond insights into RL and soft robotics, potential applicability of RL in diverse scenarios is underscored, laying the foundation for future breakthroughs in real-world applications.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 11","pages":"5509-5518"},"PeriodicalIF":0.0000,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Tailor-Made Reinforcement Learning Approach With Advanced Noise Optimization for Soft Continuum Robots\",\"authors\":\"Jino Jayan;Lal Priya P.S.;Hari Kumar R.\",\"doi\":\"10.1109/TAI.2024.3440225\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Advancements in the fusion of reinforcement learning (RL) and soft robotics are presented in this study, with a focus on refining training methodologies for soft planar continuum robots (SPCRs). The proposed modifications to the twin-delayed deep deterministic (TD3) policy gradient algorithm introduce the innovative dynamic harmonic noise (DHN) to enhance exploration adaptability. Additionally, a tailored adaptive task achievement reward (ATAR) is introduced to balance goal achievement, time efficiency, and trajectory smoothness, thereby improving precision in SPCR navigation. Evaluation metrics, including mean squared distance (MSD), mean error (ME), and mean episodic reward (MER), demonstrate robust generalization capabilities. Significant improvements in average reward, success rate, and convergence speed for the proposed modified TD3 algorithm over traditional TD3 are highlighted in the comparative analysis. Specifically, a 45.17% increase in success rate and a 4.92% increase in convergence speed over TD3 are demonstrated by the proposed TD3. Beyond insights into RL and soft robotics, potential applicability of RL in diverse scenarios is underscored, laying the foundation for future breakthroughs in real-world applications.\",\"PeriodicalId\":73305,\"journal\":{\"name\":\"IEEE transactions on artificial intelligence\",\"volume\":\"5 11\",\"pages\":\"5509-5518\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on artificial intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10631661/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10631661/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本研究介绍了强化学习（RL）与软机器人技术融合的进展，重点是改进软平面连续机器人（SPCR）的训练方法。对双延迟深度确定性（TD3）策略梯度算法的修改引入了创新的动态谐波噪声（DHN），以增强探索的适应性。此外，还引入了量身定制的自适应任务成就奖励（ATAR），以平衡目标实现、时间效率和轨迹平滑性，从而提高 SPCR 导航的精度。包括平均平方距离（MSD）、平均误差（ME）和平均偶发奖励（MER）在内的评估指标都证明了强大的泛化能力。与传统的 TD3 相比，改进后的 TD3 算法在平均奖励、成功率和收敛速度方面都有显著提高，这一点在比较分析中得到了强调。具体来说，与 TD3 相比，拟议的 TD3 算法的成功率提高了 45.17%，收敛速度提高了 4.92%。除了对 RL 和软机器人学的深入了解，RL 在不同场景中的潜在适用性也得到了强调，为未来在现实世界的应用中取得突破奠定了基础。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Tailor-Made Reinforcement Learning Approach With Advanced Noise Optimization for Soft Continuum Robots

Advancements in the fusion of reinforcement learning (RL) and soft robotics are presented in this study, with a focus on refining training methodologies for soft planar continuum robots (SPCRs). The proposed modifications to the twin-delayed deep deterministic (TD3) policy gradient algorithm introduce the innovative dynamic harmonic noise (DHN) to enhance exploration adaptability. Additionally, a tailored adaptive task achievement reward (ATAR) is introduced to balance goal achievement, time efficiency, and trajectory smoothness, thereby improving precision in SPCR navigation. Evaluation metrics, including mean squared distance (MSD), mean error (ME), and mean episodic reward (MER), demonstrate robust generalization capabilities. Significant improvements in average reward, success rate, and convergence speed for the proposed modified TD3 algorithm over traditional TD3 are highlighted in the comparative analysis. Specifically, a 45.17% increase in success rate and a 4.92% increase in convergence speed over TD3 are demonstrated by the proposed TD3. Beyond insights into RL and soft robotics, potential applicability of RL in diverse scenarios is underscored, laying the foundation for future breakthroughs in real-world applications.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助