{"title":"Tailor-Made Reinforcement Learning Approach With Advanced Noise Optimization for Soft Continuum Robots","authors":"Jino Jayan;Lal Priya P.S.;Hari Kumar R.","doi":"10.1109/TAI.2024.3440225","DOIUrl":null,"url":null,"abstract":"Advancements in the fusion of reinforcement learning (RL) and soft robotics are presented in this study, with a focus on refining training methodologies for soft planar continuum robots (SPCRs). The proposed modifications to the twin-delayed deep deterministic (TD3) policy gradient algorithm introduce the innovative dynamic harmonic noise (DHN) to enhance exploration adaptability. Additionally, a tailored adaptive task achievement reward (ATAR) is introduced to balance goal achievement, time efficiency, and trajectory smoothness, thereby improving precision in SPCR navigation. Evaluation metrics, including mean squared distance (MSD), mean error (ME), and mean episodic reward (MER), demonstrate robust generalization capabilities. Significant improvements in average reward, success rate, and convergence speed for the proposed modified TD3 algorithm over traditional TD3 are highlighted in the comparative analysis. Specifically, a 45.17% increase in success rate and a 4.92% increase in convergence speed over TD3 are demonstrated by the proposed TD3. Beyond insights into RL and soft robotics, potential applicability of RL in diverse scenarios is underscored, laying the foundation for future breakthroughs in real-world applications.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 11","pages":"5509-5518"},"PeriodicalIF":0.0000,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10631661/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Advancements in the fusion of reinforcement learning (RL) and soft robotics are presented in this study, with a focus on refining training methodologies for soft planar continuum robots (SPCRs). The proposed modifications to the twin-delayed deep deterministic (TD3) policy gradient algorithm introduce the innovative dynamic harmonic noise (DHN) to enhance exploration adaptability. Additionally, a tailored adaptive task achievement reward (ATAR) is introduced to balance goal achievement, time efficiency, and trajectory smoothness, thereby improving precision in SPCR navigation. Evaluation metrics, including mean squared distance (MSD), mean error (ME), and mean episodic reward (MER), demonstrate robust generalization capabilities. Significant improvements in average reward, success rate, and convergence speed for the proposed modified TD3 algorithm over traditional TD3 are highlighted in the comparative analysis. Specifically, a 45.17% increase in success rate and a 4.92% increase in convergence speed over TD3 are demonstrated by the proposed TD3. Beyond insights into RL and soft robotics, potential applicability of RL in diverse scenarios is underscored, laying the foundation for future breakthroughs in real-world applications.