Utilizing deep reinforcement learning for tactile-based autonomous capture of non-cooperative objects in space

Q3 Earth and Planetary Sciences Aerospace Systems Pub Date : 2023-11-24 DOI:10.1007/s42401-023-00254-1

Bahador Beigomi, Zheng H. Zhu

{"title":"Utilizing deep reinforcement learning for tactile-based autonomous capture of non-cooperative objects in space","authors":"Bahador Beigomi, Zheng H. Zhu","doi":"10.1007/s42401-023-00254-1","DOIUrl":null,"url":null,"abstract":"<div><p>The focus of this research is the creation of a deep reinforcement learning approach to tackle the challenging task of robotic gripping through tactile sensor data feedback. Leveraging deep reinforcement learning, we have sidestepped the necessity to design features manually, which simplifies the issue and allows the robot to acquire gripping strategies via trial-and-error learning. Our technique utilizes an off-policy reinforcement learning model, integrating deep deterministic policy gradient structure and twin delayed attributes to facilitate maximum precision in gripping floating items. We have formulated a comprehensive reward function to provide the agent with precise, insightful feedback to facilitate the learning of the gripping task. The training of our model was executed solely in a simulated environment using the PyBullet framework and did not require demonstrations or pre-existing knowledge of the task. We examined a gripping task with a 3-finger Robotiq gripper for a case study, where the gripper had to approach a floating object, pursue it, and eventually grip it. This training methodology in a simulated setting allowed us to experiment with various scenarios and conditions, thereby enabling the agent to develop a resilient and adaptable grip policy.</p></div>","PeriodicalId":36309,"journal":{"name":"Aerospace Systems","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Aerospace Systems","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s42401-023-00254-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Earth and Planetary Sciences","Score":null,"Total":0}

引用次数: 0

Abstract

The focus of this research is the creation of a deep reinforcement learning approach to tackle the challenging task of robotic gripping through tactile sensor data feedback. Leveraging deep reinforcement learning, we have sidestepped the necessity to design features manually, which simplifies the issue and allows the robot to acquire gripping strategies via trial-and-error learning. Our technique utilizes an off-policy reinforcement learning model, integrating deep deterministic policy gradient structure and twin delayed attributes to facilitate maximum precision in gripping floating items. We have formulated a comprehensive reward function to provide the agent with precise, insightful feedback to facilitate the learning of the gripping task. The training of our model was executed solely in a simulated environment using the PyBullet framework and did not require demonstrations or pre-existing knowledge of the task. We examined a gripping task with a 3-finger Robotiq gripper for a case study, where the gripper had to approach a floating object, pursue it, and eventually grip it. This training methodology in a simulated setting allowed us to experiment with various scenarios and conditions, thereby enabling the agent to develop a resilient and adaptable grip policy.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用深度强化学习，基于触觉自主捕捉空间中的非合作物体

本研究的重点是创建一种深度强化学习方法，通过触觉传感器数据反馈来解决机器人抓取这一具有挑战性的任务。利用深度强化学习，我们避免了手动设计特征的必要性，从而简化了问题，并允许机器人通过试错学习获得抓取策略。我们的技术利用非策略强化学习模型，整合了深度确定性策略梯度结构和孪生延迟属性，以最大限度地提高抓取浮动物品的精度。我们制定了一个全面的奖励函数，为代理提供精确、有洞察力的反馈，以促进抓取任务的学习。我们的模型训练完全是在使用 PyBullet 框架的模拟环境中进行的，不需要演示或预先存在的任务知识。我们使用三指 Robotiq 机械手进行了一项抓取任务的案例研究，在这项任务中，机械手必须接近漂浮物、追逐漂浮物并最终将其抓取。这种在模拟环境中进行的训练方法使我们能够在各种场景和条件下进行实验，从而使代理能够开发出具有弹性和适应性的抓取策略。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Aerospace Systems Social Sciences-Social Sciences (miscellaneous)

CiteScore

1.80

自引率

0.00%

发文量

期刊介绍： Aerospace Systems provides an international, peer-reviewed forum which focuses on system-level research and development regarding aeronautics and astronautics. The journal emphasizes the unique role and increasing importance of informatics on aerospace. It fills a gap in current publishing coverage from outer space vehicles to atmospheric vehicles by highlighting interdisciplinary science, technology and engineering. Potential topics include, but are not limited to: Trans-space vehicle systems design and integration Air vehicle systems Space vehicle systems Near-space vehicle systems Aerospace robotics and unmanned system Communication, navigation and surveillance Aerodynamics and aircraft design Dynamics and control Aerospace propulsion Avionics system Opto-electronic system Air traffic management Earth observation Deep space exploration Bionic micro-aircraft/spacecraft Intelligent sensing and Information fusion