Learning vision-based robotic manipulation tasks sequentially in offline reinforcement learning settings

IF 1.9 4区 计算机科学 Q3 ROBOTICS Robotica Pub Date : 2024-05-02 DOI:10.1017/s0263574724000389
Sudhir Pratap Yadav, Rajendra Nagar, Suril V. Shah
{"title":"Learning vision-based robotic manipulation tasks sequentially in offline reinforcement learning settings","authors":"Sudhir Pratap Yadav, Rajendra Nagar, Suril V. Shah","doi":"10.1017/s0263574724000389","DOIUrl":null,"url":null,"abstract":"With the rise of deep reinforcement learning (RL) methods, many complex robotic manipulation tasks are being solved. However, harnessing the full power of deep learning requires large datasets. Online RL does not suit itself readily into this paradigm due to costly and time-consuming agent-environment interaction. Therefore, many offline RL algorithms have recently been proposed to learn robotic tasks. But mainly, all such methods focus on a single-task or multitask learning, which requires retraining whenever we need to learn a new task. Continuously learning tasks without forgetting previous knowledge combined with the power of offline deep RL would allow us to scale the number of tasks by adding them one after another. This paper investigates the effectiveness of regularisation-based methods like synaptic intelligence for sequentially learning image-based robotic manipulation tasks in an offline-RL setup. We evaluate the performance of this combined framework against common challenges of sequential learning: catastrophic forgetting and forward knowledge transfer. We performed experiments with different task combinations to analyse the effect of task ordering. We also investigated the effect of the number of object configurations and the density of robot trajectories. We found that learning tasks sequentially helps in the retention of knowledge from previous tasks, thereby reducing the time required to learn a new task. Regularisation-based approaches for continuous learning, like the synaptic intelligence method, help mitigate catastrophic forgetting but have shown only limited transfer of knowledge from previous tasks.","PeriodicalId":49593,"journal":{"name":"Robotica","volume":"29 1","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Robotica","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1017/s0263574724000389","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 0

Abstract

With the rise of deep reinforcement learning (RL) methods, many complex robotic manipulation tasks are being solved. However, harnessing the full power of deep learning requires large datasets. Online RL does not suit itself readily into this paradigm due to costly and time-consuming agent-environment interaction. Therefore, many offline RL algorithms have recently been proposed to learn robotic tasks. But mainly, all such methods focus on a single-task or multitask learning, which requires retraining whenever we need to learn a new task. Continuously learning tasks without forgetting previous knowledge combined with the power of offline deep RL would allow us to scale the number of tasks by adding them one after another. This paper investigates the effectiveness of regularisation-based methods like synaptic intelligence for sequentially learning image-based robotic manipulation tasks in an offline-RL setup. We evaluate the performance of this combined framework against common challenges of sequential learning: catastrophic forgetting and forward knowledge transfer. We performed experiments with different task combinations to analyse the effect of task ordering. We also investigated the effect of the number of object configurations and the density of robot trajectories. We found that learning tasks sequentially helps in the retention of knowledge from previous tasks, thereby reducing the time required to learn a new task. Regularisation-based approaches for continuous learning, like the synaptic intelligence method, help mitigate catastrophic forgetting but have shown only limited transfer of knowledge from previous tasks.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在离线强化学习设置中按顺序学习基于视觉的机器人操纵任务
随着深度强化学习(RL)方法的兴起,许多复杂的机器人操纵任务正在得到解决。然而,要充分发挥深度学习的威力,需要大量的数据集。由于代理与环境之间的交互成本高、耗时长,在线强化学习并不适合这种模式。因此,最近提出了许多离线 RL 算法来学习机器人任务。但主要而言,所有这些方法都侧重于单任务或多任务学习,每当我们需要学习新任务时,都需要重新训练。在不遗忘先前知识的情况下持续学习任务,再加上离线深度 RL 的强大功能,我们就可以通过一个接一个地添加任务来扩展任务数量。本文研究了基于正则化的方法(如突触智能)在离线 RL 设置中连续学习基于图像的机器人操作任务的有效性。我们针对顺序学习中常见的挑战:灾难性遗忘和前向知识转移,对这一组合框架的性能进行了评估。我们进行了不同任务组合的实验,以分析任务排序的影响。我们还研究了物体配置数量和机器人轨迹密度的影响。我们发现,按顺序学习任务有助于保留之前任务的知识,从而减少学习新任务所需的时间。基于正则化的持续学习方法(如突触智能法)有助于减轻灾难性遗忘,但对先前任务知识的迁移却十分有限。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Robotica
Robotica 工程技术-机器人学
CiteScore
4.50
自引率
22.20%
发文量
181
审稿时长
9.9 months
期刊介绍: Robotica is a forum for the multidisciplinary subject of robotics and encourages developments, applications and research in this important field of automation and robotics with regard to industry, health, education and economic and social aspects of relevance. Coverage includes activities in hostile environments, applications in the service and manufacturing industries, biological robotics, dynamics and kinematics involved in robot design and uses, on-line robots, robot task planning, rehabilitation robotics, sensory perception, software in the widest sense, particularly in respect of programming languages and links with CAD/CAM systems, telerobotics and various other areas. In addition, interest is focused on various Artificial Intelligence topics of theoretical and practical interest.
期刊最新文献
3D dynamics and control of a snake robot in uncertain underwater environment An application of natural matrices to the dynamic balance problem of planar parallel manipulators Control of stance-leg motion and zero-moment point for achieving perfect upright stationary state of rimless wheel type walker with parallel linkage legs Trajectory tracking control of a mobile robot using fuzzy logic controller with optimal parameters High accuracy hybrid kinematic modeling for serial robotic manipulators
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1