Efficient Reinforcement Learning Method for Multi-Phase Robot Manipulation Skill Acquisition via Human Knowledge, Model-Based, and Model-Free Methods

IF 6.4 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS IEEE Transactions on Automation Science and Engineering Pub Date : 2024-09-04 DOI:10.1109/TASE.2024.3451296
Xing Liu;Zihao Liu;Gaozhao Wang;Zhengxiong Liu;Panfeng Huang
{"title":"Efficient Reinforcement Learning Method for Multi-Phase Robot Manipulation Skill Acquisition via Human Knowledge, Model-Based, and Model-Free Methods","authors":"Xing Liu;Zihao Liu;Gaozhao Wang;Zhengxiong Liu;Panfeng Huang","doi":"10.1109/TASE.2024.3451296","DOIUrl":null,"url":null,"abstract":"A novel efficient reinforcement learning paradigm combining human knowledge, model-based and model-free methods is presented for optimal robot manipulation control during complex multi-phase robot manipulation tasks, e.g., the peg-in-hole tasks with tight fit and nut-and-bolt assembly. Firstly, human demonstration is conducted to collect the data during successful robot manipulation, and manipulation phase estimation method integrating with human knowledge is presented to obtain the higher-level planning of the multi-phase robot manipulation tasks. Typical robot manipulation tasks can usually be decomposed into three types of phases, namely free motion, discontinuous contact, and continuous contact. For phase with free motion, the motion planning method is utilized for generating smooth trajectory. For phase with discontinuous contact in the axes of interest during the pre-manipulation process, the rule-based model-free method, namely the Policy Gradients with Human-Guided Parameter-based Exploration (PGHGPE) method is utilized. For the manipulation phase with continuous contacts, the model-based method is utilized because of its higher sample efficiency. Finally, the simulation and experimental studies verify the effectiveness of the presented algorithm. Note to Practitioners—The important premise for the future robot assistants is that the robots should have certain ability of complex manipulation skill learning. Complex manipulation tasks can be decomposed into multiple stages, and HRL is a suitable method for solving this kind of problems. However, HRL faces the challenge of low computational efficiency. To this end, efficient manipulation skill learning for complex manipulation tasks via human knowledge, model-based and model-free reinforcement learning methods are presented, which improves the efficiency of the skill learning process to a practical level.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"6643-6652"},"PeriodicalIF":6.4000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10665750/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

A novel efficient reinforcement learning paradigm combining human knowledge, model-based and model-free methods is presented for optimal robot manipulation control during complex multi-phase robot manipulation tasks, e.g., the peg-in-hole tasks with tight fit and nut-and-bolt assembly. Firstly, human demonstration is conducted to collect the data during successful robot manipulation, and manipulation phase estimation method integrating with human knowledge is presented to obtain the higher-level planning of the multi-phase robot manipulation tasks. Typical robot manipulation tasks can usually be decomposed into three types of phases, namely free motion, discontinuous contact, and continuous contact. For phase with free motion, the motion planning method is utilized for generating smooth trajectory. For phase with discontinuous contact in the axes of interest during the pre-manipulation process, the rule-based model-free method, namely the Policy Gradients with Human-Guided Parameter-based Exploration (PGHGPE) method is utilized. For the manipulation phase with continuous contacts, the model-based method is utilized because of its higher sample efficiency. Finally, the simulation and experimental studies verify the effectiveness of the presented algorithm. Note to Practitioners—The important premise for the future robot assistants is that the robots should have certain ability of complex manipulation skill learning. Complex manipulation tasks can be decomposed into multiple stages, and HRL is a suitable method for solving this kind of problems. However, HRL faces the challenge of low computational efficiency. To this end, efficient manipulation skill learning for complex manipulation tasks via human knowledge, model-based and model-free reinforcement learning methods are presented, which improves the efficiency of the skill learning process to a practical level.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过人类知识、基于模型和无模型方法实现多阶段机器人操纵技能学习的高效强化学习方法
提出了一种结合人类知识、基于模型方法和无模型方法的新型高效强化学习范式,用于机器人多阶段复杂操作任务(如紧配合、螺母螺栓装配等)的最优操作控制。首先,对机器人成功操作过程中的数据进行人工演示,提出结合人类知识的操作阶段估计方法,对机器人多阶段操作任务进行更高层次的规划;典型的机器人操作任务通常可以分解为三种阶段,即自由运动、间断接触和连续接触。对于自由运动的相位,采用运动规划方法生成光滑轨迹。对于预操作过程中兴趣轴接触不连续的相位,采用基于规则的无模型方法,即策略梯度与人类引导的参数探索(PGHGPE)方法。对于具有连续接触的操作阶段,基于模型的方法具有较高的采样效率。最后,通过仿真和实验研究验证了该算法的有效性。从业人员注意事项——未来机器人助手的重要前提是机器人应具有一定的复杂操作技能学习能力。复杂的操作任务可以分解为多个阶段,HRL是解决这类问题的合适方法。然而,HRL面临着计算效率低的挑战。为此,提出了基于人类知识、基于模型和无模型的强化学习方法对复杂操作任务进行高效的操作技能学习,将技能学习过程的效率提高到实用水平。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Automation Science and Engineering
IEEE Transactions on Automation Science and Engineering 工程技术-自动化与控制系统
CiteScore
12.50
自引率
14.30%
发文量
404
审稿时长
3.0 months
期刊介绍: The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.
期刊最新文献
Fixed-Time Performance Fault-Tolerant Control for Cluster Synchronization of Spatiotemporal Networks with Sign-Based Coupling Distributed Coverage Control for Air-Ground Robot Systems with Heterogeneous Sensing Capabilities AHLLNS: An Automated Algorithm for Multi-Objective Heterogeneous Agricultural Robot Operation Scheduling Problems Hybrid Event-Triggered Fuzzy Secure Consensus for PDE-Based Multi-Agent Systems Subject to Time Delays and Multiple Attacks An Advanced Hierarchical Control Strategy for Modeling and Stability Evaluation of a Novel Series-Connected Energy Routing System
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1