Digital Twin-Driven Reinforcement Learning for Obstacle Avoidance in Robot Manipulators: A Self-Improving Online Training Framework

Yuzhu Sun, Mien Van, Stephen McIlvanna, Nguyen Minh Nhat, Kabirat Olayemi, Jack Close, Seán McLoone
{"title":"Digital Twin-Driven Reinforcement Learning for Obstacle Avoidance in Robot Manipulators: A Self-Improving Online Training Framework","authors":"Yuzhu Sun, Mien Van, Stephen McIlvanna, Nguyen Minh Nhat, Kabirat Olayemi, Jack Close, Seán McLoone","doi":"arxiv-2403.13090","DOIUrl":null,"url":null,"abstract":"The evolution and growing automation of collaborative robots introduce more\ncomplexity and unpredictability to systems, highlighting the crucial need for\nrobot's adaptability and flexibility to address the increasing complexities of\ntheir environment. In typical industrial production scenarios, robots are often\nrequired to be re-programmed when facing a more demanding task or even a few\nchanges in workspace conditions. To increase productivity, efficiency and\nreduce human effort in the design process, this paper explores the potential of\nusing digital twin combined with Reinforcement Learning (RL) to enable robots\nto generate self-improving collision-free trajectories in real time. The\ndigital twin, acting as a virtual counterpart of the physical system, serves as\na 'forward run' for monitoring, controlling, and optimizing the physical system\nin a safe and cost-effective manner. The physical system sends data to\nsynchronize the digital system through the video feeds from cameras, which\nallows the virtual robot to update its observation and policy based on real\nscenarios. The bidirectional communication between digital and physical systems\nprovides a promising platform for hardware-in-the-loop RL training through\ntrial and error until the robot successfully adapts to its new environment. The\nproposed online training framework is demonstrated on the Unfactory Xarm5\ncollaborative robot, where the robot end-effector aims to reach the target\nposition while avoiding obstacles. The experiment suggest that proposed\nframework is capable of performing policy online training, and that there\nremains significant room for improvement.","PeriodicalId":501062,"journal":{"name":"arXiv - CS - Systems and Control","volume":"26 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Systems and Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2403.13090","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The evolution and growing automation of collaborative robots introduce more complexity and unpredictability to systems, highlighting the crucial need for robot's adaptability and flexibility to address the increasing complexities of their environment. In typical industrial production scenarios, robots are often required to be re-programmed when facing a more demanding task or even a few changes in workspace conditions. To increase productivity, efficiency and reduce human effort in the design process, this paper explores the potential of using digital twin combined with Reinforcement Learning (RL) to enable robots to generate self-improving collision-free trajectories in real time. The digital twin, acting as a virtual counterpart of the physical system, serves as a 'forward run' for monitoring, controlling, and optimizing the physical system in a safe and cost-effective manner. The physical system sends data to synchronize the digital system through the video feeds from cameras, which allows the virtual robot to update its observation and policy based on real scenarios. The bidirectional communication between digital and physical systems provides a promising platform for hardware-in-the-loop RL training through trial and error until the robot successfully adapts to its new environment. The proposed online training framework is demonstrated on the Unfactory Xarm5 collaborative robot, where the robot end-effector aims to reach the target position while avoiding obstacles. The experiment suggest that proposed framework is capable of performing policy online training, and that there remains significant room for improvement.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
机器人机械手避障的数字双胞胎驱动强化学习:自我完善的在线培训框架
协作机器人的发展和自动化程度的不断提高,为系统带来了更多的复杂性和不可预测性,突出了对机器人适应性和灵活性的关键需求,以应对日益复杂的环境。在典型的工业生产场景中,机器人在面对要求更高的任务或工作空间条件发生一些变化时,往往需要重新编程。为了提高生产率和效率,减少设计过程中的人力投入,本文探讨了利用数字孪生与强化学习(RL)相结合,使机器人能够实时生成自我改进的无碰撞轨迹的可能性。数字孪生作为物理系统的虚拟对应物,可作为 "前向运行",以安全、经济高效的方式监测、控制和优化物理系统。物理系统通过摄像头的视频馈送向数字系统发送数据,使虚拟机器人能够根据实际情况更新其观察和策略。数字系统和物理系统之间的双向通信为硬件在环 RL 训练提供了一个前景广阔的平台,通过反复试验和出错,直到机器人成功适应新环境。在 Unfactory Xarm5 协作机器人上演示了所提出的在线训练框架,机器人末端执行器的目标是在避开障碍物的同时到达目标位置。实验表明,提出的框架能够执行策略在线训练,但仍有很大的改进空间。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Human-Variability-Respecting Optimal Control for Physical Human-Machine Interaction A Valuation Framework for Customers Impacted by Extreme Temperature-Related Outages On the constrained feedback linearization control based on the MILP representation of a ReLU-ANN Motion Planning under Uncertainty: Integrating Learning-Based Multi-Modal Predictors into Branch Model Predictive Control Managing Renewable Energy Resources Using Equity-Market Risk Tools - the Efficient Frontiers
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1