Adversarial generative learning and timed path optimization for real-time visual image prediction to guide robot arm movements

IF 2.9 4区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Journal of Real-Time Image Processing Pub Date : 2024-08-15 DOI:10.1007/s11554-024-01526-5
Xin Li, Changhai Ru, Haonan Sun
{"title":"Adversarial generative learning and timed path optimization for real-time visual image prediction to guide robot arm movements","authors":"Xin Li, Changhai Ru, Haonan Sun","doi":"10.1007/s11554-024-01526-5","DOIUrl":null,"url":null,"abstract":"<p>Real-time visual image prediction, crucial for directing robotic arm movements, represents a significant technique in artificial intelligence and robotics. The primary technical challenges involve the robot’s inaccurate perception and understanding of the environment, coupled with imprecise control of movements. This study proposes ForGAN-MCTS, a generative adversarial network-based action sequence prediction algorithm, aimed at refining visually guided rearrangement planning for movable objects. Initially, the algorithm unveils a scalable and robust strategy for rearrangement planning, capitalizing on the capabilities of a Monte Carlo Tree Search strategy. Secondly, to enable the robot’s successful execution of grasping maneuvers, the algorithm proposes a generative adversarial network-based real-time prediction method, employing a network trained solely on synthetic data for robust estimation of multi-object workspace states via a single uncalibrated RGB camera. The efficacy of the newly proposed algorithm is corroborated through extensive experiments conducted by using a UR-5 robotic arm. The experimental results demonstrate that the algorithm surpasses existing methods in terms of planning efficacy and processing speed. Additionally, the algorithm is robust to camera motion and can effectively mitigate the effects of external perturbations.</p>","PeriodicalId":51224,"journal":{"name":"Journal of Real-Time Image Processing","volume":null,"pages":null},"PeriodicalIF":2.9000,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Real-Time Image Processing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11554-024-01526-5","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Real-time visual image prediction, crucial for directing robotic arm movements, represents a significant technique in artificial intelligence and robotics. The primary technical challenges involve the robot’s inaccurate perception and understanding of the environment, coupled with imprecise control of movements. This study proposes ForGAN-MCTS, a generative adversarial network-based action sequence prediction algorithm, aimed at refining visually guided rearrangement planning for movable objects. Initially, the algorithm unveils a scalable and robust strategy for rearrangement planning, capitalizing on the capabilities of a Monte Carlo Tree Search strategy. Secondly, to enable the robot’s successful execution of grasping maneuvers, the algorithm proposes a generative adversarial network-based real-time prediction method, employing a network trained solely on synthetic data for robust estimation of multi-object workspace states via a single uncalibrated RGB camera. The efficacy of the newly proposed algorithm is corroborated through extensive experiments conducted by using a UR-5 robotic arm. The experimental results demonstrate that the algorithm surpasses existing methods in terms of planning efficacy and processing speed. Additionally, the algorithm is robust to camera motion and can effectively mitigate the effects of external perturbations.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于实时视觉图像预测的对抗生成学习和定时路径优化,以指导机器人手臂运动
实时视觉图像预测对指导机械臂运动至关重要,是人工智能和机器人技术中的一项重要技术。主要的技术挑战包括机器人对环境的感知和理解不准确,以及对动作的控制不精确。本研究提出了一种基于生成对抗网络的动作序列预测算法 ForGAN-MCTS,旨在完善可移动物体的视觉引导重新排列规划。首先,该算法利用蒙特卡洛树搜索(Monte Carlo Tree Search)策略的能力,为重新排列规划揭示了一种可扩展且稳健的策略。其次,为了使机器人能够成功执行抓取动作,该算法提出了一种基于生成对抗网络的实时预测方法,该方法仅使用合成数据训练的网络,通过单个未校准的 RGB 摄像头对多物体工作区状态进行稳健估计。通过使用 UR-5 机械臂进行大量实验,证实了新提出算法的有效性。实验结果表明,该算法在规划效率和处理速度方面都超越了现有方法。此外,该算法对相机运动具有鲁棒性,并能有效减轻外部扰动的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Real-Time Image Processing
Journal of Real-Time Image Processing COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-ENGINEERING, ELECTRICAL & ELECTRONIC
CiteScore
6.80
自引率
6.70%
发文量
68
审稿时长
6 months
期刊介绍: Due to rapid advancements in integrated circuit technology, the rich theoretical results that have been developed by the image and video processing research community are now being increasingly applied in practical systems to solve real-world image and video processing problems. Such systems involve constraints placed not only on their size, cost, and power consumption, but also on the timeliness of the image data processed. Examples of such systems are mobile phones, digital still/video/cell-phone cameras, portable media players, personal digital assistants, high-definition television, video surveillance systems, industrial visual inspection systems, medical imaging devices, vision-guided autonomous robots, spectral imaging systems, and many other real-time embedded systems. In these real-time systems, strict timing requirements demand that results are available within a certain interval of time as imposed by the application. It is often the case that an image processing algorithm is developed and proven theoretically sound, presumably with a specific application in mind, but its practical applications and the detailed steps, methodology, and trade-off analysis required to achieve its real-time performance are not fully explored, leaving these critical and usually non-trivial issues for those wishing to employ the algorithm in a real-time system. The Journal of Real-Time Image Processing is intended to bridge the gap between the theory and practice of image processing, serving the greater community of researchers, practicing engineers, and industrial professionals who deal with designing, implementing or utilizing image processing systems which must satisfy real-time design constraints.
期刊最新文献
High-precision real-time autonomous driving target detection based on YOLOv8 GMS-YOLO: an enhanced algorithm for water meter reading recognition in complex environments Fast rough mode decision algorithm and hardware architecture design for AV1 encoder AdaptoMixNet: detection of foreign objects on power transmission lines under severe weather conditions Mfdd: Multi-scale attention fatigue and distracted driving detector based on facial features
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1