Xungao Zhong;Tao Gong;Junzhi Yu;Chengxian Zhou;Xunyu Zhong;Qiang Liu
{"title":"RIGNet: Robot Intention Grasp for Dense Stacked Targets With Multi-Task Siamese Schema Through RoIs Learning","authors":"Xungao Zhong;Tao Gong;Junzhi Yu;Chengxian Zhou;Xunyu Zhong;Qiang Liu","doi":"10.1109/TASE.2024.3522625","DOIUrl":null,"url":null,"abstract":"Autonomous grasping is a critical topic of robotic embodied intelligence. However, it remains challenging for robots to grasp an intended target, particularly in cluttered and densely stacked environments. This paper presents a novel solution by proposing a Robot Intention Grasp Network (RIGNet) with a multi-task siamese schema, which is based on an improved Region Proposal Network (RPN) and a Region of Interest (RoI) learning approach. More specifically, the RPN robustly outputs the RoIs to describe the candidates in stacked scenes, while the multi-task siamese network consisting of category comprehension and grasp perception modules serves to detect the intended object and generate the optimal grasp configuration. To improve the reasoning precision for grasp posture, a dynamic-oriented anchor matching strategy is further proposed to adapt to the grasp perception module. The proposed RIGNet is comparable to the state-of-the-art algorithms on single-object tasks and has better performance on stacked multi-object grasp detection. This study can provide new insights for autonomous robot reasoning in real time, facilitating an understanding of both the rationale and methodology for grasping an intended target in highly stacked environments. Note to Practitioners—This work is motivated by robotic embodied intelligence technique, which plays an irreplaceable role in robot autonomous grasping manipulation. Using this technology, the robot is capable of performing grasp tasks for human. The previous grasp detection methods focus on optimizing grasp posture for successful robot grasping manipulation as maximum as possible, but ignore the comprehension of candidate object’s attributes. Hence, this paper proposes a novel robot intention grasp network (RIGNet) for robot perception of how to grasp and comprehend why grasping. The improved RPN module robustly predict the RoIs to locate candidates in a bunch of overlap and occlusion objects, and then the multi-task siamese module finely detect the object category corresponding to the grasp configuration. What is more important, the proposed technology does not lose efficiency in single-object grasping, and even outperforms the state-of-the-art algorithms in multi-object stacked grasping. The novel RIGNet schema is suitable for human intends a real robot to execute desired grasp tasks.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"10354-10367"},"PeriodicalIF":6.4000,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10818974/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Autonomous grasping is a critical topic of robotic embodied intelligence. However, it remains challenging for robots to grasp an intended target, particularly in cluttered and densely stacked environments. This paper presents a novel solution by proposing a Robot Intention Grasp Network (RIGNet) with a multi-task siamese schema, which is based on an improved Region Proposal Network (RPN) and a Region of Interest (RoI) learning approach. More specifically, the RPN robustly outputs the RoIs to describe the candidates in stacked scenes, while the multi-task siamese network consisting of category comprehension and grasp perception modules serves to detect the intended object and generate the optimal grasp configuration. To improve the reasoning precision for grasp posture, a dynamic-oriented anchor matching strategy is further proposed to adapt to the grasp perception module. The proposed RIGNet is comparable to the state-of-the-art algorithms on single-object tasks and has better performance on stacked multi-object grasp detection. This study can provide new insights for autonomous robot reasoning in real time, facilitating an understanding of both the rationale and methodology for grasping an intended target in highly stacked environments. Note to Practitioners—This work is motivated by robotic embodied intelligence technique, which plays an irreplaceable role in robot autonomous grasping manipulation. Using this technology, the robot is capable of performing grasp tasks for human. The previous grasp detection methods focus on optimizing grasp posture for successful robot grasping manipulation as maximum as possible, but ignore the comprehension of candidate object’s attributes. Hence, this paper proposes a novel robot intention grasp network (RIGNet) for robot perception of how to grasp and comprehend why grasping. The improved RPN module robustly predict the RoIs to locate candidates in a bunch of overlap and occlusion objects, and then the multi-task siamese module finely detect the object category corresponding to the grasp configuration. What is more important, the proposed technology does not lose efficiency in single-object grasping, and even outperforms the state-of-the-art algorithms in multi-object stacked grasping. The novel RIGNet schema is suitable for human intends a real robot to execute desired grasp tasks.
期刊介绍:
The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.