This letter presents a bilateral shared-control teleoperation system by establishing a human machine environment cooperative control framework to address performance degradation caused by communication delays and cluttered environments. Specifically, we develop a leader-side robot kinematic model with environment-induced artificial potential-field constraints to dynamically fuse leader intent and follower-side environmental information, enabling delay compensation of leader commands. Based on the compensated commands, a bilateral teleoperation controller with local prediction-error compensation term is designed, and an auxiliary potential-field-based shared-control term is incorporated via weighting coefficients to enhance the safety and stability of the wheeled mobile robot (WMR) in complex scenarios. Experimental validation in both simple and challenging environments demonstrates that the proposed method significantly reduces the task failure rate and collision risk, while improving control efficiency and user experience. And operators’ subjective assessments also show significant improvements in lower perceived workload and satisfaction.
{"title":"A Shared-Control Teleoperation System Based on Potential-Field-Constraint Prediction","authors":"Pengpeng Li;Weihua Li;Zhenwei Lian;Hongjun Xing;Bindi You;Jianfeng Wang;Liang Ding","doi":"10.1109/LRA.2025.3643275","DOIUrl":"https://doi.org/10.1109/LRA.2025.3643275","url":null,"abstract":"This letter presents a bilateral shared-control teleoperation system by establishing a human machine environment cooperative control framework to address performance degradation caused by communication delays and cluttered environments. Specifically, we develop a leader-side robot kinematic model with environment-induced artificial potential-field constraints to dynamically fuse leader intent and follower-side environmental information, enabling delay compensation of leader commands. Based on the compensated commands, a bilateral teleoperation controller with local prediction-error compensation term is designed, and an auxiliary potential-field-based shared-control term is incorporated via weighting coefficients to enhance the safety and stability of the wheeled mobile robot (WMR) in complex scenarios. Experimental validation in both simple and challenging environments demonstrates that the proposed method significantly reduces the task failure rate and collision risk, while improving control efficiency and user experience. And operators’ subjective assessments also show significant improvements in lower perceived workload and satisfaction.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 2","pages":"1634-1641"},"PeriodicalIF":5.3,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-11DOI: 10.1109/LRA.2025.3643298
Ziken Huang;Xinze Niu;Bowen Chai;Renbiao Jin;Danping Zou
High-speed aerial grasping presents significant challenges due to the high demands on precise, responsive flight control and coordinated gripper manipulation. In this work, we propose Swooper, a deep reinforcement learning (DRL) based approach that achieves both precise flight control and active gripper control using a single lightweight neural network policy. Training such a policy directly via DRL is nontrivial due to the complexity of coordinating flight and grasping. To address this, we adopt a two-stage learning strategy: we first pre-train a flight control policy, and then fine-tune it to acquire grasping skills. With the carefully designed reward functions and training framework, the entire training process completes in under 60 minutes on a standard desktop with an Nvidia RTX 3060 GPU. To validate the trained policy in the real world, we develop a lightweight quadrotor grasping platform equipped with a simple off-the-shelf gripper, and deploy the policy in a zero-shot manner on the onboard Raspberry Pi 4B computer, where each inference takes only about 1.0 ms. In 25 real-world trials, our policy achieves an 84% grasp success rate and grasping speeds of up to 1.5 m/s without any fine-tuning. This matches the robustness and agility of state-of-the-art classical systems with sophisticated grippers, highlighting the capability of DRL for learning a robust control policy that seamlessly integrates high-speed flight and grasping.
由于对精确、灵敏的飞行控制和协调的抓手操作的高要求,高速空中抓取提出了重大挑战。在这项工作中,我们提出了一种基于深度强化学习(DRL)的方法Swooper,该方法使用单个轻量级神经网络策略实现精确飞行控制和主动夹持器控制。由于协调飞行和抓取的复杂性,直接通过DRL训练这样的策略是非常重要的。为了解决这个问题,我们采用了两阶段学习策略:我们首先对飞行控制策略进行预训练,然后对其进行微调以获得抓取技能。通过精心设计的奖励功能和培训框架,整个培训过程在使用Nvidia RTX 3060 GPU的标准台式机上不到60分钟即可完成。为了在现实世界中验证训练好的策略,我们开发了一个轻型四旋翼抓取平台,配备了一个简单的现成抓取器,并在板载Raspberry Pi 4B计算机上以零射击的方式部署策略,其中每次推理仅需约1.0 ms。在25次实际试验中,我们的策略在没有任何微调的情况下实现了84%的抓取成功率和高达1.5 m/s的抓取速度。这与先进的经典系统的鲁棒性和敏捷性相匹配,突出了DRL学习鲁棒控制策略的能力,该策略无缝集成了高速飞行和抓取。
{"title":"Swooper: Learning High-Speed Aerial Grasping With a Simple Gripper","authors":"Ziken Huang;Xinze Niu;Bowen Chai;Renbiao Jin;Danping Zou","doi":"10.1109/LRA.2025.3643298","DOIUrl":"https://doi.org/10.1109/LRA.2025.3643298","url":null,"abstract":"High-speed aerial grasping presents significant challenges due to the high demands on precise, responsive flight control and coordinated gripper manipulation. In this work, we propose <italic>Swooper</i>, a deep reinforcement learning (DRL) based approach that achieves both precise flight control and active gripper control using a single lightweight neural network policy. Training such a policy directly via DRL is nontrivial due to the complexity of coordinating flight and grasping. To address this, we adopt a two-stage learning strategy: we first pre-train a flight control policy, and then fine-tune it to acquire grasping skills. With the carefully designed reward functions and training framework, the entire training process completes in under 60 minutes on a standard desktop with an Nvidia RTX 3060 GPU. To validate the trained policy in the real world, we develop a lightweight quadrotor grasping platform equipped with a simple off-the-shelf gripper, and deploy the policy in a zero-shot manner on the onboard Raspberry Pi 4B computer, where each inference takes only about 1.0 ms. In 25 real-world trials, our policy achieves an 84% grasp success rate and grasping speeds of up to 1.5 m/s without any fine-tuning. This matches the robustness and agility of state-of-the-art classical systems with sophisticated grippers, highlighting the capability of DRL for learning a robust control policy that seamlessly integrates high-speed flight and grasping.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 2","pages":"2298-2305"},"PeriodicalIF":5.3,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-11DOI: 10.1109/LRA.2025.3643333
Dunfa Long;Shaoan Chen;Shuai Ao;Zhiqiang Zhang;Chengzhi Hu;Chaoyang Shi
This work introduces a novel compact 7-degree-of-freedom (7-DOF) microsurgical robot with position-orientation decoupling capacity for microvascular anastomosis. The proposed system employs a modular architecture combining a proximal displacement platform for 3D small-stroke translation and a distal compact remote center of motion (RCM) mechanism for wide-range orientation adjustment. This design meets the workspace requirements for microvascular anastomosis, requiring extensive orientation adjustments with minimal positional movement and reducing the system footprint. The parasitic motion reverse self-compensation method has been developed for motorized surgical instruments, effectively reducing operational resistance to improve precision. Theoretical analysis has been performed on both the RCM mechanism and motorized surgical instruments, and kinematics-based parameter optimization and data-driven calibration have been conducted to enhance superior performance. A prototype has been constructed, and its experimental validation demonstrated that the system achieved repeatability of 11.24 ± 2.31 μm (XY) and 12.46 ± 4.48 μm (YZ), and absolute positioning accuracy of 29.80 ± 12.27 μm (XY) and 37.02 ± 19.47 μm (YZ), meeting super-microsurgical requirements. Experiments that include needle-threading and stamen peeling tasks demonstrate the robot's superior dexterity and manipulation capabilities.
{"title":"Development of a Novel 7-DOF Position-Orientation Decoupled Microsurgical Robot with Motorized Instruments for Microvascular Anastomosis","authors":"Dunfa Long;Shaoan Chen;Shuai Ao;Zhiqiang Zhang;Chengzhi Hu;Chaoyang Shi","doi":"10.1109/LRA.2025.3643333","DOIUrl":"https://doi.org/10.1109/LRA.2025.3643333","url":null,"abstract":"This work introduces a novel compact 7-degree-of-freedom (7-DOF) microsurgical robot with position-orientation decoupling capacity for microvascular anastomosis. The proposed system employs a modular architecture combining a proximal displacement platform for 3D small-stroke translation and a distal compact remote center of motion (RCM) mechanism for wide-range orientation adjustment. This design meets the workspace requirements for microvascular anastomosis, requiring extensive orientation adjustments with minimal positional movement and reducing the system footprint. The parasitic motion reverse self-compensation method has been developed for motorized surgical instruments, effectively reducing operational resistance to improve precision. Theoretical analysis has been performed on both the RCM mechanism and motorized surgical instruments, and kinematics-based parameter optimization and data-driven calibration have been conducted to enhance superior performance. A prototype has been constructed, and its experimental validation demonstrated that the system achieved repeatability of 11.24 ± 2.31 μm (XY) and 12.46 ± 4.48 μm (YZ), and absolute positioning accuracy of 29.80 ± 12.27 μm (XY) and 37.02 ± 19.47 μm (YZ), meeting super-microsurgical requirements. Experiments that include needle-threading and stamen peeling tasks demonstrate the robot's superior dexterity and manipulation capabilities.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 2","pages":"1866-1873"},"PeriodicalIF":5.3,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145830774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-11DOI: 10.1109/LRA.2025.3643304
Heteng Zhang;Yunjie Jia;Zihao Sun;Yong Song;Bao Pang;Xianfeng Yuan;Rui Song;Simon X. Yang
Multi-robot systems have demonstrated significant potential in accomplishing complex tasks, such as cooperative pursuit, search-and-rescue operations. The emergence of heterogeneous robots with diverse capabilities and characteristics shows superior adaptability compared with homogeneous teams. However, in practical applications, global information is typically inaccessible, and composite teams must contend with partial observability and coordination difficulties. To address the issue in heterogeneous multi-robot systems, we propose a novel Intention-Guided reinforcement learning approach with Dirichlet Energy constraint (IGDE). Specifically, an intention-guided module is designed to derive long-horizon strategies based solely on local observations, enabling foresighted decision-making. In addition, a Dirichlet energy constraint is incorporated into the communication process to enhance the diversity of environmental cognition among different classes of robots. Heterogeneous robots perform class-aware actions driven by distinct cognitive representations, thereby enhancing cooperative efficiency. Notably, our approach alleviates the need of prior knowledge and heterogeneity modeling. Extensive comparative experiments and ablation studies verify the effectiveness of the proposed framework. Additionally, real-world deployment is conducted to demonstrate the practicality.
{"title":"An Intention-Guided Reinforcement Learning Approach With Dirichlet Energy Constraint for Heterogeneous Multi-Robot Cooperation","authors":"Heteng Zhang;Yunjie Jia;Zihao Sun;Yong Song;Bao Pang;Xianfeng Yuan;Rui Song;Simon X. Yang","doi":"10.1109/LRA.2025.3643304","DOIUrl":"https://doi.org/10.1109/LRA.2025.3643304","url":null,"abstract":"Multi-robot systems have demonstrated significant potential in accomplishing complex tasks, such as cooperative pursuit, search-and-rescue operations. The emergence of heterogeneous robots with diverse capabilities and characteristics shows superior adaptability compared with homogeneous teams. However, in practical applications, global information is typically inaccessible, and composite teams must contend with partial observability and coordination difficulties. To address the issue in heterogeneous multi-robot systems, we propose a novel <italic>I</i>ntention-<italic>G</i>uided reinforcement learning approach with <italic>D</i>irichlet <italic>E</i>nergy constraint (IGDE). Specifically, an intention-guided module is designed to derive long-horizon strategies based solely on local observations, enabling foresighted decision-making. In addition, a Dirichlet energy constraint is incorporated into the communication process to enhance the diversity of environmental cognition among different classes of robots. Heterogeneous robots perform class-aware actions driven by distinct cognitive representations, thereby enhancing cooperative efficiency. Notably, our approach alleviates the need of prior knowledge and heterogeneity modeling. Extensive comparative experiments and ablation studies verify the effectiveness of the proposed framework. Additionally, real-world deployment is conducted to demonstrate the practicality.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 2","pages":"1450-1457"},"PeriodicalIF":5.3,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-11DOI: 10.1109/LRA.2025.3643332
Dane Brouwer;Joshua Citron;Heather Nolte;Jeannette Bohg;Mark Cutkosky
Dense collections of movable objects are common in everyday spaces—from cabinets in a home to shelves in a warehouse. Safely retracting objects from such collections is difficult for robots, yet people do it frequently, leveraging learned experience in tandem with vision and non-prehensile tactile sensing on the sides and backs of their hands and arms. We investigate the role of contact force sensing for training robots to gently reach into constrained clutter and extract objects. The available sensing modalities are 1) “eye-in-hand” vision, 2) proprioception, 3) non-prehensile triaxial tactile sensing, 4) contact wrenches estimated from joint torques, and 5) a measure of object acquisition obtained by monitoring the vacuum line of a suction cup. We use imitation learning to train policies from a set of demonstrations on randomly generated scenes, then conduct an ablation study of wrench and tactile information. We evaluate each policy’s performance across 40 unseen environment configurations. Policies employing any force sensing show fewer excessive force failures, an increased overall success rate, and faster completion times. The best performance is achieved using both tactile and wrench information, producing an 80% improvement above the baseline without force information.
{"title":"Gentle Object Retraction in Dense Clutter Using Multimodal Force Sensing and Imitation Learning","authors":"Dane Brouwer;Joshua Citron;Heather Nolte;Jeannette Bohg;Mark Cutkosky","doi":"10.1109/LRA.2025.3643332","DOIUrl":"https://doi.org/10.1109/LRA.2025.3643332","url":null,"abstract":"Dense collections of movable objects are common in everyday spaces—from cabinets in a home to shelves in a warehouse. Safely retracting objects from such collections is difficult for robots, yet people do it frequently, leveraging learned experience in tandem with vision and non-prehensile tactile sensing on the sides and backs of their hands and arms. We investigate the role of contact force sensing for training robots to gently reach into constrained clutter and extract objects. The available sensing modalities are 1) “eye-in-hand” vision, 2) proprioception, 3) non-prehensile triaxial tactile sensing, 4) contact wrenches estimated from joint torques, and 5) a measure of object acquisition obtained by monitoring the vacuum line of a suction cup. We use imitation learning to train policies from a set of demonstrations on randomly generated scenes, then conduct an ablation study of wrench and tactile information. We evaluate each policy’s performance across 40 unseen environment configurations. Policies employing any force sensing show fewer excessive force failures, an increased overall success rate, and faster completion times. The best performance is achieved using both tactile and wrench information, producing an 80% improvement above the baseline without force information.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 2","pages":"1578-1585"},"PeriodicalIF":5.3,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-11DOI: 10.1109/LRA.2025.3643289
Jie Xu;Xuanxuan Zhang;Yongxin Ma;Yixuan Li;Linji Wang;Xinhang Xu;Shenghai Yuan;Lihua Xie
Visual-inertial odometry (VIO) can estimate robot poses at high frequencies but suffers from accumulated drift over time. Incorporating point cloud maps offers a promising solution, yet existing registration methods between vision and point clouds are limited by heterogeneous feature alignment, leaving much information underutilized and resulting in reduced accuracy, poor robustness, and high computational cost. To address these challenges, this paper proposes a visual-inertial localization system based on color point cloud maps, consisting of two main components: color map construction and visual-inertial tracking. A gradient-based map sparsification strategy is employed during map construction to preserve salient features while reducing storage and computation. For localization, we propose an image pyramid-based visual photometric IESKF, which fuses IMU and photometric observations to estimate precise poses. Gradient-rich feature points are projected onto image pyramids across multiple resolutions to perform iterative updates, effectively avoiding local minima and improving accuracy. Experimental results show that our method achieves stable and accurate localization bounded by map precision, and demonstrates higher efficiency and robustness than existing map-based approaches.
{"title":"ColorMap-VIO: A Drift-Free Visual-Inertial Odometry in a Prior Colored Point Cloud Map","authors":"Jie Xu;Xuanxuan Zhang;Yongxin Ma;Yixuan Li;Linji Wang;Xinhang Xu;Shenghai Yuan;Lihua Xie","doi":"10.1109/LRA.2025.3643289","DOIUrl":"https://doi.org/10.1109/LRA.2025.3643289","url":null,"abstract":"Visual-inertial odometry (VIO) can estimate robot poses at high frequencies but suffers from accumulated drift over time. Incorporating point cloud maps offers a promising solution, yet existing registration methods between vision and point clouds are limited by heterogeneous feature alignment, leaving much information underutilized and resulting in reduced accuracy, poor robustness, and high computational cost. To address these challenges, this paper proposes a visual-inertial localization system based on color point cloud maps, consisting of two main components: color map construction and visual-inertial tracking. A gradient-based map sparsification strategy is employed during map construction to preserve salient features while reducing storage and computation. For localization, we propose an image pyramid-based visual photometric IESKF, which fuses IMU and photometric observations to estimate precise poses. Gradient-rich feature points are projected onto image pyramids across multiple resolutions to perform iterative updates, effectively avoiding local minima and improving accuracy. Experimental results show that our method achieves stable and accurate localization bounded by map precision, and demonstrates higher efficiency and robustness than existing map-based approaches.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 2","pages":"1570-1577"},"PeriodicalIF":5.3,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-11DOI: 10.1109/LRA.2025.3643269
Alan Li;Angela P. Schoellig
6D Object pose estimation is a fundamental component in robotics enabling efficient interaction with the environment. In industrial bin-picking tasks, this problem becomes especially challenging due to difficult object poses, complex occlusions, and inter-object ambiguities. In this work, we propose a novel self-supervised method that automatically collects, labels, and fine-tunes on real images using an eye-in-hand camera setup. We leverage the mobile camera to first obtain reliable ground-truth estimates through multi-view pose estimation, allowing us to subsequently reposition the camera to capture and label real ‘hard case’ samples from the estimated scene. This process enables closure of the sim-to-real gap through large quantities of targeted real training data, generated by comparing differences in model performance between real and synthetically reconstructed scenes and informing the mobile camera on specific poses or areas for data capture. We surpass state-of-the-art performance on a challenging bin-picking benchmark: five out of seven objects surpass a 95% correct detection rate, compared to only one out of seven for previous methods.
{"title":"Self-Supervised Learning for Object Pose Estimation Through Active Real Sample Capture","authors":"Alan Li;Angela P. Schoellig","doi":"10.1109/LRA.2025.3643269","DOIUrl":"https://doi.org/10.1109/LRA.2025.3643269","url":null,"abstract":"6D Object pose estimation is a fundamental component in robotics enabling efficient interaction with the environment. In industrial bin-picking tasks, this problem becomes especially challenging due to difficult object poses, complex occlusions, and inter-object ambiguities. In this work, we propose a novel self-supervised method that automatically collects, labels, and fine-tunes on real images using an eye-in-hand camera setup. We leverage the mobile camera to first obtain reliable ground-truth estimates through multi-view pose estimation, allowing us to subsequently reposition the camera to capture and label real ‘hard case’ samples from the estimated scene. This process enables closure of the sim-to-real gap through large quantities of targeted real training data, generated by comparing differences in model performance between real and synthetically reconstructed scenes and informing the mobile camera on specific poses or areas for data capture. We surpass state-of-the-art performance on a challenging bin-picking benchmark: five out of seven objects surpass a 95% correct detection rate, compared to only one out of seven for previous methods.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 2","pages":"1954-1961"},"PeriodicalIF":5.3,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145830867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-11DOI: 10.1109/LRA.2025.3643277
Ruochen Li;Junkai Jiang;Jiongqi Wang;Shaobing Xu;Jianqiang Wang
Multi-depot multi-agent collaborative coverage is a representative problem in swarm intelligence, with broad applications in real-world scenarios. In this problem, multiple agents are initially located at different depots, which differs from the traditional problem setting, and are required to collaboratively cover a given region represented by a structured road network. The objective is to minimize the longest individual route among all agents. This problem is closely related to the $k$-Chinese Postman Problem ($k$-CPP), but is more complex due to the different depot constraint. This letter proposes a novel centralized algorithm, Cycle Clustering (CC), to solve this problem efficiently. The proposed method first transforms the original graph into an Eulerian graph, and then partitions the Eulerian graph into multiple small cycles, which are subsequently clustered and assigned to different agents. This design significantly reduces unnecessary route overhead. The algorithm’s time complexity and completeness are theoretically analyzed. Experimental results demonstrate that the proposed algorithm performs better than existing methods, reducing the average gap to the theoretical lower bound from 29.10% to 10.29%.
{"title":"Cycle Clustering: An Algorithm for Multi-Depot Multi-Agent Collaborative Coverage in Structured Road Network","authors":"Ruochen Li;Junkai Jiang;Jiongqi Wang;Shaobing Xu;Jianqiang Wang","doi":"10.1109/LRA.2025.3643277","DOIUrl":"https://doi.org/10.1109/LRA.2025.3643277","url":null,"abstract":"Multi-depot multi-agent collaborative coverage is a representative problem in swarm intelligence, with broad applications in real-world scenarios. In this problem, multiple agents are initially located at different depots, which differs from the traditional problem setting, and are required to collaboratively cover a given region represented by a structured road network. The objective is to minimize the longest individual route among all agents. This problem is closely related to the <inline-formula><tex-math>$k$</tex-math></inline-formula>-Chinese Postman Problem (<inline-formula><tex-math>$k$</tex-math></inline-formula>-CPP), but is more complex due to the different depot constraint. This letter proposes a novel centralized algorithm, Cycle Clustering (CC), to solve this problem efficiently. The proposed method first transforms the original graph into an Eulerian graph, and then partitions the Eulerian graph into multiple small cycles, which are subsequently clustered and assigned to different agents. This design significantly reduces unnecessary route overhead. The algorithm’s time complexity and completeness are theoretically analyzed. Experimental results demonstrate that the proposed algorithm performs better than existing methods, reducing the average gap to the theoretical lower bound from 29.10% to 10.29%.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 2","pages":"1554-1561"},"PeriodicalIF":5.3,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-11DOI: 10.1109/LRA.2025.3643271
Sidharth Talia;Oren Salzman;Siddhartha Srinivasa
We address the problem of efficiently organizing search over very large trees, which arises in many applications ranging from autonomous driving to aerial vehicles. Here, we are motivated by off-road autonomy, where real-time planning is essential. Classical approaches use graphs of motion primitives and exploit dominance to mitigate the curse of dimensionality and prune expansions efficiently. However, for complex dynamics, repeatedly solving two-point boundary-value problems makes graph construction too slow for fast kinodynamic planning. Hybrid A* (HA*) addressed this challenge by searching over a tree of motion primitives and introducing approximate pruning using a grid-based dominance check. However, choosing the grid resolution is difficult: too coarse risks failure, while too fine leads to excessive expansions and slow planning. We propose Incremental Generalized Hybrid A* (IGHA*), an anytime tree-search framework that dynamically organizes vertex expansions without rigid pruning. IGHA* provably matches or outperforms HA*. For both on-road kinematic and off-road kinodynamic planning queries for a car-like robot, variants of IGHA* use $6times$ fewer expansions to the best solution compared to an optimized version of HA* (HA*M, an internal baseline). In simulated off-road experiments in a high-fidelity simulator, IGHA* outperforms HA*M when both are used in the loop with a model predictive controller. We demonstrate real-time performance both in simulation and on a small-scale off-road vehicle, enabling fast, robust planning under complex dynamics.
{"title":"Incremental Generalized Hybrid A*","authors":"Sidharth Talia;Oren Salzman;Siddhartha Srinivasa","doi":"10.1109/LRA.2025.3643271","DOIUrl":"https://doi.org/10.1109/LRA.2025.3643271","url":null,"abstract":"We address the problem of efficiently organizing search over very large trees, which arises in many applications ranging from autonomous driving to aerial vehicles. Here, we are motivated by off-road autonomy, where real-time planning is essential. Classical approaches use graphs of motion primitives and exploit dominance to mitigate the curse of dimensionality and prune expansions efficiently. However, for complex dynamics, repeatedly solving two-point boundary-value problems makes graph construction too slow for fast kinodynamic planning. Hybrid A* (<monospace>HA*</monospace>) addressed this challenge by searching over a tree of motion primitives and introducing approximate pruning using a grid-based dominance check. However, choosing the grid resolution is difficult: too coarse risks failure, while too fine leads to excessive expansions and slow planning. We propose Incremental Generalized Hybrid A* (<monospace>IGHA*</monospace>), an anytime tree-search framework that dynamically organizes vertex expansions without rigid pruning. <monospace>IGHA*</monospace> provably matches or outperforms <monospace>HA*</monospace>. For both on-road kinematic and off-road kinodynamic planning queries for a car-like robot, variants of <monospace>IGHA*</monospace> use <inline-formula><tex-math>$6times$</tex-math></inline-formula> fewer expansions to the best solution compared to an optimized version of <monospace>HA*</monospace> (<monospace>HA*M</monospace>, an internal baseline). In simulated off-road experiments in a high-fidelity simulator, <monospace>IGHA*</monospace> outperforms <monospace>HA*M</monospace> when both are used in the loop with a model predictive controller. We demonstrate real-time performance both in simulation and on a small-scale off-road vehicle, enabling fast, robust planning under complex dynamics.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 2","pages":"1586-1593"},"PeriodicalIF":5.3,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-11DOI: 10.1109/LRA.2025.3643293
Gihyeon Lee;Jungwoo Lee;Juwon Kim;Young-Sik Shin;Younggun Cho
Robots are often required to localize in environments with unknown object classes and semantic ambiguity. However, when performing global localization using semantic objects, high semantic ambiguity intensifies object misclassification and increases the likelihood of incorrect associations, which in turn can cause significant errors in the estimated pose. Thus, in this letter, we propose a multi-label likelihood-based semantic graph matching framework for object-level global localization. The key idea is to exploit multi-label graph representations, rather than single-label alternatives, to capture and leverage the inherent semantic context of object observations. Based on these representations, our approach enhances semantic correspondence across graphs by combining the likelihood of each node with the maximum likelihood of its neighbors via context-aware likelihood propagation. For rigorous validation, data association and pose estimation performance are evaluated under both closed-set and open-set detection configurations. In addition, we demonstrate the scalability of our approach to large-vocabulary object categories in both real-world indoor scenes and synthetic environments.
{"title":"MSG-Loc: Multi-Label Likelihood-Based Semantic Graph Matching for Object-Level Global Localization","authors":"Gihyeon Lee;Jungwoo Lee;Juwon Kim;Young-Sik Shin;Younggun Cho","doi":"10.1109/LRA.2025.3643293","DOIUrl":"https://doi.org/10.1109/LRA.2025.3643293","url":null,"abstract":"Robots are often required to localize in environments with unknown object classes and semantic ambiguity. However, when performing global localization using semantic objects, high semantic ambiguity intensifies object misclassification and increases the likelihood of incorrect associations, which in turn can cause significant errors in the estimated pose. Thus, in this letter, we propose a multi-label likelihood-based semantic graph matching framework for object-level global localization. The key idea is to exploit multi-label graph representations, rather than single-label alternatives, to capture and leverage the inherent semantic context of object observations. Based on these representations, our approach enhances semantic correspondence across graphs by combining the likelihood of each node with the maximum likelihood of its neighbors via context-aware likelihood propagation. For rigorous validation, data association and pose estimation performance are evaluated under both closed-set and open-set detection configurations. In addition, we demonstrate the scalability of our approach to large-vocabulary object categories in both real-world indoor scenes and synthetic environments.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 2","pages":"2066-2073"},"PeriodicalIF":5.3,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}