This letter presents a bilateral shared-control teleoperation system by establishing a human machine environment cooperative control framework to address performance degradation caused by communication delays and cluttered environments. Specifically, we develop a leader-side robot kinematic model with environment-induced artificial potential-field constraints to dynamically fuse leader intent and follower-side environmental information, enabling delay compensation of leader commands. Based on the compensated commands, a bilateral teleoperation controller with local prediction-error compensation term is designed, and an auxiliary potential-field-based shared-control term is incorporated via weighting coefficients to enhance the safety and stability of the wheeled mobile robot (WMR) in complex scenarios. Experimental validation in both simple and challenging environments demonstrates that the proposed method significantly reduces the task failure rate and collision risk, while improving control efficiency and user experience. And operators’ subjective assessments also show significant improvements in lower perceived workload and satisfaction.
{"title":"A Shared-Control Teleoperation System Based on Potential-Field-Constraint Prediction","authors":"Pengpeng Li;Weihua Li;Zhenwei Lian;Hongjun Xing;Bindi You;Jianfeng Wang;Liang Ding","doi":"10.1109/LRA.2025.3643275","DOIUrl":"https://doi.org/10.1109/LRA.2025.3643275","url":null,"abstract":"This letter presents a bilateral shared-control teleoperation system by establishing a human machine environment cooperative control framework to address performance degradation caused by communication delays and cluttered environments. Specifically, we develop a leader-side robot kinematic model with environment-induced artificial potential-field constraints to dynamically fuse leader intent and follower-side environmental information, enabling delay compensation of leader commands. Based on the compensated commands, a bilateral teleoperation controller with local prediction-error compensation term is designed, and an auxiliary potential-field-based shared-control term is incorporated via weighting coefficients to enhance the safety and stability of the wheeled mobile robot (WMR) in complex scenarios. Experimental validation in both simple and challenging environments demonstrates that the proposed method significantly reduces the task failure rate and collision risk, while improving control efficiency and user experience. And operators’ subjective assessments also show significant improvements in lower perceived workload and satisfaction.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 2","pages":"1634-1641"},"PeriodicalIF":5.3,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-11DOI: 10.1109/LRA.2025.3643304
Heteng Zhang;Yunjie Jia;Zihao Sun;Yong Song;Bao Pang;Xianfeng Yuan;Rui Song;Simon X. Yang
Multi-robot systems have demonstrated significant potential in accomplishing complex tasks, such as cooperative pursuit, search-and-rescue operations. The emergence of heterogeneous robots with diverse capabilities and characteristics shows superior adaptability compared with homogeneous teams. However, in practical applications, global information is typically inaccessible, and composite teams must contend with partial observability and coordination difficulties. To address the issue in heterogeneous multi-robot systems, we propose a novel Intention-Guided reinforcement learning approach with Dirichlet Energy constraint (IGDE). Specifically, an intention-guided module is designed to derive long-horizon strategies based solely on local observations, enabling foresighted decision-making. In addition, a Dirichlet energy constraint is incorporated into the communication process to enhance the diversity of environmental cognition among different classes of robots. Heterogeneous robots perform class-aware actions driven by distinct cognitive representations, thereby enhancing cooperative efficiency. Notably, our approach alleviates the need of prior knowledge and heterogeneity modeling. Extensive comparative experiments and ablation studies verify the effectiveness of the proposed framework. Additionally, real-world deployment is conducted to demonstrate the practicality.
{"title":"An Intention-Guided Reinforcement Learning Approach With Dirichlet Energy Constraint for Heterogeneous Multi-Robot Cooperation","authors":"Heteng Zhang;Yunjie Jia;Zihao Sun;Yong Song;Bao Pang;Xianfeng Yuan;Rui Song;Simon X. Yang","doi":"10.1109/LRA.2025.3643304","DOIUrl":"https://doi.org/10.1109/LRA.2025.3643304","url":null,"abstract":"Multi-robot systems have demonstrated significant potential in accomplishing complex tasks, such as cooperative pursuit, search-and-rescue operations. The emergence of heterogeneous robots with diverse capabilities and characteristics shows superior adaptability compared with homogeneous teams. However, in practical applications, global information is typically inaccessible, and composite teams must contend with partial observability and coordination difficulties. To address the issue in heterogeneous multi-robot systems, we propose a novel <italic>I</i>ntention-<italic>G</i>uided reinforcement learning approach with <italic>D</i>irichlet <italic>E</i>nergy constraint (IGDE). Specifically, an intention-guided module is designed to derive long-horizon strategies based solely on local observations, enabling foresighted decision-making. In addition, a Dirichlet energy constraint is incorporated into the communication process to enhance the diversity of environmental cognition among different classes of robots. Heterogeneous robots perform class-aware actions driven by distinct cognitive representations, thereby enhancing cooperative efficiency. Notably, our approach alleviates the need of prior knowledge and heterogeneity modeling. Extensive comparative experiments and ablation studies verify the effectiveness of the proposed framework. Additionally, real-world deployment is conducted to demonstrate the practicality.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 2","pages":"1450-1457"},"PeriodicalIF":5.3,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-11DOI: 10.1109/LRA.2025.3643332
Dane Brouwer;Joshua Citron;Heather Nolte;Jeannette Bohg;Mark Cutkosky
Dense collections of movable objects are common in everyday spaces—from cabinets in a home to shelves in a warehouse. Safely retracting objects from such collections is difficult for robots, yet people do it frequently, leveraging learned experience in tandem with vision and non-prehensile tactile sensing on the sides and backs of their hands and arms. We investigate the role of contact force sensing for training robots to gently reach into constrained clutter and extract objects. The available sensing modalities are 1) “eye-in-hand” vision, 2) proprioception, 3) non-prehensile triaxial tactile sensing, 4) contact wrenches estimated from joint torques, and 5) a measure of object acquisition obtained by monitoring the vacuum line of a suction cup. We use imitation learning to train policies from a set of demonstrations on randomly generated scenes, then conduct an ablation study of wrench and tactile information. We evaluate each policy’s performance across 40 unseen environment configurations. Policies employing any force sensing show fewer excessive force failures, an increased overall success rate, and faster completion times. The best performance is achieved using both tactile and wrench information, producing an 80% improvement above the baseline without force information.
{"title":"Gentle Object Retraction in Dense Clutter Using Multimodal Force Sensing and Imitation Learning","authors":"Dane Brouwer;Joshua Citron;Heather Nolte;Jeannette Bohg;Mark Cutkosky","doi":"10.1109/LRA.2025.3643332","DOIUrl":"https://doi.org/10.1109/LRA.2025.3643332","url":null,"abstract":"Dense collections of movable objects are common in everyday spaces—from cabinets in a home to shelves in a warehouse. Safely retracting objects from such collections is difficult for robots, yet people do it frequently, leveraging learned experience in tandem with vision and non-prehensile tactile sensing on the sides and backs of their hands and arms. We investigate the role of contact force sensing for training robots to gently reach into constrained clutter and extract objects. The available sensing modalities are 1) “eye-in-hand” vision, 2) proprioception, 3) non-prehensile triaxial tactile sensing, 4) contact wrenches estimated from joint torques, and 5) a measure of object acquisition obtained by monitoring the vacuum line of a suction cup. We use imitation learning to train policies from a set of demonstrations on randomly generated scenes, then conduct an ablation study of wrench and tactile information. We evaluate each policy’s performance across 40 unseen environment configurations. Policies employing any force sensing show fewer excessive force failures, an increased overall success rate, and faster completion times. The best performance is achieved using both tactile and wrench information, producing an 80% improvement above the baseline without force information.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 2","pages":"1578-1585"},"PeriodicalIF":5.3,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-11DOI: 10.1109/LRA.2025.3643289
Jie Xu;Xuanxuan Zhang;Yongxin Ma;Yixuan Li;Linji Wang;Xinhang Xu;Shenghai Yuan;Lihua Xie
Visual-inertial odometry (VIO) can estimate robot poses at high frequencies but suffers from accumulated drift over time. Incorporating point cloud maps offers a promising solution, yet existing registration methods between vision and point clouds are limited by heterogeneous feature alignment, leaving much information underutilized and resulting in reduced accuracy, poor robustness, and high computational cost. To address these challenges, this paper proposes a visual-inertial localization system based on color point cloud maps, consisting of two main components: color map construction and visual-inertial tracking. A gradient-based map sparsification strategy is employed during map construction to preserve salient features while reducing storage and computation. For localization, we propose an image pyramid-based visual photometric IESKF, which fuses IMU and photometric observations to estimate precise poses. Gradient-rich feature points are projected onto image pyramids across multiple resolutions to perform iterative updates, effectively avoiding local minima and improving accuracy. Experimental results show that our method achieves stable and accurate localization bounded by map precision, and demonstrates higher efficiency and robustness than existing map-based approaches.
{"title":"ColorMap-VIO: A Drift-Free Visual-Inertial Odometry in a Prior Colored Point Cloud Map","authors":"Jie Xu;Xuanxuan Zhang;Yongxin Ma;Yixuan Li;Linji Wang;Xinhang Xu;Shenghai Yuan;Lihua Xie","doi":"10.1109/LRA.2025.3643289","DOIUrl":"https://doi.org/10.1109/LRA.2025.3643289","url":null,"abstract":"Visual-inertial odometry (VIO) can estimate robot poses at high frequencies but suffers from accumulated drift over time. Incorporating point cloud maps offers a promising solution, yet existing registration methods between vision and point clouds are limited by heterogeneous feature alignment, leaving much information underutilized and resulting in reduced accuracy, poor robustness, and high computational cost. To address these challenges, this paper proposes a visual-inertial localization system based on color point cloud maps, consisting of two main components: color map construction and visual-inertial tracking. A gradient-based map sparsification strategy is employed during map construction to preserve salient features while reducing storage and computation. For localization, we propose an image pyramid-based visual photometric IESKF, which fuses IMU and photometric observations to estimate precise poses. Gradient-rich feature points are projected onto image pyramids across multiple resolutions to perform iterative updates, effectively avoiding local minima and improving accuracy. Experimental results show that our method achieves stable and accurate localization bounded by map precision, and demonstrates higher efficiency and robustness than existing map-based approaches.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 2","pages":"1570-1577"},"PeriodicalIF":5.3,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-11DOI: 10.1109/LRA.2025.3643333
Dunfa Long;Shaoan Chen;Shuai Ao;Zhiqiang Zhang;Chengzhi Hu;Chaoyang Shi
This work introduces a novel compact 7-degree-of-freedom (7-DOF) microsurgical robot with position-orientation decoupling capacity for microvascular anastomosis. The proposed system employs a modular architecture combining a proximal displacement platform for 3D small-stroke translation and a distal compact remote center of motion (RCM) mechanism for wide-range orientation adjustment. This design meets the workspace requirements for microvascular anastomosis, requiring extensive orientation adjustments with minimal positional movement and reducing the system footprint. The parasitic motion reverse self-compensation method has been developed for motorized surgical instruments, effectively reducing operational resistance to improve precision. Theoretical analysis has been performed on both the RCM mechanism and motorized surgical instruments, and kinematics-based parameter optimization and data-driven calibration have been conducted to enhance superior performance. A prototype has been constructed, and its experimental validation demonstrated that the system achieved repeatability of 11.24 ± 2.31 μm (XY) and 12.46 ± 4.48 μm (YZ), and absolute positioning accuracy of 29.80 ± 12.27 μm (XY) and 37.02 ± 19.47 μm (YZ), meeting super-microsurgical requirements. Experiments that include needle-threading and stamen peeling tasks demonstrate the robot's superior dexterity and manipulation capabilities.
{"title":"Development of a Novel 7-DOF Position-Orientation Decoupled Microsurgical Robot with Motorized Instruments for Microvascular Anastomosis","authors":"Dunfa Long;Shaoan Chen;Shuai Ao;Zhiqiang Zhang;Chengzhi Hu;Chaoyang Shi","doi":"10.1109/LRA.2025.3643333","DOIUrl":"https://doi.org/10.1109/LRA.2025.3643333","url":null,"abstract":"This work introduces a novel compact 7-degree-of-freedom (7-DOF) microsurgical robot with position-orientation decoupling capacity for microvascular anastomosis. The proposed system employs a modular architecture combining a proximal displacement platform for 3D small-stroke translation and a distal compact remote center of motion (RCM) mechanism for wide-range orientation adjustment. This design meets the workspace requirements for microvascular anastomosis, requiring extensive orientation adjustments with minimal positional movement and reducing the system footprint. The parasitic motion reverse self-compensation method has been developed for motorized surgical instruments, effectively reducing operational resistance to improve precision. Theoretical analysis has been performed on both the RCM mechanism and motorized surgical instruments, and kinematics-based parameter optimization and data-driven calibration have been conducted to enhance superior performance. A prototype has been constructed, and its experimental validation demonstrated that the system achieved repeatability of 11.24 ± 2.31 μm (XY) and 12.46 ± 4.48 μm (YZ), and absolute positioning accuracy of 29.80 ± 12.27 μm (XY) and 37.02 ± 19.47 μm (YZ), meeting super-microsurgical requirements. Experiments that include needle-threading and stamen peeling tasks demonstrate the robot's superior dexterity and manipulation capabilities.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 2","pages":"1866-1873"},"PeriodicalIF":5.3,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145830774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-11DOI: 10.1109/LRA.2025.3643269
Alan Li;Angela P. Schoellig
6D Object pose estimation is a fundamental component in robotics enabling efficient interaction with the environment. In industrial bin-picking tasks, this problem becomes especially challenging due to difficult object poses, complex occlusions, and inter-object ambiguities. In this work, we propose a novel self-supervised method that automatically collects, labels, and fine-tunes on real images using an eye-in-hand camera setup. We leverage the mobile camera to first obtain reliable ground-truth estimates through multi-view pose estimation, allowing us to subsequently reposition the camera to capture and label real ‘hard case’ samples from the estimated scene. This process enables closure of the sim-to-real gap through large quantities of targeted real training data, generated by comparing differences in model performance between real and synthetically reconstructed scenes and informing the mobile camera on specific poses or areas for data capture. We surpass state-of-the-art performance on a challenging bin-picking benchmark: five out of seven objects surpass a 95% correct detection rate, compared to only one out of seven for previous methods.
{"title":"Self-Supervised Learning for Object Pose Estimation Through Active Real Sample Capture","authors":"Alan Li;Angela P. Schoellig","doi":"10.1109/LRA.2025.3643269","DOIUrl":"https://doi.org/10.1109/LRA.2025.3643269","url":null,"abstract":"6D Object pose estimation is a fundamental component in robotics enabling efficient interaction with the environment. In industrial bin-picking tasks, this problem becomes especially challenging due to difficult object poses, complex occlusions, and inter-object ambiguities. In this work, we propose a novel self-supervised method that automatically collects, labels, and fine-tunes on real images using an eye-in-hand camera setup. We leverage the mobile camera to first obtain reliable ground-truth estimates through multi-view pose estimation, allowing us to subsequently reposition the camera to capture and label real ‘hard case’ samples from the estimated scene. This process enables closure of the sim-to-real gap through large quantities of targeted real training data, generated by comparing differences in model performance between real and synthetically reconstructed scenes and informing the mobile camera on specific poses or areas for data capture. We surpass state-of-the-art performance on a challenging bin-picking benchmark: five out of seven objects surpass a 95% correct detection rate, compared to only one out of seven for previous methods.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 2","pages":"1954-1961"},"PeriodicalIF":5.3,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145830867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-11DOI: 10.1109/LRA.2025.3643277
Ruochen Li;Junkai Jiang;Jiongqi Wang;Shaobing Xu;Jianqiang Wang
Multi-depot multi-agent collaborative coverage is a representative problem in swarm intelligence, with broad applications in real-world scenarios. In this problem, multiple agents are initially located at different depots, which differs from the traditional problem setting, and are required to collaboratively cover a given region represented by a structured road network. The objective is to minimize the longest individual route among all agents. This problem is closely related to the $k$-Chinese Postman Problem ($k$-CPP), but is more complex due to the different depot constraint. This letter proposes a novel centralized algorithm, Cycle Clustering (CC), to solve this problem efficiently. The proposed method first transforms the original graph into an Eulerian graph, and then partitions the Eulerian graph into multiple small cycles, which are subsequently clustered and assigned to different agents. This design significantly reduces unnecessary route overhead. The algorithm’s time complexity and completeness are theoretically analyzed. Experimental results demonstrate that the proposed algorithm performs better than existing methods, reducing the average gap to the theoretical lower bound from 29.10% to 10.29%.
{"title":"Cycle Clustering: An Algorithm for Multi-Depot Multi-Agent Collaborative Coverage in Structured Road Network","authors":"Ruochen Li;Junkai Jiang;Jiongqi Wang;Shaobing Xu;Jianqiang Wang","doi":"10.1109/LRA.2025.3643277","DOIUrl":"https://doi.org/10.1109/LRA.2025.3643277","url":null,"abstract":"Multi-depot multi-agent collaborative coverage is a representative problem in swarm intelligence, with broad applications in real-world scenarios. In this problem, multiple agents are initially located at different depots, which differs from the traditional problem setting, and are required to collaboratively cover a given region represented by a structured road network. The objective is to minimize the longest individual route among all agents. This problem is closely related to the <inline-formula><tex-math>$k$</tex-math></inline-formula>-Chinese Postman Problem (<inline-formula><tex-math>$k$</tex-math></inline-formula>-CPP), but is more complex due to the different depot constraint. This letter proposes a novel centralized algorithm, Cycle Clustering (CC), to solve this problem efficiently. The proposed method first transforms the original graph into an Eulerian graph, and then partitions the Eulerian graph into multiple small cycles, which are subsequently clustered and assigned to different agents. This design significantly reduces unnecessary route overhead. The algorithm’s time complexity and completeness are theoretically analyzed. Experimental results demonstrate that the proposed algorithm performs better than existing methods, reducing the average gap to the theoretical lower bound from 29.10% to 10.29%.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 2","pages":"1554-1561"},"PeriodicalIF":5.3,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-11DOI: 10.1109/LRA.2025.3643271
Sidharth Talia;Oren Salzman;Siddhartha Srinivasa
We address the problem of efficiently organizing search over very large trees, which arises in many applications ranging from autonomous driving to aerial vehicles. Here, we are motivated by off-road autonomy, where real-time planning is essential. Classical approaches use graphs of motion primitives and exploit dominance to mitigate the curse of dimensionality and prune expansions efficiently. However, for complex dynamics, repeatedly solving two-point boundary-value problems makes graph construction too slow for fast kinodynamic planning. Hybrid A* (HA*) addressed this challenge by searching over a tree of motion primitives and introducing approximate pruning using a grid-based dominance check. However, choosing the grid resolution is difficult: too coarse risks failure, while too fine leads to excessive expansions and slow planning. We propose Incremental Generalized Hybrid A* (IGHA*), an anytime tree-search framework that dynamically organizes vertex expansions without rigid pruning. IGHA* provably matches or outperforms HA*. For both on-road kinematic and off-road kinodynamic planning queries for a car-like robot, variants of IGHA* use $6times$ fewer expansions to the best solution compared to an optimized version of HA* (HA*M, an internal baseline). In simulated off-road experiments in a high-fidelity simulator, IGHA* outperforms HA*M when both are used in the loop with a model predictive controller. We demonstrate real-time performance both in simulation and on a small-scale off-road vehicle, enabling fast, robust planning under complex dynamics.
{"title":"Incremental Generalized Hybrid A*","authors":"Sidharth Talia;Oren Salzman;Siddhartha Srinivasa","doi":"10.1109/LRA.2025.3643271","DOIUrl":"https://doi.org/10.1109/LRA.2025.3643271","url":null,"abstract":"We address the problem of efficiently organizing search over very large trees, which arises in many applications ranging from autonomous driving to aerial vehicles. Here, we are motivated by off-road autonomy, where real-time planning is essential. Classical approaches use graphs of motion primitives and exploit dominance to mitigate the curse of dimensionality and prune expansions efficiently. However, for complex dynamics, repeatedly solving two-point boundary-value problems makes graph construction too slow for fast kinodynamic planning. Hybrid A* (<monospace>HA*</monospace>) addressed this challenge by searching over a tree of motion primitives and introducing approximate pruning using a grid-based dominance check. However, choosing the grid resolution is difficult: too coarse risks failure, while too fine leads to excessive expansions and slow planning. We propose Incremental Generalized Hybrid A* (<monospace>IGHA*</monospace>), an anytime tree-search framework that dynamically organizes vertex expansions without rigid pruning. <monospace>IGHA*</monospace> provably matches or outperforms <monospace>HA*</monospace>. For both on-road kinematic and off-road kinodynamic planning queries for a car-like robot, variants of <monospace>IGHA*</monospace> use <inline-formula><tex-math>$6times$</tex-math></inline-formula> fewer expansions to the best solution compared to an optimized version of <monospace>HA*</monospace> (<monospace>HA*M</monospace>, an internal baseline). In simulated off-road experiments in a high-fidelity simulator, <monospace>IGHA*</monospace> outperforms <monospace>HA*M</monospace> when both are used in the loop with a model predictive controller. We demonstrate real-time performance both in simulation and on a small-scale off-road vehicle, enabling fast, robust planning under complex dynamics.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 2","pages":"1586-1593"},"PeriodicalIF":5.3,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-11DOI: 10.1109/LRA.2025.3643293
Gihyeon Lee;Jungwoo Lee;Juwon Kim;Young-Sik Shin;Younggun Cho
Robots are often required to localize in environments with unknown object classes and semantic ambiguity. However, when performing global localization using semantic objects, high semantic ambiguity intensifies object misclassification and increases the likelihood of incorrect associations, which in turn can cause significant errors in the estimated pose. Thus, in this letter, we propose a multi-label likelihood-based semantic graph matching framework for object-level global localization. The key idea is to exploit multi-label graph representations, rather than single-label alternatives, to capture and leverage the inherent semantic context of object observations. Based on these representations, our approach enhances semantic correspondence across graphs by combining the likelihood of each node with the maximum likelihood of its neighbors via context-aware likelihood propagation. For rigorous validation, data association and pose estimation performance are evaluated under both closed-set and open-set detection configurations. In addition, we demonstrate the scalability of our approach to large-vocabulary object categories in both real-world indoor scenes and synthetic environments.
{"title":"MSG-Loc: Multi-Label Likelihood-Based Semantic Graph Matching for Object-Level Global Localization","authors":"Gihyeon Lee;Jungwoo Lee;Juwon Kim;Young-Sik Shin;Younggun Cho","doi":"10.1109/LRA.2025.3643293","DOIUrl":"https://doi.org/10.1109/LRA.2025.3643293","url":null,"abstract":"Robots are often required to localize in environments with unknown object classes and semantic ambiguity. However, when performing global localization using semantic objects, high semantic ambiguity intensifies object misclassification and increases the likelihood of incorrect associations, which in turn can cause significant errors in the estimated pose. Thus, in this letter, we propose a multi-label likelihood-based semantic graph matching framework for object-level global localization. The key idea is to exploit multi-label graph representations, rather than single-label alternatives, to capture and leverage the inherent semantic context of object observations. Based on these representations, our approach enhances semantic correspondence across graphs by combining the likelihood of each node with the maximum likelihood of its neighbors via context-aware likelihood propagation. For rigorous validation, data association and pose estimation performance are evaluated under both closed-set and open-set detection configurations. In addition, we demonstrate the scalability of our approach to large-vocabulary object categories in both real-world indoor scenes and synthetic environments.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 2","pages":"2066-2073"},"PeriodicalIF":5.3,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Accurate extrinsic calibration of camera, radar, and LiDAR is critical for multi-modal sensor fusion in autonomous vehicles and mobile robots. Existing methods typically perform pair-wise calibration and rely on specialized targets, limiting scalability and flexibility. We introduce a universal calibration framework based on an Iterative Best Match (IBM) algorithm that refines alignment by optimizing correspondences between sensors, eliminating traditional point-to-point matching. IBM naturally extends to simultaneous camera–LiDAR–radar calibration and leverages tracked natural targets (e.g., pedestrians) to establish cross-modal correspondences without predefined calibration markers. Experiments on a realistic multi-sensor platform (fisheye-camera, LiDAR, and radar) and the KITTI dataset validate the accuracy, robustness, and efficiency of our method.
{"title":"A Universal Framework for Extrinsic Calibration of Camera, Radar, and LiDAR","authors":"Sijie Hu;Alessandro Goldwurm;Martin Mujica;Sylvain Cadou;Frédéric Lerasle","doi":"10.1109/LRA.2025.3643292","DOIUrl":"https://doi.org/10.1109/LRA.2025.3643292","url":null,"abstract":"Accurate extrinsic calibration of camera, radar, and LiDAR is critical for multi-modal sensor fusion in autonomous vehicles and mobile robots. Existing methods typically perform pair-wise calibration and rely on specialized targets, limiting scalability and flexibility. We introduce a universal calibration framework based on an Iterative Best Match (IBM) algorithm that refines alignment by optimizing correspondences between sensors, eliminating traditional point-to-point matching. IBM naturally extends to simultaneous camera–LiDAR–radar calibration and leverages tracked natural targets (e.g., pedestrians) to establish cross-modal correspondences without predefined calibration markers. Experiments on a realistic multi-sensor platform (fisheye-camera, LiDAR, and radar) and the KITTI dataset validate the accuracy, robustness, and efficiency of our method.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 2","pages":"1842-1849"},"PeriodicalIF":5.3,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145830919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}