Pub Date : 2026-02-01Epub Date: 2025-12-18DOI: 10.1109/lra.2025.3645700
Viola Del Bono, Emma Capaldi, Anushka Kelshiker, Ayhan Aktas, Hiroyuki Aihara, Sheila Russo
Soft optical sensors hold potential for enhancing minimally invasive procedures like colonoscopy, yet their complex, multi-modal responses pose significant challenges. This work introduces a machine learning (ML) framework for real-time estimation of 3D shape and contact force in a soft robotic sleeve for colonoscopy. To overcome limitations of manual calibration and collect large datasets for ML, we developed an automated platform for collecting data across a range of orientations, curvatures, and contact forces. A cascaded ML architecture was implemented for sequential estimation of contact force and 3D shape, enabling an accuracy with errors of 4.7% for curvature, 2.37% for orientation, and 5.5% for force tracking. We also explored the potential of ML for contact localization by training a model to estimate contact intensity and location across 16 indenters distributed along the sleeve. The force intensity was estimated with an error ranging from 0.06 N to 0.31 N throughout the indenters. Despite the proximity of the contact points, the system achieved high localization performances, with 8 indenters reaching over 80% accuracy, demonstrating promising spatial resolution.
{"title":"Multi-modal sensing in colonoscopy: a data-driven approach.","authors":"Viola Del Bono, Emma Capaldi, Anushka Kelshiker, Ayhan Aktas, Hiroyuki Aihara, Sheila Russo","doi":"10.1109/lra.2025.3645700","DOIUrl":"10.1109/lra.2025.3645700","url":null,"abstract":"<p><p>Soft optical sensors hold potential for enhancing minimally invasive procedures like colonoscopy, yet their complex, multi-modal responses pose significant challenges. This work introduces a machine learning (ML) framework for real-time estimation of 3D shape and contact force in a soft robotic sleeve for colonoscopy. To overcome limitations of manual calibration and collect large datasets for ML, we developed an automated platform for collecting data across a range of orientations, curvatures, and contact forces. A cascaded ML architecture was implemented for sequential estimation of contact force and 3D shape, enabling an accuracy with errors of 4.7% for curvature, 2.37% for orientation, and 5.5% for force tracking. We also explored the potential of ML for contact localization by training a model to estimate contact intensity and location across 16 indenters distributed along the sleeve. The force intensity was estimated with an error ranging from 0.06 N to 0.31 N throughout the indenters. Despite the proximity of the contact points, the system achieved high localization performances, with 8 indenters reaching over 80% accuracy, demonstrating promising spatial resolution.</p>","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 2","pages":"2018-2025"},"PeriodicalIF":5.3,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12811025/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145998055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-28DOI: 10.1109/LRA.2026.3656038
{"title":"IEEE Robotics and Automation Society Information","authors":"","doi":"10.1109/LRA.2026.3656038","DOIUrl":"https://doi.org/10.1109/LRA.2026.3656038","url":null,"abstract":"","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 2","pages":"C3-C3"},"PeriodicalIF":5.3,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11367139","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-28DOI: 10.1109/LRA.2026.3656040
{"title":"IEEE Robotics and Automation Letters Information for Authors","authors":"","doi":"10.1109/LRA.2026.3656040","DOIUrl":"https://doi.org/10.1109/LRA.2026.3656040","url":null,"abstract":"","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 2","pages":"C4-C4"},"PeriodicalIF":5.3,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11367250","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine learning for robot manipulation promises to unlock generalization to novel tasks and environments. But how should we measure the progress of these policies towards generalization? Evaluating and quantifying generalization is the Wild West of modern robotics, with each work proposing and measuring different types of generalization in their own, often difficult to reproduce settings. In this work, our goal is (1) to outline the forms of generalization we believe are important for robot manipulation in a comprehensive and fine-grained manner, and (2) to provide reproducible guidelines for measuring these notions of generalization. We first propose $bigstar$-Gen, a taxonomy of generalization for robot manipulation structured around visual, semantic, and behavioral generalization. Next, we instantiate $bigstar$-Gen with two case studies on real-world benchmarking: one based on open-source models and the Bridge V2 dataset, and another based on the bimanual ALOHA 2 platform that covers more dexterous and longer horizon tasks. Our case studies reveal many interesting insights: for example, we observe that open-source vision-language-action models often struggle with semantic generalization, despite pre-training on internet-scale language datasets.
{"title":"A Taxonomy for Evaluating Generalist Robot Manipulation Policies","authors":"Jensen Gao;Suneel Belkhale;Sudeep Dasari;Ashwin Balakrishna;Dhruv Shah;Dorsa Sadigh","doi":"10.1109/LRA.2026.3656785","DOIUrl":"https://doi.org/10.1109/LRA.2026.3656785","url":null,"abstract":"Machine learning for robot manipulation promises to unlock generalization to novel tasks and environments. But how should we measure the progress of these policies towards generalization? Evaluating and quantifying generalization is the Wild West of modern robotics, with each work proposing and measuring different types of generalization in their own, often difficult to reproduce settings. In this work, our goal is (1) to outline the forms of generalization we believe are important for robot manipulation in a comprehensive and fine-grained manner, and (2) to provide reproducible guidelines for measuring these notions of generalization. We first propose <inline-formula><tex-math>$bigstar$</tex-math></inline-formula>-Gen, a taxonomy of generalization for robot manipulation structured around visual, semantic, and behavioral generalization. Next, we instantiate <inline-formula><tex-math>$bigstar$</tex-math></inline-formula>-Gen with two case studies on real-world benchmarking: one based on open-source models and the Bridge V2 dataset, and another based on the bimanual ALOHA 2 platform that covers more dexterous and longer horizon tasks. Our case studies reveal many interesting insights: for example, we observe that open-source vision-language-action models often struggle with semantic generalization, despite pre-training on internet-scale language datasets.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"3182-3189"},"PeriodicalIF":5.3,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-22DOI: 10.1109/LRA.2026.3656791
Yongjian Zhao;Yuyan Qi;Jiaqi Shao;Bin Sun;Min Wang;Songyi Zhong;Yang Yang
High power density and energy efficiency are critical for achieving agile locomotion and sustained operation in miniature flapping-wing robots. Here, a pneumatic linear reciprocating oscillator is developed as an actuation solution. The oscillator leverages the Bernoulli principle to establish a positive feedback mechanism through coordinated interactions among a soft membrane, a piston, and the airflow. Experimental validation demonstrates that the oscillator-based flapping-wing robot can generate a lift of 0.43 N to enable take-off and sustained flight in unstructured environments. The minimal oscillation unit exhibits maximum input and output specific power of 710.5 W/kg and 220.7 W/kg, respectively, with peak energy conversion efficiency reaching 41.9% . This design represents a paradigm shift from conventional electromechanical systems, offering two fundamental advancements: (i) simplified robotic drive architectures through an oscillator-based mechanism, and (ii) a foundation for hybrid energy systems that reduce reliance on electricity.
{"title":"An Energy-Efficient and Powerful Oscillator for Micro-Air Vehicles With Electronics-Free Flapping","authors":"Yongjian Zhao;Yuyan Qi;Jiaqi Shao;Bin Sun;Min Wang;Songyi Zhong;Yang Yang","doi":"10.1109/LRA.2026.3656791","DOIUrl":"https://doi.org/10.1109/LRA.2026.3656791","url":null,"abstract":"High power density and energy efficiency are critical for achieving agile locomotion and sustained operation in miniature flapping-wing robots. Here, a pneumatic linear reciprocating oscillator is developed as an actuation solution. The oscillator leverages the Bernoulli principle to establish a positive feedback mechanism through coordinated interactions among a soft membrane, a piston, and the airflow. Experimental validation demonstrates that the oscillator-based flapping-wing robot can generate a lift of 0.43 N to enable take-off and sustained flight in unstructured environments. The minimal oscillation unit exhibits maximum input and output specific power of 710.5 W/kg and 220.7 W/kg, respectively, with peak energy conversion efficiency reaching 41.9% . This design represents a paradigm shift from conventional electromechanical systems, offering two fundamental advancements: (i) simplified robotic drive architectures through an oscillator-based mechanism, and (ii) a foundation for hybrid energy systems that reduce reliance on electricity.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"3174-3181"},"PeriodicalIF":5.3,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-22DOI: 10.1109/LRA.2026.3656784
Arjun Gupta;Rishik Sathua;Saurabh Gupta
Many everyday mobile manipulation tasks require precise interaction with small objects, such as grasping a knob to open a cabinet or pressing a light switch. In this letter, we develop Visual Servoing with Vision Models (VSVM), a closed-loop framework that enables a mobile manipulator to tackle such precise tasks involving the manipulation of small objects. VSVM uses state-of-the-art vision foundation models to generate 3D targets for visual servoing to enable diverse tasks in novel environments. Naively doing so fails because of occlusion by the end-effector. VSVM mitigates this using vision models that out-paint the end-effector thereby significantly enhancing target localization. We demonstrate that aided by out-painting methods, open-vocabulary object detectors can serve as a drop-in module for VSVM to seek semantic targets (e.g. knobs) and point tracking methods can help VSVM reliably pursue interaction sites indicated by user clicks. We conduct a large-scale evaluation spanning experiments in 10 novel environments across 6 buildings including 72 different object instances. VSVM obtains a 71% zero-shot success rate on manipulating unseen objects in novel environments in the real world, outperforming an open-loop control method by an absolute 42% and an imitation learning baseline trained on 1000+ demonstrations also by an absolute success rate of 50% .
{"title":"Precise Mobile Manipulation of Small Everyday Objects","authors":"Arjun Gupta;Rishik Sathua;Saurabh Gupta","doi":"10.1109/LRA.2026.3656784","DOIUrl":"https://doi.org/10.1109/LRA.2026.3656784","url":null,"abstract":"Many everyday mobile manipulation tasks require precise interaction with small objects, such as grasping a knob to open a cabinet or pressing a light switch. In this letter, we develop Visual Servoing with Vision Models (VSVM), a closed-loop framework that enables a mobile manipulator to tackle such precise tasks involving the manipulation of small objects. VSVM uses state-of-the-art vision foundation models to generate 3D targets for visual servoing to enable diverse tasks in novel environments. Naively doing so fails because of occlusion by the end-effector. VSVM mitigates this using vision models that out-paint the end-effector thereby significantly enhancing target localization. We demonstrate that aided by out-painting methods, open-vocabulary object detectors can serve as a drop-in module for VSVM to seek semantic targets (e.g. knobs) and point tracking methods can help VSVM reliably pursue interaction sites indicated by user clicks. We conduct a large-scale evaluation spanning experiments in 10 novel environments across 6 buildings including 72 different object instances. VSVM obtains a 71% zero-shot success rate on manipulating unseen objects in novel environments in the real world, outperforming an open-loop control method by an absolute 42% and an imitation learning baseline trained on 1000+ demonstrations also by an absolute success rate of 50% .","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"3214-3221"},"PeriodicalIF":5.3,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-21DOI: 10.1109/LRA.2026.3656725
Xuan Xiao;Kefeng Zhang;Jiaqi Zhu;Jianming Wang;Runtian Zhu
Snake robots exhibit remarkable locomotion capabilities in complex environments as the degrees of freedom (DOFs) increase, but at the cost of energy consumption. To address this issue, this article proposes a cooperation strategy for snake robots based on a head-tail docking mechanism, which allows multiple short snake robots to combine into a longer one, enabling the execution of complex tasks. The mechanical design and the implementation of the dockable snake robots are introduced, featuring passive docking mechanisms at both the head and tail, an embedded controller and a vision camera mounted on the head, and a distributed power supply system. Furthermore, the control strategies for the combined robots have been developed to perform the crawler gait and the motion of spanning between parallel pipes. As a result, experiments are conducted to demonstrate the feasibility and performance of the proposed docking mechanism and cooperative control methods. Specifically, two snake robots can autonomously dock under visual guidance. After docking, the combined robots can rapidly traverse flat surfaces by performing the crawler gait at an average speed of 0.168 m/s. Additionally, the robots can perform spanning between parallel pipes and pipe inspection tasks concurrently by separating.
{"title":"Design and Validation of Docking-Based Cooperative Strategies for Snake Robots in Complex Environments","authors":"Xuan Xiao;Kefeng Zhang;Jiaqi Zhu;Jianming Wang;Runtian Zhu","doi":"10.1109/LRA.2026.3656725","DOIUrl":"https://doi.org/10.1109/LRA.2026.3656725","url":null,"abstract":"Snake robots exhibit remarkable locomotion capabilities in complex environments as the degrees of freedom (DOFs) increase, but at the cost of energy consumption. To address this issue, this article proposes a cooperation strategy for snake robots based on a head-tail docking mechanism, which allows multiple short snake robots to combine into a longer one, enabling the execution of complex tasks. The mechanical design and the implementation of the dockable snake robots are introduced, featuring passive docking mechanisms at both the head and tail, an embedded controller and a vision camera mounted on the head, and a distributed power supply system. Furthermore, the control strategies for the combined robots have been developed to perform the crawler gait and the motion of spanning between parallel pipes. As a result, experiments are conducted to demonstrate the feasibility and performance of the proposed docking mechanism and cooperative control methods. Specifically, two snake robots can autonomously dock under visual guidance. After docking, the combined robots can rapidly traverse flat surfaces by performing the crawler gait at an average speed of 0.168 m/s. Additionally, the robots can perform spanning between parallel pipes and pipe inspection tasks concurrently by separating.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"3190-3197"},"PeriodicalIF":5.3,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-19DOI: 10.1109/LRA.2026.3655281
Haoyu Wang;Zhiqiang Miao;Weiwei Zhan;Xiangke Wang;Wei He;Yaonan Wang
To address the limited maneuverability and low energy efficiency of autonomous aerial vehicles (AAVs) in confined spaces, we design and implement the Hybrid Sprawl-Tuned Vehicle (HSTV) - a deformable multi-modal robotic platform specifically engineered for operation in complex and spatially constrained environments. Based on the “FSTAR” platform, HSTV is equipped with passive front wheels and actively driven rear wheels. The gear transmission mechanism enables the actively driven wheels to be driven without the need for dedicated motors, simplifying the architecture of the system. For both flying and driving modes, detailed kinematics and dynamics models integrated with a mode switching strategy are constructed by using the Newton-Euler method. Based on the developed models, the constrained nonlinear model predictive controller is designed to achieve the accurate motion performance in flying and driving mode. Comprehensive experimental results and comparative analysis demonstrate that HSTV achieves significant trajectory tracking accuracy across both flying and driving modes, while saving energy by up to 70.9% with no significantly increasing structural complexity (maintained at 98.6% simplicity).
{"title":"Design and NMPC-Based Control of a Hybrid Sprawl-Tuned Vehicle With Flying and Driving Modes","authors":"Haoyu Wang;Zhiqiang Miao;Weiwei Zhan;Xiangke Wang;Wei He;Yaonan Wang","doi":"10.1109/LRA.2026.3655281","DOIUrl":"https://doi.org/10.1109/LRA.2026.3655281","url":null,"abstract":"To address the limited maneuverability and low energy efficiency of autonomous aerial vehicles (AAVs) in confined spaces, we design and implement the Hybrid Sprawl-Tuned Vehicle (HSTV) - a deformable multi-modal robotic platform specifically engineered for operation in complex and spatially constrained environments. Based on the “FSTAR” platform, HSTV is equipped with passive front wheels and actively driven rear wheels. The gear transmission mechanism enables the actively driven wheels to be driven without the need for dedicated motors, simplifying the architecture of the system. For both flying and driving modes, detailed kinematics and dynamics models integrated with a mode switching strategy are constructed by using the Newton-Euler method. Based on the developed models, the constrained nonlinear model predictive controller is designed to achieve the accurate motion performance in flying and driving mode. Comprehensive experimental results and comparative analysis demonstrate that HSTV achieves significant trajectory tracking accuracy across both flying and driving modes, while saving energy by up to 70.9% with no significantly increasing structural complexity (maintained at 98.6% simplicity).","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"3222-3229"},"PeriodicalIF":5.3,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-19DOI: 10.1109/LRA.2026.3655311
Cong Li;Qin Rao;Zheng Tian;Jun Yang
The rubber-tired container gantry crane (RTG) is a type of heavy-duty lifting equipment commonly used in container yards, which is driven by two-side rubber tires and steered via differential drive. While moving along the desired path, the RTG must remain centered of the lane with restricted heading angle, as deviations may compromise the safety of subsequent yard operations. Due to its underactuated nature and the presence of external disturbances, achieving accurate lane-keeping poses a significant control challenge. To address this issue, a robust safety-critical steering control strategy integrating disturbance rejection vector field (VF) with a new state-interlocked control barrier function (SICBF) is proposed. The strategy initially employs a VF path-following method as the nominal controller. By strategically shrinking the safe set, the SICBF overcomes the limitations of traditional CBFs, such as state coupling in the inequality verification and infeasibility when the control coefficient tends to zero. Furthermore, by incorporating a disturbance observer (DOB) into the quadratic programming (QP) framework, the robustness and safety of the control system are significantly enhanced. Comprehensive simulation and experiment are conducted on a practical RTG with a 40-ton load capacity. To our best knowledge, the proposed method is one of the very few methods that have demonstrated successful application to the practical RTG systems.
{"title":"Safety-Critical Steering Control for Rubber-Tired Container Gantry Cranes: A State-Interlocked CBF Approach","authors":"Cong Li;Qin Rao;Zheng Tian;Jun Yang","doi":"10.1109/LRA.2026.3655311","DOIUrl":"https://doi.org/10.1109/LRA.2026.3655311","url":null,"abstract":"The rubber-tired container gantry crane (RTG) is a type of heavy-duty lifting equipment commonly used in container yards, which is driven by two-side rubber tires and steered via differential drive. While moving along the desired path, the RTG must remain centered of the lane with restricted heading angle, as deviations may compromise the safety of subsequent yard operations. Due to its underactuated nature and the presence of external disturbances, achieving accurate lane-keeping poses a significant control challenge. To address this issue, a robust safety-critical steering control strategy integrating disturbance rejection vector field (VF) with a new state-interlocked control barrier function (SICBF) is proposed. The strategy initially employs a VF path-following method as the nominal controller. By strategically shrinking the safe set, the SICBF overcomes the limitations of traditional CBFs, such as state coupling in the inequality verification and infeasibility when the control coefficient tends to zero. Furthermore, by incorporating a disturbance observer (DOB) into the quadratic programming (QP) framework, the robustness and safety of the control system are significantly enhanced. Comprehensive simulation and experiment are conducted on a practical RTG with a 40-ton load capacity. To our best knowledge, the proposed method is one of the very few methods that have demonstrated successful application to the practical RTG systems.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"3238-3245"},"PeriodicalIF":5.3,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Voxel-based LiDAR–inertial odometry (LIO) is accurate and efficient but can suffer from geometric inconsistencies when single-Gaussian voxel models indiscriminately merge observations from conflicting viewpoints. To address this limitation, we propose Azimuth-LIO, a robust voxel-based LIO framework that leverages azimuth-aware voxelization and probabilistic fusion. Instead of using a single distribution per voxel, we discretize each voxel into azimuth-sectorized substructures, each modeled by an anisotropic 3D Gaussian to preserve viewpoint-specific spatial features and uncertainties. We further introduce a direction-weighted distribution-to-distribution registration metric to adaptively quantify the contributions of different azimuth sectors, followed by a Bayesian fusion framework that exploits these confidence weights to ensure azimuth-consistent map updates. The performance and efficiency of the proposed method are evaluated on public benchmarks including the M2DGR, MCD, and SubT-MRS datasets, demonstrating superior accuracy and robustness compared to existing voxel-based algorithms.
{"title":"Azimuth-LIO: Robust LiDAR-Inertial Odometry via Azimuth-Aware Voxelization and Probabilistic Fusion","authors":"Zhongguan Liu;Wei Li;Honglei Che;Lu Pan;Shuaidong Yuan","doi":"10.1109/LRA.2026.3655291","DOIUrl":"https://doi.org/10.1109/LRA.2026.3655291","url":null,"abstract":"Voxel-based LiDAR–inertial odometry (LIO) is accurate and efficient but can suffer from geometric inconsistencies when single-Gaussian voxel models indiscriminately merge observations from conflicting viewpoints. To address this limitation, we propose Azimuth-LIO, a robust voxel-based LIO framework that leverages azimuth-aware voxelization and probabilistic fusion. Instead of using a single distribution per voxel, we discretize each voxel into azimuth-sectorized substructures, each modeled by an anisotropic 3D Gaussian to preserve viewpoint-specific spatial features and uncertainties. We further introduce a direction-weighted distribution-to-distribution registration metric to adaptively quantify the contributions of different azimuth sectors, followed by a Bayesian fusion framework that exploits these confidence weights to ensure azimuth-consistent map updates. The performance and efficiency of the proposed method are evaluated on public benchmarks including the M2DGR, MCD, and SubT-MRS datasets, demonstrating superior accuracy and robustness compared to existing voxel-based algorithms.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"3158-3165"},"PeriodicalIF":5.3,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}