Pub Date : 2026-01-12DOI: 10.1109/LRA.2026.3653300
Lequn Fu;Xiao Li;Yibin Liu;Xiangan Zeng;Yibo Peng;Youjun Xiong;Shiqi Li
Achieving natural, robust, and energy-efficient locomotion remains a central challenge for humanoid control. While imitation learning enables robots to reproduce human-like behaviors, differences in morphology, actuation, and partial observability often limit direct motion replication. This work proposes a human-inspired reinforcement learning framework that integrates both implicit and explicit guidance. Implicit human motion priors, obtained through adversarial learning, provide style alignment with human data, while explicit biomechanical rewards encode characteristic gait principles to promote symmetry, stability, and adaptability. In addition, a history-based state estimator explicitly reconstructs base velocities from partial observations, mitigating observability gaps and enhancing robustness in real-world settings. To assess human-likeness, we introduce a tri-metric evaluation protocol covering gait symmetry, human–robot similarity, and energy efficiency. Extensive experiments demonstrate that the proposed approach produces locomotion that is not only robust and transferable across diverse terrains but also energy-efficient and recognizably human-like.
{"title":"Human-Inspired Adaptive Gait Learning for Humanoids Locomotion","authors":"Lequn Fu;Xiao Li;Yibin Liu;Xiangan Zeng;Yibo Peng;Youjun Xiong;Shiqi Li","doi":"10.1109/LRA.2026.3653300","DOIUrl":"https://doi.org/10.1109/LRA.2026.3653300","url":null,"abstract":"Achieving natural, robust, and energy-efficient locomotion remains a central challenge for humanoid control. While imitation learning enables robots to reproduce human-like behaviors, differences in morphology, actuation, and partial observability often limit direct motion replication. This work proposes a human-inspired reinforcement learning framework that integrates both implicit and explicit guidance. Implicit human motion priors, obtained through adversarial learning, provide style alignment with human data, while explicit biomechanical rewards encode characteristic gait principles to promote symmetry, stability, and adaptability. In addition, a history-based state estimator explicitly reconstructs base velocities from partial observations, mitigating observability gaps and enhancing robustness in real-world settings. To assess human-likeness, we introduce a tri-metric evaluation protocol covering gait symmetry, human–robot similarity, and energy efficiency. Extensive experiments demonstrate that the proposed approach produces locomotion that is not only robust and transferable across diverse terrains but also energy-efficient and recognizably human-like.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"2458-2465"},"PeriodicalIF":5.3,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146001865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vision-Language-Action (VLA) models excel at robotic tasks by leveraging large-scale 2D vision-language pretraining, but their reliance on RGB images limits spatial reasoning critical for real-world interaction. Retraining these models with 3D data is computationally prohibitive, while discarding existing 2D datasets wastes valuable resources. To bridge this gap, we propose PointVLA, a framework that enhances pre-trained VLAs with point cloud inputs without requiring retraining. Our method freezes the vanilla action expert and injects 3D features via a lightweight modular block. To identify the most effective way of integrating point cloud representations, we conduct a skip-block analysis to pinpoint less useful blocks in the vanilla action expert, ensuring that 3D features are injected only into these blocks—minimizing disruption to pre-trained representations. Extensive experiments demonstrate that PointVLA outperforms state-of-the-art 2D imitation learning methods, such as OpenVLA, Diffusion Policy and DexVLA, across both simulated and real-world robotic tasks. Specifically, we highlight several key advantages of PointVLA enabled by point cloud integration: (1) Few-shot multi-tasking, where PointVLA successfully performs four different tasks using only 20 demonstrations each; (2) Real-vs-photo discrimination, where PointVLA distinguishes real objects from their images, leveraging 3D world knowledge to improve safety and reliability; (3) Height adaptability, where unlike conventional 2D imitation learning methods, PointVLA enables robots to adapt to objects at varying table heights that were unseen in training data. Furthermore, PointVLA achieves strong performance in long-horizon tasks, such as picking and packing objects from a moving conveyor belt, showcasing its ability to generalize across complex, dynamic environments.
视觉-语言-行动(VLA)模型通过利用大规模的2D视觉语言预训练在机器人任务中表现出色,但它们对RGB图像的依赖限制了对现实世界交互至关重要的空间推理。用3D数据重新训练这些模型在计算上是禁止的,而丢弃现有的2D数据集浪费了宝贵的资源。为了弥补这一差距,我们提出了PointVLA,这是一个使用点云输入增强预训练vla而无需再训练的框架。我们的方法是冻结普通的动作专家,并通过轻量级模块块注入3D功能。为了确定整合点云表示的最有效方法,我们进行了跳过块分析,以确定vanilla动作专家中不太有用的块,确保3D特征仅注入到这些块中,从而最大限度地减少对预训练表示的干扰。大量实验表明,PointVLA在模拟和现实机器人任务中都优于最先进的2D模仿学习方法,如OpenVLA、Diffusion Policy和DexVLA。具体来说,我们强调了点云集成支持的PointVLA的几个关键优势:(1)少镜头多任务,其中PointVLA成功执行四个不同的任务,每个任务仅使用20个演示;(2) real -vs-photo - discrimination, PointVLA将真实物体与其图像区分开来,利用3D世界知识提高安全性和可靠性;(3)高度适应性,与传统的2D模仿学习方法不同,PointVLA使机器人能够适应不同桌子高度的物体,这些物体在训练数据中是看不见的。此外,PointVLA在长期任务中表现出色,例如从移动的传送带中挑选和包装物体,展示了其在复杂动态环境中的泛化能力。
{"title":"PointVLA: Injecting the 3D World Into Vision-Language-Action Models","authors":"Chengmeng Li;Junjie Wen;Yaxin Peng;Yan Peng;Yichen Zhu","doi":"10.1109/LRA.2026.3653303","DOIUrl":"https://doi.org/10.1109/LRA.2026.3653303","url":null,"abstract":"Vision-Language-Action (VLA) models excel at robotic tasks by leveraging large-scale 2D vision-language pretraining, but their reliance on RGB images limits spatial reasoning critical for real-world interaction. Retraining these models with 3D data is computationally prohibitive, while discarding existing 2D datasets wastes valuable resources. To bridge this gap, we propose PointVLA, a framework that enhances pre-trained VLAs with point cloud inputs without requiring retraining. Our method freezes the vanilla action expert and injects 3D features via a lightweight modular block. To identify the most effective way of integrating point cloud representations, we conduct a skip-block analysis to pinpoint less useful blocks in the vanilla action expert, ensuring that 3D features are injected only into these blocks—minimizing disruption to pre-trained representations. Extensive experiments demonstrate that PointVLA outperforms state-of-the-art 2D imitation learning methods, such as OpenVLA, Diffusion Policy and DexVLA, across both simulated and real-world robotic tasks. Specifically, we highlight several key advantages of PointVLA enabled by point cloud integration: (1) <bold>Few-shot multi-tasking</b>, where PointVLA successfully performs four different tasks using only 20 demonstrations each; (2) <bold>Real-vs-photo discrimination</b>, where PointVLA distinguishes real objects from their images, leveraging 3D world knowledge to improve safety and reliability; (3) <bold>Height adaptability</b>, where unlike conventional 2D imitation learning methods, PointVLA enables robots to adapt to objects at varying table heights that were unseen in training data. Furthermore, PointVLA achieves strong performance in long-horizon tasks, such as picking and packing objects from a moving conveyor belt, showcasing its ability to generalize across complex, dynamic environments.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"2506-2513"},"PeriodicalIF":5.3,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146001872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/LRA.2026.3653287
Yizhi Zhou;Yufan Liu;Xuan Wang
This letter studies the problem of Cooperative Localization (CL) for multi-robot systems in 3-D environments, where a group of mobile robots jointly localize themselves by using measurements from onboard sensors and shared information from other robots. To ensure the efficiency of information fusion and observability consistency in a distributed CL system, we propose a distributed multi-robot CL method based on Lie groups, well-suited for 3-D scenarios with full 3-D rotational dynamics and generic nonlinear inter-robot measurement models. Unlike most existing distributed CL algorithms that operate in vector space and are only applicable to simple 2-D environments, the proposed algorithm performs distributed information fusion directly on the manifold that inherently accounts for the non-Euclidean nature of 3-D rotations and translations. By leveraging the nice property of invariant errors, we analytically prove that the proposed algorithm naturally preserves the observability consistency of the CL system. This ensures that the system maintains the correct structure of unobservable directions throughout the estimation process. The effectiveness of the proposed algorithm is validated by several numerical experiments conducted to rigorously investigate the effects of relative information fusion in the distributed CL system.
{"title":"Distributed 3-D Multi-Robot Cooperative Localization: An Efficient and Consistent Approach","authors":"Yizhi Zhou;Yufan Liu;Xuan Wang","doi":"10.1109/LRA.2026.3653287","DOIUrl":"https://doi.org/10.1109/LRA.2026.3653287","url":null,"abstract":"This letter studies the problem of Cooperative Localization (CL) for multi-robot systems in 3-D environments, where a group of mobile robots jointly localize themselves by using measurements from onboard sensors and shared information from other robots. To ensure the efficiency of information fusion and observability consistency in a distributed CL system, we propose a distributed multi-robot CL method based on Lie groups, well-suited for 3-D scenarios with full 3-D rotational dynamics and generic nonlinear inter-robot measurement models. Unlike most existing distributed CL algorithms that operate in vector space and are only applicable to simple 2-D environments, the proposed algorithm performs distributed information fusion directly on the manifold that inherently accounts for the non-Euclidean nature of 3-D rotations and translations. By leveraging the nice property of invariant errors, we analytically prove that the proposed algorithm naturally preserves the observability consistency of the CL system. This ensures that the system maintains the correct structure of unobservable directions throughout the estimation process. The effectiveness of the proposed algorithm is validated by several numerical experiments conducted to rigorously investigate the effects of relative information fusion in the distributed CL system.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 2","pages":"2306-2313"},"PeriodicalIF":5.3,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Although fully autonomous systems still face challenges due to patients' anatomical variability, teleoperated systems appear to be more practical in current healthcare settings. This paper presents an anatomy-aware control framework for teleoperated lung ultrasound. Leveraging biomechanically accurate 3D modelling, the system applies virtual constraints on the ultrasound probe pose and provides real-time visual feedback to assist in precise probe placement tasks. A twofold evaluation, one with 5 naïve operators on a single volunteer and the second with a single experienced operator on 6 volunteers, compared our method with a standard teleoperation baseline. The results of the first one characterised the accuracy of the anatomical model and the improved perceived performance by the naïve operators, while the second one focused on the efficiency of the system in improving probe placement and reducing procedure time compared to traditional teleoperation. The results demonstrate that the proposed framework enhances the physician's capabilities in executing remote lung ultrasound, reducing more than 20% of execution time on 4-point acquisitions, towards faster, more objective and repeatable exams.
{"title":"An Anatomy-Aware Shared Control Approach for Assisted Teleoperation of Lung Ultrasound Examinations","authors":"Davide Nardi;Edoardo Lamon;Daniele Fontanelli;Matteo Saveriano;Luigi Palopoli","doi":"10.1109/LRA.2026.3653292","DOIUrl":"https://doi.org/10.1109/LRA.2026.3653292","url":null,"abstract":"Although fully autonomous systems still face challenges due to patients' anatomical variability, teleoperated systems appear to be more practical in current healthcare settings. This paper presents an anatomy-aware control framework for teleoperated lung ultrasound. Leveraging biomechanically accurate 3D modelling, the system applies virtual constraints on the ultrasound probe pose and provides real-time visual feedback to assist in precise probe placement tasks. A twofold evaluation, one with 5 naïve operators on a single volunteer and the second with a single experienced operator on 6 volunteers, compared our method with a standard teleoperation baseline. The results of the first one characterised the accuracy of the anatomical model and the improved perceived performance by the naïve operators, while the second one focused on the efficiency of the system in improving probe placement and reducing procedure time compared to traditional teleoperation. The results demonstrate that the proposed framework enhances the physician's capabilities in executing remote lung ultrasound, reducing more than 20% of execution time on 4-point acquisitions, towards faster, more objective and repeatable exams.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"2570-2577"},"PeriodicalIF":5.3,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11346947","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/LRA.2026.3653299
Saekwang Nam;Bowen Deng;Loong Yi Lee;Jonathan M. Rossiter;Nathan F. Lepora
We present a tactile-sensorized Fin-Ray finger that enables simultaneous detection of contact location and indentation depth through an indirect sensing approach. A hinge mechanism is integrated between the soft Fin-Ray structure and a rigid sensing module, allowing deformation and translation information to be transferred to a bottom crossbeam upon which are an array of marker-tipped pins based on the biomimetic structure of the TacTip vision-based tactile sensor. Deformation patterns captured by an internal camera are processed using a convolutional neural network to infer contact conditions without directly sensing the finger surface. The finger design was optimized by varying pin configurations and hinge orientations, achieving 0.1 mm depth and 2 mm location-sensing accuracies. The perception demonstrated robust generalization to various indenter shapes and sizes, which was applied to a pick-and-place task under uncertain picking positions, where the tactile feedback significantly improved placement accuracy. Overall, this work provides a lightweight, flexible, and scalable tactile sensing solution suitable for soft robotic structures where the sensing needs situating away from the contact interface.
{"title":"TacFinRay: Soft Tactile Fin-Ray Finger With Indirect Tactile Sensing for Robust Grasping","authors":"Saekwang Nam;Bowen Deng;Loong Yi Lee;Jonathan M. Rossiter;Nathan F. Lepora","doi":"10.1109/LRA.2026.3653299","DOIUrl":"https://doi.org/10.1109/LRA.2026.3653299","url":null,"abstract":"We present a tactile-sensorized Fin-Ray finger that enables simultaneous detection of contact location and indentation depth through an indirect sensing approach. A hinge mechanism is integrated between the soft Fin-Ray structure and a rigid sensing module, allowing deformation and translation information to be transferred to a bottom crossbeam upon which are an array of marker-tipped pins based on the biomimetic structure of the TacTip vision-based tactile sensor. Deformation patterns captured by an internal camera are processed using a convolutional neural network to infer contact conditions without directly sensing the finger surface. The finger design was optimized by varying pin configurations and hinge orientations, achieving 0.1 mm depth and 2 mm location-sensing accuracies. The perception demonstrated robust generalization to various indenter shapes and sizes, which was applied to a pick-and-place task under uncertain picking positions, where the tactile feedback significantly improved placement accuracy. Overall, this work provides a lightweight, flexible, and scalable tactile sensing solution suitable for soft robotic structures where the sensing needs situating away from the contact interface.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"2722-2729"},"PeriodicalIF":5.3,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/LRA.2026.3653394
Jonghyeok Kim;Wan Kyun Chung
Among the many choices in the matrix-vector factorization of the Coriolis and centripetal terms satisfying the skew-symmetry condition in system dynamics, the unique factorization, called Christoffel-consistent (CC) factorization, has been proposed. We derived the unique CC factorization in the Lie group context and examined the impact of Christoffel inconsistency in Coriolis matrix factorization on the dynamic behavior of robot systems during both free motion and interaction with humans, particularly in the context of passivity-based controllers and augmented PD controllers. Specifically, the question is: What are the advantages of using the CC factorization, and what is the effect of non-CC factorization on the robot’s dynamic behavior, which has been rarely explored? We showed that Christoffel inconsistency generates unwanted torsion, causing the system to deviate from the desired trajectory, and this results in undesirable dynamic behavior when controlling the system, especially when the dynamics of the robot is described by twist and wrench. Through simulation and a real-world robot experiment, this phenomenon is verified for the first time.
{"title":"Christoffel-Consistent Coriolis Factorization and Its Effect on the Control of a Robot","authors":"Jonghyeok Kim;Wan Kyun Chung","doi":"10.1109/LRA.2026.3653394","DOIUrl":"https://doi.org/10.1109/LRA.2026.3653394","url":null,"abstract":"Among the many choices in the matrix-vector factorization of the Coriolis and centripetal terms satisfying the skew-symmetry condition in system dynamics, the unique factorization, called Christoffel-consistent (CC) factorization, has been proposed. We derived the unique CC factorization in the Lie group context and examined the impact of Christoffel inconsistency in Coriolis matrix factorization on the dynamic behavior of robot systems during both free motion and interaction with humans, particularly in the context of passivity-based controllers and augmented PD controllers. Specifically, the question is: What are the advantages of using the CC factorization, and what is the effect of non-CC factorization on the robot’s dynamic behavior, which has been rarely explored? We showed that Christoffel inconsistency generates unwanted torsion, causing the system to deviate from the desired trajectory, and this results in undesirable dynamic behavior when controlling the system, especially when the dynamics of the robot is described by twist and wrench. Through simulation and a real-world robot experiment, this phenomenon is verified for the first time.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"2682-2689"},"PeriodicalIF":5.3,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/LRA.2026.3652069
Jianxi Zhang;Jingtian Zhang;Hong Zeng;Dapeng Chen;Huijun Li;Aiguo Song
The foreign objects on utility poles may damage power lines and cause significant disruptions in electricity supply. A widely used approach to address this issue is for qualified personnel to climb on the pole and remove the foreign objects in a timely manner using an insulating tube. However, prolonged overhead manipulation of the insulating tube in the constrained environment not only leads to considerable upper-limb fatigue but also makes accurate tube positioning increasingly challenging. To address these challenges, wearable robotic limbs with an active control strategy have the potential to effectively reduce upper-limb fatigue and assist in tube positioning. This work presents supernumerary robotic limbs (SRLs) designed to assist electrical workers in a simulated overhead foreign objects removal task. We further propose a shared control method based on finite-horizon non-zero-sum game theory. This method models the cooperation between the SRL and the worker to adaptively modulate the input of the SRL, thereby providing rapid and accurate assistance in tube positioning. Experimental results show that the proposed SRL can reduce primary upper-limb muscle activity (deltoid, biceps brachii, brachioradialis and flexor carpi radialis) by up to 59.73% compared with performing the task without the SRL. Moreover, compared with a method that ignores human input, the proposed control strategy achieves more accurate positioning during human-SRLs cooperation. These results demonstrate the potential of both the SRL and the control strategy for the live-line overhead foreign objects removal task.
{"title":"Development and Control of Supernumerary Robotic Limbs for Overhead Tube Manipulation Task","authors":"Jianxi Zhang;Jingtian Zhang;Hong Zeng;Dapeng Chen;Huijun Li;Aiguo Song","doi":"10.1109/LRA.2026.3652069","DOIUrl":"https://doi.org/10.1109/LRA.2026.3652069","url":null,"abstract":"The foreign objects on utility poles may damage power lines and cause significant disruptions in electricity supply. A widely used approach to address this issue is for qualified personnel to climb on the pole and remove the foreign objects in a timely manner using an insulating tube. However, prolonged overhead manipulation of the insulating tube in the constrained environment not only leads to considerable upper-limb fatigue but also makes accurate tube positioning increasingly challenging. To address these challenges, wearable robotic limbs with an active control strategy have the potential to effectively reduce upper-limb fatigue and assist in tube positioning. This work presents supernumerary robotic limbs (SRLs) designed to assist electrical workers in a simulated overhead foreign objects removal task. We further propose a shared control method based on finite-horizon non-zero-sum game theory. This method models the cooperation between the SRL and the worker to adaptively modulate the input of the SRL, thereby providing rapid and accurate assistance in tube positioning. Experimental results show that the proposed SRL can reduce primary upper-limb muscle activity (deltoid, biceps brachii, brachioradialis and flexor carpi radialis) by up to 59.73% compared with performing the task without the SRL. Moreover, compared with a method that ignores human input, the proposed control strategy achieves more accurate positioning during human-SRLs cooperation. These results demonstrate the potential of both the SRL and the control strategy for the live-line overhead foreign objects removal task.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"2634-2641"},"PeriodicalIF":5.3,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/LRA.2026.3653290
Yuchen Weng;Nuo Li;Peng Yu;Qi Wang;Yongqiang Qi;Shaoze You;Jun Wang
Neural Radiance Fields (NeRF) have significantly advanced photorealistic novel view synthesis. Recently, 3D Gaussian Splatting has emerged as a promising technique with faster training and rendering speeds. However, both methods rely heavily on clear images and precise camera poses, limiting performance under motion blur. To address this, we introduce Event-Informed 3D Deblur Reconstruction with Gaussian Splatting(EiGS), a novel approach leveraging event camera data to enhance 3D Gaussian Splatting, improving sharpness and clarity in scenes affected by motion blur. Our method employs an Adaptive Deviation Estimator to learn Gaussian center shifts as the inverse of complex camera jitter, enabling simulation of motion blur during training. A motion consistency loss ensures global coherence in Gaussian displacements, while Blurriness and Event Integration Losses guide the model toward precise 3D representations. Extensive experiments demonstrate superior sharpness and real-time rendering capabilities compared to existing methods, with ablation studies validating the effectiveness of our components in robust, high-quality reconstruction for complex static scenes.
{"title":"EiGS: Event-Informed 3D Deblur Reconstruction With Gaussian Splatting","authors":"Yuchen Weng;Nuo Li;Peng Yu;Qi Wang;Yongqiang Qi;Shaoze You;Jun Wang","doi":"10.1109/LRA.2026.3653290","DOIUrl":"https://doi.org/10.1109/LRA.2026.3653290","url":null,"abstract":"Neural Radiance Fields (NeRF) have significantly advanced photorealistic novel view synthesis. Recently, 3D Gaussian Splatting has emerged as a promising technique with faster training and rendering speeds. However, both methods rely heavily on clear images and precise camera poses, limiting performance under motion blur. To address this, we introduce Event-Informed 3D Deblur Reconstruction with Gaussian Splatting(EiGS), a novel approach leveraging event camera data to enhance 3D Gaussian Splatting, improving sharpness and clarity in scenes affected by motion blur. Our method employs an Adaptive Deviation Estimator to learn Gaussian center shifts as the inverse of complex camera jitter, enabling simulation of motion blur during training. A motion consistency loss ensures global coherence in Gaussian displacements, while Blurriness and Event Integration Losses guide the model toward precise 3D representations. Extensive experiments demonstrate superior sharpness and real-time rendering capabilities compared to existing methods, with ablation studies validating the effectiveness of our components in robust, high-quality reconstruction for complex static scenes.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"2474-2481"},"PeriodicalIF":5.3,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146001871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/LRA.2026.3653282
Sangmin Lee;Donghyun Choi;Jee-Hwan Ryu
Accurate global localization remains a fundamental challenge in autonomous vehicle navigation. Traditional methods typically rely on high-definition (HD) maps generated through prior traverses or utilize auxiliary sensors, such as a global positioning system (GPS). However, the above approaches are often limited by high costs, scalability issues, and decreased reliability where GPS is unavailable. Moreover, prior methods require route-specific sensor calibration and impose modality-specific constraints, which restrict generalization across different sensor types. The proposed framework addresses this limitation by leveraging a shared embedding space, learned via a weight-sharing Vision Transformer (ViT) encoder, that aligns heterogeneous sensor modalities, Light Detection and Ranging (LiDAR) images, and geo-tagged StreetView panoramas. The proposed alignment enables reliable cross-modal retrieval and coarse-level localization without HD-map priors or route-specific calibration. Further, to address the heading inconsistency between query LiDAR and StreetView, an equirectangular perspective-n-point (PnP) solver is proposed to refine the relative pose through patch-level feature correspondences. As a result, the framework achieves coarse 3-degree-of-freedom (DoF) localization from a single LiDAR scan and publicly available StreetView imagery, bridging the gap between place recognition and metric localization. Experiments demonstrate that the proposed method achieves high recall and heading accuracy, offering scalability in urban settings covered by public Street View without reliance on HD maps.
{"title":"LSV-Loc: LiDAR to StreetView Image Cross-Modal Localization","authors":"Sangmin Lee;Donghyun Choi;Jee-Hwan Ryu","doi":"10.1109/LRA.2026.3653282","DOIUrl":"https://doi.org/10.1109/LRA.2026.3653282","url":null,"abstract":"Accurate global localization remains a fundamental challenge in autonomous vehicle navigation. Traditional methods typically rely on high-definition (HD) maps generated through prior traverses or utilize auxiliary sensors, such as a global positioning system (GPS). However, the above approaches are often limited by high costs, scalability issues, and decreased reliability where GPS is unavailable. Moreover, prior methods require route-specific sensor calibration and impose modality-specific constraints, which restrict generalization across different sensor types. The proposed framework addresses this limitation by leveraging a shared embedding space, learned via a weight-sharing Vision Transformer (ViT) encoder, that aligns heterogeneous sensor modalities, Light Detection and Ranging (LiDAR) images, and geo-tagged StreetView panoramas. The proposed alignment enables reliable cross-modal retrieval and coarse-level localization without HD-map priors or route-specific calibration. Further, to address the heading inconsistency between query LiDAR and StreetView, an equirectangular perspective-n-point (PnP) solver is proposed to refine the relative pose through patch-level feature correspondences. As a result, the framework achieves coarse 3-degree-of-freedom (DoF) localization from a single LiDAR scan and publicly available StreetView imagery, bridging the gap between place recognition and metric localization. Experiments demonstrate that the proposed method achieves high recall and heading accuracy, offering scalability in urban settings covered by public Street View without reliance on HD maps.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"2514-2521"},"PeriodicalIF":5.3,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146001877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/LRA.2026.3653387
Yifei Liu;Kefei Wen
We present a Lie group implicit formulation for kinematically redundant parallel manipulators that yields left-trivialized extended Jacobians for the extended task variable $x=(g,rho)in text{SE}(3)times mathcal {R}$. On top of this model we design a gradient-based redundancy flow on the redundancy manifold that empirically maintains a positive manipulability margin along prescribed $text{SE}(3)$ trajectories. The framework uses right-multiplicative state updates, remains compatible with automatic differentiation, and avoids mechanism-specific analytic Jacobians; it works with either direct inverse kinematics or a numeric solver. A specialization to $text{SO}(2)^{3}$ provides computation-friendly first- and second-order steps. We validate the approach on two representative mechanisms: a (6+3)-degree-of-freedom (DoF) Stewart platform and a Spherical–Revolute platform. Across dense-coverage orientation trajectories and interactive gamepad commands, the extended Jacobian remained well conditioned while the redundancy planner ran at approximately 2 kHz in software-in-the-loop on a laptop-class CPU. The method integrates cleanly with existing kinematic stacks and is suitable for real-time deployment.
{"title":"Lie Group Implicit Kinematics for Redundant Parallel Manipulators: Left-Trivialized Extended Jacobians and Gradient-Based Online Redundancy Flows for Singularity Avoidance","authors":"Yifei Liu;Kefei Wen","doi":"10.1109/LRA.2026.3653387","DOIUrl":"https://doi.org/10.1109/LRA.2026.3653387","url":null,"abstract":"We present a Lie group implicit formulation for kinematically redundant parallel manipulators that yields left-trivialized extended Jacobians for the extended task variable <inline-formula><tex-math>$x=(g,rho)in text{SE}(3)times mathcal {R}$</tex-math></inline-formula>. On top of this model we design a gradient-based redundancy flow on the redundancy manifold that empirically maintains a positive manipulability margin along prescribed <inline-formula><tex-math>$text{SE}(3)$</tex-math></inline-formula> trajectories. The framework uses right-multiplicative state updates, remains compatible with automatic differentiation, and avoids mechanism-specific analytic Jacobians; it works with either direct inverse kinematics or a numeric solver. A specialization to <inline-formula><tex-math>$text{SO}(2)^{3}$</tex-math></inline-formula> provides computation-friendly first- and second-order steps. We validate the approach on two representative mechanisms: a (6+3)-degree-of-freedom (DoF) Stewart platform and a Spherical–Revolute platform. Across dense-coverage orientation trajectories and interactive gamepad commands, the extended Jacobian remained well conditioned while the redundancy planner ran at approximately 2 kHz in software-in-the-loop on a laptop-class CPU. The method integrates cleanly with existing kinematic stacks and is suitable for real-time deployment.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 2","pages":"2322-2329"},"PeriodicalIF":5.3,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}