首页 > 最新文献

IEEE Robotics and Automation Letters最新文献

英文 中文
Event-Fused Hybrid ANN-SNN Architecture for Low-Latency Object Detection in Automotive Vision 用于汽车视觉低延迟目标检测的事件融合混合ANN-SNN架构
IF 5.3 2区 计算机科学 Q2 ROBOTICS Pub Date : 2026-02-09 DOI: 10.1109/LRA.2026.3662637
Chengjun Zhang;Yuhao Zhang;Jisong Yu;Jie Yang;Mohamad Sawan
In advanced driver-assistance systems, current computer vision algorithms predominantly rely on frame-based RGB cameras, which suffer from high latency in high-speed or sudden-scenario applications due to fixed frame rates. In response to this challenge, event-based cameras have gained attention as a viable substitute, providing markedly higher temporal resolution and greatly diminished latency. However, the asynchronous and sparse nature of event data poses challenges in achieving accuracy comparable to frame-based algorithms. Leveraging the event-driven nature of Spiking Neural Networks (SNNs), we propose an Event-Fused Hybrid (EFH) architecture for automotive vision. EFH combines Artificial Neural Networks (ANNs) for static feature extraction from RGB frames with SNNs that dynamically update these features using event streams. This approach enables high-efficiency, high-frame-rate object detection with minimal latency. Our method achieves state-of-the-art performance in inter-frame object detection by effectively fusing event data, while the SNN branch significantly reduces power consumption during event-stream processing. Furthermore, we deploy the system on a vehicle platform, achieving real-time object detection at 60 FPS using a 15-FPS RGB camera paired with an event camera.
在先进的驾驶辅助系统中,目前的计算机视觉算法主要依赖于基于帧的RGB相机,由于固定帧率,在高速或突发场景应用中存在高延迟。为了应对这一挑战,基于事件的相机作为一种可行的替代品受到了关注,它提供了明显更高的时间分辨率和大大减少的延迟。然而,事件数据的异步和稀疏特性在实现与基于帧的算法相当的准确性方面提出了挑战。利用脉冲神经网络(snn)的事件驱动特性,我们提出了一种用于汽车视觉的事件融合混合(EFH)架构。EFH结合了用于从RGB帧中提取静态特征的人工神经网络(ann)和使用事件流动态更新这些特征的snn。这种方法能够以最小的延迟实现高效率、高帧率的目标检测。我们的方法通过有效融合事件数据实现了帧间目标检测的最先进性能,而SNN分支在事件流处理过程中显着降低了功耗。此外,我们将该系统部署在车辆平台上,使用15帧/秒的RGB相机与事件相机配对,以60帧/秒的速度实现实时目标检测。
{"title":"Event-Fused Hybrid ANN-SNN Architecture for Low-Latency Object Detection in Automotive Vision","authors":"Chengjun Zhang;Yuhao Zhang;Jisong Yu;Jie Yang;Mohamad Sawan","doi":"10.1109/LRA.2026.3662637","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662637","url":null,"abstract":"In advanced driver-assistance systems, current computer vision algorithms predominantly rely on frame-based RGB cameras, which suffer from high latency in high-speed or sudden-scenario applications due to fixed frame rates. In response to this challenge, event-based cameras have gained attention as a viable substitute, providing markedly higher temporal resolution and greatly diminished latency. However, the asynchronous and sparse nature of event data poses challenges in achieving accuracy comparable to frame-based algorithms. Leveraging the event-driven nature of Spiking Neural Networks (SNNs), we propose an Event-Fused Hybrid (EFH) architecture for automotive vision. EFH combines Artificial Neural Networks (ANNs) for static feature extraction from RGB frames with SNNs that dynamically update these features using event streams. This approach enables high-efficiency, high-frame-rate object detection with minimal latency. Our method achieves state-of-the-art performance in inter-frame object detection by effectively fusing event data, while the SNN branch significantly reduces power consumption during event-stream processing. Furthermore, we deploy the system on a vehicle platform, achieving real-time object detection at 60 FPS using a 15-FPS RGB camera paired with an event camera.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"3622-3628"},"PeriodicalIF":5.3,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Parameter-Robust MPPI for Safe Online Learning of Unknown Parameters 未知参数安全在线学习的参数鲁棒MPPI
IF 5.3 2区 计算机科学 Q2 ROBOTICS Pub Date : 2026-02-09 DOI: 10.1109/LRA.2026.3662531
Matti Vahs;Jaeyoun Choi;Niklas Schmid;Jana Tumova;Chuchu Fan
Robots deployed in dynamic environments must remain safe even when key physical parameters are uncertain or change over time. We propose Parameter-Robust Model Predictive Path Integral (PRMPPI) control, a framework that integrates online parameter learning with probabilistic safety constraints. PRMPPI maintains a particle-based belief over parameters via Stein Variational Gradient Descent, evaluates safety constraints using Conformal Prediction, and optimizes both a nominal performance-driven and a safety-focused backup trajectory in parallel. This yields a controller that is cautious at first, improves performance as parameters are learned, and ensures safety throughout. Simulation and hardware experiments demonstrate higher success rates, lower tracking error, and more accurate parameter estimates than baselines.
即使关键物理参数不确定或随时间变化,部署在动态环境中的机器人也必须保持安全。我们提出了参数鲁棒模型预测路径积分(PRMPPI)控制,这是一个将在线参数学习与概率安全约束相结合的框架。PRMPPI通过Stein变分梯度下降(Stein Variational Gradient Descent)对参数保持基于粒子的信念,使用保形预测(Conformal Prediction)评估安全约束,并并行优化名义性能驱动和以安全为重点的备份轨迹。这就产生了一种控制器,它一开始是谨慎的,随着参数的学习提高性能,并确保整个过程的安全性。仿真和硬件实验表明,与基线相比,成功率更高,跟踪误差更低,参数估计更准确。
{"title":"Parameter-Robust MPPI for Safe Online Learning of Unknown Parameters","authors":"Matti Vahs;Jaeyoun Choi;Niklas Schmid;Jana Tumova;Chuchu Fan","doi":"10.1109/LRA.2026.3662531","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662531","url":null,"abstract":"Robots deployed in dynamic environments must remain safe even when key physical parameters are uncertain or change over time. We propose Parameter-Robust Model Predictive Path Integral (PRMPPI) control, a framework that integrates online parameter learning with probabilistic safety constraints. PRMPPI maintains a particle-based belief over parameters via Stein Variational Gradient Descent, evaluates safety constraints using Conformal Prediction, and optimizes both a nominal performance-driven and a safety-focused backup trajectory in parallel. This yields a controller that is cautious at first, improves performance as parameters are learned, and ensures safety throughout. Simulation and hardware experiments demonstrate higher success rates, lower tracking error, and more accurate parameter estimates than baselines.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 4","pages":"3931-3938"},"PeriodicalIF":5.3,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146216651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Hierarchical Framework for Real-Time Path Planning of Microswarm in Dynamic Environments 动态环境下微群实时路径规划的层次框架
IF 5.3 2区 计算机科学 Q2 ROBOTICS Pub Date : 2026-02-06 DOI: 10.1109/LRA.2026.3662530
Yamei Li;Ruijian Ge;Aoji Zhu;Jiachi Zhao;Danjing Shi;Yinghan Sun;Yangmin Li;Lidong Yang
Autonomous navigation of magnetic microswarms in dynamic and unstructured environments is essential for biomedical applications, such as targeted therapy and minimally invasive interventions. However, existing path planning methods struggle to simultaneously achieve real-time adaptability and path smoothness in dynamic obstacle environments. To address this, we propose a hierarchical Dynamic Rapidly-exploring Random Tree Star (D-RRT*) path planning framework that integrates dynamic step size adjustment, local target selection, and local planning that considers microswarms' turning capabilities and energy optimization. Comparative simulations and experiments validate the effectiveness of the proposed planning framework, and results show that it can significantly improve the planning efficiency, path smoothness, and collision avoidance in complex dynamic scenarios.
磁微群在动态和非结构化环境中的自主导航对于生物医学应用至关重要,例如靶向治疗和微创干预。然而,现有的路径规划方法很难同时实现动态障碍物环境下的实时适应性和路径平滑性。为了解决这个问题,我们提出了一个分层的动态快速探索随机树星(D-RRT*)路径规划框架,该框架集成了动态步长调整、局部目标选择和考虑微群转弯能力和能量优化的局部规划。对比仿真和实验验证了所提规划框架的有效性,结果表明该框架能显著提高复杂动态场景下的规划效率、路径平滑性和避碰性。
{"title":"A Hierarchical Framework for Real-Time Path Planning of Microswarm in Dynamic Environments","authors":"Yamei Li;Ruijian Ge;Aoji Zhu;Jiachi Zhao;Danjing Shi;Yinghan Sun;Yangmin Li;Lidong Yang","doi":"10.1109/LRA.2026.3662530","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662530","url":null,"abstract":"Autonomous navigation of magnetic microswarms in dynamic and unstructured environments is essential for biomedical applications, such as targeted therapy and minimally invasive interventions. However, existing path planning methods struggle to simultaneously achieve real-time adaptability and path smoothness in dynamic obstacle environments. To address this, we propose a hierarchical Dynamic Rapidly-exploring Random Tree Star (D-RRT*) path planning framework that integrates dynamic step size adjustment, local target selection, and local planning that considers microswarms' turning capabilities and energy optimization. Comparative simulations and experiments validate the effectiveness of the proposed planning framework, and results show that it can significantly improve the planning efficiency, path smoothness, and collision avoidance in complex dynamic scenarios.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"3891-3898"},"PeriodicalIF":5.3,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146223535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Distributed Multi-Robot Ergodic Coverage Control for Estimating Time-Varying Spatial Processes 时变空间过程估计的分布式多机器人遍历覆盖控制
IF 5.3 2区 计算机科学 Q2 ROBOTICS Pub Date : 2026-02-06 DOI: 10.1109/LRA.2026.3662641
Mattia Mantovani;Mattia Catellani;Lorenzo Sabattini
We present a fully distributed framework for multi-robot exploration and coverage of time-varying spatial processes in complex, non-convex environments.Building on heat-equation-driven adaptive coverage (HEDAC) and system ergodicity, the proposed approach enables robots to autonomously navigate arbitrary domains, reconstruct unknown spatial fields, and continuously balance exploration and coverage without centralized coordination. A temporal decay mechanism promotes adaptive monitoring by regulating the relevance of past observations. Simulation and real-world experiments demonstrate the effectiveness and robustness of the method.
我们提出了一个完全分布式的框架,用于多机器人探索和覆盖复杂的非凸环境中的时变空间过程。基于热方程驱动的自适应覆盖(HEDAC)和系统遍历性,该方法使机器人能够自主导航任意域,重建未知空间场,并在不集中协调的情况下不断平衡探索和覆盖。时间衰减机制通过调节过去观测的相关性来促进自适应监测。仿真和实际实验证明了该方法的有效性和鲁棒性。
{"title":"Distributed Multi-Robot Ergodic Coverage Control for Estimating Time-Varying Spatial Processes","authors":"Mattia Mantovani;Mattia Catellani;Lorenzo Sabattini","doi":"10.1109/LRA.2026.3662641","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662641","url":null,"abstract":"We present a fully distributed framework for multi-robot exploration and coverage of time-varying spatial processes in complex, non-convex environments.Building on heat-equation-driven adaptive coverage (HEDAC) and system ergodicity, the proposed approach enables robots to autonomously navigate arbitrary domains, reconstruct unknown spatial fields, and continuously balance exploration and coverage without centralized coordination. A temporal decay mechanism promotes adaptive monitoring by regulating the relevance of past observations. Simulation and real-world experiments demonstrate the effectiveness and robustness of the method.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 4","pages":"3955-3962"},"PeriodicalIF":5.3,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11373838","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146216607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Error-State Model Predictive Path Integral Control of Tendon-Driven Continuum Robots Using Cosserat Rod Dynamics With Strain Parametrization 基于应变参数化Cosserat杆动力学的肌腱驱动连续体机器人误差状态预测路径积分控制
IF 5.3 2区 计算机科学 Q2 ROBOTICS Pub Date : 2026-02-06 DOI: 10.1109/LRA.2026.3662658
E. Arefinia;N. Feizi;F. C. Pedrosa;R. V. Patel;J. Jayender
This paper presents an error-state Model Predictive Path Integral (MPPI) framework for tendon-driven continuum robots (TDCRs). Tracking-error dynamics are formulated on a Lie group to preserve full pose geometry, yielding precise position–orientation error metrics. A nonlinear Cosserat-rod model with strain parameterization provides a closed-form TDCR dynamics representation and updates in $0.3 pm text{0.3},text{ms}$. The model is calibrated via weight-release and actuation experiments on robotic ablation catheters, and its generalized coordinates are estimated through nested optimization. The MPPI controller parallelizes trajectory sampling and evaluation, uses tendon-displacement actuation computed via optimization to eliminate force sensors, and is uncertainty-aware through a simple and efficient exponentially weighted moving-average (EWMA) estimator embedded in the running cost. Control trajectories are sampled around the current best sequence and evaluated with an adaptive cost and exponential weighting to bias low-cost solutions. Experiments comparing conventional model predictive control (MPC), Lie-group MPC, offline Implicit Q-Learning (IQL), and MPPI formulated with Cartesian errors show that our MPPI method achieves the highest accuracy, significantly better computational efficiency than MPC, and better overall accuracy than all baselines. The model further extends naturally to multi-segment TDCRs and can incorporate tendon-actuation friction.
提出了一种肌腱驱动连续体机器人的误差状态模型预测路径积分(MPPI)框架。跟踪误差动力学是在李群上制定的,以保持完整的姿态几何,产生精确的位置方向误差度量。在$0.3 pm text{0.3},text{ms}$中更新了具有应变参数化的非线性Cosserat-rod模型,提供了封闭形式的TDCR动态表示。通过机器人消融导管的失重和驱动实验对模型进行标定,并通过嵌套优化估计模型的广义坐标。MPPI控制器并行进行轨迹采样和评估,通过优化计算肌腱位移驱动来消除力传感器,并通过嵌入在运行成本中的简单高效的指数加权移动平均(EWMA)估计器来感知不确定性。控制轨迹围绕当前最佳序列进行采样,并使用自适应成本和指数加权来评估,以偏向低成本解决方案。通过比较传统模型预测控制(MPC)、Lie-group MPC、离线隐式Q-Learning (IQL)和带笛卡尔误差的MPPI方法,我们的MPPI方法具有最高的精度,计算效率显著优于MPC,总体精度优于所有基线。该模型进一步自然地扩展到多节段tdcr,并可以考虑肌腱驱动摩擦。
{"title":"Error-State Model Predictive Path Integral Control of Tendon-Driven Continuum Robots Using Cosserat Rod Dynamics With Strain Parametrization","authors":"E. Arefinia;N. Feizi;F. C. Pedrosa;R. V. Patel;J. Jayender","doi":"10.1109/LRA.2026.3662658","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662658","url":null,"abstract":"This paper presents an error-state Model Predictive Path Integral (MPPI) framework for tendon-driven continuum robots (TDCRs). Tracking-error dynamics are formulated on a Lie group to preserve full pose geometry, yielding precise position–orientation error metrics. A nonlinear Cosserat-rod model with strain parameterization provides a closed-form TDCR dynamics representation and updates in <inline-formula><tex-math>$0.3 pm text{0.3},text{ms}$</tex-math></inline-formula>. The model is calibrated via weight-release and actuation experiments on robotic ablation catheters, and its generalized coordinates are estimated through nested optimization. The MPPI controller parallelizes trajectory sampling and evaluation, uses tendon-displacement actuation computed via optimization to eliminate force sensors, and is uncertainty-aware through a simple and efficient exponentially weighted moving-average (EWMA) estimator embedded in the running cost. Control trajectories are sampled around the current best sequence and evaluated with an adaptive cost and exponential weighting to bias low-cost solutions. Experiments comparing conventional model predictive control (MPC), Lie-group MPC, offline Implicit Q-Learning (IQL), and MPPI formulated with Cartesian errors show that our MPPI method achieves the highest accuracy, significantly better computational efficiency than MPC, and better overall accuracy than all baselines. The model further extends naturally to multi-segment TDCRs and can incorporate tendon-actuation friction.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"3867-3874"},"PeriodicalIF":5.3,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146223711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Simple Compliant Gripper That Reconfigures Between Sensing and Grasping Modes 一个简单的柔性夹持器,可以在感知和抓取模式之间重新配置
IF 5.3 2区 计算机科学 Q2 ROBOTICS Pub Date : 2026-02-06 DOI: 10.1109/LRA.2026.3662656
Shashank Ramesh;Taylor Girard;Mark Plecnik
In addition to securely handling objects, grippers endowed with a sense of touch are desirable for handling delicate objects. Force sensors can be mounted distally at the fingertips, but a simpler option is to estimate fingertip forces from motor current at the base. This approach, which is not new, reduces wire routing and recesses electronics away from the sensed surface for operation in wet environments. However, for such a strategy to work, the actuating motor must have no or low gearing, which increases its transparency to external torques, but greatly limits its own torque output. Therein lies a trade-off. In addition to gearing, the transmission ratio from motor to fingertip is also defined by the gripper linkage itself. In this work, the trade-off is overcome by introducing a linkage capable of reconfiguring without the need of an extra actuator. Reconfiguration is performed by moving across an output singularity to select between a mode which is biased for force sensing (sense mode) and a mode which is biased for force production (grip mode). These novel kinematics are embodied as a mostly monolithic compliant mechanism, leaving just one traditional pin joint in the entire gripper assembly. Experiments show that force mode exhibits 3.1× more force output, and sense mode can measure 2.6× smaller forces. Corollary to the latter, sense mode is ideal for estimating the stiffness of objects, including small fruits with stiffness lower than 150 N/m. As an illustration, we demonstrate the usage of sense mode to estimate the ripeness of small fruits, followed by a transition to force mode in order to pluck the fruit. Ripeness is distinguished with up to $approx 90%$ accuracy based on estimates of stiffness and fruit size using sense mode.
除了安全处理物体外,具有触觉的抓手还可以处理易碎的物体。力传感器可以安装在指尖的远端,但一个更简单的选择是通过底部的电机电流来估计指尖的力。这种方法并不新颖,它减少了电线布线,并使电子设备远离被测表面,以便在潮湿环境中运行。然而,为了使这种策略起作用,执行电机必须没有或低传动装置,这增加了其对外部扭矩的透明度,但极大地限制了其自身的扭矩输出。这是一种权衡。除了传动装置,从电机到指尖的传动比也由夹持器连杆本身定义。在这项工作中,通过引入能够在不需要额外执行器的情况下重新配置的连杆来克服权衡。通过在输出奇点上移动来执行重新配置,以在力感偏置模式(感测模式)和力产生偏置模式(握持模式)之间进行选择。这些新颖的运动学体现为一个主要的单片柔性机构,在整个夹具组件中只留下一个传统的销钉连接。实验表明,力模式的输出力增大3.1倍,感测模式的输出力减小2.6倍。与后者相对应的是,感觉模式对于估计物体的刚度是理想的,包括刚度低于150 N/m的小水果。作为一个例子,我们演示了使用感觉模式来估计小水果的成熟度,然后过渡到力模式,以便采摘水果。根据使用感官模式估计的硬度和果实大小,以高达约90%的精度来区分成熟度。
{"title":"A Simple Compliant Gripper That Reconfigures Between Sensing and Grasping Modes","authors":"Shashank Ramesh;Taylor Girard;Mark Plecnik","doi":"10.1109/LRA.2026.3662656","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662656","url":null,"abstract":"In addition to securely handling objects, grippers endowed with a sense of touch are desirable for handling delicate objects. Force sensors can be mounted distally at the fingertips, but a simpler option is to estimate fingertip forces from motor current at the base. This approach, which is not new, reduces wire routing and recesses electronics away from the sensed surface for operation in wet environments. However, for such a strategy to work, the actuating motor must have no or low gearing, which increases its transparency to external torques, but greatly limits its own torque output. Therein lies a trade-off. In addition to gearing, the transmission ratio from motor to fingertip is also defined by the gripper linkage itself. In this work, the trade-off is overcome by introducing a linkage capable of reconfiguring without the need of an extra actuator. Reconfiguration is performed by moving across an output singularity to select between a mode which is biased for force sensing (<italic>sense mode</i>) and a mode which is biased for force production (<italic>grip mode</i>). These novel kinematics are embodied as a mostly monolithic compliant mechanism, leaving just one traditional pin joint in the entire gripper assembly. Experiments show that force mode exhibits 3.1× more force output, and sense mode can measure 2.6× smaller forces. Corollary to the latter, sense mode is ideal for estimating the stiffness of objects, including small fruits with stiffness lower than 150 N/m. As an illustration, we demonstrate the usage of sense mode to estimate the ripeness of small fruits, followed by a transition to force mode in order to pluck the fruit. Ripeness is distinguished with up to <inline-formula><tex-math>$approx 90%$</tex-math></inline-formula> accuracy based on estimates of stiffness and fruit size using sense mode.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"3796-3803"},"PeriodicalIF":5.3,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146223574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Explicit Offset Learning for Joint Pedestrian Detection and Localization in Weakly Aligned Multispectral Images 弱对准多光谱图像中关节行人检测与定位的显式偏移学习
IF 5.3 2区 计算机科学 Q2 ROBOTICS Pub Date : 2026-02-06 DOI: 10.1109/LRA.2026.3662631
Xingfang Zhou;Zujun Yu;Tao Ruan;Baoqing Guo;Dingyuan Bai;Tao Sun
Infrared (IR) and visible (VIS) image pairs suffer from position offset of the same object across different modalities in practice. Related work focus on improving detection performance through feature alignment and fusion, ignoring object accurate localization in different modalities, while complementary information at the object level can help to further explain and judge the object. To address this, we propose a multi-spectral pedestrian detection method based on YOLO, featuring explicit offset learning for both feature alignment and object offset prediction. First, an Adaptive Multi-scale Mask Fusion (AMMF) module is designed to enhance features by learning to dynamically fuse mask predictions from Feature Pyramid Network (FPN) in both modalities. Then, a Region-Aware Supervised Feature Alignment (RASFA) module is proposed with a symmetric design. This module simultaneously predicts both IR-to-VIS and VIS-to-IR offset fields within one efficient framework. The former enables robust feature alignment supervised on target regions, while the latter directly provides object-level offsets. As a result, the detection head can efficiently output IR detections by applying these readily available offsets to the VIS detections, eliminating the need for a separate offset prediction branch. Experiments on three public multispectral pedestrian datasets demonstrate that our method not only improves detection performance but also achieves accurate localization of different modalities, outperforming previous state-of-the-art methods.
在实际应用中,红外(IR)和可见光(VIS)图像对在不同模态下受到相同物体位置偏移的影响。相关工作侧重于通过特征对齐和融合来提高检测性能,忽略了不同模态下物体的精确定位,而物体层面的互补信息有助于进一步解释和判断物体。为了解决这个问题,我们提出了一种基于YOLO的多光谱行人检测方法,该方法在特征对齐和物体偏移预测方面都具有显式偏移学习。首先,设计了自适应多尺度掩码融合(AMMF)模块,通过学习在两种模式下动态融合特征金字塔网络(FPN)的掩码预测来增强特征。然后,提出了一种具有对称设计的区域感知监督特征对齐(RASFA)模块。该模块在一个有效的框架内同时预测IR-to-VIS和VIS-to-IR偏移场。前者可以实现对目标区域的鲁棒特征对齐监督,而后者直接提供对象级偏移。因此,检测头可以通过将这些容易获得的偏移量应用于VIS检测来有效地输出红外检测,从而消除了对单独偏移量预测分支的需要。在三个公共多光谱行人数据集上的实验表明,我们的方法不仅提高了检测性能,而且实现了不同模态的准确定位,优于现有的最先进的方法。
{"title":"Explicit Offset Learning for Joint Pedestrian Detection and Localization in Weakly Aligned Multispectral Images","authors":"Xingfang Zhou;Zujun Yu;Tao Ruan;Baoqing Guo;Dingyuan Bai;Tao Sun","doi":"10.1109/LRA.2026.3662631","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662631","url":null,"abstract":"Infrared (IR) and visible (VIS) image pairs suffer from position offset of the same object across different modalities in practice. Related work focus on improving detection performance through feature alignment and fusion, ignoring object accurate localization in different modalities, while complementary information at the object level can help to further explain and judge the object. To address this, we propose a multi-spectral pedestrian detection method based on YOLO, featuring explicit offset learning for both feature alignment and object offset prediction. First, an Adaptive Multi-scale Mask Fusion (AMMF) module is designed to enhance features by learning to dynamically fuse mask predictions from Feature Pyramid Network (FPN) in both modalities. Then, a Region-Aware Supervised Feature Alignment (RASFA) module is proposed with a symmetric design. This module simultaneously predicts both IR-to-VIS and VIS-to-IR offset fields within one efficient framework. The former enables robust feature alignment supervised on target regions, while the latter directly provides object-level offsets. As a result, the detection head can efficiently output IR detections by applying these readily available offsets to the VIS detections, eliminating the need for a separate offset prediction branch. Experiments on three public multispectral pedestrian datasets demonstrate that our method not only improves detection performance but also achieves accurate localization of different modalities, outperforming previous state-of-the-art methods.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"3645-3652"},"PeriodicalIF":5.3,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EED: Embodied Environment Description Through Robotic Visual Exploration 通过机器人视觉探索的具体化环境描述
IF 5.3 2区 计算机科学 Q2 ROBOTICS Pub Date : 2026-02-06 DOI: 10.1109/LRA.2026.3662652
Kohei Matsumoto;Asako Kanezaki
The optimal way to convey information about a real environment to humans is through natural language descriptions. With the remarkable advancements in large language models and the field of Embodied AI in recent years, it has become possible for robots to autonomously navigate environments while recognizing and understanding their surroundings, much like humans do. In this paper, we propose a new Embodied AI task in which an autonomous mobile robot explores an environment and summarizes the entire environment in natural language. To properly evaluate this task, we use a crowdsourcing service to collect human-generated environment descriptions and construct a benchmark dataset. Additionally, the evaluation is conducted through a crowdsourcing service, and we investigate correlations with existing text evaluation metrics. Furthermore, we propose a baseline reinforcement learning method for the robot's environment exploration behavior to perform this task, demonstrating its superior performance compared to existing visual exploration methods.
向人类传达真实环境信息的最佳方式是通过自然语言描述。近年来,随着大型语言模型和嵌入式人工智能领域的显著进步,机器人在识别和理解周围环境的同时自主导航环境已经成为可能,就像人类一样。在本文中,我们提出了一种新的嵌入式AI任务,其中自主移动机器人探索环境并以自然语言总结整个环境。为了正确评估这项任务,我们使用众包服务来收集人类生成的环境描述并构建基准数据集。此外,评估是通过众包服务进行的,我们研究了与现有文本评估指标的相关性。此外,我们提出了一种基于机器人环境探索行为的基线强化学习方法来执行该任务,与现有的视觉探索方法相比,该方法具有优越的性能。
{"title":"EED: Embodied Environment Description Through Robotic Visual Exploration","authors":"Kohei Matsumoto;Asako Kanezaki","doi":"10.1109/LRA.2026.3662652","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662652","url":null,"abstract":"The optimal way to convey information about a real environment to humans is through natural language descriptions. With the remarkable advancements in large language models and the field of Embodied AI in recent years, it has become possible for robots to autonomously navigate environments while recognizing and understanding their surroundings, much like humans do. In this paper, we propose a new Embodied AI task in which an autonomous mobile robot explores an environment and summarizes the entire environment in natural language. To properly evaluate this task, we use a crowdsourcing service to collect human-generated environment descriptions and construct a benchmark dataset. Additionally, the evaluation is conducted through a crowdsourcing service, and we investigate correlations with existing text evaluation metrics. Furthermore, we propose a baseline reinforcement learning method for the robot's environment exploration behavior to perform this task, demonstrating its superior performance compared to existing visual exploration methods.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 4","pages":"3994-4001"},"PeriodicalIF":5.3,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11373846","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146216631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Incremental 3D Crop Model Association for Real-Time Counting in Dense Orchards 用于密集果园实时计数的增量三维作物模型关联
IF 5.3 2区 计算机科学 Q2 ROBOTICS Pub Date : 2026-02-06 DOI: 10.1109/LRA.2026.3662648
Daesung Park;KwangEun Ko;Dongbum Pyo;Jaehyeon Kang
Accurate real-time crop counting is essential for autonomous agricultural systems. However, existing methods often fail in dense plantings due to heavy foliage, irregular planting patterns, and frequent occlusions. While 2D tracking suffers from double-counting and 3D reconstruction requires offline processing, we propose a real-time crop counting framework that incrementally constructs global 3D crop instances during data collection. Each crop is modeled as a 3D oriented bounding box, initialized upon detection and updated with subsequent observations. To ensure robust association across frames, we employ 3D Generalized Intersection over Union (GIoU) for spatial matching and confidence-based filtering for validation, effectively reducing double-counting in dense orchards. Unlike prior methods, our approach supports on-the-fly counting without post-hoc reconstruction and performs reliably in unstructured field conditions. Experimental results demonstrate the accuracy and real-time capability of the proposed system in dense agricultural settings.
准确的实时作物计数对自主农业系统至关重要。然而,由于茂密的植被、不规则的种植模式和频繁的遮挡,现有的方法往往失败。虽然2D跟踪存在重复计数问题,3D重建需要离线处理,但我们提出了一种实时作物计数框架,该框架在数据收集过程中逐步构建全局3D作物实例。每个作物都被建模为一个三维定向的边界框,在检测时进行初始化,并根据后续观察进行更新。为了确保跨帧的鲁棒关联,我们采用3D广义交联(GIoU)进行空间匹配,并基于置信度的过滤进行验证,有效地减少了密集果园中的重复计数。与之前的方法不同,我们的方法支持即时计数,无需事后重建,并且在非结构化的现场条件下可靠地执行。实验结果证明了该系统在密集农业环境下的准确性和实时性。
{"title":"Incremental 3D Crop Model Association for Real-Time Counting in Dense Orchards","authors":"Daesung Park;KwangEun Ko;Dongbum Pyo;Jaehyeon Kang","doi":"10.1109/LRA.2026.3662648","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662648","url":null,"abstract":"Accurate real-time crop counting is essential for autonomous agricultural systems. However, existing methods often fail in dense plantings due to heavy foliage, irregular planting patterns, and frequent occlusions. While 2D tracking suffers from double-counting and 3D reconstruction requires offline processing, we propose a real-time crop counting framework that incrementally constructs global 3D crop instances during data collection. Each crop is modeled as a 3D oriented bounding box, initialized upon detection and updated with subsequent observations. To ensure robust association across frames, we employ 3D Generalized Intersection over Union (GIoU) for spatial matching and confidence-based filtering for validation, effectively reducing double-counting in dense orchards. Unlike prior methods, our approach supports on-the-fly counting without post-hoc reconstruction and performs reliably in unstructured field conditions. Experimental results demonstrate the accuracy and real-time capability of the proposed system in dense agricultural settings.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"3860-3866"},"PeriodicalIF":5.3,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146223602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Contact-Aware Path Planning for Autonomous Neuroendovascular Navigation 自主神经血管内导航的接触感知路径规划
IF 5.3 2区 计算机科学 Q2 ROBOTICS Pub Date : 2026-02-06 DOI: 10.1109/LRA.2026.3662633
Aabha Tamhankar;Ron Alterovitz;Ajit S. Puri;Giovanni Pittiglio
We propose a deterministic and time-efficient contact-aware path planner for neurovascular navigation. The algorithm leverages information from pre- and intra-operative images of the vessels to navigate pre-bent passive tools, by intelligently predicting and exploiting interactions with the anatomy. A kinematic model is derived and employed by the sampling-based planner for tree expansion that utilizes simplified motion primitives. This approach enables fast computation of the feasible path, with negligible loss in accuracy, as demonstrated in diverse and representative anatomies of the vessels. In these anatomical demonstrators, the algorithm shows a 100% convergence rate within 22.8 s in the worst case, with sub-millimeter tracking errors ($< {0.64},{mathrm{mm}}$), and is found effective on anatomical phantoms representative of $sim$94% of patients.
我们提出了一个确定性和时间高效的接触感知路径规划神经血管导航。该算法通过智能预测和利用与解剖结构的相互作用,利用术前和术中血管图像的信息来导航预弯曲的被动工具。基于采样的规划器利用简化的运动原语,导出了一种运动模型,并将其用于树的展开。这种方法能够快速计算可行路径,精度损失可以忽略不计,正如在不同的和有代表性的血管解剖中所证明的那样。在这些解剖演示中,该算法在最坏的情况下在22.8 s内显示出100%的收敛率,跟踪误差为亚毫米($< {0.64}},{ maththrm {mm}}$),并且被发现对代表$sim$94%的患者的解剖幻影有效。
{"title":"Contact-Aware Path Planning for Autonomous Neuroendovascular Navigation","authors":"Aabha Tamhankar;Ron Alterovitz;Ajit S. Puri;Giovanni Pittiglio","doi":"10.1109/LRA.2026.3662633","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662633","url":null,"abstract":"We propose a deterministic and time-efficient contact-aware path planner for neurovascular navigation. The algorithm leverages information from pre- and intra-operative images of the vessels to navigate pre-bent passive tools, by intelligently predicting and exploiting interactions with the anatomy. A kinematic model is derived and employed by the sampling-based planner for tree expansion that utilizes simplified motion primitives. This approach enables fast computation of the feasible path, with negligible loss in accuracy, as demonstrated in diverse and representative anatomies of the vessels. In these anatomical demonstrators, the algorithm shows a 100% convergence rate within 22.8 s in the worst case, with sub-millimeter tracking errors (<inline-formula><tex-math>$&lt; {0.64},{mathrm{mm}}$</tex-math></inline-formula>), and is found effective on anatomical phantoms representative of <inline-formula><tex-math>$sim$</tex-math></inline-formula>94% of patients.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 4","pages":"4130-4137"},"PeriodicalIF":5.3,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146223848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Robotics and Automation Letters
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1