In advanced driver-assistance systems, current computer vision algorithms predominantly rely on frame-based RGB cameras, which suffer from high latency in high-speed or sudden-scenario applications due to fixed frame rates. In response to this challenge, event-based cameras have gained attention as a viable substitute, providing markedly higher temporal resolution and greatly diminished latency. However, the asynchronous and sparse nature of event data poses challenges in achieving accuracy comparable to frame-based algorithms. Leveraging the event-driven nature of Spiking Neural Networks (SNNs), we propose an Event-Fused Hybrid (EFH) architecture for automotive vision. EFH combines Artificial Neural Networks (ANNs) for static feature extraction from RGB frames with SNNs that dynamically update these features using event streams. This approach enables high-efficiency, high-frame-rate object detection with minimal latency. Our method achieves state-of-the-art performance in inter-frame object detection by effectively fusing event data, while the SNN branch significantly reduces power consumption during event-stream processing. Furthermore, we deploy the system on a vehicle platform, achieving real-time object detection at 60 FPS using a 15-FPS RGB camera paired with an event camera.
{"title":"Event-Fused Hybrid ANN-SNN Architecture for Low-Latency Object Detection in Automotive Vision","authors":"Chengjun Zhang;Yuhao Zhang;Jisong Yu;Jie Yang;Mohamad Sawan","doi":"10.1109/LRA.2026.3662637","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662637","url":null,"abstract":"In advanced driver-assistance systems, current computer vision algorithms predominantly rely on frame-based RGB cameras, which suffer from high latency in high-speed or sudden-scenario applications due to fixed frame rates. In response to this challenge, event-based cameras have gained attention as a viable substitute, providing markedly higher temporal resolution and greatly diminished latency. However, the asynchronous and sparse nature of event data poses challenges in achieving accuracy comparable to frame-based algorithms. Leveraging the event-driven nature of Spiking Neural Networks (SNNs), we propose an Event-Fused Hybrid (EFH) architecture for automotive vision. EFH combines Artificial Neural Networks (ANNs) for static feature extraction from RGB frames with SNNs that dynamically update these features using event streams. This approach enables high-efficiency, high-frame-rate object detection with minimal latency. Our method achieves state-of-the-art performance in inter-frame object detection by effectively fusing event data, while the SNN branch significantly reduces power consumption during event-stream processing. Furthermore, we deploy the system on a vehicle platform, achieving real-time object detection at 60 FPS using a 15-FPS RGB camera paired with an event camera.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"3622-3628"},"PeriodicalIF":5.3,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-09DOI: 10.1109/LRA.2026.3662531
Matti Vahs;Jaeyoun Choi;Niklas Schmid;Jana Tumova;Chuchu Fan
Robots deployed in dynamic environments must remain safe even when key physical parameters are uncertain or change over time. We propose Parameter-Robust Model Predictive Path Integral (PRMPPI) control, a framework that integrates online parameter learning with probabilistic safety constraints. PRMPPI maintains a particle-based belief over parameters via Stein Variational Gradient Descent, evaluates safety constraints using Conformal Prediction, and optimizes both a nominal performance-driven and a safety-focused backup trajectory in parallel. This yields a controller that is cautious at first, improves performance as parameters are learned, and ensures safety throughout. Simulation and hardware experiments demonstrate higher success rates, lower tracking error, and more accurate parameter estimates than baselines.
{"title":"Parameter-Robust MPPI for Safe Online Learning of Unknown Parameters","authors":"Matti Vahs;Jaeyoun Choi;Niklas Schmid;Jana Tumova;Chuchu Fan","doi":"10.1109/LRA.2026.3662531","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662531","url":null,"abstract":"Robots deployed in dynamic environments must remain safe even when key physical parameters are uncertain or change over time. We propose Parameter-Robust Model Predictive Path Integral (PRMPPI) control, a framework that integrates online parameter learning with probabilistic safety constraints. PRMPPI maintains a particle-based belief over parameters via Stein Variational Gradient Descent, evaluates safety constraints using Conformal Prediction, and optimizes both a nominal performance-driven and a safety-focused backup trajectory in parallel. This yields a controller that is cautious at first, improves performance as parameters are learned, and ensures safety throughout. Simulation and hardware experiments demonstrate higher success rates, lower tracking error, and more accurate parameter estimates than baselines.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 4","pages":"3931-3938"},"PeriodicalIF":5.3,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146216651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-06DOI: 10.1109/LRA.2026.3662530
Yamei Li;Ruijian Ge;Aoji Zhu;Jiachi Zhao;Danjing Shi;Yinghan Sun;Yangmin Li;Lidong Yang
Autonomous navigation of magnetic microswarms in dynamic and unstructured environments is essential for biomedical applications, such as targeted therapy and minimally invasive interventions. However, existing path planning methods struggle to simultaneously achieve real-time adaptability and path smoothness in dynamic obstacle environments. To address this, we propose a hierarchical Dynamic Rapidly-exploring Random Tree Star (D-RRT*) path planning framework that integrates dynamic step size adjustment, local target selection, and local planning that considers microswarms' turning capabilities and energy optimization. Comparative simulations and experiments validate the effectiveness of the proposed planning framework, and results show that it can significantly improve the planning efficiency, path smoothness, and collision avoidance in complex dynamic scenarios.
{"title":"A Hierarchical Framework for Real-Time Path Planning of Microswarm in Dynamic Environments","authors":"Yamei Li;Ruijian Ge;Aoji Zhu;Jiachi Zhao;Danjing Shi;Yinghan Sun;Yangmin Li;Lidong Yang","doi":"10.1109/LRA.2026.3662530","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662530","url":null,"abstract":"Autonomous navigation of magnetic microswarms in dynamic and unstructured environments is essential for biomedical applications, such as targeted therapy and minimally invasive interventions. However, existing path planning methods struggle to simultaneously achieve real-time adaptability and path smoothness in dynamic obstacle environments. To address this, we propose a hierarchical Dynamic Rapidly-exploring Random Tree Star (D-RRT*) path planning framework that integrates dynamic step size adjustment, local target selection, and local planning that considers microswarms' turning capabilities and energy optimization. Comparative simulations and experiments validate the effectiveness of the proposed planning framework, and results show that it can significantly improve the planning efficiency, path smoothness, and collision avoidance in complex dynamic scenarios.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"3891-3898"},"PeriodicalIF":5.3,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146223535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present a fully distributed framework for multi-robot exploration and coverage of time-varying spatial processes in complex, non-convex environments.Building on heat-equation-driven adaptive coverage (HEDAC) and system ergodicity, the proposed approach enables robots to autonomously navigate arbitrary domains, reconstruct unknown spatial fields, and continuously balance exploration and coverage without centralized coordination. A temporal decay mechanism promotes adaptive monitoring by regulating the relevance of past observations. Simulation and real-world experiments demonstrate the effectiveness and robustness of the method.
{"title":"Distributed Multi-Robot Ergodic Coverage Control for Estimating Time-Varying Spatial Processes","authors":"Mattia Mantovani;Mattia Catellani;Lorenzo Sabattini","doi":"10.1109/LRA.2026.3662641","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662641","url":null,"abstract":"We present a fully distributed framework for multi-robot exploration and coverage of time-varying spatial processes in complex, non-convex environments.Building on heat-equation-driven adaptive coverage (HEDAC) and system ergodicity, the proposed approach enables robots to autonomously navigate arbitrary domains, reconstruct unknown spatial fields, and continuously balance exploration and coverage without centralized coordination. A temporal decay mechanism promotes adaptive monitoring by regulating the relevance of past observations. Simulation and real-world experiments demonstrate the effectiveness and robustness of the method.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 4","pages":"3955-3962"},"PeriodicalIF":5.3,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11373838","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146216607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-06DOI: 10.1109/LRA.2026.3662658
E. Arefinia;N. Feizi;F. C. Pedrosa;R. V. Patel;J. Jayender
This paper presents an error-state Model Predictive Path Integral (MPPI) framework for tendon-driven continuum robots (TDCRs). Tracking-error dynamics are formulated on a Lie group to preserve full pose geometry, yielding precise position–orientation error metrics. A nonlinear Cosserat-rod model with strain parameterization provides a closed-form TDCR dynamics representation and updates in $0.3 pm text{0.3},text{ms}$. The model is calibrated via weight-release and actuation experiments on robotic ablation catheters, and its generalized coordinates are estimated through nested optimization. The MPPI controller parallelizes trajectory sampling and evaluation, uses tendon-displacement actuation computed via optimization to eliminate force sensors, and is uncertainty-aware through a simple and efficient exponentially weighted moving-average (EWMA) estimator embedded in the running cost. Control trajectories are sampled around the current best sequence and evaluated with an adaptive cost and exponential weighting to bias low-cost solutions. Experiments comparing conventional model predictive control (MPC), Lie-group MPC, offline Implicit Q-Learning (IQL), and MPPI formulated with Cartesian errors show that our MPPI method achieves the highest accuracy, significantly better computational efficiency than MPC, and better overall accuracy than all baselines. The model further extends naturally to multi-segment TDCRs and can incorporate tendon-actuation friction.
{"title":"Error-State Model Predictive Path Integral Control of Tendon-Driven Continuum Robots Using Cosserat Rod Dynamics With Strain Parametrization","authors":"E. Arefinia;N. Feizi;F. C. Pedrosa;R. V. Patel;J. Jayender","doi":"10.1109/LRA.2026.3662658","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662658","url":null,"abstract":"This paper presents an error-state Model Predictive Path Integral (MPPI) framework for tendon-driven continuum robots (TDCRs). Tracking-error dynamics are formulated on a Lie group to preserve full pose geometry, yielding precise position–orientation error metrics. A nonlinear Cosserat-rod model with strain parameterization provides a closed-form TDCR dynamics representation and updates in <inline-formula><tex-math>$0.3 pm text{0.3},text{ms}$</tex-math></inline-formula>. The model is calibrated via weight-release and actuation experiments on robotic ablation catheters, and its generalized coordinates are estimated through nested optimization. The MPPI controller parallelizes trajectory sampling and evaluation, uses tendon-displacement actuation computed via optimization to eliminate force sensors, and is uncertainty-aware through a simple and efficient exponentially weighted moving-average (EWMA) estimator embedded in the running cost. Control trajectories are sampled around the current best sequence and evaluated with an adaptive cost and exponential weighting to bias low-cost solutions. Experiments comparing conventional model predictive control (MPC), Lie-group MPC, offline Implicit Q-Learning (IQL), and MPPI formulated with Cartesian errors show that our MPPI method achieves the highest accuracy, significantly better computational efficiency than MPC, and better overall accuracy than all baselines. The model further extends naturally to multi-segment TDCRs and can incorporate tendon-actuation friction.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"3867-3874"},"PeriodicalIF":5.3,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146223711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-06DOI: 10.1109/LRA.2026.3662656
Shashank Ramesh;Taylor Girard;Mark Plecnik
In addition to securely handling objects, grippers endowed with a sense of touch are desirable for handling delicate objects. Force sensors can be mounted distally at the fingertips, but a simpler option is to estimate fingertip forces from motor current at the base. This approach, which is not new, reduces wire routing and recesses electronics away from the sensed surface for operation in wet environments. However, for such a strategy to work, the actuating motor must have no or low gearing, which increases its transparency to external torques, but greatly limits its own torque output. Therein lies a trade-off. In addition to gearing, the transmission ratio from motor to fingertip is also defined by the gripper linkage itself. In this work, the trade-off is overcome by introducing a linkage capable of reconfiguring without the need of an extra actuator. Reconfiguration is performed by moving across an output singularity to select between a mode which is biased for force sensing (sense mode) and a mode which is biased for force production (grip mode). These novel kinematics are embodied as a mostly monolithic compliant mechanism, leaving just one traditional pin joint in the entire gripper assembly. Experiments show that force mode exhibits 3.1× more force output, and sense mode can measure 2.6× smaller forces. Corollary to the latter, sense mode is ideal for estimating the stiffness of objects, including small fruits with stiffness lower than 150 N/m. As an illustration, we demonstrate the usage of sense mode to estimate the ripeness of small fruits, followed by a transition to force mode in order to pluck the fruit. Ripeness is distinguished with up to $approx 90%$ accuracy based on estimates of stiffness and fruit size using sense mode.
{"title":"A Simple Compliant Gripper That Reconfigures Between Sensing and Grasping Modes","authors":"Shashank Ramesh;Taylor Girard;Mark Plecnik","doi":"10.1109/LRA.2026.3662656","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662656","url":null,"abstract":"In addition to securely handling objects, grippers endowed with a sense of touch are desirable for handling delicate objects. Force sensors can be mounted distally at the fingertips, but a simpler option is to estimate fingertip forces from motor current at the base. This approach, which is not new, reduces wire routing and recesses electronics away from the sensed surface for operation in wet environments. However, for such a strategy to work, the actuating motor must have no or low gearing, which increases its transparency to external torques, but greatly limits its own torque output. Therein lies a trade-off. In addition to gearing, the transmission ratio from motor to fingertip is also defined by the gripper linkage itself. In this work, the trade-off is overcome by introducing a linkage capable of reconfiguring without the need of an extra actuator. Reconfiguration is performed by moving across an output singularity to select between a mode which is biased for force sensing (<italic>sense mode</i>) and a mode which is biased for force production (<italic>grip mode</i>). These novel kinematics are embodied as a mostly monolithic compliant mechanism, leaving just one traditional pin joint in the entire gripper assembly. Experiments show that force mode exhibits 3.1× more force output, and sense mode can measure 2.6× smaller forces. Corollary to the latter, sense mode is ideal for estimating the stiffness of objects, including small fruits with stiffness lower than 150 N/m. As an illustration, we demonstrate the usage of sense mode to estimate the ripeness of small fruits, followed by a transition to force mode in order to pluck the fruit. Ripeness is distinguished with up to <inline-formula><tex-math>$approx 90%$</tex-math></inline-formula> accuracy based on estimates of stiffness and fruit size using sense mode.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"3796-3803"},"PeriodicalIF":5.3,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146223574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-06DOI: 10.1109/LRA.2026.3662631
Xingfang Zhou;Zujun Yu;Tao Ruan;Baoqing Guo;Dingyuan Bai;Tao Sun
Infrared (IR) and visible (VIS) image pairs suffer from position offset of the same object across different modalities in practice. Related work focus on improving detection performance through feature alignment and fusion, ignoring object accurate localization in different modalities, while complementary information at the object level can help to further explain and judge the object. To address this, we propose a multi-spectral pedestrian detection method based on YOLO, featuring explicit offset learning for both feature alignment and object offset prediction. First, an Adaptive Multi-scale Mask Fusion (AMMF) module is designed to enhance features by learning to dynamically fuse mask predictions from Feature Pyramid Network (FPN) in both modalities. Then, a Region-Aware Supervised Feature Alignment (RASFA) module is proposed with a symmetric design. This module simultaneously predicts both IR-to-VIS and VIS-to-IR offset fields within one efficient framework. The former enables robust feature alignment supervised on target regions, while the latter directly provides object-level offsets. As a result, the detection head can efficiently output IR detections by applying these readily available offsets to the VIS detections, eliminating the need for a separate offset prediction branch. Experiments on three public multispectral pedestrian datasets demonstrate that our method not only improves detection performance but also achieves accurate localization of different modalities, outperforming previous state-of-the-art methods.
{"title":"Explicit Offset Learning for Joint Pedestrian Detection and Localization in Weakly Aligned Multispectral Images","authors":"Xingfang Zhou;Zujun Yu;Tao Ruan;Baoqing Guo;Dingyuan Bai;Tao Sun","doi":"10.1109/LRA.2026.3662631","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662631","url":null,"abstract":"Infrared (IR) and visible (VIS) image pairs suffer from position offset of the same object across different modalities in practice. Related work focus on improving detection performance through feature alignment and fusion, ignoring object accurate localization in different modalities, while complementary information at the object level can help to further explain and judge the object. To address this, we propose a multi-spectral pedestrian detection method based on YOLO, featuring explicit offset learning for both feature alignment and object offset prediction. First, an Adaptive Multi-scale Mask Fusion (AMMF) module is designed to enhance features by learning to dynamically fuse mask predictions from Feature Pyramid Network (FPN) in both modalities. Then, a Region-Aware Supervised Feature Alignment (RASFA) module is proposed with a symmetric design. This module simultaneously predicts both IR-to-VIS and VIS-to-IR offset fields within one efficient framework. The former enables robust feature alignment supervised on target regions, while the latter directly provides object-level offsets. As a result, the detection head can efficiently output IR detections by applying these readily available offsets to the VIS detections, eliminating the need for a separate offset prediction branch. Experiments on three public multispectral pedestrian datasets demonstrate that our method not only improves detection performance but also achieves accurate localization of different modalities, outperforming previous state-of-the-art methods.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"3645-3652"},"PeriodicalIF":5.3,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-06DOI: 10.1109/LRA.2026.3662652
Kohei Matsumoto;Asako Kanezaki
The optimal way to convey information about a real environment to humans is through natural language descriptions. With the remarkable advancements in large language models and the field of Embodied AI in recent years, it has become possible for robots to autonomously navigate environments while recognizing and understanding their surroundings, much like humans do. In this paper, we propose a new Embodied AI task in which an autonomous mobile robot explores an environment and summarizes the entire environment in natural language. To properly evaluate this task, we use a crowdsourcing service to collect human-generated environment descriptions and construct a benchmark dataset. Additionally, the evaluation is conducted through a crowdsourcing service, and we investigate correlations with existing text evaluation metrics. Furthermore, we propose a baseline reinforcement learning method for the robot's environment exploration behavior to perform this task, demonstrating its superior performance compared to existing visual exploration methods.
{"title":"EED: Embodied Environment Description Through Robotic Visual Exploration","authors":"Kohei Matsumoto;Asako Kanezaki","doi":"10.1109/LRA.2026.3662652","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662652","url":null,"abstract":"The optimal way to convey information about a real environment to humans is through natural language descriptions. With the remarkable advancements in large language models and the field of Embodied AI in recent years, it has become possible for robots to autonomously navigate environments while recognizing and understanding their surroundings, much like humans do. In this paper, we propose a new Embodied AI task in which an autonomous mobile robot explores an environment and summarizes the entire environment in natural language. To properly evaluate this task, we use a crowdsourcing service to collect human-generated environment descriptions and construct a benchmark dataset. Additionally, the evaluation is conducted through a crowdsourcing service, and we investigate correlations with existing text evaluation metrics. Furthermore, we propose a baseline reinforcement learning method for the robot's environment exploration behavior to perform this task, demonstrating its superior performance compared to existing visual exploration methods.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 4","pages":"3994-4001"},"PeriodicalIF":5.3,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11373846","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146216631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-06DOI: 10.1109/LRA.2026.3662648
Daesung Park;KwangEun Ko;Dongbum Pyo;Jaehyeon Kang
Accurate real-time crop counting is essential for autonomous agricultural systems. However, existing methods often fail in dense plantings due to heavy foliage, irregular planting patterns, and frequent occlusions. While 2D tracking suffers from double-counting and 3D reconstruction requires offline processing, we propose a real-time crop counting framework that incrementally constructs global 3D crop instances during data collection. Each crop is modeled as a 3D oriented bounding box, initialized upon detection and updated with subsequent observations. To ensure robust association across frames, we employ 3D Generalized Intersection over Union (GIoU) for spatial matching and confidence-based filtering for validation, effectively reducing double-counting in dense orchards. Unlike prior methods, our approach supports on-the-fly counting without post-hoc reconstruction and performs reliably in unstructured field conditions. Experimental results demonstrate the accuracy and real-time capability of the proposed system in dense agricultural settings.
{"title":"Incremental 3D Crop Model Association for Real-Time Counting in Dense Orchards","authors":"Daesung Park;KwangEun Ko;Dongbum Pyo;Jaehyeon Kang","doi":"10.1109/LRA.2026.3662648","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662648","url":null,"abstract":"Accurate real-time crop counting is essential for autonomous agricultural systems. However, existing methods often fail in dense plantings due to heavy foliage, irregular planting patterns, and frequent occlusions. While 2D tracking suffers from double-counting and 3D reconstruction requires offline processing, we propose a real-time crop counting framework that incrementally constructs global 3D crop instances during data collection. Each crop is modeled as a 3D oriented bounding box, initialized upon detection and updated with subsequent observations. To ensure robust association across frames, we employ 3D Generalized Intersection over Union (GIoU) for spatial matching and confidence-based filtering for validation, effectively reducing double-counting in dense orchards. Unlike prior methods, our approach supports on-the-fly counting without post-hoc reconstruction and performs reliably in unstructured field conditions. Experimental results demonstrate the accuracy and real-time capability of the proposed system in dense agricultural settings.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"3860-3866"},"PeriodicalIF":5.3,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146223602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-06DOI: 10.1109/LRA.2026.3662633
Aabha Tamhankar;Ron Alterovitz;Ajit S. Puri;Giovanni Pittiglio
We propose a deterministic and time-efficient contact-aware path planner for neurovascular navigation. The algorithm leverages information from pre- and intra-operative images of the vessels to navigate pre-bent passive tools, by intelligently predicting and exploiting interactions with the anatomy. A kinematic model is derived and employed by the sampling-based planner for tree expansion that utilizes simplified motion primitives. This approach enables fast computation of the feasible path, with negligible loss in accuracy, as demonstrated in diverse and representative anatomies of the vessels. In these anatomical demonstrators, the algorithm shows a 100% convergence rate within 22.8 s in the worst case, with sub-millimeter tracking errors ($< {0.64},{mathrm{mm}}$), and is found effective on anatomical phantoms representative of $sim$94% of patients.
{"title":"Contact-Aware Path Planning for Autonomous Neuroendovascular Navigation","authors":"Aabha Tamhankar;Ron Alterovitz;Ajit S. Puri;Giovanni Pittiglio","doi":"10.1109/LRA.2026.3662633","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662633","url":null,"abstract":"We propose a deterministic and time-efficient contact-aware path planner for neurovascular navigation. The algorithm leverages information from pre- and intra-operative images of the vessels to navigate pre-bent passive tools, by intelligently predicting and exploiting interactions with the anatomy. A kinematic model is derived and employed by the sampling-based planner for tree expansion that utilizes simplified motion primitives. This approach enables fast computation of the feasible path, with negligible loss in accuracy, as demonstrated in diverse and representative anatomies of the vessels. In these anatomical demonstrators, the algorithm shows a 100% convergence rate within 22.8 s in the worst case, with sub-millimeter tracking errors (<inline-formula><tex-math>$< {0.64},{mathrm{mm}}$</tex-math></inline-formula>), and is found effective on anatomical phantoms representative of <inline-formula><tex-math>$sim$</tex-math></inline-formula>94% of patients.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 4","pages":"4130-4137"},"PeriodicalIF":5.3,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146223848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}