Pub Date : 2025-10-22eCollection Date: 2025-01-01DOI: 10.3389/frobt.2025.1660244
Yuchao Li, Ziqi Jin, Jin Liu, Daolin Ma
Industrial terminal assembly tasks are often repetitive and involve handling components with tight tolerances that are susceptible to damage. Learning an effective terminal assembly policy in real-world is challenging, as collisions between parts and the environment can lead to slippage or part breakage. In this paper, we propose a safe reinforcement learning approach to develop a visuo-tactile assembly policy that is robust to variations in grasp poses. Our method minimizes collisions between the terminal head and terminal base by decomposing the assembly task into three distinct phases. In the first grasp phase,a vision-guided model is trained to pick the terminal head from an initial bin. In the second align phase, a tactile-based grasp pose estimation model is employed to align the terminal head with the terminal base. In the final assembly phase, a visuo-tactile policy is learned to precisely insert the terminal head into the terminal base. To ensure safe training, the robot leverages human demonstrations and interventions. Experimental results on PLC terminal assembly demonstrate that the proposed method achieves 100% successful insertions across 100 different initial end-effector and grasp poses, while imitation learning and online-RL policy yield only 9% and 0%.
{"title":"Visuo-tactile feedback policies for terminal assembly facilitated by reinforcement learning.","authors":"Yuchao Li, Ziqi Jin, Jin Liu, Daolin Ma","doi":"10.3389/frobt.2025.1660244","DOIUrl":"10.3389/frobt.2025.1660244","url":null,"abstract":"<p><p>Industrial terminal assembly tasks are often repetitive and involve handling components with tight tolerances that are susceptible to damage. Learning an effective terminal assembly policy in real-world is challenging, as collisions between parts and the environment can lead to slippage or part breakage. In this paper, we propose a safe reinforcement learning approach to develop a visuo-tactile assembly policy that is robust to variations in grasp poses. Our method minimizes collisions between the terminal head and terminal base by decomposing the assembly task into three distinct phases. In the first <i>grasp</i> phase,a vision-guided model is trained to pick the terminal head from an initial bin. In the second <i>align</i> phase, a tactile-based grasp pose estimation model is employed to align the terminal head with the terminal base. In the final <i>assembly</i> phase, a visuo-tactile policy is learned to precisely insert the terminal head into the terminal base. To ensure safe training, the robot leverages human demonstrations and interventions. Experimental results on PLC terminal assembly demonstrate that the proposed method achieves 100% successful insertions across 100 different initial end-effector and grasp poses, while imitation learning and online-RL policy yield only 9% and 0%.</p>","PeriodicalId":47597,"journal":{"name":"Frontiers in Robotics and AI","volume":"12 ","pages":"1660244"},"PeriodicalIF":3.0,"publicationDate":"2025-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12586048/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145460310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-21eCollection Date: 2025-01-01DOI: 10.3389/frobt.2025.1693988
Jongyoon Park, Pileun Kim, Daeil Ko
The integration of Vision-Language Models (VLMs) into autonomous systems is of growing importance for improving Human-Robot Interaction (HRI), enabling robots to operate within complex and unstructured environments and collaborate with non-expert users. For mobile robots to be effectively deployed in dynamic settings such as domestic or industrial areas, the ability to interpret and execute natural language commands is crucial. However, while VLMs offer powerful zero-shot, open-vocabulary recognition capabilities, their high computational cost presents a significant challenge for real-time performance on resource-constrained edge devices. This study provides a systematic analysis of the trade-offs involved in optimizing a real-time robotic perception pipeline on the NVIDIA Jetson AGX Orin 64GB platform. We investigate the relationship between accuracy and latency by evaluating combinations of two open-vocabulary detection models and two prompt-based segmentation models. Each pipeline is optimized using various precision levels (FP32, FP16, and Best) via NVIDIA TensorRT. We present a quantitative comparison of the mean Intersection over Union (mIoU) and latency for each configuration, offering practical insights and benchmarks for researchers and developers deploying these advanced models on embedded systems.
将视觉语言模型(VLMs)集成到自主系统中对于改善人机交互(HRI)越来越重要,使机器人能够在复杂和非结构化的环境中操作,并与非专业用户协作。为了使移动机器人有效地部署在家庭或工业领域等动态环境中,解释和执行自然语言命令的能力至关重要。然而,尽管vlm提供了强大的零射击、开放词汇表识别能力,但其高昂的计算成本对资源受限边缘设备的实时性能提出了重大挑战。本研究系统分析了在NVIDIA Jetson AGX Orin 64GB平台上优化实时机器人感知管道所涉及的权衡。我们通过评估两种开放词汇检测模型和两种基于提示的分割模型的组合来研究准确率和延迟之间的关系。每个管道都通过NVIDIA TensorRT使用不同的精度级别(FP32, FP16和Best)进行优化。我们对每种配置的平均交联(mIoU)和延迟进行了定量比较,为研究人员和开发人员在嵌入式系统上部署这些先进模型提供了实用的见解和基准。
{"title":"Real-time open-vocabulary perception for mobile robots on edge devices: a systematic analysis of the accuracy-latency trade-off.","authors":"Jongyoon Park, Pileun Kim, Daeil Ko","doi":"10.3389/frobt.2025.1693988","DOIUrl":"https://doi.org/10.3389/frobt.2025.1693988","url":null,"abstract":"<p><p>The integration of Vision-Language Models (VLMs) into autonomous systems is of growing importance for improving Human-Robot Interaction (HRI), enabling robots to operate within complex and unstructured environments and collaborate with non-expert users. For mobile robots to be effectively deployed in dynamic settings such as domestic or industrial areas, the ability to interpret and execute natural language commands is crucial. However, while VLMs offer powerful zero-shot, open-vocabulary recognition capabilities, their high computational cost presents a significant challenge for real-time performance on resource-constrained edge devices. This study provides a systematic analysis of the trade-offs involved in optimizing a real-time robotic perception pipeline on the NVIDIA Jetson AGX Orin 64GB platform. We investigate the relationship between accuracy and latency by evaluating combinations of two open-vocabulary detection models and two prompt-based segmentation models. Each pipeline is optimized using various precision levels (FP32, FP16, and Best) via NVIDIA TensorRT. We present a quantitative comparison of the mean Intersection over Union (mIoU) and latency for each configuration, offering practical insights and benchmarks for researchers and developers deploying these advanced models on embedded systems.</p>","PeriodicalId":47597,"journal":{"name":"Frontiers in Robotics and AI","volume":"12 ","pages":"1693988"},"PeriodicalIF":3.0,"publicationDate":"2025-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12583037/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145453636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-21eCollection Date: 2025-01-01DOI: 10.3389/frobt.2025.1650228
Jonas Smits, Pierre Schegg, Loic Wauters, Luc Perard, Corentin Languepin, Davide Recchia, Vera Damerjian Pieters, Stéphane Lopez, Didier Tchetche, Kendra Grubb, Jorgen Hansen, Eric Sejor, Pierre Berthet-Rayne
Transcatheter Aortic Valve Implantation (TAVI) is a minimally invasive procedure in which a transcatheter heart valve (THV) is implanted within the patient's diseased native aortic valve. The procedure is increasingly chosen even for intermediate-risk and younger patients, as it combines complication rates comparable to open-heart surgery with the advantage of being far less invasive. Despite its benefits, challenges remain in achieving accurate and repeatable valve positioning, with inaccuracies potentially leading to complications such as THV migration, coronary obstruction, and conduction disturbances (CD). The latter often requires a permanent pacemaker implantation as a costly and life-changing mitigation. Robotic assistance may offer solutions, enhancing precision, standardization, and reducing radiation exposure for clinicians. This article introduces a novel solution for robot-assisted TAVI, addressing the growing need for skilled clinicians and improving procedural outcomes. We present an in-vivo animal demonstration of robotic-assisted TAVI, showing feasibility of tele-operative instrument control and THV deployment. This, done at safer distances from radiation sources by a single operator. Furthermore, THV positioning and deployment under supervised autonomy is demonstrated on phantom, and shown to be feasible using both camera- and fluoroscopy-based imaging feedback and AI. Finally, an initial operator study probes performance and potential added value of various technology augmentations with respect to a manual expert operator, indicating equivalent to superior accuracy and repeatability using robotic assistance. It is concluded that robot-assisted TAVI is technically feasible in-vivo, and presents a strong case for a clinically meaningful application of level-3 autonomy. These findings support the potential of surgical robotic technology to enhance TAVI accuracy and repeatability, ultimately improving patient outcomes and expanding procedural accessibility.
{"title":"Towards autonomous robot-assisted transcatheter heart valve implantation: in vivo teleoperation and phantom validation of AI-guided positioning.","authors":"Jonas Smits, Pierre Schegg, Loic Wauters, Luc Perard, Corentin Languepin, Davide Recchia, Vera Damerjian Pieters, Stéphane Lopez, Didier Tchetche, Kendra Grubb, Jorgen Hansen, Eric Sejor, Pierre Berthet-Rayne","doi":"10.3389/frobt.2025.1650228","DOIUrl":"10.3389/frobt.2025.1650228","url":null,"abstract":"<p><p>Transcatheter Aortic Valve Implantation (TAVI) is a minimally invasive procedure in which a transcatheter heart valve (THV) is implanted within the patient's diseased native aortic valve. The procedure is increasingly chosen even for intermediate-risk and younger patients, as it combines complication rates comparable to open-heart surgery with the advantage of being far less invasive. Despite its benefits, challenges remain in achieving accurate and repeatable valve positioning, with inaccuracies potentially leading to complications such as THV migration, coronary obstruction, and conduction disturbances (CD). The latter often requires a permanent pacemaker implantation as a costly and life-changing mitigation. Robotic assistance may offer solutions, enhancing precision, standardization, and reducing radiation exposure for clinicians. This article introduces a novel solution for robot-assisted TAVI, addressing the growing need for skilled clinicians and improving procedural outcomes. We present an <i>in-vivo</i> animal demonstration of robotic-assisted TAVI, showing feasibility of tele-operative instrument control and THV deployment. This, done at safer distances from radiation sources by a single operator. Furthermore, THV positioning and deployment under supervised autonomy is demonstrated on phantom, and shown to be feasible using both camera- and fluoroscopy-based imaging feedback and AI. Finally, an initial operator study probes performance and potential added value of various technology augmentations with respect to a manual expert operator, indicating equivalent to superior accuracy and repeatability using robotic assistance. It is concluded that robot-assisted TAVI is technically feasible <i>in-vivo</i>, and presents a strong case for a clinically meaningful application of level-3 autonomy. These findings support the potential of surgical robotic technology to enhance TAVI accuracy and repeatability, ultimately improving patient outcomes and expanding procedural accessibility.</p>","PeriodicalId":47597,"journal":{"name":"Frontiers in Robotics and AI","volume":"12 ","pages":"1650228"},"PeriodicalIF":3.0,"publicationDate":"2025-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12583050/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145453642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-20eCollection Date: 2025-01-01DOI: 10.3389/frobt.2025.1671673
Fernando Amodeo, Noé Pérez-Higueras, Luis Merino, Fernando Caballero
Mobile robots require knowledge of the environment, especially of humans located in its vicinity. While the most common approaches for detecting humans involve computer vision, an often overlooked hardware feature of robots for people detection are their 2D range finders. These were originally intended for obstacle avoidance and mapping/SLAM tasks. In most robots, they are conveniently located at a height approximately between the ankle and the knee, so they can be used for detecting people too, and with a larger field of view and depth resolution compared to cameras. In this paper, we present a new dataset for people detection using knee-high 2D range finders called FROG. This dataset has greater laser resolution, scanning frequency, and more complete annotation data compared to existing datasets such as DROW (Beyer et al., 2018). Particularly, the FROG dataset contains annotations for 100% of its laser scans (unlike DROW which only annotates 5%), 17x more annotated scans, 100x more people annotations, and over twice the distance traveled by the robot. We propose a benchmark based on the FROG dataset, and analyze a collection of state-of-the-art people detectors based on 2D range finder data. We also propose and evaluate a new end-to-end deep learning approach for people detection. Our solution works with the raw sensor data directly (not needing hand-crafted input data features), thus avoiding CPU preprocessing and releasing the developer of understanding specific domain heuristics. Experimental results show how the proposed people detector attains results comparable to the state of the art, while an optimized implementation for ROS can operate at more than 500 Hz.
移动机器人需要了解环境,尤其是其附近的人类。虽然检测人类的最常见方法涉及计算机视觉,但机器人用于检测人类的一个经常被忽视的硬件功能是它们的2D测距仪。这些最初用于避障和映射/SLAM任务。在大多数机器人中,它们的高度大约在脚踝和膝盖之间,所以它们也可以用来探测人,而且与相机相比,它们具有更大的视野和深度分辨率。在本文中,我们提出了一种新的数据集,用于使用膝盖高的2D测距仪进行人员检测,称为FROG。与现有数据集(如DROW)相比,该数据集具有更高的激光分辨率、扫描频率和更完整的注释数据(Beyer et al., 2018)。特别是,FROG数据集包含100%激光扫描的注释(不像DROW只注释5%),17倍的注释扫描,100倍的人注释,以及超过两倍的机器人行进距离。我们提出了一个基于FROG数据集的基准,并基于2D测距仪数据分析了一组最先进的人体探测器。我们还提出并评估了一种新的端到端深度学习方法,用于人员检测。我们的解决方案直接使用原始传感器数据(不需要手工制作的输入数据特征),从而避免了CPU预处理,并释放了开发人员理解特定领域的启发式。实验结果表明,所提出的人检测器如何获得与当前技术水平相当的结果,而ROS的优化实现可以在500 Hz以上工作。
{"title":"FROG: a new people detection dataset for knee-high 2D range finders.","authors":"Fernando Amodeo, Noé Pérez-Higueras, Luis Merino, Fernando Caballero","doi":"10.3389/frobt.2025.1671673","DOIUrl":"10.3389/frobt.2025.1671673","url":null,"abstract":"<p><p>Mobile robots require knowledge of the environment, especially of humans located in its vicinity. While the most common approaches for detecting humans involve computer vision, an often overlooked hardware feature of robots for people detection are their 2D range finders. These were originally intended for obstacle avoidance and mapping/SLAM tasks. In most robots, they are conveniently located at a height approximately between the ankle and the knee, so they can be used for detecting people too, and with a larger field of view and depth resolution compared to cameras. In this paper, we present a new dataset for people detection using knee-high 2D range finders called FROG. This dataset has greater laser resolution, scanning frequency, and more complete annotation data compared to existing datasets such as DROW (Beyer et al., 2018). Particularly, the FROG dataset contains annotations for 100% of its laser scans (unlike DROW which only annotates 5%), 17x more annotated scans, 100x more people annotations, and over twice the distance traveled by the robot. We propose a benchmark based on the FROG dataset, and analyze a collection of state-of-the-art people detectors based on 2D range finder data. We also propose and evaluate a new end-to-end deep learning approach for people detection. Our solution works with the raw sensor data directly (not needing hand-crafted input data features), thus avoiding CPU preprocessing and releasing the developer of understanding specific domain heuristics. Experimental results show how the proposed people detector attains results comparable to the state of the art, while an optimized implementation for ROS can operate at more than 500 Hz.</p>","PeriodicalId":47597,"journal":{"name":"Frontiers in Robotics and AI","volume":"12 ","pages":"1671673"},"PeriodicalIF":3.0,"publicationDate":"2025-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12580528/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145446191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-17eCollection Date: 2025-01-01DOI: 10.3389/frobt.2025.1628213
Martin Føre, Emilia May O'Brien, Eleni Kelasidi
Due to their utility in replacing workers in tasks unsuitable for humans, unmanned underwater vehicles (UUVs) have become increasingly common tools in the fish farming industry. However, earlier studies and anecdotal evidence from farmers imply that farmed fish tend to move away from and avoid intrusive objects such as vehicles that are deployed and operated inside net pens. Such responses could imply a discomfort associated with the intrusive objects, which, in turn, can lead to stress and impaired welfare in the fish. To prevent this, vehicles and their control systems should be designed to automatically adjust operations when they perceive that they are repelling the fish. A necessary first step in this direction is to develop on-vehicle observation systems for assessing object/vehicle-fish distances in real-time settings that can provide inputs to the control algorithms. Due to their small size and low weight, modern cameras are ideal for this purpose. Moreover, the ongoing rapid developments within deep learning methods are enabling the use of increasingly sophisticated methods for analyzing footage from cameras. To explore this potential, we developed three new pipelines for the automated assessment of fish-camera distances in video and images. These methods were complemented using a recently published method, yielding four pipelines in total, namely, SegmentDepth, BBoxDepth, and SuperGlue that were based on stereo-vision and DepthAnything that was monocular. The overall performance was investigated using field data by comparing the fish-object distances obtained from the methods with those measured using a sonar. The four methods were then benchmarked by comparing the number of objects detected and the quality and overall accuracy of the stereo matches (only stereo-based methods). SegmentDepth, DepthAnything, and SuperGlue performed well in comparison with the sonar data, yielding mean absolute errors (MAE) of 0.205 m (95% CI: 0.050-0.360), 0.412 m (95% CI: 0.148-0.676), and 0.187 m (95% CI: 0.073-0.300), respectively, and were integrated into the Robot Operating System (ROS2) framework to enable real-time application in fish behavior identification and the control of robotic vehicles such as UUVs.
由于无人水下航行器(uuv)在代替工人从事不适合人类的工作方面的效用,它已成为养鱼业中越来越普遍的工具。然而,早期的研究和来自农民的轶事证据表明,养殖鱼类倾向于远离和避开侵入性物体,如部署和在渔网围栏内操作的车辆。这样的反应可能意味着与侵入性物体有关的不适,这反过来又会导致鱼的压力和福利受损。为了防止这种情况发生,车辆及其控制系统应该设计成当它们感知到它们正在驱赶鱼时自动调整操作。朝这个方向发展的第一步是开发车载观察系统,用于实时评估物体/车辆与鱼的距离,从而为控制算法提供输入。由于其体积小,重量轻,现代相机是理想的这一目的。此外,深度学习方法的持续快速发展使越来越复杂的方法能够用于分析来自摄像机的镜头。为了探索这一潜力,我们开发了三种新的管道,用于自动评估视频和图像中的鱼相机距离。这些方法与最近发布的方法相补充,总共产生了四个管道,即基于立体视觉的SegmentDepth、BBoxDepth和SuperGlue,以及基于单目的DepthAnything。通过将这些方法获得的鱼物距离与声纳测量的距离进行比较,利用现场数据调查了整体性能。然后通过比较检测到的物体数量和立体匹配的质量和整体精度(仅基于立体的方法)来对这四种方法进行基准测试。与声纳数据相比,SegmentDepth、DepthAnything和SuperGlue表现良好,平均绝对误差(MAE)分别为0.205 m (95% CI: 0.050-0.360)、0.412 m (95% CI: 0.148-0.676)和0.187 m (95% CI: 0.073-0.300),并被集成到机器人操作系统(ROS2)框架中,以实现实时应用于鱼类行为识别和机器人车辆(如uuv)的控制。
{"title":"Deep learning methods for 3D tracking of fish in challenging underwater conditions for future perception in autonomous underwater vehicles.","authors":"Martin Føre, Emilia May O'Brien, Eleni Kelasidi","doi":"10.3389/frobt.2025.1628213","DOIUrl":"10.3389/frobt.2025.1628213","url":null,"abstract":"<p><p>Due to their utility in replacing workers in tasks unsuitable for humans, unmanned underwater vehicles (UUVs) have become increasingly common tools in the fish farming industry. However, earlier studies and anecdotal evidence from farmers imply that farmed fish tend to move away from and avoid intrusive objects such as vehicles that are deployed and operated inside net pens. Such responses could imply a discomfort associated with the intrusive objects, which, in turn, can lead to stress and impaired welfare in the fish. To prevent this, vehicles and their control systems should be designed to automatically adjust operations when they perceive that they are repelling the fish. A necessary first step in this direction is to develop on-vehicle observation systems for assessing object/vehicle-fish distances in real-time settings that can provide inputs to the control algorithms. Due to their small size and low weight, modern cameras are ideal for this purpose. Moreover, the ongoing rapid developments within deep learning methods are enabling the use of increasingly sophisticated methods for analyzing footage from cameras. To explore this potential, we developed three new pipelines for the automated assessment of fish-camera distances in video and images. These methods were complemented using a recently published method, yielding four pipelines in total, namely, <i>SegmentDepth</i>, <i>BBoxDepth</i>, and <i>SuperGlue</i> that were based on stereo-vision and <i>DepthAnything</i> that was monocular. The overall performance was investigated using field data by comparing the fish-object distances obtained from the methods with those measured using a sonar. The four methods were then benchmarked by comparing the number of objects detected and the quality and overall accuracy of the stereo matches (only stereo-based methods). <i>SegmentDepth</i>, <i>DepthAnything</i>, and <i>SuperGlue</i> performed well in comparison with the sonar data, yielding mean absolute errors (MAE) of 0.205 m (95% CI: 0.050-0.360), 0.412 m (95% CI: 0.148-0.676), and 0.187 m (95% CI: 0.073-0.300), respectively, and were integrated into the Robot Operating System (ROS2) framework to enable real-time application in fish behavior identification and the control of robotic vehicles such as UUVs.</p>","PeriodicalId":47597,"journal":{"name":"Frontiers in Robotics and AI","volume":"12 ","pages":"1628213"},"PeriodicalIF":3.0,"publicationDate":"2025-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12575977/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145432746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-16eCollection Date: 2025-01-01DOI: 10.3389/frobt.2025.1655242
Dong Trong Nguyen, Christian Lindahl Elseth, Jakob Rude Øvstaas, Nikolai Arntzen, Geir Hamre, Dag-Børre Lillestøl
As aquaculture expands to meet global food demand, it remains dependent on manual, costly, infrequent, and high-risk operations due to reliance on high-end Remotely Operated Vehicles (ROVs). Scalable and autonomous systems are needed to enable safer and more efficient practices. This paper proposes a cost-effective autonomous inspection framework for the monitoring of mooring systems, a critical component ensuring structural integrity and regulatory compliance for both the aquaculture and floating offshore wind (FOW) sectors. The core contribution of this paper is a modular and scalable vision-based inspection pipeline built on the open-source Robot Operating System 2 (ROS 2) and implemented on a low-cost Blueye X3 underwater drone. The system integrates real-time image enhancement, YOLOv5-based object detection, and 4-DOF visual servoing for autonomous tracking of mooring lines. Additionally, the pipeline supports 3D reconstruction of the observed structure using tools such as ORB-SLAM3 and Meshroom, enabling future capabilities in change detection and defect identification. Validation results from simulation, dock and sea trials showed that the underwater drone can effective inspect of mooring system critical components with real-time processing on edge hardware. A cost estimation for the proposed approach showed a substantial reduction as compared with traditional ROV-based inspections. By increasing the Level of Autonomy (LoA) of off-the-shelf drones, this work provides (1) safer operations by replacing crew-dependent and costly operations that require a ROV and a mothership, (2) scalable monitoring and (3) regulatory-ready documentation. This offers a practical, cross-industry solution for sustainable offshore infrastructure management.
{"title":"Enabling scalable inspection of offshore mooring systems using cost-effective autonomous underwater drones.","authors":"Dong Trong Nguyen, Christian Lindahl Elseth, Jakob Rude Øvstaas, Nikolai Arntzen, Geir Hamre, Dag-Børre Lillestøl","doi":"10.3389/frobt.2025.1655242","DOIUrl":"10.3389/frobt.2025.1655242","url":null,"abstract":"<p><p>As aquaculture expands to meet global food demand, it remains dependent on manual, costly, infrequent, and high-risk operations due to reliance on high-end Remotely Operated Vehicles (ROVs). Scalable and autonomous systems are needed to enable safer and more efficient practices. This paper proposes a cost-effective autonomous inspection framework for the monitoring of mooring systems, a critical component ensuring structural integrity and regulatory compliance for both the aquaculture and floating offshore wind (FOW) sectors. The core contribution of this paper is a modular and scalable vision-based inspection pipeline built on the open-source Robot Operating System 2 (ROS 2) and implemented on a low-cost Blueye X3 underwater drone. The system integrates real-time image enhancement, YOLOv5-based object detection, and 4-DOF visual servoing for autonomous tracking of mooring lines. Additionally, the pipeline supports 3D reconstruction of the observed structure using tools such as ORB-SLAM3 and Meshroom, enabling future capabilities in change detection and defect identification. Validation results from simulation, dock and sea trials showed that the underwater drone can effective inspect of mooring system critical components with real-time processing on edge hardware. A cost estimation for the proposed approach showed a substantial reduction as compared with traditional ROV-based inspections. By increasing the Level of Autonomy (LoA) of off-the-shelf drones, this work provides (1) safer operations by replacing crew-dependent and costly operations that require a ROV and a mothership, (2) scalable monitoring and (3) regulatory-ready documentation. This offers a practical, cross-industry solution for sustainable offshore infrastructure management.</p>","PeriodicalId":47597,"journal":{"name":"Frontiers in Robotics and AI","volume":"12 ","pages":"1655242"},"PeriodicalIF":3.0,"publicationDate":"2025-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12572656/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145432759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-16eCollection Date: 2025-01-01DOI: 10.3389/frobt.2025.1648737
Ferran Gebellí, Raquel Ros
Introduction: This paper presents a participatory design approach for developing assistive robots, addressing the critical gap between designing robotic applications and real-world user needs. Traditional design methodologies often fail to capture authentic requirements due to users' limited familiarity with robotic technologies and the disconnection between design activities and actual deployment contexts.
Methods: We propose a methodology centred on iterative in-situ co-design, where stakeholders collaborate with researchers using functional low-fidelity prototypes within the actual environment of use. Our approach comprises three phases: observation and inspiration, in-situ co-design through prototyping, which is the core of the methodology, and longitudinal evaluation. We implemented this methodology over 10 months at an intermediate healthcare centre. The process involved healthcare staff in defining functionality, designing interactions, and refining system behaviour through hands-on experience with teleoperated prototypes.
Results: The resulting autonomous patrolling robot operated continuously across a two-month deployment. The evaluation through questionnaires on usability, usage and understanding of the robotic system, along with open-ended questions revealed diverse user adoption patterns, with five distinct personas emerging: enthusiastic high-adopter, disillusioned high-adopter, unconvinced mid-adopter, satisfied mid-adopter and non-adopter, which are discussed in detail.
Discussion: During the final evaluation deployment, user feedback still identified both new needs and practical improvements, as co-design iterations have the potential to continue indefinitely. Moreover, despite some performance issues, the robot's presence seemed to generate a placebo effect on both staff and patients, while it appears that staff's behaviours were also influenced by the regular observation of the researchers. The obtained results prove valuable insights into long-term human-robot interaction dynamics, highlighting the importance of context-based requirements gathering.
{"title":"An <i>in-situ</i> participatory approach for assistive robots: methodology and implementation in a healthcare setting.","authors":"Ferran Gebellí, Raquel Ros","doi":"10.3389/frobt.2025.1648737","DOIUrl":"10.3389/frobt.2025.1648737","url":null,"abstract":"<p><strong>Introduction: </strong>This paper presents a participatory design approach for developing assistive robots, addressing the critical gap between designing robotic applications and real-world user needs. Traditional design methodologies often fail to capture authentic requirements due to users' limited familiarity with robotic technologies and the disconnection between design activities and actual deployment contexts.</p><p><strong>Methods: </strong>We propose a methodology centred on iterative <i>in-situ</i> co-design, where stakeholders collaborate with researchers using functional low-fidelity prototypes within the actual environment of use. Our approach comprises three phases: observation and inspiration, <i>in-situ</i> co-design through prototyping, which is the core of the methodology, and longitudinal evaluation. We implemented this methodology over 10 months at an intermediate healthcare centre. The process involved healthcare staff in defining functionality, designing interactions, and refining system behaviour through hands-on experience with teleoperated prototypes.</p><p><strong>Results: </strong>The resulting autonomous patrolling robot operated continuously across a two-month deployment. The evaluation through questionnaires on usability, usage and understanding of the robotic system, along with open-ended questions revealed diverse user adoption patterns, with five distinct personas emerging: enthusiastic high-adopter, disillusioned high-adopter, unconvinced mid-adopter, satisfied mid-adopter and non-adopter, which are discussed in detail.</p><p><strong>Discussion: </strong>During the final evaluation deployment, user feedback still identified both new needs and practical improvements, as co-design iterations have the potential to continue indefinitely. Moreover, despite some performance issues, the robot's presence seemed to generate a placebo effect on both staff and patients, while it appears that staff's behaviours were also influenced by the regular observation of the researchers. The obtained results prove valuable insights into long-term human-robot interaction dynamics, highlighting the importance of context-based requirements gathering.</p>","PeriodicalId":47597,"journal":{"name":"Frontiers in Robotics and AI","volume":"12 ","pages":"1648737"},"PeriodicalIF":3.0,"publicationDate":"2025-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12572612/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145432751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-16eCollection Date: 2025-01-01DOI: 10.3389/frobt.2025.1622206
Samuel A Olatunji, Veronica Falcon, Anjali Ramesh, Wendy A Rogers
Social robots have the potential to support the health activities of older adults. However, they need to be designed for their specific needs; be accepted by and useful to them; and be integrated into their healthcare ecosystem and care network. We explored the research literature to determine the evidence base to guide design considerations necessary for socially assistive robots (SARs) for older adults in the context of healthcare. We identified various elements of the user-centered design of SARs to meet the needs of older adults within the constraints of a home environment. We emphasized the potential benefits of SARs in empowering older adults and supporting their autonomy for health applications. We identified research gaps and provided a road map for future development and deployment to enhance SAR functionality within digital health systems.
{"title":"Considerations for designing socially assistive robots for older adults.","authors":"Samuel A Olatunji, Veronica Falcon, Anjali Ramesh, Wendy A Rogers","doi":"10.3389/frobt.2025.1622206","DOIUrl":"10.3389/frobt.2025.1622206","url":null,"abstract":"<p><p>Social robots have the potential to support the health activities of older adults. However, they need to be designed for their specific needs; be accepted by and useful to them; and be integrated into their healthcare ecosystem and care network. We explored the research literature to determine the evidence base to guide design considerations necessary for socially assistive robots (SARs) for older adults in the context of healthcare. We identified various elements of the user-centered design of SARs to meet the needs of older adults within the constraints of a home environment. We emphasized the potential benefits of SARs in empowering older adults and supporting their autonomy for health applications. We identified research gaps and provided a road map for future development and deployment to enhance SAR functionality within digital health systems.</p>","PeriodicalId":47597,"journal":{"name":"Frontiers in Robotics and AI","volume":"12 ","pages":"1622206"},"PeriodicalIF":3.0,"publicationDate":"2025-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12571604/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145432789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-15eCollection Date: 2025-01-01DOI: 10.3389/frobt.2025.1646353
B M Hofstede, S Ipakchian Askari, T R C van Hoesel, R H Cuijpers, L P de Witte, W A IJsselsteijn, H H Nap
With ageing populations and decreasing numbers of care personnel, care technologies such as socially assistive robots offer innovative solutions for healthcare workers and older adults, supporting ageing in place. Among others, SARs are used for both daytime structure support and social companionship, particularly benefiting people with dementia by providing structure in earlier stages of the disease and comfort in later stages. This research introduces the concept of Huggable Integrated SARs (HI-SAR): a novel subtype of SARs combining a soft, comforting, huggable form with integrated socially assistive functionalities, such as verbal prompts for daytime structure, interactive companionship, and activity monitoring via sensor data, enabling the possibility of more context-aware interaction. While HI-SARs have shown promise in Asian care contexts, real-world application and potential in diverse long-term care contexts remain limited and underexplored. This research investigates the potential of HI-SARs in Dutch healthcare settings (eldercare, disability care, and rehabilitation) through three studies conducted between September 2023 and December 2024. Study I examined HI-SAR functions and integration in Dutch care practice via focus groups with professionals, innovation managers, and older adults (N = 36). Study II explored user preferences through sessions with clients with intellectual disabilities and professionals (N = 32). Study III involved two case studies in care settings with clients and caregivers (N = 4). Results indicate that HI-SARs were generally well-received by professionals and older adults, who appreciated their support for daily routines and social engagement, particularly for clients with cognitive disabilities such as dementia. However, concerns were raised about hygiene, the functioning of activity monitoring, and limited interactivity. Based on these findings, we recommend four design and implementation strategies to improve the effectiveness of HI-SARs: (1) integrating personalisation options such as customizable voices to increase user acceptance; (2) optimising activity monitoring by simplifying data output and using sensor input more proactively to trigger interactions; (3) considering persons with cognitive impairments as a first target user group; and (4) encouraging individual use to enhance hygiene and tailor experiences to client needs. Overall, this research demonstrates the potential of HI-SARs in diverse long-term care settings, although further research is needed to explore their applicability, usability, and long-term impact.
{"title":"Huggable integrated socially assistive robots: exploring the potential and challenges for sustainable use in long-term care contexts.","authors":"B M Hofstede, S Ipakchian Askari, T R C van Hoesel, R H Cuijpers, L P de Witte, W A IJsselsteijn, H H Nap","doi":"10.3389/frobt.2025.1646353","DOIUrl":"10.3389/frobt.2025.1646353","url":null,"abstract":"<p><p>With ageing populations and decreasing numbers of care personnel, care technologies such as socially assistive robots offer innovative solutions for healthcare workers and older adults, supporting ageing in place. Among others, SARs are used for both daytime structure support and social companionship, particularly benefiting people with dementia by providing structure in earlier stages of the disease and comfort in later stages. This research introduces the concept of Huggable Integrated SARs (HI-SAR): a novel subtype of SARs combining a soft, comforting, huggable form with integrated socially assistive functionalities, such as verbal prompts for daytime structure, interactive companionship, and activity monitoring via sensor data, enabling the possibility of more context-aware interaction. While HI-SARs have shown promise in Asian care contexts, real-world application and potential in diverse long-term care contexts remain limited and underexplored. This research investigates the potential of HI-SARs in Dutch healthcare settings (eldercare, disability care, and rehabilitation) through three studies conducted between September 2023 and December 2024. Study I examined HI-SAR functions and integration in Dutch care practice via focus groups with professionals, innovation managers, and older adults (N = 36). Study II explored user preferences through sessions with clients with intellectual disabilities and professionals (N = 32). Study III involved two case studies in care settings with clients and caregivers (N = 4). Results indicate that HI-SARs were generally well-received by professionals and older adults, who appreciated their support for daily routines and social engagement, particularly for clients with cognitive disabilities such as dementia. However, concerns were raised about hygiene, the functioning of activity monitoring, and limited interactivity. Based on these findings, we recommend four design and implementation strategies to improve the effectiveness of HI-SARs: (1) integrating personalisation options such as customizable voices to increase user acceptance; (2) optimising activity monitoring by simplifying data output and using sensor input more proactively to trigger interactions; (3) considering persons with cognitive impairments as a first target user group; and (4) encouraging individual use to enhance hygiene and tailor experiences to client needs. Overall, this research demonstrates the potential of HI-SARs in diverse long-term care settings, although further research is needed to explore their applicability, usability, and long-term impact.</p>","PeriodicalId":47597,"journal":{"name":"Frontiers in Robotics and AI","volume":"12 ","pages":"1646353"},"PeriodicalIF":3.0,"publicationDate":"2025-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12569430/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145410435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-15eCollection Date: 2025-01-01DOI: 10.3389/frobt.2025.1526287
Joseph Bolarinwa, Manuel Giuliani, Paul Bremner
Introduction: The challenges encountered in the design of multi-robot teams (MRT) highlight the need for different levels of human involvement, creating human-in-the-loop multi-robot teams. By integrating human cognitive abilities with the functionalities of the robots in the MRT, we can enhance overall system performance. Designing such a human-in-the-loop MRT requires several decisions based on the specific context of application. Before implementing these systems in real-world scenarios, it is essential to model and simulate the various components of the MRT to evaluate their impact on performance and the different roles a human operator might play.
Methods: We developed a simulation framework for a human-in-the-loop MRT using the Java Agent DEvelopment framework (JADE) and investigated the effects of different numbers of robots in the MRT, MRT architectures, and levels of human involvement (human collaboration and human intervention) on performance metrics.
Results: Results show that task execution outcomes and request completion times (RCT) improve with an increasing number of robots in the MRT. Human collaboration reduced the RCT, while human intervention increased the RCT, regardless of the number of robots in the MRT. The effect of system architecture was only significant when the number of robots in the MRT was low.
Discussion: This study demonstrates that both the number of robots in a multi-robot team (MRT) and the inclusion of a human in the loop significantly influence system performance. The findings also highlight the value of simulation as a cost- and time-efficiency strategy to evaluate MRT configurations prior to real-world implementation.
{"title":"Should we get involved? impact of human collaboration and intervention on multi-robot teams.","authors":"Joseph Bolarinwa, Manuel Giuliani, Paul Bremner","doi":"10.3389/frobt.2025.1526287","DOIUrl":"10.3389/frobt.2025.1526287","url":null,"abstract":"<p><strong>Introduction: </strong>The challenges encountered in the design of multi-robot teams (MRT) highlight the need for different levels of human involvement, creating human-in-the-loop multi-robot teams. By integrating human cognitive abilities with the functionalities of the robots in the MRT, we can enhance overall system performance. Designing such a human-in-the-loop MRT requires several decisions based on the specific context of application. Before implementing these systems in real-world scenarios, it is essential to model and simulate the various components of the MRT to evaluate their impact on performance and the different roles a human operator might play.</p><p><strong>Methods: </strong>We developed a simulation framework for a human-in-the-loop MRT using the Java Agent DEvelopment framework (JADE) and investigated the effects of different numbers of robots in the MRT, MRT architectures, and levels of human involvement (human collaboration and human intervention) on performance metrics.</p><p><strong>Results: </strong>Results show that task execution outcomes and request completion times (RCT) improve with an increasing number of robots in the MRT. Human collaboration reduced the RCT, while human intervention increased the RCT, regardless of the number of robots in the MRT. The effect of system architecture was only significant when the number of robots in the MRT was low.</p><p><strong>Discussion: </strong>This study demonstrates that both the number of robots in a multi-robot team (MRT) and the inclusion of a human in the loop significantly influence system performance. The findings also highlight the value of simulation as a cost- and time-efficiency strategy to evaluate MRT configurations prior to real-world implementation.</p>","PeriodicalId":47597,"journal":{"name":"Frontiers in Robotics and AI","volume":"12 ","pages":"1526287"},"PeriodicalIF":3.0,"publicationDate":"2025-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12569544/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145410444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}