Pub Date : 2026-05-01Epub Date: 2026-02-04DOI: 10.1016/j.robot.2026.105377
Lianxin Zhang , Yang Jiao , Yihan Huang , Ziyou Wang , Huihuan Qian
Self-assembly enables multi-robot systems to merge diverse capabilities and accomplish tasks beyond the reach of individual robots. Incorporating varied docking mechanism layouts (DMLs) can enhance robot versatility or reduce costs. However, assembling multiple heterogeneous robots with diverse DMLs is still a research gap. This paper addresses this problem by introducing CuBoat, an omnidirectional unmanned surface vehicle (USV). CuBoat can be equipped with or without docking systems on its four sides to emulate heterogeneous robots. We implement a multi-robot system based on multiple CuBoats. To enhance maneuverability, a linear active disturbance rejection control (LADRC) scheme is proposed. Additionally, we present a generalized parallel self-assembly planning algorithm for efficient assembly among CuBoats with different DMLs. Validation is conducted through simulation within 2 scenarios across 4 distinct maps, demonstrating the performance of the self-assembly planning algorithm. Moreover, trajectory tracking tests confirm the effectiveness of the LADRC controller. Self-assembly experiments on 5 maps with different target structures affirm the algorithm’s feasibility and generality. This study advances robotic self-assembly, enabling multi-robot systems to collaboratively tackle complex tasks beyond the capabilities of individual robots.
{"title":"Parallel self-assembly for modular USVs with diverse docking mechanism layouts","authors":"Lianxin Zhang , Yang Jiao , Yihan Huang , Ziyou Wang , Huihuan Qian","doi":"10.1016/j.robot.2026.105377","DOIUrl":"10.1016/j.robot.2026.105377","url":null,"abstract":"<div><div>Self-assembly enables multi-robot systems to merge diverse capabilities and accomplish tasks beyond the reach of individual robots. Incorporating varied docking mechanism layouts (DMLs) can enhance robot versatility or reduce costs. However, assembling multiple heterogeneous robots with diverse DMLs is still a research gap. This paper addresses this problem by introducing CuBoat, an omnidirectional unmanned surface vehicle (USV). CuBoat can be equipped with or without docking systems on its four sides to emulate heterogeneous robots. We implement a multi-robot system based on multiple CuBoats. To enhance maneuverability, a linear active disturbance rejection control (LADRC) scheme is proposed. Additionally, we present a generalized parallel self-assembly planning algorithm for efficient assembly among CuBoats with different DMLs. Validation is conducted through simulation within 2 scenarios across 4 distinct maps, demonstrating the performance of the self-assembly planning algorithm. Moreover, trajectory tracking tests confirm the effectiveness of the LADRC controller. Self-assembly experiments on 5 maps with different target structures affirm the algorithm’s feasibility and generality. This study advances robotic self-assembly, enabling multi-robot systems to collaboratively tackle complex tasks beyond the capabilities of individual robots.</div></div>","PeriodicalId":49592,"journal":{"name":"Robotics and Autonomous Systems","volume":"199 ","pages":"Article 105377"},"PeriodicalIF":5.2,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-05-01Epub Date: 2026-01-27DOI: 10.1016/j.robot.2026.105369
Tzu-Han Lin , Cih-An Chen , Chi-Hsiang Lo , Li-Chen Fu
This paper proposes an integrated system combining location estimation, human activity recognition (HAR), and plan recognition modules. In order to improve the HAR performance, we propose a location estimation system that fuses ResNet50-Places365 (Zhou et al., 2018) and our created estimator that leverages information on the distances between human and nearby objects. The location information from the location estimation system and the human skeleton information will be fed into HAR module governed by our developed activity-location graph convolutional neural network (AL-GCN). To explore more usage of the recognized activities, we propose a plan recognition system that updates the human’s plan knowledge base while taking into account human’s habits from time to time so as to make three important predictions, namely, next activity, objective, and plan. In our experiment, we evaluate our system on both dataset and real-world scenarios. In dataset evaluation, our location estimation system performs best with 92.83% accuracy, our AL-GCN model outperforms the state-of-the-art (SOTA) models with 94.33% accuracy on cross-subject evaluation, and our proposed plan recognition improves when habits are considered and knowledge base is updated. In the real-world experiments, the location estimation achieves 98% accuracy when in the living room, and our AL-GCN model improves the accuracy from 10% to 20% by including location information. Finally, our plan recognition shows that, by updating knowledge base, the predictions accuracy increases significantly.
本文提出了一种结合位置估计、人类活动识别(HAR)和计划识别模块的集成系统。为了提高HAR性能,我们提出了一种融合了ResNet50-Places365 (Zhou et al., 2018)和我们创建的估计器的位置估计系统,该估计器利用了人类与附近物体之间的距离信息。由我们开发的活动-位置图卷积神经网络(AL-GCN)控制的HAR模块将来自位置估计系统的位置信息和人体骨架信息馈送到HAR模块。为了探索识别活动的更多用途,我们提出了一种计划识别系统,它在不断更新人类的计划知识库的同时,考虑到人类的习惯,从而做出三个重要的预测,即下一个活动、目标和计划。在我们的实验中,我们在数据集和现实世界场景上评估了我们的系统。在数据集评估中,我们的位置估计系统以92.83%的准确率表现最佳,我们的AL-GCN模型在跨主题评估中以94.33%的准确率优于最先进的SOTA模型,当考虑习惯和知识库更新时,我们提出的计划识别得到改善。在现实世界的实验中,在客厅的位置估计准确率达到98%,我们的AL-GCN模型通过包含位置信息将准确率从10%提高到20%。最后,我们的计划识别表明,通过更新知识库,预测精度显著提高。
{"title":"Household robot utilizing location information for human activity and habit understanding","authors":"Tzu-Han Lin , Cih-An Chen , Chi-Hsiang Lo , Li-Chen Fu","doi":"10.1016/j.robot.2026.105369","DOIUrl":"10.1016/j.robot.2026.105369","url":null,"abstract":"<div><div>This paper proposes an integrated system combining location estimation, human activity recognition (HAR), and plan recognition modules. In order to improve the HAR performance, we propose a location estimation system that fuses ResNet50-Places365 (Zhou et al., 2018) and our created estimator that leverages information on the distances between human and nearby objects. The location information from the location estimation system and the human skeleton information will be fed into HAR module governed by our developed activity-location graph convolutional neural network (AL-GCN). To explore more usage of the recognized activities, we propose a plan recognition system that updates the human’s plan knowledge base while taking into account human’s habits from time to time so as to make three important predictions, namely, next activity, objective, and plan. In our experiment, we evaluate our system on both dataset and real-world scenarios. In dataset evaluation, our location estimation system performs best with 92.83% accuracy, our AL-GCN model outperforms the state-of-the-art (SOTA) models with 94.33% accuracy on cross-subject evaluation, and our proposed plan recognition improves when habits are considered and knowledge base is updated. In the real-world experiments, the location estimation achieves 98% accuracy when in the living room, and our AL-GCN model improves the accuracy from 10% to 20% by including location information. Finally, our plan recognition shows that, by updating knowledge base, the predictions accuracy increases significantly.</div></div>","PeriodicalId":49592,"journal":{"name":"Robotics and Autonomous Systems","volume":"199 ","pages":"Article 105369"},"PeriodicalIF":5.2,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146070763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-05-01Epub Date: 2026-02-02DOI: 10.1016/j.robot.2026.105367
Dhruba Jyoti Sut , Prabhu Sethuramalingam
Over the past few years, soft robotics has made significant advancements, particularly because of the inherent benefits of more flexibility and safer operations. In various industries, including health care, agriculture, and machinery, the end effector or gripper helps robotic systems grab, transport, manipulate, assemble, and paint. The flexibility and adaptability of the grasping strategy determine the gripping objects' effectiveness. Besides non-destructive gripping, soft manipulators improve ductility, safety, adaptability, and flexibility. Due to extrinsic influences that produce internal nonlinearity and unpredictable deformation, creating, modelling, and operating soft manipulators is difficult. An elementary on-off regulator valve is inadequate for effectively regulating pressure in soft pneumatic grippers handling delicate items. This work presents image processing in real-time with adaptive gripping force to handle fragile objects without damage. Convolutional Neural Network (CNN), CNN- Support Vector Machine (SVM), and Inception v3 are compared to see which one can best classify objects on a new dataset called Obj10, which has 22,410 images divided into ten classes. Employed a servo system with proportional-integral-derivative (PID) control to regulate the Filter Regulator Lubricator (FRL), ensuring an efficient control mechanism and force acquisition. Pressure sensor data utilised as feedback for the system. The Inception-v3-based CNN model improves image categorization after compression, and feature extraction creates feature vectors. Retraining the classification layer with these vectors improves object classification accuracy to 97.88%. The proposed framework combines object recognition with a new control method to grab objects in experiments with three grippers. The results show that soft grippers are best for non-destructive grasping.
{"title":"Deep learning-based object identification for grasping force control of a robotic soft end effector","authors":"Dhruba Jyoti Sut , Prabhu Sethuramalingam","doi":"10.1016/j.robot.2026.105367","DOIUrl":"10.1016/j.robot.2026.105367","url":null,"abstract":"<div><div>Over the past few years, soft robotics has made significant advancements, particularly because of the inherent benefits of more flexibility and safer operations. In various industries, including health care, agriculture, and machinery, the end effector or gripper helps robotic systems grab, transport, manipulate, assemble, and paint. The flexibility and adaptability of the grasping strategy determine the gripping objects' effectiveness. Besides non-destructive gripping, soft manipulators improve ductility, safety, adaptability, and flexibility. Due to extrinsic influences that produce internal nonlinearity and unpredictable deformation, creating, modelling, and operating soft manipulators is difficult. An elementary on-off regulator valve is inadequate for effectively regulating pressure in soft pneumatic grippers handling delicate items. This work presents image processing in real-time with adaptive gripping force to handle fragile objects without damage. Convolutional Neural Network (CNN), CNN- Support Vector Machine (SVM), and Inception v3 are compared to see which one can best classify objects on a new dataset called Obj10, which has 22,410 images divided into ten classes. Employed a servo system with proportional-integral-derivative (PID) control to regulate the Filter Regulator Lubricator (FRL), ensuring an efficient control mechanism and force acquisition. Pressure sensor data utilised as feedback for the system. The Inception-v3-based CNN model improves image categorization after compression, and feature extraction creates feature vectors. Retraining the classification layer with these vectors improves object classification accuracy to 97.88%. The proposed framework combines object recognition with a new control method to grab objects in experiments with three grippers. The results show that soft grippers are best for non-destructive grasping.</div></div>","PeriodicalId":49592,"journal":{"name":"Robotics and Autonomous Systems","volume":"199 ","pages":"Article 105367"},"PeriodicalIF":5.2,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-05-01Epub Date: 2026-01-28DOI: 10.1016/j.robot.2026.105371
Junhui Li, Mohammed A.A. Al-qaness
To enable intuitive and reliable human–robot collaboration, robots must understand human actions at a structural level, making skeleton-based gesture recognition (SGR) a crucial source of precise and robust intention cues. Graph convolutional networks (GCNs) have become a key technology in SGR due to their efficient processing of non-Euclidean data. However, existing methods typically choose between a fixed anatomical prior graph and a fully adaptive dynamic graph, which limits the model’s ability to capture structural invariance and dynamic variability in hand motion simultaneously. To address this challenge, we propose the Structural-Adaptive Spatio-Temporal GCN (SA-STGCN), which relies on an innovative spatiotemporal feature extraction mechanism designed to fuse structural priors with motion-adaptive topology synergistically. Spatially, our designed Spatio-Temporal Attunement (STA) Block integrates two key components in parallel: Relational Semantics Graph Convolution (RS-GC), which constructs a rich structured representation by modeling multiple priors such as physical connectivity, symmetry relationships, and functional groupings, while aggregating features at both the joint and component levels. Meanwhile, Motion Signature Graph Convolution (MS-GC) learns a dynamic, instance-specific topological graph from the data to capture instantaneous motion patterns. Temporally, the Temporal Multi-Scale Aggregation (TMA) Module effectively captures fine-grained motion at varying rates through multi-way dilated convolutions, and the Temporal Saliency Modulator (TSM) further enhances the feature weights of keyframes. These improvements significantly enhance the accuracy and efficiency of GR. The experimental results demonstrate that our model achieves an accuracy of 97.62% on the 14-class task and 95.36% on the 28-class task of the SHREC’17 Track dataset, as well as 93.22% on the FPHA dataset.
{"title":"SA-STGCN: Structural-Adaptive Spatio-Temporal Graph Convolution with Spatio-Temporal Attunement for skeleton-based gesture recognition","authors":"Junhui Li, Mohammed A.A. Al-qaness","doi":"10.1016/j.robot.2026.105371","DOIUrl":"10.1016/j.robot.2026.105371","url":null,"abstract":"<div><div>To enable intuitive and reliable human–robot collaboration, robots must understand human actions at a structural level, making skeleton-based gesture recognition (SGR) a crucial source of precise and robust intention cues. Graph convolutional networks (GCNs) have become a key technology in SGR due to their efficient processing of non-Euclidean data. However, existing methods typically choose between a fixed anatomical prior graph and a fully adaptive dynamic graph, which limits the model’s ability to capture structural invariance and dynamic variability in hand motion simultaneously. To address this challenge, we propose the Structural-Adaptive Spatio-Temporal GCN (SA-STGCN), which relies on an innovative spatiotemporal feature extraction mechanism designed to fuse structural priors with motion-adaptive topology synergistically. Spatially, our designed Spatio-Temporal Attunement (STA) Block integrates two key components in parallel: Relational Semantics Graph Convolution (RS-GC), which constructs a rich structured representation by modeling multiple priors such as physical connectivity, symmetry relationships, and functional groupings, while aggregating features at both the joint and component levels. Meanwhile, Motion Signature Graph Convolution (MS-GC) learns a dynamic, instance-specific topological graph from the data to capture instantaneous motion patterns. Temporally, the Temporal Multi-Scale Aggregation (TMA) Module effectively captures fine-grained motion at varying rates through multi-way dilated convolutions, and the Temporal Saliency Modulator (TSM) further enhances the feature weights of keyframes. These improvements significantly enhance the accuracy and efficiency of GR. The experimental results demonstrate that our model achieves an accuracy of 97.62% on the 14-class task and 95.36% on the 28-class task of the SHREC’17 Track dataset, as well as 93.22% on the FPHA dataset.</div></div>","PeriodicalId":49592,"journal":{"name":"Robotics and Autonomous Systems","volume":"199 ","pages":"Article 105371"},"PeriodicalIF":5.2,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146070762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-05-01Epub Date: 2026-02-09DOI: 10.1016/j.robot.2026.105389
Mohamed Manzour , Catherine M. Elias , Omar M. Shehata , Rubén Izquierdo , Miguel Ángel Sotelo
Research on lane change prediction has gained attention in the last few years. Most existing works in this area have been conducted in simulation environments or with pre-recorded datasets, and they often rely on simplified assumptions about sensing, communication, and traffic behavior that do not always hold in practice. Real-world deployments of lane-change prediction systems are relatively rare, and when they are reported, the practical challenges, limitations, and lessons learned are often under-documented. This study explores cooperative lane-change prediction through a real hardware deployment in mixed traffic and shares the insights that emerged during implementation and testing. The studied architecture integrates stereo-camera perception, wireless communication, a knowledge-graph-based intention prediction module, and automated longitudinal control implemented on embedded platforms. It is implemented on an ego vehicle and a target vehicle and evaluated in a three-vehicle scenario where a third vehicle acts as the preceding vehicle that forces the target vehicle to change lane. Real-road experiments show that, when the cooperative prediction module is enabled, the ego vehicle can anticipate the target vehicle’s lane-change intention about 4 s before the actual lane crossing and decelerate early to open a safe gap, whereas disabling prediction leads to late reactions and aggressive braking. The experiments also reveal constraints that are critical for real deployments. Perception pipelines are sensitive to outdoor lighting, so tests must be scheduled at times and locations with more stable illumination. A precomputed lookup table keeps prediction fast on embedded devices. Communication reliability and thermal effects on the hardware, especially in hot weather, can noticeably affect the system behavior. By documenting these experiences together with the observed behavior of the vehicles with and without prediction, the study provides practical guidance for others working on similar cooperative prediction systems.
{"title":"Design insights and comparative evaluation of a hardware-based cooperative perception architecture for lane change prediction","authors":"Mohamed Manzour , Catherine M. Elias , Omar M. Shehata , Rubén Izquierdo , Miguel Ángel Sotelo","doi":"10.1016/j.robot.2026.105389","DOIUrl":"10.1016/j.robot.2026.105389","url":null,"abstract":"<div><div>Research on lane change prediction has gained attention in the last few years. Most existing works in this area have been conducted in simulation environments or with pre-recorded datasets, and they often rely on simplified assumptions about sensing, communication, and traffic behavior that do not always hold in practice. Real-world deployments of lane-change prediction systems are relatively rare, and when they are reported, the practical challenges, limitations, and lessons learned are often under-documented. This study explores cooperative lane-change prediction through a real hardware deployment in mixed traffic and shares the insights that emerged during implementation and testing. The studied architecture integrates stereo-camera perception, wireless communication, a knowledge-graph-based intention prediction module, and automated longitudinal control implemented on embedded platforms. It is implemented on an ego vehicle and a target vehicle and evaluated in a three-vehicle scenario where a third vehicle acts as the preceding vehicle that forces the target vehicle to change lane. Real-road experiments show that, when the cooperative prediction module is enabled, the ego vehicle can anticipate the target vehicle’s lane-change intention about 4 s before the actual lane crossing and decelerate early to open a safe gap, whereas disabling prediction leads to late reactions and aggressive braking. The experiments also reveal constraints that are critical for real deployments. Perception pipelines are sensitive to outdoor lighting, so tests must be scheduled at times and locations with more stable illumination. A precomputed lookup table keeps prediction fast on embedded devices. Communication reliability and thermal effects on the hardware, especially in hot weather, can noticeably affect the system behavior. By documenting these experiences together with the observed behavior of the vehicles with and without prediction, the study provides practical guidance for others working on similar cooperative prediction systems.</div></div>","PeriodicalId":49592,"journal":{"name":"Robotics and Autonomous Systems","volume":"199 ","pages":"Article 105389"},"PeriodicalIF":5.2,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-05-01Epub Date: 2026-01-31DOI: 10.1016/j.robot.2026.105375
Manuel Hernandez-Mejia, David Romero, Tamás Ruppert, Federico Guedea, Omkar Salunkhe, Ciro Rodriguez, Johan Stahre
Wire harnesses are critical components for the smartification, connectivity, and electrification trends of consumer and industrial products. However, their manufacturing remains predominantly manual, with over 90% of their process tasks still executed by human operations. This dependence on manual labour persists due to the inherent challenges of manipulating wire harness, which are classified as Deformable Linear Objects (DLOs). These objects exhibit non-linear and unpredictable deformation behaviours, making them difficult to manipulate with conventional robotic systems. As a result, there is an increasing demand for advanced robotic and cobotic manipulation solutions customised for supporting wire harness (co-)assembly processes. This systematic literature review explores the current state-of-the-art in robotic manipulation systems for DLOs, with a focus on their application to wire harness (co-)assembly processes, covering three key domains: (i) Wire Harness Manufacturing Process(es) Automation, (ii) Handing and Holding (Manipulation) Systems for DLOs, and (iii) Robot Gripper Design Methodologies. The review addresses three main research questions related to the adaptability of existing robotic manipulation systems design methodologies for DLOs, the role of enabling technologies, and the potential development of a reference design methodology for robotic and cobotic manipulation systems for wire harnesses. Findings highlight significant progress in areas such as tactile sensing, soft robotics, dual arm coordination, and CAD-based robot programming. However, some research gaps remain in real-time deformation estimation of DLOs, adaptive robot motion planning, and ergonomic task allocation methods for human-robot collaborative workstations. The study concludes with a research opportunities heatmap and a future research framework that visually summarises its findings, aiming to guide future research and technological development efforts for flexible, efficient, and scalable robotic or cobotic manipulation systems for wire harness (co-)assembly.
{"title":"A review of robotic manipulation solutions for deformable linear objects: The case of wire harnesses (Co-)assembly by robots","authors":"Manuel Hernandez-Mejia, David Romero, Tamás Ruppert, Federico Guedea, Omkar Salunkhe, Ciro Rodriguez, Johan Stahre","doi":"10.1016/j.robot.2026.105375","DOIUrl":"10.1016/j.robot.2026.105375","url":null,"abstract":"<div><div>Wire harnesses are critical components for the smartification, connectivity, and electrification trends of consumer and industrial products. However, their manufacturing remains predominantly manual, with over 90% of their process tasks still executed by human operations. This dependence on manual labour persists due to the inherent challenges of manipulating wire harness, which are classified as Deformable Linear Objects (DLOs). These objects exhibit non-linear and unpredictable deformation behaviours, making them difficult to manipulate with conventional robotic systems. As a result, there is an increasing demand for advanced robotic and cobotic manipulation solutions customised for supporting wire harness (co-)assembly processes. This systematic literature review explores the current state-of-the-art in robotic manipulation systems for DLOs, with a focus on their application to wire harness (co-)assembly processes, covering three key domains: (i) Wire Harness Manufacturing Process(es) Automation, (ii) Handing and Holding (Manipulation) Systems for DLOs, and (iii) Robot Gripper Design Methodologies. The review addresses three main research questions related to the adaptability of existing robotic manipulation systems design methodologies for DLOs, the role of enabling technologies, and the potential development of a reference design methodology for robotic and cobotic manipulation systems for wire harnesses. Findings highlight significant progress in areas such as tactile sensing, soft robotics, dual arm coordination, and CAD-based robot programming. However, some research gaps remain in real-time deformation estimation of DLOs, adaptive robot motion planning, and ergonomic task allocation methods for human-robot collaborative workstations. The study concludes with a research opportunities heatmap and a future research framework that visually summarises its findings, aiming to guide future research and technological development efforts for flexible, efficient, and scalable robotic or cobotic manipulation systems for wire harness (co-)assembly.</div></div>","PeriodicalId":49592,"journal":{"name":"Robotics and Autonomous Systems","volume":"199 ","pages":"Article 105375"},"PeriodicalIF":5.2,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Assistive robots can support collaborative manipulation tasks such as carrying heavy or extended objects. As the human–human interaction is the basis for human–robot interaction, it is important to understand and quantify primarily haptic interaction. The subjects’ movements were recorded with a 3D motion capture system to determine spatio–temporal and upper and lower body kinematic parameters. The human–human interaction provided foundational data on human movement in collaborative manipulation tasks. The task with the robot revealed almost no changes in upper body kinematics, however, it was slower and showed adaptations of the human movement in the center of mass motion and in spatio–temporal parameters and lower body kinematics. This shows, that analyzing the interaction between humans and assistive robots focusing on human movement is essential for further developing assistive robots.
{"title":"Human–human and human–robot co–manipulation: A biomechanical analysis of a joint carrying task","authors":"Fabian Goell , Bjoern Braunstein , Jule Heieis , Daniel Braun , Nadine Reißner , Kirill Safronov , Christian Weiser , Verena Schuengel , Kirsten Albracht","doi":"10.1016/j.robot.2026.105360","DOIUrl":"10.1016/j.robot.2026.105360","url":null,"abstract":"<div><div>Assistive robots can support collaborative manipulation tasks such as carrying heavy or extended objects. As the human–human interaction is the basis for human–robot interaction, it is important to understand and quantify primarily haptic interaction. The subjects’ movements were recorded with a 3D motion capture system to determine spatio–temporal and upper and lower body kinematic parameters. The human–human interaction provided foundational data on human movement in collaborative manipulation tasks. The task with the robot revealed almost no changes in upper body kinematics, however, it was slower and showed adaptations of the human movement in the center of mass motion and in spatio–temporal parameters and lower body kinematics. This shows, that analyzing the interaction between humans and assistive robots focusing on human movement is essential for further developing assistive robots.</div></div>","PeriodicalId":49592,"journal":{"name":"Robotics and Autonomous Systems","volume":"199 ","pages":"Article 105360"},"PeriodicalIF":5.2,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146070766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-05-01Epub Date: 2026-02-04DOI: 10.1016/j.robot.2026.105373
Daud Khan, Sawera Aslam, Sudeb Mondal, KyungHi Chang
Autonomous vehicles require robust and context-aware decision-making to safely navigate complex urban intersections. To address challenges of perception uncertainty, communication delay, and multi-agent interaction, this paper proposes a novel framework combining multi-modal sensor fusion with confidence-weighted V2X message aggregation and dual-attention reinforcement learning. In the proposed system, RSUs employ an EKF to integrate LiDAR and camera data with CAM, CPM, and DENM messages over the 5G NR PC5 sidelink, generating a unified environmental representation with confidence weighting. This fused state is periodically broadcast to vehicles, where each onboard unit applies a dual-attention module to extract salient temporal and spatial features for policy learning. A Dual-Attention PPO (DA-PPO) agent then optimizes intersection maneuvers lane changing, collision avoidance, and traffic flow management using these context-rich inputs. Simulation results using the V2AIX dataset demonstrate that the proposed DA-PPO achieves up to 97.4% decision accuracy, 15%–20% higher packet-delivery reliability, and 2.3×faster policy convergence compared with baseline A2C (PC5 interface) and PPO models. Furthermore, a decision-accuracy-based autonomy sublevel classification is introduced to benchmark high-autonomy decision performance with reference to SAE autonomy levels within the evaluated intersection scenarios. Overall, the proposed approach enables scalable, interpretable, and communication-aware autonomy for next-generation intelligent transportation systems.
自动驾驶汽车需要强大的环境感知决策,才能安全通过复杂的城市十字路口。为了解决感知不确定性、通信延迟和多智能体交互的挑战,本文提出了一种将多模态传感器融合与置信度加权V2X消息聚合和双注意强化学习相结合的新框架。在提出的系统中,rsu使用EKF将激光雷达和相机数据与5G NR PC5副链路上的CAM、CPM和DENM消息集成在一起,生成具有置信度加权的统一环境表示。这种融合状态定期广播到车辆,每个车载单元应用双注意力模块提取显著的时间和空间特征,用于策略学习。然后,双注意力PPO (DA-PPO)代理利用这些上下文丰富的输入优化交叉口机动、变道、避碰和交通流管理。使用V2AIX数据集的仿真结果表明,与基线A2C (PC5接口)和PPO模型相比,所提出的DA-PPO模型的决策准确率高达97.4%,分组交付可靠性提高15%-20%,策略收敛性提高2.3×faster。在此基础上,引入了基于决策精度的自主子级别分类方法,以评估的交叉口场景中的SAE自主级别为基准,对高自主决策性能进行了基准测试。总的来说,所提出的方法为下一代智能交通系统实现了可扩展、可解释和通信感知的自治。
{"title":"Achieving Level-4 autonomy in urban intersections through EKF-based multi-modal fusion enhanced by dual-attention PPO","authors":"Daud Khan, Sawera Aslam, Sudeb Mondal, KyungHi Chang","doi":"10.1016/j.robot.2026.105373","DOIUrl":"10.1016/j.robot.2026.105373","url":null,"abstract":"<div><div>Autonomous vehicles require robust and context-aware decision-making to safely navigate complex urban intersections. To address challenges of perception uncertainty, communication delay, and multi-agent interaction, this paper proposes a novel framework combining multi-modal sensor fusion with confidence-weighted V2X message aggregation and dual-attention reinforcement learning. In the proposed system, RSUs employ an EKF to integrate LiDAR and camera data with CAM, CPM, and DENM messages over the 5G NR PC5 sidelink, generating a unified environmental representation with confidence weighting. This fused state is periodically broadcast to vehicles, where each onboard unit applies a dual-attention module to extract salient temporal and spatial features for policy learning. A Dual-Attention PPO (DA-PPO) agent then optimizes intersection maneuvers lane changing, collision avoidance, and traffic flow management using these context-rich inputs. Simulation results using the V2AIX dataset demonstrate that the proposed DA-PPO achieves up to 97.4% decision accuracy, 15%–20% higher packet-delivery reliability, and 2.3×faster policy convergence compared with baseline A2C (PC5 interface) and PPO models. Furthermore, a decision-accuracy-based autonomy sublevel classification is introduced to benchmark high-autonomy decision performance with reference to SAE autonomy levels within the evaluated intersection scenarios. Overall, the proposed approach enables scalable, interpretable, and communication-aware autonomy for next-generation intelligent transportation systems.</div></div>","PeriodicalId":49592,"journal":{"name":"Robotics and Autonomous Systems","volume":"199 ","pages":"Article 105373"},"PeriodicalIF":5.2,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-05-01Epub Date: 2026-01-30DOI: 10.1016/j.robot.2026.105374
Prashant Kumar , Weiwei Wan , Kensuke Harada
Soft pneumatic fingers are of great research interest. However, their significant potential is limited as most of them can generate only one motion, mostly bending. The conventional design of soft fingers does not allow them to switch to another motion mode. In this paper, we developed a novel multi-modal and single-actuated soft finger where its motion mode is switched by changing the finger’s temperature. Our soft finger is capable of switching between three distinctive motion modes: bending, twisting, and extension-in approximately five seconds. We carried out a detailed experimental study of the soft finger and evaluated its repeatability and range of motion. It exhibited repeatability of around one millimeter and a fifty percent larger range of motion than a standard bending actuator. We developed an analytical model for a fiber-reinforced soft actuator for twisting motion. This helped us relate the input pressure to the output twist radius of the twisting motion. This model was validated by experimental verification. Further, a soft robotic gripper with multiple grasp modes was developed using three actuators. This gripper can adapt to and grasp objects of a large range of size, shape, and stiffness. We showcased its grasping capabilities by successfully grasping a small berry, a large roll, and a delicate tofu cube.
{"title":"Temperature driven multi-modal/single-actuated soft finger","authors":"Prashant Kumar , Weiwei Wan , Kensuke Harada","doi":"10.1016/j.robot.2026.105374","DOIUrl":"10.1016/j.robot.2026.105374","url":null,"abstract":"<div><div>Soft pneumatic fingers are of great research interest. However, their significant potential is limited as most of them can generate only one motion, mostly bending. The conventional design of soft fingers does not allow them to switch to another motion mode. In this paper, we developed a novel multi-modal and single-actuated soft finger where its motion mode is switched by changing the finger’s temperature. Our soft finger is capable of switching between three distinctive motion modes: bending, twisting, and extension-in approximately five seconds. We carried out a detailed experimental study of the soft finger and evaluated its repeatability and range of motion. It exhibited repeatability of around one millimeter and a fifty percent larger range of motion than a standard bending actuator. We developed an analytical model for a fiber-reinforced soft actuator for twisting motion. This helped us relate the input pressure to the output twist radius of the twisting motion. This model was validated by experimental verification. Further, a soft robotic gripper with multiple grasp modes was developed using three actuators. This gripper can adapt to and grasp objects of a large range of size, shape, and stiffness. We showcased its grasping capabilities by successfully grasping a small berry, a large roll, and a delicate tofu cube.</div></div>","PeriodicalId":49592,"journal":{"name":"Robotics and Autonomous Systems","volume":"199 ","pages":"Article 105374"},"PeriodicalIF":5.2,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-05-01Epub Date: 2026-01-22DOI: 10.1016/j.robot.2026.105362
Yali Han , Zhiyang Chen , Han Sun , Songli Sun , Shunyu Liu
This work presents the creation of a motor-powered lower extremity exoskeleton robot that uses motion control algorithms derived from lasso-driven motion. The provided methodologies are designed to optimize the rehabilitative training process at various phases. In the early stages of rehabilitation, we employ gravity-compensated position control and sliding mode variable structure control to achieve precise trajectory-following movement and validate the amplification effect of the exoskeleton. In the later rehabilitation phase, it is advisable to employ bimodal switching control to improve the coordination and interaction between humans and machines. This solution involves implementing impedance control during the support phase and moment feedback control during the swing phase. This research aims to provide a method of control for exoskeletons with the central aim of enhancing the lower limbs of patients.
{"title":"A motion control study of lower extremity exoskeleton for different stages of rehabilitation","authors":"Yali Han , Zhiyang Chen , Han Sun , Songli Sun , Shunyu Liu","doi":"10.1016/j.robot.2026.105362","DOIUrl":"10.1016/j.robot.2026.105362","url":null,"abstract":"<div><div>This work presents the creation of a motor-powered lower extremity exoskeleton robot that uses motion control algorithms derived from lasso-driven motion. The provided methodologies are designed to optimize the rehabilitative training process at various phases. In the early stages of rehabilitation, we employ gravity-compensated position control and sliding mode variable structure control to achieve precise trajectory-following movement and validate the amplification effect of the exoskeleton. In the later rehabilitation phase, it is advisable to employ bimodal switching control to improve the coordination and interaction between humans and machines. This solution involves implementing impedance control during the support phase and moment feedback control during the swing phase. This research aims to provide a method of control for exoskeletons with the central aim of enhancing the lower limbs of patients.</div></div>","PeriodicalId":49592,"journal":{"name":"Robotics and Autonomous Systems","volume":"199 ","pages":"Article 105362"},"PeriodicalIF":5.2,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}