Current guide robot systems have two main issues: (1) they only support a single mode of interaction (proactive or reactive) and lack a coordination mechanism and (2) they rely heavily on predefined content, which hinders the realisation of a natural and flexible human-like interaction experience. To address these issues, this paper proposes a dual-mode human–robot interaction (HRI) method based on a large language model (LLM). This method includes the following: (1) proactive interaction module. This module uses the robot's own sensors to perceive environmental information in real time, enabling it to provide various human-like services, such as safety alerts, situational announcements, and personalised recommendations. (2) Reactive interaction module. This integrates a query router with retrieval-augmented generation (RAG) method to build an adaptive response mechanism, which aims to provide more accurate responses while optimising response efficiency. Validation in guided tour scenarios confirms the efficiency of the proposed method. Results demonstrate that the proposed method achieves a 92% F1-score (improving 8 percentage points [PPs] over pure LLM and 6 PPs over traditional RAG), has a 48.4% improvement in response latency compared to the standard retrieval-cosine method (the fastest baseline among static RAG approaches) and achieves higher Likert-scale ratings in naturalness (4.35), intelligence (4.05), dependability (4.48) and stimulation (4.45) than other evaluated methods. This study proposes a scalable technical pathway for advancing human–robot interaction systems towards more natural and anthropomorphic interaction paradigms.
{"title":"Research on a Dual-Mode Human–Robot Interaction Method Based on a Large Language Model","authors":"Jingjing Guo, Xi Han","doi":"10.1049/csy2.70037","DOIUrl":"https://doi.org/10.1049/csy2.70037","url":null,"abstract":"<p>Current guide robot systems have two main issues: (1) they only support a single mode of interaction (proactive or reactive) and lack a coordination mechanism and (2) they rely heavily on predefined content, which hinders the realisation of a natural and flexible human-like interaction experience. To address these issues, this paper proposes a dual-mode human–robot interaction (HRI) method based on a large language model (LLM). This method includes the following: (1) proactive interaction module. This module uses the robot's own sensors to perceive environmental information in real time, enabling it to provide various human-like services, such as safety alerts, situational announcements, and personalised recommendations. (2) Reactive interaction module. This integrates a query router with retrieval-augmented generation (RAG) method to build an adaptive response mechanism, which aims to provide more accurate responses while optimising response efficiency. Validation in guided tour scenarios confirms the efficiency of the proposed method. Results demonstrate that the proposed method achieves a 92% <i>F</i>1-score (improving 8 percentage points [PPs] over pure LLM and 6 PPs over traditional RAG), has a 48.4% improvement in response latency compared to the standard retrieval-cosine method (the fastest baseline among static RAG approaches) and achieves higher Likert-scale ratings in naturalness (4.35), intelligence (4.05), dependability (4.48) and stimulation (4.45) than other evaluated methods. This study proposes a scalable technical pathway for advancing human–robot interaction systems towards more natural and anthropomorphic interaction paradigms.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":"8 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.70037","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145969782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Foreign object debris (FOD) detection is critical to aircraft safety, but existing visual algorithms have difficulty in detecting tiny objects and in low-light conditions. FOD detection in low-light conditions can be achieved using laser line-scan cameras, but there is still a lot of room for research on how to better use the multi-channel images obtained by the camera. To address these issues, this paper proposes a tiny FOD detection algorithm (TFD-Net) suitable for laser line-scan cameras and visible light cameras, along with a new multi-channel information fusion (NMIF) method based on laser line-scan camera image features. The proposed TFD-Net is designed specifically for tiny FOD with three key parts: a loss function based on two-dimensional (2D) Gaussian distribution, a multi-scale detection head and an improved pooling module. These designs can effectively extract tiny FOD features and achieve high-precision detection. The proposed NMIF makes better use of the three-channel image features acquired by the laser line-scan camera, improving the effectiveness of the laser line-scan camera in FOD detection significantly.
{"title":"Tiny Foreign Object Debris Detection Considering Multi-Channel Information Fusion and Gaussian Distribution of Pixels","authors":"Zhicong Lu, Guoliang Liu, Changteng Shi, Yichao Cao, Dongxuan Li, Guohui Tian","doi":"10.1049/csy2.70039","DOIUrl":"https://doi.org/10.1049/csy2.70039","url":null,"abstract":"<p>Foreign object debris (FOD) detection is critical to aircraft safety, but existing visual algorithms have difficulty in detecting tiny objects and in low-light conditions. FOD detection in low-light conditions can be achieved using laser line-scan cameras, but there is still a lot of room for research on how to better use the multi-channel images obtained by the camera. To address these issues, this paper proposes a tiny FOD detection algorithm (TFD-Net) suitable for laser line-scan cameras and visible light cameras, along with a new multi-channel information fusion (NMIF) method based on laser line-scan camera image features. The proposed TFD-Net is designed specifically for tiny FOD with three key parts: a loss function based on two-dimensional (2D) Gaussian distribution, a multi-scale detection head and an improved pooling module. These designs can effectively extract tiny FOD features and achieve high-precision detection. The proposed NMIF makes better use of the three-channel image features acquired by the laser line-scan camera, improving the effectiveness of the laser line-scan camera in FOD detection significantly.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":"8 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.70039","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145887935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haoliang Xu, Syed Muhammad Nashit Arshad, Shichi Peng, Han Xu, Hang Yin, Qiang Li
Dexterous robotic hands are essential for various tasks in dynamic environments, but challenges such as slip detection and grasp stability affect real-time performance. Traditional grasping methods often fail to detect subtle slip events, leading to unstable grasps. This paper proposes a real-time slip detection and force compensation system using a hybrid convolutional neural networks and long short-term memory (CNN-LSTM) architecture to detect slip to enhance grasp stability. The system combines tactile sensing with deep learning to detect slips and dynamically adjust individual finger grasping forces, ensuring precise and stable object grasping. The proposed system leverages a hybrid CNN-LSTM architecture to effectively capture both spatial and temporal features of slip dynamics, enabling robust slip detection and grasp stabilisation. By employing data augmentation techniques, the system generates a comprehensive dataset from limited experimental data, enhancing training efficiency and model generalisation. The approach extends slip detection to individual fingers, allowing real-time monitoring and targeted force compensation when a slip is detected on a specific finger. This ensures adaptive and stable grasping, even in dynamic environments. Experimental results demonstrate significant improvements, with the CNN-LSTM model achieving an 82% grasp success rate, outperforming traditional CNN (70%), LSTM (72%), and only traditional proportional–integral–derivative PID (54%) methods. The system's real-time force adjustment capability prevents object drops and enhances overall grasp stability, making it highly scalable for applications in industrial automation, healthcare, and service robots. Despite the CNN-LSTM architecture being a well-established approach, it demonstrates exceptional performance in this task, achieving high accuracy and robustness in slip detection and grasp stabilisation.
{"title":"Slip Detection and Stable Grasping With Multi-Fingered Robotic Hand Using Deep Learning Approach","authors":"Haoliang Xu, Syed Muhammad Nashit Arshad, Shichi Peng, Han Xu, Hang Yin, Qiang Li","doi":"10.1049/csy2.70036","DOIUrl":"10.1049/csy2.70036","url":null,"abstract":"<p>Dexterous robotic hands are essential for various tasks in dynamic environments, but challenges such as slip detection and grasp stability affect real-time performance. Traditional grasping methods often fail to detect subtle slip events, leading to unstable grasps. This paper proposes a real-time slip detection and force compensation system using a hybrid convolutional neural networks and long short-term memory (CNN-LSTM) architecture to detect slip to enhance grasp stability. The system combines tactile sensing with deep learning to detect slips and dynamically adjust individual finger grasping forces, ensuring precise and stable object grasping. The proposed system leverages a hybrid CNN-LSTM architecture to effectively capture both spatial and temporal features of slip dynamics, enabling robust slip detection and grasp stabilisation. By employing data augmentation techniques, the system generates a comprehensive dataset from limited experimental data, enhancing training efficiency and model generalisation. The approach extends slip detection to individual fingers, allowing real-time monitoring and targeted force compensation when a slip is detected on a specific finger. This ensures adaptive and stable grasping, even in dynamic environments. Experimental results demonstrate significant improvements, with the CNN-LSTM model achieving an 82% grasp success rate, outperforming traditional CNN (70%), LSTM (72%), and only traditional proportional–integral–derivative PID (54%) methods. The system's real-time force adjustment capability prevents object drops and enhances overall grasp stability, making it highly scalable for applications in industrial automation, healthcare, and service robots. Despite the CNN-LSTM architecture being a well-established approach, it demonstrates exceptional performance in this task, achieving high accuracy and robustness in slip detection and grasp stabilisation.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":"7 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2025-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.70036","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145686457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Han Xu, Mingqi Chen, Gaofeng Li, Lei Wei, Shichi Peng, Haoliang Xu, ZunRan Wang, Huibin Cao, Qiang Li
In robotic bimanual teleoperation, multimodal sensory feedback plays a crucial role, providing operators with a more immersive operating experience, reducing cognitive burden and improving operating efficiency. In this study, we develop an immersive bilateral isomorphic bimanual telerobotic system, which comprises dual arms and dual dexterous hands, with visual and haptic force feedback. To assess the performance of this system, we carried out a series of experiments and investigated the user's teleoperation experience. The results demonstrate that haptic force feedback enhances physical perception capabilities and complex task operating abilities. In addition, it compensates for visual perception deficiencies and reduces the operator's work burden. Consequently, our proposed system achieves more intuitive, realistic and immersive teleoperation, improves operating efficiency and expands the complexity of tasks that robots can perform through teleoperation.
{"title":"An Immersive Virtual Reality Bimanual Telerobotic System With Haptic Feedback","authors":"Han Xu, Mingqi Chen, Gaofeng Li, Lei Wei, Shichi Peng, Haoliang Xu, ZunRan Wang, Huibin Cao, Qiang Li","doi":"10.1049/csy2.70033","DOIUrl":"10.1049/csy2.70033","url":null,"abstract":"<p>In robotic bimanual teleoperation, multimodal sensory feedback plays a crucial role, providing operators with a more immersive operating experience, reducing cognitive burden and improving operating efficiency. In this study, we develop an immersive bilateral isomorphic bimanual telerobotic system, which comprises dual arms and dual dexterous hands, with visual and haptic force feedback. To assess the performance of this system, we carried out a series of experiments and investigated the user's teleoperation experience. The results demonstrate that haptic force feedback enhances physical perception capabilities and complex task operating abilities. In addition, it compensates for visual perception deficiencies and reduces the operator's work burden. Consequently, our proposed system achieves more intuitive, realistic and immersive teleoperation, improves operating efficiency and expands the complexity of tasks that robots can perform through teleoperation.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":"7 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.70033","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145619193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gesture recognition is a key task in the field of human–computer interaction (HCI). To solve the problems of low accuracy and poor real-time performance in the recognition process, this paper designs a HCI system based on gesture recognition. This paper utilises the Ultraleap 3Di to collect the dynamic gesture dataset for the defined interaction gestures, and the high-precision device guarantees data collection. This paper constructs a framework incorporating the advantages of convolutional neural networks (CNNs) and long short-term memory networks (LSTM) using noncontact gesture interaction as the medium of human–computer collaboration. The framework utilises CNN to perform feature extraction on the input frame information. Then, the extracted feature sequences are fed into LSTM to process the timing information, which is very effective in classifying and recognising the defined dynamic gestures. Finally, a HCI system based on gesture recognition is designed. Based on the Unity3D platform, the UR5 robotic arm was modelled and the cyclic coordinate descent (CCD) algorithm was applied to solve the inverse kinematics, successfully realising the semantic control of gestures on the UR5 robotic arm. The experiment verifies that the CNN–LSTM network can ensure the real-time performance of the whole system and the effectiveness and reliability of the gesture interaction system based on Ultraleap 3Di.
{"title":"Virtual Reality Integrated Human–Computer Interaction System Based on Ultraleap 3Di Hand Gestures Recognition","authors":"Chujie He, Xiangyu Zhou, Jiarui Zhang, Jing Luo, Yahong Chen, Xiaoli Liu, Shifeng Ma, Junjie Sun","doi":"10.1049/csy2.70035","DOIUrl":"10.1049/csy2.70035","url":null,"abstract":"<p>Gesture recognition is a key task in the field of human–computer interaction (HCI). To solve the problems of low accuracy and poor real-time performance in the recognition process, this paper designs a HCI system based on gesture recognition. This paper utilises the Ultraleap 3Di to collect the dynamic gesture dataset for the defined interaction gestures, and the high-precision device guarantees data collection. This paper constructs a framework incorporating the advantages of convolutional neural networks (CNNs) and long short-term memory networks (LSTM) using noncontact gesture interaction as the medium of human–computer collaboration. The framework utilises CNN to perform feature extraction on the input frame information. Then, the extracted feature sequences are fed into LSTM to process the timing information, which is very effective in classifying and recognising the defined dynamic gestures. Finally, a HCI system based on gesture recognition is designed. Based on the Unity3D platform, the UR5 robotic arm was modelled and the cyclic coordinate descent (CCD) algorithm was applied to solve the inverse kinematics, successfully realising the semantic control of gestures on the UR5 robotic arm. The experiment verifies that the CNN–LSTM network can ensure the real-time performance of the whole system and the effectiveness and reliability of the gesture interaction system based on Ultraleap 3Di.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":"7 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2025-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.70035","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145572622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Industrial anomaly detection is crucial for preventing equipment failures, yet challenges persist due to limited labelled data and complex fault patterns. This paper introduces the condition-adaptive refinement (CARe) framework, a self-supervised approach to anomaly detection that synthesises realistic training data through condition-guided diffusion and adaptive feature refinement. The framework features three innovations: a condition-controllable diffusion (CCD) model generates pseudo-anomalous samples using spatial constraints, enhancing synthetic data. An adaptive feature refinement (AFR) module improves detection accuracy by reconstructing multi-scale features. The method identifies anomalies by analysing reconstruction residuals without labelled data. Experiments validate the method's effectiveness, demonstrating substantial improvements in detection accuracy and generalisability. CARe offers a robust solution for industrial anomaly detection under data scarcity.
{"title":"Self-Supervised Anomaly Detection for Substation Equipment With Realistic Diffusion-Based Synthesis and Adaptive Feature Refinement","authors":"Bo Xu, Jia Liu","doi":"10.1049/csy2.70032","DOIUrl":"10.1049/csy2.70032","url":null,"abstract":"<p>Industrial anomaly detection is crucial for preventing equipment failures, yet challenges persist due to limited labelled data and complex fault patterns. This paper introduces the condition-adaptive refinement (CARe) framework, a self-supervised approach to anomaly detection that synthesises realistic training data through condition-guided diffusion and adaptive feature refinement. The framework features three innovations: a condition-controllable diffusion (CCD) model generates pseudo-anomalous samples using spatial constraints, enhancing synthetic data. An adaptive feature refinement (AFR) module improves detection accuracy by reconstructing multi-scale features. The method identifies anomalies by analysing reconstruction residuals without labelled data. Experiments validate the method's effectiveness, demonstrating substantial improvements in detection accuracy and generalisability. CARe offers a robust solution for industrial anomaly detection under data scarcity.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":"7 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2025-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.70032","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145572492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In recent years, significant advancements have been made in enabling intelligent unmanned agents to achieve autonomous navigation and positioning within large-scale indoor or underground environments. Central to these achievements is simultaneous localization and mapping (SLAM) technology. Concurrently, the rapid evolution of LiDAR technologies has revolutionised SLAM, enhancing localisation and mapping capabilities in extreme environments characterised by high dynamics, sparse features or GPS-denied environment. Although much research has concentrated on camera-based SLAM or GPS-fused SLAM, this paper provides a comprehensive review of the development of LiDAR-based multi-sensor fusion SLAM with a particular emphasis on GPS-denied environments and filter-based sensor fusion techniques. The paper is structured as follows: The first section introduces the relevant hardware and datasets. The second section delves into the localisation methodologies employed. The third section discusses the mapping processes involved. The fourth section addresses open problems and suggests future research directions. Overall, this review aims to offer a thorough analysis of the development trends in SLAM with a focus on LiDAR-based methods, covering both hardware and software aspects, providing readers with a clear reference on workflow for engineering deliverable technologies that can be adapted to various application scenarios.
{"title":"GPS-Denied LiDAR-Based SLAM—A Survey","authors":"Haolong Jiang, Yikun Cheng, Weichen Dai, Wenbin Wan, Qinyao Liu, Fanxin Wang","doi":"10.1049/csy2.70031","DOIUrl":"10.1049/csy2.70031","url":null,"abstract":"<p>In recent years, significant advancements have been made in enabling intelligent unmanned agents to achieve autonomous navigation and positioning within large-scale indoor or underground environments. Central to these achievements is simultaneous localization and mapping (SLAM) technology. Concurrently, the rapid evolution of LiDAR technologies has revolutionised SLAM, enhancing localisation and mapping capabilities in extreme environments characterised by high dynamics, sparse features or GPS-denied environment. Although much research has concentrated on camera-based SLAM or GPS-fused SLAM, this paper provides a comprehensive review of the development of LiDAR-based multi-sensor fusion SLAM with a particular emphasis on GPS-denied environments and filter-based sensor fusion techniques. The paper is structured as follows: The first section introduces the relevant hardware and datasets. The second section delves into the localisation methodologies employed. The third section discusses the mapping processes involved. The fourth section addresses open problems and suggests future research directions. Overall, this review aims to offer a thorough analysis of the development trends in SLAM with a focus on LiDAR-based methods, covering both hardware and software aspects, providing readers with a clear reference on workflow for engineering deliverable technologies that can be adapted to various application scenarios.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":"7 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.70031","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145572483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In human–robot collaboration, ensuring both safety and efficiency in obstacle avoidance remains a critical challenge. This paper proposes a sampling-based danger-aware artificial potential field (SDAPF) method for obstacle avoidance during human–robot collaboration and interaction. Existing methods often struggle with dynamic obstacles and varying environmental complexities, which can hinder their performance. To address these challenges, SDAPF integrates three key components: position sampling for local minimum avoidance, a novel hazard index that quantifies risk based on the distance and relative velocity between the robot and dynamic obstacles and a dynamic obstacle motion prediction module leveraging depth image data. These features enable intelligent path selection, adaptive step size adjustments based on obstacle dynamics and proactive decision-making for collision-free navigation. The hazard index allows the robot to dynamically assess the urgency of avoiding an obstacle, whereas the motion prediction module anticipates future positions of moving obstacles, enabling the robot to plan paths in advance. The effectiveness of SDAPF is demonstrated through both simulations and real-world experiments, highlighting its potential to significantly enhance safety and operational efficiency in complex human–robot interaction scenarios.
{"title":"Adaptive Obstacle Avoidance Using Vision-Based Dynamic Prediction and Strategic Motion Planning","authors":"Jianhang Shang, Guoliang Liu, Tenglong Zhang, Haoyang He, Guohui Tian, Wei Li, Zhenhua Liu","doi":"10.1049/csy2.70034","DOIUrl":"10.1049/csy2.70034","url":null,"abstract":"<p>In human–robot collaboration, ensuring both safety and efficiency in obstacle avoidance remains a critical challenge. This paper proposes a sampling-based danger-aware artificial potential field (SDAPF) method for obstacle avoidance during human–robot collaboration and interaction. Existing methods often struggle with dynamic obstacles and varying environmental complexities, which can hinder their performance. To address these challenges, SDAPF integrates three key components: position sampling for local minimum avoidance, a novel hazard index that quantifies risk based on the distance and relative velocity between the robot and dynamic obstacles and a dynamic obstacle motion prediction module leveraging depth image data. These features enable intelligent path selection, adaptive step size adjustments based on obstacle dynamics and proactive decision-making for collision-free navigation. The hazard index allows the robot to dynamically assess the urgency of avoiding an obstacle, whereas the motion prediction module anticipates future positions of moving obstacles, enabling the robot to plan paths in advance. The effectiveness of SDAPF is demonstrated through both simulations and real-world experiments, highlighting its potential to significantly enhance safety and operational efficiency in complex human–robot interaction scenarios.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":"7 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.70034","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145572454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hua Huang, Hai Zhu, Xiaozhou Zhu, Wenjun Mei, Baosong Deng
Multi-robot source seeking in unknown environments is challenging due to the difficulties in coordinating multi-robot sensing, information fusion and path planning. Existing approaches often struggle with computational scalability and search efficiency, particularly when dealing with multiple sources. In this paper, we develop a distributed multi-robot multi-source seeking strategy that enables robots to discover multiple sources using local sensing and neighbourhood communication. Our approach consists of three key components. First, we design a distributed mapping technique that leverages Gaussian processes for probabilistic inference across the entire environment and adapts it for a decentralised setup. Second, we formulate the source-seeking problem as an informative path planning problem and design a new information-theoretic objective function that combines predicted source locations with environmental uncertainty to prevent robots from being trapped at discovered sources. Third, we develop a tree search algorithm for planning the actions of robots over a fixed-horizon cycle. The algorithm generates a sequence of points leading to the most informative location. Based on the sequence, the robot is guided to the target location by taking a fixed-step movement inspired by the principles of model predictive control. Simulations validate our approach across different scenarios with varying numbers of sources and robots. In particular, the proposed information-theoretic heuristic outperforms the broadly used uncertainty-first and mean-gradient-first approaches, reducing search steps by up to 36.7%. Furthermore, our approach achieves an improvement of up to 63.8% in search efficiency compared to state-of-the-art coverage-based methods for multi-robot multi-source seeking problems. The average computational time of the proposed method is below 90 ms, supporting its feasibility for real-time applications.
{"title":"Online Path Planning for Multi-Robot Multi-Source Seeking Using Distributed Gaussian Processes","authors":"Hua Huang, Hai Zhu, Xiaozhou Zhu, Wenjun Mei, Baosong Deng","doi":"10.1049/csy2.70030","DOIUrl":"10.1049/csy2.70030","url":null,"abstract":"<p>Multi-robot source seeking in unknown environments is challenging due to the difficulties in coordinating multi-robot sensing, information fusion and path planning. Existing approaches often struggle with computational scalability and search efficiency, particularly when dealing with multiple sources. In this paper, we develop a distributed multi-robot multi-source seeking strategy that enables robots to discover multiple sources using local sensing and neighbourhood communication. Our approach consists of three key components. First, we design a distributed mapping technique that leverages Gaussian processes for probabilistic inference across the entire environment and adapts it for a decentralised setup. Second, we formulate the source-seeking problem as an informative path planning problem and design a new information-theoretic objective function that combines predicted source locations with environmental uncertainty to prevent robots from being trapped at discovered sources. Third, we develop a tree search algorithm for planning the actions of robots over a fixed-horizon cycle. The algorithm generates a sequence of points leading to the most informative location. Based on the sequence, the robot is guided to the target location by taking a fixed-step movement inspired by the principles of model predictive control. Simulations validate our approach across different scenarios with varying numbers of sources and robots. In particular, the proposed information-theoretic heuristic outperforms the broadly used uncertainty-first and mean-gradient-first approaches, reducing search steps by up to 36.7%. Furthermore, our approach achieves an improvement of up to 63.8% in search efficiency compared to state-of-the-art coverage-based methods for multi-robot multi-source seeking problems. The average computational time of the proposed method is below 90 ms, supporting its feasibility for real-time applications.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":"7 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2025-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.70030","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145572076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Accurate identification of fungal species is essential for effective diagnosis and treatment. Traditional microscopy-based methods are often subjective and time-consuming. Deep learning has emerged as a promising tool in this domain. However, existing deep learning models often struggle to generalise in the presence of class imbalance and subtle morphological differences, which are common in fungal image datasets. This study proposes MASA-Net, a deep learning framework that combines a fine-tuned DenseNet201 backbone with a multi-aspect channel–spatial attention (MASA) module. The attention mechanism refines spatial and channel-wise features by capturing multi-scale spatial patterns and adaptively emphasising informative channels. This enhances the network's ability to focus on diagnostically relevant fungal structures while suppressing irrelevant features. The MASA-Net is evaluated on the DeFungi dataset and demonstrates superior performance in terms of accuracy, precision, recall and F1-score. It also outperforms established attention mechanisms such as squeeze-and-excitation networks (SE) and convolutional block attention module (CBAM) under identical conditions. These results highlight MASA-Net's robustness and effectiveness in addressing class imbalance and structural variability, offering a reliable solution for automated fungal species identification.
{"title":"MASA-Net: Multi-Aspect Channel–Spatial Attention Network With Cross-Layer Feature Aggregation for Accurate Fungi Species Identification","authors":"Indranil Bera, Rajesh Mukherjee, Bidesh Chakraborty","doi":"10.1049/csy2.70029","DOIUrl":"https://doi.org/10.1049/csy2.70029","url":null,"abstract":"<p>Accurate identification of fungal species is essential for effective diagnosis and treatment. Traditional microscopy-based methods are often subjective and time-consuming. Deep learning has emerged as a promising tool in this domain. However, existing deep learning models often struggle to generalise in the presence of class imbalance and subtle morphological differences, which are common in fungal image datasets. This study proposes MASA-Net, a deep learning framework that combines a fine-tuned DenseNet201 backbone with a multi-aspect channel–spatial attention (MASA) module. The attention mechanism refines spatial and channel-wise features by capturing multi-scale spatial patterns and adaptively emphasising informative channels. This enhances the network's ability to focus on diagnostically relevant fungal structures while suppressing irrelevant features. The MASA-Net is evaluated on the DeFungi dataset and demonstrates superior performance in terms of accuracy, precision, recall and <i>F</i>1-score. It also outperforms established attention mechanisms such as squeeze-and-excitation networks (SE) and convolutional block attention module (CBAM) under identical conditions. These results highlight MASA-Net's robustness and effectiveness in addressing class imbalance and structural variability, offering a reliable solution for automated fungal species identification.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":"7 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.70029","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145146841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}