Robotic racket sports provide exceptional benchmarks for evaluating dynamic motion control capabilities in robots. Due to the highly non-linear dynamics of the shuttlecock, the stringent demands on robots' dynamic responses, and the convergence difficulties caused by sparse rewards in reinforcement learning, badminton strikes remain a formidable challenge for robot systems. To address these issues, this study proposes DTG-IRRL, a novel learning framework for badminton strikes that integrates imitation-relaxation reinforcement learning with dynamic trajectory generation. The framework demonstrates significantly improved training efficiency and performance, achieving faster convergence and twice the landing accuracy. Analysis of the reward function within a specific parameter space hyperplane intuitively reveals the convergence difficulties arising from the inherent sparsity of rewards in racket sports and demonstrates the framework's effectiveness in mitigating local and slow convergence. Implemented on hardware with zero-shot transfer, the framework achieves a 90% hitting rate and a 70% landing accuracy, enabling sustained humanrobot rallies. Cross-platform validation using the UR5 robot demonstrates the framework's generalizability while highlighting the requirement for high dynamic performance of robotic arms in racket sports.
{"title":"Imitation-relaxation reinforcement learning for sparse badminton strikes via dynamic trajectory generation.","authors":"Yanyan Yuan, Yucheng Tao, Shaowen Cheng, Yanhong Liang, Yongbin Jin, Hongtao Wang","doi":"10.3389/fnbot.2025.1649870","DOIUrl":"10.3389/fnbot.2025.1649870","url":null,"abstract":"<p><p>Robotic racket sports provide exceptional benchmarks for evaluating dynamic motion control capabilities in robots. Due to the highly non-linear dynamics of the shuttlecock, the stringent demands on robots' dynamic responses, and the convergence difficulties caused by sparse rewards in reinforcement learning, badminton strikes remain a formidable challenge for robot systems. To address these issues, this study proposes DTG-IRRL, a novel learning framework for badminton strikes that integrates imitation-relaxation reinforcement learning with dynamic trajectory generation. The framework demonstrates significantly improved training efficiency and performance, achieving faster convergence and twice the landing accuracy. Analysis of the reward function within a specific parameter space hyperplane intuitively reveals the convergence difficulties arising from the inherent sparsity of rewards in racket sports and demonstrates the framework's effectiveness in mitigating local and slow convergence. Implemented on hardware with zero-shot transfer, the framework achieves a 90% hitting rate and a 70% landing accuracy, enabling sustained humanrobot rallies. Cross-platform validation using the UR5 robot demonstrates the framework's generalizability while highlighting the requirement for high dynamic performance of robotic arms in racket sports.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"19 ","pages":"1649870"},"PeriodicalIF":2.8,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12436432/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145079389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-01eCollection Date: 2025-01-01DOI: 10.3389/fnbot.2025.1562675
Charles Lambelet, Melvin Mathis, Marc Siegenthaler, Jeremia P O Held, Daniel Woolley, Olivier Lambercy, Roger Gassert, Nicole Wenderoth
Introduction: Wrist function impairment is common after stroke and heavily impacts the execution of daily tasks. Robotic therapy, and more specifically wearable exoskeletons, have the potential to boost training dose in context-relevant scenarios, promote voluntary effort through motor intent detection, and mitigate the effect of gravity. Portable exoskeletons are often non-backdrivable and it is challenging to make their control safe, reactive and stable. Admittance control is often used in this case, however, this type of control can become unstable when the supported biological joint stiffens. Variable admittance control adapts its parameters dynamically to allow free motion and stabilize the human-robot interaction.
Methods: In this study, we implemented a variable admittance control scheme on a one degree of freedom wearable wrist exoskeleton. The damping parameter of the admittance scheme is adjusted in real-time to cope with instabilities and varying wrist stiffness. In addition to the admittance control scheme, sEMG- and gravity-based controllers were implemented, characterized and optimized on ten healthy participants and tested on six stroke survivors.
Results: The results show that (1) the variable admittance control scheme could stabilize the interaction but at the cost of a decrease in transparency, and (2) when coupled with the variable admittance controller the sEMG-based control enhanced wrist functionality of stroke survivors in the most extreme angular positions.
Discussion: Our variable admittance control scheme with sEMG- and gravity-based support was most beneficial for patients with higher levels of impairment by improving range of motion and promoting voluntary effort. Future work could combine both controllers to customize and fine tune the stability of the support to a wider range of impairment levels and types.
{"title":"Variable admittance control with sEMG-based support for wearable wrist exoskeleton.","authors":"Charles Lambelet, Melvin Mathis, Marc Siegenthaler, Jeremia P O Held, Daniel Woolley, Olivier Lambercy, Roger Gassert, Nicole Wenderoth","doi":"10.3389/fnbot.2025.1562675","DOIUrl":"10.3389/fnbot.2025.1562675","url":null,"abstract":"<p><strong>Introduction: </strong>Wrist function impairment is common after stroke and heavily impacts the execution of daily tasks. Robotic therapy, and more specifically wearable exoskeletons, have the potential to boost training dose in context-relevant scenarios, promote voluntary effort through motor intent detection, and mitigate the effect of gravity. Portable exoskeletons are often non-backdrivable and it is challenging to make their control safe, reactive and stable. Admittance control is often used in this case, however, this type of control can become unstable when the supported biological joint stiffens. Variable admittance control adapts its parameters dynamically to allow free motion and stabilize the human-robot interaction.</p><p><strong>Methods: </strong>In this study, we implemented a variable admittance control scheme on a one degree of freedom wearable wrist exoskeleton. The damping parameter of the admittance scheme is adjusted in real-time to cope with instabilities and varying wrist stiffness. In addition to the admittance control scheme, sEMG- and gravity-based controllers were implemented, characterized and optimized on ten healthy participants and tested on six stroke survivors.</p><p><strong>Results: </strong>The results show that (1) the variable admittance control scheme could stabilize the interaction but at the cost of a decrease in transparency, and (2) when coupled with the variable admittance controller the sEMG-based control enhanced wrist functionality of stroke survivors in the most extreme angular positions.</p><p><strong>Discussion: </strong>Our variable admittance control scheme with sEMG- and gravity-based support was most beneficial for patients with higher levels of impairment by improving range of motion and promoting voluntary effort. Future work could combine both controllers to customize and fine tune the stability of the support to a wider range of impairment levels and types.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"19 ","pages":"1562675"},"PeriodicalIF":2.8,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12434121/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145075000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-22eCollection Date: 2025-01-01DOI: 10.3389/fnbot.2025.1643919
Weizhen Tang, Jie Dai, Zhousheng Huang, Boyang Hao, Weizheng Xie
Introduction: To address the challenges of current 4D trajectory prediction-specifically, limited multi-factor feature extraction and excessive computational cost-this study develops a lightweight prediction framework tailored for real-time air-traffic management.
Methods: We propose a hybrid RCBAM-TCN-LSTM architecture enhanced with a teacher-student knowledge distillation mechanism. The Residual Convolutional Block Attention Module (RCBAM) serves as the teacher network to extract high-dimensional spatial features via residual structures and channel-spatial attention. The student network adopts a Temporal Convolutional Network-LSTM (TCN-LSTM) design, integrating dilated causal convolutions and two LSTM layers for efficient temporal modeling. Historical ADS-B trajectory data from Zhuhai Jinwan Airport are preprocessed using cubic spline interpolation and a uniform-step sliding window to ensure data alignment and temporal consistency. In the distillation process, soft labels from the teacher and hard labels from actual observations jointly guide student training.
Results: In multi-step prediction experiments, the distilled RCBAM-TCN-LSTM model achieved average reductions of 40%-60% in MAE, RMSE, and MAPE compared with the original RCBAM and TCN-LSTM models, while improving R² by 4%-6%. The approach maintained high accuracy across different prediction horizons while reducing computational complexity.
Discussion: The proposed method effectively balances high-precision modeling of spatiotemporal dependencies with lightweight deployment requirements, enabling real-time air-traffic monitoring and early warning on standard CPUs and embedded devices. This framework offers a scalable solution for enhancing the operational safety and efficiency of modern air-traffic control systems.
{"title":"4D trajectory lightweight prediction algorithm based on knowledge distillation technique.","authors":"Weizhen Tang, Jie Dai, Zhousheng Huang, Boyang Hao, Weizheng Xie","doi":"10.3389/fnbot.2025.1643919","DOIUrl":"10.3389/fnbot.2025.1643919","url":null,"abstract":"<p><strong>Introduction: </strong>To address the challenges of current 4D trajectory prediction-specifically, limited multi-factor feature extraction and excessive computational cost-this study develops a lightweight prediction framework tailored for real-time air-traffic management.</p><p><strong>Methods: </strong>We propose a hybrid RCBAM-TCN-LSTM architecture enhanced with a teacher-student knowledge distillation mechanism. The Residual Convolutional Block Attention Module (RCBAM) serves as the teacher network to extract high-dimensional spatial features via residual structures and channel-spatial attention. The student network adopts a Temporal Convolutional Network-LSTM (TCN-LSTM) design, integrating dilated causal convolutions and two LSTM layers for efficient temporal modeling. Historical ADS-B trajectory data from Zhuhai Jinwan Airport are preprocessed using cubic spline interpolation and a uniform-step sliding window to ensure data alignment and temporal consistency. In the distillation process, soft labels from the teacher and hard labels from actual observations jointly guide student training.</p><p><strong>Results: </strong>In multi-step prediction experiments, the distilled RCBAM-TCN-LSTM model achieved average reductions of 40%-60% in MAE, RMSE, and MAPE compared with the original RCBAM and TCN-LSTM models, while improving <i>R</i> <sup>²</sup> by 4%-6%. The approach maintained high accuracy across different prediction horizons while reducing computational complexity.</p><p><strong>Discussion: </strong>The proposed method effectively balances high-precision modeling of spatiotemporal dependencies with lightweight deployment requirements, enabling real-time air-traffic monitoring and early warning on standard CPUs and embedded devices. This framework offers a scalable solution for enhancing the operational safety and efficiency of modern air-traffic control systems.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"19 ","pages":"1643919"},"PeriodicalIF":2.8,"publicationDate":"2025-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12411499/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145014961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-14eCollection Date: 2025-01-01DOI: 10.3389/fnbot.2025.1628968
Jian Teng, Sukyoung Cho, Shaw-Mung Lee
Brain-computer interface (BCI) integration with virtual reality (VR) has progressed from single-limb control to multi-limb coordination, yet achieving intuitive tri-manual operation remains challenging. This study presents a consumer-grade hybrid BCI-VR framework enabling simultaneous control of two biological hands and a virtual third limb through integration of Tobii eye-tracking, NeuroSky single-channel EEG, and non-haptic controllers. The system employs e-Sense attention thresholds (>80% for 300 ms) to trigger virtual hand activation combined with gaze-driven targeting within 45° visual cones. A soft maximum weighted arbitration algorithm resolves spatiotemporal conflicts between manual and virtual inputs with 92.4% success rate. Experimental validation with eight participants across 160 trials demonstrated 87.5% virtual hand success rate and 41% spatial error reduction (σ = 0.23 mm vs. 0.39 mm) compared to traditional dual-hand control. The framework achieved 320 ms activation latency and 22% NASA-TLX workload reduction through adaptive cognitive load management. Time-frequency analysis revealed characteristic beta-band (15-20 Hz) energy modulations during successful virtual limb control, providing neurophysiological evidence for attention-mediated supernumerary limb embodiment. These findings demonstrate that sophisticated algorithmic approaches can compensate for consumer-grade hardware limitations, enabling laboratory-grade precision in accessible tri-manual VR applications for rehabilitation, training, and assistive technologies.
脑机接口(BCI)与虚拟现实(VR)的集成已经从单肢控制发展到多肢协调,但实现直观的三手操作仍然是一个挑战。本研究提出了一种消费级混合BCI-VR框架,通过集成Tobii眼动追踪、NeuroSky单通道EEG和非触觉控制器,可以同时控制两只生物手和虚拟第三肢。该系统采用e-Sense注意力阈值(bbb80 %, 300 ms)来触发虚拟手激活,并结合45°视锥内的凝视驱动目标。一种软最大加权仲裁算法解决了人工和虚拟输入的时空冲突,成功率为92.4%。8名参与者参与的160项实验验证表明,与传统双手控制相比,虚拟手成功率为87.5%,空间误差降低41% (σ = 0.23 mm vs. 0.39 mm)。该框架通过自适应认知负载管理实现了320 ms的激活延迟和22%的NASA-TLX工作负载减少。时频分析显示,在成功的虚拟肢体控制过程中,特征的β波段(15-20 Hz)能量调制,为注意介导的多肢体体现提供了神经生理学证据。这些研究结果表明,复杂的算法方法可以弥补消费者级硬件的限制,使实验室级的精度在可访问的三手动VR应用中用于康复、训练和辅助技术。
{"title":"Tri-manual interaction in hybrid BCI-VR systems: integrating gaze, EEG control for enhanced 3D object manipulation.","authors":"Jian Teng, Sukyoung Cho, Shaw-Mung Lee","doi":"10.3389/fnbot.2025.1628968","DOIUrl":"10.3389/fnbot.2025.1628968","url":null,"abstract":"<p><p>Brain-computer interface (BCI) integration with virtual reality (VR) has progressed from single-limb control to multi-limb coordination, yet achieving intuitive tri-manual operation remains challenging. This study presents a consumer-grade hybrid BCI-VR framework enabling simultaneous control of two biological hands and a virtual third limb through integration of Tobii eye-tracking, NeuroSky single-channel EEG, and non-haptic controllers. The system employs e-Sense attention thresholds (>80% for 300 ms) to trigger virtual hand activation combined with gaze-driven targeting within 45° visual cones. A soft maximum weighted arbitration algorithm resolves spatiotemporal conflicts between manual and virtual inputs with 92.4% success rate. Experimental validation with eight participants across 160 trials demonstrated 87.5% virtual hand success rate and 41% spatial error reduction (<i>σ</i> = 0.23 mm vs. 0.39 mm) compared to traditional dual-hand control. The framework achieved 320 ms activation latency and 22% NASA-TLX workload reduction through adaptive cognitive load management. Time-frequency analysis revealed characteristic beta-band (15-20 Hz) energy modulations during successful virtual limb control, providing neurophysiological evidence for attention-mediated supernumerary limb embodiment. These findings demonstrate that sophisticated algorithmic approaches can compensate for consumer-grade hardware limitations, enabling laboratory-grade precision in accessible tri-manual VR applications for rehabilitation, training, and assistive technologies.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"19 ","pages":"1628968"},"PeriodicalIF":2.8,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12390853/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144950948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-06eCollection Date: 2025-01-01DOI: 10.3389/fnbot.2025.1630281
Dahai Li, Su Chen
Fine-grained image classification tasks face challenges such as difficulty in labeling, scarcity of samples, and small category differences. To address this problem, this study proposes a novel fine-grained image classification method based on the MogaNet network and a multi-level gating mechanism. A feature extraction network based on MogaNet is constructed, and multi-scale feature fusion is combined to fully mine image information. The contextual information extractor is designed to align and filter more discriminative local features using the semantic context of the network, thereby strengthening the network's ability to capture detailed features. Meanwhile, a multi-level gating mechanism is introduced to obtain the saliency features of images. A feature elimination strategy is proposed to suppress the interference of fuzzy class features and background noise. A loss function is designed to constrain the elimination of fuzzy class features and classification prediction. Experimental results demonstrate that the new method can be applied to 5-shot tasks across four public datasets: Mini-ImageNet, CUB-200-2011, Stanford Dogs, and Stanford Cars. The accuracy rates reach 79.33, 87.58, 79.34, and 83.82%, respectively, which shows better performance than other state-of-the-art image classification methods.
{"title":"Fine-grained image classification using the MogaNet network and a multi-level gating mechanism.","authors":"Dahai Li, Su Chen","doi":"10.3389/fnbot.2025.1630281","DOIUrl":"10.3389/fnbot.2025.1630281","url":null,"abstract":"<p><p>Fine-grained image classification tasks face challenges such as difficulty in labeling, scarcity of samples, and small category differences. To address this problem, this study proposes a novel fine-grained image classification method based on the MogaNet network and a multi-level gating mechanism. A feature extraction network based on MogaNet is constructed, and multi-scale feature fusion is combined to fully mine image information. The contextual information extractor is designed to align and filter more discriminative local features using the semantic context of the network, thereby strengthening the network's ability to capture detailed features. Meanwhile, a multi-level gating mechanism is introduced to obtain the saliency features of images. A feature elimination strategy is proposed to suppress the interference of fuzzy class features and background noise. A loss function is designed to constrain the elimination of fuzzy class features and classification prediction. Experimental results demonstrate that the new method can be applied to 5-shot tasks across four public datasets: Mini-ImageNet, CUB-200-2011, Stanford Dogs, and Stanford Cars. The accuracy rates reach 79.33, 87.58, 79.34, and 83.82%, respectively, which shows better performance than other state-of-the-art image classification methods.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"19 ","pages":"1630281"},"PeriodicalIF":2.8,"publicationDate":"2025-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12364808/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144950965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-30eCollection Date: 2025-01-01DOI: 10.3389/fnbot.2025.1643011
Mohammed Alshehri, Tingting Xue, Ghulam Mujtaba, Yahya AlQahtani, Nouf Abdullah Almujally, Ahmad Jalal, Hui Liu
Introduction: Accurate vehicle analysis from aerial imagery has become increasingly vital for emerging technologies and public service applications such as intelligent traffic management, urban planning, autonomous navigation, and military surveillance. However, analyzing UAV-captured video poses several inherent challenges, such as the small size of target vehicles, occlusions, cluttered urban backgrounds, motion blur, and fluctuating lighting conditions which hinder the accuracy and consistency of conventional perception systems. To address these complexities, our research proposes a fully end-to-end deep learning-driven perception pipeline specifically optimized for UAV-based traffic monitoring. The proposed framwork integrates multiple advanced modules: RetinexNet for preprocessing, segmentation using HRNet to preserve high-resolution semantic information, and vehicle detection using the YOLOv11 framework. Deep SORT is employed for efficient vehicle tracking, while CSRNet facilitates high-density vehicle counting. LSTM networks are integrated to predict vehicle trajectories based on temporal patterns, and a combination of DenseNet and SuperPoint is utilized for robust feature extraction. Finally, classification is performed using Vision Transformers (ViTs), leveraging attention mechanisms to ensure accurate recognition across diverse categories. The modular yet unified architecture is designed to handle spatiotemporal dynamics, making it suitable for real-time deployment in diverse UAV platforms.
Method: The framework suggests using today's best neural networks that are made to solve different problems in aerial vehicle analysis. RetinexNet is used in preprocessing to make the lighting of each input frame consistent. Using HRNet for semantic segmentation allows for accurate splitting between vehicles and their surroundings. YOLOv11 provides high precision and quick vehicle detection and Deep SORT allows reliable tracking without losing track of individual cars. CSRNet are used for vehicle counting that is unaffected by obstacles or traffic jams. LSTM models capture how a car moves in time to forecast future positions. Combining DenseNet and SuperPoint embeddings that were improved with an AutoEncoder is done during feature extraction. In the end, using an attention function, Vision Transformer-based models classify vehicles seen from above. Every part of the system is developed and included to give the improved performance when the UAV is being used in real life.
Results: Our proposed framework significantly improves the accuracy, reliability, and efficiency of vehicle analysis from UAV imagery. Our pipeline was rigorously evaluated on two famous datasets, AU-AIR and Roundabout. On the AU-AIR dataset, the system achieved a detection accuracy of 97.8%, a tracking accuracy of 96.5%, and a classification accuracy of 98.4%. Similarly, on the Roundabout dataset, it reached 96.9% det
{"title":"Integrated neural network framework for multi-object detection and recognition using UAV imagery.","authors":"Mohammed Alshehri, Tingting Xue, Ghulam Mujtaba, Yahya AlQahtani, Nouf Abdullah Almujally, Ahmad Jalal, Hui Liu","doi":"10.3389/fnbot.2025.1643011","DOIUrl":"10.3389/fnbot.2025.1643011","url":null,"abstract":"<p><strong>Introduction: </strong>Accurate vehicle analysis from aerial imagery has become increasingly vital for emerging technologies and public service applications such as intelligent traffic management, urban planning, autonomous navigation, and military surveillance. However, analyzing UAV-captured video poses several inherent challenges, such as the small size of target vehicles, occlusions, cluttered urban backgrounds, motion blur, and fluctuating lighting conditions which hinder the accuracy and consistency of conventional perception systems. To address these complexities, our research proposes a fully end-to-end deep learning-driven perception pipeline specifically optimized for UAV-based traffic monitoring. The proposed framwork integrates multiple advanced modules: RetinexNet for preprocessing, segmentation using HRNet to preserve high-resolution semantic information, and vehicle detection using the YOLOv11 framework. Deep SORT is employed for efficient vehicle tracking, while CSRNet facilitates high-density vehicle counting. LSTM networks are integrated to predict vehicle trajectories based on temporal patterns, and a combination of DenseNet and SuperPoint is utilized for robust feature extraction. Finally, classification is performed using Vision Transformers (ViTs), leveraging attention mechanisms to ensure accurate recognition across diverse categories. The modular yet unified architecture is designed to handle spatiotemporal dynamics, making it suitable for real-time deployment in diverse UAV platforms.</p><p><strong>Method: </strong>The framework suggests using today's best neural networks that are made to solve different problems in aerial vehicle analysis. RetinexNet is used in preprocessing to make the lighting of each input frame consistent. Using HRNet for semantic segmentation allows for accurate splitting between vehicles and their surroundings. YOLOv11 provides high precision and quick vehicle detection and Deep SORT allows reliable tracking without losing track of individual cars. CSRNet are used for vehicle counting that is unaffected by obstacles or traffic jams. LSTM models capture how a car moves in time to forecast future positions. Combining DenseNet and SuperPoint embeddings that were improved with an AutoEncoder is done during feature extraction. In the end, using an attention function, Vision Transformer-based models classify vehicles seen from above. Every part of the system is developed and included to give the improved performance when the UAV is being used in real life.</p><p><strong>Results: </strong>Our proposed framework significantly improves the accuracy, reliability, and efficiency of vehicle analysis from UAV imagery. Our pipeline was rigorously evaluated on two famous datasets, AU-AIR and Roundabout. On the AU-AIR dataset, the system achieved a detection accuracy of 97.8%, a tracking accuracy of 96.5%, and a classification accuracy of 98.4%. Similarly, on the Roundabout dataset, it reached 96.9% det","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"19 ","pages":"1643011"},"PeriodicalIF":2.8,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12343587/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144845568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In deep-sea areas, the hoisting operation of offshore wind turbines is seriously affected by waves, and the secondary impact is prone to occur between the turbine and the pile foundation. To address this issue, this study proposes an integrated wave compensation system for offshore wind turbines based on a neuromorphic vision (NeuroVI) camera. The system employs a NeuroVI camera to achieve non-contact, high-precision, and low-latency displacement detection of hydraulic cylinders, overcoming the limitations of traditional magnetostrictive displacement sensors, which exhibit slow response and susceptibility to interference in harsh marine conditions. A dynamic simulation model was developed using AMESim-Simulink co-simulation to analyze the compensation performance of the NeuroVI-based system under step and sinusoidal wave disturbances. Comparative results demonstrate that the NeuroVI feedback system achieves faster response times and superior stability over conventional sensors. Laboratory-scale model tests and real-world application in the installation of a 5.2 MW offshore wind turbine validated the system's feasibility and robustness, enabling real-time collaborative control of turbine and cylinder displacement to effectively mitigate multi-impact risks. This research provides an innovative approach for deploying neural perception technology in complex marine scenarios and advances the development of neuro-robotic systems in ocean engineering.
{"title":"NeuroVI-based wave compensation system control for offshore wind turbines.","authors":"Fengshuang Ma, Xiangyong Liu, Zhiqiang Xu, Tianhong Ding","doi":"10.3389/fnbot.2025.1648713","DOIUrl":"10.3389/fnbot.2025.1648713","url":null,"abstract":"<p><p>In deep-sea areas, the hoisting operation of offshore wind turbines is seriously affected by waves, and the secondary impact is prone to occur between the turbine and the pile foundation. To address this issue, this study proposes an integrated wave compensation system for offshore wind turbines based on a neuromorphic vision (NeuroVI) camera. The system employs a NeuroVI camera to achieve non-contact, high-precision, and low-latency displacement detection of hydraulic cylinders, overcoming the limitations of traditional magnetostrictive displacement sensors, which exhibit slow response and susceptibility to interference in harsh marine conditions. A dynamic simulation model was developed using AMESim-Simulink co-simulation to analyze the compensation performance of the NeuroVI-based system under step and sinusoidal wave disturbances. Comparative results demonstrate that the NeuroVI feedback system achieves faster response times and superior stability over conventional sensors. Laboratory-scale model tests and real-world application in the installation of a 5.2 MW offshore wind turbine validated the system's feasibility and robustness, enabling real-time collaborative control of turbine and cylinder displacement to effectively mitigate multi-impact risks. This research provides an innovative approach for deploying neural perception technology in complex marine scenarios and advances the development of neuro-robotic systems in ocean engineering.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"19 ","pages":"1648713"},"PeriodicalIF":2.8,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12343490/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144845569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-28eCollection Date: 2025-01-01DOI: 10.3389/fnbot.2025.1604453
Chenfei Ma, Xinyu Jiang, Kianoush Nazarpour
Myoelectric control systems translate electromyographic signals (EMG) from muscles into movement intentions, allowing control over various interfaces, such as prosthetics, wearable devices, and robotics. However, a major challenge lies in enhancing the system's ability to generalize, personalize, and adapt to the high variability of EMG signals. Artificial intelligence, particularly neural networks, has shown promising decoding performance when applied to large datasets. However, highly parameterized deep neural networks usually require extensive user-specific data with ground truth labels to learn individual unique EMG patterns. However, the characteristics of the EMG signal can change significantly over time, even for the same user, leading to performance degradation during extended use. In this work, we propose an innovative three-stage neural network training scheme designed to progressively develop an adaptive workflow, improving and maintaining the network performance on 28 subjects over 2 days. Experiments demonstrate the importance and necessity of each stage in the proposed framework.
{"title":"Pre-training, personalization, and self-calibration: all a neural network-based myoelectric decoder needs.","authors":"Chenfei Ma, Xinyu Jiang, Kianoush Nazarpour","doi":"10.3389/fnbot.2025.1604453","DOIUrl":"10.3389/fnbot.2025.1604453","url":null,"abstract":"<p><p>Myoelectric control systems translate electromyographic signals (EMG) from muscles into movement intentions, allowing control over various interfaces, such as prosthetics, wearable devices, and robotics. However, a major challenge lies in enhancing the system's ability to generalize, personalize, and adapt to the high variability of EMG signals. Artificial intelligence, particularly neural networks, has shown promising decoding performance when applied to large datasets. However, highly parameterized deep neural networks usually require extensive user-specific data with ground truth labels to learn individual unique EMG patterns. However, the characteristics of the EMG signal can change significantly over time, even for the same user, leading to performance degradation during extended use. In this work, we propose an innovative three-stage neural network training scheme designed to progressively develop an adaptive workflow, improving and maintaining the network performance on 28 subjects over 2 days. Experiments demonstrate the importance and necessity of each stage in the proposed framework.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"19 ","pages":"1604453"},"PeriodicalIF":2.8,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12336220/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144821257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-11eCollection Date: 2025-01-01DOI: 10.3389/fnbot.2025.1576473
Zhisheng Ma, Shaobin Huang
Zeroing neural network (ZNN) is viewed as an effective solution to time-varying nonlinear equation (TVNE). In this paper, a further study is shown by proposing a novel combined discrete-time ZNN (CDTZNN) model for solving TVNE. Specifically, a new difference formula, which is called the Taylor difference formula, is constructed for first-order derivative approximation by following Taylor series expansion. The Taylor difference formula is then used to discretize the continuous-time ZNN model in the previous study. The corresponding DTZNN model is obtained, where the direct Jacobian matrix inversion is required (being time consuming). Another DTZNN model for computing the inverse of Jacobian matrix is established to solve the aforementioned limitation. The novel CDTZNN model for solving the TVNE is thus developed by combining the two models. Theoretical analysis and numerical results demonstrate the efficacy of the proposed CDTZNN model. The CDTZNN applicability is further indicated by applying the proposed model to the motion planning of robot manipulators.
{"title":"Design and analysis of combined discrete-time zeroing neural network for solving time-varying nonlinear equation with robot application.","authors":"Zhisheng Ma, Shaobin Huang","doi":"10.3389/fnbot.2025.1576473","DOIUrl":"10.3389/fnbot.2025.1576473","url":null,"abstract":"<p><p>Zeroing neural network (ZNN) is viewed as an effective solution to time-varying nonlinear equation (TVNE). In this paper, a further study is shown by proposing a novel combined discrete-time ZNN (CDTZNN) model for solving TVNE. Specifically, a new difference formula, which is called the Taylor difference formula, is constructed for first-order derivative approximation by following Taylor series expansion. The Taylor difference formula is then used to discretize the continuous-time ZNN model in the previous study. The corresponding DTZNN model is obtained, where the direct Jacobian matrix inversion is required (being time consuming). Another DTZNN model for computing the inverse of Jacobian matrix is established to solve the aforementioned limitation. The novel CDTZNN model for solving the TVNE is thus developed by combining the two models. Theoretical analysis and numerical results demonstrate the efficacy of the proposed CDTZNN model. The CDTZNN applicability is further indicated by applying the proposed model to the motion planning of robot manipulators.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"19 ","pages":"1576473"},"PeriodicalIF":2.8,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12289663/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144729707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-27eCollection Date: 2025-01-01DOI: 10.3389/fnbot.2025.1630728
Xueqin Ji, Shuting Zhao, Di Liu, Feng Wang, Xinrong Chen
In nasal endoscopic surgery, the narrow nasal cavity restricts the surgical field of view and the manipulation of surgical instruments. Therefore, precise real-time intraoperative navigation, which can provide precise 3D information, plays a crucial role in avoiding critical areas with dense blood vessels and nerves. Although significant progress has been made in endoscopic 3D reconstruction methods, their application in nasal scenarios still faces numerous challenges. On the one hand, there is a lack of high-quality, annotated nasal endoscopy datasets. On the other hand, issues such as motion blur and soft tissue deformations complicate the nasal endoscopy reconstruction process. To tackle these challenges, a series of nasal endoscopy examination videos are collected, and the pose information for each frame is recorded. Additionally, a novel model named Mip-EndoGS is proposed, which integrates 3D Gaussian Splatting for reconstruction and rendering and a diffusion module to reduce image blurring in endoscopic data. Meanwhile, by incorporating an adaptive low-pass filter into the rendering pipeline, the aliasing artifacts (jagged edges) are mitigated, which occur during the rendering process. Extensive quantitative and visual experiments show that the proposed model is capable of reconstructing 3D scenes within the nasal cavity in real-time, thereby offering surgeons more detailed and precise information about the surgical scene. Moreover, the proposed approach holds great potential for integration with AR-based surgical navigation systems to enhance intraoperative guidance.
{"title":"A robust and effective framework for 3D scene reconstruction and high-quality rendering in nasal endoscopy surgery.","authors":"Xueqin Ji, Shuting Zhao, Di Liu, Feng Wang, Xinrong Chen","doi":"10.3389/fnbot.2025.1630728","DOIUrl":"10.3389/fnbot.2025.1630728","url":null,"abstract":"<p><p>In nasal endoscopic surgery, the narrow nasal cavity restricts the surgical field of view and the manipulation of surgical instruments. Therefore, precise real-time intraoperative navigation, which can provide precise 3D information, plays a crucial role in avoiding critical areas with dense blood vessels and nerves. Although significant progress has been made in endoscopic 3D reconstruction methods, their application in nasal scenarios still faces numerous challenges. On the one hand, there is a lack of high-quality, annotated nasal endoscopy datasets. On the other hand, issues such as motion blur and soft tissue deformations complicate the nasal endoscopy reconstruction process. To tackle these challenges, a series of nasal endoscopy examination videos are collected, and the pose information for each frame is recorded. Additionally, a novel model named Mip-EndoGS is proposed, which integrates 3D Gaussian Splatting for reconstruction and rendering and a diffusion module to reduce image blurring in endoscopic data. Meanwhile, by incorporating an adaptive low-pass filter into the rendering pipeline, the aliasing artifacts (jagged edges) are mitigated, which occur during the rendering process. Extensive quantitative and visual experiments show that the proposed model is capable of reconstructing 3D scenes within the nasal cavity in real-time, thereby offering surgeons more detailed and precise information about the surgical scene. Moreover, the proposed approach holds great potential for integration with AR-based surgical navigation systems to enhance intraoperative guidance.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"19 ","pages":"1630728"},"PeriodicalIF":2.6,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12245865/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144626010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}