Pub Date : 2024-10-29eCollection Date: 2024-01-01DOI: 10.3389/fnbot.2024.1503038
Paloma de la Puente, Markus Vincze, Diego Guffanti, Daniel Galan
{"title":"Editorial: Assistive and service robots for health and home applications (RH3 - Robot Helpers in Health and Home).","authors":"Paloma de la Puente, Markus Vincze, Diego Guffanti, Daniel Galan","doi":"10.3389/fnbot.2024.1503038","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1503038","url":null,"abstract":"","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"18 ","pages":"1503038"},"PeriodicalIF":2.6,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11554614/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142618571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-22eCollection Date: 2024-01-01DOI: 10.3389/fnbot.2024.1488337
Lei Wang, Danping Liu, Jun Wang
Ensuring representativeness of collected samples is the most critical requirement of water sampling. Unmanned surface vehicles (USVs) have been widely adopted in water sampling, but current USV sampling path planning tend to overemphasize path optimization, neglecting the representative samples collection. This study proposed a modified A* algorithm that combined remote sensing technique while considering both path length and the representativeness of collected samples. Water quality parameters were initially retrieved using satellite remote sensing imagery and a deep belief network model, with the parameter value incorporated as coefficient Q in the heuristic function of A* algorithm. The adjustment coefficient k was then introduced into the coefficient Q to optimize the trade-off between sampling representativeness and path length. To evaluate the effectiveness of this algorithm, Chlorophyll-a concentration (Chl-a) was employed as the test parameter, with Chaohu Lake as the study area. Results showed that the algorithm was effective in collecting more representative samples in real-world conditions. As the coefficient k increased, the representativeness of collected samples enhanced, indicated by the Chl-a closely approximating the overall mean Chl-a and exhibiting a gradient distribution. This enhancement was also associated with increased path length. This study is significant in USV water sampling and water environment protection.
{"title":"A modified A* algorithm combining remote sensing technique to collect representative samples from unmanned surface vehicles.","authors":"Lei Wang, Danping Liu, Jun Wang","doi":"10.3389/fnbot.2024.1488337","DOIUrl":"10.3389/fnbot.2024.1488337","url":null,"abstract":"<p><p>Ensuring representativeness of collected samples is the most critical requirement of water sampling. Unmanned surface vehicles (USVs) have been widely adopted in water sampling, but current USV sampling path planning tend to overemphasize path optimization, neglecting the representative samples collection. This study proposed a modified A* algorithm that combined remote sensing technique while considering both path length and the representativeness of collected samples. Water quality parameters were initially retrieved using satellite remote sensing imagery and a deep belief network model, with the parameter value incorporated as coefficient <i>Q</i> in the heuristic function of A* algorithm. The adjustment coefficient <i>k</i> was then introduced into the coefficient <i>Q</i> to optimize the trade-off between sampling representativeness and path length. To evaluate the effectiveness of this algorithm, Chlorophyll-a concentration (Chl-a) was employed as the test parameter, with Chaohu Lake as the study area. Results showed that the algorithm was effective in collecting more representative samples in real-world conditions. As the coefficient <i>k</i> increased, the representativeness of collected samples enhanced, indicated by the Chl-a closely approximating the overall mean Chl-a and exhibiting a gradient distribution. This enhancement was also associated with increased path length. This study is significant in USV water sampling and water environment protection.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"18 ","pages":"1488337"},"PeriodicalIF":2.6,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11535655/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142582574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-21eCollection Date: 2024-01-01DOI: 10.3389/fnbot.2024.1443177
Libo Ma, Yan Tong
Currently, the application of robotics technology in sports training and competitions is rapidly increasing. Traditional methods mainly rely on image or video data, neglecting the effective utilization of textual information. To address this issue, we propose: TL-CStrans Net: A vision robot for table tennis player action recognition driven via CS-Transformer. This is a multimodal approach that combines CS-Transformer, CLIP, and transfer learning techniques to effectively integrate visual and textual information. Firstly, we employ the CS-Transformer model as the neural computing backbone. By utilizing the CS-Transformer, we can effectively process visual information extracted from table tennis game scenes, enabling accurate stroke recognition. Then, we introduce the CLIP model, which combines computer vision and natural language processing. CLIP allows us to jointly learn representations of images and text, thereby aligning the visual and textual modalities. Finally, to reduce training and computational requirements, we leverage pre-trained CS-Transformer and CLIP models through transfer learning, which have already acquired knowledge from relevant domains, and apply them to table tennis stroke recognition tasks. Experimental results demonstrate the outstanding performance of TL-CStrans Net in table tennis stroke recognition. Our research is of significant importance in promoting the application of multimodal robotics technology in the field of sports and bridging the gap between neural computing, computer vision, and neuroscience.
{"title":"TL-CStrans Net: a vision robot for table tennis player action recognition driven via CS-Transformer.","authors":"Libo Ma, Yan Tong","doi":"10.3389/fnbot.2024.1443177","DOIUrl":"10.3389/fnbot.2024.1443177","url":null,"abstract":"<p><p>Currently, the application of robotics technology in sports training and competitions is rapidly increasing. Traditional methods mainly rely on image or video data, neglecting the effective utilization of textual information. To address this issue, we propose: TL-CStrans Net: A vision robot for table tennis player action recognition driven via CS-Transformer. This is a multimodal approach that combines CS-Transformer, CLIP, and transfer learning techniques to effectively integrate visual and textual information. Firstly, we employ the CS-Transformer model as the neural computing backbone. By utilizing the CS-Transformer, we can effectively process visual information extracted from table tennis game scenes, enabling accurate stroke recognition. Then, we introduce the CLIP model, which combines computer vision and natural language processing. CLIP allows us to jointly learn representations of images and text, thereby aligning the visual and textual modalities. Finally, to reduce training and computational requirements, we leverage pre-trained CS-Transformer and CLIP models through transfer learning, which have already acquired knowledge from relevant domains, and apply them to table tennis stroke recognition tasks. Experimental results demonstrate the outstanding performance of TL-CStrans Net in table tennis stroke recognition. Our research is of significant importance in promoting the application of multimodal robotics technology in the field of sports and bridging the gap between neural computing, computer vision, and neuroscience.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"18 ","pages":"1443177"},"PeriodicalIF":2.6,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11532032/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142575211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-21eCollection Date: 2024-01-01DOI: 10.3389/fnbot.2024.1508032
[This corrects the article DOI: 10.3389/fnbot.2024.1452019.].
[此处更正了文章 DOI:10.3389/fnbot.2024.1452019.]。
{"title":"Erratum: Swimtrans Net: a multimodal robotic system for swimming action recognition driven via Swin-Transformer.","authors":"","doi":"10.3389/fnbot.2024.1508032","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1508032","url":null,"abstract":"<p><p>[This corrects the article DOI: 10.3389/fnbot.2024.1452019.].</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"18 ","pages":"1508032"},"PeriodicalIF":2.6,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11551536/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142618573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Panoptic segmentation plays a crucial role in enabling robots to comprehend their surroundings, providing fine-grained scene understanding information for robots' intelligent tasks. Although existing methods have made some progress, they are prone to fail in areas with weak textures, small objects, etc. Inspired by biological vision research, we propose a cascaded contour-enhanced panoptic segmentation network called CCPSNet, attempting to enhance the discriminability of instances through structural knowledge. To acquire the scene structure, a cascade contour detection stream is designed, which extracts comprehensive scene contours using channel regulation structural perception module and coarse-to-fine cascade strategy. Furthermore, the contour-guided multi-scale feature enhancement stream is developed to boost the discrimination ability for small objects and weak textures. The stream integrates contour information and multi-scale context features through structural-aware feature modulation module and inverse aggregation technique. Experimental results show that our method improves accuracy on the Cityscapes (61.2 PQ) and COCO (43.5 PQ) datasets while also demonstrating robustness in challenging simulated real-world complex scenarios faced by robots, such as dirty cameras and rainy conditions. The proposed network promises to help the robot perceive the real scene. In future work, an unsupervised training strategy for the network could be explored to reduce the training cost.
{"title":"Cascade contour-enhanced panoptic segmentation for robotic vision perception.","authors":"Yue Xu, Runze Liu, Dongchen Zhu, Lili Chen, Xiaolin Zhang, Jiamao Li","doi":"10.3389/fnbot.2024.1489021","DOIUrl":"10.3389/fnbot.2024.1489021","url":null,"abstract":"<p><p>Panoptic segmentation plays a crucial role in enabling robots to comprehend their surroundings, providing fine-grained scene understanding information for robots' intelligent tasks. Although existing methods have made some progress, they are prone to fail in areas with weak textures, small objects, etc. Inspired by biological vision research, we propose a cascaded contour-enhanced panoptic segmentation network called CCPSNet, attempting to enhance the discriminability of instances through structural knowledge. To acquire the scene structure, a cascade contour detection stream is designed, which extracts comprehensive scene contours using channel regulation structural perception module and coarse-to-fine cascade strategy. Furthermore, the contour-guided multi-scale feature enhancement stream is developed to boost the discrimination ability for small objects and weak textures. The stream integrates contour information and multi-scale context features through structural-aware feature modulation module and inverse aggregation technique. Experimental results show that our method improves accuracy on the Cityscapes (61.2 PQ) and COCO (43.5 PQ) datasets while also demonstrating robustness in challenging simulated real-world complex scenarios faced by robots, such as dirty cameras and rainy conditions. The proposed network promises to help the robot perceive the real scene. In future work, an unsupervised training strategy for the network could be explored to reduce the training cost.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"18 ","pages":"1489021"},"PeriodicalIF":2.6,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11532083/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142577450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-18eCollection Date: 2024-01-01DOI: 10.3389/fnbot.2024.1477232
Zhiquan Chen, Jiabao Guo, Yishan Liu, Mengqian Tian, Xingsong Wang
In this work, the mechanical principles of external fixation and resistance training for the wrist affected by a distal radius fracture (DRF) are revealed. Based on the biomechanical analysis, two wearable exoskeleton devices are proposed to facilitate the DRF rehabilitation progress. Chronologically, the adjustable fixation device (AFD) provides fixed protection and limited mobilization of the fractured wrist in the early stage, while the functional recovery of relevant muscles is achieved by the resistance training device (RTD) in the later stage. According to the designed mechatronic systems of AFD and RTD, the experimental prototypes for these two apparatuses are established. By experiments, the actual motion ranges of AFD are investigated, and the feasibility in monitoring joint angles are validated. Meanwhile, the resistant influences of RTD are analyzed based on the surface electromyography (sEMG) signal features, the results demonstrate that the training-induced muscle strength enhancement is generally increased with the increment in external resistance. The exoskeleton devices presented in this work would be beneficial for the active rehabilitation of patients with DRF.
{"title":"Design and analysis of exoskeleton devices for rehabilitation of distal radius fracture.","authors":"Zhiquan Chen, Jiabao Guo, Yishan Liu, Mengqian Tian, Xingsong Wang","doi":"10.3389/fnbot.2024.1477232","DOIUrl":"10.3389/fnbot.2024.1477232","url":null,"abstract":"<p><p>In this work, the mechanical principles of external fixation and resistance training for the wrist affected by a distal radius fracture (DRF) are revealed. Based on the biomechanical analysis, two wearable exoskeleton devices are proposed to facilitate the DRF rehabilitation progress. Chronologically, the adjustable fixation device (AFD) provides fixed protection and limited mobilization of the fractured wrist in the early stage, while the functional recovery of relevant muscles is achieved by the resistance training device (RTD) in the later stage. According to the designed mechatronic systems of AFD and RTD, the experimental prototypes for these two apparatuses are established. By experiments, the actual motion ranges of AFD are investigated, and the feasibility in monitoring joint angles are validated. Meanwhile, the resistant influences of RTD are analyzed based on the surface electromyography (sEMG) signal features, the results demonstrate that the training-induced muscle strength enhancement is generally increased with the increment in external resistance. The exoskeleton devices presented in this work would be beneficial for the active rehabilitation of patients with DRF.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"18 ","pages":"1477232"},"PeriodicalIF":2.6,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11527727/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142570965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-14eCollection Date: 2024-01-01DOI: 10.3389/fnbot.2024.1484088
Zixin Huang, Xuesong Tao, Xinyuan Liu
Object detection plays a crucial role in robotic vision, focusing on accurately identifying and localizing objects within images. However, many existing methods encounter limitations, particularly when it comes to effectively implementing a one-to-many matching strategy. To address these challenges, we propose NAN-DETR (Noising Multi-Anchor Detection Transformer), an innovative framework based on DETR (Detection Transformer). NAN-DETR introduces three key improvements to transformer-based object detection: a decoder-based multi-anchor strategy, a centralization noising mechanism, and the integration of Complete Intersection over Union (CIoU) loss. The multi-anchor strategy leverages multiple anchors per object, significantly enhancing detection accuracy by improving the one-to-many matching process. The centralization noising mechanism mitigates conflicts among anchors by injecting controlled noise into the detection boxes, thereby increasing the robustness of the model. Additionally, CIoU loss, which incorporates both aspect ratio and spatial distance in its calculations, results in more precise bounding box predictions compared to the conventional IoU loss. Although NAN-DETR may not drastically improve real-time processing capabilities, its exceptional performance positions it as a highly reliable solution for diverse object detection scenarios.
{"title":"NAN-DETR: noising multi-anchor makes DETR better for object detection.","authors":"Zixin Huang, Xuesong Tao, Xinyuan Liu","doi":"10.3389/fnbot.2024.1484088","DOIUrl":"10.3389/fnbot.2024.1484088","url":null,"abstract":"<p><p>Object detection plays a crucial role in robotic vision, focusing on accurately identifying and localizing objects within images. However, many existing methods encounter limitations, particularly when it comes to effectively implementing a one-to-many matching strategy. To address these challenges, we propose NAN-DETR (Noising Multi-Anchor Detection Transformer), an innovative framework based on DETR (Detection Transformer). NAN-DETR introduces three key improvements to transformer-based object detection: a decoder-based multi-anchor strategy, a centralization noising mechanism, and the integration of Complete Intersection over Union (CIoU) loss. The multi-anchor strategy leverages multiple anchors per object, significantly enhancing detection accuracy by improving the one-to-many matching process. The centralization noising mechanism mitigates conflicts among anchors by injecting controlled noise into the detection boxes, thereby increasing the robustness of the model. Additionally, CIoU loss, which incorporates both aspect ratio and spatial distance in its calculations, results in more precise bounding box predictions compared to the conventional IoU loss. Although NAN-DETR may not drastically improve real-time processing capabilities, its exceptional performance positions it as a highly reliable solution for diverse object detection scenarios.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"18 ","pages":"1484088"},"PeriodicalIF":2.6,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11513373/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142521681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-14eCollection Date: 2024-01-01DOI: 10.3389/fnbot.2024.1466571
Xu Liao, Le Li, Chuangxia Huang, Xian Zhao, Shumin Tan
How to improve the success rate of autonomous underwater vehicle (AUV) path planning and reduce travel time as much as possible is a very challenging and crucial problem in the practical applications of AUV in the complex ocean current environment. Traditional reinforcement learning algorithms lack exploration of the environment, and the strategies learned by the agent may not generalize well to other different environments. To address these challenges, we propose a novel AUV path planning algorithm named the Noisy Dueling Double Deep Q-Network (ND3QN) algorithm by modifying the reward function and introducing a noisy network, which generalizes the traditional D3QN algorithm. Compared with the classical algorithm [e.g., Rapidly-exploring Random Trees Star (RRT*), DQN, and D3QN], with simulation experiments conducted in realistic terrain and ocean currents, the proposed ND3QN algorithm demonstrates the outstanding characteristics of a higher success rate of AUV path planning, shorter travel time, and smoother paths.
{"title":"Noisy Dueling Double Deep Q-Network algorithm for autonomous underwater vehicle path planning.","authors":"Xu Liao, Le Li, Chuangxia Huang, Xian Zhao, Shumin Tan","doi":"10.3389/fnbot.2024.1466571","DOIUrl":"10.3389/fnbot.2024.1466571","url":null,"abstract":"<p><p>How to improve the success rate of autonomous underwater vehicle (AUV) path planning and reduce travel time as much as possible is a very challenging and crucial problem in the practical applications of AUV in the complex ocean current environment. Traditional reinforcement learning algorithms lack exploration of the environment, and the strategies learned by the agent may not generalize well to other different environments. To address these challenges, we propose a novel AUV path planning algorithm named the Noisy Dueling Double Deep Q-Network (ND3QN) algorithm by modifying the reward function and introducing a noisy network, which generalizes the traditional D3QN algorithm. Compared with the classical algorithm [e.g., Rapidly-exploring Random Trees Star (RRT*), DQN, and D3QN], with simulation experiments conducted in realistic terrain and ocean currents, the proposed ND3QN algorithm demonstrates the outstanding characteristics of a higher success rate of AUV path planning, shorter travel time, and smoother paths.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"18 ","pages":"1466571"},"PeriodicalIF":2.6,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11513341/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142521682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-11eCollection Date: 2024-01-01DOI: 10.3389/fnbot.2024.1453571
Hong LinLin, Lee Sangheang, Song GuanTing
Introduction: Assistive robots and human-robot interaction have become integral parts of sports training. However, existing methods often fail to provide real-time and accurate feedback, and they often lack integration of comprehensive multi-modal data.
Methods: To address these issues, we propose a groundbreaking and innovative approach: CAM-Vtrans-Cross-Attention Multi-modal Visual Transformer. By leveraging the strengths of state-of-the-art techniques such as Visual Transformers (ViT) and models like CLIP, along with cross-attention mechanisms, CAM-Vtrans harnesses the power of visual and textual information to provide athletes with highly accurate and timely feedback. Through the utilization of multi-modal robot data, CAM-Vtrans offers valuable assistance, enabling athletes to optimize their performance while minimizing potential injury risks. This novel approach represents a significant advancement in the field, offering an innovative solution to overcome the limitations of existing methods and enhance the precision and efficiency of sports training programs.
{"title":"CAM-Vtrans: real-time sports training utilizing multi-modal robot data.","authors":"Hong LinLin, Lee Sangheang, Song GuanTing","doi":"10.3389/fnbot.2024.1453571","DOIUrl":"10.3389/fnbot.2024.1453571","url":null,"abstract":"<p><strong>Introduction: </strong>Assistive robots and human-robot interaction have become integral parts of sports training. However, existing methods often fail to provide real-time and accurate feedback, and they often lack integration of comprehensive multi-modal data.</p><p><strong>Methods: </strong>To address these issues, we propose a groundbreaking and innovative approach: CAM-Vtrans-Cross-Attention Multi-modal Visual Transformer. By leveraging the strengths of state-of-the-art techniques such as Visual Transformers (ViT) and models like CLIP, along with cross-attention mechanisms, CAM-Vtrans harnesses the power of visual and textual information to provide athletes with highly accurate and timely feedback. Through the utilization of multi-modal robot data, CAM-Vtrans offers valuable assistance, enabling athletes to optimize their performance while minimizing potential injury risks. This novel approach represents a significant advancement in the field, offering an innovative solution to overcome the limitations of existing methods and enhance the precision and efficiency of sports training programs.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"18 ","pages":"1453571"},"PeriodicalIF":2.6,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11502466/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142516399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-11eCollection Date: 2024-01-01DOI: 10.3389/fnbot.2024.1443432
Qi Lu
Introduction: Accurately recognizing and understanding human motion actions presents a key challenge in the development of intelligent sports robots. Traditional methods often encounter significant drawbacks, such as high computational resource requirements and suboptimal real-time performance. To address these limitations, this study proposes a novel approach called Sports-ACtrans Net.
Methods: In this approach, the Swin Transformer processes visual data to extract spatial features, while the Spatio-Temporal Graph Convolutional Network (ST-GCN) models human motion as graphs to handle skeleton data. By combining these outputs, a comprehensive representation of motion actions is created. Reinforcement learning is employed to optimize the action recognition process, framing it as a sequential decision-making problem. Deep Q-learning is utilized to learn the optimal policy, thereby enhancing the robot's ability to accurately recognize and engage in motion.
Results and discussion: Experiments demonstrate significant improvements over state-of-the-art methods. This research advances the fields of neural computation, computer vision, and neuroscience, aiding in the development of intelligent robotic systems capable of understanding and participating in sports activities.
{"title":"Sports-ACtrans Net: research on multimodal robotic sports action recognition driven via ST-GCN.","authors":"Qi Lu","doi":"10.3389/fnbot.2024.1443432","DOIUrl":"10.3389/fnbot.2024.1443432","url":null,"abstract":"<p><strong>Introduction: </strong>Accurately recognizing and understanding human motion actions presents a key challenge in the development of intelligent sports robots. Traditional methods often encounter significant drawbacks, such as high computational resource requirements and suboptimal real-time performance. To address these limitations, this study proposes a novel approach called Sports-ACtrans Net.</p><p><strong>Methods: </strong>In this approach, the Swin Transformer processes visual data to extract spatial features, while the Spatio-Temporal Graph Convolutional Network (ST-GCN) models human motion as graphs to handle skeleton data. By combining these outputs, a comprehensive representation of motion actions is created. Reinforcement learning is employed to optimize the action recognition process, framing it as a sequential decision-making problem. Deep Q-learning is utilized to learn the optimal policy, thereby enhancing the robot's ability to accurately recognize and engage in motion.</p><p><strong>Results and discussion: </strong>Experiments demonstrate significant improvements over state-of-the-art methods. This research advances the fields of neural computation, computer vision, and neuroscience, aiding in the development of intelligent robotic systems capable of understanding and participating in sports activities.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"18 ","pages":"1443432"},"PeriodicalIF":2.6,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11502397/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142498770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}