Pub Date : 2024-01-17DOI: 10.3389/fnbot.2024.1355857
Zijian Yuan, Pengwei Shao, Jinran Li, Yinuo Wang, Zixuan Zhu, Weijie Qiu, Buqun Chen, Yan Tang, Aiqing Han
Introduction
Acupoint localization is integral to Traditional Chinese Medicine (TCM) acupuncture diagnosis and treatment. Employing intelligent detection models for recognizing facial acupoints can substantially enhance localization accuracy.
Methods
This study introduces an advancement in the YOLOv8-pose keypoint detection algorithm, tailored for facial acupoints, and named YOLOv8-ACU. This model enhances acupoint feature extraction by integrating ECA attention, replaces the original neck module with a lighter Slim-neck module, and improves the loss function for GIoU.
Results
The YOLOv8-ACU model achieves impressive accuracy, with an mAP@0.5 of 97.5% and an mAP@0.5–0.95 of 76.9% on our self-constructed datasets. It also marks a reduction in model parameters by 0.44M, model size by 0.82 MB, and GFLOPs by 9.3%.
Discussion
With its enhanced recognition accuracy and efficiency, along with good generalization ability, YOLOv8-ACU provides significant reference value for facial acupoint localization and detection. This is particularly beneficial for Chinese medicine practitioners engaged in facial acupoint research and intelligent detection.
{"title":"YOLOv8-ACU: improved YOLOv8-pose for facial acupoint detection","authors":"Zijian Yuan, Pengwei Shao, Jinran Li, Yinuo Wang, Zixuan Zhu, Weijie Qiu, Buqun Chen, Yan Tang, Aiqing Han","doi":"10.3389/fnbot.2024.1355857","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1355857","url":null,"abstract":"<sec><title>Introduction</title><p>Acupoint localization is integral to Traditional Chinese Medicine (TCM) acupuncture diagnosis and treatment. Employing intelligent detection models for recognizing facial acupoints can substantially enhance localization accuracy.</p></sec><sec><title>Methods</title><p>This study introduces an advancement in the YOLOv8-pose keypoint detection algorithm, tailored for facial acupoints, and named YOLOv8-ACU. This model enhances acupoint feature extraction by integrating ECA attention, replaces the original neck module with a lighter Slim-neck module, and improves the loss function for GIoU.</p></sec><sec><title>Results</title><p>The YOLOv8-ACU model achieves impressive accuracy, with an mAP@0.5 of 97.5% and an mAP@0.5–0.95 of 76.9% on our self-constructed datasets. It also marks a reduction in model parameters by 0.44M, model size by 0.82 MB, and GFLOPs by 9.3%.</p></sec><sec><title>Discussion</title><p>With its enhanced recognition accuracy and efficiency, along with good generalization ability, YOLOv8-ACU provides significant reference value for facial acupoint localization and detection. This is particularly beneficial for Chinese medicine practitioners engaged in facial acupoint research and intelligent detection.</p></sec>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"26 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139658451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-15DOI: 10.3389/fnbot.2024.1343249
Xinghe Xie, Liyan Chen, Shujia Qin, Fusheng Zha, Xinggang Fan
Introduction
As an interactive method gaining popularity, brain-computer interfaces (BCIs) aim to facilitate communication between the brain and external devices. Among the various research topics in BCIs, the classification of motor imagery using electroencephalography (EEG) signals has the potential to greatly improve the quality of life for people with disabilities.
Methods
This technology assists them in controlling computers or other devices like prosthetic limbs, wheelchairs, and drones. However, the current performance of EEG signal decoding is not sufficient for real-world applications based on Motor Imagery EEG (MI-EEG). To address this issue, this study proposes an attention-based bidirectional feature pyramid temporal convolutional network model for the classification task of MI-EEG. The model incorporates a multi-head self-attention mechanism to weigh significant features in the MI-EEG signals. It also utilizes a temporal convolution network (TCN) to separate high-level temporal features. The signals are enhanced using the sliding-window technique, and channel and time-domain information of the MI-EEG signals is extracted through convolution.
Results
Additionally, a bidirectional feature pyramid structure is employed to implement attention mechanisms across different scales and multiple frequency bands of the MI-EEG signals. The performance of our model is evaluated on the BCI Competition IV-2a dataset and the BCI Competition IV-2b dataset, and the results showed that our model outperformed the state-of-the-art baseline model, with an accuracy of 87.5 and 86.3% for the subject-dependent, respectively.
Discussion
In conclusion, the BFATCNet model offers a novel approach for EEG-based motor imagery classification in BCIs, effectively capturing relevant features through attention mechanisms and temporal convolutional networks. Its superior performance on the BCI Competition IV-2a and IV-2b datasets highlights its potential for real-world applications. However, its performance on other datasets may vary, necessitating further research on data augmentation techniques and integration with multiple modalities to enhance interpretability and generalization. Additionally, reducing computational complexity for real-time applications is an important area for future work.
{"title":"Bidirectional feature pyramid attention-based temporal convolutional network model for motor imagery electroencephalogram classification","authors":"Xinghe Xie, Liyan Chen, Shujia Qin, Fusheng Zha, Xinggang Fan","doi":"10.3389/fnbot.2024.1343249","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1343249","url":null,"abstract":"<sec><title>Introduction</title><p>As an interactive method gaining popularity, brain-computer interfaces (BCIs) aim to facilitate communication between the brain and external devices. Among the various research topics in BCIs, the classification of motor imagery using electroencephalography (EEG) signals has the potential to greatly improve the quality of life for people with disabilities.</p></sec><sec><title>Methods</title><p>This technology assists them in controlling computers or other devices like prosthetic limbs, wheelchairs, and drones. However, the current performance of EEG signal decoding is not sufficient for real-world applications based on Motor Imagery EEG (MI-EEG). To address this issue, this study proposes an attention-based bidirectional feature pyramid temporal convolutional network model for the classification task of MI-EEG. The model incorporates a multi-head self-attention mechanism to weigh significant features in the MI-EEG signals. It also utilizes a temporal convolution network (TCN) to separate high-level temporal features. The signals are enhanced using the sliding-window technique, and channel and time-domain information of the MI-EEG signals is extracted through convolution.</p></sec><sec><title>Results</title><p>Additionally, a bidirectional feature pyramid structure is employed to implement attention mechanisms across different scales and multiple frequency bands of the MI-EEG signals. The performance of our model is evaluated on the BCI Competition IV-2a dataset and the BCI Competition IV-2b dataset, and the results showed that our model outperformed the state-of-the-art baseline model, with an accuracy of 87.5 and 86.3% for the subject-dependent, respectively.</p></sec><sec><title>Discussion</title><p>In conclusion, the BFATCNet model offers a novel approach for EEG-based motor imagery classification in BCIs, effectively capturing relevant features through attention mechanisms and temporal convolutional networks. Its superior performance on the BCI Competition IV-2a and IV-2b datasets highlights its potential for real-world applications. However, its performance on other datasets may vary, necessitating further research on data augmentation techniques and integration with multiple modalities to enhance interpretability and generalization. Additionally, reducing computational complexity for real-time applications is an important area for future work.</p></sec>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"123 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139581115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The transportation of hazardous chemicals on roadways has raised significant safety concerns. Incidents involving these substances often lead to severe and devastating consequences. Consequently, there is a pressing need for real-time detection systems tailored for hazardous material vehicles. However, existing detection methods face challenges in accurately identifying smaller targets and achieving high precision. This paper introduces a novel solution, HMV-YOLO, an enhancement of the YOLOv7-tiny model designed to address these challenges. Within this model, two innovative modules, CBSG and G-ELAN, are introduced. The CBSG module's mathematical model incorporates components such as Convolution (Conv2d), Batch Normalization (BN), SiLU activation, and Global Response Normalization (GRN) to mitigate feature collapse issues and enhance neuron activity. The G-ELAN module, building upon CBSG, further advances feature fusion. Experimental results showcase the superior performance of the enhanced model compared to the original one across various evaluation metrics. This advancement shows great promise for practical applications, particularly in the context of real-time monitoring systems for hazardous material vehicles.
{"title":"Enhancing hazardous material vehicle detection with advanced feature enhancement modules using HMV-YOLO","authors":"Ling Wang, Bushi Liu, Wei Shao, Zhe Li, Kailu Chang, Wenjie Zhu","doi":"10.3389/fnbot.2024.1351939","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1351939","url":null,"abstract":"<p>The transportation of hazardous chemicals on roadways has raised significant safety concerns. Incidents involving these substances often lead to severe and devastating consequences. Consequently, there is a pressing need for real-time detection systems tailored for hazardous material vehicles. However, existing detection methods face challenges in accurately identifying smaller targets and achieving high precision. This paper introduces a novel solution, HMV-YOLO, an enhancement of the YOLOv7-tiny model designed to address these challenges. Within this model, two innovative modules, CBSG and G-ELAN, are introduced. The CBSG module's mathematical model incorporates components such as Convolution (Conv2d), Batch Normalization (BN), SiLU activation, and Global Response Normalization (GRN) to mitigate feature collapse issues and enhance neuron activity. The G-ELAN module, building upon CBSG, further advances feature fusion. Experimental results showcase the superior performance of the enhanced model compared to the original one across various evaluation metrics. This advancement shows great promise for practical applications, particularly in the context of real-time monitoring systems for hazardous material vehicles.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"14 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139581112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-12DOI: 10.3389/fnbot.2024.1349498
Haotian Wu, Shigang Yue, Cheng Hu
Insects exhibit remarkable abilities in navigating complex natural environments, whether it be evading predators, capturing prey, or seeking out con-specifics, all of which rely on their compact yet reliable neural systems. We explore the field of bio-inspired robotic vision systems, focusing on the locust inspired Lobula Giant Movement Detector (LGMD) models. The existing LGMD models are thoroughly evaluated, identifying their common meta-properties that are essential for their functionality. This article reveals a common framework, characterized by layered structures and computational strategies, which is crucial for enhancing the capability of bio-inspired models for diverse applications. The result of this analysis is the Strategic Prototype, which embodies the identified meta-properties. It represents a modular and more flexible method for developing more responsive and adaptable robotic visual systems. The perspective highlights the potential of the Strategic Prototype: LGMD-Universally Prototype (LGMD-UP), the key to re-framing LGMD models and advancing our understanding and implementation of bio-inspired visual systems in robotics. It might open up more flexible and adaptable avenues for research and practical applications.
{"title":"Re-framing bio-plausible collision detection: identifying shared meta-properties through strategic prototyping","authors":"Haotian Wu, Shigang Yue, Cheng Hu","doi":"10.3389/fnbot.2024.1349498","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1349498","url":null,"abstract":"<p>Insects exhibit remarkable abilities in navigating complex natural environments, whether it be evading predators, capturing prey, or seeking out con-specifics, all of which rely on their compact yet reliable neural systems. We explore the field of bio-inspired robotic vision systems, focusing on the locust inspired Lobula Giant Movement Detector (LGMD) models. The existing LGMD models are thoroughly evaluated, identifying their common meta-properties that are essential for their functionality. This article reveals a common framework, characterized by layered structures and computational strategies, which is crucial for enhancing the capability of bio-inspired models for diverse applications. The result of this analysis is the Strategic Prototype, which embodies the identified meta-properties. It represents a modular and more flexible method for developing more responsive and adaptable robotic visual systems. The perspective highlights the potential of the Strategic Prototype: LGMD-Universally Prototype (LGMD-UP), the key to re-framing LGMD models and advancing our understanding and implementation of bio-inspired visual systems in robotics. It might open up more flexible and adaptable avenues for research and practical applications.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"330 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139559140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-09DOI: 10.3389/fnbot.2023.1354389
F. Cordella, S. Soekadar, L. Zollo
{"title":"Editorial: Neurorobotics and strategies for adaptive human-machine interaction, volume II","authors":"F. Cordella, S. Soekadar, L. Zollo","doi":"10.3389/fnbot.2023.1354389","DOIUrl":"https://doi.org/10.3389/fnbot.2023.1354389","url":null,"abstract":"","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"49 3","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139444898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-08DOI: 10.3389/fnbot.2023.1342742
Chengwei Wu
{"title":"Editorial: Safety and security of robotic systems: intelligent algorithms","authors":"Chengwei Wu","doi":"10.3389/fnbot.2023.1342742","DOIUrl":"https://doi.org/10.3389/fnbot.2023.1342742","url":null,"abstract":"","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"26 13","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139445254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-03DOI: 10.3389/fnbot.2024.1293992
Chao Li, Chenke Yue, Hanfu Li, Zhile Wang
With the development of deep learning, synthetic aperture radar (SAR) ship detection and recognition based on deep learning have gained widespread application and advancement. However, there are still challenging issues, manifesting in two primary facets: firstly, the imaging mechanism of SAR results in significant noise interference, making it difficult to separate background noise from ship target features in complex backgrounds such as ports and urban areas; secondly, the heterogeneous scales of ship target features result in the susceptibility of smaller targets to information loss, rendering them elusive to detection. In this article, we propose a context-aware one-stage ship detection network that exhibits heightened sensitivity to scale variations and robust resistance to noise interference. Then we introduce a Local feature refinement module (LFRM), which utilizes multiple receptive fields of different sizes to extract local multi-scale information, followed by a two-branch channel-wise attention approach to obtain local cross-channel interactions. To minimize the effect of a complex background on the target, we design the global context aggregation module (GCAM) to enhance the feature representation of the target and suppress the interference of noise by acquiring long-range dependencies. Finally, we validate the effectiveness of our method on three publicly available SAR ship detection datasets, SAR-Ship-Dataset, high-resolution SAR images dataset (HRSID), and SAR ship detection dataset (SSDD). The experimental results show that our method is more competitive, with AP50s of 96.3, 93.3, and 96.2% on the three publicly available datasets, respectively.
随着深度学习的发展,基于深度学习的合成孔径雷达(SAR)船舶探测与识别技术获得了广泛的应用和进步。然而,目前仍存在一些具有挑战性的问题,主要表现在两个方面:一是合成孔径雷达的成像机制会产生明显的噪声干扰,使得在港口和城市等复杂背景下难以将背景噪声与船舶目标特征分离开来;二是船舶目标特征的异构尺度导致较小的目标容易受到信息丢失的影响,使其难以被检测到。在本文中,我们提出了一种上下文感知的单级船舶检测网络,该网络对尺度变化表现出更高的灵敏度,并具有强大的抗噪声干扰能力。然后,我们引入了局部特征细化模块(LFRM),该模块利用多个不同大小的感受野来提取局部多尺度信息,然后采用双分支通道关注方法来获取局部跨通道交互信息。为了尽量减少复杂背景对目标的影响,我们设计了全局上下文聚合模块(GCAM),通过获取长程依赖关系来增强目标的特征表示并抑制噪声干扰。最后,我们在三个公开的 SAR 船舶检测数据集(SAR-Ship-Dataset)、高分辨率 SAR 图像数据集(HRSID)和 SAR 船舶检测数据集(SSDD)上验证了我们的方法的有效性。实验结果表明,我们的方法更具竞争力,在三个公开数据集上的 AP50 分别为 96.3%、93.3% 和 96.2%。
{"title":"Context-aware SAR image ship detection and recognition network","authors":"Chao Li, Chenke Yue, Hanfu Li, Zhile Wang","doi":"10.3389/fnbot.2024.1293992","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1293992","url":null,"abstract":"<p>With the development of deep learning, synthetic aperture radar (SAR) ship detection and recognition based on deep learning have gained widespread application and advancement. However, there are still challenging issues, manifesting in two primary facets: firstly, the imaging mechanism of SAR results in significant noise interference, making it difficult to separate background noise from ship target features in complex backgrounds such as ports and urban areas; secondly, the heterogeneous scales of ship target features result in the susceptibility of smaller targets to information loss, rendering them elusive to detection. In this article, we propose a context-aware one-stage ship detection network that exhibits heightened sensitivity to scale variations and robust resistance to noise interference. Then we introduce a Local feature refinement module (LFRM), which utilizes multiple receptive fields of different sizes to extract local multi-scale information, followed by a two-branch channel-wise attention approach to obtain local cross-channel interactions. To minimize the effect of a complex background on the target, we design the global context aggregation module (GCAM) to enhance the feature representation of the target and suppress the interference of noise by acquiring long-range dependencies. Finally, we validate the effectiveness of our method on three publicly available SAR ship detection datasets, SAR-Ship-Dataset, high-resolution SAR images dataset (HRSID), and SAR ship detection dataset (SSDD). The experimental results show that our method is more competitive, with AP50s of 96.3, 93.3, and 96.2% on the three publicly available datasets, respectively.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"83 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139476612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Insulators play a pivotal role in the reliability of power distribution networks, necessitating precise defect detection. However, compared with aerial insulator images of transmission network, insulator images of power distribution network contain more complex backgrounds and subtle insulator defects, it leads to high false detection rates and omission rates in current mainstream detection algorithms. In response, this study presents ID-YOLOv7, a tailored convolutional neural network. First, we design a novel Edge Detailed Shape Data Augmentation (EDSDA) method to enhance the model's sensitivity to insulator's edge shapes. Meanwhile, a Cross-Channel and Spatial Multi-Scale Attention (CCSMA) module is proposed, which can interactively model across different channels and spatial domains, to augment the network's attention to high-level insulator defect features. Second, we design a Re-BiC module to fuse multi-scale contextual features and reconstruct the Neck component, alleviating the issue of critical feature loss during inter-feature layer interaction in traditional FPN structures. Finally, we utilize the MPDIoU function to calculate the model's localization loss, effectively reducing redundant computational costs. We perform comprehensive experiments using the Su22kV_broken and PASCAL VOC 2007 datasets to validate our algorithm's effectiveness. On the Su22kV_broken dataset, our approach attains an 85.7% mAP on a single NVIDIA RTX 2080ti graphics card, marking a 7.2% increase over the original YOLOv7. On the PASCAL VOC 2007 dataset, we achieve an impressive 90.3% mAP at a processing speed of 53 FPS, showing a 2.9% improvement compared to the original YOLOv7.
{"title":"ID-YOLOv7: an efficient method for insulator defect detection in power distribution network","authors":"Bojian Chen, Weihao Zhang, Wenbin Wu, Yiran Li, Zhuolei Chen, Chenglong Li","doi":"10.3389/fnbot.2023.1331427","DOIUrl":"https://doi.org/10.3389/fnbot.2023.1331427","url":null,"abstract":"<p>Insulators play a pivotal role in the reliability of power distribution networks, necessitating precise defect detection. However, compared with aerial insulator images of transmission network, insulator images of power distribution network contain more complex backgrounds and subtle insulator defects, it leads to high false detection rates and omission rates in current mainstream detection algorithms. In response, this study presents ID-YOLOv7, a tailored convolutional neural network. First, we design a novel Edge Detailed Shape Data Augmentation (EDSDA) method to enhance the model's sensitivity to insulator's edge shapes. Meanwhile, a Cross-Channel and Spatial Multi-Scale Attention (CCSMA) module is proposed, which can interactively model across different channels and spatial domains, to augment the network's attention to high-level insulator defect features. Second, we design a Re-BiC module to fuse multi-scale contextual features and reconstruct the Neck component, alleviating the issue of critical feature loss during inter-feature layer interaction in traditional FPN structures. Finally, we utilize the MPDIoU function to calculate the model's localization loss, effectively reducing redundant computational costs. We perform comprehensive experiments using the Su22kV_broken and PASCAL VOC 2007 datasets to validate our algorithm's effectiveness. On the Su22kV_broken dataset, our approach attains an 85.7% mAP on a single NVIDIA RTX 2080ti graphics card, marking a 7.2% increase over the original YOLOv7. On the PASCAL VOC 2007 dataset, we achieve an impressive 90.3% mAP at a processing speed of 53 FPS, showing a 2.9% improvement compared to the original YOLOv7.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"8 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2023-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139469309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Loop closure detection is an important module for simultaneous localization and mapping (SLAM). Correct detection of loops can reduce the cumulative drift in positioning. Because traditional detection methods rely on handicraft features, false positive detections can occur when the environment changes, resulting in incorrect estimates and an inability to obtain accurate maps. In this research paper, a loop closure detection method based on a variational autoencoder (VAE) is proposed. It is intended to be used as a feature extractor to extract image features through neural networks to replace the handicraft features used in traditional methods. This method extracts a low-dimensional vector as the representation of the image. At the same time, the attention mechanism is added to the network and constraints are added to improve the loss function for better image representation. In the back-end feature matching process, geometric checking is used to filter out the wrong matching for the false positive problem. Finally, through numerical experiments, the proposed method is demonstrated to have a better precision-recall curve than the traditional method of the bag-of-words model and other deep learning methods and is highly robust to environmental changes. In addition, experiments on datasets from three different scenarios also demonstrate that the method can be applied in real-world scenarios and that it has a good performance.
{"title":"Loop closure detection of visual SLAM based on variational autoencoder","authors":"Shibin Song, Fengjie Yu, Xiaojie Jiang, Jie Zhu, Weihao Cheng, Xiao Fang","doi":"10.3389/fnbot.2023.1301785","DOIUrl":"https://doi.org/10.3389/fnbot.2023.1301785","url":null,"abstract":"<p>Loop closure detection is an important module for simultaneous localization and mapping (SLAM). Correct detection of loops can reduce the cumulative drift in positioning. Because traditional detection methods rely on handicraft features, false positive detections can occur when the environment changes, resulting in incorrect estimates and an inability to obtain accurate maps. In this research paper, a loop closure detection method based on a variational autoencoder (VAE) is proposed. It is intended to be used as a feature extractor to extract image features through neural networks to replace the handicraft features used in traditional methods. This method extracts a low-dimensional vector as the representation of the image. At the same time, the attention mechanism is added to the network and constraints are added to improve the loss function for better image representation. In the back-end feature matching process, geometric checking is used to filter out the wrong matching for the false positive problem. Finally, through numerical experiments, the proposed method is demonstrated to have a better precision-recall curve than the traditional method of the bag-of-words model and other deep learning methods and is highly robust to environmental changes. In addition, experiments on datasets from three different scenarios also demonstrate that the method can be applied in real-world scenarios and that it has a good performance.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"84 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2023-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139499694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-26DOI: 10.3389/fnbot.2023.1302898
Xiaoran Kong, Yatong Zhou, Zhe Li, Shaohai Wang
Target assignment and path planning are crucial for the cooperativity of multiple unmanned aerial vehicles (UAV) systems. However, it is a challenge considering the dynamics of environments and the partial observability of UAVs. In this article, the problem of multi-UAV target assignment and path planning is formulated as a partially observable Markov decision process (POMDP), and a novel deep reinforcement learning (DRL)-based algorithm is proposed to address it. Specifically, a target assignment network is introduced into the twin-delayed deep deterministic policy gradient (TD3) algorithm to solve the target assignment problem and path planning problem simultaneously. The target assignment network executes target assignment for each step of UAVs, while the TD3 guides UAVs to plan paths for this step based on the assignment result and provides training labels for the optimization of the target assignment network. Experimental results demonstrate that the proposed approach can ensure an optimal complete target allocation and achieve a collision-free path for each UAV in three-dimensional (3D) dynamic multiple-obstacle environments, and present a superior performance in target completion and a better adaptability to complex environments compared with existing methods.
{"title":"Multi-UAV simultaneous target assignment and path planning based on deep reinforcement learning in dynamic multiple obstacles environments","authors":"Xiaoran Kong, Yatong Zhou, Zhe Li, Shaohai Wang","doi":"10.3389/fnbot.2023.1302898","DOIUrl":"https://doi.org/10.3389/fnbot.2023.1302898","url":null,"abstract":"<p>Target assignment and path planning are crucial for the cooperativity of multiple unmanned aerial vehicles (UAV) systems. However, it is a challenge considering the dynamics of environments and the partial observability of UAVs. In this article, the problem of multi-UAV target assignment and path planning is formulated as a partially observable Markov decision process (POMDP), and a novel deep reinforcement learning (DRL)-based algorithm is proposed to address it. Specifically, a target assignment network is introduced into the twin-delayed deep deterministic policy gradient (TD3) algorithm to solve the target assignment problem and path planning problem simultaneously. The target assignment network executes target assignment for each step of UAVs, while the TD3 guides UAVs to plan paths for this step based on the assignment result and provides training labels for the optimization of the target assignment network. Experimental results demonstrate that the proposed approach can ensure an optimal complete target allocation and achieve a collision-free path for each UAV in three-dimensional (3D) dynamic multiple-obstacle environments, and present a superior performance in target completion and a better adaptability to complex environments compared with existing methods.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"8 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2023-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139515443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}