International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)最新文献

英文中文

Multimodel ensemble-based Pneumonia x-ray image classification 基于多模型集合的肺炎 X 光图像分类

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014404

Guanglong Zheng

Pneumonia is a life-threatening respiratory infection that affects millions of individuals worldwide. Early and accurate diagnosis of pneumonia is crucial for effective treatment and patient care. In recent years, deep learning techniques have shown remarkable promise in automating the diagnosis of pneumonia from X-ray images. However, the inherent variability in X-ray images and the complexity of pneumonia patterns pose significant challenges to achieving high classification accuracy. In this paper, we propose a novel approach for pneumonia X-ray image classification based on multiple model ensemble. Our method leverages the strengths of diverse deep learning architectures and achieves superior classification performance compared to single models. We conducted extensive experiments on both public and private datasets, and the proposed method achieved accuracy improvements of 7.53 and 3.36, respectively. The experimental results indicate that the proposed method has high usability.

肺炎是一种危及生命的呼吸道感染，影响着全球数百万人。肺炎的早期准确诊断对于有效治疗和患者护理至关重要。近年来，深度学习技术在根据 X 光图像自动诊断肺炎方面显示出了显著的前景。然而，X 光图像固有的可变性和肺炎模式的复杂性给实现高分类准确性带来了巨大挑战。在本文中，我们提出了一种基于多模型集合的肺炎 X 光图像分类新方法。我们的方法充分利用了多种深度学习架构的优势，与单一模型相比，分类性能更优越。我们在公共数据集和私有数据集上进行了大量实验，结果表明所提出的方法分别提高了 7.53 和 3.36 的准确率。实验结果表明，所提出的方法具有很高的可用性。

引用次数: 0

FBS_YOLO3 vehicle detection algorithm based on viewpoint information 基于视角信息的 FBS_YOLO3 车辆检测算法

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014408

Chunbao Huo, Zengwen Chen, Zhibo Tong, Ya Zheng

The FBS_YOLO3 vehicle detection algorithm is a novel solution to the challenge of detecting vehicles in unstructured road scenarios with limited warning information. This algorithm builds upon the YOLOv3 model to deliver advanced multi-scale target detection. Firstly, FBS_YOLO3 incorporates four convolutional residual structures into the YOLOv3 backbone network to obtain deeper feature knowledge via down-sampling. Secondly, the feature fusion network is improved by implementing a PAN network structure which enhances the accuracy and robustness of viewpoint recognition through top-down and bottom-up feature fusion. Lastly, the K-means clustering fusion cross-comparison loss function is utilized to redefine the anchor frame by employing a K-means fusion cross-ratio loss function. This innovative approach solves the issue of mismatching the predetermined anchor frame size of the YOLOv3 network. Experimental results demonstrate that FBS_YOLO3 on a self-built dataset can improve mAP by 3.15% compared with the original network, while maintaining a quick detection rate of 37 fps. Moreover, FBS_YOLO3 can accurately detect vehicles, identify viewpoint information, and effectively solve the warning information insufficiency problem in unstructured road scenarios.

FBS_YOLO3 车辆检测算法是一种新颖的解决方案，可解决在预警信息有限的非结构化道路场景中检测车辆的难题。该算法以 YOLOv3 模型为基础，提供先进的多尺度目标检测。首先，FBS_YOLO3 在 YOLOv3 骨干网络中加入了四个卷积残差结构，通过向下采样获得更深入的特征知识。其次，通过实施 PAN 网络结构改进了特征融合网络，通过自上而下和自下而上的特征融合提高了视点识别的准确性和鲁棒性。最后，利用 K-means 聚类融合交叉比较损失函数，采用 K-means 融合交叉比率损失函数重新定义锚点帧。这种创新方法解决了 YOLOv3 网络预定锚帧大小不匹配的问题。实验结果表明，FBS_YOLO3 在自建数据集上的 mAP 比原始网络提高了 3.15%，同时保持了 37 fps 的快速检测率。此外，FBS_YOLO3 还能准确检测车辆，识别视点信息，有效解决非结构化道路场景中预警信息不足的问题。

{"title":"FBS_YOLO3 vehicle detection algorithm based on viewpoint information","authors":"Chunbao Huo, Zengwen Chen, Zhibo Tong, Ya Zheng","doi":"10.1117/12.3014408","DOIUrl":"https://doi.org/10.1117/12.3014408","url":null,"abstract":"The FBS_YOLO3 vehicle detection algorithm is a novel solution to the challenge of detecting vehicles in unstructured road scenarios with limited warning information. This algorithm builds upon the YOLOv3 model to deliver advanced multi-scale target detection. Firstly, FBS_YOLO3 incorporates four convolutional residual structures into the YOLOv3 backbone network to obtain deeper feature knowledge via down-sampling. Secondly, the feature fusion network is improved by implementing a PAN network structure which enhances the accuracy and robustness of viewpoint recognition through top-down and bottom-up feature fusion. Lastly, the K-means clustering fusion cross-comparison loss function is utilized to redefine the anchor frame by employing a K-means fusion cross-ratio loss function. This innovative approach solves the issue of mismatching the predetermined anchor frame size of the YOLOv3 network. Experimental results demonstrate that FBS_YOLO3 on a self-built dataset can improve mAP by 3.15% compared with the original network, while maintaining a quick detection rate of 37 fps. Moreover, FBS_YOLO3 can accurately detect vehicles, identify viewpoint information, and effectively solve the warning information insufficiency problem in unstructured road scenarios.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"17 4","pages":"129690S - 129690S-6"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Detection algorithm for diabetic retinopathy based on ResNet and transfer learning 基于 ResNet 和迁移学习的糖尿病视网膜病变检测算法

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014400

Weihua Wang, Li Lei

DR (Diabetic retinopathy) a chronic progressive disease which affects eyesight and even causes blindness. It is significance to carry out the identification and severity diagnosis of DR, timely diagnosis and treatment of DR Patients, improve the people’s quality, especially the elderly, and improve the efficiency of diagnosis. In this study, with the goal of efficient and accurate division of DR Levels, a DR Recognition and classification algorithm based on ResNet and transfer learning is proposed. Firstly, shallow feature extraction module of ResNet18 is used to get retinal image feature, and then the fully connected classification structure model of DR Is designed. Then the transfer learning method is combined to train the network weights to improve the generalization ability of the model, ResNet-18 is selected as the backbone network model for feature extracting. Results show that the accuracy of the training set reaches to provide useful guidance for DR Automatic diagnosis, and effectively alleviates the problem of low accuracy of DR Classification

DR（糖尿病视网膜病变）是一种影响视力甚至导致失明的慢性进展性疾病。对 DR 进行识别和严重程度诊断，及时诊断和治疗 DR 患者，提高人们尤其是老年人的生活质量，提高诊断效率具有重要意义。本研究以高效、准确地划分 DR 级别为目标，提出了一种基于 ResNet 和迁移学习的 DR 识别与分类算法。首先，利用 ResNet18 的浅层特征提取模块获取视网膜图像特征，然后设计出 DR 的全连接分类结构模型。然后结合迁移学习方法训练网络权重，提高模型的泛化能力，并选择 ResNet-18 作为特征提取的骨干网络模型。结果表明，训练集的准确率达到了为 DR 自动诊断提供有用指导的水平，并有效缓解了 DR 分类准确率低的问题。

引用次数: 0

Research on dark level correction method for CMOS image sensors CMOS 图像传感器暗电平校正方法研究

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014385

Yizhe Wang, Zhongjie Guo, Youmei Guo

To obtain higher imaging quality, the dark level generated during the operation of CMOS image sensors (CIS) needs to be corrected. In this paper, a dark level correction circuit is designed based on a 4 T active pixel, which includes a dark current cancellation circuit and a switched capacitor amplifier circuit. First, the dark current is collected in real time by using the dark pixels in the periphery of the face array, and the dark current noise is read out and differed from the image signals output from the columns to obtain a more accurate output signal, thus eliminating the dark level caused by the dark current. Then the switched-capacitor amplifier is used to collect and amplify the signals to facilitate the subsequent ADC processing. Based on the 110 nm process for the proposed method of specific circuit design verification, the verification results show that the dark level correction circuit designed in this paper through a real-time sampling of the dark pixels of the periphery of the array can be reduced to the exposure stage of the dark current noise to more than 85% of the original.

为了获得更高的成像质量，需要对 CMOS 图像传感器（CIS）工作时产生的暗电平进行校正。本文设计了一种基于 4 T 有源像素的暗电平校正电路，其中包括暗电流消除电路和开关电容放大器电路。首先，利用面阵外围的暗像素实时采集暗电流，读出暗电流噪声，并与列输出的图像信号进行差分，得到更精确的输出信号，从而消除暗电流引起的暗电平。然后使用开关电容放大器来收集和放大信号，以方便后续的 ADC 处理。基于 110 nm 工艺对本文提出的方法进行了具体的电路设计验证，验证结果表明，本文设计的暗电平校正电路通过对阵列外围的暗像素进行实时采样，可以将曝光阶段的暗电流噪声降低到原来的 85% 以上。

引用次数: 0

Detection and recongnition of pulmonary nodules based on convolution neural network 基于卷积神经网络的肺结节检测与再识别

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014478

Qiangchao Shi, Zhibing Shu

Lung cancer is the disease with the highest incidence rate and mortality of cancer in China, which seriously threatens human life safety. Pulmonary nodules are the main factor leading to lung cancer, and their precise identification plays a crucial role in clinical diagnosis. This paper proposes a lung nodule detection model that combines global image information to address issues. The model is based on improved YOLOV5 network. Finally, comparative experiments have verified the accuracy and effectiveness of this model.

肺癌是我国发病率和死亡率最高的肿瘤疾病，严重威胁着人类的生命安全。肺结节是导致肺癌的主要因素，其精确识别在临床诊断中起着至关重要的作用。本文针对这一问题，提出了一种结合全局图像信息的肺结节检测模型。该模型基于改进的 YOLOV5 网络。最后，对比实验验证了该模型的准确性和有效性。

引用次数: 0

Iterative segmentation and propagation based interactive video object segmentation 基于迭代分割和传播的交互式视频对象分割

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014487

Sihan Luo, Sizhe Yang, Xia Yuan

Interactive video object segmentation (iVOS), which aims to efficiently produce high-quality segmentation masks of the target object in a video with user interactions. Recently, numerous works are proposed to advance the task of iVOS. However, their usages on user intent are limited. First, typical modules usually try to direct generate the segmentation without any further exploration on the input interaction, which misses valuable information. Second, recent iVOS approaches also do not consider the raw interactive information. As a result, the final segmentation results will be poisoned by the erroneous information given by the previous round’s segmentation masks. To solve the aforementioned weaknesses, in this paper, an Iterative Segmentation and Propagation based iVOS method is proposed to conduct better user intent exploration, namely ISP. ISP directly models user intent into the PGI2M module and TP module. Specifically, ISP first extracts a coarse-grained segmentation mask by analyzing the user’s input. Subsequently, this mask is used as a prior to aid the PGI2M module. Secondly, ISP presents a new interaction-driven self-attention module to recall the user’s intent in the TP module. Extensive experiments on two public datasets show the superiority of ISP over existing methods.

交互式视频对象分割（iVOS），旨在通过用户交互有效地生成视频中目标对象的高质量分割掩码。最近，有许多作品被提出来推进 iVOS 任务。然而，它们在用户意图方面的应用都很有限。首先，典型的模块通常试图直接生成分割，而不对输入交互进行任何进一步的探索，这就错过了有价值的信息。其次，最近的 iVOS 方法也没有考虑原始交互信息。因此，最终的分割结果会受到上一轮分割掩码所提供的错误信息的影响。为了解决上述缺陷，本文提出了一种基于迭代分割和传播的 iVOS 方法，即 ISP，以更好地探索用户意图。ISP 将用户意图直接建模到 PGI2M 模块和 TP 模块中。具体来说，ISP 首先通过分析用户输入提取粗粒度分割掩码。然后，将该掩码作为先验，辅助 PGI2M 模块。其次，ISP 提出了一种新的交互驱动型自我注意模块，用于在 TP 模块中回忆用户的意图。在两个公开数据集上进行的广泛实验表明，ISP 比现有方法更具优势。

{"title":"Iterative segmentation and propagation based interactive video object segmentation","authors":"Sihan Luo, Sizhe Yang, Xia Yuan","doi":"10.1117/12.3014487","DOIUrl":"https://doi.org/10.1117/12.3014487","url":null,"abstract":"Interactive video object segmentation (iVOS), which aims to efficiently produce high-quality segmentation masks of the target object in a video with user interactions. Recently, numerous works are proposed to advance the task of iVOS. However, their usages on user intent are limited. First, typical modules usually try to direct generate the segmentation without any further exploration on the input interaction, which misses valuable information. Second, recent iVOS approaches also do not consider the raw interactive information. As a result, the final segmentation results will be poisoned by the erroneous information given by the previous round’s segmentation masks. To solve the aforementioned weaknesses, in this paper, an Iterative Segmentation and Propagation based iVOS method is proposed to conduct better user intent exploration, namely ISP. ISP directly models user intent into the PGI2M module and TP module. Specifically, ISP first extracts a coarse-grained segmentation mask by analyzing the user’s input. Subsequently, this mask is used as a prior to aid the PGI2M module. Secondly, ISP presents a new interaction-driven self-attention module to recall the user’s intent in the TP module. Extensive experiments on two public datasets show the superiority of ISP over existing methods.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"43 5","pages":"129691A - 129691A-10"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Research on object detection for small objects in agriculture: taking red bayberry as an example 农业小物体检测研究：以红杨梅为例

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014464

Shan Hua, Kaiyuan Han, Shuangwei Li, Minjie Xu, Shouyan Zhu, Zhifu Xu

With the continuous improvement of intelligent management level in red bayberry orchards, the demand for automatic picking and automatic sorting is becoming increasingly apparent. The prerequisite for achieving these automated processes is to quickly identify the maturity of red bayberries by object detection. In this study, we classified red bayberry into 8 levels of maturity and achieved an object detection precision of 88.9%. We used a fast object detection model, combined with small object optimization methods and small feature extraction layers to get higher precision.

随着红杨梅果园智能化管理水平的不断提高，对自动采摘和自动分拣的需求日益明显。实现这些自动化流程的前提是通过物体检测快速识别红杨梅的成熟度。在这项研究中，我们将红杨梅分为 8 个成熟度等级，物体检测精度达到 88.9%。我们使用了快速物体检测模型，并结合小物体优化方法和小特征提取层来获得更高的精度。

引用次数: 0

Rice extraction from Sentinel-2A image based on feature optimization and UPerNet:Swin Transformer model 基于特征优化和 UPerNet:Swin Transformer 模型的哨兵-2A 图像水稻提取技术

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014406

Yu Wei, Bo Wei, Xianhua Liang, Zhiwei Qi

Starting from the problem that rice extraction from remote sensing images still faces effective feature construction and extraction model, the feature optimization and combined deep learning model are considered. Taking Sentinel-2A image as data source, a multi-dimensional feature data set including spectral features, red edge features, vegetation index, water index and texture features is constructed. The ReliefF-RFE algorithm is used to optimize the features of the data set for rice extraction, and the combined UPerNet-Swin Transformer model is used to extract the rice from the study area based on the optimized features. Comparison with other feature combination schemes and deep learning models demonstrates that: (1) using the optimized features based on the ReliefF-RFE algorithm has the best segmentation effect for rice extraction, which its accuracy, recall rate, F1 score and IoU reach 92.77%, 92.28%, 92.52% and 86.09%, respectively, and (2) compared with PSPNet, Unet, DeepLabv3+ and the original UPerNet models, the combined UPerNet-Swin Transformer model has fewer misclassifications and omissions under the same optimal feature combination schemes, which the F1 score and IoU are increased by 11.12% and 17.46%, respectively

从遥感影像中水稻提取仍面临有效特征构建和提取模型的问题出发，考虑了特征优化和组合深度学习模型。以 Sentinel-2A 图像为数据源，构建了包括光谱特征、红边特征、植被指数、水分指数和纹理特征在内的多维特征数据集。利用 ReliefF-RFE 算法优化数据集中的水稻提取特征，并根据优化后的特征利用 UPerNet-Swin Transformer 组合模型提取研究区域的水稻。与其他特征组合方案和深度学习模型的比较表明(1）基于ReliefF-RFE算法的优化特征对水稻提取的分割效果最好，其准确率、召回率、F1得分和IoU分别达到92.77%、92.28%、92.52%和86.09%；（2）与 PSPNet、Unet、DeepLabv3+ 和原始 UPerNet 模型相比，在相同的最优特征组合方案下，UPerNet-Swin Transformer 组合模型的误分类和漏分类较少，F1 分数和 IoU 分别提高了 11.12% 和 17.46%

{"title":"Rice extraction from Sentinel-2A image based on feature optimization and UPerNet:Swin Transformer model","authors":"Yu Wei, Bo Wei, Xianhua Liang, Zhiwei Qi","doi":"10.1117/12.3014406","DOIUrl":"https://doi.org/10.1117/12.3014406","url":null,"abstract":"Starting from the problem that rice extraction from remote sensing images still faces effective feature construction and extraction model, the feature optimization and combined deep learning model are considered. Taking Sentinel-2A image as data source, a multi-dimensional feature data set including spectral features, red edge features, vegetation index, water index and texture features is constructed. The ReliefF-RFE algorithm is used to optimize the features of the data set for rice extraction, and the combined UPerNet-Swin Transformer model is used to extract the rice from the study area based on the optimized features. Comparison with other feature combination schemes and deep learning models demonstrates that: (1) using the optimized features based on the ReliefF-RFE algorithm has the best segmentation effect for rice extraction, which its accuracy, recall rate, F1 score and IoU reach 92.77%, 92.28%, 92.52% and 86.09%, respectively, and (2) compared with PSPNet, Unet, DeepLabv3+ and the original UPerNet models, the combined UPerNet-Swin Transformer model has fewer misclassifications and omissions under the same optimal feature combination schemes, which the F1 score and IoU are increased by 11.12% and 17.46%, respectively","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"15 2-4","pages":"129691L - 129691L-6"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Research on brocade defect detection algorithm based on deep learning 基于深度学习的织锦缺陷检测算法研究

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014538

Ning Yun

The brocade weaving craft has a long history, with exquisite patterns and profound cultural connotations. It is an excellent representative of Chinese silk culture and an eye-catching business card in the intangible cultural heritage of mankind. The process of making brocade is a very complicated craft. In order to be able to detect defects in time during the production process, an improved SE-SSD fabric defect detection algorithm is proposed for the low efficiency of defect detection in traditional production, the large model affects the deployment and the shortcomings of DB-YOLOv3. By improving the network structure and optimizing the prior frame adjustment mechanism, the algorithm improves the ability of model feature extraction and greatly reduces the parameters and calculation of the network. The experimental results show that the SE-SSD algorithm effectively improves the missed detection of linear and weak target defects. Compared with the SSD network, the detection accuracy is increased by 27.55%, reaching 93.08% mAP, the detection speed is increased to 49FPS, and the network parameters are reduced. 51.5%, which improves the practicability of the algorithm, and the ability to detect small target defects still needs to be improved.

织锦工艺历史悠久，图案精美，文化内涵深厚。它是中国丝绸文化的优秀代表，也是人类非物质文化遗产中一张引人注目的名片。织锦是一门非常复杂的工艺。为了能够在生产过程中及时发现疵点，针对传统生产中疵点检测效率低、模型庞大影响部署以及DB-YOLOv3的缺点，提出了一种改进的SE-SSD织物疵点检测算法。该算法通过改进网络结构和优化先行帧调整机制，提高了模型特征提取能力，大大减少了网络参数和计算量。实验结果表明，SE-SSD 算法有效提高了线性缺陷和弱目标缺陷的漏检率。与 SSD 网络相比，检测精度提高了 27.55%，mAP 达到 93.08%，检测速度提高到 49FPS，网络参数降低了 51.5%，提高了网络的实用性。51.5%，提高了算法的实用性，但对小目标缺陷的检测能力仍有待提高。

{"title":"Research on brocade defect detection algorithm based on deep learning","authors":"Ning Yun","doi":"10.1117/12.3014538","DOIUrl":"https://doi.org/10.1117/12.3014538","url":null,"abstract":"The brocade weaving craft has a long history, with exquisite patterns and profound cultural connotations. It is an excellent representative of Chinese silk culture and an eye-catching business card in the intangible cultural heritage of mankind. The process of making brocade is a very complicated craft. In order to be able to detect defects in time during the production process, an improved SE-SSD fabric defect detection algorithm is proposed for the low efficiency of defect detection in traditional production, the large model affects the deployment and the shortcomings of DB-YOLOv3. By improving the network structure and optimizing the prior frame adjustment mechanism, the algorithm improves the ability of model feature extraction and greatly reduces the parameters and calculation of the network. The experimental results show that the SE-SSD algorithm effectively improves the missed detection of linear and weak target defects. Compared with the SSD network, the detection accuracy is increased by 27.55%, reaching 93.08% mAP, the detection speed is increased to 49FPS, and the network parameters are reduced. 51.5%, which improves the practicability of the algorithm, and the ability to detect small target defects still needs to be improved.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"12 2","pages":"1296907 - 1296907-6"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Research on mine moving target detection method based on deep learning 基于深度学习的地雷移动目标检测方法研究

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014398

Jiaheng Zhang, Peng Mei, Yongsheng Yang

In response to the problem of low accuracy in detecting moving targets in minefield images due to indistinct target features, complex background information, and frequent occlusions, this paper proposes a deep learning-based method for minefield moving target detection. Firstly, a fully dynamic convolutional structure is incorporated into the convolutional block of the backbone feature extraction network to reduce redundant information and enhance feature extraction capability. Secondly, the Swin Transformer network structure is introduced during the feature fusion process to enhance the perception of local geometric information. Finally, a coordinate attention mechanism is added to update the fused feature maps, thus enhancing the network's ability to detect occluded targets and targets in low-light conditions. The proposed algorithm is evaluated on a self-built minefield dataset and the Pascal VOC dataset through ablation experiments, and the results show that it significantly improves the average accuracy of target detection in minefield images.

针对雷场图像中目标特征不清晰、背景信息复杂、遮挡频繁等导致的移动目标检测精度低的问题，本文提出了一种基于深度学习的雷场移动目标检测方法。首先，在骨干特征提取网络的卷积块中加入全动态卷积结构，以减少冗余信息，增强特征提取能力。其次，在特征融合过程中引入 Swin Transformer 网络结构，以增强对局部几何信息的感知。最后，加入了坐标注意机制来更新融合后的特征图，从而增强了网络检测隐蔽目标和弱光条件下目标的能力。通过消融实验，在自建雷区数据集和帕斯卡尔 VOC 数据集上对所提出的算法进行了评估，结果表明该算法显著提高了雷区图像中目标检测的平均准确率。

引用次数: 0

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀