International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)最新文献

英文中文

Research on automatic scoring algorithm for English composition based on machine learning 基于机器学习的英语作文自动评分算法研究

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014482

Hui Li

It is difficult to extract deep semantic features for English composition scoring methods based on artificial features, and it is difficult for English composition scoring methods based on neural networks to extract shallow features such as the number of words, resulting in the limitations of different composition scoring methods. Based on existing research results, this paper proposes an English composition scoring method that combines artificial feature extraction methods and deep learning methods. This method uses artificially designed features to extract shallow features at the word and sentence levels in the composition, draws on existing methods to extract semantic features of the composition, and performs regression calculations on the deep features and shallow features to obtain the total score of the composition. The experiment uses the Pearson evaluation index to measure the correlation between the predicted total score of the essay and the true total score under the combination method. The experiment shows that compared with the average results of 0.747 and 0.645 of baseline models such as BiLSTM and RNN, the algorithm proposed in this article is respectively improvements are 0.068 and 0.17, which proves the effectiveness of the method proposed in this paper.

基于人工特征的英语作文评分方法难以提取深层语义特征，而基于神经网络的英语作文评分方法又难以提取字数等浅层特征，导致不同作文评分方法的局限性。本文在已有研究成果的基础上，提出了一种人工特征提取方法与深度学习方法相结合的英语作文评分方法。该方法利用人工设计的特征提取作文中单词和句子层面的浅层特征，借鉴现有方法提取作文的语义特征，并对深层特征和浅层特征进行回归计算，得到作文的总分。实验采用皮尔逊评价指数来衡量组合方法下作文预测总分与真实总分之间的相关性。实验结果表明，与BiLSTM和RNN等基线模型的平均结果0.747和0.645相比，本文提出的算法分别提高了0.068和0.17，证明了本文所提方法的有效性。

{"title":"Research on automatic scoring algorithm for English composition based on machine learning","authors":"Hui Li","doi":"10.1117/12.3014482","DOIUrl":"https://doi.org/10.1117/12.3014482","url":null,"abstract":"It is difficult to extract deep semantic features for English composition scoring methods based on artificial features, and it is difficult for English composition scoring methods based on neural networks to extract shallow features such as the number of words, resulting in the limitations of different composition scoring methods. Based on existing research results, this paper proposes an English composition scoring method that combines artificial feature extraction methods and deep learning methods. This method uses artificially designed features to extract shallow features at the word and sentence levels in the composition, draws on existing methods to extract semantic features of the composition, and performs regression calculations on the deep features and shallow features to obtain the total score of the composition. The experiment uses the Pearson evaluation index to measure the correlation between the predicted total score of the essay and the true total score under the combination method. The experiment shows that compared with the average results of 0.747 and 0.645 of baseline models such as BiLSTM and RNN, the algorithm proposed in this article is respectively improvements are 0.068 and 0.17, which proves the effectiveness of the method proposed in this paper.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"20 6","pages":"129690T - 129690T-6"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139640403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing audio perception in augmented reality: a dynamic vocal information processing framework 增强增强现实中的音频感知：动态人声信息处理框架

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014440

Danqing Zhao, Shuyi Xin, Lechen Liu, Yihan Sun, Anqi Du

The development of the Metaverse nowadays has sparked widespread emotions among researchers, and correspondingly, many technologies have been derived to improve the human's sense of reality in the Metaverse. Especially, Extended Reality (XR), as an indispensable and important technology and research direction in the study of the metaverse, aims to bring seamless transformation between the virtual world and the real-world immersion to the experiential world. However, the technology we currently lack is the ability to simultaneously separate, classify, and locate dynamic human sound information to enhance human sound perception in complex noise environments. This article proposes a framework that utilizes FCNN for separation, algebraic models for positioning to obtain estimated distances, and SVM for classification. The dataset is built to simulates distance-related changes with accurate ground truth labels. The results show that our method can effectively separate, separate, and locate mixed sound data, providing users with comprehensive information about the content, gender, and distance of the speaking object in complex sound environments, enhancing their immersive experience and perception ability. Our innovation lies in the combination of three audio processing technologies and the framework proposed may well inspire future work on related topics.

如今，元宇宙的发展引发了研究者们的广泛关注，相应地也衍生出许多技术来改善人类在元宇宙中的现实感。尤其是扩展现实技术（Extended Reality，XR），作为元宇宙研究中不可或缺的重要技术和研究方向，旨在实现虚拟世界与现实世界之间的无缝转换，让人们身临其境地体验世界。然而，我们目前缺乏的技术是同时分离、分类和定位人类动态声音信息的能力，以增强人类在复杂噪声环境中的声音感知能力。本文提出的框架利用 FCNN 进行分离，利用代数模型进行定位以获得估计距离，并利用 SVM 进行分类。建立的数据集模拟了与距离相关的变化，并带有准确的地面实况标签。结果表明，我们的方法可以有效地分离、分隔和定位混合声音数据，为用户提供复杂声音环境中说话对象的内容、性别和距离等综合信息，增强用户的沉浸式体验和感知能力。我们的创新之处在于结合了三种音频处理技术，所提出的框架很可能会对未来相关课题的研究有所启发。

{"title":"Enhancing audio perception in augmented reality: a dynamic vocal information processing framework","authors":"Danqing Zhao, Shuyi Xin, Lechen Liu, Yihan Sun, Anqi Du","doi":"10.1117/12.3014440","DOIUrl":"https://doi.org/10.1117/12.3014440","url":null,"abstract":"The development of the Metaverse nowadays has sparked widespread emotions among researchers, and correspondingly, many technologies have been derived to improve the human's sense of reality in the Metaverse. Especially, Extended Reality (XR), as an indispensable and important technology and research direction in the study of the metaverse, aims to bring seamless transformation between the virtual world and the real-world immersion to the experiential world. However, the technology we currently lack is the ability to simultaneously separate, classify, and locate dynamic human sound information to enhance human sound perception in complex noise environments. This article proposes a framework that utilizes FCNN for separation, algebraic models for positioning to obtain estimated distances, and SVM for classification. The dataset is built to simulates distance-related changes with accurate ground truth labels. The results show that our method can effectively separate, separate, and locate mixed sound data, providing users with comprehensive information about the content, gender, and distance of the speaking object in complex sound environments, enhancing their immersive experience and perception ability. Our innovation lies in the combination of three audio processing technologies and the framework proposed may well inspire future work on related topics.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":" 22","pages":"129691Z - 129691Z-9"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139640520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Research on surface defect classification method of hot rolled strip steel based on comparative learning 基于比较学习的热轧带钢表面缺陷分类方法研究

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014479

Xingshuai Zang, Shengnan Zhang, Yu He

In response to the thin nature of hot rolled steel plates and strips, the vast majority of which are surface defects that can easily lead to production accidents, and limited by the challenges of insufficient datasets and a large amount of unlabeled data, this paper proposes a comparative learning method to solve the above problems. In terms of methods, a dual data augmentation strategy is adopted. Firstly, the original image is data enhanced through manual processing, and CycleGAN is introduced for style transfer to enrich the dataset. Then, ResNet152 network is used for feature extraction, and several comparative learning methods are applied to observe the accuracy of hot rolled strip defect detection. In the end, the improved comparative learning method in this article successfully improved the accuracy of surface defect classification for hot rolled strip steel. Through this research, we are committed to providing more reliable quality control methods for industrial production and reducing the risk of production accidents.

针对热轧钢板和钢带厚度较薄，绝大多数为表面缺陷，容易导致生产事故的特点，以及受限于数据集不足和大量无标注数据的挑战，本文提出了一种比较学习方法来解决上述问题。在方法上，采用了双重数据增强策略。首先，通过人工处理对原始图像进行数据增强，并引入 CycleGAN 进行样式转移，以丰富数据集。然后，使用 ResNet152 网络进行特征提取，并应用多种比较学习方法来观察热轧带钢缺陷检测的准确性。最后，本文改进的比较学习方法成功地提高了热轧带钢表面缺陷分类的准确性。通过这项研究，我们致力于为工业生产提供更可靠的质量控制方法，降低生产事故风险。

引用次数: 0

Multi-objective vehicle routing problem with time windows under uncertain conditions 不确定条件下带有时间窗口的多目标车辆路线问题

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014402

jiashuo guo, Yuxin Liu

In this paper, we research the multi-objective vehicle routing problem with time windows under uncertainty. For solving it efficiently, the robust multi-objective particle swarm optimization incorporates the simulated annealing algorithm is proposed. The new algorithm aims to improve the local search abilities of particles. Experimental results show that the proposed algorithm outperforms the traditional the robust multi-objective particle swarm optimization algorithm on the selected problem sets as the uncertain interference intensity increases.

本文研究了不确定条件下带时间窗的多目标车辆路由问题。为了高效地解决该问题，本文提出了结合模拟退火算法的鲁棒多目标粒子群优化算法。新算法旨在提高粒子的局部搜索能力。实验结果表明，随着不确定干扰强度的增加，所提出的算法在所选问题集上优于传统的鲁棒多目标粒子群优化算法。

引用次数: 0

Design and realization of cross-border e-commerce logistics intelligent monitoring and early warning system based on improved genetic algorithm 基于改进遗传算法的跨境电商物流智能监测预警系统的设计与实现

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014648

Fei Lei, Zicen Liao, Mingxiu Huang, Hui Tian

The rapid development of the cross-border e-commerce market has led to an increase in logistics complexity, and intelligent monitoring and early warning systems are needed to meet the challenges. The objective of this study is to design and implement a cross-border e-commerce logistics monitoring and early warning system based on improved genetic algorithms to enhance the reliability of transportation quality. The system collects data related to cross-border e-commerce logistics transportation quality, analyzes and optimizes the improved genetic algorithm in one system, and uses the improved genetic algorithm for decision-making and planning. The system has a real-time monitoring function to discover potential transportation quality problems and conduct predictive analysis to identify the min advance for timely warning. The system can provide cross-border e-commerce enterprises with more efficient logistics and transportation quality management, reduce costs and improve customer satisfaction. It helps enterprises to cope with logistics challenges, provide more reliable services, and promote the continuous development and prosperity of cross-border e-commerce.

跨境电商市场的快速发展导致物流复杂性增加，需要智能监控和预警系统来应对挑战。本研究的目的是设计并实现基于改进遗传算法的跨境电商物流监测预警系统，以提高运输质量的可靠性。该系统收集跨境电商物流运输质量相关数据，在一个系统中对改进遗传算法进行分析和优化，并利用改进遗传算法进行决策和规划。该系统具有实时监控功能，可发现潜在的运输质量问题，并进行预测分析，提前识别 min，及时预警。该系统可为跨境电商企业提供更高效的物流运输质量管理，降低成本，提高客户满意度。帮助企业应对物流挑战，提供更可靠的服务，促进跨境电子商务的不断发展和繁荣。

{"title":"Design and realization of cross-border e-commerce logistics intelligent monitoring and early warning system based on improved genetic algorithm","authors":"Fei Lei, Zicen Liao, Mingxiu Huang, Hui Tian","doi":"10.1117/12.3014648","DOIUrl":"https://doi.org/10.1117/12.3014648","url":null,"abstract":"The rapid development of the cross-border e-commerce market has led to an increase in logistics complexity, and intelligent monitoring and early warning systems are needed to meet the challenges. The objective of this study is to design and implement a cross-border e-commerce logistics monitoring and early warning system based on improved genetic algorithms to enhance the reliability of transportation quality. The system collects data related to cross-border e-commerce logistics transportation quality, analyzes and optimizes the improved genetic algorithm in one system, and uses the improved genetic algorithm for decision-making and planning. The system has a real-time monitoring function to discover potential transportation quality problems and conduct predictive analysis to identify the min advance for timely warning. The system can provide cross-border e-commerce enterprises with more efficient logistics and transportation quality management, reduce costs and improve customer satisfaction. It helps enterprises to cope with logistics challenges, provide more reliable services, and promote the continuous development and prosperity of cross-border e-commerce.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"58 1","pages":"129690W - 129690W-4"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Jamming detection based on phase feature for SAR images 基于相位特征的合成孔径雷达图像干扰检测

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014617

Haoyu Zhang, Sinong Quan, Shiqi Xing, Yitao Liu

Synthetic Aperture Radar (SAR) is capable of producing high-resolution complex-valued pictures, which have extensive applications in both civil and military domains. Among these applications, SAR electronic countermeasures currently represent a prominent area of research interest. Presently, within the radar electronic countermeasures, there exists a diminishing disparity among the features of real and false targets, rendering the detection of jamming increasingly challenging. This paper examines the phase of SAR images and presents a method for identifying SAR jamming regions based on phase features. The initial step involves organizing the cluttered phase information into neighborhood phase differences. Subsequently, this information is coupled with the amplitude to obtain the weighted phase difference. This metric effectively captures the extent of phase distortion resulting from jamming. The findings from the simulation experiment demonstrate that the proposed feature and method are capable of accurately identifying and filtering out the jamming region in SAR pictures. Furthermore, it demonstrates the prospection of phase within the SAR image interpretation and electronic countermeasures.

合成孔径雷达（SAR）能够生成高分辨率的复值图像，在民用和军事领域都有广泛的应用。在这些应用中，合成孔径雷达电子对抗目前是一个突出的研究领域。目前，在雷达电子对抗中，真实目标和虚假目标的特征差距越来越小，使干扰探测变得越来越具有挑战性。本文研究了合成孔径雷达图像的相位，并提出了一种基于相位特征识别合成孔径雷达干扰区域的方法。第一步是将杂乱的相位信息整理为邻域相位差。随后，将这些信息与振幅结合起来，得到加权相位差。这一指标能有效捕捉干扰造成的相位失真程度。模拟实验的结果表明，所提出的特征和方法能够准确识别和滤除合成孔径雷达图像中的干扰区域。此外，它还证明了相位在合成孔径雷达图像判读和电子对抗中的应用前景。

{"title":"Jamming detection based on phase feature for SAR images","authors":"Haoyu Zhang, Sinong Quan, Shiqi Xing, Yitao Liu","doi":"10.1117/12.3014617","DOIUrl":"https://doi.org/10.1117/12.3014617","url":null,"abstract":"Synthetic Aperture Radar (SAR) is capable of producing high-resolution complex-valued pictures, which have extensive applications in both civil and military domains. Among these applications, SAR electronic countermeasures currently represent a prominent area of research interest. Presently, within the radar electronic countermeasures, there exists a diminishing disparity among the features of real and false targets, rendering the detection of jamming increasingly challenging. This paper examines the phase of SAR images and presents a method for identifying SAR jamming regions based on phase features. The initial step involves organizing the cluttered phase information into neighborhood phase differences. Subsequently, this information is coupled with the amplitude to obtain the weighted phase difference. This metric effectively captures the extent of phase distortion resulting from jamming. The findings from the simulation experiment demonstrate that the proposed feature and method are capable of accurately identifying and filtering out the jamming region in SAR pictures. Furthermore, it demonstrates the prospection of phase within the SAR image interpretation and electronic countermeasures.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"27 1","pages":"129691X - 129691X-5"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Research on Green glass of Cantonese colored windows based on color model 基于色彩模型的广东彩窗绿色玻璃研究

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014495

Xiaoqing Wang, Ying Du

Based on the color model, this paper investigates the green glass in Cantonese colored windows, aiming to establish the standard value of the green color of Cantonese colored windows, and to provide relevant data and suggestions to protect the design concept of Cantonese architectural decoration and regulate the use of color in colored windows. By collecting samples, with the help of image processing and color analysis software, the green glass was positioned and analyzed in both RGB and LAB color models. It is found that in the RGB color model, the threshold intervals of green glass are mainly concentrated in the Forest Green and Green regions; in the LAB color model, the threshold intervals of the two-color channels of green glass are mainly distributed in the range of medium and low saturation. This study provides digitalized standard values and reference data for the green of the Cantonese colored windows, which helps to maintain the design style of traditional architecture and promote the development and application of colored windows.

本文以色彩模型为基础，对广东彩窗中的绿色玻璃进行研究，旨在建立广东彩窗绿色的标准值，为保护广东建筑装饰的设计理念、规范彩窗的色彩使用提供相关数据和建议。通过采集样本，借助图像处理和色彩分析软件，对绿色玻璃进行了 RGB 和 LAB 两种色彩模型的定位和分析。结果发现，在 RGB 色彩模型中，绿色玻璃的阈值区间主要集中在森林绿和绿色区域；在 LAB 色彩模型中，绿色玻璃双色通道的阈值区间主要分布在中低饱和度范围内。本研究为广东彩窗的绿色提供了数字化的标准值和参考数据，有助于保持传统建筑的设计风格，促进彩窗的发展和应用。

引用次数: 0

An improved dung beetle optimizer 改进的蜣螂优化器

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014472

Jinxiang Feng, Jingyang Li, Yufeng Zhang, H. Baoyin

Dung Beetle Optimizer(DBO) is an effective metaheuristic algorithm proposed in 2022. But at the same time, DBO also suffers from a local-global imbalance in the exploration process, tends to fall into local optimization and exploitability needs to be further improved, etc. Therefore, we propose an improved DBO algorithm to address these shortcomings and named it CDBO. Firstly, Tent chaotic mapping can be used for the purpose of initializing the population, improving the quality of initial solutions, promoting the enhancement of population variety, and augmenting the global search capability of the algorithm. Secondly, introducing dynamic weighting factors enables the algorithm to fully search for local areas while also taking into account global exploration. To assess the effectiveness of CDBO, a total of 12 benchmark test functions were utilized to evaluate the performance of this algorithm, wherein CDBO was compared with other widely recognized metaheuristic algorithms. The results showed that CDBO had improved search accuracy and convergence speed. Finally, CDBO was applied to airfoil optimization problem, verifying the feasibility of applying CDBO to practical engineering problems.

Dung Beetle Optimizer（DBO）是2022年提出的一种有效的元启发式算法。但同时，DBO也存在探索过程中局部与全局不平衡、容易陷入局部优化、可利用性有待进一步提高等问题。因此，针对这些不足，我们提出了一种改进的DBO算法，并将其命名为CDBO。首先，Tent 混沌映射可用于初始化种群，提高初始解的质量，促进种群多样性的提高，增强算法的全局搜索能力。其次，引入动态加权因子可以使算法在充分搜索局部区域的同时兼顾全局探索。为了评估 CDBO 的有效性，我们使用了 12 个基准测试函数来评估该算法的性能，并将 CDBO 与其他公认的元启发式算法进行了比较。结果表明，CDBO 提高了搜索精度和收敛速度。最后，将 CDBO 应用于机翼优化问题，验证了将 CDBO 应用于实际工程问题的可行性。

{"title":"An improved dung beetle optimizer","authors":"Jinxiang Feng, Jingyang Li, Yufeng Zhang, H. Baoyin","doi":"10.1117/12.3014472","DOIUrl":"https://doi.org/10.1117/12.3014472","url":null,"abstract":"Dung Beetle Optimizer(DBO) is an effective metaheuristic algorithm proposed in 2022. But at the same time, DBO also suffers from a local-global imbalance in the exploration process, tends to fall into local optimization and exploitability needs to be further improved, etc. Therefore, we propose an improved DBO algorithm to address these shortcomings and named it CDBO. Firstly, Tent chaotic mapping can be used for the purpose of initializing the population, improving the quality of initial solutions, promoting the enhancement of population variety, and augmenting the global search capability of the algorithm. Secondly, introducing dynamic weighting factors enables the algorithm to fully search for local areas while also taking into account global exploration. To assess the effectiveness of CDBO, a total of 12 benchmark test functions were utilized to evaluate the performance of this algorithm, wherein CDBO was compared with other widely recognized metaheuristic algorithms. The results showed that CDBO had improved search accuracy and convergence speed. Finally, CDBO was applied to airfoil optimization problem, verifying the feasibility of applying CDBO to practical engineering problems.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"9 2","pages":"129692V - 129692V-9"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Research on target detection algorithm based on vehicle detection 基于车辆检测的目标检测算法研究

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014382

Yanguo Huang, Zehao Rao, Luo Li

Aiming at the current problem of unsatisfactory vehicle detection in complex scenes, an improved vehicle target detection network model is proposed. First, Res2Net residual network is fused in SCP, and the CSP_R structure is proposed, so that the model can extract deeper feature information and strengthen the ability to characterize small-scale targets; the attention mechanism is introduced, and the C3_CBAM module is designed to strengthen the attention to the detection targets while avoiding the increase of the model's computational volume; the loss function of the MPDIoU regression optimization is introduced, and the loss function is optimized by combining the prediction frame with the real frame length, width and area loss, and quantitative indicators to improve the convergence speed and robustness of the model. Finally, the model is validated on the SODA10M dataset, and the experimental results show that the model detection speed reaches 32 frames per second. The average detection accuracy reaches 83.7%, which is an improvement of 7.8 percentage points compared with YOLOV5s.

针对目前复杂场景下车辆检测效果不理想的问题，提出了一种改进的车辆目标检测网络模型。首先，在 SCP 中融合 Res2Net 残差网络，提出 CSP_R 结构，使模型能够提取更深层次的特征信息，增强对小尺度目标的表征能力；引入关注机制，设计 C3_CBAM 模块，在避免增加模型计算量的同时，加强对检测目标的关注；引入 MPDIoU 回归优化的损失函数，结合预测帧的实际帧长、宽、面积损失和定量指标对损失函数进行优化，提高模型的收敛速度和鲁棒性。最后，在 SODA10M 数据集上对模型进行了验证，实验结果表明，模型的检测速度达到了每秒 32 帧。平均检测准确率达到 83.7%，比 YOLOV5s 提高了 7.8 个百分点。

引用次数: 0

The application of target tracking algorithm in intelligent video system to flight support 智能视频系统中目标跟踪算法在飞行支持中的应用

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

Pub Date : 2024-01-09 DOI: 10.1117/12.3014375

Jianjun Peng, Jialei Zhai, Xiang Jin, Chengshuang Hu, Zaigang Li

As the global pandemic gradually eases and the aviation transport industry continues to experience steady growth, highdensity flight operations are becoming the new normal. The intelligentization of flight support processes is a crucial avenue for enhancing both the safety and efficiency of flight operations. With the advancement of computer vision technology, video-based object tracking has shown significant potential in the context of flight support processes. However, in real airport environments, object tracking often encounters challenges such as occlusion, scale variations, rotation, and changes in lighting conditions, leading to a decrease in tracking accuracy and even target loss. In this paper, our focus is on overcoming tracking failures caused by occlusion, deformation, and lighting variations. We have conducted the following work, taking into consideration the unique characteristics of airport environments and the specific requirements of flight support processes: (i) We utilized features at three levels, namely, Histogram of Oriented Gradient (HOG), Color Names, and Convolutional Neural Networks (CNN), to describe the texture, color, and high-level semantics of video images, respectively. (ii) We employed a multi-feature fusion approach using a trilinear interpolation function to integrate information from various sources. (iii) We implemented improved ECO algorithms for the tracking of moving objects in the airport environment. Finally, we validated this object tracking system using real surveillance videos from the airport. Experimental results have demonstrated the effectiveness and practicality of the method under challenging conditions.

随着全球疫情的逐渐缓解和航空运输业的持续稳定增长，高密度的飞行作业正在成为新常态。飞行支持流程的智能化是提高飞行安全和效率的重要途径。随着计算机视觉技术的发展，基于视频的物体跟踪技术在飞行支持流程中显示出了巨大的潜力。然而，在真实的机场环境中，物体跟踪经常会遇到遮挡、比例变化、旋转和光照条件变化等挑战，从而导致跟踪精度下降，甚至丢失目标。在本文中，我们的重点是克服由遮挡、变形和光照变化引起的跟踪失败。考虑到机场环境的特殊性和飞行保障流程的具体要求，我们开展了以下工作：(i) 我们利用三个层次的特征，即方向梯度直方图（HOG）、颜色名称和卷积神经网络（CNN），分别描述视频图像的纹理、颜色和高级语义。(ii) 我们采用了一种多特征融合方法，利用三线性插值函数来整合来自不同来源的信息。(iii) 我们改进了 ECO 算法，用于跟踪机场环境中的移动物体。最后，我们利用机场的真实监控视频验证了这一物体跟踪系统。实验结果证明了该方法在具有挑战性的条件下的有效性和实用性。

{"title":"The application of target tracking algorithm in intelligent video system to flight support","authors":"Jianjun Peng, Jialei Zhai, Xiang Jin, Chengshuang Hu, Zaigang Li","doi":"10.1117/12.3014375","DOIUrl":"https://doi.org/10.1117/12.3014375","url":null,"abstract":"As the global pandemic gradually eases and the aviation transport industry continues to experience steady growth, highdensity flight operations are becoming the new normal. The intelligentization of flight support processes is a crucial avenue for enhancing both the safety and efficiency of flight operations. With the advancement of computer vision technology, video-based object tracking has shown significant potential in the context of flight support processes. However, in real airport environments, object tracking often encounters challenges such as occlusion, scale variations, rotation, and changes in lighting conditions, leading to a decrease in tracking accuracy and even target loss. In this paper, our focus is on overcoming tracking failures caused by occlusion, deformation, and lighting variations. We have conducted the following work, taking into consideration the unique characteristics of airport environments and the specific requirements of flight support processes: (i) We utilized features at three levels, namely, Histogram of Oriented Gradient (HOG), Color Names, and Convolutional Neural Networks (CNN), to describe the texture, color, and high-level semantics of video images, respectively. (ii) We employed a multi-feature fusion approach using a trilinear interpolation function to integrate information from various sources. (iii) We implemented improved ECO algorithms for the tracking of moving objects in the airport environment. Finally, we validated this object tracking system using real surveillance videos from the airport. Experimental results have demonstrated the effectiveness and practicality of the method under challenging conditions.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"96 3","pages":"129690Q - 129690Q-8"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀