Zhengang Shi, C. Wu, W. Fu, Peng Tao, Linhao Zhang, Bo Gao
To enhance the performance of intelligent watt hour meters, a visual recognition and comparison system based on convolutional neural networks is proposed for intelligent watt hour meter chips. Firstly, the overall framework of the chip visual recognition comparison system is designed. Secondly, the hardware part of the system comprises the image acquisition module and image data transmission module of intelligent watt hour meter chips. In the software part, the classification function is selected based on the structural characteristics and operational principle of convolutional neural networks, and iterative training is used to complete the identification and comparison of smart meter chips. The experimental results demonstrate that this proposed system can significantly improve the accuracy of visual recognition and comparison, while also reducing the time consumption when compared to traditional recognition and comparison systems.
{"title":"Visual recognition and comparison system and method of intelligent watt hour meter chip based on convolutional neural network","authors":"Zhengang Shi, C. Wu, W. Fu, Peng Tao, Linhao Zhang, Bo Gao","doi":"10.1117/12.3014476","DOIUrl":"https://doi.org/10.1117/12.3014476","url":null,"abstract":"To enhance the performance of intelligent watt hour meters, a visual recognition and comparison system based on convolutional neural networks is proposed for intelligent watt hour meter chips. Firstly, the overall framework of the chip visual recognition comparison system is designed. Secondly, the hardware part of the system comprises the image acquisition module and image data transmission module of intelligent watt hour meter chips. In the software part, the classification function is selected based on the structural characteristics and operational principle of convolutional neural networks, and iterative training is used to complete the identification and comparison of smart meter chips. The experimental results demonstrate that this proposed system can significantly improve the accuracy of visual recognition and comparison, while also reducing the time consumption when compared to traditional recognition and comparison systems.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"169 2","pages":"1296921 - 1296921-7"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiacheng Fu, Dengbin Liao, Chunzhi Meng, Anni Huang, Junbing Pan
The current conventional collaborative resource scheduling algorithms in heterogeneous cloud environments mainly process the allocation through the quantified results of data characteristics of heterogeneous cloud resources, which leads to low integrated scheduling efficiency due to the differences in the attributes of resources. In this regard, a collaborative and comprehensive resource scheduling algorithm in heterogeneous cloud environment is proposed. Firstly, the heterogeneous cloud resource information data is sampled and processed, and the resource quality is graded. The scheduling task model is constructed by constructing the mapping function of scheduling task assignment sub-nodes, and the hierarchical scheduling strategy is proposed by combining with ant colony algorithm. In the experiments, the designed collaborative integrated scheduling algorithm is tested for the scheduling efficiency. The final results can prove that the algorithm has a lower average delay and a more desirable integrated scheduling efficiency when the proposed method is used for scheduling heterogeneous cloud resources.
{"title":"Research on collaborative and integrated resource schedulingalgorithm in heterogeneous cloud environment","authors":"Jiacheng Fu, Dengbin Liao, Chunzhi Meng, Anni Huang, Junbing Pan","doi":"10.1117/12.3014391","DOIUrl":"https://doi.org/10.1117/12.3014391","url":null,"abstract":"The current conventional collaborative resource scheduling algorithms in heterogeneous cloud environments mainly process the allocation through the quantified results of data characteristics of heterogeneous cloud resources, which leads to low integrated scheduling efficiency due to the differences in the attributes of resources. In this regard, a collaborative and comprehensive resource scheduling algorithm in heterogeneous cloud environment is proposed. Firstly, the heterogeneous cloud resource information data is sampled and processed, and the resource quality is graded. The scheduling task model is constructed by constructing the mapping function of scheduling task assignment sub-nodes, and the hierarchical scheduling strategy is proposed by combining with ant colony algorithm. In the experiments, the designed collaborative integrated scheduling algorithm is tested for the scheduling efficiency. The final results can prove that the algorithm has a lower average delay and a more desirable integrated scheduling efficiency when the proposed method is used for scheduling heterogeneous cloud resources.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"143 3","pages":"129690J - 129690J-6"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper proposes a fine-grained image classification architecture using multi-task learning. The structure of the fine-grained classification network uses ResNest as the feature extraction layer of the multi-task hard parameter sharing mode with the fine-grained category label regression branch based on multi-hot naming conventions and classification branch based on cross-entropy loss with one-hot encoding. The coupling between the two branches enables multi-task classification through hyperparameter weighting. Subsequently, comparison and ablation experiments were performed on the public datasets of Stanford Cars, CUB-200-2011 and FGVC-Aircraft. The experimental result shows multi-label regression, multi-task learning and label smoothing can effectively improve the generalization of the model and increase the inter-class distance of the previous layer at the network output terminal, and reduces the intra-class distance.
{"title":"Naming conventions-based multi-label and multi-task learning for fine-grained classification","authors":"Qinbang Zhou, Kezhi Zhang, Feng Yue, Zhaoliang Zhang, Hui Yu","doi":"10.1117/12.3014589","DOIUrl":"https://doi.org/10.1117/12.3014589","url":null,"abstract":"This paper proposes a fine-grained image classification architecture using multi-task learning. The structure of the fine-grained classification network uses ResNest as the feature extraction layer of the multi-task hard parameter sharing mode with the fine-grained category label regression branch based on multi-hot naming conventions and classification branch based on cross-entropy loss with one-hot encoding. The coupling between the two branches enables multi-task classification through hyperparameter weighting. Subsequently, comparison and ablation experiments were performed on the public datasets of Stanford Cars, CUB-200-2011 and FGVC-Aircraft. The experimental result shows multi-label regression, multi-task learning and label smoothing can effectively improve the generalization of the model and increase the inter-class distance of the previous layer at the network output terminal, and reduces the intra-class distance.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":" 30","pages":"129691D - 129691D-7"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139640394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Embroidery is an important intangible cultural heritage in China. The development of digital technology has changed the way of transmission and inheritance of traditional culture. At present, the research on digital simulation of embroidery is still relatively small, and there are some problems such as weak generalization ability and weak three-dimensional sense. According to the characteristics of embroidery art works, this paper proposes an embroidery style generation method combining attention mechanism and cycle-consistent adversarial networks. The attention mechanism module is used to guide the generator and discriminator to control the target area migration of embroidery style images, so as to digitally simulate the embroidery art style. The results show that the proposed method has stronger generalization ability than the traditional embroidery digital simulation method, and has greater optimization in embroidery reality compared with the existing deep learning model.
{"title":"Embroidery style generation with machine learning","authors":"Luojia Wang, Fei Guo","doi":"10.1117/12.3014501","DOIUrl":"https://doi.org/10.1117/12.3014501","url":null,"abstract":"Embroidery is an important intangible cultural heritage in China. The development of digital technology has changed the way of transmission and inheritance of traditional culture. At present, the research on digital simulation of embroidery is still relatively small, and there are some problems such as weak generalization ability and weak three-dimensional sense. According to the characteristics of embroidery art works, this paper proposes an embroidery style generation method combining attention mechanism and cycle-consistent adversarial networks. The attention mechanism module is used to guide the generator and discriminator to control the target area migration of embroidery style images, so as to digitally simulate the embroidery art style. The results show that the proposed method has stronger generalization ability than the traditional embroidery digital simulation method, and has greater optimization in embroidery reality compared with the existing deep learning model.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"10 9","pages":"129691P - 129691P-6"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139640413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
3D reconstruction technology utilizes 3D data to create models of physical objects. Cameras, laser scanners, and other sensors can be used to gather 3D data of objects, which can be processed using computer graphics technology for creating 3D models through 3D reconstruction technology. In engineering, high-precision 3D reconstruction models can substitute physical pipes for automatic measuring of pipe diameters. This paper proposes a target segmentation-based optimization method for single-frame reconstruction, which enables precise diameter measurement of pipes. Experimental results show that single-frame reconstruction, based on target segmentation technology, produces excellent results in the current application scenario. The proposed method is better adapted to complex construction conditions than the complex reconstruction methods. Complex backgrounds include excessive and uneven distributed light and interfering objects. Using target segmentation technology based on image processing, the MIVOS user-interactive video can produce and distribute the target object mask based on the user's interaction with the video frame. Complex background removal can improve the quality of reconstructed sample images. MIVOS is used to segment the pipe area in the image and remove most of the background noise. Consequently, the process lessens the interference of background noise in the reconstruction results. The proposed method exhibits significant progress in measuring both the inner and outer diameters of pipes when compared to both multi-frame and single-frame reconstruction methods. Their measurements have an average error of no more than 1 mm. The proposed method provides technical guidance for measuring the inner and outer diameters of pipes under complex conditions.
{"title":"3D pipeline reconstruction and diameter measurement method based on target segmentation","authors":"Guanghai Wu, Hao Zhang, Zhiqi Yan, Haoyu Wang, Zhihao Zhong, Ziao Yin","doi":"10.1117/12.3014403","DOIUrl":"https://doi.org/10.1117/12.3014403","url":null,"abstract":"3D reconstruction technology utilizes 3D data to create models of physical objects. Cameras, laser scanners, and other sensors can be used to gather 3D data of objects, which can be processed using computer graphics technology for creating 3D models through 3D reconstruction technology. In engineering, high-precision 3D reconstruction models can substitute physical pipes for automatic measuring of pipe diameters. This paper proposes a target segmentation-based optimization method for single-frame reconstruction, which enables precise diameter measurement of pipes. Experimental results show that single-frame reconstruction, based on target segmentation technology, produces excellent results in the current application scenario. The proposed method is better adapted to complex construction conditions than the complex reconstruction methods. Complex backgrounds include excessive and uneven distributed light and interfering objects. Using target segmentation technology based on image processing, the MIVOS user-interactive video can produce and distribute the target object mask based on the user's interaction with the video frame. Complex background removal can improve the quality of reconstructed sample images. MIVOS is used to segment the pipe area in the image and remove most of the background noise. Consequently, the process lessens the interference of background noise in the reconstruction results. The proposed method exhibits significant progress in measuring both the inner and outer diameters of pipes when compared to both multi-frame and single-frame reconstruction methods. Their measurements have an average error of no more than 1 mm. The proposed method provides technical guidance for measuring the inner and outer diameters of pipes under complex conditions.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"17 1","pages":"1296929 - 1296929-13"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139640406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xun Wang, Yinghui Tan, Tao Li, Chuang Liu, Guanghao Yang, Qian Wang
In order to understand the multidimensional situation prediction of digital twin active power grid, a research on multidimensional situation prediction of digital twin active power grid based on LSTM algorithm is proposed. In this paper, firstly, a multi-dimensional situation prediction algorithm of power grid key indicators based on LSTM is established to realize the change prediction of key indicators attributes of digital twin active power grid. Secondly, the data of several key indicators such as load characteristics are collected, and a multi-dimensional system prediction model is established, which can control the state of active power grid; The LSTM prediction algorithm is proposed to fit the characteristics of multi-dimensional data, and the next stage of multi-dimensional data prediction is mapped to the power digital twin, so as to realize the synchronous implementation and intelligent regulation of smart energy system operation planning. Finally, a simulation test model is established, and an example shows that the multi-dimensional situation prediction method of digital twin power grid based on deep learning can better predict and distinguish the power grid situation, and provide decision support for accurate planning of energy system in the future.
{"title":"Multi-dimensional situation prediction of digital twin active power grid based on LSTM algorithm","authors":"Xun Wang, Yinghui Tan, Tao Li, Chuang Liu, Guanghao Yang, Qian Wang","doi":"10.1117/12.3014395","DOIUrl":"https://doi.org/10.1117/12.3014395","url":null,"abstract":"In order to understand the multidimensional situation prediction of digital twin active power grid, a research on multidimensional situation prediction of digital twin active power grid based on LSTM algorithm is proposed. In this paper, firstly, a multi-dimensional situation prediction algorithm of power grid key indicators based on LSTM is established to realize the change prediction of key indicators attributes of digital twin active power grid. Secondly, the data of several key indicators such as load characteristics are collected, and a multi-dimensional system prediction model is established, which can control the state of active power grid; The LSTM prediction algorithm is proposed to fit the characteristics of multi-dimensional data, and the next stage of multi-dimensional data prediction is mapped to the power digital twin, so as to realize the synchronous implementation and intelligent regulation of smart energy system operation planning. Finally, a simulation test model is established, and an example shows that the multi-dimensional situation prediction method of digital twin power grid based on deep learning can better predict and distinguish the power grid situation, and provide decision support for accurate planning of energy system in the future.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":" 25","pages":"129690Y - 129690Y-6"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139640519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper, based on the principles of brushless DC motors, constructs a mathematical model and combines genetic algorithms with traditional PID control. Genetic algorithms are used for parameter optimization to obtain the optimal solution for PID control, achieving higher control precision and stability. A simulation model of the motor and control system is developed using Simulink, and various operational conditions, including normal startup and sudden speed changes during operation, are simulated. The results show significant improvements in the motor's response speed and control precision. There is no overshoot during the startup phase, and the error during constant-speed operation is below 0.5%.
{"title":"Simulation study of brushless DC motor speed control system based on GA-PID","authors":"Yang Tang, Hao Chen, Faxin Zhu","doi":"10.1117/12.3014483","DOIUrl":"https://doi.org/10.1117/12.3014483","url":null,"abstract":"This paper, based on the principles of brushless DC motors, constructs a mathematical model and combines genetic algorithms with traditional PID control. Genetic algorithms are used for parameter optimization to obtain the optimal solution for PID control, achieving higher control precision and stability. A simulation model of the motor and control system is developed using Simulink, and various operational conditions, including normal startup and sudden speed changes during operation, are simulated. The results show significant improvements in the motor's response speed and control precision. There is no overshoot during the startup phase, and the error during constant-speed operation is below 0.5%.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"62 2","pages":"129692X - 129692X-5"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The development of information technology has made the field of deep learning face recognition develop rapidly. The traditional face detection and recognition algorithm can perform well under constrained conditions, but under unconstrained conditions, its effect will be greatly discounted when low quality images and partial occlusion of faces are encountered. Based on MTCNN and FaceNet, this paper adopts two strategies to solve the above two problems respectively. On the one hand, by introducing the face image quality assessment function to solve the problem of low quality pictures, before face detection, a quality assessment of the face image is done, and only the image whose quality score reaches the threshold can be input into the model. On the other hand, the Coordinate attention mechanism is introduced to deal with the problem of partial occlusion of the face, which improves the recognition ability of the model by adaptively enhancing the weight of the unocclusion area of the face. Experimental results show that compared with existing algorithms, the accuracy of the proposed algorithm is significantly improved.
{"title":"Research on face recognition algorithm based on FaceNet and coordinate attention","authors":"Tao Zhang, zewu ke","doi":"10.1117/12.3014505","DOIUrl":"https://doi.org/10.1117/12.3014505","url":null,"abstract":"The development of information technology has made the field of deep learning face recognition develop rapidly. The traditional face detection and recognition algorithm can perform well under constrained conditions, but under unconstrained conditions, its effect will be greatly discounted when low quality images and partial occlusion of faces are encountered. Based on MTCNN and FaceNet, this paper adopts two strategies to solve the above two problems respectively. On the one hand, by introducing the face image quality assessment function to solve the problem of low quality pictures, before face detection, a quality assessment of the face image is done, and only the image whose quality score reaches the threshold can be input into the model. On the other hand, the Coordinate attention mechanism is introduced to deal with the problem of partial occlusion of the face, which improves the recognition ability of the model by adaptively enhancing the weight of the unocclusion area of the face. Experimental results show that compared with existing algorithms, the accuracy of the proposed algorithm is significantly improved.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"11 3","pages":"129690O - 129690O-5"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Continuous updating and maintenance of feasible solutions is crucial when solving constrained multi-objective optimization problems (CMOPs). However, most existing constrained multi-objective evolutionary algorithms (CMOEAs) are not efficient enough in updating and preserving competitive feasible solutions, thus reducing population diversity. To address this issue, this paper proposes a dual-population (i.e., mainPop and auxPop) constrained multi-objective evolutionary algorithm with a feasible archive set for CMOPs, named DPFAS. The two populations have different functions in the algorithm. Specifically, the ݉ܽ݅݊ܲmainPop considers both objectives and constraints for solving the original CMOPs, while the ܽauxPop is used only for the optimization of objectives without considering constraints. In addition, a feasible archive set is used to store feasible solutions that are competitive in the ܽauxPop and provide useful information for the ݉ܽ݅݊ܲmainPop. Moreover, a fitness assignment strategy is designed to speed up the algorithm’s convergence. Particularly, the population converges faster by selecting better-nondominated solutions into the matching pool. Finally, experimental studies on 23 benchmark functions show that the proposed algorithm was more competitive compared with five state-of-the-art CMOEAs.
{"title":"A dual population constrained multiobjective evolutionary algorithm with a feasible archive set","authors":"Xinchang Yu, Yumeng Wang, Tong Zhang, Huaqing Xu","doi":"10.1117/12.3014412","DOIUrl":"https://doi.org/10.1117/12.3014412","url":null,"abstract":"Continuous updating and maintenance of feasible solutions is crucial when solving constrained multi-objective optimization problems (CMOPs). However, most existing constrained multi-objective evolutionary algorithms (CMOEAs) are not efficient enough in updating and preserving competitive feasible solutions, thus reducing population diversity. To address this issue, this paper proposes a dual-population (i.e., mainPop and auxPop) constrained multi-objective evolutionary algorithm with a feasible archive set for CMOPs, named DPFAS. The two populations have different functions in the algorithm. Specifically, the ݉ܽ݅݊ܲmainPop considers both objectives and constraints for solving the original CMOPs, while the ܽauxPop is used only for the optimization of objectives without considering constraints. In addition, a feasible archive set is used to store feasible solutions that are competitive in the ܽauxPop and provide useful information for the ݉ܽ݅݊ܲmainPop. Moreover, a fitness assignment strategy is designed to speed up the algorithm’s convergence. Particularly, the population converges faster by selecting better-nondominated solutions into the matching pool. Finally, experimental studies on 23 benchmark functions show that the proposed algorithm was more competitive compared with five state-of-the-art CMOEAs.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"590 1","pages":"1296904 - 1296904-5"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In view of the problems of excessive parameter setting and large calculation of YOLOv7 in pedestrian object detection in complex street scenarios, this paper proposes a lightweight method to improve YOLOv7 algorithm. Under the YOLOv7 framework, Partial Convolution (PConv) is integrated into the convolution of the original algorithm, replacing part of the convolution in the original convolution layer, and the SEAttention attention module is introduced to ensure the detection accuracy of the lightweight algorithm. The experimental results on the home-made data set show that, compared with the original YOLOv7 algorithm, the number of model parameters decreased by 11.0% in the improved YOLOv7 algorithm, and the algorithm calculation volume decreased by 19.4%, while ensuring the high accuracy of the original YOLOv7 algorithm. In this paper, the algorithm reduces the number of parameters and calculations, and achieves the balance of lightweight and accuracy.
{"title":"Pedestrian object detection algorithm based on lightweight YOLOv7 in complex street scenarios","authors":"Shangqi Cheng, Hongxia Niu","doi":"10.1117/12.3014518","DOIUrl":"https://doi.org/10.1117/12.3014518","url":null,"abstract":"In view of the problems of excessive parameter setting and large calculation of YOLOv7 in pedestrian object detection in complex street scenarios, this paper proposes a lightweight method to improve YOLOv7 algorithm. Under the YOLOv7 framework, Partial Convolution (PConv) is integrated into the convolution of the original algorithm, replacing part of the convolution in the original convolution layer, and the SEAttention attention module is introduced to ensure the detection accuracy of the lightweight algorithm. The experimental results on the home-made data set show that, compared with the original YOLOv7 algorithm, the number of model parameters decreased by 11.0% in the improved YOLOv7 algorithm, and the algorithm calculation volume decreased by 19.4%, while ensuring the high accuracy of the original YOLOv7 algorithm. In this paper, the algorithm reduces the number of parameters and calculations, and achieves the balance of lightweight and accuracy.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"35 1","pages":"129690E - 129690E-6"},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140511684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}