With the rapid development of high-speed railway, it is necessary to ensure the safety of railway running, and the computer interlock system is the key equipment to ensure the safety of railway running in the station. It is a real-time system with high safety and reliability, which needs comprehensive and strict testing before it is put into use. In order to ensure that the interlocking system can strictly complete the function of each part, the computer interlocking test is very important. In recent years, with the rapid development of deep learning and image processing technology, in order to further improve the test efficiency of computer interlocking system, this paper studies the result decision module of automatic interlocking test. Target detection algorithm YOLOv5 is adopted to realize the location and recognition of signal, switch and section icon on the interlocking interface.
{"title":"Research on the application of YOLOv5 in station interlocking test","authors":"Hao Cheng, T. He, Rui Tian","doi":"10.1117/12.3003775","DOIUrl":"https://doi.org/10.1117/12.3003775","url":null,"abstract":"With the rapid development of high-speed railway, it is necessary to ensure the safety of railway running, and the computer interlock system is the key equipment to ensure the safety of railway running in the station. It is a real-time system with high safety and reliability, which needs comprehensive and strict testing before it is put into use. In order to ensure that the interlocking system can strictly complete the function of each part, the computer interlocking test is very important. In recent years, with the rapid development of deep learning and image processing technology, in order to further improve the test efficiency of computer interlocking system, this paper studies the result decision module of automatic interlocking test. Target detection algorithm YOLOv5 is adopted to realize the location and recognition of signal, switch and section icon on the interlocking interface.","PeriodicalId":210802,"journal":{"name":"International Conference on Image Processing and Intelligent Control","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121996618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hanning Wang, Jiang Wang, Chuangzhan Zeng, Chen Wang
The GrabCut image segmentation algorithm based on the principle of graph theory has been extensively used in the field of computer vision. However, the shortcoming is that it requires human-computer interaction to complete the ROI region selection to solve the segmentation task of the foreground image. Therefore, it cannot meet the requirements of fully intelligent image processing. In order to eliminate human-computer interaction and realize smart region selection, this paper proposes a ROI smart region generating and fine-tuning method to improve the GrabCut method, so as to realize intelligent image segmentation. The experimental results show that our method is compatible with both single-target and multi-target foreground image segmentation solutions.
{"title":"Non-interactive GrabCut image segmentation method","authors":"Hanning Wang, Jiang Wang, Chuangzhan Zeng, Chen Wang","doi":"10.1117/12.3000781","DOIUrl":"https://doi.org/10.1117/12.3000781","url":null,"abstract":"The GrabCut image segmentation algorithm based on the principle of graph theory has been extensively used in the field of computer vision. However, the shortcoming is that it requires human-computer interaction to complete the ROI region selection to solve the segmentation task of the foreground image. Therefore, it cannot meet the requirements of fully intelligent image processing. In order to eliminate human-computer interaction and realize smart region selection, this paper proposes a ROI smart region generating and fine-tuning method to improve the GrabCut method, so as to realize intelligent image segmentation. The experimental results show that our method is compatible with both single-target and multi-target foreground image segmentation solutions.","PeriodicalId":210802,"journal":{"name":"International Conference on Image Processing and Intelligent Control","volume":"145 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115772838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Terracotta Warriors and Horses of the First Qin Emperor are a highly prized relic in China and the world. Some people believe that the Terracotta Warriors are a realistic representation of the Qin dynasty, while others feel that the Terracotta Warriors are the result of artistic re-creation. Therefore, in order to verify the "realism" of the terracotta warriors and horses, the degree of resemblance between the terracotta warriors and real people was quantified. The representative points of the outer corners of the eyes on the head and face of the terracotta warriors are selected. The method of extracting feature points based on the cross-sectional line of approximation is proposed using point cloud data. The method is based on the approximation method of cross-sectional line extraction, which is more accurate and easier to calculate by converting 3D into 2D. The final experimental results show that the curvature of the outer corner points of the terracotta warriors and the curvature pattern of the real eyes basically match. The method verifies that the facial features of terracotta warriors are highly correlated with those of real people. At the same time, the "realistic" nature of the terracotta warriors is demonstrated.
{"title":"A method of extracting outer eye corners of terra cotta warriors based on point cloud data","authors":"Yue Cheng, Xianglei Liu","doi":"10.1117/12.3000790","DOIUrl":"https://doi.org/10.1117/12.3000790","url":null,"abstract":"The Terracotta Warriors and Horses of the First Qin Emperor are a highly prized relic in China and the world. Some people believe that the Terracotta Warriors are a realistic representation of the Qin dynasty, while others feel that the Terracotta Warriors are the result of artistic re-creation. Therefore, in order to verify the \"realism\" of the terracotta warriors and horses, the degree of resemblance between the terracotta warriors and real people was quantified. The representative points of the outer corners of the eyes on the head and face of the terracotta warriors are selected. The method of extracting feature points based on the cross-sectional line of approximation is proposed using point cloud data. The method is based on the approximation method of cross-sectional line extraction, which is more accurate and easier to calculate by converting 3D into 2D. The final experimental results show that the curvature of the outer corner points of the terracotta warriors and the curvature pattern of the real eyes basically match. The method verifies that the facial features of terracotta warriors are highly correlated with those of real people. At the same time, the \"realistic\" nature of the terracotta warriors is demonstrated.","PeriodicalId":210802,"journal":{"name":"International Conference on Image Processing and Intelligent Control","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114987264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study introduces a number of enhancements to the YOLOX-S target detection network in an effort to address the issues of heavy traffic at the ferry, complex traffic environment, and sluggish detection speed. The conventional residual block in CSPDarknet, which has a significant number of parameters and high equipment requirements, is replaced by the MBConv module in the deep layer and by the Fuse-MBConv module in the shallow layer. This is completed for YOLOXS's backbone feature extraction network, CSPDarknet. The enhanced model's mAP value is 83.39%, 2.7% more than the baseline method. The experimental findings demonstrate that the enhanced method presented in this study is appropriate for the real-time detection of moving objects, such as cars and people, in the vicinity of the ferry entrance
{"title":"Vehicle pedestrian detection algorithm at ferry entrance based on improved YOLOX","authors":"Yushan Liu, Xinyi Yang, Weikang Liu, Qinghua Liu, Mengdi Zhao","doi":"10.1117/12.3001323","DOIUrl":"https://doi.org/10.1117/12.3001323","url":null,"abstract":"This study introduces a number of enhancements to the YOLOX-S target detection network in an effort to address the issues of heavy traffic at the ferry, complex traffic environment, and sluggish detection speed. The conventional residual block in CSPDarknet, which has a significant number of parameters and high equipment requirements, is replaced by the MBConv module in the deep layer and by the Fuse-MBConv module in the shallow layer. This is completed for YOLOXS's backbone feature extraction network, CSPDarknet. The enhanced model's mAP value is 83.39%, 2.7% more than the baseline method. The experimental findings demonstrate that the enhanced method presented in this study is appropriate for the real-time detection of moving objects, such as cars and people, in the vicinity of the ferry entrance","PeriodicalId":210802,"journal":{"name":"International Conference on Image Processing and Intelligent Control","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115057163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Since the 21st century, artificial intelligence has been continuously evolving in various fields, particularly in agriculture. Vegetables, as a critical component of agriculture and human diets, have always been a focal point in terms of cultivation, production, and sales. Compared to traditional vegetable classification that requires professional knowledge and experience, AI technology utilizes computer vision to achieve automated classification. This study presents a deep learning-based vegetable recognition system aimed at automating the identification and classification of vegetables. The system utilizes a convolutional neural network (CNN) as its fundamental algorithm, integrating the traditional CNN architecture, which comprises convolutional layers, pooling layers, and fully connected layers. In comparison to other vegetable recognition systems on the market, this system utilizes a simpler architecture for processing and classifying vegetable images, significantly improving the accuracy and compatibility of identification. The research steps comprise data collection, image preprocessing, model training, and model testing. Experimental results demonstrate that the system can rapidly and accurately identify and classify various vegetables, with an average accuracy rate exceeding 95% on the test dataset, showcasing high practical value.
{"title":"A simple and efficient deep learning-based framework for vegetable recognition","authors":"Xian Gong","doi":"10.1117/12.3000777","DOIUrl":"https://doi.org/10.1117/12.3000777","url":null,"abstract":"Since the 21st century, artificial intelligence has been continuously evolving in various fields, particularly in agriculture. Vegetables, as a critical component of agriculture and human diets, have always been a focal point in terms of cultivation, production, and sales. Compared to traditional vegetable classification that requires professional knowledge and experience, AI technology utilizes computer vision to achieve automated classification. This study presents a deep learning-based vegetable recognition system aimed at automating the identification and classification of vegetables. The system utilizes a convolutional neural network (CNN) as its fundamental algorithm, integrating the traditional CNN architecture, which comprises convolutional layers, pooling layers, and fully connected layers. In comparison to other vegetable recognition systems on the market, this system utilizes a simpler architecture for processing and classifying vegetable images, significantly improving the accuracy and compatibility of identification. The research steps comprise data collection, image preprocessing, model training, and model testing. Experimental results demonstrate that the system can rapidly and accurately identify and classify various vegetables, with an average accuracy rate exceeding 95% on the test dataset, showcasing high practical value.","PeriodicalId":210802,"journal":{"name":"International Conference on Image Processing and Intelligent Control","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128290430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Apples are susceptible to diseases during growth that can reduce yields and cause economic losses. The common types of diseases of apple leaves are mainly spotted leaf drop, brown spot, grey spot, tobacco leaf blossom and rust. In this paper, a control system for the fine breeding of apple trees is designed to address the problem of the five diseases mentioned above affecting the growth of apple trees. The system uses a convolutional neural network to build a CNN model for disease identification of apple leaves. The data set is first processed using a pre-processing model (Xception) and the processed data is loaded into the built model. The experiments show that the accuracy of disease recognition using this model is high, and that fine farming of apple trees can be achieved through the control system.
{"title":"A control system for fine farming of apple trees","authors":"Xuehua Liu, Haojie Liu, Siyuan Yu, Zhenpeng Zhong","doi":"10.1117/12.3001228","DOIUrl":"https://doi.org/10.1117/12.3001228","url":null,"abstract":"Apples are susceptible to diseases during growth that can reduce yields and cause economic losses. The common types of diseases of apple leaves are mainly spotted leaf drop, brown spot, grey spot, tobacco leaf blossom and rust. In this paper, a control system for the fine breeding of apple trees is designed to address the problem of the five diseases mentioned above affecting the growth of apple trees. The system uses a convolutional neural network to build a CNN model for disease identification of apple leaves. The data set is first processed using a pre-processing model (Xception) and the processed data is loaded into the built model. The experiments show that the accuracy of disease recognition using this model is high, and that fine farming of apple trees can be achieved through the control system.","PeriodicalId":210802,"journal":{"name":"International Conference on Image Processing and Intelligent Control","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114267938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the arrival of the Internet era, smartphones, tablets and other terminal devices have gradually become popular. On this basis, mobile applications based on the Android platform have emerged and developed rapidly, including two categories of multimedia UI design and embedded HTML5 webpage creation. In order to meet people's demand for information products, the two need to be effectively combined to achieve efficient information processing functions. At present, China has started to vigorously develop the intelligent operating system represented by Android system, and multimedia UI design is one of its important components. The work is mainly done by the software development team, so the developers are required to have a high level of professional skills. At the same time, it is necessary to ensure the flexibility and compatibility of the designed program. In addition, to make the computer technology better integrated into the multimedia UI design, it is necessary to prepare the corresponding data to ensure that the user can provide good visual effects and interactive experience effect.
{"title":"Research on UI design and optimization of digital media with artifical intelligence","authors":"Meng Xia","doi":"10.1117/12.3000795","DOIUrl":"https://doi.org/10.1117/12.3000795","url":null,"abstract":"With the arrival of the Internet era, smartphones, tablets and other terminal devices have gradually become popular. On this basis, mobile applications based on the Android platform have emerged and developed rapidly, including two categories of multimedia UI design and embedded HTML5 webpage creation. In order to meet people's demand for information products, the two need to be effectively combined to achieve efficient information processing functions. At present, China has started to vigorously develop the intelligent operating system represented by Android system, and multimedia UI design is one of its important components. The work is mainly done by the software development team, so the developers are required to have a high level of professional skills. At the same time, it is necessary to ensure the flexibility and compatibility of the designed program. In addition, to make the computer technology better integrated into the multimedia UI design, it is necessary to prepare the corresponding data to ensure that the user can provide good visual effects and interactive experience effect.","PeriodicalId":210802,"journal":{"name":"International Conference on Image Processing and Intelligent Control","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127715103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jian Pan, Z. Li, Yi Wei, Cong Huang, Dong Liang, Tong Lu, Zhibin Chen, Yin Nong, Binkai Zhou, Weiwei Liu
In construction, coal mining, tobacco manufacturing and other industries, wearing helmets is crucial safety measure for workers, and the monitoring of helmet wearing plays a significant role in maintaining production safety. However, manual monitoring demands substantial human, material and financial resources, and will suffer from low efficiency and are error prone. Therefore, we proposed a lightweight real-time deep learning-based detection framework called YOLO-H, for automatic helmet wearing detection. Our YOLO-H model was developed on the foundation of YOLOv5-n by introducing the state-of-the-art techniques such as re-parameterization, decoupled head, label assignment strategy and loss function. Our proposed YOLO-H performed more efficiently and effectively. On a private dataset, our proposed framework achieved 94.5% mAP@0.5 and 65.2% mAP@0.5:0.95 with 82 FPS (Frames Per Second), which surpassed YOLOv5 by a large margin. Compared to other methods, our framework also showed overwhelming performance in terms of speed and accuracy. More importantly, the developed framework can be applied to other object detection scenarios.
{"title":"YOLO-H: a lightweight object detection framework for helmet wearing detection","authors":"Jian Pan, Z. Li, Yi Wei, Cong Huang, Dong Liang, Tong Lu, Zhibin Chen, Yin Nong, Binkai Zhou, Weiwei Liu","doi":"10.1117/12.3000832","DOIUrl":"https://doi.org/10.1117/12.3000832","url":null,"abstract":"In construction, coal mining, tobacco manufacturing and other industries, wearing helmets is crucial safety measure for workers, and the monitoring of helmet wearing plays a significant role in maintaining production safety. However, manual monitoring demands substantial human, material and financial resources, and will suffer from low efficiency and are error prone. Therefore, we proposed a lightweight real-time deep learning-based detection framework called YOLO-H, for automatic helmet wearing detection. Our YOLO-H model was developed on the foundation of YOLOv5-n by introducing the state-of-the-art techniques such as re-parameterization, decoupled head, label assignment strategy and loss function. Our proposed YOLO-H performed more efficiently and effectively. On a private dataset, our proposed framework achieved 94.5% mAP@0.5 and 65.2% mAP@0.5:0.95 with 82 FPS (Frames Per Second), which surpassed YOLOv5 by a large margin. Compared to other methods, our framework also showed overwhelming performance in terms of speed and accuracy. More importantly, the developed framework can be applied to other object detection scenarios.","PeriodicalId":210802,"journal":{"name":"International Conference on Image Processing and Intelligent Control","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114635826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Image super-resolution reconstruction is an ill-posed problem, as a low-resolution image can correspond to multiple high-resolution images. The models SRCNN and SRDenseNet produce high-resolution images using the mean square error (MSE) loss function, which results in blurry images that are the average of multiple high-quality images. However, the GAN model is capable of reconstructing a more realistic distribution of high-quality images. In this paper, we propose modifications to the SRGAN model by utilizing L1 norm loss for the discriminator's loss function, resulting in a more stable model. We also use VGG16 features for perceptual loss instead of VGG19, which produces better results. The content loss is calculated by weighting both the VGG loss and MSE loss, achieving a better balance between PSNR and human perception.
{"title":"Improved SRGAN model","authors":"Cong Zhu, Fei Wang, Sheng Liang, Keke Liu","doi":"10.1117/12.3000809","DOIUrl":"https://doi.org/10.1117/12.3000809","url":null,"abstract":"Image super-resolution reconstruction is an ill-posed problem, as a low-resolution image can correspond to multiple high-resolution images. The models SRCNN and SRDenseNet produce high-resolution images using the mean square error (MSE) loss function, which results in blurry images that are the average of multiple high-quality images. However, the GAN model is capable of reconstructing a more realistic distribution of high-quality images. In this paper, we propose modifications to the SRGAN model by utilizing L1 norm loss for the discriminator's loss function, resulting in a more stable model. We also use VGG16 features for perceptual loss instead of VGG19, which produces better results. The content loss is calculated by weighting both the VGG loss and MSE loss, achieving a better balance between PSNR and human perception.","PeriodicalId":210802,"journal":{"name":"International Conference on Image Processing and Intelligent Control","volume":"12782 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129971443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vision technology plays an important role when AUVs (Autonomous Underwater Vehicles) operate underwater. In this paper, the three-dimensional mode of binocular stereo vision is constructed to complete the positioning of the target. This article then explains the problem of distorted underwater images and introduces a method to correct distorted images. Based on underwater physical imaging, the underwater image processing methods are divided into underwater image enhancement and underwater image restoration. The research status of the two ways is analyzed and reviewed. The advantages and disadvantages of the above methods are summarized and discussed, and the future development trend is predicted.
{"title":"Analysis of binocular visual perception technology of underwater robot","authors":"Zhuang Sheng, Qiang Zhao, Gang Wang, Yingjie Song","doi":"10.1117/12.3001201","DOIUrl":"https://doi.org/10.1117/12.3001201","url":null,"abstract":"Vision technology plays an important role when AUVs (Autonomous Underwater Vehicles) operate underwater. In this paper, the three-dimensional mode of binocular stereo vision is constructed to complete the positioning of the target. This article then explains the problem of distorted underwater images and introduces a method to correct distorted images. Based on underwater physical imaging, the underwater image processing methods are divided into underwater image enhancement and underwater image restoration. The research status of the two ways is analyzed and reviewed. The advantages and disadvantages of the above methods are summarized and discussed, and the future development trend is predicted.","PeriodicalId":210802,"journal":{"name":"International Conference on Image Processing and Intelligent Control","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130332049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}