首页 > 最新文献

International Conference on Image Processing and Intelligent Control最新文献

英文 中文
Research on the application of YOLOv5 in station interlocking test YOLOv5在车站联锁试验中的应用研究
Pub Date : 2023-08-09 DOI: 10.1117/12.3003775
Hao Cheng, T. He, Rui Tian
With the rapid development of high-speed railway, it is necessary to ensure the safety of railway running, and the computer interlock system is the key equipment to ensure the safety of railway running in the station. It is a real-time system with high safety and reliability, which needs comprehensive and strict testing before it is put into use. In order to ensure that the interlocking system can strictly complete the function of each part, the computer interlocking test is very important. In recent years, with the rapid development of deep learning and image processing technology, in order to further improve the test efficiency of computer interlocking system, this paper studies the result decision module of automatic interlocking test. Target detection algorithm YOLOv5 is adopted to realize the location and recognition of signal, switch and section icon on the interlocking interface.
随着高速铁路的快速发展,保证铁路运行的安全势在必行,而计算机联锁系统是保证车站铁路运行安全的关键设备。它是一个具有高安全性和可靠性的实时系统,在投入使用前需要进行全面和严格的测试。为了保证联锁系统能严格完成各部分的功能,计算机联锁测试是非常重要的。近年来,随着深度学习和图像处理技术的飞速发展,为了进一步提高计算机联锁系统的测试效率,本文对自动联锁测试的结果决策模块进行了研究。采用目标检测算法YOLOv5,实现联锁界面上信号、开关、断面图标的定位与识别。
{"title":"Research on the application of YOLOv5 in station interlocking test","authors":"Hao Cheng, T. He, Rui Tian","doi":"10.1117/12.3003775","DOIUrl":"https://doi.org/10.1117/12.3003775","url":null,"abstract":"With the rapid development of high-speed railway, it is necessary to ensure the safety of railway running, and the computer interlock system is the key equipment to ensure the safety of railway running in the station. It is a real-time system with high safety and reliability, which needs comprehensive and strict testing before it is put into use. In order to ensure that the interlocking system can strictly complete the function of each part, the computer interlocking test is very important. In recent years, with the rapid development of deep learning and image processing technology, in order to further improve the test efficiency of computer interlocking system, this paper studies the result decision module of automatic interlocking test. Target detection algorithm YOLOv5 is adopted to realize the location and recognition of signal, switch and section icon on the interlocking interface.","PeriodicalId":210802,"journal":{"name":"International Conference on Image Processing and Intelligent Control","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121996618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Non-interactive GrabCut image segmentation method 非交互式GrabCut图像分割方法
Pub Date : 2023-08-09 DOI: 10.1117/12.3000781
Hanning Wang, Jiang Wang, Chuangzhan Zeng, Chen Wang
The GrabCut image segmentation algorithm based on the principle of graph theory has been extensively used in the field of computer vision. However, the shortcoming is that it requires human-computer interaction to complete the ROI region selection to solve the segmentation task of the foreground image. Therefore, it cannot meet the requirements of fully intelligent image processing. In order to eliminate human-computer interaction and realize smart region selection, this paper proposes a ROI smart region generating and fine-tuning method to improve the GrabCut method, so as to realize intelligent image segmentation. The experimental results show that our method is compatible with both single-target and multi-target foreground image segmentation solutions.
基于图论原理的GrabCut图像分割算法在计算机视觉领域得到了广泛的应用。但缺点是需要人机交互来完成感兴趣区域的选择,以解决前景图像的分割任务。因此,它不能满足全智能图像处理的要求。为了消除人机交互,实现智能区域选择,本文提出了一种ROI智能区域生成和微调方法,对GrabCut方法进行改进,从而实现智能图像分割。实验结果表明,该方法兼容单目标和多目标前景图像分割方案。
{"title":"Non-interactive GrabCut image segmentation method","authors":"Hanning Wang, Jiang Wang, Chuangzhan Zeng, Chen Wang","doi":"10.1117/12.3000781","DOIUrl":"https://doi.org/10.1117/12.3000781","url":null,"abstract":"The GrabCut image segmentation algorithm based on the principle of graph theory has been extensively used in the field of computer vision. However, the shortcoming is that it requires human-computer interaction to complete the ROI region selection to solve the segmentation task of the foreground image. Therefore, it cannot meet the requirements of fully intelligent image processing. In order to eliminate human-computer interaction and realize smart region selection, this paper proposes a ROI smart region generating and fine-tuning method to improve the GrabCut method, so as to realize intelligent image segmentation. The experimental results show that our method is compatible with both single-target and multi-target foreground image segmentation solutions.","PeriodicalId":210802,"journal":{"name":"International Conference on Image Processing and Intelligent Control","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115772838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A method of extracting outer eye corners of terra cotta warriors based on point cloud data 一种基于点云数据的兵马俑外眼角提取方法
Pub Date : 2023-08-09 DOI: 10.1117/12.3000790
Yue Cheng, Xianglei Liu
The Terracotta Warriors and Horses of the First Qin Emperor are a highly prized relic in China and the world. Some people believe that the Terracotta Warriors are a realistic representation of the Qin dynasty, while others feel that the Terracotta Warriors are the result of artistic re-creation. Therefore, in order to verify the "realism" of the terracotta warriors and horses, the degree of resemblance between the terracotta warriors and real people was quantified. The representative points of the outer corners of the eyes on the head and face of the terracotta warriors are selected. The method of extracting feature points based on the cross-sectional line of approximation is proposed using point cloud data. The method is based on the approximation method of cross-sectional line extraction, which is more accurate and easier to calculate by converting 3D into 2D. The final experimental results show that the curvature of the outer corner points of the terracotta warriors and the curvature pattern of the real eyes basically match. The method verifies that the facial features of terracotta warriors are highly correlated with those of real people. At the same time, the "realistic" nature of the terracotta warriors is demonstrated.
秦始皇兵马俑在中国和世界上都是非常珍贵的文物。一些人认为兵马俑是秦朝的真实再现,而另一些人则认为兵马俑是艺术再创作的结果。因此,为了验证兵马俑的“真实性”,对兵马俑与真人的相似程度进行了量化。选取兵马俑头部和面部眼睛外角的代表性点。利用点云数据,提出了基于近似横切线提取特征点的方法。该方法基于截线提取的近似方法,通过将三维转换为二维,更精确,更容易计算。最后的实验结果表明,兵马俑外角点的曲率与真实人眼的曲率模式基本吻合。该方法验证了兵马俑的面部特征与真人的面部特征高度相关。同时,展示了兵马俑的“真实感”。
{"title":"A method of extracting outer eye corners of terra cotta warriors based on point cloud data","authors":"Yue Cheng, Xianglei Liu","doi":"10.1117/12.3000790","DOIUrl":"https://doi.org/10.1117/12.3000790","url":null,"abstract":"The Terracotta Warriors and Horses of the First Qin Emperor are a highly prized relic in China and the world. Some people believe that the Terracotta Warriors are a realistic representation of the Qin dynasty, while others feel that the Terracotta Warriors are the result of artistic re-creation. Therefore, in order to verify the \"realism\" of the terracotta warriors and horses, the degree of resemblance between the terracotta warriors and real people was quantified. The representative points of the outer corners of the eyes on the head and face of the terracotta warriors are selected. The method of extracting feature points based on the cross-sectional line of approximation is proposed using point cloud data. The method is based on the approximation method of cross-sectional line extraction, which is more accurate and easier to calculate by converting 3D into 2D. The final experimental results show that the curvature of the outer corner points of the terracotta warriors and the curvature pattern of the real eyes basically match. The method verifies that the facial features of terracotta warriors are highly correlated with those of real people. At the same time, the \"realistic\" nature of the terracotta warriors is demonstrated.","PeriodicalId":210802,"journal":{"name":"International Conference on Image Processing and Intelligent Control","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114987264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Vehicle pedestrian detection algorithm at ferry entrance based on improved YOLOX 基于改进YOLOX的渡口车辆行人检测算法
Pub Date : 2023-08-09 DOI: 10.1117/12.3001323
Yushan Liu, Xinyi Yang, Weikang Liu, Qinghua Liu, Mengdi Zhao
This study introduces a number of enhancements to the YOLOX-S target detection network in an effort to address the issues of heavy traffic at the ferry, complex traffic environment, and sluggish detection speed. The conventional residual block in CSPDarknet, which has a significant number of parameters and high equipment requirements, is replaced by the MBConv module in the deep layer and by the Fuse-MBConv module in the shallow layer. This is completed for YOLOXS's backbone feature extraction network, CSPDarknet. The enhanced model's mAP value is 83.39%, 2.7% more than the baseline method. The experimental findings demonstrate that the enhanced method presented in this study is appropriate for the real-time detection of moving objects, such as cars and people, in the vicinity of the ferry entrance
本研究引入了对YOLOX-S目标检测网络的一些增强功能,以解决渡轮上交通繁忙、交通环境复杂和检测速度缓慢的问题。CSPDarknet中传统的残差块参数多、设备要求高,深层采用MBConv模块,浅层采用Fuse-MBConv模块。这是为YOLOXS的骨干特征提取网络CSPDarknet完成的。增强模型的mAP值为83.39%,比基线方法提高了2.7%。实验结果表明,本文提出的增强方法适用于轮渡入口附近运动物体(如汽车和人)的实时检测
{"title":"Vehicle pedestrian detection algorithm at ferry entrance based on improved YOLOX","authors":"Yushan Liu, Xinyi Yang, Weikang Liu, Qinghua Liu, Mengdi Zhao","doi":"10.1117/12.3001323","DOIUrl":"https://doi.org/10.1117/12.3001323","url":null,"abstract":"This study introduces a number of enhancements to the YOLOX-S target detection network in an effort to address the issues of heavy traffic at the ferry, complex traffic environment, and sluggish detection speed. The conventional residual block in CSPDarknet, which has a significant number of parameters and high equipment requirements, is replaced by the MBConv module in the deep layer and by the Fuse-MBConv module in the shallow layer. This is completed for YOLOXS's backbone feature extraction network, CSPDarknet. The enhanced model's mAP value is 83.39%, 2.7% more than the baseline method. The experimental findings demonstrate that the enhanced method presented in this study is appropriate for the real-time detection of moving objects, such as cars and people, in the vicinity of the ferry entrance","PeriodicalId":210802,"journal":{"name":"International Conference on Image Processing and Intelligent Control","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115057163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A simple and efficient deep learning-based framework for vegetable recognition 一个简单高效的基于深度学习的蔬菜识别框架
Pub Date : 2023-08-09 DOI: 10.1117/12.3000777
Xian Gong
Since the 21st century, artificial intelligence has been continuously evolving in various fields, particularly in agriculture. Vegetables, as a critical component of agriculture and human diets, have always been a focal point in terms of cultivation, production, and sales. Compared to traditional vegetable classification that requires professional knowledge and experience, AI technology utilizes computer vision to achieve automated classification. This study presents a deep learning-based vegetable recognition system aimed at automating the identification and classification of vegetables. The system utilizes a convolutional neural network (CNN) as its fundamental algorithm, integrating the traditional CNN architecture, which comprises convolutional layers, pooling layers, and fully connected layers. In comparison to other vegetable recognition systems on the market, this system utilizes a simpler architecture for processing and classifying vegetable images, significantly improving the accuracy and compatibility of identification. The research steps comprise data collection, image preprocessing, model training, and model testing. Experimental results demonstrate that the system can rapidly and accurately identify and classify various vegetables, with an average accuracy rate exceeding 95% on the test dataset, showcasing high practical value.
21世纪以来,人工智能在各个领域不断发展,尤其是在农业领域。蔬菜作为农业和人类饮食的重要组成部分,一直是种植、生产和销售的重点。与传统的蔬菜分类需要专业知识和经验相比,人工智能技术利用计算机视觉实现自动分类。本文提出了一种基于深度学习的蔬菜识别系统,旨在实现蔬菜的自动识别和分类。该系统采用卷积神经网络(convolutional neural network, CNN)作为基本算法,融合了传统的CNN架构,包括卷积层、池化层和全连接层。与市场上的其他蔬菜识别系统相比,该系统采用了更简单的结构对蔬菜图像进行处理和分类,显著提高了识别的准确性和兼容性。研究步骤包括数据采集、图像预处理、模型训练和模型测试。实验结果表明,该系统能够快速准确地对各种蔬菜进行识别和分类,在测试数据集上平均准确率超过95%,具有较高的实用价值。
{"title":"A simple and efficient deep learning-based framework for vegetable recognition","authors":"Xian Gong","doi":"10.1117/12.3000777","DOIUrl":"https://doi.org/10.1117/12.3000777","url":null,"abstract":"Since the 21st century, artificial intelligence has been continuously evolving in various fields, particularly in agriculture. Vegetables, as a critical component of agriculture and human diets, have always been a focal point in terms of cultivation, production, and sales. Compared to traditional vegetable classification that requires professional knowledge and experience, AI technology utilizes computer vision to achieve automated classification. This study presents a deep learning-based vegetable recognition system aimed at automating the identification and classification of vegetables. The system utilizes a convolutional neural network (CNN) as its fundamental algorithm, integrating the traditional CNN architecture, which comprises convolutional layers, pooling layers, and fully connected layers. In comparison to other vegetable recognition systems on the market, this system utilizes a simpler architecture for processing and classifying vegetable images, significantly improving the accuracy and compatibility of identification. The research steps comprise data collection, image preprocessing, model training, and model testing. Experimental results demonstrate that the system can rapidly and accurately identify and classify various vegetables, with an average accuracy rate exceeding 95% on the test dataset, showcasing high practical value.","PeriodicalId":210802,"journal":{"name":"International Conference on Image Processing and Intelligent Control","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128290430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A control system for fine farming of apple trees 一种苹果树精细栽培的控制系统
Pub Date : 2023-08-09 DOI: 10.1117/12.3001228
Xuehua Liu, Haojie Liu, Siyuan Yu, Zhenpeng Zhong
Apples are susceptible to diseases during growth that can reduce yields and cause economic losses. The common types of diseases of apple leaves are mainly spotted leaf drop, brown spot, grey spot, tobacco leaf blossom and rust. In this paper, a control system for the fine breeding of apple trees is designed to address the problem of the five diseases mentioned above affecting the growth of apple trees. The system uses a convolutional neural network to build a CNN model for disease identification of apple leaves. The data set is first processed using a pre-processing model (Xception) and the processed data is loaded into the built model. The experiments show that the accuracy of disease recognition using this model is high, and that fine farming of apple trees can be achieved through the control system.
苹果在生长过程中易受病害影响,这些病害会降低产量并造成经济损失。苹果叶片常见病害类型主要有斑落病、褐斑病、灰斑病、烟叶花病和锈病。本文针对上述五种病害影响苹果树生长的问题,设计了一套苹果树优良育种控制系统。该系统利用卷积神经网络建立了苹果叶片病害识别的CNN模型。首先使用预处理模型(Xception)处理数据集,然后将处理后的数据加载到构建的模型中。实验结果表明,该模型对苹果病害的识别准确率较高,可以实现苹果的精细化种植。
{"title":"A control system for fine farming of apple trees","authors":"Xuehua Liu, Haojie Liu, Siyuan Yu, Zhenpeng Zhong","doi":"10.1117/12.3001228","DOIUrl":"https://doi.org/10.1117/12.3001228","url":null,"abstract":"Apples are susceptible to diseases during growth that can reduce yields and cause economic losses. The common types of diseases of apple leaves are mainly spotted leaf drop, brown spot, grey spot, tobacco leaf blossom and rust. In this paper, a control system for the fine breeding of apple trees is designed to address the problem of the five diseases mentioned above affecting the growth of apple trees. The system uses a convolutional neural network to build a CNN model for disease identification of apple leaves. The data set is first processed using a pre-processing model (Xception) and the processed data is loaded into the built model. The experiments show that the accuracy of disease recognition using this model is high, and that fine farming of apple trees can be achieved through the control system.","PeriodicalId":210802,"journal":{"name":"International Conference on Image Processing and Intelligent Control","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114267938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on UI design and optimization of digital media with artifical intelligence 基于人工智能的数字媒体UI设计与优化研究
Pub Date : 2023-08-09 DOI: 10.1117/12.3000795
Meng Xia
With the arrival of the Internet era, smartphones, tablets and other terminal devices have gradually become popular. On this basis, mobile applications based on the Android platform have emerged and developed rapidly, including two categories of multimedia UI design and embedded HTML5 webpage creation. In order to meet people's demand for information products, the two need to be effectively combined to achieve efficient information processing functions. At present, China has started to vigorously develop the intelligent operating system represented by Android system, and multimedia UI design is one of its important components. The work is mainly done by the software development team, so the developers are required to have a high level of professional skills. At the same time, it is necessary to ensure the flexibility and compatibility of the designed program. In addition, to make the computer technology better integrated into the multimedia UI design, it is necessary to prepare the corresponding data to ensure that the user can provide good visual effects and interactive experience effect.
随着互联网时代的到来,智能手机、平板电脑等终端设备逐渐普及。在此基础上,基于Android平台的移动应用出现并迅速发展,包括多媒体UI设计和嵌入式HTML5网页制作两大类。为了满足人们对信息产品的需求,需要将两者有效地结合起来,实现高效的信息处理功能。目前,中国已经开始大力发展以Android系统为代表的智能操作系统,多媒体UI设计是其重要组成部分之一。该工作主要由软件开发团队完成,因此要求开发人员具有较高的专业技能。同时,要保证所设计程序的灵活性和兼容性。此外,为了使计算机技术更好地融入多媒体UI设计中,需要准备相应的数据,以确保用户能够提供良好的视觉效果和交互体验效果。
{"title":"Research on UI design and optimization of digital media with artifical intelligence","authors":"Meng Xia","doi":"10.1117/12.3000795","DOIUrl":"https://doi.org/10.1117/12.3000795","url":null,"abstract":"With the arrival of the Internet era, smartphones, tablets and other terminal devices have gradually become popular. On this basis, mobile applications based on the Android platform have emerged and developed rapidly, including two categories of multimedia UI design and embedded HTML5 webpage creation. In order to meet people's demand for information products, the two need to be effectively combined to achieve efficient information processing functions. At present, China has started to vigorously develop the intelligent operating system represented by Android system, and multimedia UI design is one of its important components. The work is mainly done by the software development team, so the developers are required to have a high level of professional skills. At the same time, it is necessary to ensure the flexibility and compatibility of the designed program. In addition, to make the computer technology better integrated into the multimedia UI design, it is necessary to prepare the corresponding data to ensure that the user can provide good visual effects and interactive experience effect.","PeriodicalId":210802,"journal":{"name":"International Conference on Image Processing and Intelligent Control","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127715103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
YOLO-H: a lightweight object detection framework for helmet wearing detection YOLO-H:用于头盔佩戴检测的轻量级目标检测框架
Pub Date : 2023-08-09 DOI: 10.1117/12.3000832
Jian Pan, Z. Li, Yi Wei, Cong Huang, Dong Liang, Tong Lu, Zhibin Chen, Yin Nong, Binkai Zhou, Weiwei Liu
In construction, coal mining, tobacco manufacturing and other industries, wearing helmets is crucial safety measure for workers, and the monitoring of helmet wearing plays a significant role in maintaining production safety. However, manual monitoring demands substantial human, material and financial resources, and will suffer from low efficiency and are error prone. Therefore, we proposed a lightweight real-time deep learning-based detection framework called YOLO-H, for automatic helmet wearing detection. Our YOLO-H model was developed on the foundation of YOLOv5-n by introducing the state-of-the-art techniques such as re-parameterization, decoupled head, label assignment strategy and loss function. Our proposed YOLO-H performed more efficiently and effectively. On a private dataset, our proposed framework achieved 94.5% mAP@0.5 and 65.2% mAP@0.5:0.95 with 82 FPS (Frames Per Second), which surpassed YOLOv5 by a large margin. Compared to other methods, our framework also showed overwhelming performance in terms of speed and accuracy. More importantly, the developed framework can be applied to other object detection scenarios.
在建筑、煤矿、烟草制造等行业中,佩戴安全帽是工人的重要安全措施,对安全帽佩戴情况进行监控对维护生产安全具有重要作用。然而,人工监控需要大量的人力、物力和财力,而且效率低,容易出错。因此,我们提出了一种轻量级的实时深度学习检测框架YOLO-H,用于自动检测头盔佩戴情况。我们的YOLOv5-n模型是在YOLOv5-n的基础上开发的,引入了最先进的技术,如重新参数化、解耦头部、标签分配策略和损失函数。我们提出的YOLO-H执行效率更高。在私有数据集上,我们提出的框架实现了94.5% mAP@0.5和65.2% mAP@0.5:0.95,每秒帧数为82帧,大大超过了YOLOv5。与其他方法相比,我们的框架在速度和准确性方面也表现出压倒性的性能。更重要的是,所开发的框架可以应用于其他目标检测场景。
{"title":"YOLO-H: a lightweight object detection framework for helmet wearing detection","authors":"Jian Pan, Z. Li, Yi Wei, Cong Huang, Dong Liang, Tong Lu, Zhibin Chen, Yin Nong, Binkai Zhou, Weiwei Liu","doi":"10.1117/12.3000832","DOIUrl":"https://doi.org/10.1117/12.3000832","url":null,"abstract":"In construction, coal mining, tobacco manufacturing and other industries, wearing helmets is crucial safety measure for workers, and the monitoring of helmet wearing plays a significant role in maintaining production safety. However, manual monitoring demands substantial human, material and financial resources, and will suffer from low efficiency and are error prone. Therefore, we proposed a lightweight real-time deep learning-based detection framework called YOLO-H, for automatic helmet wearing detection. Our YOLO-H model was developed on the foundation of YOLOv5-n by introducing the state-of-the-art techniques such as re-parameterization, decoupled head, label assignment strategy and loss function. Our proposed YOLO-H performed more efficiently and effectively. On a private dataset, our proposed framework achieved 94.5% mAP@0.5 and 65.2% mAP@0.5:0.95 with 82 FPS (Frames Per Second), which surpassed YOLOv5 by a large margin. Compared to other methods, our framework also showed overwhelming performance in terms of speed and accuracy. More importantly, the developed framework can be applied to other object detection scenarios.","PeriodicalId":210802,"journal":{"name":"International Conference on Image Processing and Intelligent Control","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114635826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improved SRGAN model 改进的SRGAN模型
Pub Date : 2023-08-09 DOI: 10.1117/12.3000809
Cong Zhu, Fei Wang, Sheng Liang, Keke Liu
Image super-resolution reconstruction is an ill-posed problem, as a low-resolution image can correspond to multiple high-resolution images. The models SRCNN and SRDenseNet produce high-resolution images using the mean square error (MSE) loss function, which results in blurry images that are the average of multiple high-quality images. However, the GAN model is capable of reconstructing a more realistic distribution of high-quality images. In this paper, we propose modifications to the SRGAN model by utilizing L1 norm loss for the discriminator's loss function, resulting in a more stable model. We also use VGG16 features for perceptual loss instead of VGG19, which produces better results. The content loss is calculated by weighting both the VGG loss and MSE loss, achieving a better balance between PSNR and human perception.
图像超分辨率重构是一个不适定问题,因为一幅低分辨率图像可以对应多幅高分辨率图像。SRCNN和SRDenseNet模型使用均方误差(MSE)损失函数生成高分辨率图像,这导致图像模糊,这是多个高质量图像的平均值。然而,GAN模型能够重建更真实的高质量图像分布。在本文中,我们提出了对SRGAN模型的修改,利用L1范数损失作为鉴别器的损失函数,从而使模型更稳定。我们还使用了VGG16特征来代替VGG19,它产生了更好的效果。通过对VGG损失和MSE损失进行加权来计算内容损失,从而在PSNR和人类感知之间实现更好的平衡。
{"title":"Improved SRGAN model","authors":"Cong Zhu, Fei Wang, Sheng Liang, Keke Liu","doi":"10.1117/12.3000809","DOIUrl":"https://doi.org/10.1117/12.3000809","url":null,"abstract":"Image super-resolution reconstruction is an ill-posed problem, as a low-resolution image can correspond to multiple high-resolution images. The models SRCNN and SRDenseNet produce high-resolution images using the mean square error (MSE) loss function, which results in blurry images that are the average of multiple high-quality images. However, the GAN model is capable of reconstructing a more realistic distribution of high-quality images. In this paper, we propose modifications to the SRGAN model by utilizing L1 norm loss for the discriminator's loss function, resulting in a more stable model. We also use VGG16 features for perceptual loss instead of VGG19, which produces better results. The content loss is calculated by weighting both the VGG loss and MSE loss, achieving a better balance between PSNR and human perception.","PeriodicalId":210802,"journal":{"name":"International Conference on Image Processing and Intelligent Control","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129971443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analysis of binocular visual perception technology of underwater robot 水下机器人双目视觉感知技术分析
Pub Date : 2023-08-09 DOI: 10.1117/12.3001201
Zhuang Sheng, Qiang Zhao, Gang Wang, Yingjie Song
Vision technology plays an important role when AUVs (Autonomous Underwater Vehicles) operate underwater. In this paper, the three-dimensional mode of binocular stereo vision is constructed to complete the positioning of the target. This article then explains the problem of distorted underwater images and introduces a method to correct distorted images. Based on underwater physical imaging, the underwater image processing methods are divided into underwater image enhancement and underwater image restoration. The research status of the two ways is analyzed and reviewed. The advantages and disadvantages of the above methods are summarized and discussed, and the future development trend is predicted.
视觉技术在auv (Autonomous Underwater vehicle)水下作业中起着至关重要的作用。本文构建了双目立体视觉的三维模式来完成目标的定位。然后阐述了水下图像失真的问题,并介绍了一种校正图像失真的方法。基于水下物理成像,水下图像处理方法分为水下图像增强和水下图像恢复。对两种方法的研究现状进行了分析和评述。对上述方法的优缺点进行了总结和讨论,并对未来的发展趋势进行了预测。
{"title":"Analysis of binocular visual perception technology of underwater robot","authors":"Zhuang Sheng, Qiang Zhao, Gang Wang, Yingjie Song","doi":"10.1117/12.3001201","DOIUrl":"https://doi.org/10.1117/12.3001201","url":null,"abstract":"Vision technology plays an important role when AUVs (Autonomous Underwater Vehicles) operate underwater. In this paper, the three-dimensional mode of binocular stereo vision is constructed to complete the positioning of the target. This article then explains the problem of distorted underwater images and introduces a method to correct distorted images. Based on underwater physical imaging, the underwater image processing methods are divided into underwater image enhancement and underwater image restoration. The research status of the two ways is analyzed and reviewed. The advantages and disadvantages of the above methods are summarized and discussed, and the future development trend is predicted.","PeriodicalId":210802,"journal":{"name":"International Conference on Image Processing and Intelligent Control","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130332049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Conference on Image Processing and Intelligent Control
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1