首页 > 最新文献

Proceedings of the 2023 Asia Conference on Computer Vision, Image Processing and Pattern Recognition最新文献

英文 中文
Skeleton-based Generative Adversarial Networks for Font Shape Style Transfer: Learning text style from some characters and transferring the style to any unseen characters 基于骨架的生成对抗网络,用于字体形状风格转移:从一些字符中学习文本样式,并将样式转移到任何看不见的字符
Thanaphon Thanusan, K. Patanukhom
This paper presents a new font shape style transfer technique that employs a generative adversarial network (GAN) and skeleton-based input feature maps to modify a target text to match a target font shape while retaining the original text content. Our GAN model is modified from a Shape-Matching GAN which utilizes a StyleNet generator and a PatchGAN discriminator. Rather than using a base-font character images as input to the generator like other existing font transfer models, we utilize the proposed skeleton-based features as input. The experimental results show that our model can produce the unseen characters in the desired font style better than an existing method.
本文提出了一种新的字体形状风格转移技术,该技术采用生成对抗网络(GAN)和基于骨架的输入特征映射来修改目标文本以匹配目标字体形状,同时保留原始文本内容。我们的GAN模型是在使用StyleNet生成器和PatchGAN鉴别器的形状匹配GAN的基础上改进的。不像其他现有的字体转移模型那样使用基本字体字符图像作为生成器的输入,我们利用提出的基于骨架的特征作为输入。实验结果表明,该模型比现有方法能更好地生成所需字体样式的未见字符。
{"title":"Skeleton-based Generative Adversarial Networks for Font Shape Style Transfer: Learning text style from some characters and transferring the style to any unseen characters","authors":"Thanaphon Thanusan, K. Patanukhom","doi":"10.1145/3596286.3596288","DOIUrl":"https://doi.org/10.1145/3596286.3596288","url":null,"abstract":"This paper presents a new font shape style transfer technique that employs a generative adversarial network (GAN) and skeleton-based input feature maps to modify a target text to match a target font shape while retaining the original text content. Our GAN model is modified from a Shape-Matching GAN which utilizes a StyleNet generator and a PatchGAN discriminator. Rather than using a base-font character images as input to the generator like other existing font transfer models, we utilize the proposed skeleton-based features as input. The experimental results show that our model can produce the unseen characters in the desired font style better than an existing method.","PeriodicalId":208318,"journal":{"name":"Proceedings of the 2023 Asia Conference on Computer Vision, Image Processing and Pattern Recognition","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121346836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hierarchical Attention based Feature Learning for Interpretable Social Event Prediction
Yinsen Wang, Xin Zhang, Yan Pan, Zexin Fu
abstract. Major social events, e.g., civil unrests, generally impact both social stability and civil life. Therefore, anticipating the occurrence of concerned social events in advance is of great significance to decision makers. By mining previous indicators of the event type of interest from open-source data, we can make inference on whether a particular one of that type will occur sometime in the future. In recent years, this kind of data-driven approaches have been proposed to predict social events. However, there are still some challenges remaining to be addressed: (I) Modeling previous feature for a particular event based on limited and obtainable data source. (II) Mining temporal dependences between complicated information in different periods. (III) Explaining prediction results from a reasonable perspective. To cope with these research issues, we proposed a hierarchical attention-based feature learning framework for interpretable social event prediction. We model the evolution processes prior to the onset of an event of interest using a sequence of temporal event graphs. Then, we employ the GNN (Graph Neural Network) approach for graph mining and the attention mechanism on multi-level data for feature learning. For model explanation, an importance evaluation indicator is proposed to identify influential factors of distinct feature levels leading to the event occurrence from the past. Additionally, we conduct experiments on four real-world datasets to verify the proposed method. The results indicate that it outperforms other baseline models on protest prediction tasks.
摘要重大社会事件,例如内乱,通常会影响社会稳定和公民生活。因此,提前预测相关社会事件的发生对决策者来说具有重要意义。通过从开源数据中挖掘感兴趣的事件类型的先前指标,我们可以推断该类型的特定事件是否会在未来某个时候发生。近年来,人们提出了这种数据驱动的方法来预测社会事件。然而,仍然存在一些有待解决的挑战:(I)基于有限的和可获得的数据源为特定事件建模以前的特征。(二)挖掘不同时期复杂信息之间的时间依赖关系。(三)从合理角度解释预测结果。为了解决这些研究问题,我们提出了一个分层的基于注意的特征学习框架,用于可解释的社会事件预测。我们使用时序事件图来模拟感兴趣的事件发生之前的进化过程。然后,我们采用GNN(图神经网络)方法进行图挖掘,并采用多层次数据的注意机制进行特征学习。为了对模型进行解释,提出了一个重要度评价指标,以识别不同特征水平的影响因素导致事件从过去发生。此外,我们在四个真实数据集上进行了实验来验证所提出的方法。结果表明,它在抗议预测任务上优于其他基线模型。
{"title":"Hierarchical Attention based Feature Learning for Interpretable Social Event Prediction","authors":"Yinsen Wang, Xin Zhang, Yan Pan, Zexin Fu","doi":"10.1145/3596286.3596298","DOIUrl":"https://doi.org/10.1145/3596286.3596298","url":null,"abstract":"abstract. Major social events, e.g., civil unrests, generally impact both social stability and civil life. Therefore, anticipating the occurrence of concerned social events in advance is of great significance to decision makers. By mining previous indicators of the event type of interest from open-source data, we can make inference on whether a particular one of that type will occur sometime in the future. In recent years, this kind of data-driven approaches have been proposed to predict social events. However, there are still some challenges remaining to be addressed: (I) Modeling previous feature for a particular event based on limited and obtainable data source. (II) Mining temporal dependences between complicated information in different periods. (III) Explaining prediction results from a reasonable perspective. To cope with these research issues, we proposed a hierarchical attention-based feature learning framework for interpretable social event prediction. We model the evolution processes prior to the onset of an event of interest using a sequence of temporal event graphs. Then, we employ the GNN (Graph Neural Network) approach for graph mining and the attention mechanism on multi-level data for feature learning. For model explanation, an importance evaluation indicator is proposed to identify influential factors of distinct feature levels leading to the event occurrence from the past. Additionally, we conduct experiments on four real-world datasets to verify the proposed method. The results indicate that it outperforms other baseline models on protest prediction tasks.","PeriodicalId":208318,"journal":{"name":"Proceedings of the 2023 Asia Conference on Computer Vision, Image Processing and Pattern Recognition","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127751878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparative Analysis of Deep Learning Models for Predicting Online Review Helpfulness 深度学习模型预测在线评论有用性的比较分析
Sirinda Palahan
The exponential growth of online customer reviews has created challenges for potential buyers to filter and identify helpful reviews, directly affecting their shopping experience. Accurate prediction of review helpfulness can improve the selection and presentation of valuable reviews, leading to a better user experience and more informed purchasing decisions. To address the limitations of traditional machine learning methods that rely on handcrafted features and fail to capture semantic context, this paper presents a comparative analysis of existing deep learning models to predict the helpfulness of online reviews. Our study employs larger and more diverse datasets from three popular e-commerce platforms: TripAdvisor, Amazon, and Yelp, and compares multiple deep learning models, including Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), and DistilBert, to identify the most accurate and effective predictions. Additionally, the study compares the deep learning models to the traditional machine learning algorithm XGBoost. Understanding the benefits and limitations of each model can lead to improved model selection and optimization, resulting in more accurate and efficient predictions for a wide range of applications. The results show that CNN consistently outperforms the other deep learning models and XGBoost regarding Mean Squared Error (MSE) and training time across all datasets.
在线客户评论的指数级增长给潜在买家过滤和识别有用评论带来了挑战,直接影响了他们的购物体验。对评论有用性的准确预测可以改进有价值评论的选择和呈现,从而带来更好的用户体验和更明智的购买决策。为了解决传统机器学习方法依赖手工制作的特征和无法捕获语义上下文的局限性,本文对现有的深度学习模型进行了比较分析,以预测在线评论的有用性。我们的研究采用了来自三个流行电子商务平台(TripAdvisor、Amazon和Yelp)的更大和更多样化的数据集,并比较了多种深度学习模型,包括卷积神经网络(CNN)、循环神经网络(RNN)和蒸馏器,以确定最准确和有效的预测。此外,该研究还将深度学习模型与传统机器学习算法XGBoost进行了比较。了解每个模型的优点和局限性可以改进模型选择和优化,从而为广泛的应用程序提供更准确和有效的预测。结果表明,在所有数据集上,CNN在均方误差(MSE)和训练时间方面始终优于其他深度学习模型和XGBoost。
{"title":"Comparative Analysis of Deep Learning Models for Predicting Online Review Helpfulness","authors":"Sirinda Palahan","doi":"10.1145/3596286.3596300","DOIUrl":"https://doi.org/10.1145/3596286.3596300","url":null,"abstract":"The exponential growth of online customer reviews has created challenges for potential buyers to filter and identify helpful reviews, directly affecting their shopping experience. Accurate prediction of review helpfulness can improve the selection and presentation of valuable reviews, leading to a better user experience and more informed purchasing decisions. To address the limitations of traditional machine learning methods that rely on handcrafted features and fail to capture semantic context, this paper presents a comparative analysis of existing deep learning models to predict the helpfulness of online reviews. Our study employs larger and more diverse datasets from three popular e-commerce platforms: TripAdvisor, Amazon, and Yelp, and compares multiple deep learning models, including Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), and DistilBert, to identify the most accurate and effective predictions. Additionally, the study compares the deep learning models to the traditional machine learning algorithm XGBoost. Understanding the benefits and limitations of each model can lead to improved model selection and optimization, resulting in more accurate and efficient predictions for a wide range of applications. The results show that CNN consistently outperforms the other deep learning models and XGBoost regarding Mean Squared Error (MSE) and training time across all datasets.","PeriodicalId":208318,"journal":{"name":"Proceedings of the 2023 Asia Conference on Computer Vision, Image Processing and Pattern Recognition","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132698846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Object Detection for Complex Architectural Floor Plans with Efficient Receptive Fields 基于高效感受场的复杂建筑平面深度目标检测
Zhongguo Xu, N. Jha, Syed Mehadi, M. Mandal
Architectural floor plans play an important role in sharing the building information among engineers, designers, and clients. Automatic floor plan analysis can help in improving work efficiency and accuracy. Object detection and recognition are critical in understanding and analyzing a floor plan document. However, few research works have been conducted to date for automatic object detection in architectural floor plans. In this paper, a convolutional neural network, namely ArchNet, is proposed to detect various visual objects, such as door, window, and stairs. The ArchNet is a modified version of YOLO network, and consists of five modules: backbone, multiscale receptive fields, neck, head, and non-maximal suppression. In this paper, ArchNet is used to detect 13 object classes commonly found in architectural floor plans. Experimental results show that the proposed architecture can achieve a mean average precision of 75% which is superior compared to the state-of-the-art techniques.
建筑平面图在工程师、设计师和客户之间共享建筑信息方面发挥着重要作用。自动平面图分析有助于提高工作效率和准确性。物体检测和识别是理解和分析平面图文件的关键。然而,迄今为止,针对建筑平面图中的自动目标检测的研究工作还很少。本文提出了一种卷积神经网络ArchNet来检测各种视觉物体,如门、窗、楼梯等。ArchNet是YOLO网络的改进版本,由5个模块组成:主干、多尺度感受野、颈部、头部和非最大抑制。在本文中,使用ArchNet来检测建筑平面图中常见的13个对象类。实验结果表明,该结构的平均精度可达75%,优于现有技术。
{"title":"Deep Object Detection for Complex Architectural Floor Plans with Efficient Receptive Fields","authors":"Zhongguo Xu, N. Jha, Syed Mehadi, M. Mandal","doi":"10.1145/3596286.3596295","DOIUrl":"https://doi.org/10.1145/3596286.3596295","url":null,"abstract":"Architectural floor plans play an important role in sharing the building information among engineers, designers, and clients. Automatic floor plan analysis can help in improving work efficiency and accuracy. Object detection and recognition are critical in understanding and analyzing a floor plan document. However, few research works have been conducted to date for automatic object detection in architectural floor plans. In this paper, a convolutional neural network, namely ArchNet, is proposed to detect various visual objects, such as door, window, and stairs. The ArchNet is a modified version of YOLO network, and consists of five modules: backbone, multiscale receptive fields, neck, head, and non-maximal suppression. In this paper, ArchNet is used to detect 13 object classes commonly found in architectural floor plans. Experimental results show that the proposed architecture can achieve a mean average precision of 75% which is superior compared to the state-of-the-art techniques.","PeriodicalId":208318,"journal":{"name":"Proceedings of the 2023 Asia Conference on Computer Vision, Image Processing and Pattern Recognition","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115089413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
3D Ultrasound Tomography Image Reconstruction Algorithm by GPU 基于GPU的三维超声断层图像重建算法
Jiaduo Gong
At present, X-ray technology, B-ultrasound and magnetic resonance imaging technology have more or less defects in the detection of female breast cancer, so the early detection of breast cancer is still a very important challenge. Ultrasound tomography (UT) can solve these problems very well. This project mainly uses the TVAL3 algorithm to reconstruct the original image from the information collected by the UT system for clinical use. TVAL3 algorithm involves a large number of matrix-vector multiplications and transposed matrix-vector multiplications, which will consume a lot of time if traditional CPU methods are used. For the characteristics of matrix-vector multiplication, this project uses CUDA to call GPU for parallel computing. At the same time, in order to further increase the speed of the calculation, we put part of the unchanged content into the GPU in advance to reduce the time spent on the transfer process. The final speedups of 20x, 10x and 5x were achieved in matrix vector multiplication, transpose matrix vector multiplication and total time, respectively.
目前,x射线技术、b超和磁共振成像技术在女性乳腺癌的检测中或多或少都存在缺陷,因此早期发现乳腺癌仍然是一个非常重要的挑战。超声断层扫描(UT)可以很好地解决这些问题。本项目主要使用TVAL3算法对UT系统采集的信息进行原始图像重构,以供临床使用。TVAL3算法涉及大量的矩阵-向量乘法和转置矩阵-向量乘法,如果使用传统的CPU方法,将消耗大量的时间。针对矩阵向量乘法的特点,本项目采用CUDA调用GPU进行并行计算。同时,为了进一步提高计算速度,我们将部分未修改的内容提前放入GPU中,以减少传输过程所花费的时间。在矩阵向量乘法、转置矩阵向量乘法和总时间上分别实现了20倍、10倍和5倍的最终加速。
{"title":"3D Ultrasound Tomography Image Reconstruction Algorithm by GPU","authors":"Jiaduo Gong","doi":"10.1145/3596286.3596290","DOIUrl":"https://doi.org/10.1145/3596286.3596290","url":null,"abstract":"At present, X-ray technology, B-ultrasound and magnetic resonance imaging technology have more or less defects in the detection of female breast cancer, so the early detection of breast cancer is still a very important challenge. Ultrasound tomography (UT) can solve these problems very well. This project mainly uses the TVAL3 algorithm to reconstruct the original image from the information collected by the UT system for clinical use. TVAL3 algorithm involves a large number of matrix-vector multiplications and transposed matrix-vector multiplications, which will consume a lot of time if traditional CPU methods are used. For the characteristics of matrix-vector multiplication, this project uses CUDA to call GPU for parallel computing. At the same time, in order to further increase the speed of the calculation, we put part of the unchanged content into the GPU in advance to reduce the time spent on the transfer process. The final speedups of 20x, 10x and 5x were achieved in matrix vector multiplication, transpose matrix vector multiplication and total time, respectively.","PeriodicalId":208318,"journal":{"name":"Proceedings of the 2023 Asia Conference on Computer Vision, Image Processing and Pattern Recognition","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126636439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improved Mask R-CNN Network Method for PV Panel Defect Detection 改进掩模R-CNN网络方法用于光伏板缺陷检测
Wangwang Yang, Z. Deng, Enwen Hu, Yao Zhang
Abstract: With the increasing popularity of photovoltaic power generation, the demand for photovoltaic panel defect detection in the industry is also increasing. Deep learning can automatically extract individual photovoltaic panels from images or videos, and perform the defect detection task on it. Aiming at the problem of low detection accuracy of existing deep learning-based photovoltaic panel defect detection methods, an improved Mask R-CNN photovoltaic panel defect detection algorithm is proposed. To improve the training performance, the feature pyramid (FPN) structure is improved, and the cascade network based on attention guidance is adopted to fuse more features and prevent the loss of shallow semantic information to a certain extent. Secondly, Group Normalization (GN) is used to replace Batch Normalization (BN) in the traditional high-performance deep neural network models. The quality of the self-made dataset is improved by Mosaic data enhancement to prevent accuracy loss due to insufficient sample size in the dataset. The effectiveness of the algorithm is verified by the self-made dataset and the public COCO2017 dataset. The improved Mask R-CNN algorithm has a detection accuracy of more than 89% on the self-made photovoltaic panel dataset and 44.6% bounding box average precision (APbbox) and 41.5% mask average precision (APmask) on the COCO2017 dataset, which is 6.4% and 5.8% higher than the original Mask R-CNN algorithm respectively. Finally, to comprehensively analyze the detection performance of the improved algorithm in photovoltaic panel defect detection tasks, the common deep learning-based defect detection algorithms for photovoltaic panel defect detection are summarized. Based on this, a comparison and summary of the improved algorithm in this paper are conducted.
摘要:随着光伏发电的日益普及,行业对光伏板缺陷检测的需求也越来越大。深度学习可以自动从图像或视频中提取单个光伏板,并对其执行缺陷检测任务。针对现有基于深度学习的光伏板缺陷检测方法检测精度低的问题,提出了一种改进的Mask R-CNN光伏板缺陷检测算法。为了提高训练性能,改进了特征金字塔(FPN)结构,采用基于注意力引导的级联网络融合更多特征,在一定程度上防止浅层语义信息的丢失。其次,用群归一化(Group Normalization, GN)取代传统高性能深度神经网络模型中的批归一化(Batch Normalization, BN)。通过马赛克数据增强,提高了自制数据集的质量,避免了数据集样本量不足造成的精度损失。通过自制数据集和公开的COCO2017数据集验证了算法的有效性。改进后的Mask R-CNN算法在自制光伏板数据集上的检测精度达到89%以上,在COCO2017数据集上的检测精度达到44.6%的边界框平均精度(APbbox)和41.5%的掩码平均精度(APmask),分别比原Mask R-CNN算法提高了6.4%和5.8%。最后,为了全面分析改进算法在光伏板缺陷检测任务中的检测性能,总结了目前常用的基于深度学习的光伏板缺陷检测算法。在此基础上,对本文提出的改进算法进行了比较和总结。
{"title":"Improved Mask R-CNN Network Method for PV Panel Defect Detection","authors":"Wangwang Yang, Z. Deng, Enwen Hu, Yao Zhang","doi":"10.1145/3596286.3596287","DOIUrl":"https://doi.org/10.1145/3596286.3596287","url":null,"abstract":"Abstract: With the increasing popularity of photovoltaic power generation, the demand for photovoltaic panel defect detection in the industry is also increasing. Deep learning can automatically extract individual photovoltaic panels from images or videos, and perform the defect detection task on it. Aiming at the problem of low detection accuracy of existing deep learning-based photovoltaic panel defect detection methods, an improved Mask R-CNN photovoltaic panel defect detection algorithm is proposed. To improve the training performance, the feature pyramid (FPN) structure is improved, and the cascade network based on attention guidance is adopted to fuse more features and prevent the loss of shallow semantic information to a certain extent. Secondly, Group Normalization (GN) is used to replace Batch Normalization (BN) in the traditional high-performance deep neural network models. The quality of the self-made dataset is improved by Mosaic data enhancement to prevent accuracy loss due to insufficient sample size in the dataset. The effectiveness of the algorithm is verified by the self-made dataset and the public COCO2017 dataset. The improved Mask R-CNN algorithm has a detection accuracy of more than 89% on the self-made photovoltaic panel dataset and 44.6% bounding box average precision (APbbox) and 41.5% mask average precision (APmask) on the COCO2017 dataset, which is 6.4% and 5.8% higher than the original Mask R-CNN algorithm respectively. Finally, to comprehensively analyze the detection performance of the improved algorithm in photovoltaic panel defect detection tasks, the common deep learning-based defect detection algorithms for photovoltaic panel defect detection are summarized. Based on this, a comparison and summary of the improved algorithm in this paper are conducted.","PeriodicalId":208318,"journal":{"name":"Proceedings of the 2023 Asia Conference on Computer Vision, Image Processing and Pattern Recognition","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122535391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Infrared Dim-small Target Detection Method Based on Improved YOLOv7 基于改进YOLOv7的红外弱小目标检测方法
Yujie Zheng, Yuyong Cui, Xinyi Gao
Efficient detection of dim-small targets with high accuracy is a difficult task in the field of infrared target tracking since the tiny size of small infrared targets significantly reduces the accuracy of conventional models. To address this issue, this paper improves YOLOv7 so that it can be applied to the detection of infrared dim-small targets. Initially, an enhanced MPConv-based pooling structure is proposed, which reduces the high false detection rate caused by white point noise. Then, a CBAM attention module is added to the backbone structure, which employs both spatial and channel attention to preserve more of the original characteristics of infrared faint targets. Finally, the EIOU loss is utilized in the Head module to increase the speed of model convergence. Experiments reveal that the improved algorithm achieves a model mAP of 70.8% on the dim-small target dataset, which represents a 3.4% improvement over YOLOv7 and outperforms other conventional algorithms.
由于红外小目标的微小尺寸大大降低了传统红外目标跟踪模型的精度,因此对弱小目标进行高精度的有效检测是红外目标跟踪领域的一个难题。针对这一问题,本文对YOLOv7进行了改进,使其能够应用于红外弱小目标的探测。首先,提出了一种增强的基于mpconvs的池化结构,降低了白点噪声造成的高误检率。然后,在主干网结构中加入CBAM注意模块,利用空间注意和信道注意,更多地保留了红外微弱目标的原有特征。最后,在Head模块中利用EIOU损失来提高模型的收敛速度。实验表明,改进后的算法在弱小目标数据集上的mAP值为70.8%,比YOLOv7提高了3.4%,优于其他传统算法。
{"title":"An Infrared Dim-small Target Detection Method Based on Improved YOLOv7","authors":"Yujie Zheng, Yuyong Cui, Xinyi Gao","doi":"10.1145/3596286.3596289","DOIUrl":"https://doi.org/10.1145/3596286.3596289","url":null,"abstract":"Efficient detection of dim-small targets with high accuracy is a difficult task in the field of infrared target tracking since the tiny size of small infrared targets significantly reduces the accuracy of conventional models. To address this issue, this paper improves YOLOv7 so that it can be applied to the detection of infrared dim-small targets. Initially, an enhanced MPConv-based pooling structure is proposed, which reduces the high false detection rate caused by white point noise. Then, a CBAM attention module is added to the backbone structure, which employs both spatial and channel attention to preserve more of the original characteristics of infrared faint targets. Finally, the EIOU loss is utilized in the Head module to increase the speed of model convergence. Experiments reveal that the improved algorithm achieves a model mAP of 70.8% on the dim-small target dataset, which represents a 3.4% improvement over YOLOv7 and outperforms other conventional algorithms.","PeriodicalId":208318,"journal":{"name":"Proceedings of the 2023 Asia Conference on Computer Vision, Image Processing and Pattern Recognition","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123439136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-adaptive Methods with Flexible Detection Criteria of the Battery Cell Dent Defect 基于柔性检测准则的电池凹痕缺陷自适应检测方法
Chen De, Yanrui Dong, Zhou Jun Xiong, Wang Hai, Du Yi Xian, Li Shi Peng
For the battery cell production lines using automatic defect detection equipment, it becomes necessary to control the fluctuation of the yield rate of the production line, and adapt to the difference among the cell replacements and the incoming material processes of each batch. So the defect detection algorithm needs to adjust the detection criteria according to the target preferential rate range, which is a new challenge for artificial intelligent manufacturing in the battery industry. Taking the dent defect on the side of the battery cell as an example, based on both the traditional algorithm and the deep learning algorithm, two kinds of self-adaptive adjustment method of the defect detection criteria are proposed. The traditional algorithm employs the linear interpolation method to classify good and bad products based on depth and area information, and calculates the optimal critical values of depth and area that meet the target yield rate. The deep learning algorithm combines the convolutional neural network and the support vector machine classifiers, with the histogram of oriented gradient feature as the classifier input, so as to classify different degrees of defective products. The test results show that the detection criteria can be adjusted flexibly and automatically for the side dent defects, which could realize the self-adaption of algorithm to the target yield rate, save the manual operation time, and improve the production efficiency.
对于采用自动缺陷检测设备的电芯生产线,需要控制生产线成品率的波动,适应每批电池更换和进料工艺的差异。因此缺陷检测算法需要根据目标优先率范围调整检测准则,这对电池行业的人工智能制造提出了新的挑战。以电池侧面凹痕缺陷为例,在传统算法和深度学习算法的基础上,提出了两种缺陷检测准则的自适应调整方法。传统算法采用线性插值方法,根据深度和面积信息对良品和劣品进行分类,计算出满足目标成品率的深度和面积的最优临界值。深度学习算法将卷积神经网络与支持向量机分类器相结合,以有向梯度特征的直方图作为分类器输入,对不同程度的缺陷产品进行分类。试验结果表明,侧凹缺陷检测标准可灵活自动调整,实现了算法自适应目标成品率,节省了人工操作时间,提高了生产效率。
{"title":"Self-adaptive Methods with Flexible Detection Criteria of the Battery Cell Dent Defect","authors":"Chen De, Yanrui Dong, Zhou Jun Xiong, Wang Hai, Du Yi Xian, Li Shi Peng","doi":"10.1145/3596286.3596296","DOIUrl":"https://doi.org/10.1145/3596286.3596296","url":null,"abstract":"For the battery cell production lines using automatic defect detection equipment, it becomes necessary to control the fluctuation of the yield rate of the production line, and adapt to the difference among the cell replacements and the incoming material processes of each batch. So the defect detection algorithm needs to adjust the detection criteria according to the target preferential rate range, which is a new challenge for artificial intelligent manufacturing in the battery industry. Taking the dent defect on the side of the battery cell as an example, based on both the traditional algorithm and the deep learning algorithm, two kinds of self-adaptive adjustment method of the defect detection criteria are proposed. The traditional algorithm employs the linear interpolation method to classify good and bad products based on depth and area information, and calculates the optimal critical values of depth and area that meet the target yield rate. The deep learning algorithm combines the convolutional neural network and the support vector machine classifiers, with the histogram of oriented gradient feature as the classifier input, so as to classify different degrees of defective products. The test results show that the detection criteria can be adjusted flexibly and automatically for the side dent defects, which could realize the self-adaption of algorithm to the target yield rate, save the manual operation time, and improve the production efficiency.","PeriodicalId":208318,"journal":{"name":"Proceedings of the 2023 Asia Conference on Computer Vision, Image Processing and Pattern Recognition","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122266356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analysis of Fuzzy Network Selection Algorithm for Heterogeneous Wireless Mobile Networks 异构无线移动网络模糊网络选择算法分析
T. Thumthawatworn, K. Nongpong, Pawut Satitsuksanoh
Over the last couple of years, the recent pandemic rendered work operations to be mobile and relied heavily on real-time and traffic-intensive applications such as online classrooms and meetings. Since our working life requires seamless mobility and stable wireless connectivity, heterogeneous wireless networks gain more attention as key infrastructures to fulfill communication needs. Intelligent handover decision deems necessary to select the appropriate wireless network. An intelligent mechanism such as fuzzy logic proves to enhance such decision-making. Different membership functions used in a fuzzy inference system contribute to different network selection performances. This work evaluates different fuzzy membership functions, hence appropriate membership functions can be used in the design of fuzzy-based network selection mechanisms for heterogeneous wireless networks.
在过去几年中,最近的大流行使工作业务变得移动化,并严重依赖实时和流量密集型应用程序,如在线教室和会议。由于我们的工作生活需要无缝移动和稳定的无线连接,异构无线网络作为满足通信需求的关键基础设施受到越来越多的关注。智能切换决策认为有必要选择合适的无线网络。事实证明,模糊逻辑等智能机制可以增强此类决策。在模糊推理系统中使用不同的隶属函数会导致不同的网络选择性能。本文对不同的模糊隶属度函数进行了评估,从而为异构无线网络的模糊网络选择机制设计提供了合适的隶属度函数。
{"title":"Analysis of Fuzzy Network Selection Algorithm for Heterogeneous Wireless Mobile Networks","authors":"T. Thumthawatworn, K. Nongpong, Pawut Satitsuksanoh","doi":"10.1145/3596286.3596299","DOIUrl":"https://doi.org/10.1145/3596286.3596299","url":null,"abstract":"Over the last couple of years, the recent pandemic rendered work operations to be mobile and relied heavily on real-time and traffic-intensive applications such as online classrooms and meetings. Since our working life requires seamless mobility and stable wireless connectivity, heterogeneous wireless networks gain more attention as key infrastructures to fulfill communication needs. Intelligent handover decision deems necessary to select the appropriate wireless network. An intelligent mechanism such as fuzzy logic proves to enhance such decision-making. Different membership functions used in a fuzzy inference system contribute to different network selection performances. This work evaluates different fuzzy membership functions, hence appropriate membership functions can be used in the design of fuzzy-based network selection mechanisms for heterogeneous wireless networks.","PeriodicalId":208318,"journal":{"name":"Proceedings of the 2023 Asia Conference on Computer Vision, Image Processing and Pattern Recognition","volume":"138 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127448335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Application of visible light-infrared image fusion technology in power system fault detection 可见光-红外图像融合技术在电力系统故障检测中的应用
Sichao Chen, Yang Luo, Jianbo Yin, Guohua Zhou, Dilong Shen, Liang Shen
Infrared thermal imaging is widely used in industrial inspection due to its advantages such as passive identification, non-contact detection, long detection distance and strong environmental adaptability. In power systems, infrared thermal imaging can be used to carry out live detection of power equipment to prevent or examine potential risk and threats. This paper provides a fault detection method for power equipment through the visible light-infrared image fusion technology. The information of infrared image is collected through infrared thermal imager, and the infrared image is preprocessed. The scale invariant feature transform (SIFT) feature point detection algorithm is used to extract the difference between visible light image and infrared image. The feature points are screened and registered by random sample consensus (RANSAC) algorithm to realize the fusion of the visible light image and the infrared image of the power equipment, so as to detect the working status of the power equipment and accurately locate the fault source when a fault occurs.
红外热成像具有被动识别、非接触检测、检测距离远、环境适应性强等优点,在工业检测中得到了广泛的应用。在电力系统中,红外热成像可用于对电力设备进行实时检测,以预防或检查潜在的风险和威胁。本文提出了一种利用可见光-红外图像融合技术对电力设备进行故障检测的方法。通过红外热成像仪采集红外图像信息,并对红外图像进行预处理。采用尺度不变特征变换(SIFT)特征点检测算法提取可见光图像与红外图像之间的差异。采用随机样本一致性(RANSAC)算法对特征点进行筛选配准,实现电力设备可见光图像与红外图像的融合,从而检测电力设备的工作状态,在故障发生时准确定位故障源。
{"title":"Application of visible light-infrared image fusion technology in power system fault detection","authors":"Sichao Chen, Yang Luo, Jianbo Yin, Guohua Zhou, Dilong Shen, Liang Shen","doi":"10.1145/3596286.3596294","DOIUrl":"https://doi.org/10.1145/3596286.3596294","url":null,"abstract":"Infrared thermal imaging is widely used in industrial inspection due to its advantages such as passive identification, non-contact detection, long detection distance and strong environmental adaptability. In power systems, infrared thermal imaging can be used to carry out live detection of power equipment to prevent or examine potential risk and threats. This paper provides a fault detection method for power equipment through the visible light-infrared image fusion technology. The information of infrared image is collected through infrared thermal imager, and the infrared image is preprocessed. The scale invariant feature transform (SIFT) feature point detection algorithm is used to extract the difference between visible light image and infrared image. The feature points are screened and registered by random sample consensus (RANSAC) algorithm to realize the fusion of the visible light image and the infrared image of the power equipment, so as to detect the working status of the power equipment and accurately locate the fault source when a fault occurs.","PeriodicalId":208318,"journal":{"name":"Proceedings of the 2023 Asia Conference on Computer Vision, Image Processing and Pattern Recognition","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133685103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the 2023 Asia Conference on Computer Vision, Image Processing and Pattern Recognition
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1