首页 > 最新文献

Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition最新文献

英文 中文
Speech Recognition Method based on CTC Multilayer Loss 基于CTC多层损失的语音识别方法
Deyu Luo, Xianhong Chen, Mao-shen Jia, C. Bao
Due to the conditional independent assumption of a CTC model, a language model is usually added to improve its speech recognition performance. However, adding a language model will increase the complexity and computation cost. Therefore, we proposed a simple and effective speech recognition method based on CTC multilayer loss. Unlike the traditional CTC model which only optimizes the CTC loss of the last layer, in this method, the CTC multilayer loss, which guides the training of the model, is obtained by weighted summation of the CTC losses of different layers. Through optimizing the losses of different layers, the information of different layers of the CTC model can be taken into account, and the information obtained is more comprehensive, so that the model obtained has better recognition performance. With a small amount of code modification, this CTC multilayer loss method can well regulate the training of CTC and improve the performance of speech recognition. Since this method only changes the loss function of the CTC model and does not change the structure of the CTC model and its testing process, the training stage is simple and the testing stage has no extra memory cost and computation cost. We evaluated the method on Aishell-1 dataset using WeNet as the baseline, and it was able to reduce the character error rate (CER) by 7.5% and improve speech recognition performance without adding a language model.
由于CTC模型的条件独立假设,通常会加入语言模型来提高其语音识别性能。但是,添加语言模型会增加复杂性和计算成本。因此,我们提出了一种简单有效的基于CTC多层损失的语音识别方法。与传统的CTC模型只优化最后一层的CTC损失不同,该方法通过对不同层的CTC损失加权求和得到指导模型训练的CTC多层损失。通过对不同层的损失进行优化,可以考虑到CTC模型的不同层的信息,得到的信息更加全面,从而使得到的模型具有更好的识别性能。通过少量的代码修改,该CTC多层损失方法可以很好地调节CTC的训练,提高语音识别的性能。由于该方法只改变了CTC模型的损失函数,不改变CTC模型的结构及其测试过程,因此训练阶段简单,测试阶段没有额外的内存成本和计算成本。以WeNet为基准,在ahell -1数据集上对该方法进行了评估,结果表明,该方法在不添加语言模型的情况下,将字符错误率(CER)降低了7.5%,提高了语音识别性能。
{"title":"Speech Recognition Method based on CTC Multilayer Loss","authors":"Deyu Luo, Xianhong Chen, Mao-shen Jia, C. Bao","doi":"10.1145/3581807.3581864","DOIUrl":"https://doi.org/10.1145/3581807.3581864","url":null,"abstract":"Due to the conditional independent assumption of a CTC model, a language model is usually added to improve its speech recognition performance. However, adding a language model will increase the complexity and computation cost. Therefore, we proposed a simple and effective speech recognition method based on CTC multilayer loss. Unlike the traditional CTC model which only optimizes the CTC loss of the last layer, in this method, the CTC multilayer loss, which guides the training of the model, is obtained by weighted summation of the CTC losses of different layers. Through optimizing the losses of different layers, the information of different layers of the CTC model can be taken into account, and the information obtained is more comprehensive, so that the model obtained has better recognition performance. With a small amount of code modification, this CTC multilayer loss method can well regulate the training of CTC and improve the performance of speech recognition. Since this method only changes the loss function of the CTC model and does not change the structure of the CTC model and its testing process, the training stage is simple and the testing stage has no extra memory cost and computation cost. We evaluated the method on Aishell-1 dataset using WeNet as the baseline, and it was able to reduce the character error rate (CER) by 7.5% and improve speech recognition performance without adding a language model.","PeriodicalId":292813,"journal":{"name":"Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129098358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inferring Potential Drug-Target Interactions using Multiple Similarities and Network Consistency Projection 利用多重相似性和网络一致性投影推断潜在的药物-靶标相互作用
Jianhua Li, Haoran Ren, Dayu Xiao, Botao Deng
Developing new drugs is time-consuming, labor-intensive and costly. Identifying new targets for existing drugs can help to discover new potential therapeutic uses of old drugs and reduce the cost of drug development. Drug-target interactions are usually inferred by searching for similar drugs and targets. Various biomedical databases have been established currently, which provide effective data for predicting drug-target interactions. We proposed a novel computational model for discovering Drug-Target Interactions using Network consistency project (DTIN). The Gaussian kernel similarity of drugs and targets were derived from known drug-target interactions by Gaussian kernel function, thus DTIN incorporated six types of similarities, including drug chemical structure similarity, drug ATC similarity, drug Gaussian kernel similarity, target sequence similarity, target function similarity, and target Gaussian kernel similarity. We used logistic regression to process the integrated similarity and predicted scores of interacting drug-target pairs by network consistency projection. Five-fold cross-validation was implemented on a benchmark dataset, and the computational results demonstrated that DTIN was effective and outperformed two advanced models.
开发新药耗时、费力且昂贵。确定现有药物的新靶点有助于发现旧药物的新潜在治疗用途,并降低药物开发成本。药物-靶标相互作用通常是通过寻找相似的药物和靶标来推断的。目前已经建立了各种生物医学数据库,为预测药物-靶点相互作用提供了有效的数据。我们提出了一种新的基于网络一致性项目(DTIN)的药物-靶标相互作用的计算模型。药物和靶标的高斯核相似度是由已知的药物-靶标相互作用通过高斯核函数得到的,DTIN包含药物化学结构相似度、药物ATC相似度、药物高斯核相似度、靶标序列相似度、靶标函数相似度和靶标高斯核相似度6种相似度。我们使用逻辑回归处理综合相似度,并通过网络一致性投影预测相互作用的药物-靶标对的得分。在一个基准数据集上进行了五重交叉验证,计算结果表明DTIN是有效的,并且优于两种先进的模型。
{"title":"Inferring Potential Drug-Target Interactions using Multiple Similarities and Network Consistency Projection","authors":"Jianhua Li, Haoran Ren, Dayu Xiao, Botao Deng","doi":"10.1145/3581807.3581860","DOIUrl":"https://doi.org/10.1145/3581807.3581860","url":null,"abstract":"Developing new drugs is time-consuming, labor-intensive and costly. Identifying new targets for existing drugs can help to discover new potential therapeutic uses of old drugs and reduce the cost of drug development. Drug-target interactions are usually inferred by searching for similar drugs and targets. Various biomedical databases have been established currently, which provide effective data for predicting drug-target interactions. We proposed a novel computational model for discovering Drug-Target Interactions using Network consistency project (DTIN). The Gaussian kernel similarity of drugs and targets were derived from known drug-target interactions by Gaussian kernel function, thus DTIN incorporated six types of similarities, including drug chemical structure similarity, drug ATC similarity, drug Gaussian kernel similarity, target sequence similarity, target function similarity, and target Gaussian kernel similarity. We used logistic regression to process the integrated similarity and predicted scores of interacting drug-target pairs by network consistency projection. Five-fold cross-validation was implemented on a benchmark dataset, and the computational results demonstrated that DTIN was effective and outperformed two advanced models.","PeriodicalId":292813,"journal":{"name":"Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129104145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Medium and Long Wave Infrared Image Enhancement Fusion Method Based on Edge Preserving 基于边缘保持的中长波红外图像增强融合方法
Shubin Lou, Xin Zheng, Bin Yue, Qiang Wu
Medium- and long-wave infrared image fusion has problems such as overemphasizing detail retention, which often weakens the presence of thermal information, poor contrast of fused images, and large noise, so a medium- and long-wave image fusion method based on improved non-subsample shearlt transform (NSST) is proposed. Firstly, the image processing of mid-wave infrared and long-wave infrared images is carried out in a targeted manner, and the pixel values of the target and background area are adjusted by using the adaptive contrast enhancement algorithm to adjust the pixel values of the mid-wave infrared image, so as to achieve the target enhancement effect by expanding the relative pixel difference between the thermal target and the background area. Secondly, the average curvature filtering and Gaussian filtering are used to decompose the source image into detail layer, structure layer and area layer. The energy differential feature is used to guide the energy attribute fusion strategy to fuse the regional layer, the structure layer adopts the maximum fusion strategy to fuse, and the detail layer adopts the fusion strategy of directional contrast. Finally, the three levels after fusion are added to reconstruct the final fusion image. Experimental results show that the algorithm can effectively fuse mid-wave infrared and long-wave infrared images, which can not only effectively retain the mid-wave infrared thermal radiation and heat information, but also retain the edge detail expression ability in the fusion results to a large extent. It can be seen from the subjective and objective evaluation indicators that the proposed algorithm shows better fusion performance than other algorithms.
针对中长波红外图像融合存在过于强调细节保留而削弱热信息存在、融合后图像对比度差、噪声大等问题,提出了一种基于改进非子样本剪切变换(NSST)的中长波图像融合方法。首先,有针对性地对中波红外和长波红外图像进行图像处理,利用自适应对比度增强算法调整中波红外图像的像素值,调整目标和背景区域的像素值,通过扩大热目标与背景区域的相对像元差来达到目标增强效果。其次,利用平均曲率滤波和高斯滤波将源图像分解为细节层、结构层和面积层;利用能量差分特征引导能量属性融合策略对区域层进行融合,结构层采用最大融合策略进行融合,细节层采用方向对比融合策略。最后,将融合后的三个层次相加,重建最终的融合图像。实验结果表明,该算法能够有效地融合中波红外和长波红外图像,既能有效地保留中波红外热辐射和热信息,又能在很大程度上保留融合结果中的边缘细节表达能力。从主客观评价指标可以看出,该算法比其他算法具有更好的融合性能。
{"title":"Medium and Long Wave Infrared Image Enhancement Fusion Method Based on Edge Preserving","authors":"Shubin Lou, Xin Zheng, Bin Yue, Qiang Wu","doi":"10.1145/3581807.3581852","DOIUrl":"https://doi.org/10.1145/3581807.3581852","url":null,"abstract":"Medium- and long-wave infrared image fusion has problems such as overemphasizing detail retention, which often weakens the presence of thermal information, poor contrast of fused images, and large noise, so a medium- and long-wave image fusion method based on improved non-subsample shearlt transform (NSST) is proposed. Firstly, the image processing of mid-wave infrared and long-wave infrared images is carried out in a targeted manner, and the pixel values of the target and background area are adjusted by using the adaptive contrast enhancement algorithm to adjust the pixel values of the mid-wave infrared image, so as to achieve the target enhancement effect by expanding the relative pixel difference between the thermal target and the background area. Secondly, the average curvature filtering and Gaussian filtering are used to decompose the source image into detail layer, structure layer and area layer. The energy differential feature is used to guide the energy attribute fusion strategy to fuse the regional layer, the structure layer adopts the maximum fusion strategy to fuse, and the detail layer adopts the fusion strategy of directional contrast. Finally, the three levels after fusion are added to reconstruct the final fusion image. Experimental results show that the algorithm can effectively fuse mid-wave infrared and long-wave infrared images, which can not only effectively retain the mid-wave infrared thermal radiation and heat information, but also retain the edge detail expression ability in the fusion results to a large extent. It can be seen from the subjective and objective evaluation indicators that the proposed algorithm shows better fusion performance than other algorithms.","PeriodicalId":292813,"journal":{"name":"Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition","volume":"269 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122471767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Scale Features Integration for Handwritten Mathematical Expression Recognition 手写数学表达式识别的多尺度特征集成
Xianghao Liu, Da-han Wang, Shunzhi Zhu
Handwritten mathematical expression recognition (HMER) is a challenging task due to the complex two-dimensional structure of mathematical expressions and the similarity of handwritten texts. Most existing methods for HMER only consider single-scale features while ignoring multi-scale features that are very important to HMER. Few works have explored the fusion of multi-scale features in HMER, but exhibited an extra branch that brings more parameters and computation. In this paper, we propose an end-to-end method to integrate multi-scale features using a unified model. Specifically, we customized the Dense Atrous Spatial Pyramid Pooling (DenseASPP) to our backbone network to capture the multi-scale features of the input image meanwhile expanding the receptive fields. Moreover, we added a symbol classifier using focal loss to better discriminate and recognize similar symbols, to further improve the performance of HMER. Experiments on the Competition on Recognition of Online Handwritten Mathematical Expressions (CROHME) 2014, 2016 and 2019 shows that the proposed method achieves superior performance to most state-of-the-art methods, demonstrating the effectiveness of the proposed method.
由于数学表达式的复杂二维结构和手写体文本的相似性,手写体数学表达式识别是一项具有挑战性的任务。现有的HMER方法大多只考虑单尺度特征,而忽略了对HMER非常重要的多尺度特征。在HMER中对多尺度特征融合的研究较少,但多了一个分支,带来了更多的参数和计算量。在本文中,我们提出了一种使用统一模型集成多尺度特征的端到端方法。具体来说,我们在骨干网中定制了密集空间金字塔池(DenseASPP),以捕获输入图像的多尺度特征,同时扩展接收域。此外,我们还增加了一个利用焦点损失的符号分类器来更好地区分和识别相似的符号,进一步提高了HMER的性能。在2014年、2016年和2019年在线手写数学表达式识别大赛(CROHME)上的实验表明,本文方法的性能优于大多数最先进的方法,证明了本文方法的有效性。
{"title":"Multi-Scale Features Integration for Handwritten Mathematical Expression Recognition","authors":"Xianghao Liu, Da-han Wang, Shunzhi Zhu","doi":"10.1145/3581807.3581844","DOIUrl":"https://doi.org/10.1145/3581807.3581844","url":null,"abstract":"Handwritten mathematical expression recognition (HMER) is a challenging task due to the complex two-dimensional structure of mathematical expressions and the similarity of handwritten texts. Most existing methods for HMER only consider single-scale features while ignoring multi-scale features that are very important to HMER. Few works have explored the fusion of multi-scale features in HMER, but exhibited an extra branch that brings more parameters and computation. In this paper, we propose an end-to-end method to integrate multi-scale features using a unified model. Specifically, we customized the Dense Atrous Spatial Pyramid Pooling (DenseASPP) to our backbone network to capture the multi-scale features of the input image meanwhile expanding the receptive fields. Moreover, we added a symbol classifier using focal loss to better discriminate and recognize similar symbols, to further improve the performance of HMER. Experiments on the Competition on Recognition of Online Handwritten Mathematical Expressions (CROHME) 2014, 2016 and 2019 shows that the proposed method achieves superior performance to most state-of-the-art methods, demonstrating the effectiveness of the proposed method.","PeriodicalId":292813,"journal":{"name":"Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115773702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Intelligent IoT Scheduling Mechanism Based on Data Traffic Prediction 基于数据流量预测的物联网智能调度机制
Shuai Hou, Jizhe Lu, Enguo Zhu, Hailong Zhang, Aliaosha Ye
To improve the efficiency of data collection, transmission and application in the electric power Internet of Things(IoT), many researches are devoted to resource allocation and scheduling algorithms. However, few studies focus on the impact of dynamic changes in data volume on decision-making. In this paper, we propose an intelligent IoT scheduling mechanism based on data traffic prediction. First, we propose an IoT data traffic prediction model(IoT-DTP) to accurately predict the future data volume. On this basis, we construct a data-driven IoT scheduling mechanism (PESM), which can realize higher real-time data transmission capability and faster service response. For instance, it can realize efficient data interaction of App launch, release and update in the intelligent IoT software platform. Finally, through theoretical analysis and experimental evaluation, the efficiency of the proposed method is verified.
为了提高电力物联网数据采集、传输和应用的效率,资源分配和调度算法得到了广泛的研究。然而,很少有研究关注数据量的动态变化对决策的影响。本文提出了一种基于数据流量预测的物联网智能调度机制。首先,我们提出了一个物联网数据流量预测模型(IoT- dtp)来准确预测未来的数据量。在此基础上,构建数据驱动的物联网调度机制(PESM),实现更高的数据实时传输能力和更快的业务响应。例如,在智能物联网软件平台上实现App上线、发布、更新的高效数据交互。最后,通过理论分析和实验评价,验证了所提方法的有效性。
{"title":"Intelligent IoT Scheduling Mechanism Based on Data Traffic Prediction","authors":"Shuai Hou, Jizhe Lu, Enguo Zhu, Hailong Zhang, Aliaosha Ye","doi":"10.1145/3581807.3581899","DOIUrl":"https://doi.org/10.1145/3581807.3581899","url":null,"abstract":"To improve the efficiency of data collection, transmission and application in the electric power Internet of Things(IoT), many researches are devoted to resource allocation and scheduling algorithms. However, few studies focus on the impact of dynamic changes in data volume on decision-making. In this paper, we propose an intelligent IoT scheduling mechanism based on data traffic prediction. First, we propose an IoT data traffic prediction model(IoT-DTP) to accurately predict the future data volume. On this basis, we construct a data-driven IoT scheduling mechanism (PESM), which can realize higher real-time data transmission capability and faster service response. For instance, it can realize efficient data interaction of App launch, release and update in the intelligent IoT software platform. Finally, through theoretical analysis and experimental evaluation, the efficiency of the proposed method is verified.","PeriodicalId":292813,"journal":{"name":"Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132514585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Combining Attention Mechanism and Local-Global Features Association Network for Vehicle Re-identification 基于注意机制和局部-全局特征关联网络的车辆再识别
Caiyu Li, X. Du, Yun Wu, Da-han Wang
Vehicle re-identification (Re-ID) aims to retrieve the target vehicle from a large dataset composed of vehicle images captured by multiple cameras. Most vehicles are difficult to recognize in the environment of low resolution, occlusion, and viewpoint change, which brings challenges to vehicle Re-ID. Existing work usually uses additional attribute information to distinguish different vehicles, such as color, viewpoint, and model. However, this requires expensive manual annotation. Therefore, we propose a three-branch network based on attention mechanism and local-global feature association (AM-LGFA) to improve the accuracy of vehicle Re-ID. In the global branch, the global features of the vehicle are extracted. A multi-scale channel attention module is introduced into the attention branch to suppress irrelevant information and extract important channel features. The features extracted from the backbone are divided into different stripe features in the horizontal direction in the local branch. Then connect each stripe feature with the global information to enhance the context between features. Finally, the features extracted from the three branches are concatenated as the feature representation of the test phase. The experimental results show that the features extracted by the AM-LGFA network are complementary. The effectiveness of this method is verified on two challenging public datasets, VehicleID and VeRi-776.
车辆再识别(Re-ID)旨在从由多个摄像头捕获的车辆图像组成的大型数据集中检索目标车辆。大多数车辆在低分辨率、遮挡和视点变化的环境下难以被识别,这给车辆Re-ID带来了挑战。现有的工作通常使用附加的属性信息来区分不同的车辆,如颜色、视点和模型。然而,这需要昂贵的手工注释。为此,我们提出了一种基于关注机制和局部-全局特征关联(AM-LGFA)的三分支网络来提高车辆Re-ID的准确性。在全局分支中,提取车辆的全局特征。在注意分支中引入多尺度通道注意模块,抑制无关信息,提取重要通道特征。从主干提取的特征在局部分支的水平方向上被划分为不同的条纹特征。然后将每个条纹特征与全局信息连接起来,增强特征之间的关联性。最后,将从三个分支中提取的特征连接起来作为测试阶段的特征表示。实验结果表明,AM-LGFA网络提取的特征具有互补性。在两个具有挑战性的公共数据集(VehicleID和VeRi-776)上验证了该方法的有效性。
{"title":"Combining Attention Mechanism and Local-Global Features Association Network for Vehicle Re-identification","authors":"Caiyu Li, X. Du, Yun Wu, Da-han Wang","doi":"10.1145/3581807.3581842","DOIUrl":"https://doi.org/10.1145/3581807.3581842","url":null,"abstract":"Vehicle re-identification (Re-ID) aims to retrieve the target vehicle from a large dataset composed of vehicle images captured by multiple cameras. Most vehicles are difficult to recognize in the environment of low resolution, occlusion, and viewpoint change, which brings challenges to vehicle Re-ID. Existing work usually uses additional attribute information to distinguish different vehicles, such as color, viewpoint, and model. However, this requires expensive manual annotation. Therefore, we propose a three-branch network based on attention mechanism and local-global feature association (AM-LGFA) to improve the accuracy of vehicle Re-ID. In the global branch, the global features of the vehicle are extracted. A multi-scale channel attention module is introduced into the attention branch to suppress irrelevant information and extract important channel features. The features extracted from the backbone are divided into different stripe features in the horizontal direction in the local branch. Then connect each stripe feature with the global information to enhance the context between features. Finally, the features extracted from the three branches are concatenated as the feature representation of the test phase. The experimental results show that the features extracted by the AM-LGFA network are complementary. The effectiveness of this method is verified on two challenging public datasets, VehicleID and VeRi-776.","PeriodicalId":292813,"journal":{"name":"Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130888493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PV infrared hot spot classification algorithm with multi-branch feature fusion 多分支特征融合的PV红外热点分类算法
Han Zhou
This paper designs a multi-branch feature fusion classification algorithm to improve the network accuracy of the classical deep learning-based infrared image algorithms. First, the algorithm uses a multi-resolution sub-network parallel connection method to build the overall network architecture. Then, a lightweight structural module is designed to reduce the computational load of network weight parameters, and a channel attention module is introduced to refine feature channels and improve detection accuracy. Finally, the parallel connection mode of the spatial pyramid is designed to enhance the ability of feature semantic expression. The experimental results show the improved accuracy of the algorithm model proposed in this paper and the optimization of parameters. The accuracy rate can reach 97.6%. The proposed algorithm is an innovation to the current mainstream classification algorithm, which reflects good promotion and application.
为了提高经典的基于深度学习的红外图像分类算法的网络准确率,本文设计了一种多分支特征融合分类算法。首先,该算法采用多分辨率子网并行连接方法构建整体网络架构。然后,设计了轻量化结构模块,以减少网络权重参数的计算量;引入通道关注模块,以细化特征通道,提高检测精度。最后,设计空间金字塔的平行连接模式,增强特征语义表达能力。实验结果表明,本文提出的算法模型提高了精度,并对参数进行了优化。准确率可达97.6%。该算法是对当前主流分类算法的创新,具有良好的推广应用效果。
{"title":"PV infrared hot spot classification algorithm with multi-branch feature fusion","authors":"Han Zhou","doi":"10.1145/3581807.3581836","DOIUrl":"https://doi.org/10.1145/3581807.3581836","url":null,"abstract":"This paper designs a multi-branch feature fusion classification algorithm to improve the network accuracy of the classical deep learning-based infrared image algorithms. First, the algorithm uses a multi-resolution sub-network parallel connection method to build the overall network architecture. Then, a lightweight structural module is designed to reduce the computational load of network weight parameters, and a channel attention module is introduced to refine feature channels and improve detection accuracy. Finally, the parallel connection mode of the spatial pyramid is designed to enhance the ability of feature semantic expression. The experimental results show the improved accuracy of the algorithm model proposed in this paper and the optimization of parameters. The accuracy rate can reach 97.6%. The proposed algorithm is an innovation to the current mainstream classification algorithm, which reflects good promotion and application.","PeriodicalId":292813,"journal":{"name":"Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130656034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Mask Detection Algorithm Based on Improved Yolov5s 基于改进Yolov5s的掩码检测算法
Xin Zhang, Yalan Zeng, Shunyong Zhou
Aiming at the severe form of new coronavirus epidemic prevention and control, a target detection algorithm is proposed to detect whether masks are worn in public places. The Ghostnet and SElayer modules with fewer design parameters replace the BottleneckCSP part in the original Yolov5s network, which reduces the computational complexity of the model and improves the detection accuracy. The bounding box regression loss function DIOU is optimized, the DGIOU loss function is used for bounding box regression, and the center coordinate distance between the two bounding boxes is considered to achieve a better convergence effect. In the feature pyramid, the depthwise separable convolution DW is used to replace the ordinary convolution, which further reduces the amount of parameters and reduces the loss of feature information caused by multiple convolutions. The experimental results show that compared with the yolov5s algorithm, the proposed method improves the mAP by 4.6% and the detection rate by 10.7 frame/s in the mask wearing detection. Compared with other mainstream algorithms, the improved yolov5s algorithm has better generalization ability and practicability.
针对新型冠状病毒疫情防控的严峻形式,提出了一种检测公共场所是否佩戴口罩的目标检测算法。采用设计参数较少的Ghostnet和SElayer模块替代了原有Yolov5s网络中的BottleneckCSP部分,降低了模型的计算复杂度,提高了检测精度。对边界盒回归损失函数DIOU进行优化,使用DGIOU损失函数进行边界盒回归,并考虑两个边界盒之间的中心坐标距离,达到更好的收敛效果。在特征金字塔中,采用深度可分离卷积DW代替普通卷积,进一步减少了参数的数量,减少了多次卷积造成的特征信息损失。实验结果表明,与yolov5s算法相比,该方法在面具佩戴检测中mAP提高了4.6%,检测率提高了10.7帧/秒。与其他主流算法相比,改进的yolov5s算法具有更好的泛化能力和实用性。
{"title":"A Mask Detection Algorithm Based on Improved Yolov5s","authors":"Xin Zhang, Yalan Zeng, Shunyong Zhou","doi":"10.1145/3581807.3581818","DOIUrl":"https://doi.org/10.1145/3581807.3581818","url":null,"abstract":"Aiming at the severe form of new coronavirus epidemic prevention and control, a target detection algorithm is proposed to detect whether masks are worn in public places. The Ghostnet and SElayer modules with fewer design parameters replace the BottleneckCSP part in the original Yolov5s network, which reduces the computational complexity of the model and improves the detection accuracy. The bounding box regression loss function DIOU is optimized, the DGIOU loss function is used for bounding box regression, and the center coordinate distance between the two bounding boxes is considered to achieve a better convergence effect. In the feature pyramid, the depthwise separable convolution DW is used to replace the ordinary convolution, which further reduces the amount of parameters and reduces the loss of feature information caused by multiple convolutions. The experimental results show that compared with the yolov5s algorithm, the proposed method improves the mAP by 4.6% and the detection rate by 10.7 frame/s in the mask wearing detection. Compared with other mainstream algorithms, the improved yolov5s algorithm has better generalization ability and practicability.","PeriodicalId":292813,"journal":{"name":"Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125349361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Encrypted Traffic Identification Method Based on RepVGG 基于RepVGG的加密流量识别方法
Kemeng Wang, Quan Zhou, Zhikang Zeng, Menglong Chen
With the emergence of encrypted traffic, more and more researchers use AI technology to improve the accuracy of traffic identification. However, machine learning needs to rely on human experience to extract features, and the training of deep learning models depends on a large number of labeled samples.To solve these problems, we propose an encrypted traffic identification method based on RepVGG. First, the pre-trained model RepVGG-A0 on the ImageNet dataset is migrated to the encrypted traffic dataset, and a dropout layer is added before the linear classifier in order to avoid overfitting. Then, to reduce the impact of sample imbalance, different weight parameters are assigned to different categories in the training process.Finally, we make a comparison with other traffic identification methods.The experimental results show that the proposed method can achieve 99.98% accuracy in binary classification and 97% accuracy in multi-classification experiments, which proves the effectiveness of the method.
随着加密流量的出现,越来越多的研究者利用AI技术来提高流量识别的准确性。然而,机器学习需要依靠人类的经验来提取特征,深度学习模型的训练依赖于大量的标记样本。为了解决这些问题,我们提出了一种基于RepVGG的加密流量识别方法。首先,将ImageNet数据集上的预训练模型RepVGG-A0迁移到加密流量数据集,并在线性分类器之前添加dropout层以避免过拟合。然后,为了减少样本不平衡的影响,在训练过程中对不同的类别分配不同的权重参数。最后,与其他流量识别方法进行了比较。实验结果表明,该方法在二分类实验中准确率达到99.98%,在多分类实验中准确率达到97%,证明了该方法的有效性。
{"title":"An Encrypted Traffic Identification Method Based on RepVGG","authors":"Kemeng Wang, Quan Zhou, Zhikang Zeng, Menglong Chen","doi":"10.1145/3581807.3581896","DOIUrl":"https://doi.org/10.1145/3581807.3581896","url":null,"abstract":"With the emergence of encrypted traffic, more and more researchers use AI technology to improve the accuracy of traffic identification. However, machine learning needs to rely on human experience to extract features, and the training of deep learning models depends on a large number of labeled samples.To solve these problems, we propose an encrypted traffic identification method based on RepVGG. First, the pre-trained model RepVGG-A0 on the ImageNet dataset is migrated to the encrypted traffic dataset, and a dropout layer is added before the linear classifier in order to avoid overfitting. Then, to reduce the impact of sample imbalance, different weight parameters are assigned to different categories in the training process.Finally, we make a comparison with other traffic identification methods.The experimental results show that the proposed method can achieve 99.98% accuracy in binary classification and 97% accuracy in multi-classification experiments, which proves the effectiveness of the method.","PeriodicalId":292813,"journal":{"name":"Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126656011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian Personalized Ranking based on Knowledge Graph 基于知识图谱的贝叶斯个性化排名
Ran Ma, Xiaotian Yang, Jiang Li, Fei Gao
Collaborative filtering algorithms have serious data sparsity and cold start problems as the amount of data increases and the movie dataset keeps growing.To solve the above problems, this paper proposes to combine the knowledge graph with Matrix factorization algorithm.Through the user's historical interests, mining the user's similar interests on the knowledge graph, to form the candidate items, useing eventually to predict users' interests, and finally using Bayesian personalized recommendation to predict the user's rating of the candidate items to achieve top-K recommendation.Through experiments, it is demonstrated that the algorithm proposed in this paper significantly improves the recommendation effect of matrix decomposition model. With its AUC=0.9348 and ACC=0.8474 on the movie dataset, the experimental data show that the algorithm can improve the recommendation effect more effectively.
随着数据量的增加和电影数据集的不断增长,协同过滤算法存在严重的数据稀疏性和冷启动问题。为了解决上述问题,本文提出将知识图与矩阵分解算法相结合。通过用户的历史兴趣,挖掘用户在知识图上的相似兴趣,形成候选项目,利用最终预测用户的兴趣,最后利用贝叶斯个性化推荐预测用户对候选项目的评分,实现top-K推荐。通过实验证明,本文提出的算法显著提高了矩阵分解模型的推荐效果。实验数据表明,该算法在电影数据集上的AUC=0.9348, ACC=0.8474,可以更有效地提高推荐效果。
{"title":"Bayesian Personalized Ranking based on Knowledge Graph","authors":"Ran Ma, Xiaotian Yang, Jiang Li, Fei Gao","doi":"10.1145/3581807.3581887","DOIUrl":"https://doi.org/10.1145/3581807.3581887","url":null,"abstract":"Collaborative filtering algorithms have serious data sparsity and cold start problems as the amount of data increases and the movie dataset keeps growing.To solve the above problems, this paper proposes to combine the knowledge graph with Matrix factorization algorithm.Through the user's historical interests, mining the user's similar interests on the knowledge graph, to form the candidate items, useing eventually to predict users' interests, and finally using Bayesian personalized recommendation to predict the user's rating of the candidate items to achieve top-K recommendation.Through experiments, it is demonstrated that the algorithm proposed in this paper significantly improves the recommendation effect of matrix decomposition model. With its AUC=0.9348 and ACC=0.8474 on the movie dataset, the experimental data show that the algorithm can improve the recommendation effect more effectively.","PeriodicalId":292813,"journal":{"name":"Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114501493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1