首页 > 最新文献

Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition最新文献

英文 中文
Tongue Image Retrieval Based On Reinforcement Learning 基于强化学习的舌头图像检索
A. Farooq, Xinfeng Zhang
In Chinese medicine, the patient's body constitution plays a crucial role in determining the course of treatment because it is so intrinsically linked to the patient's physiological and pathological processes. Traditional Chinese medicine practitioners use tongue diagnosis to determine a person's constitutional type during an examination. An effective solution is needed to overcome the complexity of this setting before the tongue image constitution recognition system can be deployed on a non-invasive mobile device for fast, efficient, and accurate constitution recognition. We will use deep deterministic policy gradients to implement tongue retrieval techniques. We suggested a new method for image retrieval systems based on Deep Deterministic Policy Gradients (DDPG) in an effort to boost the precision of database searches for query images. We present a strategy for enhancing image retrieval accuracy that uses the complexity of individual instances to split the dataset into two subsets for independent classification using Deep reinforcement learning. Experiments on tongue datasets are performed to gauge the efficacy of our suggested approach; in these experiments, deep reinforcement learning techniques are applied to develop a retrieval system for pictures of tongues affected by various disorders. Using our proposed strategy, it may be possible to enhance image retrieval accuracy through enhanced recognition of tongue diseases. Databases containing pictures of tongues affected by a wide range of disorders will be used as examples. The experimental results suggest that the new approach to computing the main colour histogram outperforms the prior one. Though the difference is tiny statistically, the enhanced retrieval impact is clear to the human eye. The tongue is similarly brought to the fore to emphasise the importance of the required verbal statement. Both investigations used tongue images classified into five distinct categories.
在中医中,病人的体质在决定治疗过程中起着至关重要的作用,因为它与病人的生理和病理过程有着内在的联系。中医在检查时使用舌头诊断来确定一个人的体质类型。在非侵入性移动设备上部署舌头图像体质识别系统以实现快速、高效、准确的体质识别之前,需要一个有效的解决方案来克服这种设置的复杂性。我们将使用深度确定性策略梯度来实现舌头检索技术。本文提出了一种基于深度确定性策略梯度(Deep Deterministic Policy Gradients, DDPG)的图像检索方法,以提高数据库对查询图像的检索精度。我们提出了一种提高图像检索精度的策略,该策略利用单个实例的复杂性将数据集分成两个子集,使用深度强化学习进行独立分类。在舌头数据集上进行了实验,以衡量我们建议的方法的有效性;在这些实验中,应用深度强化学习技术开发了一个检索系统,用于检索受各种疾病影响的舌头图片。使用我们提出的策略,可以通过增强对舌头疾病的识别来提高图像检索的准确性。包含受各种疾病影响的舌头图片的数据库将被用作例子。实验结果表明,计算主颜色直方图的新方法优于先前的方法。虽然统计上的差异很小,但增强的检索影响对人眼来说是显而易见的。同样,舌头也被放在前面,以强调所需要的口头陈述的重要性。这两项调查使用的舌头图像都被分为五种不同的类别。
{"title":"Tongue Image Retrieval Based On Reinforcement Learning","authors":"A. Farooq, Xinfeng Zhang","doi":"10.1145/3581807.3581848","DOIUrl":"https://doi.org/10.1145/3581807.3581848","url":null,"abstract":"In Chinese medicine, the patient's body constitution plays a crucial role in determining the course of treatment because it is so intrinsically linked to the patient's physiological and pathological processes. Traditional Chinese medicine practitioners use tongue diagnosis to determine a person's constitutional type during an examination. An effective solution is needed to overcome the complexity of this setting before the tongue image constitution recognition system can be deployed on a non-invasive mobile device for fast, efficient, and accurate constitution recognition. We will use deep deterministic policy gradients to implement tongue retrieval techniques. We suggested a new method for image retrieval systems based on Deep Deterministic Policy Gradients (DDPG) in an effort to boost the precision of database searches for query images. We present a strategy for enhancing image retrieval accuracy that uses the complexity of individual instances to split the dataset into two subsets for independent classification using Deep reinforcement learning. Experiments on tongue datasets are performed to gauge the efficacy of our suggested approach; in these experiments, deep reinforcement learning techniques are applied to develop a retrieval system for pictures of tongues affected by various disorders. Using our proposed strategy, it may be possible to enhance image retrieval accuracy through enhanced recognition of tongue diseases. Databases containing pictures of tongues affected by a wide range of disorders will be used as examples. The experimental results suggest that the new approach to computing the main colour histogram outperforms the prior one. Though the difference is tiny statistically, the enhanced retrieval impact is clear to the human eye. The tongue is similarly brought to the fore to emphasise the importance of the required verbal statement. Both investigations used tongue images classified into five distinct categories.","PeriodicalId":292813,"journal":{"name":"Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121905344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ResAsapp: An Effective Convolution to Distinguish Adjacent Pixels For Scene Text Detection ResAsapp:一种用于场景文本检测的有效卷积识别相邻像素
Kangming Weng, X. Du, Kunze Chen, Dahan Wang, Shunzhi Zhu
The segmentation-based approach is an essential direction of scene text detection, and it can detect arbitrary or curved text, which has attracted the increasing attention of many researchers. However, extensive research has shown that the segmentation-based method will be disturbed by adjoining pixels and cannot effectively identify the text boundaries. To tackle this problem, we proposed a ResAsapp Conv based on the PSE algorithm. This convolution structure can provide different scale visual fields about the object and make it effectively recognize the boundary of texts. The method's effectiveness is validated on three benchmark datasets, CTW1500, Total-Text, and ICDAR2015 datasets. In particular, on the CTW1500 dataset, a dataset full of long curve text in all kinds of scenes, which is hard to distinguish, our network achieves an F-measure of 81.2%.
基于分割的场景文本检测方法是场景文本检测的一个重要方向,它可以检测任意文本或弯曲文本,越来越受到研究者的关注。然而,大量研究表明,基于分割的方法会受到相邻像素的干扰,无法有效识别文本边界。为了解决这个问题,我们提出了一种基于PSE算法的ResAsapp Conv。这种卷积结构可以提供物体不同尺度的视野,使其能够有效地识别文本的边界。在CTW1500、Total-Text和ICDAR2015三个基准数据集上验证了该方法的有效性。特别是在CTW1500数据集上,我们的网络实现了81.2%的F-measure。CTW1500数据集是一个充满各种场景的长曲线文本的数据集,很难区分。
{"title":"ResAsapp: An Effective Convolution to Distinguish Adjacent Pixels For Scene Text Detection","authors":"Kangming Weng, X. Du, Kunze Chen, Dahan Wang, Shunzhi Zhu","doi":"10.1145/3581807.3581854","DOIUrl":"https://doi.org/10.1145/3581807.3581854","url":null,"abstract":"The segmentation-based approach is an essential direction of scene text detection, and it can detect arbitrary or curved text, which has attracted the increasing attention of many researchers. However, extensive research has shown that the segmentation-based method will be disturbed by adjoining pixels and cannot effectively identify the text boundaries. To tackle this problem, we proposed a ResAsapp Conv based on the PSE algorithm. This convolution structure can provide different scale visual fields about the object and make it effectively recognize the boundary of texts. The method's effectiveness is validated on three benchmark datasets, CTW1500, Total-Text, and ICDAR2015 datasets. In particular, on the CTW1500 dataset, a dataset full of long curve text in all kinds of scenes, which is hard to distinguish, our network achieves an F-measure of 81.2%.","PeriodicalId":292813,"journal":{"name":"Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125191899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Unstructured Data Desensitization Approach for Futures Industry 期货行业的非结构化数据脱敏方法
Xiaofan Zhi, Li Xue, Sihao Xie
The development of technologies of Big Data and artificial intelligence provides powerful boost to financing institutions on data digging, while also bringing challenges to prevent private data disclosures. Data desensitization technology is one of the ways to protect private data. Compared to structured data desensitization technologies, unstructured data desensitization technologies are still facing some challenges. On one hand, the accuracy of text recognition from images, voices and videos and other types of unstructured data seriously affects the performance of desensitization. On the other hand, conventional sensitive information recognition methods, which are rules and matching-based, often offer unacceptable desensitized results when facing complicated financial data. Due to such issues, this paper proposes a completely new method for unstructured data desensitization. By first using the evaluation model based on multi-level fine-grained verification for text conversion accuracy to improve the accuracy of text recognition, followed by introducing a sensitive information recognition model based on hybrid analysis to reduce the rates of missed and false detection on sensitive information recognition, this unstructured data desensitization method achieved satisfactory results on real datasets.
大数据和人工智能技术的发展为金融机构的数据挖掘提供了强大的推动力,同时也为防止私人数据泄露带来了挑战。数据脱敏技术是保护私有数据的一种方法。与结构化数据脱敏技术相比,非结构化数据脱敏技术还面临着一些挑战。一方面,从图像、语音和视频等非结构化数据中识别文本的准确性严重影响脱敏性能。另一方面,传统的基于规则和匹配的敏感信息识别方法在面对复杂的金融数据时,往往会产生不可接受的脱敏结果。针对这些问题,本文提出了一种全新的非结构化数据脱敏方法。该非结构化数据脱敏方法首先采用基于多级细粒度验证的文本转换精度评价模型来提高文本识别的精度,然后引入基于混合分析的敏感信息识别模型来降低敏感信息识别的漏检率和误检率,在真实数据集上取得了满意的结果。
{"title":"An Unstructured Data Desensitization Approach for Futures Industry","authors":"Xiaofan Zhi, Li Xue, Sihao Xie","doi":"10.1145/3581807.3581885","DOIUrl":"https://doi.org/10.1145/3581807.3581885","url":null,"abstract":"The development of technologies of Big Data and artificial intelligence provides powerful boost to financing institutions on data digging, while also bringing challenges to prevent private data disclosures. Data desensitization technology is one of the ways to protect private data. Compared to structured data desensitization technologies, unstructured data desensitization technologies are still facing some challenges. On one hand, the accuracy of text recognition from images, voices and videos and other types of unstructured data seriously affects the performance of desensitization. On the other hand, conventional sensitive information recognition methods, which are rules and matching-based, often offer unacceptable desensitized results when facing complicated financial data. Due to such issues, this paper proposes a completely new method for unstructured data desensitization. By first using the evaluation model based on multi-level fine-grained verification for text conversion accuracy to improve the accuracy of text recognition, followed by introducing a sensitive information recognition model based on hybrid analysis to reduce the rates of missed and false detection on sensitive information recognition, this unstructured data desensitization method achieved satisfactory results on real datasets.","PeriodicalId":292813,"journal":{"name":"Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128610894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RGFGM-LXMERT-An Improve Architecture Based On LXMERT rgfgm -LXMERT-基于LXMERT的改进架构
Renjie Yu
LXMERT (Learning Cross-Modality Encoder Representations from Transformers) is a two-stream cross-modality pre-trained model that performs well in different downstream tasks which contain two visual question answering datasets and a challenging visual-reasoning task (i.e., VQA, GQA, and NLVR). But the large-scale model still has a lot of room for progress. That is, the model accuracy is very low, the generalization ability is weak, and it is easy to be attacked by adversarial attacks. Furthermore, training the LXMERT model takes a lot of time and money, so there is an urgent need to improve. Thus, I try to improve the training speed, generalization ability, and accuracy of the model by enhancing both the training method and the model structure. In the training method, FGM (Fast Gradient Method) adversarial training is introduced in the finetune phase of the model by adding the disturbances in both the language embedding layer's and visual feature linear layer's weights, which effectively improves the model accuracy and generalization ability. In the model structure, a residual block with weight is used to improve the training speed by 1.6% in the pre-training phase of this model without losing the model performance. Next, t the most important structure, the Encoder, is redesigned to make the model more convergent. The Encoder's FFN (Feed-Forward Neural Network) is replaced by GLU (Gated Linear Unit), which also improves the ability of model fitting and model performance. The improved model performs better on the VQA task than the benchmark (i.e., LXMERT). In the end, detailed ablation studies prove that my enhancement strategies are effective for LXMERT and observe the effectiveness of different measures on the model.
LXMERT (Learning Cross-Modality Encoder Representations from Transformers)是一种双流跨模态预训练模型,在包含两个视觉问答数据集和一个具有挑战性的视觉推理任务(即VQA、GQA和NLVR)的不同下游任务中表现良好。但大规模模型仍有很大的进步空间。即模型精度很低,泛化能力较弱,容易受到对抗性攻击。此外,训练LXMERT模型需要花费大量的时间和金钱,因此迫切需要改进。因此,我试图通过改进训练方法和模型结构来提高模型的训练速度、泛化能力和准确性。在训练方法中,通过在语言嵌入层和视觉特征线性层的权值中加入干扰,在模型的微调阶段引入FGM (Fast Gradient method)对抗训练,有效提高了模型的准确率和泛化能力。在模型结构中,在不损失模型性能的前提下,在模型的预训练阶段,使用带有权重的残差块将训练速度提高1.6%。接下来,重新设计最重要的结构——编码器,使模型更加收敛。将编码器的前馈神经网络(FFN)替换为门控线性单元(GLU),提高了模型拟合能力和模型性能。改进的模型在VQA任务上比基准测试(即LXMERT)表现得更好。最后,详细的消融研究证明了我的增强策略对LXMERT是有效的,并观察了不同措施对模型的有效性。
{"title":"RGFGM-LXMERT-An Improve Architecture Based On LXMERT","authors":"Renjie Yu","doi":"10.1145/3581807.3581879","DOIUrl":"https://doi.org/10.1145/3581807.3581879","url":null,"abstract":"LXMERT (Learning Cross-Modality Encoder Representations from Transformers) is a two-stream cross-modality pre-trained model that performs well in different downstream tasks which contain two visual question answering datasets and a challenging visual-reasoning task (i.e., VQA, GQA, and NLVR). But the large-scale model still has a lot of room for progress. That is, the model accuracy is very low, the generalization ability is weak, and it is easy to be attacked by adversarial attacks. Furthermore, training the LXMERT model takes a lot of time and money, so there is an urgent need to improve. Thus, I try to improve the training speed, generalization ability, and accuracy of the model by enhancing both the training method and the model structure. In the training method, FGM (Fast Gradient Method) adversarial training is introduced in the finetune phase of the model by adding the disturbances in both the language embedding layer's and visual feature linear layer's weights, which effectively improves the model accuracy and generalization ability. In the model structure, a residual block with weight is used to improve the training speed by 1.6% in the pre-training phase of this model without losing the model performance. Next, t the most important structure, the Encoder, is redesigned to make the model more convergent. The Encoder's FFN (Feed-Forward Neural Network) is replaced by GLU (Gated Linear Unit), which also improves the ability of model fitting and model performance. The improved model performs better on the VQA task than the benchmark (i.e., LXMERT). In the end, detailed ablation studies prove that my enhancement strategies are effective for LXMERT and observe the effectiveness of different measures on the model.","PeriodicalId":292813,"journal":{"name":"Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127288976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Few-Shot Data Augmentation for Industrial Character Recognition 工业字符识别的少镜头数据增强
Hongchao Gao, Xiaoqian Huang, Bofeng Liu
The task of industrial character recognition is to extract character content on the surface of the workpiece in the industrial production process. Limited training data, incomplete available character categories and non-standardized character styles encountered in actual production have led to a significant reduction in the recognition performance of deep learning-based methods, such as scene text recognition and Optical Character Recognition (OCR). In this paper, we propose an augmentation strategy suitable for industrial character recognition based on the Generative Adversarial Network (GAN). The strategy consists of two modules, a character detection module and a synthetic data generation module. The results show that the augmentation strategy achieves best generation results. Recognition network utilizing the augmentation dataset generated by the strategy can achieve the best results on four types of industrial datasets.
工业字符识别的任务是在工业生产过程中提取工件表面的字符内容。有限的训练数据、不完整的可用字符类别以及在实际生产中遇到的非标准化字符样式导致基于深度学习的方法(如场景文本识别和光学字符识别(OCR))的识别性能显著降低。在本文中,我们提出了一种基于生成对抗网络(GAN)的适合工业字符识别的增强策略。该策略包括两个模块:字符检测模块和综合数据生成模块。结果表明,该增强策略获得了最佳的生成效果。利用该策略生成的增强数据集的识别网络可以在四种类型的工业数据集上获得最佳结果。
{"title":"Few-Shot Data Augmentation for Industrial Character Recognition","authors":"Hongchao Gao, Xiaoqian Huang, Bofeng Liu","doi":"10.1145/3581807.3581841","DOIUrl":"https://doi.org/10.1145/3581807.3581841","url":null,"abstract":"The task of industrial character recognition is to extract character content on the surface of the workpiece in the industrial production process. Limited training data, incomplete available character categories and non-standardized character styles encountered in actual production have led to a significant reduction in the recognition performance of deep learning-based methods, such as scene text recognition and Optical Character Recognition (OCR). In this paper, we propose an augmentation strategy suitable for industrial character recognition based on the Generative Adversarial Network (GAN). The strategy consists of two modules, a character detection module and a synthetic data generation module. The results show that the augmentation strategy achieves best generation results. Recognition network utilizing the augmentation dataset generated by the strategy can achieve the best results on four types of industrial datasets.","PeriodicalId":292813,"journal":{"name":"Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128929197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Network Bandwidth Prediction Method Based on Hidden Markov model in High-speed Railway 基于隐马尔可夫模型的高速铁路网络带宽预测方法
Luyao Wang, Jia Guo, Ye Zhu, Heying Song, Yanmin Wei, Jinao Wang
In the context of the full commercial use of 5G, high-speed rail passengers have higher and higher requirements for wireless network service quality. However, in the current high-speed rail 5G network streaming media transmission, due to the fast moving speed, the base station is frequently switched, and the user bandwidth does not match the streaming media bit rate, resulting in a poor user network experience and a poor streaming media experience. In view of the above problems, this paper focuses on the bandwidth prediction of network users in the high-speed rail environment, and proposes a bandwidth prediction algorithm High speed 5G Environment Bandwidth Predict(H5EBP) based on the hidden Markov model in different states of the high-speed rail. So as to improve the user's streaming media experience. After comparative evaluation with other existing bandwidth prediction algorithms, H5EBP can greatly improve the accuracy of bandwidth prediction, thereby improving the user's streaming media experience.
在5G全面商用的背景下,高铁旅客对无线网络服务质量的要求越来越高。然而,在当前高铁5G网络流媒体传输中,由于移动速度快,基站频繁切换,用户带宽与流媒体比特率不匹配,导致用户网络体验差,流媒体体验差。针对上述问题,本文针对高铁环境下网络用户的带宽预测问题,提出了一种基于隐马尔可夫模型的高铁不同状态下高速5G环境带宽预测(H5EBP)带宽预测算法。从而提高用户的流媒体体验。经过与现有其他带宽预测算法的对比评估,H5EBP可以大大提高带宽预测的准确性,从而改善用户的流媒体体验。
{"title":"Network Bandwidth Prediction Method Based on Hidden Markov model in High-speed Railway","authors":"Luyao Wang, Jia Guo, Ye Zhu, Heying Song, Yanmin Wei, Jinao Wang","doi":"10.1145/3581807.3581900","DOIUrl":"https://doi.org/10.1145/3581807.3581900","url":null,"abstract":"In the context of the full commercial use of 5G, high-speed rail passengers have higher and higher requirements for wireless network service quality. However, in the current high-speed rail 5G network streaming media transmission, due to the fast moving speed, the base station is frequently switched, and the user bandwidth does not match the streaming media bit rate, resulting in a poor user network experience and a poor streaming media experience. In view of the above problems, this paper focuses on the bandwidth prediction of network users in the high-speed rail environment, and proposes a bandwidth prediction algorithm High speed 5G Environment Bandwidth Predict(H5EBP) based on the hidden Markov model in different states of the high-speed rail. So as to improve the user's streaming media experience. After comparative evaluation with other existing bandwidth prediction algorithms, H5EBP can greatly improve the accuracy of bandwidth prediction, thereby improving the user's streaming media experience.","PeriodicalId":292813,"journal":{"name":"Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116664712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DQN Method Analysis for Network Routing of Electric Optical Communication Network 电光通信网络路由的DQN方法分析
Yuqing Zhong, Xiong Wei Zhang, Wuhua Xu
Route planning of electric optical communication network play crucial role for communication reliability and performance. For the purpose to carry out enforcement learning and obtain optimized routing result, Deep Q Network (DQN), which has been approved to be a high performance neural network model, is analyzed for electric optical network routing. Depend on network function and structure, large scale electric optical communication network can be divided into several sub networks for better training speed. Advanced DQN model is analysis and trained for a 200 nodes communication network and a 700 nodes communication network. The training results of different scale networks, which can prove the effectiveness of this method, are given with reward data and running time for comparison. This method can be used for dynamic route planning of a large scale electric communication network.
电光通信网络的路由规划对通信的可靠性和性能起着至关重要的作用。为了进行强制学习并获得优化的路由结果,对已被认可为高性能神经网络模型的深Q网络(Deep Q Network, DQN)进行了电光网络路由分析。根据网络功能和结构的不同,可以将大型电光通信网络划分为若干个子网络,以提高训练速度。对200节点通信网络和700节点通信网络的DQN模型进行了分析和训练。给出了不同规模网络的训练结果,证明了该方法的有效性,并给出了奖励数据和运行时间进行比较。该方法可用于大规模电力通信网络的动态路由规划。
{"title":"DQN Method Analysis for Network Routing of Electric Optical Communication Network","authors":"Yuqing Zhong, Xiong Wei Zhang, Wuhua Xu","doi":"10.1145/3581807.3581898","DOIUrl":"https://doi.org/10.1145/3581807.3581898","url":null,"abstract":"Route planning of electric optical communication network play crucial role for communication reliability and performance. For the purpose to carry out enforcement learning and obtain optimized routing result, Deep Q Network (DQN), which has been approved to be a high performance neural network model, is analyzed for electric optical network routing. Depend on network function and structure, large scale electric optical communication network can be divided into several sub networks for better training speed. Advanced DQN model is analysis and trained for a 200 nodes communication network and a 700 nodes communication network. The training results of different scale networks, which can prove the effectiveness of this method, are given with reward data and running time for comparison. This method can be used for dynamic route planning of a large scale electric communication network.","PeriodicalId":292813,"journal":{"name":"Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117042244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparative Study on EEG Feature Recognition based on Deep Belief Network 基于深度信念网络的脑电信号特征识别比较研究
Guangrong Liu, Bin Hao, Abdelkader Nasreddine Belkacem, Jiaxin Zhang, Penghai Li, Jun Liang, Changming Wang, Chao Chen
In Brain Computer interface (BCI) system, motor imagination has some problems, such as difficulty in extracting EEG signal features, low accuracy of classification and recognition, long training time and gradient saturation in feature classification based on traditional deep neural network, etc. In this paper, a deep belief network (DBN) model is proposed. Fast Fourier transform (FFT) and wavelet transform (WT) combined with deep machine learning model DBN were used to extract the feature vectors of time-frequency signals of different leads, superposition and average them, and then perform classification experiments. The number of DBN network layers and the number of neurons in each layer were determined by iteration. Through the reverse fine-tuning, the optimal weight coefficient W and the paranoid term B are determined layer by layer, and the training and optimization problems of deep neural networks are solved. In this paper, a motion imagination and Motion observation (MI-AO) experiment is designed, which can be obtained by comparing with the public dataset BCI Competition IV 2a. The DBN model is used to compare with other algorithms, and the average accuracy of binary classification is 83.81%, and the average accuracy of four classification is 80.77%.
在脑机接口(BCI)系统中,运动想象存在脑电信号特征提取困难、分类识别准确率低、训练时间长、基于传统深度神经网络的特征分类存在梯度饱和等问题。本文提出了一种深度信念网络(DBN)模型。采用快速傅里叶变换(FFT)和小波变换(WT)结合深度机器学习模型DBN提取不同导联时频信号的特征向量,对其进行叠加和平均,然后进行分类实验。通过迭代确定DBN网络的层数和每层神经元的个数。通过反向微调,逐层确定最优权系数W和偏执项B,解决深度神经网络的训练和优化问题。本文设计了一个运动想象和运动观察(MI-AO)实验,该实验可以通过与公共数据集BCI Competition IV 2a进行比较得到。采用DBN模型与其他算法进行对比,二值分类的平均准确率为83.81%,四种分类的平均准确率为80.77%。
{"title":"Comparative Study on EEG Feature Recognition based on Deep Belief Network","authors":"Guangrong Liu, Bin Hao, Abdelkader Nasreddine Belkacem, Jiaxin Zhang, Penghai Li, Jun Liang, Changming Wang, Chao Chen","doi":"10.1145/3581807.3581871","DOIUrl":"https://doi.org/10.1145/3581807.3581871","url":null,"abstract":"In Brain Computer interface (BCI) system, motor imagination has some problems, such as difficulty in extracting EEG signal features, low accuracy of classification and recognition, long training time and gradient saturation in feature classification based on traditional deep neural network, etc. In this paper, a deep belief network (DBN) model is proposed. Fast Fourier transform (FFT) and wavelet transform (WT) combined with deep machine learning model DBN were used to extract the feature vectors of time-frequency signals of different leads, superposition and average them, and then perform classification experiments. The number of DBN network layers and the number of neurons in each layer were determined by iteration. Through the reverse fine-tuning, the optimal weight coefficient W and the paranoid term B are determined layer by layer, and the training and optimization problems of deep neural networks are solved. In this paper, a motion imagination and Motion observation (MI-AO) experiment is designed, which can be obtained by comparing with the public dataset BCI Competition IV 2a. The DBN model is used to compare with other algorithms, and the average accuracy of binary classification is 83.81%, and the average accuracy of four classification is 80.77%.","PeriodicalId":292813,"journal":{"name":"Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition","volume":"288 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114264102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nutrient Deficiency Diagnosis of Plants Based on Transfer Learning and Lightweight Convolutional Neural Networks MobileNetV3-Large 基于迁移学习和轻量级卷积神经网络的植物营养缺乏症诊断
Qian Yan, Xuhong Lin, Wenwen Gong, Caicong Wu, Yifei Chen
Nutrient Deficiency Diagnosis of Plants is an important application in precision agriculture. At present, nutrient deficiency diagnosis of plants mainly depends on manual identification, which makes it difficult to ensure efficiency and accuracy. Therefore, based on deep learning and focusing on the problems of difficult convergence and poor real-time performance of the existing deep convolution neural network in the detection of plant nutrient deficiency, this study proposes a lightweight model—UMNet (Nutrient-MobileNetV3-Network) for plant nutrient deficiency detection. This model enhances the collected rice leaf images to expand the dataset, then migrates the knowledge learned by the MobilenetV3-Large network on the ImageNet dataset to UMNet, redesigns a new full connection layer, and uses a new activation function. The experimental results show that: (1) Transfer learning solves the problem of insufficient training data. Compared with learning without transfer learning, the accuracy increases by 7.22% ∼ 9.63%, which greatly improves the convergence speed and recognition accuracy of the model. (2) Compared with complex convolutional neural networks(CNN), such as InceptionV3, InceptionResnetV2 and VGG16, the lightweight network UMNet has lower storage requirements and shorter training time. At the same time, it can still ensure high accuracy, and the recognition accuracy is better than other lightweight networks with the same complexity: ShuffleNetV2, EfficientNetB0 and Xception. The identification accuracy of the plant nutrient deficiency detection model UMNet constructed in this paper can reach 97.80%, and the training time of a single epoch is about 46.4s. It only takes 1.45s to predict the nutrient deficiency of a single object, which realizes the intelligent detection in the field of plant nutrient deficiency, and it will promote academic exploration of deep learning in plant phenotype research.
植物营养缺乏症诊断是精准农业的重要应用。目前,植物营养缺乏症诊断主要依靠人工鉴定,难以保证效率和准确性。因此,本研究基于深度学习,针对现有深度卷积神经网络在植物营养缺乏症检测中难以收敛和实时性差的问题,提出了一种用于植物营养缺乏症检测的轻量级模型- umnet (nutrient - mobilenetv3 - network)。该模型对采集的水稻叶片图像进行增强,扩展数据集,然后将MobilenetV3-Large网络在ImageNet数据集上学习到的知识迁移到UMNet上,重新设计新的全连接层,并使用新的激活函数。实验结果表明:(1)迁移学习解决了训练数据不足的问题。与不进行迁移学习的学习相比,准确率提高了7.22% ~ 9.63%,大大提高了模型的收敛速度和识别准确率。(2)与InceptionV3、InceptionResnetV2和VGG16等复杂卷积神经网络(CNN)相比,轻量级网络UMNet具有更低的存储要求和更短的训练时间。同时,它仍然可以保证较高的准确率,并且识别精度优于相同复杂度的其他轻量级网络:ShuffleNetV2、EfficientNetB0和Xception。本文构建的植物营养缺乏症检测模型UMNet的识别准确率可达97.80%,单历元训练时间约为46.4s。预测单个对象的营养缺乏症只需1.45s,实现了植物营养缺乏症领域的智能检测,将促进植物表型研究中深度学习的学术探索。
{"title":"Nutrient Deficiency Diagnosis of Plants Based on Transfer Learning and Lightweight Convolutional Neural Networks MobileNetV3-Large","authors":"Qian Yan, Xuhong Lin, Wenwen Gong, Caicong Wu, Yifei Chen","doi":"10.1145/3581807.3581812","DOIUrl":"https://doi.org/10.1145/3581807.3581812","url":null,"abstract":"Nutrient Deficiency Diagnosis of Plants is an important application in precision agriculture. At present, nutrient deficiency diagnosis of plants mainly depends on manual identification, which makes it difficult to ensure efficiency and accuracy. Therefore, based on deep learning and focusing on the problems of difficult convergence and poor real-time performance of the existing deep convolution neural network in the detection of plant nutrient deficiency, this study proposes a lightweight model—UMNet (Nutrient-MobileNetV3-Network) for plant nutrient deficiency detection. This model enhances the collected rice leaf images to expand the dataset, then migrates the knowledge learned by the MobilenetV3-Large network on the ImageNet dataset to UMNet, redesigns a new full connection layer, and uses a new activation function. The experimental results show that: (1) Transfer learning solves the problem of insufficient training data. Compared with learning without transfer learning, the accuracy increases by 7.22% ∼ 9.63%, which greatly improves the convergence speed and recognition accuracy of the model. (2) Compared with complex convolutional neural networks(CNN), such as InceptionV3, InceptionResnetV2 and VGG16, the lightweight network UMNet has lower storage requirements and shorter training time. At the same time, it can still ensure high accuracy, and the recognition accuracy is better than other lightweight networks with the same complexity: ShuffleNetV2, EfficientNetB0 and Xception. The identification accuracy of the plant nutrient deficiency detection model UMNet constructed in this paper can reach 97.80%, and the training time of a single epoch is about 46.4s. It only takes 1.45s to predict the nutrient deficiency of a single object, which realizes the intelligent detection in the field of plant nutrient deficiency, and it will promote academic exploration of deep learning in plant phenotype research.","PeriodicalId":292813,"journal":{"name":"Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116298132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
VRGNet: A Robust Visible Region-Guided Network for Occluded Pedestrian Detection VRGNet:用于遮挡行人检测的鲁棒可见区域引导网络
Xin Mao, Chaoqi Yan, Hong Zhang, J. Song, Ding Yuan
Pedestrian detection has made significant progress in both academic and industrial fields. However, there are still some challenging questions with regard to occlusion scene. In this paper, we propose a novel and robust visible region-guided network (VRGNet) to specially improve the occluded pedestrian detection performance. Specifically, we leverage the adapted FPN-based framework to extract multi-scale features, and fuse them together to encode more precision localization and semantic information. In addition, we construct a pedestrian part pool that covers almost all the scale of different occluded body regions. Meanwhile, we propose a new occlusion handling strategy by elaborately integrating the prior knowledge of different visible body regions with visibility prediction into the detection framework to deal with pedestrians with different degree of occlusion. The extensive experiments demonstrate that our VRGNet achieves a leading performance under different evaluation settings on Caltech-USA dataset, especially for occluded pedestrians. In addition, it also achieves a competitive of 48.4%, 9.3%, 6.7% under the Heavy, Partial and Bare settings respectively on CityPersons dataset compared with other state-of-the-art pedestrian detection algorithms, while keeping a better speed-accuracy trade-off.
行人检测在学术和工业领域都取得了重大进展。然而,关于遮挡场景,仍然存在一些具有挑战性的问题。在本文中,我们提出了一种新颖的鲁棒可见区域引导网络(VRGNet)来提高遮挡行人的检测性能。具体来说,我们利用改进的基于fpn的框架来提取多尺度特征,并将它们融合在一起以编码更精确的定位和语义信息。此外,我们构建了一个行人部分池,几乎涵盖了不同遮挡体区域的所有尺度。同时,我们提出了一种新的遮挡处理策略,将不同可见身体区域的先验知识与可见度预测结合到检测框架中,以处理不同遮挡程度的行人。大量的实验表明,我们的VRGNet在加州理工-美国数据集的不同评估设置下都取得了领先的性能,特别是对于遮挡的行人。此外,与其他最先进的行人检测算法相比,该算法在CityPersons数据集的Heavy、Partial和Bare设置下分别达到了48.4%、9.3%和6.7%的竞争力,同时保持了更好的速度和准确性权衡。
{"title":"VRGNet: A Robust Visible Region-Guided Network for Occluded Pedestrian Detection","authors":"Xin Mao, Chaoqi Yan, Hong Zhang, J. Song, Ding Yuan","doi":"10.1145/3581807.3581817","DOIUrl":"https://doi.org/10.1145/3581807.3581817","url":null,"abstract":"Pedestrian detection has made significant progress in both academic and industrial fields. However, there are still some challenging questions with regard to occlusion scene. In this paper, we propose a novel and robust visible region-guided network (VRGNet) to specially improve the occluded pedestrian detection performance. Specifically, we leverage the adapted FPN-based framework to extract multi-scale features, and fuse them together to encode more precision localization and semantic information. In addition, we construct a pedestrian part pool that covers almost all the scale of different occluded body regions. Meanwhile, we propose a new occlusion handling strategy by elaborately integrating the prior knowledge of different visible body regions with visibility prediction into the detection framework to deal with pedestrians with different degree of occlusion. The extensive experiments demonstrate that our VRGNet achieves a leading performance under different evaluation settings on Caltech-USA dataset, especially for occluded pedestrians. In addition, it also achieves a competitive of 48.4%, 9.3%, 6.7% under the Heavy, Partial and Bare settings respectively on CityPersons dataset compared with other state-of-the-art pedestrian detection algorithms, while keeping a better speed-accuracy trade-off.","PeriodicalId":292813,"journal":{"name":"Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123704960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1