Pub Date : 2023-03-01DOI: 10.1109/PRMVIA58252.2023.00034
Wensi Zhang, Xiao Liang, Yifang Zhang, Hanchen Su
The exploitation of big data in industrial fields faces several challenges, such as data privacy and security, data integration and interoperability, and data analysis and visualization. Data privacy and security is a major concern, as the data collected from industrial fields often contain sensitive information. Due to the particularity of the industrial field, there are challenges in the utilization of big data. 1. The distribution of different categories data is extremely uneven; 2. There are a large number of industry terms in the short texts that constitute the metadata, which makes semantic representation difficult. These two challenges have a large impact on the application performance of existing models. In order to resolve the problems above, this paper proposes a pre-training model based on probability distribution, which for the classification of sensitive data in the power industry. The model consists of three modules: 1. The data enhancement module adopts the technology of synonym expansion and noise introduction, so that the model can extract the classification features of sensitive data with a small proportion; 2. The pre-training module adopts the BERT model, which can obtain the semantics of industry terms in short texts; 3. The probability prediction module is used to regularize the distribution of test data to meet the training data. Compared with the traditional classification model and the classification model based on deep learning, the F1-score can be improved by 36.68% and 6.39%.
{"title":"Sensitive Data Classification of Imbalanced Short Text Based on Probability Distribution BERT in Electric power industry","authors":"Wensi Zhang, Xiao Liang, Yifang Zhang, Hanchen Su","doi":"10.1109/PRMVIA58252.2023.00034","DOIUrl":"https://doi.org/10.1109/PRMVIA58252.2023.00034","url":null,"abstract":"The exploitation of big data in industrial fields faces several challenges, such as data privacy and security, data integration and interoperability, and data analysis and visualization. Data privacy and security is a major concern, as the data collected from industrial fields often contain sensitive information. Due to the particularity of the industrial field, there are challenges in the utilization of big data. 1. The distribution of different categories data is extremely uneven; 2. There are a large number of industry terms in the short texts that constitute the metadata, which makes semantic representation difficult. These two challenges have a large impact on the application performance of existing models. In order to resolve the problems above, this paper proposes a pre-training model based on probability distribution, which for the classification of sensitive data in the power industry. The model consists of three modules: 1. The data enhancement module adopts the technology of synonym expansion and noise introduction, so that the model can extract the classification features of sensitive data with a small proportion; 2. The pre-training module adopts the BERT model, which can obtain the semantics of industry terms in short texts; 3. The probability prediction module is used to regularize the distribution of test data to meet the training data. Compared with the traditional classification model and the classification model based on deep learning, the F1-score can be improved by 36.68% and 6.39%.","PeriodicalId":221346,"journal":{"name":"2023 International Conference on Pattern Recognition, Machine Vision and Intelligent Algorithms (PRMVIA)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129441801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1109/prmvia58252.2023.00030
Zongluo Zhao, Zhixin Zhao, Qiangqiang Li, Jiaxi Zhuang, Xiaoming Ju
Currently, in field of electrical power system, techniques of anomalies detection are constantly innovating. Applications of neural network on processing of patrol images spare analyzers plenty of time, but subject to the relatively low resolution of the object contour under heavy weather, results do not show well on recognition of anomalies in electrical equipment, while multi-modal methods can import more information to the objects detected, thus may improve the success rate of capture. In this paper, we propose a feature-fusion model which uses cross-attention learning method to augment features of the anomalies with text of corresponding description and environment condition. After comparing experiments on self-constructed datasets of images and text, our model has achieved the state of art on multiple metrics. More importantly, it is found that adding additional features to the model can achieve better results through ablation experiments, which shows our model is scalable for a better solution.
{"title":"Multi-Modal Cross-Attention Learning on Detecting Anomalies of Electrical Equipment","authors":"Zongluo Zhao, Zhixin Zhao, Qiangqiang Li, Jiaxi Zhuang, Xiaoming Ju","doi":"10.1109/prmvia58252.2023.00030","DOIUrl":"https://doi.org/10.1109/prmvia58252.2023.00030","url":null,"abstract":"Currently, in field of electrical power system, techniques of anomalies detection are constantly innovating. Applications of neural network on processing of patrol images spare analyzers plenty of time, but subject to the relatively low resolution of the object contour under heavy weather, results do not show well on recognition of anomalies in electrical equipment, while multi-modal methods can import more information to the objects detected, thus may improve the success rate of capture. In this paper, we propose a feature-fusion model which uses cross-attention learning method to augment features of the anomalies with text of corresponding description and environment condition. After comparing experiments on self-constructed datasets of images and text, our model has achieved the state of art on multiple metrics. More importantly, it is found that adding additional features to the model can achieve better results through ablation experiments, which shows our model is scalable for a better solution.","PeriodicalId":221346,"journal":{"name":"2023 International Conference on Pattern Recognition, Machine Vision and Intelligent Algorithms (PRMVIA)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133519969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1109/prmvia58252.2023.00009
Yi Zhu, Xiu Li
Multimedia data has exploded both in quantity and form. Under such background, cross-modal retrieval has become a research hot spot in recent years. We address the image-to-text and text-to-image retrieval problems by proposing a symmetric two-stream pre-training framework. In this work, the architecture is based on the CLIP model and it consists of a BERT-pretrained text encoder and a Vision Transformer (ViT)-pretrained image encoder. We utilize not only a cross-modal contrastive loss, but also two symmetric uni-modal contrast losses to train the model in an unsupervised manner. In addition, we propose novel training strategies, including the multi-stage training scheme and iterative training strategy with clustered hard negative data. Experimental results show that our model achieves better performance via introducing the uni-modal self-supervised branch and losses compared to the sole CLIP model.
{"title":"Iterative Uni-modal and Cross-modal Clustered Contrastive Learning for Image-text Retrieval","authors":"Yi Zhu, Xiu Li","doi":"10.1109/prmvia58252.2023.00009","DOIUrl":"https://doi.org/10.1109/prmvia58252.2023.00009","url":null,"abstract":"Multimedia data has exploded both in quantity and form. Under such background, cross-modal retrieval has become a research hot spot in recent years. We address the image-to-text and text-to-image retrieval problems by proposing a symmetric two-stream pre-training framework. In this work, the architecture is based on the CLIP model and it consists of a BERT-pretrained text encoder and a Vision Transformer (ViT)-pretrained image encoder. We utilize not only a cross-modal contrastive loss, but also two symmetric uni-modal contrast losses to train the model in an unsupervised manner. In addition, we propose novel training strategies, including the multi-stage training scheme and iterative training strategy with clustered hard negative data. Experimental results show that our model achieves better performance via introducing the uni-modal self-supervised branch and losses compared to the sole CLIP model.","PeriodicalId":221346,"journal":{"name":"2023 International Conference on Pattern Recognition, Machine Vision and Intelligent Algorithms (PRMVIA)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130767956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1109/prmvia58252.2023.00021
Yuran Sun
In recent years, due to the rapid development of the city, landscape sculpture has developed rapidly in the city. Due to its important role in regional beautification and humanistic education, the development of landscape sculpture has also become rich and colorful. With the application and promotion of computer virtual technology, facing the diversity and interaction of landscape culture, the wide application of computer virtual technology in landscape sculpture creation is an urgent need for the development of landscape sculpture. It has significant advantages in the production speed, cost reduction and customization of landscape sculpture, and optimizes the production process of landscape sculpture. This article discusses and analyzes the application method of computer virtual technology in landscape sculpture creation, and discusses how computer virtual technology intervenes in the creation of landscape sculpture, especially the material value and positive significance of computer virtual technology in sculpture landscape sculpture. According to the experimental research, the image frame size of the 3D landscape sculpture in the virtual space of this paper can store more than 48 frames, the real-time performance of the system is well guaranteed, and the speed and power output reach the best without affecting the response of other functions of the system Level.
{"title":"Application Research of Landscape Sculpture Design Aided by Computer Virtual Technology","authors":"Yuran Sun","doi":"10.1109/prmvia58252.2023.00021","DOIUrl":"https://doi.org/10.1109/prmvia58252.2023.00021","url":null,"abstract":"In recent years, due to the rapid development of the city, landscape sculpture has developed rapidly in the city. Due to its important role in regional beautification and humanistic education, the development of landscape sculpture has also become rich and colorful. With the application and promotion of computer virtual technology, facing the diversity and interaction of landscape culture, the wide application of computer virtual technology in landscape sculpture creation is an urgent need for the development of landscape sculpture. It has significant advantages in the production speed, cost reduction and customization of landscape sculpture, and optimizes the production process of landscape sculpture. This article discusses and analyzes the application method of computer virtual technology in landscape sculpture creation, and discusses how computer virtual technology intervenes in the creation of landscape sculpture, especially the material value and positive significance of computer virtual technology in sculpture landscape sculpture. According to the experimental research, the image frame size of the 3D landscape sculpture in the virtual space of this paper can store more than 48 frames, the real-time performance of the system is well guaranteed, and the speed and power output reach the best without affecting the response of other functions of the system Level.","PeriodicalId":221346,"journal":{"name":"2023 International Conference on Pattern Recognition, Machine Vision and Intelligent Algorithms (PRMVIA)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121086070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1109/PRMVIA58252.2023.00041
Haowen Sun, Liming Zhao
Intelligent storage has become an important part of various logistics industries, and task assignment of multi-mobile robots is an important part of intelligent storage. In this paper, the robot transport cost and no-load operation cost and task completion time cost and task assignment balance are used as optimization objectives. An improved genetic algorithm is proposed for the optimization of task assignment of multi-mobile robots. By establishing a mathematical model; adaptively adjusting the crossover probability and the fitness function of the improved genetic algorithm are used to improve the convergence speed and convergence of the population. Example simulations show that the improved genetic algorithm converges faster and has a better assignment.
{"title":"Research on multi-AGV scheduling for intelligent storage based on improved genetic algorithm","authors":"Haowen Sun, Liming Zhao","doi":"10.1109/PRMVIA58252.2023.00041","DOIUrl":"https://doi.org/10.1109/PRMVIA58252.2023.00041","url":null,"abstract":"Intelligent storage has become an important part of various logistics industries, and task assignment of multi-mobile robots is an important part of intelligent storage. In this paper, the robot transport cost and no-load operation cost and task completion time cost and task assignment balance are used as optimization objectives. An improved genetic algorithm is proposed for the optimization of task assignment of multi-mobile robots. By establishing a mathematical model; adaptively adjusting the crossover probability and the fitness function of the improved genetic algorithm are used to improve the convergence speed and convergence of the population. Example simulations show that the improved genetic algorithm converges faster and has a better assignment.","PeriodicalId":221346,"journal":{"name":"2023 International Conference on Pattern Recognition, Machine Vision and Intelligent Algorithms (PRMVIA)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125328855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1109/PRMVIA58252.2023.00007
Guangyi Liu, Qifan Liu, Wenming Cao
Few-shot learning is a challenging task in the field of machine learning that aims to acknowledge novel class with a few amount of labeled samples. To address this problem, researchers have proposed several methods, with metric-based methods being one of the most effective approaches. These methods learn a transferable embedding space for classification by computing the similarity between samples. In this context, Graph Neural Networks (GNNs) have been employed to describe the association among support samples and query samples. However, existing GNN-based methods face limitations in their capability to achieve deeper layers, which restricts their ability to effectively transport information from the support images to the query images. To overcome the limitation, we propose a deep adaptive residual graph convolution network with deeper layers that better explores the relationship between support and query sets. Additionally, we design a hybrid attention module to learn the metric distributions, which helps to alleviate the over-fitting problem that can occur with few samples. The proposed method has been shown to be effective through comprehensive experimentation on five benchmark datasets.
{"title":"Hybrid Attention Deep Adaptive Residual Graph Convolution Network for Few-shot Classification","authors":"Guangyi Liu, Qifan Liu, Wenming Cao","doi":"10.1109/PRMVIA58252.2023.00007","DOIUrl":"https://doi.org/10.1109/PRMVIA58252.2023.00007","url":null,"abstract":"Few-shot learning is a challenging task in the field of machine learning that aims to acknowledge novel class with a few amount of labeled samples. To address this problem, researchers have proposed several methods, with metric-based methods being one of the most effective approaches. These methods learn a transferable embedding space for classification by computing the similarity between samples. In this context, Graph Neural Networks (GNNs) have been employed to describe the association among support samples and query samples. However, existing GNN-based methods face limitations in their capability to achieve deeper layers, which restricts their ability to effectively transport information from the support images to the query images. To overcome the limitation, we propose a deep adaptive residual graph convolution network with deeper layers that better explores the relationship between support and query sets. Additionally, we design a hybrid attention module to learn the metric distributions, which helps to alleviate the over-fitting problem that can occur with few samples. The proposed method has been shown to be effective through comprehensive experimentation on five benchmark datasets.","PeriodicalId":221346,"journal":{"name":"2023 International Conference on Pattern Recognition, Machine Vision and Intelligent Algorithms (PRMVIA)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117055596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1109/PRMVIA58252.2023.00040
Yingying Mei, Yuanyuan Wang
Twitter text sentiment analysis has important applications in public sentiment monitoring. The results of sentiment analysis based on traditional machine learning models and sentiment dictionaries are often unsatisfactory. How to optimize the performance of public opinion sentiment analysis has become an important challenge in this field. This paper uses the BERT model based on deep learning to complete the language understanding task and compares the performance with the traditional practice. The results show that the BERT model achieves better performance, reaching more than 90%. The model was then used to perform three classifications to analyze Twitter comments during the COVID-19 outbreak, and overall positive sentiment and neutral sentiment dominated. In addition, we also conduct related analysis on word frequency, word cloud and time comparison, in order to achieve the purpose of comprehensively understanding the social-emotional state during the epidemic.
{"title":"Sentiment Analysis of the COVID-19 Epidemic Based on Deep Learning","authors":"Yingying Mei, Yuanyuan Wang","doi":"10.1109/PRMVIA58252.2023.00040","DOIUrl":"https://doi.org/10.1109/PRMVIA58252.2023.00040","url":null,"abstract":"Twitter text sentiment analysis has important applications in public sentiment monitoring. The results of sentiment analysis based on traditional machine learning models and sentiment dictionaries are often unsatisfactory. How to optimize the performance of public opinion sentiment analysis has become an important challenge in this field. This paper uses the BERT model based on deep learning to complete the language understanding task and compares the performance with the traditional practice. The results show that the BERT model achieves better performance, reaching more than 90%. The model was then used to perform three classifications to analyze Twitter comments during the COVID-19 outbreak, and overall positive sentiment and neutral sentiment dominated. In addition, we also conduct related analysis on word frequency, word cloud and time comparison, in order to achieve the purpose of comprehensively understanding the social-emotional state during the epidemic.","PeriodicalId":221346,"journal":{"name":"2023 International Conference on Pattern Recognition, Machine Vision and Intelligent Algorithms (PRMVIA)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124458956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1109/PRMVIA58252.2023.00045
Yangzhao Yu, Bin He, Guangjie Yu, Faxin Zhong
Effective resilience training can prevent early post-traumatic stress disorder, but limitations in emotion induction and recognition make it extremely challenging. Thus, this paper presents a fear mental resilience training that uses virtual reality exposure therapy and introduces two key techniques - construction of virtual scenarios and dynamic weighted decision fusion. Firstly, virtual reality (VR) is proposed to construct three disaster scenarios to induce different level of fear emotion and combining VR with stroop test to improve ecological validity. Then, three different weights are designed by analyzing the modal and cross-modal information to establish a fear emotion classification model based on dynamic weighted decision fusion. Finally, combining VR scenarios with exposure therapy to achieve progressive fear resilience training. And evaluate the training effect according to the individual’s emotional state and stroop performance level. The results demonstrate the designed VR scenarios can effectively induce fear, the proposed data fusion method realizes dynamic weighted fusion according to the weight design, effectively integrates multimodal data information, thereby improving the classification performance of the model. And the mental resilience training based on VR and dynamic weighted decision fusion methods is of great significance for enhancing the mental resilience of the subjects.
{"title":"Research on fear mental resilience training based on virtual reality and dynamic decision fusion","authors":"Yangzhao Yu, Bin He, Guangjie Yu, Faxin Zhong","doi":"10.1109/PRMVIA58252.2023.00045","DOIUrl":"https://doi.org/10.1109/PRMVIA58252.2023.00045","url":null,"abstract":"Effective resilience training can prevent early post-traumatic stress disorder, but limitations in emotion induction and recognition make it extremely challenging. Thus, this paper presents a fear mental resilience training that uses virtual reality exposure therapy and introduces two key techniques - construction of virtual scenarios and dynamic weighted decision fusion. Firstly, virtual reality (VR) is proposed to construct three disaster scenarios to induce different level of fear emotion and combining VR with stroop test to improve ecological validity. Then, three different weights are designed by analyzing the modal and cross-modal information to establish a fear emotion classification model based on dynamic weighted decision fusion. Finally, combining VR scenarios with exposure therapy to achieve progressive fear resilience training. And evaluate the training effect according to the individual’s emotional state and stroop performance level. The results demonstrate the designed VR scenarios can effectively induce fear, the proposed data fusion method realizes dynamic weighted fusion according to the weight design, effectively integrates multimodal data information, thereby improving the classification performance of the model. And the mental resilience training based on VR and dynamic weighted decision fusion methods is of great significance for enhancing the mental resilience of the subjects.","PeriodicalId":221346,"journal":{"name":"2023 International Conference on Pattern Recognition, Machine Vision and Intelligent Algorithms (PRMVIA)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133415966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1109/prmvia58252.2023.00051
Kaixin Su, Xianhui Yu, Xiaoying Cai, Yangfeng Lai
Building a campus innovation laboratory with the characteristics of "3D visualization", "informatization" and "digitalization" is the basis for the construction of a smart campus. The majority of the current campus visualization systems were created utilizing conventional modeling tools along with GIS platforms for C/S architecture. It tends to be flat, has a poor level of visualization, and lacks consistent integration of model information. It has increasingly become necessary to find a solution to the problem of how to achieve real-time acquisition and full-space three-dimensional display of innovation laboratories in the Internet environment. This study enhances the interactivity of the innovation laboratory display using the Pano2VR fusion of panorama and video. It also builds an indoor and outdoor model of the innovation laboratory, employs cutting-edge layered display technology to create the floor’s layered display effect, integrates the indoor and outdoor expressions of the campus, and builds a Web-based three-dimensional visualization management system for innovative campuses to realize online learning. This method can more effectively address the issues with shared labs and limited offline space that currently plague institutions.
{"title":"University Innovation Lab full-space 3D visualization display system based on 3D real sense and panoramic technology","authors":"Kaixin Su, Xianhui Yu, Xiaoying Cai, Yangfeng Lai","doi":"10.1109/prmvia58252.2023.00051","DOIUrl":"https://doi.org/10.1109/prmvia58252.2023.00051","url":null,"abstract":"Building a campus innovation laboratory with the characteristics of \"3D visualization\", \"informatization\" and \"digitalization\" is the basis for the construction of a smart campus. The majority of the current campus visualization systems were created utilizing conventional modeling tools along with GIS platforms for C/S architecture. It tends to be flat, has a poor level of visualization, and lacks consistent integration of model information. It has increasingly become necessary to find a solution to the problem of how to achieve real-time acquisition and full-space three-dimensional display of innovation laboratories in the Internet environment. This study enhances the interactivity of the innovation laboratory display using the Pano2VR fusion of panorama and video. It also builds an indoor and outdoor model of the innovation laboratory, employs cutting-edge layered display technology to create the floor’s layered display effect, integrates the indoor and outdoor expressions of the campus, and builds a Web-based three-dimensional visualization management system for innovative campuses to realize online learning. This method can more effectively address the issues with shared labs and limited offline space that currently plague institutions.","PeriodicalId":221346,"journal":{"name":"2023 International Conference on Pattern Recognition, Machine Vision and Intelligent Algorithms (PRMVIA)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134111414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1109/prmvia58252.2023.00048
Zhilin Zhang, Ting Zhang, Zhaoying Liu, Yujian Li
Fine-grained ship image recognition is to discriminate different subcategories of ship categories. Because of the lack of ship data sets and the particularity of the identification task, fine-grained ship recognition is a challenging task. We designed a part assignment module, which has the function of part assignment and extracting import part information. Then, we added the module to the SimCLR contrastive learning framework. This method uses the module to assignment the information in the feature map, extract the key information of key regions, increase the learning ability of contrast learning for key information, in the end, the accuracy of fine-grained classification can be improved.
{"title":"Contrastive Learning with Part Assignment for Fine-grained Ship Image Recognition","authors":"Zhilin Zhang, Ting Zhang, Zhaoying Liu, Yujian Li","doi":"10.1109/prmvia58252.2023.00048","DOIUrl":"https://doi.org/10.1109/prmvia58252.2023.00048","url":null,"abstract":"Fine-grained ship image recognition is to discriminate different subcategories of ship categories. Because of the lack of ship data sets and the particularity of the identification task, fine-grained ship recognition is a challenging task. We designed a part assignment module, which has the function of part assignment and extracting import part information. Then, we added the module to the SimCLR contrastive learning framework. This method uses the module to assignment the information in the feature map, extract the key information of key regions, increase the learning ability of contrast learning for key information, in the end, the accuracy of fine-grained classification can be improved.","PeriodicalId":221346,"journal":{"name":"2023 International Conference on Pattern Recognition, Machine Vision and Intelligent Algorithms (PRMVIA)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127820837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}