Pub Date : 2020-07-01DOI: 10.1109/ICIVC50857.2020.9177467
Cheng Yang, Lei Kong
To satisfy the differentiated requirements of different user groups and even individuals needs abundant design solutions, leading to a significant increase in design time and labor costs. Through the interactive evolutionary algorithm, a new product intelligent design method is proposed to solve the problem of intelligent generation of batched appearance design schemes, and the design scheme “population” that matches the target style image is directly obtained. First, model the stylized parameters of the product. Then, a neural network is used to establish a mapping model of the style image space and the appearance parameters of the product. Finally, through the collaborative evolution mechanism, the intelligent generation of product design solutions is realized. The results show that this method greatly eases the evaluation fatigue problem in evolutionary calculations while obtaining the target style product design scheme.
{"title":"Research on Product Style Design Based on Genetic Algorithm","authors":"Cheng Yang, Lei Kong","doi":"10.1109/ICIVC50857.2020.9177467","DOIUrl":"https://doi.org/10.1109/ICIVC50857.2020.9177467","url":null,"abstract":"To satisfy the differentiated requirements of different user groups and even individuals needs abundant design solutions, leading to a significant increase in design time and labor costs. Through the interactive evolutionary algorithm, a new product intelligent design method is proposed to solve the problem of intelligent generation of batched appearance design schemes, and the design scheme “population” that matches the target style image is directly obtained. First, model the stylized parameters of the product. Then, a neural network is used to establish a mapping model of the style image space and the appearance parameters of the product. Finally, through the collaborative evolution mechanism, the intelligent generation of product design solutions is realized. The results show that this method greatly eases the evaluation fatigue problem in evolutionary calculations while obtaining the target style product design scheme.","PeriodicalId":6806,"journal":{"name":"2020 IEEE 5th International Conference on Image, Vision and Computing (ICIVC)","volume":"6 1","pages":"317-321"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73029350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-07-01DOI: 10.1109/ICIVC50857.2020.9177491
Bingqing Niu, Hongtao Wu, Ying Meng
Cracks are one of the most common and serious diseases of tunnel lining, which seriously threatens the safety of vehicles and requires regular inspection and measurement. In view of the problems of underexposure, uneven illumination and serious noise of the collected images in the tunnel, after the image is evenly processed, a denoising method combined with median filtering and bilateral filtering is constructed, which can filter out a lot of noise on the basis of protecting the details of the crack edge. Due to the large number of mechanical scratches and disturbing textures in the tunnel lining, EMAP is used to enhance features after Gabor filtering, and the improved CEM segmentation algorithm is used to effectively overcome the inaccurate segmentation of traditional algorithms and obtain binary images of cracks. The experimental results show that the proposed algorithm can identify the accuracy of tunnel lining cracks by more than 92%, which verifies the effectiveness of the proposed algorithm.
{"title":"Application of CEM Algorithm in the Field of Tunnel Crack Identification","authors":"Bingqing Niu, Hongtao Wu, Ying Meng","doi":"10.1109/ICIVC50857.2020.9177491","DOIUrl":"https://doi.org/10.1109/ICIVC50857.2020.9177491","url":null,"abstract":"Cracks are one of the most common and serious diseases of tunnel lining, which seriously threatens the safety of vehicles and requires regular inspection and measurement. In view of the problems of underexposure, uneven illumination and serious noise of the collected images in the tunnel, after the image is evenly processed, a denoising method combined with median filtering and bilateral filtering is constructed, which can filter out a lot of noise on the basis of protecting the details of the crack edge. Due to the large number of mechanical scratches and disturbing textures in the tunnel lining, EMAP is used to enhance features after Gabor filtering, and the improved CEM segmentation algorithm is used to effectively overcome the inaccurate segmentation of traditional algorithms and obtain binary images of cracks. The experimental results show that the proposed algorithm can identify the accuracy of tunnel lining cracks by more than 92%, which verifies the effectiveness of the proposed algorithm.","PeriodicalId":6806,"journal":{"name":"2020 IEEE 5th International Conference on Image, Vision and Computing (ICIVC)","volume":"1 1","pages":"232-236"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85011439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A real-time measurement method for the thread number of rail fastener is proposed. The measurement is achieved by the processing of thread images. After analysis, the fastener thread has the elliptical feature, so two parameters‐‐‐elliptical similarity and elliptical integrity are proposed. Based on the two parameters, the thread region is located. Combining with the light intensity distribution of thread region and thread positions, the thread number is measured. 200 samples are chosen to verify the validity of this method. Experimental data show that the limitation error of this method can reach 0.1985 threads.
{"title":"Real-Time Measurement of Thread Number of Rail Fastener","authors":"X. Luo, Ke-bin Jia, Pengyu Liu, Daoquan Xiong, Xiuchen Tian","doi":"10.1109/ICIVC50857.2020.9177448","DOIUrl":"https://doi.org/10.1109/ICIVC50857.2020.9177448","url":null,"abstract":"A real-time measurement method for the thread number of rail fastener is proposed. The measurement is achieved by the processing of thread images. After analysis, the fastener thread has the elliptical feature, so two parameters‐‐‐elliptical similarity and elliptical integrity are proposed. Based on the two parameters, the thread region is located. Combining with the light intensity distribution of thread region and thread positions, the thread number is measured. 200 samples are chosen to verify the validity of this method. Experimental data show that the limitation error of this method can reach 0.1985 threads.","PeriodicalId":6806,"journal":{"name":"2020 IEEE 5th International Conference on Image, Vision and Computing (ICIVC)","volume":"9 1","pages":"312-316"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85559113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-07-01DOI: 10.1109/ICIVC50857.2020.9177488
Tianhang Gao, Zhenhao Yang
Augmented reality (AR) superimposes computer-generated virtual objects on real scenes to gain immersive experience. Effective recognition of 3D objects in real scenes is the fundamental requirement in AR. The traditional Canny edge detection algorithm ignores the important boundary information about the object, thus decreasing the recognition accuracy. In this paper, we improve Canny to propose a novel 3D object recognition method, where median filtering is adopted in order to extract the contour of the object instead of Gaussian fuzzy. An operator based on wedge template is designed to improve the boundary detection effect of the corner. Local feature descriptors are then introduced to describe the local feature points of the object. Finally, SLAM technology is conducted to ensure that the virtual model is stably superimposed above the 3D object. The experimental results show that the proposed method is able to retain the edge information of the object well and can be combined with local feature descriptors to accurately recognize 3D objects.
{"title":"3D Object Recognition Method Based on Improved Canny Edge Detection Algorithm in Augmented Reality","authors":"Tianhang Gao, Zhenhao Yang","doi":"10.1109/ICIVC50857.2020.9177488","DOIUrl":"https://doi.org/10.1109/ICIVC50857.2020.9177488","url":null,"abstract":"Augmented reality (AR) superimposes computer-generated virtual objects on real scenes to gain immersive experience. Effective recognition of 3D objects in real scenes is the fundamental requirement in AR. The traditional Canny edge detection algorithm ignores the important boundary information about the object, thus decreasing the recognition accuracy. In this paper, we improve Canny to propose a novel 3D object recognition method, where median filtering is adopted in order to extract the contour of the object instead of Gaussian fuzzy. An operator based on wedge template is designed to improve the boundary detection effect of the corner. Local feature descriptors are then introduced to describe the local feature points of the object. Finally, SLAM technology is conducted to ensure that the virtual model is stably superimposed above the 3D object. The experimental results show that the proposed method is able to retain the edge information of the object well and can be combined with local feature descriptors to accurately recognize 3D objects.","PeriodicalId":6806,"journal":{"name":"2020 IEEE 5th International Conference on Image, Vision and Computing (ICIVC)","volume":"22 1","pages":"19-23"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87898239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-07-01DOI: 10.1109/ICIVC50857.2020.9177452
Dionis A. Padilla, Nicole Kim U. Vitug, Julius Benito S. Marquez
Shorthand or Stenography has been used in a variety of fields of practice, particularly by court stenographers. To record every detail of the hearing, a stenographer must write fast and accurate In the Philippines, the stenographers still used the conventional way of writing shorthand, which is by hand. Transcribing shorthand writing is time-consuming and sometimes confusing because of a lot of characters or words to be transcribed. Another problem is that only a stenographer can understand and translate shorthand writing. What if there is no stenographer available to decipher a document? A deep learning approach was used to implement and developed an automated Gregg shorthand word to English-word conversion. The Convolutional Neural Network (CNN) model used was the Inception-v3 in TensorFlow platform, an open-source algorithm used for object classification. The training datasets consist of 135 Legal Terminologies with 120 images per word with a total of 16,200 datasets. The trained model achieved a validation accuracy of 91%. For testing, 10 trials per legal terminology were executed with a total of 1,350 handwritten Gregg Shorthand words tested. The system correctly translated a total of 739 words resulting in 54.74% accuracy.
{"title":"Deep Learning Approach in Gregg Shorthand Word to English-Word Conversion","authors":"Dionis A. Padilla, Nicole Kim U. Vitug, Julius Benito S. Marquez","doi":"10.1109/ICIVC50857.2020.9177452","DOIUrl":"https://doi.org/10.1109/ICIVC50857.2020.9177452","url":null,"abstract":"Shorthand or Stenography has been used in a variety of fields of practice, particularly by court stenographers. To record every detail of the hearing, a stenographer must write fast and accurate In the Philippines, the stenographers still used the conventional way of writing shorthand, which is by hand. Transcribing shorthand writing is time-consuming and sometimes confusing because of a lot of characters or words to be transcribed. Another problem is that only a stenographer can understand and translate shorthand writing. What if there is no stenographer available to decipher a document? A deep learning approach was used to implement and developed an automated Gregg shorthand word to English-word conversion. The Convolutional Neural Network (CNN) model used was the Inception-v3 in TensorFlow platform, an open-source algorithm used for object classification. The training datasets consist of 135 Legal Terminologies with 120 images per word with a total of 16,200 datasets. The trained model achieved a validation accuracy of 91%. For testing, 10 trials per legal terminology were executed with a total of 1,350 handwritten Gregg Shorthand words tested. The system correctly translated a total of 739 words resulting in 54.74% accuracy.","PeriodicalId":6806,"journal":{"name":"2020 IEEE 5th International Conference on Image, Vision and Computing (ICIVC)","volume":"35 1","pages":"204-210"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82796768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-07-01DOI: 10.1109/ICIVC50857.2020.9177460
Yue Gao, Jun Shi, Jun Li, Ruoyu Wang
Remote sensing scene classification is of great importance to remote sensing image analysis. Most existing methods based on Convolutional Neural Network (CNN) fail to discriminate the crucial information from the complex scene content due to the intraclass diversity. In this paper, we propose a dual attention-aware network for remote sensing scene classification. Specifically, we use two kinds of attention modules (i.e. channel and spatial attentions) to explore the contextual dependencies from the channel and spatial dimensions respectively. The channel attention module intends to capture the channel-wise feature dependencies and further exploit the significant semantic attention. On the other hand, the spatial attention module aims to concentrate the attentive spatial locations and thus discover the discriminative parts inside the scene. The outputs of two attention modules are finally integrated as the attention-aware feature representation for improving classification performance. Experimental results on RSSCN7 and AID benchmark datasets show the effectiveness and superiority of the proposed methods for scene classification in remote sensing imagery.
{"title":"Remote Sensing Scene Classification with Dual Attention-Aware Network","authors":"Yue Gao, Jun Shi, Jun Li, Ruoyu Wang","doi":"10.1109/ICIVC50857.2020.9177460","DOIUrl":"https://doi.org/10.1109/ICIVC50857.2020.9177460","url":null,"abstract":"Remote sensing scene classification is of great importance to remote sensing image analysis. Most existing methods based on Convolutional Neural Network (CNN) fail to discriminate the crucial information from the complex scene content due to the intraclass diversity. In this paper, we propose a dual attention-aware network for remote sensing scene classification. Specifically, we use two kinds of attention modules (i.e. channel and spatial attentions) to explore the contextual dependencies from the channel and spatial dimensions respectively. The channel attention module intends to capture the channel-wise feature dependencies and further exploit the significant semantic attention. On the other hand, the spatial attention module aims to concentrate the attentive spatial locations and thus discover the discriminative parts inside the scene. The outputs of two attention modules are finally integrated as the attention-aware feature representation for improving classification performance. Experimental results on RSSCN7 and AID benchmark datasets show the effectiveness and superiority of the proposed methods for scene classification in remote sensing imagery.","PeriodicalId":6806,"journal":{"name":"2020 IEEE 5th International Conference on Image, Vision and Computing (ICIVC)","volume":"109 1","pages":"171-175"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88648111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-07-01DOI: 10.1109/ICIVC50857.2020.9177449
Zhe Luo, You Yu, Daijun Zhang, Shijie Feng, H. Yu, Yongxin Chang, Wei Shen
Conditional Random Field is a discriminative model for time series data. In this paper, we propose an improved CRF and apply it to the task of air quality inference. Different from the classical CRF, our linear chain CRF is a supervised learning based on the deep convolution neural network, which has a strong learning ability and fast processing speed for the engineering big data. Specifically, we model the state feature function and the state transition feature function with deep convolutional neural network. The parameter space can store more feature expressions learned from a large number of data. For the state transition feature function of linear conditional random field, we add the influence of input sequence on this function. Through the modelling and learning both vertex features and edge features from data, we obtain a more powerful and more efficient CRF. Experiments on natural language and air quality data show our CRF can achieve higher accuracy.
{"title":"Air Quality Inference with Deep Convolutional Conditional Random Field","authors":"Zhe Luo, You Yu, Daijun Zhang, Shijie Feng, H. Yu, Yongxin Chang, Wei Shen","doi":"10.1109/ICIVC50857.2020.9177449","DOIUrl":"https://doi.org/10.1109/ICIVC50857.2020.9177449","url":null,"abstract":"Conditional Random Field is a discriminative model for time series data. In this paper, we propose an improved CRF and apply it to the task of air quality inference. Different from the classical CRF, our linear chain CRF is a supervised learning based on the deep convolution neural network, which has a strong learning ability and fast processing speed for the engineering big data. Specifically, we model the state feature function and the state transition feature function with deep convolutional neural network. The parameter space can store more feature expressions learned from a large number of data. For the state transition feature function of linear conditional random field, we add the influence of input sequence on this function. Through the modelling and learning both vertex features and edge features from data, we obtain a more powerful and more efficient CRF. Experiments on natural language and air quality data show our CRF can achieve higher accuracy.","PeriodicalId":6806,"journal":{"name":"2020 IEEE 5th International Conference on Image, Vision and Computing (ICIVC)","volume":"9 1","pages":"296-302"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74354346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-07-01DOI: 10.1109/ICIVC50857.2020.9177462
Ying Meng, Hongtao Wu, Bingqing Niu
The lining image collected by the tunnel detection equipment will be degraded by the uneven gray distribution of the collected image due to the restriction of the site environment and hardware resources of the tunnel. In serious cases, the whole image is dim and fuzzy, and the disease feature information cannot be identified from the image background. In order to solve these problems, this paper proposes an image adaptive smoothing and image high frequency edge preserving optimization algorithm for tunnel lining environment. Compared with the traditional image preprocessing and image denoising algorithm, this algorithm improves the problem of the disease gray feature information jumping and information loss in the tunnel lining image due to the imbalance of gray level and the noise interference, and ensures the effectiveness of the original image interested in the disease target area information. Compared with a large number of experimental data, the improved algorithm has a great improvement in convergence speed and image quality.
{"title":"A Method to Improve the Lining Images Quality in Complex Tunnel Scenes","authors":"Ying Meng, Hongtao Wu, Bingqing Niu","doi":"10.1109/ICIVC50857.2020.9177462","DOIUrl":"https://doi.org/10.1109/ICIVC50857.2020.9177462","url":null,"abstract":"The lining image collected by the tunnel detection equipment will be degraded by the uneven gray distribution of the collected image due to the restriction of the site environment and hardware resources of the tunnel. In serious cases, the whole image is dim and fuzzy, and the disease feature information cannot be identified from the image background. In order to solve these problems, this paper proposes an image adaptive smoothing and image high frequency edge preserving optimization algorithm for tunnel lining environment. Compared with the traditional image preprocessing and image denoising algorithm, this algorithm improves the problem of the disease gray feature information jumping and information loss in the tunnel lining image due to the imbalance of gray level and the noise interference, and ensures the effectiveness of the original image interested in the disease target area information. Compared with a large number of experimental data, the improved algorithm has a great improvement in convergence speed and image quality.","PeriodicalId":6806,"journal":{"name":"2020 IEEE 5th International Conference on Image, Vision and Computing (ICIVC)","volume":"57 5 1","pages":"199-203"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79830096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-07-01DOI: 10.1109/ICIVC50857.2020.9177484
Longwei Li, Jiangbo Xi, Wandong Jiang, Ming Cong, Ling Han, Yun Yang
Objects detection in high resolution (HR) remote sensing images plays an important role in modern military, national defense, and commercial field. Because of a variety of object types and different sizes, it is difficulty to realize the rapid detection of multi-scale high resolution remote sensing objects, and provides support for succeeding decision making responses. This paper proposes a multi-scale fast detection method of remote sensing image objects with deep learning model, named YOLOv3. The COCO data model is used to establish the high resolution remote sensing image set based on the NWPU data. The proposed model can realize automatic learning of object features, which has good properties on generalization and robustness. It can also overcome the deficiency of traditional object detection method needing manual feature design for different objects. The experimental results show that the average detection accuracy of objects with different sizes in high resolution remote sensing images can reach 93.50%, which demonstrates that the proposed method can achieve rapid detection of different types of multi-scale objects.
{"title":"Multi-scale Fast Detection of Objects in High Resolution Remote Sensing Images","authors":"Longwei Li, Jiangbo Xi, Wandong Jiang, Ming Cong, Ling Han, Yun Yang","doi":"10.1109/ICIVC50857.2020.9177484","DOIUrl":"https://doi.org/10.1109/ICIVC50857.2020.9177484","url":null,"abstract":"Objects detection in high resolution (HR) remote sensing images plays an important role in modern military, national defense, and commercial field. Because of a variety of object types and different sizes, it is difficulty to realize the rapid detection of multi-scale high resolution remote sensing objects, and provides support for succeeding decision making responses. This paper proposes a multi-scale fast detection method of remote sensing image objects with deep learning model, named YOLOv3. The COCO data model is used to establish the high resolution remote sensing image set based on the NWPU data. The proposed model can realize automatic learning of object features, which has good properties on generalization and robustness. It can also overcome the deficiency of traditional object detection method needing manual feature design for different objects. The experimental results show that the average detection accuracy of objects with different sizes in high resolution remote sensing images can reach 93.50%, which demonstrates that the proposed method can achieve rapid detection of different types of multi-scale objects.","PeriodicalId":6806,"journal":{"name":"2020 IEEE 5th International Conference on Image, Vision and Computing (ICIVC)","volume":"18 1","pages":"5-10"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84101658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-07-01DOI: 10.1109/ICIVC50857.2020.9177445
Jun Wang, Yuanyun Wang, Shaoquan Zhang, Chenguang Xu, Chengzhi Deng
Recently, visual tracking has seen much progress in either accuracy or speed. However, due to drastic illumination variation, partial occlusion, scale variation and out-of-plane rotation, visual tracking remains a challenging task. Dealing with complicated appearance variations is an open issue in visual tracking. Existing trackers represent target candidates by a combination of target templates or previous tracking results under some constraints. When a drastic appearance variation occurs or some appearance variations occur simultaneously, such target representations are not robust. In this paper, we present a discriminative dictionary learning based target representation. A target candidate is represented via a linear combination of atoms in a learnt dictionary. The online dictionary learning can learn the appearance variations in tracking processing. So, the learnt dictionary can cover all of kinds of appearance variations. Based on this kind of target representation, a novel tracking algorithm is proposed. Extensive experiments on challenging sequences in popular tracking benchmark demonstrate competing tracking performances against some state-of-the-art trackers.
{"title":"Dictionary Learning for Visual Tracking with Dimensionality Reduction","authors":"Jun Wang, Yuanyun Wang, Shaoquan Zhang, Chenguang Xu, Chengzhi Deng","doi":"10.1109/ICIVC50857.2020.9177445","DOIUrl":"https://doi.org/10.1109/ICIVC50857.2020.9177445","url":null,"abstract":"Recently, visual tracking has seen much progress in either accuracy or speed. However, due to drastic illumination variation, partial occlusion, scale variation and out-of-plane rotation, visual tracking remains a challenging task. Dealing with complicated appearance variations is an open issue in visual tracking. Existing trackers represent target candidates by a combination of target templates or previous tracking results under some constraints. When a drastic appearance variation occurs or some appearance variations occur simultaneously, such target representations are not robust. In this paper, we present a discriminative dictionary learning based target representation. A target candidate is represented via a linear combination of atoms in a learnt dictionary. The online dictionary learning can learn the appearance variations in tracking processing. So, the learnt dictionary can cover all of kinds of appearance variations. Based on this kind of target representation, a novel tracking algorithm is proposed. Extensive experiments on challenging sequences in popular tracking benchmark demonstrate competing tracking performances against some state-of-the-art trackers.","PeriodicalId":6806,"journal":{"name":"2020 IEEE 5th International Conference on Image, Vision and Computing (ICIVC)","volume":"7 1","pages":"251-255"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85812513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}