{"title":"Deep object tracking with multi-modal data","authors":"Xuezhi Zhang, Yuan Yuan, Xiaoqiang Lu","doi":"10.1109/CITS.2016.7546403","DOIUrl":null,"url":null,"abstract":"Object tracking is a challenging topic in the field of computer vision since its performance is easily disturbed by occlusion, illumination change, background clutter, scale variation, etc. In this paper, we introduce a robust tracking algorithm that fuses information from both visible images and infrared (IR) images. The proposed tracking algorithm not only incorporates convolutional feature maps from the visible channel, but also employs a scale pyramid representation from IR channel. We estimate the target location by fusing multilayer convolutional feature maps, and predict the target scale from a scale pyramid. The pipeline of the proposed method is as follows. First, the hierarchical convolutional feature maps are obtained from visible images using VGG-Nets. Then, the accurate target location is predicted by the maximum response of correlation filters with the visible image feature maps. Finally, we obtain the precise object scale with a scale pyramid from infrared images where the difference between the target and the background is clear. In order to verify the performance of the proposed method, we capture six video sequences under different conditions. These sequences contain both visible channel and IR channel. Ten state-of-the-art tracking algorithms are compared with our method, and the experimental results show the effectiveness of the proposed tracker.","PeriodicalId":340958,"journal":{"name":"2016 International Conference on Computer, Information and Telecommunication Systems (CITS)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on Computer, Information and Telecommunication Systems (CITS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CITS.2016.7546403","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Object tracking is a challenging topic in the field of computer vision since its performance is easily disturbed by occlusion, illumination change, background clutter, scale variation, etc. In this paper, we introduce a robust tracking algorithm that fuses information from both visible images and infrared (IR) images. The proposed tracking algorithm not only incorporates convolutional feature maps from the visible channel, but also employs a scale pyramid representation from IR channel. We estimate the target location by fusing multilayer convolutional feature maps, and predict the target scale from a scale pyramid. The pipeline of the proposed method is as follows. First, the hierarchical convolutional feature maps are obtained from visible images using VGG-Nets. Then, the accurate target location is predicted by the maximum response of correlation filters with the visible image feature maps. Finally, we obtain the precise object scale with a scale pyramid from infrared images where the difference between the target and the background is clear. In order to verify the performance of the proposed method, we capture six video sequences under different conditions. These sequences contain both visible channel and IR channel. Ten state-of-the-art tracking algorithms are compared with our method, and the experimental results show the effectiveness of the proposed tracker.