{"title":"样本平衡和iou引导无锚视觉跟踪","authors":"Jueyu Zhu, Yu Qin, Kai Wang, Gao zhi Zeng","doi":"10.5566/ias.2929","DOIUrl":null,"url":null,"abstract":"Siamese network-based visual tracking algorithms have achieved excellent performance in recent years, but challenges such as fast target motion, shape and scale variations have made the tracking extremely difficult. The regression of anchor-free tracking has low computational complexity, strong real-time performance, and is suitable for visual tracking. Based on the anchor-free siamese tracking framework, this paper firstly introduces balance factors and modulation coefficients into the cross-entropy loss function to solve the classification inaccuracy caused by the imbalance between positive and negative samples as well as the imbalance between hard and easy samples during the training process, so that the model focuses more on the positive samples and the hard samples that make the major contribution to the training. Secondly, the intersection over union (IoU) loss function of the regression branch is improved, not only focusing on the IoU between the predicted box and the ground truth box, but also considering the aspect ratios of the two boxes and the minimum bounding box area that accommodate the two, which guides the generation of more accurate regression offsets. The overall loss of classification and regression is iteratively minimized and improves the accuracy and robustness of visual tracking. Experiments on four public datasets, OTB2015, VOT2016, UAV123 and GOT-10k, show that the proposed algorithm achieves the state-of-the-art performance.","PeriodicalId":49062,"journal":{"name":"Image Analysis & Stereology","volume":"46 6","pages":"0"},"PeriodicalIF":0.8000,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Sample-balanced and IoU-guided anchor-free visual tracking\",\"authors\":\"Jueyu Zhu, Yu Qin, Kai Wang, Gao zhi Zeng\",\"doi\":\"10.5566/ias.2929\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Siamese network-based visual tracking algorithms have achieved excellent performance in recent years, but challenges such as fast target motion, shape and scale variations have made the tracking extremely difficult. The regression of anchor-free tracking has low computational complexity, strong real-time performance, and is suitable for visual tracking. Based on the anchor-free siamese tracking framework, this paper firstly introduces balance factors and modulation coefficients into the cross-entropy loss function to solve the classification inaccuracy caused by the imbalance between positive and negative samples as well as the imbalance between hard and easy samples during the training process, so that the model focuses more on the positive samples and the hard samples that make the major contribution to the training. Secondly, the intersection over union (IoU) loss function of the regression branch is improved, not only focusing on the IoU between the predicted box and the ground truth box, but also considering the aspect ratios of the two boxes and the minimum bounding box area that accommodate the two, which guides the generation of more accurate regression offsets. The overall loss of classification and regression is iteratively minimized and improves the accuracy and robustness of visual tracking. Experiments on four public datasets, OTB2015, VOT2016, UAV123 and GOT-10k, show that the proposed algorithm achieves the state-of-the-art performance.\",\"PeriodicalId\":49062,\"journal\":{\"name\":\"Image Analysis & Stereology\",\"volume\":\"46 6\",\"pages\":\"0\"},\"PeriodicalIF\":0.8000,\"publicationDate\":\"2023-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Image Analysis & Stereology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5566/ias.2929\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image Analysis & Stereology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5566/ias.2929","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
基于Siamese网络的视觉跟踪算法近年来取得了很好的效果,但目标快速运动、形状和尺度变化等挑战使得跟踪非常困难。无锚跟踪回归计算复杂度低,实时性强,适合于视觉跟踪。基于无锚暹罗跟踪框架,本文首先在交叉熵损失函数中引入平衡因子和调制系数,解决训练过程中正负样本不平衡、难易样本不平衡造成的分类不准确问题,使模型更加关注对训练贡献较大的正样本和难样本。其次,改进回归分支的IoU (intersection over union)损失函数,不仅关注预测框与地面真值框之间的IoU,还考虑了两个框的宽高比和容纳两者的最小边界框面积,从而指导生成更精确的回归偏移量。迭代最小化了分类和回归的总体损失,提高了视觉跟踪的准确性和鲁棒性。在OTB2015、VOT2016、UAV123和GOT-10k四个公共数据集上的实验表明,该算法达到了最先进的性能。
Sample-balanced and IoU-guided anchor-free visual tracking
Siamese network-based visual tracking algorithms have achieved excellent performance in recent years, but challenges such as fast target motion, shape and scale variations have made the tracking extremely difficult. The regression of anchor-free tracking has low computational complexity, strong real-time performance, and is suitable for visual tracking. Based on the anchor-free siamese tracking framework, this paper firstly introduces balance factors and modulation coefficients into the cross-entropy loss function to solve the classification inaccuracy caused by the imbalance between positive and negative samples as well as the imbalance between hard and easy samples during the training process, so that the model focuses more on the positive samples and the hard samples that make the major contribution to the training. Secondly, the intersection over union (IoU) loss function of the regression branch is improved, not only focusing on the IoU between the predicted box and the ground truth box, but also considering the aspect ratios of the two boxes and the minimum bounding box area that accommodate the two, which guides the generation of more accurate regression offsets. The overall loss of classification and regression is iteratively minimized and improves the accuracy and robustness of visual tracking. Experiments on four public datasets, OTB2015, VOT2016, UAV123 and GOT-10k, show that the proposed algorithm achieves the state-of-the-art performance.
期刊介绍:
Image Analysis and Stereology is the official journal of the International Society for Stereology & Image Analysis. It promotes the exchange of scientific, technical, organizational and other information on the quantitative analysis of data having a geometrical structure, including stereology, differential geometry, image analysis, image processing, mathematical morphology, stochastic geometry, statistics, pattern recognition, and related topics. The fields of application are not restricted and range from biomedicine, materials sciences and physics to geology and geography.