BoostTrack: boosting the similarity measure and detection confidence for improved multiple object tracking

IF 2.3 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Machine Vision and Applications Pub Date : 2024-04-12 DOI:10.1007/s00138-024-01531-5

Vukasin D. Stanojevic, Branimir T. Todorovic

{"title":"BoostTrack: boosting the similarity measure and detection confidence for improved multiple object tracking","authors":"Vukasin D. Stanojevic, Branimir T. Todorovic","doi":"10.1007/s00138-024-01531-5","DOIUrl":null,"url":null,"abstract":"<p>Handling unreliable detections and avoiding identity switches are crucial for the success of multiple object tracking (MOT). Ideally, MOT algorithm should use true positive detections only, work in real-time and produce no identity switches. To approach the described ideal solution, we present the BoostTrack, a simple yet effective tracing-by-detection MOT method that utilizes several lightweight plug and play additions to improve MOT performance. We design a detection-tracklet confidence score and use it to scale the similarity measure and implicitly favour high detection confidence and high tracklet confidence pairs in one-stage association. To reduce the ambiguity arising from using intersection over union (IoU), we propose a novel Mahalanobis distance and shape similarity additions to boost the overall similarity measure. To utilize low-detection score bounding boxes in one-stage association, we propose to boost the confidence scores of two groups of detections: the detections we assume to correspond to the existing tracked object, and the detections we assume to correspond to a previously undetected object. The proposed additions are orthogonal to the existing approaches, and we combine them with interpolation and camera motion compensation to achieve results comparable to the standard benchmark solutions while retaining real-time execution speed. When combined with appearance similarity, our method outperforms all standard benchmark solutions on MOT17 and MOT20 datasets. It ranks first among online methods in HOTA metric in the MOT Challenge on MOT17 and MOT20 test sets. We make our code available at https://github.com/vukasin-stanojevic/BoostTrack.</p>","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":"298 1","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Vision and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00138-024-01531-5","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Handling unreliable detections and avoiding identity switches are crucial for the success of multiple object tracking (MOT). Ideally, MOT algorithm should use true positive detections only, work in real-time and produce no identity switches. To approach the described ideal solution, we present the BoostTrack, a simple yet effective tracing-by-detection MOT method that utilizes several lightweight plug and play additions to improve MOT performance. We design a detection-tracklet confidence score and use it to scale the similarity measure and implicitly favour high detection confidence and high tracklet confidence pairs in one-stage association. To reduce the ambiguity arising from using intersection over union (IoU), we propose a novel Mahalanobis distance and shape similarity additions to boost the overall similarity measure. To utilize low-detection score bounding boxes in one-stage association, we propose to boost the confidence scores of two groups of detections: the detections we assume to correspond to the existing tracked object, and the detections we assume to correspond to a previously undetected object. The proposed additions are orthogonal to the existing approaches, and we combine them with interpolation and camera motion compensation to achieve results comparable to the standard benchmark solutions while retaining real-time execution speed. When combined with appearance similarity, our method outperforms all standard benchmark solutions on MOT17 and MOT20 datasets. It ranks first among online methods in HOTA metric in the MOT Challenge on MOT17 and MOT20 test sets. We make our code available at https://github.com/vukasin-stanojevic/BoostTrack.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

BoostTrack：提高相似性测量和检测可信度，改进多目标跟踪

处理不可靠的检测和避免身份转换是多目标跟踪（MOT）成功的关键。理想情况下，多目标跟踪算法应只使用真正的正向检测，实时工作，并且不产生身份转换。为了接近所描述的理想解决方案，我们提出了 BoostTrack，这是一种简单而有效的通过检测进行追踪的 MOT 方法，它利用几个轻量级的即插即用附加功能来提高 MOT 性能。我们设计了一个检测-小轨迹置信度得分，并用它来扩展相似度量，在单阶段关联中隐性地偏向于高检测置信度和高小轨迹置信度对。为了减少因使用交集大于联合（IoU）而产生的歧义，我们提出了一种新的 Mahalanobis 距离和形状相似性加法，以提高整体相似性度量。为了在单阶段关联中利用低检测得分的边界框，我们建议提高两组检测的置信度得分：一组是我们假定与现有跟踪对象相对应的检测，另一组是我们假定与之前未检测到的对象相对应的检测。我们提出的附加方法与现有方法正交，并与插值和摄像机运动补偿相结合，在保持实时执行速度的同时，实现了与标准基准解决方案相当的结果。当与外观相似性相结合时，我们的方法在 MOT17 和 MOT20 数据集上的表现优于所有标准基准解决方案。在 MOT 挑战赛的 MOT17 和 MOT20 测试集上，我们的方法在 HOTA 指标的在线方法中排名第一。我们在 https://github.com/vukasin-stanojevic/BoostTrack 上提供了我们的代码。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Machine Vision and Applications 工程技术-工程：电子与电气

CiteScore

6.30

自引率

3.00%

发文量

审稿时长

8.7 months

期刊介绍： Machine Vision and Applications publishes high-quality technical contributions in machine vision research and development. Specifically, the editors encourage submittals in all applications and engineering aspects of image-related computing. In particular, original contributions dealing with scientific, commercial, industrial, military, and biomedical applications of machine vision, are all within the scope of the journal. Particular emphasis is placed on engineering and technology aspects of image processing and computer vision. The following aspects of machine vision applications are of interest: algorithms, architectures, VLSI implementations, AI techniques and expert systems for machine vision, front-end sensing, multidimensional and multisensor machine vision, real-time techniques, image databases, virtual reality and visualization. Papers must include a significant experimental validation component.