{"title":"FastTrack: A Highly Efficient and Generic GPU-Based Multi-object Tracking Method with Parallel Kalman Filter","authors":"Chongwei Liu, Haojie Li, Zhihui Wang","doi":"10.1007/s11263-023-01933-4","DOIUrl":null,"url":null,"abstract":"<p>The Kalman Filter based on uniform assumption has been a crucial motion estimation module in trackers. However, it has limitations in non-uniform motion modeling and computational efficiency when applied to large-scale object tracking scenarios. To address these issues, we propose a novel <b><i>Parallel Kalman Filter (PKF)</i></b>, which simplifies conventional state variables to reduces computational load and enable effective non-uniform modeling. Within PKF, we propose a non-uniform formulation which models non-uniform motion as uniform motion by transforming the time interval <span>\\(\\Delta t\\)</span> from a constant into a variable related to displacement, and incorporate a deceleration strategy into the control-input model of the formulation to tackle the escape problem in Multi-Object Tracking (MOT); an innovative parallel computation method is also proposed, which transposes the computation graph of PKF from the matrix to the quadratic form, significantly reducing the computational load and facilitating parallel computation between distinct tracklets via CUDA, thus making the time consumption of PKF independent of the input tracklet scale, i.e., <i>O</i>(1). Based on PKF, we introduce <b><i>Fast</i></b>, <i>the first fully GPU-based tracker paradigm</i>, which significantly enhances tracking efficiency in large-scale object tracking scenarios; and <b><i>FastTrack</i></b>, the MOT system composed of Fast and a general detector, offering high efficiency and generality. Within FastTrack, Fast only requires bounding boxes with scores and class ids for a single association during one iteration, and introduces innovative GPU-based tracking modules, such as an efficient GPU 2D-array data structure for tracklet management, a novel cost matrix implemented in CUDA for automatic association priority determination, a new association metric called HIoU, and the first implementation of the Auction Algorithm in CUDA for the asymmetric assignment problem. Experiments show that the average time per iteration of PKF on a GTX 1080Ti is only 0.2 ms; Fast can achieve a real-time efficiency of 250FPS on a GTX 1080Ti and 42FPS even on a Jetson AGX Xavier, outperforming conventional CPU-based trackers. Concurrently, FastTrack demonstrates state-of-the-art performance on four public benchmarks, specifically MOT17, MOT20, KITTI, and DanceTrack, and attains the highest speed in large-scale tracking scenarios of MOT20.</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"29 22","pages":""},"PeriodicalIF":11.6000,"publicationDate":"2023-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Vision","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11263-023-01933-4","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The Kalman Filter based on uniform assumption has been a crucial motion estimation module in trackers. However, it has limitations in non-uniform motion modeling and computational efficiency when applied to large-scale object tracking scenarios. To address these issues, we propose a novel Parallel Kalman Filter (PKF), which simplifies conventional state variables to reduces computational load and enable effective non-uniform modeling. Within PKF, we propose a non-uniform formulation which models non-uniform motion as uniform motion by transforming the time interval \(\Delta t\) from a constant into a variable related to displacement, and incorporate a deceleration strategy into the control-input model of the formulation to tackle the escape problem in Multi-Object Tracking (MOT); an innovative parallel computation method is also proposed, which transposes the computation graph of PKF from the matrix to the quadratic form, significantly reducing the computational load and facilitating parallel computation between distinct tracklets via CUDA, thus making the time consumption of PKF independent of the input tracklet scale, i.e., O(1). Based on PKF, we introduce Fast, the first fully GPU-based tracker paradigm, which significantly enhances tracking efficiency in large-scale object tracking scenarios; and FastTrack, the MOT system composed of Fast and a general detector, offering high efficiency and generality. Within FastTrack, Fast only requires bounding boxes with scores and class ids for a single association during one iteration, and introduces innovative GPU-based tracking modules, such as an efficient GPU 2D-array data structure for tracklet management, a novel cost matrix implemented in CUDA for automatic association priority determination, a new association metric called HIoU, and the first implementation of the Auction Algorithm in CUDA for the asymmetric assignment problem. Experiments show that the average time per iteration of PKF on a GTX 1080Ti is only 0.2 ms; Fast can achieve a real-time efficiency of 250FPS on a GTX 1080Ti and 42FPS even on a Jetson AGX Xavier, outperforming conventional CPU-based trackers. Concurrently, FastTrack demonstrates state-of-the-art performance on four public benchmarks, specifically MOT17, MOT20, KITTI, and DanceTrack, and attains the highest speed in large-scale tracking scenarios of MOT20.
期刊介绍:
The International Journal of Computer Vision (IJCV) serves as a platform for sharing new research findings in the rapidly growing field of computer vision. It publishes 12 issues annually and presents high-quality, original contributions to the science and engineering of computer vision. The journal encompasses various types of articles to cater to different research outputs.
Regular articles, which span up to 25 journal pages, focus on significant technical advancements that are of broad interest to the field. These articles showcase substantial progress in computer vision.
Short articles, limited to 10 pages, offer a swift publication path for novel research outcomes. They provide a quicker means for sharing new findings with the computer vision community.
Survey articles, comprising up to 30 pages, offer critical evaluations of the current state of the art in computer vision or offer tutorial presentations of relevant topics. These articles provide comprehensive and insightful overviews of specific subject areas.
In addition to technical articles, the journal also includes book reviews, position papers, and editorials by prominent scientific figures. These contributions serve to complement the technical content and provide valuable perspectives.
The journal encourages authors to include supplementary material online, such as images, video sequences, data sets, and software. This additional material enhances the understanding and reproducibility of the published research.
Overall, the International Journal of Computer Vision is a comprehensive publication that caters to researchers in this rapidly growing field. It covers a range of article types, offers additional online resources, and facilitates the dissemination of impactful research.