Online Multiple Object Tracking Using Min-Cost Flow on Temporal Window for Autonomous Driving

IF 2.6 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC World Electric Vehicle Journal Pub Date : 2023-09-02 DOI:10.3390/wevj14090243

Hongjian Wei, Yingping Huang, Qian Zhang, Zhiyang Guo

{"title":"Online Multiple Object Tracking Using Min-Cost Flow on Temporal Window for Autonomous Driving","authors":"Hongjian Wei, Yingping Huang, Qian Zhang, Zhiyang Guo","doi":"10.3390/wevj14090243","DOIUrl":null,"url":null,"abstract":"Multiple object tracking (MOT), as a core technology for environment perception in autonomous driving, has attracted attention from researchers. Combing the advantages of batch global optimization, we present a novel online MOT framework for autonomous driving, consisting of feature extraction and data association on a temporal window. In the feature extraction stage, we design a three-channel appearance feature extraction network based on metric learning by using ResNet50 as the backbone network and the triplet loss function and employ a Kalman Filter with a constant acceleration motion model to optimize and predict the object bounding box information, so as to obtain reliable and discriminative object representation features. For data association, to reduce the ID switches, the min-cost flow of global association is introduced within the temporal window composed of consecutive multi-frame images. The trajectories within the temporal window are divided into two categories, active trajectories and inactive trajectories, and the appearance, motion affinities between each category of trajectories, and detections are calculated, respectively. Based on this, a sparse affinity network is constructed, and the data association is achieved using the min-cost flow problem of the network. Qualitative experimental results on KITTI MOT public benchmark dataset and real-world campus scenario sequences validate the effectiveness and robustness of our method. Compared with the homogeneous, vision-based MOT methods, quantitative experimental results demonstrate that our method has competitive advantages in terms of higher order tracking accuracy, association accuracy, and ID switches.","PeriodicalId":38979,"journal":{"name":"World Electric Vehicle Journal","volume":" ","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2023-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"World Electric Vehicle Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/wevj14090243","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Multiple object tracking (MOT), as a core technology for environment perception in autonomous driving, has attracted attention from researchers. Combing the advantages of batch global optimization, we present a novel online MOT framework for autonomous driving, consisting of feature extraction and data association on a temporal window. In the feature extraction stage, we design a three-channel appearance feature extraction network based on metric learning by using ResNet50 as the backbone network and the triplet loss function and employ a Kalman Filter with a constant acceleration motion model to optimize and predict the object bounding box information, so as to obtain reliable and discriminative object representation features. For data association, to reduce the ID switches, the min-cost flow of global association is introduced within the temporal window composed of consecutive multi-frame images. The trajectories within the temporal window are divided into two categories, active trajectories and inactive trajectories, and the appearance, motion affinities between each category of trajectories, and detections are calculated, respectively. Based on this, a sparse affinity network is constructed, and the data association is achieved using the min-cost flow problem of the network. Qualitative experimental results on KITTI MOT public benchmark dataset and real-world campus scenario sequences validate the effectiveness and robustness of our method. Compared with the homogeneous, vision-based MOT methods, quantitative experimental results demonstrate that our method has competitive advantages in terms of higher order tracking accuracy, association accuracy, and ID switches.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于时间窗口最小代价流的自动驾驶在线多目标跟踪

多目标跟踪(MOT)作为自动驾驶环境感知的核心技术，受到了研究者的广泛关注。结合批量全局优化的优点，提出了一种新的自动驾驶在线MOT框架，包括特征提取和时间窗口上的数据关联。在特征提取阶段，我们以ResNet50为骨干网络，采用三重损失函数，设计了基于度量学习的三通道外观特征提取网络，并采用具有恒定加速度运动模型的卡尔曼滤波对目标边界框信息进行优化预测，从而获得可靠、有判别性的目标表示特征。在数据关联方面，为了减少ID切换，在连续多帧图像组成的时间窗口内引入全局关联的最小代价流。将时间窗口内的轨迹分为活动轨迹和非活动轨迹两类，并分别计算每一类轨迹之间的外观、运动亲和性和检测。在此基础上，构建了一个稀疏亲和网络，利用网络的最小代价流问题实现数据关联。在KITTI MOT公共基准数据集和真实校园场景序列上的定性实验结果验证了该方法的有效性和鲁棒性。与同类的基于视觉的MOT方法相比，定量实验结果表明，该方法在高阶跟踪精度、关联精度和ID切换方面具有竞争优势。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊