STCMOT:基于无人机的多目标跟踪时空聚合学习

Jianbo Ma, Chuanming Tang, Fei Wu, Can Zhao, Jianlin Zhang, Zhiyong Xu
{"title":"STCMOT:基于无人机的多目标跟踪时空聚合学习","authors":"Jianbo Ma, Chuanming Tang, Fei Wu, Can Zhao, Jianlin Zhang, Zhiyong Xu","doi":"arxiv-2409.11234","DOIUrl":null,"url":null,"abstract":"Multiple object tracking (MOT) in Unmanned Aerial Vehicle (UAV) videos is\nimportant for diverse applications in computer vision. Current MOT trackers\nrely on accurate object detection results and precise matching of target\nreidentification (ReID). These methods focus on optimizing target spatial\nattributes while overlooking temporal cues in modelling object relationships,\nespecially for challenging tracking conditions such as object deformation and\nblurring, etc. To address the above-mentioned issues, we propose a novel\nSpatio-Temporal Cohesion Multiple Object Tracking framework (STCMOT), which\nutilizes historical embedding features to model the representation of ReID and\ndetection features in a sequential order. Concretely, a temporal embedding\nboosting module is introduced to enhance the discriminability of individual\nembedding based on adjacent frame cooperation. While the trajectory embedding\nis then propagated by a temporal detection refinement module to mine salient\ntarget locations in the temporal field. Extensive experiments on the\nVisDrone2019 and UAVDT datasets demonstrate our STCMOT sets a new\nstate-of-the-art performance in MOTA and IDF1 metrics. The source codes are\nreleased at https://github.com/ydhcg-BoBo/STCMOT.","PeriodicalId":501130,"journal":{"name":"arXiv - CS - Computer Vision and Pattern Recognition","volume":"11 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"STCMOT: Spatio-Temporal Cohesion Learning for UAV-Based Multiple Object Tracking\",\"authors\":\"Jianbo Ma, Chuanming Tang, Fei Wu, Can Zhao, Jianlin Zhang, Zhiyong Xu\",\"doi\":\"arxiv-2409.11234\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multiple object tracking (MOT) in Unmanned Aerial Vehicle (UAV) videos is\\nimportant for diverse applications in computer vision. Current MOT trackers\\nrely on accurate object detection results and precise matching of target\\nreidentification (ReID). These methods focus on optimizing target spatial\\nattributes while overlooking temporal cues in modelling object relationships,\\nespecially for challenging tracking conditions such as object deformation and\\nblurring, etc. To address the above-mentioned issues, we propose a novel\\nSpatio-Temporal Cohesion Multiple Object Tracking framework (STCMOT), which\\nutilizes historical embedding features to model the representation of ReID and\\ndetection features in a sequential order. Concretely, a temporal embedding\\nboosting module is introduced to enhance the discriminability of individual\\nembedding based on adjacent frame cooperation. While the trajectory embedding\\nis then propagated by a temporal detection refinement module to mine salient\\ntarget locations in the temporal field. Extensive experiments on the\\nVisDrone2019 and UAVDT datasets demonstrate our STCMOT sets a new\\nstate-of-the-art performance in MOTA and IDF1 metrics. The source codes are\\nreleased at https://github.com/ydhcg-BoBo/STCMOT.\",\"PeriodicalId\":501130,\"journal\":{\"name\":\"arXiv - CS - Computer Vision and Pattern Recognition\",\"volume\":\"11 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Computer Vision and Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.11234\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11234","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

无人飞行器(UAV)视频中的多目标跟踪(MOT)对于计算机视觉领域的各种应用都非常重要。当前的多目标跟踪器依赖于精确的目标检测结果和目标识别(ReID)的精确匹配。这些方法侧重于优化目标的空间属性,而忽略了在模拟物体关系时的时间线索,尤其是在物体变形和模糊等具有挑战性的跟踪条件下。为了解决上述问题,我们提出了一种新颖的空间-时间内聚多目标跟踪框架(STCMOT),它利用历史嵌入特征来模拟按顺序表示的 ReID 和检测特征。具体来说,引入了一个时间嵌入增强模块,以增强基于相邻帧合作的单个嵌入的可辨别性。而轨迹嵌入则由时序检测细化模块传播,以挖掘时域中的咸目标位置。在 VisDrone2019 和 UAVDT 数据集上进行的大量实验表明,我们的 STCMOT 在 MOTA 和 IDF1 指标上达到了最先进的性能。源代码发布于 https://github.com/ydhcg-BoBo/STCMOT。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
STCMOT: Spatio-Temporal Cohesion Learning for UAV-Based Multiple Object Tracking
Multiple object tracking (MOT) in Unmanned Aerial Vehicle (UAV) videos is important for diverse applications in computer vision. Current MOT trackers rely on accurate object detection results and precise matching of target reidentification (ReID). These methods focus on optimizing target spatial attributes while overlooking temporal cues in modelling object relationships, especially for challenging tracking conditions such as object deformation and blurring, etc. To address the above-mentioned issues, we propose a novel Spatio-Temporal Cohesion Multiple Object Tracking framework (STCMOT), which utilizes historical embedding features to model the representation of ReID and detection features in a sequential order. Concretely, a temporal embedding boosting module is introduced to enhance the discriminability of individual embedding based on adjacent frame cooperation. While the trajectory embedding is then propagated by a temporal detection refinement module to mine salient target locations in the temporal field. Extensive experiments on the VisDrone2019 and UAVDT datasets demonstrate our STCMOT sets a new state-of-the-art performance in MOTA and IDF1 metrics. The source codes are released at https://github.com/ydhcg-BoBo/STCMOT.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Massively Multi-Person 3D Human Motion Forecasting with Scene Context Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution Precise Forecasting of Sky Images Using Spatial Warping JEAN: Joint Expression and Audio-guided NeRF-based Talking Face Generation Applications of Knowledge Distillation in Remote Sensing: A Survey
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1