Robust Multi-Object Tracking with pseudo-information guided motion and enhanced semantic vision

IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Expert Systems with Applications Pub Date : 2025-05-10 Epub Date: 2025-02-17 DOI:10.1016/j.eswa.2025.126846
Yukuan Zhang , Shengsheng Wang , Zihao Fu , Limin Zhao , Jiarui Zhao
{"title":"Robust Multi-Object Tracking with pseudo-information guided motion and enhanced semantic vision","authors":"Yukuan Zhang ,&nbsp;Shengsheng Wang ,&nbsp;Zihao Fu ,&nbsp;Limin Zhao ,&nbsp;Jiarui Zhao","doi":"10.1016/j.eswa.2025.126846","DOIUrl":null,"url":null,"abstract":"<div><div>The key to Multi-Object Tracking is to differentiate multiple instances in a video sequence and maintain their identity continuity. To achieve this goal, most methods model the motion or appearance cues of instances. However, when faced with complex scenarios like camera motion, occlusion, and crowding, trackers often lack discriminative capabilities. In this paper, we propose a robust tracker, named RccTrack, that combines motion cues guided by pseudo information and enhanced visual clues to overcome the aforementioned issues. Specifically, pseudo-observation information is constructed for guiding trajectory localization and generate interference-resistant trajectories. Pseudo-state information is constructed for guiding the calculation of inter-frame target motion directions. These pseudo-information is used to enhance the discriminative power of the motion cues. For visual cues, a semantic fusion network is designed to extract strong discriminative appearance information and store them in our hierarchical fusion embedding clusters, thus enhancing the discriminative power of the visual cues. In addition, we design the cascade matching method, which performs the association task based on the trajectory length information to distinguish confusing targets. In the matching stage, the two cues mentioned above are combined to enhance the discriminative power of the tracker. Experimental results demonstrate that RccTrack achieves state-of-the-art performance on MOT16, MOT17, MOT20, and DanceTrack benchmarks.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"273 ","pages":"Article 126846"},"PeriodicalIF":7.5000,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425004683","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/17 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

The key to Multi-Object Tracking is to differentiate multiple instances in a video sequence and maintain their identity continuity. To achieve this goal, most methods model the motion or appearance cues of instances. However, when faced with complex scenarios like camera motion, occlusion, and crowding, trackers often lack discriminative capabilities. In this paper, we propose a robust tracker, named RccTrack, that combines motion cues guided by pseudo information and enhanced visual clues to overcome the aforementioned issues. Specifically, pseudo-observation information is constructed for guiding trajectory localization and generate interference-resistant trajectories. Pseudo-state information is constructed for guiding the calculation of inter-frame target motion directions. These pseudo-information is used to enhance the discriminative power of the motion cues. For visual cues, a semantic fusion network is designed to extract strong discriminative appearance information and store them in our hierarchical fusion embedding clusters, thus enhancing the discriminative power of the visual cues. In addition, we design the cascade matching method, which performs the association task based on the trajectory length information to distinguish confusing targets. In the matching stage, the two cues mentioned above are combined to enhance the discriminative power of the tracker. Experimental results demonstrate that RccTrack achieves state-of-the-art performance on MOT16, MOT17, MOT20, and DanceTrack benchmarks.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
伪信息引导运动和增强语义视觉的鲁棒多目标跟踪
多目标跟踪的关键是区分视频序列中的多个实例并保持其身份连续性。为了实现这一目标,大多数方法对实例的运动或外观线索进行建模。然而,当面对摄像机运动、遮挡和拥挤等复杂场景时,跟踪器往往缺乏辨别能力。在本文中,我们提出了一种鲁棒跟踪器RccTrack,它结合了伪信息引导的运动线索和增强的视觉线索来克服上述问题。构造伪观测信息,引导弹道定位,生成抗干扰弹道。构造伪状态信息,指导帧间目标运动方向的计算。这些伪信息被用来增强运动线索的辨别能力。对于视觉线索,设计语义融合网络提取强判别性的外观信息,并将其存储在我们的分层融合嵌入聚类中,从而增强视觉线索的判别能力。此外,我们设计了级联匹配方法,该方法基于轨迹长度信息执行关联任务,以区分混淆目标。在匹配阶段,将上述两种线索结合起来,增强跟踪器的识别能力。实验结果表明,RccTrack在MOT16、MOT17、MOT20和DanceTrack基准测试中达到了最先进的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Expert Systems with Applications
Expert Systems with Applications 工程技术-工程:电子与电气
CiteScore
13.80
自引率
10.60%
发文量
2045
审稿时长
8.7 months
期刊介绍: Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.
期刊最新文献
H-SemiS: Hierarchical fusion of semi and self-supervised learning for knee osteoarthritis severity grading Expert systems for predicting the efficiencies of photomultiplication organic photodetectors PASegNet: Integrating dual awareness of position and boundary on 3D dental meshes for tooth instance segmentation Genetic programming with advanced diverse partner selection for dynamic scheduling Real-time analysis of indoor sports game situations through deep learning-based classification
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1