Yukuan Zhang , Shengsheng Wang , Zihao Fu , Limin Zhao , Jiarui Zhao
{"title":"Robust Multi-Object Tracking with pseudo-information guided motion and enhanced semantic vision","authors":"Yukuan Zhang , Shengsheng Wang , Zihao Fu , Limin Zhao , Jiarui Zhao","doi":"10.1016/j.eswa.2025.126846","DOIUrl":null,"url":null,"abstract":"<div><div>The key to Multi-Object Tracking is to differentiate multiple instances in a video sequence and maintain their identity continuity. To achieve this goal, most methods model the motion or appearance cues of instances. However, when faced with complex scenarios like camera motion, occlusion, and crowding, trackers often lack discriminative capabilities. In this paper, we propose a robust tracker, named RccTrack, that combines motion cues guided by pseudo information and enhanced visual clues to overcome the aforementioned issues. Specifically, pseudo-observation information is constructed for guiding trajectory localization and generate interference-resistant trajectories. Pseudo-state information is constructed for guiding the calculation of inter-frame target motion directions. These pseudo-information is used to enhance the discriminative power of the motion cues. For visual cues, a semantic fusion network is designed to extract strong discriminative appearance information and store them in our hierarchical fusion embedding clusters, thus enhancing the discriminative power of the visual cues. In addition, we design the cascade matching method, which performs the association task based on the trajectory length information to distinguish confusing targets. In the matching stage, the two cues mentioned above are combined to enhance the discriminative power of the tracker. Experimental results demonstrate that RccTrack achieves state-of-the-art performance on MOT16, MOT17, MOT20, and DanceTrack benchmarks.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"273 ","pages":"Article 126846"},"PeriodicalIF":7.5000,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425004683","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The key to Multi-Object Tracking is to differentiate multiple instances in a video sequence and maintain their identity continuity. To achieve this goal, most methods model the motion or appearance cues of instances. However, when faced with complex scenarios like camera motion, occlusion, and crowding, trackers often lack discriminative capabilities. In this paper, we propose a robust tracker, named RccTrack, that combines motion cues guided by pseudo information and enhanced visual clues to overcome the aforementioned issues. Specifically, pseudo-observation information is constructed for guiding trajectory localization and generate interference-resistant trajectories. Pseudo-state information is constructed for guiding the calculation of inter-frame target motion directions. These pseudo-information is used to enhance the discriminative power of the motion cues. For visual cues, a semantic fusion network is designed to extract strong discriminative appearance information and store them in our hierarchical fusion embedding clusters, thus enhancing the discriminative power of the visual cues. In addition, we design the cascade matching method, which performs the association task based on the trajectory length information to distinguish confusing targets. In the matching stage, the two cues mentioned above are combined to enhance the discriminative power of the tracker. Experimental results demonstrate that RccTrack achieves state-of-the-art performance on MOT16, MOT17, MOT20, and DanceTrack benchmarks.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.