Indoor scene multi-object tracking based on region search and memory buffer pool

IF 7.6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pattern Recognition Pub Date : 2025-03-22 DOI:10.1016/j.patcog.2025.111623

Yang Li , Guanci Yang , Zhidong Su , Shaobo Li , Jing Yang , Ling He

{"title":"Indoor scene multi-object tracking based on region search and memory buffer pool","authors":"Yang Li , Guanci Yang , Zhidong Su , Shaobo Li , Jing Yang , Ling He","doi":"10.1016/j.patcog.2025.111623","DOIUrl":null,"url":null,"abstract":"<div><div>This study proposes a new Indoor Scene Multi-Object Tracking (IS-MOT) task to complete multi-granularity parsing and continuously track indoor human objects. To foster the IS-MOT task, we refer to the basic human movement composition, combining indoor human motion characteristics, constructing a large-scale multi-object tracking benchmark for indoor social robot perspective, termed Multi-Resident Tracking (MRT). To address the issue of insufficient persistent tracking capability when extending existing MOT methods to the IS-MOT task. A persistent visual multi-object tracking method based on region search and memory buffer pool (PeViTrack) is designed. PeViTrack is mainly composed of a Homogeneous Semantic Memory Buffer Pool (HSMBP) that integrates a Motion State Estimation Module (MSEM) and a Hierarchical Matching Correlation Mechanism (HMCM). HSMBP allows the network to construct an allocation representation based on high and low confidence detection boxes, thereby establishing homogeneous and heterogeneous semantic embedding decision spaces in the spatial domain, thus forcing the network to search and accurately associate object homogeneous and heterogeneous features efficiently. Extensive experiments on the constructed MRT and the well-recognized DanceTrack dataset show that PeViTrack achieves state-of-the-art tracking performance. The code and datasets will be made available at <span><span>https://github.com/funweb/PeViTrack</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"165 ","pages":"Article 111623"},"PeriodicalIF":7.6000,"publicationDate":"2025-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325002833","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

This study proposes a new Indoor Scene Multi-Object Tracking (IS-MOT) task to complete multi-granularity parsing and continuously track indoor human objects. To foster the IS-MOT task, we refer to the basic human movement composition, combining indoor human motion characteristics, constructing a large-scale multi-object tracking benchmark for indoor social robot perspective, termed Multi-Resident Tracking (MRT). To address the issue of insufficient persistent tracking capability when extending existing MOT methods to the IS-MOT task. A persistent visual multi-object tracking method based on region search and memory buffer pool (PeViTrack) is designed. PeViTrack is mainly composed of a Homogeneous Semantic Memory Buffer Pool (HSMBP) that integrates a Motion State Estimation Module (MSEM) and a Hierarchical Matching Correlation Mechanism (HMCM). HSMBP allows the network to construct an allocation representation based on high and low confidence detection boxes, thereby establishing homogeneous and heterogeneous semantic embedding decision spaces in the spatial domain, thus forcing the network to search and accurately associate object homogeneous and heterogeneous features efficiently. Extensive experiments on the constructed MRT and the well-recognized DanceTrack dataset show that PeViTrack achieves state-of-the-art tracking performance. The code and datasets will be made available at https://github.com/funweb/PeViTrack.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于区域搜索和内存缓冲池的室内场景多目标跟踪

本研究提出了一种新的室内场景多目标跟踪（IS-MOT）任务，以完成多粒度解析和连续跟踪室内人体目标。为了促进 IS-MOT 任务的完成，我们参考了人类的基本运动构成，结合室内人类的运动特征，构建了一个面向室内社交机器人视角的大规模多目标跟踪基准，称为多目标跟踪（Multi-Resident Tracking，MRT）。为了解决将现有 MOT 方法扩展到 IS-MOT 任务时持续跟踪能力不足的问题。设计了一种基于区域搜索和内存缓冲池的持续视觉多目标跟踪方法（PeViTrack）。PeViTrack 主要由一个同构语义内存缓冲池（HSMBP）组成，该缓冲池集成了运动状态估计模块（MSEM）和分层匹配相关机制（HMCM）。HSMBP 允许网络根据高置信度和低置信度检测框构建分配表示，从而在空间域建立同质和异质语义嵌入决策空间，从而迫使网络高效搜索并准确关联对象的同质和异质特征。在构建的 MRT 和广受认可的 DanceTrack 数据集上进行的大量实验表明，PeViTrack 实现了最先进的跟踪性能。代码和数据集将发布在 https://github.com/funweb/PeViTrack 网站上。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Pattern Recognition 工程技术-工程：电子与电气

CiteScore

14.40

自引率

16.20%

发文量

683

审稿时长

5.6 months

期刊介绍： The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.