CTIFTrack:用于物体跟踪的连续时态信息融合技术

IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Expert Systems with Applications Pub Date : 2024-11-06 DOI:10.1016/j.eswa.2024.125654
Zhiguo Zhang , Zhiqing Guo , Liejun Wang, Yongming Li
{"title":"CTIFTrack:用于物体跟踪的连续时态信息融合技术","authors":"Zhiguo Zhang ,&nbsp;Zhiqing Guo ,&nbsp;Liejun Wang,&nbsp;Yongming Li","doi":"10.1016/j.eswa.2024.125654","DOIUrl":null,"url":null,"abstract":"<div><div>In visual tracking tasks, researchers usually focus on increasing the complexity of the model or only discretely focusing on the changes in the object itself to achieve accurate recognition and tracking of the moving object. However, they often overlook the significant contribution of video-level linear temporal information fusion and continuous spatiotemporal mapping to tracking tasks. This oversight may lead to poor tracking performance or insufficient real-time ability of the model in complex scenes. Therefore, this paper proposes a real-time tracker, namely Continuous Temporal Information Fusion Tracker (CTIFTrack). The key of CTIFTrack lies in its well-designed Temporal Information Fusion (TIF) module, which cleverly performs a linear fusion of the temporal information between the <span><math><mrow><mrow><mo>(</mo><mi>t</mi><mtext>-</mtext><mn>1</mn><mo>)</mo></mrow><mtext>-th</mtext></mrow></math></span> and the <span><math><mrow><mi>t</mi><mtext>-th</mtext></mrow></math></span> frames and completes the spatiotemporal mapping. This enables the tracker to better understand the overall spatiotemporal information and contextual spatiotemporal correlations within the video, thereby having a positive impact on the tracking task. In addition, this paper also proposes the Object Template Feature Refinement (OTFR) module, which effectively captures the global information and local details of the object, and further improves the tracker’s understanding of the object features. Extensive experiments are conducted on seven benchmarks, such as LaSOT, GOT-10K, UAV123, NFS, TrackingNet, VOT2018 and OTB-100. The experimental results validate the significant contribution of the TIF module and OTFR module to the tracking task, as well as the effectiveness of CTIFTrack. It is worth noting that while maintaining excellent tracking performance, CTIFTrack also shows outstanding real-time tracking speed. On the Nvidia Tesla T4-16GB GPU, the <span><math><mrow><mi>F</mi><mi>P</mi><mi>S</mi></mrow></math></span> of CTIFTrack reaches 71.98. The code and demo materials will be available at <span><span>https://github.com/vpsg-research/CTIFTrack</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"262 ","pages":"Article 125654"},"PeriodicalIF":7.5000,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CTIFTrack: Continuous Temporal Information Fusion for object track\",\"authors\":\"Zhiguo Zhang ,&nbsp;Zhiqing Guo ,&nbsp;Liejun Wang,&nbsp;Yongming Li\",\"doi\":\"10.1016/j.eswa.2024.125654\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In visual tracking tasks, researchers usually focus on increasing the complexity of the model or only discretely focusing on the changes in the object itself to achieve accurate recognition and tracking of the moving object. However, they often overlook the significant contribution of video-level linear temporal information fusion and continuous spatiotemporal mapping to tracking tasks. This oversight may lead to poor tracking performance or insufficient real-time ability of the model in complex scenes. Therefore, this paper proposes a real-time tracker, namely Continuous Temporal Information Fusion Tracker (CTIFTrack). The key of CTIFTrack lies in its well-designed Temporal Information Fusion (TIF) module, which cleverly performs a linear fusion of the temporal information between the <span><math><mrow><mrow><mo>(</mo><mi>t</mi><mtext>-</mtext><mn>1</mn><mo>)</mo></mrow><mtext>-th</mtext></mrow></math></span> and the <span><math><mrow><mi>t</mi><mtext>-th</mtext></mrow></math></span> frames and completes the spatiotemporal mapping. This enables the tracker to better understand the overall spatiotemporal information and contextual spatiotemporal correlations within the video, thereby having a positive impact on the tracking task. In addition, this paper also proposes the Object Template Feature Refinement (OTFR) module, which effectively captures the global information and local details of the object, and further improves the tracker’s understanding of the object features. Extensive experiments are conducted on seven benchmarks, such as LaSOT, GOT-10K, UAV123, NFS, TrackingNet, VOT2018 and OTB-100. The experimental results validate the significant contribution of the TIF module and OTFR module to the tracking task, as well as the effectiveness of CTIFTrack. It is worth noting that while maintaining excellent tracking performance, CTIFTrack also shows outstanding real-time tracking speed. On the Nvidia Tesla T4-16GB GPU, the <span><math><mrow><mi>F</mi><mi>P</mi><mi>S</mi></mrow></math></span> of CTIFTrack reaches 71.98. The code and demo materials will be available at <span><span>https://github.com/vpsg-research/CTIFTrack</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"262 \",\"pages\":\"Article 125654\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2024-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957417424025211\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417424025211","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

在视觉跟踪任务中,研究人员通常专注于提高模型的复杂度,或仅离散地关注物体本身的变化,以实现对运动物体的精确识别和跟踪。然而,他们往往忽视了视频级线性时空信息融合和连续时空映射对跟踪任务的重要贡献。这种疏忽可能会导致复杂场景下的跟踪性能不佳或模型的实时性不足。因此,本文提出了一种实时跟踪器,即连续时空信息融合跟踪器(CTIFTrack)。CTIFTrack 的关键在于其精心设计的时空信息融合(Temporal Information Fusion,TIF)模块,该模块巧妙地将第 (t-1)-th 帧与第 t 帧之间的时空信息进行线性融合,并完成时空映射。这样,跟踪器就能更好地理解视频中的整体时空信息和上下文时空相关性,从而对跟踪任务产生积极影响。此外,本文还提出了物体模板特征提纯(OTFR)模块,该模块能有效捕捉物体的全局信息和局部细节,进一步提高跟踪器对物体特征的理解。在 LaSOT、GOT-10K、UAV123、NFS、TrackingNet、VOT2018 和 OTB-100 等七个基准上进行了广泛的实验。实验结果验证了 TIF 模块和 OTFR 模块对跟踪任务的重要贡献,以及 CTIFTrack 的有效性。值得注意的是,在保持出色跟踪性能的同时,CTIFTrack 还表现出了出色的实时跟踪速度。在 Nvidia Tesla T4-16GB GPU 上,CTIFTrack 的 FPS 达到 71.98。代码和演示材料将在 https://github.com/vpsg-research/CTIFTrack 网站上提供。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
CTIFTrack: Continuous Temporal Information Fusion for object track
In visual tracking tasks, researchers usually focus on increasing the complexity of the model or only discretely focusing on the changes in the object itself to achieve accurate recognition and tracking of the moving object. However, they often overlook the significant contribution of video-level linear temporal information fusion and continuous spatiotemporal mapping to tracking tasks. This oversight may lead to poor tracking performance or insufficient real-time ability of the model in complex scenes. Therefore, this paper proposes a real-time tracker, namely Continuous Temporal Information Fusion Tracker (CTIFTrack). The key of CTIFTrack lies in its well-designed Temporal Information Fusion (TIF) module, which cleverly performs a linear fusion of the temporal information between the (t-1)-th and the t-th frames and completes the spatiotemporal mapping. This enables the tracker to better understand the overall spatiotemporal information and contextual spatiotemporal correlations within the video, thereby having a positive impact on the tracking task. In addition, this paper also proposes the Object Template Feature Refinement (OTFR) module, which effectively captures the global information and local details of the object, and further improves the tracker’s understanding of the object features. Extensive experiments are conducted on seven benchmarks, such as LaSOT, GOT-10K, UAV123, NFS, TrackingNet, VOT2018 and OTB-100. The experimental results validate the significant contribution of the TIF module and OTFR module to the tracking task, as well as the effectiveness of CTIFTrack. It is worth noting that while maintaining excellent tracking performance, CTIFTrack also shows outstanding real-time tracking speed. On the Nvidia Tesla T4-16GB GPU, the FPS of CTIFTrack reaches 71.98. The code and demo materials will be available at https://github.com/vpsg-research/CTIFTrack.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Expert Systems with Applications
Expert Systems with Applications 工程技术-工程:电子与电气
CiteScore
13.80
自引率
10.60%
发文量
2045
审稿时长
8.7 months
期刊介绍: Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.
期刊最新文献
Anticipating impression using textual sentiment based on ensemble LRD model Trusted commonsense knowledge enhanced depression detection based on three-way decision MSU-Net: Multi-Scale self-attention semantic segmentation method for oil-tea camellia planting area extraction in hilly areas of southern China Editorial Board DAN: Neural network based on dual attention for anomaly detection in ICS
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1