MOTT:基于绿色学习范式的多目标跟踪新模型

AI Open Pub Date : 2023-01-01 DOI:10.1016/j.aiopen.2023.09.002

Shan Wu , Amnir Hadachi , Chaoru Lu , Damien Vivet

{"title":"MOTT:基于绿色学习范式的多目标跟踪新模型","authors":"Shan Wu , Amnir Hadachi , Chaoru Lu , Damien Vivet","doi":"10.1016/j.aiopen.2023.09.002","DOIUrl":null,"url":null,"abstract":"<div><p>Multi-object tracking (MOT) is one of the most essential and challenging tasks in computer vision (CV). Unlike object detectors, MOT systems nowadays are more complicated and consist of several neural network models. Thus, the balance between the system performance and the runtime is crucial for online scenarios. While some of the works contribute by adding more modules to achieve improvements, we propose a pruned model by leveraging the state-of-the-art Transformer backbone model. Our model saves up to 62% FLOPS compared with other Transformer-based models and almost as twice as fast as them. The results of the proposed model are still competitive among the state-of-the-art methods. Moreover, we will open-source our modified Transformer backbone model for general CV tasks as well as the MOT system.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 145-153"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MOTT: A new model for multi-object tracking based on green learning paradigm\",\"authors\":\"Shan Wu , Amnir Hadachi , Chaoru Lu , Damien Vivet\",\"doi\":\"10.1016/j.aiopen.2023.09.002\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Multi-object tracking (MOT) is one of the most essential and challenging tasks in computer vision (CV). Unlike object detectors, MOT systems nowadays are more complicated and consist of several neural network models. Thus, the balance between the system performance and the runtime is crucial for online scenarios. While some of the works contribute by adding more modules to achieve improvements, we propose a pruned model by leveraging the state-of-the-art Transformer backbone model. Our model saves up to 62% FLOPS compared with other Transformer-based models and almost as twice as fast as them. The results of the proposed model are still competitive among the state-of-the-art methods. Moreover, we will open-source our modified Transformer backbone model for general CV tasks as well as the MOT system.</p></div>\",\"PeriodicalId\":100068,\"journal\":{\"name\":\"AI Open\",\"volume\":\"4 \",\"pages\":\"Pages 145-153\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"AI Open\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666651023000165\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"AI Open","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666651023000165","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

多目标跟踪（MOT）是计算机视觉（CV）中最重要和最具挑战性的任务之一。与物体探测器不同，MOT系统现在更加复杂，由几个神经网络模型组成。因此，系统性能和运行时间之间的平衡对于在线场景至关重要。虽然一些工作通过添加更多模块来实现改进，但我们通过利用最先进的Transformer主干模型提出了一个精简模型。与其他基于Transformer的模型相比，我们的模型节省了高达62%的FLOPS，速度几乎是它们的两倍。所提出的模型的结果在最先进的方法中仍然具有竞争力。此外，我们将为通用CV任务和MOT系统开源我们修改后的Transformer主干模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

MOTT: A new model for multi-object tracking based on green learning paradigm

Multi-object tracking (MOT) is one of the most essential and challenging tasks in computer vision (CV). Unlike object detectors, MOT systems nowadays are more complicated and consist of several neural network models. Thus, the balance between the system performance and the runtime is crucial for online scenarios. While some of the works contribute by adding more modules to achieve improvements, we propose a pruned model by leveraging the state-of-the-art Transformer backbone model. Our model saves up to 62% FLOPS compared with other Transformer-based models and almost as twice as fast as them. The results of the proposed model are still competitive among the state-of-the-art methods. Moreover, we will open-source our modified Transformer backbone model for general CV tasks as well as the MOT system.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊