Attention in Crowd Counting Using the Transformer and Density Map to Improve Counting Result

P. Do
{"title":"Attention in Crowd Counting Using the Transformer and Density Map to Improve Counting Result","authors":"P. Do","doi":"10.1109/NICS54270.2021.9701500","DOIUrl":null,"url":null,"abstract":"With the vigorous development of CNN, most crowd counting methods have approached using CNN to estimate the density map and then infer the count. However, these methods face many limitations due to limited receptive fields, background noise, etc. With the advent of Transformer in natural language processing, it is possible to utilize this model for the crowd counting problem. The Transformer can model the global context, so it helps to solve the problem of receptive fields. On the other hand, with the attention mechanism, the model can focus on areas of concentration of people, helping to solve the problem of background noise. In this paper, we propose a Crowd counting model combining Transformer and Density map (TDCrowd) to estimate the number of people in a crowd. With the use of a Transformer, TDCrowd can still be trained so that it does not need information about the location of people in the crowd, but only information about the count. Experiments on three datasets ShanghaiTech, UCF_QNR, and JHU-Crowd++, show that TDCrowd gives better results when compared to regression-based methods (need only the count information) and density map-based (need the count information and location information).","PeriodicalId":296963,"journal":{"name":"2021 8th NAFOSTED Conference on Information and Computer Science (NICS)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 8th NAFOSTED Conference on Information and Computer Science (NICS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NICS54270.2021.9701500","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

With the vigorous development of CNN, most crowd counting methods have approached using CNN to estimate the density map and then infer the count. However, these methods face many limitations due to limited receptive fields, background noise, etc. With the advent of Transformer in natural language processing, it is possible to utilize this model for the crowd counting problem. The Transformer can model the global context, so it helps to solve the problem of receptive fields. On the other hand, with the attention mechanism, the model can focus on areas of concentration of people, helping to solve the problem of background noise. In this paper, we propose a Crowd counting model combining Transformer and Density map (TDCrowd) to estimate the number of people in a crowd. With the use of a Transformer, TDCrowd can still be trained so that it does not need information about the location of people in the crowd, but only information about the count. Experiments on three datasets ShanghaiTech, UCF_QNR, and JHU-Crowd++, show that TDCrowd gives better results when compared to regression-based methods (need only the count information) and density map-based (need the count information and location information).
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用变形图和密度图改进计数结果的人群计数注意事项
随着CNN的蓬勃发展,大多数人群计数方法都接近于使用CNN来估计密度图,然后推断计数。然而,这些方法由于受接收野、背景噪声等因素的限制而面临许多局限性。随着自然语言处理中Transformer的出现,将该模型用于人群计数问题成为可能。Transformer可以对全局上下文进行建模,因此它有助于解决接收域的问题。另一方面,通过注意机制,该模型可以将注意力集中在人群集中的区域,有助于解决背景噪音问题。在本文中,我们提出了一种结合变压器和密度图的人群计数模型(TDCrowd)来估计人群中的人数。通过使用Transformer, TDCrowd仍然可以进行训练,这样它就不需要关于人群中人的位置的信息,而只需要关于计数的信息。在ShanghaiTech、UCF_QNR和JHU-Crowd++三个数据集上的实验表明,TDCrowd方法比基于回归的方法(只需要计数信息)和基于密度图的方法(需要计数信息和位置信息)获得了更好的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A novel adaptive neural controller for narrowband active noise control systems A Lightweight Model for Remote Sensing Image Retrieval with Knowledge Distillation and Mining Interclass Characteristics Keynote Talk #1 : Cryscanner: Finding Cryptographic Libraries Misuse FedChain: A Collaborative Framework for Building Artificial Intelligence Models using Blockchain and Federated Learning Exploring the Performances of Stacking Classifier in Predicting Patients Having Stroke
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1