RGB-T Multi-Modal Crowd Counting Based on Transformer

Zhengyi Liu, liuzywen
{"title":"RGB-T Multi-Modal Crowd Counting Based on Transformer","authors":"Zhengyi Liu, liuzywen","doi":"10.48550/arXiv.2301.03033","DOIUrl":null,"url":null,"abstract":"Crowd counting aims to estimate the number of persons in a scene. Most state-of-the-art crowd counting methods based on color images can't work well in poor illumination conditions due to invisible objects. With the widespread use of infrared cameras, crowd counting based on color and thermal images is studied. Existing methods only achieve multi-modal fusion without count objective constraint. To better excavate multi-modal information, we use count-guided multi-modal fusion and modal-guided count enhancement to achieve the impressive performance. The proposed count-guided multi-modal fusion module utilizes a multi-scale token transformer to interact two-modal information under the guidance of count information and perceive different scales from the token perspective. The proposed modal-guided count enhancement module employs multi-scale deformable transformer decoder structure to enhance one modality feature and count information by the other modality. Experiment in public RGBT-CC dataset shows that our method refreshes the state-of-the-art results. https://github.com/liuzywen/RGBTCC","PeriodicalId":72437,"journal":{"name":"BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference","volume":"35 1","pages":"427"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2301.03033","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Crowd counting aims to estimate the number of persons in a scene. Most state-of-the-art crowd counting methods based on color images can't work well in poor illumination conditions due to invisible objects. With the widespread use of infrared cameras, crowd counting based on color and thermal images is studied. Existing methods only achieve multi-modal fusion without count objective constraint. To better excavate multi-modal information, we use count-guided multi-modal fusion and modal-guided count enhancement to achieve the impressive performance. The proposed count-guided multi-modal fusion module utilizes a multi-scale token transformer to interact two-modal information under the guidance of count information and perceive different scales from the token perspective. The proposed modal-guided count enhancement module employs multi-scale deformable transformer decoder structure to enhance one modality feature and count information by the other modality. Experiment in public RGBT-CC dataset shows that our method refreshes the state-of-the-art results. https://github.com/liuzywen/RGBTCC
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于变压器的RGB-T多模态人群计数
人群计数的目的是估计一个场景中的人数。大多数最先进的基于彩色图像的人群计数方法由于不可见的物体而无法在较差的照明条件下很好地工作。随着红外摄像机的广泛应用,研究了基于彩色和热成像的人群计数。现有方法只能实现多模态融合,没有计数目标约束。为了更好地挖掘多模态信息,我们使用了计数引导的多模态融合和模态引导的计数增强来获得令人印象深刻的性能。提出的计数引导多模态融合模块利用多尺度令牌转换器在计数信息的引导下对两模态信息进行交互,并从令牌角度感知不同的尺度。该模态引导计数增强模块采用多尺度可变形变压器译码器结构,增强一种模态特征,对另一种模态进行计数。在公开RGBT-CC数据集上的实验表明,我们的方法刷新了最先进的结果。https://github.com/liuzywen/RGBTCC
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Learning Anatomically Consistent Embedding for Chest Radiography. Single Pixel Spectral Color Constancy DiffSketching: Sketch Control Image Synthesis with Diffusion Models Defect Transfer GAN: Diverse Defect Synthesis for Data Augmentation Mitigating Bias in Visual Transformers via Targeted Alignment
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1