DBCNet:面向智能车辆RGB-T城市场景理解的动态双边交叉融合网络

IF 8.6 1区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS IEEE Transactions on Systems Man Cybernetics-Systems Pub Date : 2023-08-15 DOI:10.1109/TSMC.2023.3298921

Wujie Zhou;Tingting Gong;Jingsheng Lei;Lu Yu

{"title":"DBCNet:面向智能车辆RGB-T城市场景理解的动态双边交叉融合网络","authors":"Wujie Zhou;Tingting Gong;Jingsheng Lei;Lu Yu","doi":"10.1109/TSMC.2023.3298921","DOIUrl":null,"url":null,"abstract":"Understanding urban scenes is a fundamental capability required of intelligent vehicles. Depth cues provide useful geometric information for semantic segmentation, thus complementing RGB (color) data. Although single-modal RGB images are improved by depth information, semantic segmentation may be degraded in poor-visibility conditions. Thermal imaging can address some limitations of depth data. Therefore, we leverage the multimodal information in RGB-and-thermal (RGB-T) images by introducing a dynamic bilateral cross-fusion network (DBCNet) for RGB-T urban scene understanding. First, RGB-T features extracted by a given backbone are regrouped as high- or low-level features. Second, multimodal high-level features are sent to a dynamic bilateral cross-fusion module for further refinement. Third, a bounded high-level semantic-feature integration module is added to provide feature guidance, and a multitask supervision mechanism is used for fine-tuning. Extensive experiments on two RGB-T urban scene-understanding datasets indicate that DBCNet aggregates multilevel deep features effectively and outperforms state-of-the-art deep-learning scene-understanding methods.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":null,"pages":null},"PeriodicalIF":8.6000,"publicationDate":"2023-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"DBCNet: Dynamic Bilateral Cross-Fusion Network for RGB-T Urban Scene Understanding in Intelligent Vehicles\",\"authors\":\"Wujie Zhou;Tingting Gong;Jingsheng Lei;Lu Yu\",\"doi\":\"10.1109/TSMC.2023.3298921\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Understanding urban scenes is a fundamental capability required of intelligent vehicles. Depth cues provide useful geometric information for semantic segmentation, thus complementing RGB (color) data. Although single-modal RGB images are improved by depth information, semantic segmentation may be degraded in poor-visibility conditions. Thermal imaging can address some limitations of depth data. Therefore, we leverage the multimodal information in RGB-and-thermal (RGB-T) images by introducing a dynamic bilateral cross-fusion network (DBCNet) for RGB-T urban scene understanding. First, RGB-T features extracted by a given backbone are regrouped as high- or low-level features. Second, multimodal high-level features are sent to a dynamic bilateral cross-fusion module for further refinement. Third, a bounded high-level semantic-feature integration module is added to provide feature guidance, and a multitask supervision mechanism is used for fine-tuning. Extensive experiments on two RGB-T urban scene-understanding datasets indicate that DBCNet aggregates multilevel deep features effectively and outperforms state-of-the-art deep-learning scene-understanding methods.\",\"PeriodicalId\":48915,\"journal\":{\"name\":\"IEEE Transactions on Systems Man Cybernetics-Systems\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":8.6000,\"publicationDate\":\"2023-08-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Systems Man Cybernetics-Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10217340/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Systems Man Cybernetics-Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10217340/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 8

摘要

理解城市场景是智能汽车需要具备的基本能力。深度线索为语义分割提供了有用的几何信息，从而补充了RGB(颜色)数据。虽然深度信息可以改善单模态RGB图像，但在低可见性条件下，语义分割可能会下降。热成像可以解决深度数据的一些局限性。因此，我们利用RGB-T (RGB-T)图像中的多模态信息，引入动态双边交叉融合网络(DBCNet)来理解RGB-T城市场景。首先，将给定主干提取的RGB-T特征重新组合为高级或低级特征。其次，将多模态高层特征发送到动态双边交叉融合模块进行进一步细化;第三，增加有界的高级语义特征集成模块提供特征引导，并采用多任务监督机制进行微调。在两个RGB-T城市场景理解数据集上进行的大量实验表明，DBCNet有效地聚合了多层深度特征，并且优于最先进的深度学习场景理解方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

DBCNet: Dynamic Bilateral Cross-Fusion Network for RGB-T Urban Scene Understanding in Intelligent Vehicles

Understanding urban scenes is a fundamental capability required of intelligent vehicles. Depth cues provide useful geometric information for semantic segmentation, thus complementing RGB (color) data. Although single-modal RGB images are improved by depth information, semantic segmentation may be degraded in poor-visibility conditions. Thermal imaging can address some limitations of depth data. Therefore, we leverage the multimodal information in RGB-and-thermal (RGB-T) images by introducing a dynamic bilateral cross-fusion network (DBCNet) for RGB-T urban scene understanding. First, RGB-T features extracted by a given backbone are regrouped as high- or low-level features. Second, multimodal high-level features are sent to a dynamic bilateral cross-fusion module for further refinement. Third, a bounded high-level semantic-feature integration module is added to provide feature guidance, and a multitask supervision mechanism is used for fine-tuning. Extensive experiments on two RGB-T urban scene-understanding datasets indicate that DBCNet aggregates multilevel deep features effectively and outperforms state-of-the-art deep-learning scene-understanding methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Systems Man Cybernetics-Systems AUTOMATION & CONTROL SYSTEMS-COMPUTER SCIENCE, CYBERNETICS

CiteScore

18.50

自引率

11.50%

发文量

812

审稿时长

6 months

期刊介绍： The IEEE Transactions on Systems, Man, and Cybernetics: Systems encompasses the fields of systems engineering, covering issue formulation, analysis, and modeling throughout the systems engineering lifecycle phases. It addresses decision-making, issue interpretation, systems management, processes, and various methods such as optimization, modeling, and simulation in the development and deployment of large systems.

期刊最新文献

Table of Contents Table of Contents Guest Editorial Enabling Technologies and Systems for Industry 5.0: From Foundation Models to Foundation Intelligence IEEE Transactions on Systems, Man, and Cybernetics publication information IEEE Transactions on Systems, Man, and Cybernetics publication information