{"title":"DBCNet: Dynamic Bilateral Cross-Fusion Network for RGB-T Urban Scene Understanding in Intelligent Vehicles","authors":"Wujie Zhou;Tingting Gong;Jingsheng Lei;Lu Yu","doi":"10.1109/TSMC.2023.3298921","DOIUrl":null,"url":null,"abstract":"Understanding urban scenes is a fundamental capability required of intelligent vehicles. Depth cues provide useful geometric information for semantic segmentation, thus complementing RGB (color) data. Although single-modal RGB images are improved by depth information, semantic segmentation may be degraded in poor-visibility conditions. Thermal imaging can address some limitations of depth data. Therefore, we leverage the multimodal information in RGB-and-thermal (RGB-T) images by introducing a dynamic bilateral cross-fusion network (DBCNet) for RGB-T urban scene understanding. First, RGB-T features extracted by a given backbone are regrouped as high- or low-level features. Second, multimodal high-level features are sent to a dynamic bilateral cross-fusion module for further refinement. Third, a bounded high-level semantic-feature integration module is added to provide feature guidance, and a multitask supervision mechanism is used for fine-tuning. Extensive experiments on two RGB-T urban scene-understanding datasets indicate that DBCNet aggregates multilevel deep features effectively and outperforms state-of-the-art deep-learning scene-understanding methods.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":null,"pages":null},"PeriodicalIF":8.6000,"publicationDate":"2023-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Systems Man Cybernetics-Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10217340/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 8
Abstract
Understanding urban scenes is a fundamental capability required of intelligent vehicles. Depth cues provide useful geometric information for semantic segmentation, thus complementing RGB (color) data. Although single-modal RGB images are improved by depth information, semantic segmentation may be degraded in poor-visibility conditions. Thermal imaging can address some limitations of depth data. Therefore, we leverage the multimodal information in RGB-and-thermal (RGB-T) images by introducing a dynamic bilateral cross-fusion network (DBCNet) for RGB-T urban scene understanding. First, RGB-T features extracted by a given backbone are regrouped as high- or low-level features. Second, multimodal high-level features are sent to a dynamic bilateral cross-fusion module for further refinement. Third, a bounded high-level semantic-feature integration module is added to provide feature guidance, and a multitask supervision mechanism is used for fine-tuning. Extensive experiments on two RGB-T urban scene-understanding datasets indicate that DBCNet aggregates multilevel deep features effectively and outperforms state-of-the-art deep-learning scene-understanding methods.
期刊介绍:
The IEEE Transactions on Systems, Man, and Cybernetics: Systems encompasses the fields of systems engineering, covering issue formulation, analysis, and modeling throughout the systems engineering lifecycle phases. It addresses decision-making, issue interpretation, systems management, processes, and various methods such as optimization, modeling, and simulation in the development and deployment of large systems.