基于知识蒸馏的RGB-T语义分割网络

IF 8.7 1区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS IEEE Transactions on Systems Man Cybernetics-Systems Pub Date : 2024-12-27 DOI:10.1109/TSMC.2024.3517732

Wujie Zhou;Tingting Gong;Weiqing Yan

{"title":"基于知识蒸馏的RGB-T语义分割网络","authors":"Wujie Zhou;Tingting Gong;Weiqing Yan","doi":"10.1109/TSMC.2024.3517732","DOIUrl":null,"url":null,"abstract":"Deep-learning-based semantic segmentation has received increasing research attention in recent years. However, owing to complex architectures, existing approaches have failed to achieve high accuracies in real-time applications. In this article, a novel knowledge distillation (KD) SegFormer-based network, called KDSNet-S*, is proposed to explore the tradeoff between accuracy and efficiency. Specifically, a structured KD scheme is designed to transfer the rich advanced features of a teacher network (KDSNet-T) to a student network (KDSNet-S). Thereafter, the KDSNet-S network learns the precise segmentation ability of the KDSNet-T network. Additionally, a multifield perceptual fusion model is proposed to learn more integrated features for a single modality and obtain discriminative and comprehensive feature representations. Furthermore, a high-level feature integration module is introduced to refine multimodality high-level features. Finally, multilevel features are fused, and a label-decoupling-based three-stream decoder that decomposes the original semantic segmentation map into center and contour diffusion maps for different supervision tasks is introduced. Experimental results on two public red-green–blue-thermal semantic segmentation datasets indicate the superiority of KDSNet-S* over compared state-of-the-art methods. The KDSNet-S* reduces parameters and floating-point operations per second by 91.1% and 81.9%, respectively, compared with the KDSNet-T. The source codes and results will be available at <uri>https://github.com/purple-ting/KDSNet</uri>.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 3","pages":"2170-2182"},"PeriodicalIF":8.7000,"publicationDate":"2024-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Knowledge Distillation SegFormer-Based Network for RGB-T Semantic Segmentation\",\"authors\":\"Wujie Zhou;Tingting Gong;Weiqing Yan\",\"doi\":\"10.1109/TSMC.2024.3517732\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep-learning-based semantic segmentation has received increasing research attention in recent years. However, owing to complex architectures, existing approaches have failed to achieve high accuracies in real-time applications. In this article, a novel knowledge distillation (KD) SegFormer-based network, called KDSNet-S*, is proposed to explore the tradeoff between accuracy and efficiency. Specifically, a structured KD scheme is designed to transfer the rich advanced features of a teacher network (KDSNet-T) to a student network (KDSNet-S). Thereafter, the KDSNet-S network learns the precise segmentation ability of the KDSNet-T network. Additionally, a multifield perceptual fusion model is proposed to learn more integrated features for a single modality and obtain discriminative and comprehensive feature representations. Furthermore, a high-level feature integration module is introduced to refine multimodality high-level features. Finally, multilevel features are fused, and a label-decoupling-based three-stream decoder that decomposes the original semantic segmentation map into center and contour diffusion maps for different supervision tasks is introduced. Experimental results on two public red-green–blue-thermal semantic segmentation datasets indicate the superiority of KDSNet-S* over compared state-of-the-art methods. The KDSNet-S* reduces parameters and floating-point operations per second by 91.1% and 81.9%, respectively, compared with the KDSNet-T. The source codes and results will be available at <uri>https://github.com/purple-ting/KDSNet</uri>.\",\"PeriodicalId\":48915,\"journal\":{\"name\":\"IEEE Transactions on Systems Man Cybernetics-Systems\",\"volume\":\"55 3\",\"pages\":\"2170-2182\"},\"PeriodicalIF\":8.7000,\"publicationDate\":\"2024-12-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Systems Man Cybernetics-Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10817074/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Systems Man Cybernetics-Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10817074/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

基于深度学习的语义分割近年来受到越来越多的研究关注。然而，由于结构复杂，现有方法无法在实时应用中实现高精度。在本文中，提出了一种新的基于知识蒸馏（KD） segformer的网络，称为KDSNet-S*，以探索准确性和效率之间的权衡。具体来说，结构化KD方案旨在将教师网络（KDSNet-T）丰富的高级功能转移到学生网络（KDSNet-S）。然后，KDSNet-S网络学习KDSNet-T网络的精确分割能力。此外，提出了一种多场感知融合模型，对单一模态学习更多的集成特征，获得判别性和综合性的特征表示。在此基础上，引入高级特征集成模块对多模态高级特征进行细化。最后，融合了多层特征，提出了一种基于标签解耦的三流解码器，该解码器将原始语义分割图分解为中心和轮廓扩散图，用于不同的监督任务。在两个公开的红-绿-蓝-热语义分割数据集上的实验结果表明，KDSNet-S*优于现有的语义分割方法。与KDSNet-T相比，KDSNet-S*每秒的参数和浮点运算次数分别减少了91.1%和81.9%。源代码和结果可在https://github.com/purple-ting/KDSNet上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Knowledge Distillation SegFormer-Based Network for RGB-T Semantic Segmentation

Deep-learning-based semantic segmentation has received increasing research attention in recent years. However, owing to complex architectures, existing approaches have failed to achieve high accuracies in real-time applications. In this article, a novel knowledge distillation (KD) SegFormer-based network, called KDSNet-S*, is proposed to explore the tradeoff between accuracy and efficiency. Specifically, a structured KD scheme is designed to transfer the rich advanced features of a teacher network (KDSNet-T) to a student network (KDSNet-S). Thereafter, the KDSNet-S network learns the precise segmentation ability of the KDSNet-T network. Additionally, a multifield perceptual fusion model is proposed to learn more integrated features for a single modality and obtain discriminative and comprehensive feature representations. Furthermore, a high-level feature integration module is introduced to refine multimodality high-level features. Finally, multilevel features are fused, and a label-decoupling-based three-stream decoder that decomposes the original semantic segmentation map into center and contour diffusion maps for different supervision tasks is introduced. Experimental results on two public red-green–blue-thermal semantic segmentation datasets indicate the superiority of KDSNet-S* over compared state-of-the-art methods. The KDSNet-S* reduces parameters and floating-point operations per second by 91.1% and 81.9%, respectively, compared with the KDSNet-T. The source codes and results will be available at https://github.com/purple-ting/KDSNet.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Systems Man Cybernetics-Systems AUTOMATION & CONTROL SYSTEMS-COMPUTER SCIENCE, CYBERNETICS

CiteScore

18.50

自引率

11.50%

发文量

812

审稿时长

6 months

期刊介绍： The IEEE Transactions on Systems, Man, and Cybernetics: Systems encompasses the fields of systems engineering, covering issue formulation, analysis, and modeling throughout the systems engineering lifecycle phases. It addresses decision-making, issue interpretation, systems management, processes, and various methods such as optimization, modeling, and simulation in the development and deployment of large systems.