LCFNets：自动驾驶实时语义分割的补偿策略

IF 14 1区工程技术 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE IEEE Transactions on Intelligent Vehicles Pub Date : 2024-02-08 DOI:10.1109/TIV.2024.3363830

Lu Yang;Yiwen Bai;Fenglei Ren;Chongke Bi;Ronghui Zhang

{"title":"LCFNets：自动驾驶实时语义分割的补偿策略","authors":"Lu Yang;Yiwen Bai;Fenglei Ren;Chongke Bi;Ronghui Zhang","doi":"10.1109/TIV.2024.3363830","DOIUrl":null,"url":null,"abstract":"Semantic segmentation is an important research topic in the environment perception of intelligent vehicles. Many semantic segmentation networks based on bilateral architecture have been proven effective. However, semantic segmentation networks based on this architecture has the risk of pixel classification errors and small objects being overwhelmed. In this paper, we solve the problem by proposing a novel three-branch architecture network called LCFNets. Compared to existing bilateral architecture, LCFNets introduce compensation branch for the first time to preserve the features of original images. Through two efficient modules, Lightweight Detail Guidance Fusion Module (L-DGF) and Lightweight Semantic Guidance Fusion Module (L-SGF), detail and semantic branches are allowed to selectively extract features from this branch. To balance the three-branch features and guide them to fuse effectively, a novel aggregation layer is designed. Depth-wise Convolution Pyramid Pooling module (DCPP) and Total Guidance Fusion Module (TGF) enable the aggregation layer to extract the global receptive field and realize multi-branch aggregation with fewer calculation complexity. Extensive experiments on Cityscapes and CamVid datasets have shown that our family of LCFNets provide a better trade-off between speed and accuracy. With the full resolution input and no ImageNet pre-training, LCFNet-slim achieves 76.86% mIoU at 114.36 FPS and LCFNet achieves 77.96% mIoU at 92.37 FPS on Cityscapes. On the other hand, LCFNet-super achieves 79.10% mIoU at 47.46 FPS.","PeriodicalId":36532,"journal":{"name":"IEEE Transactions on Intelligent Vehicles","volume":"9 4","pages":"4715-4729"},"PeriodicalIF":14.0000,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"LCFNets: Compensation Strategy for Real-Time Semantic Segmentation of Autonomous Driving\",\"authors\":\"Lu Yang;Yiwen Bai;Fenglei Ren;Chongke Bi;Ronghui Zhang\",\"doi\":\"10.1109/TIV.2024.3363830\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Semantic segmentation is an important research topic in the environment perception of intelligent vehicles. Many semantic segmentation networks based on bilateral architecture have been proven effective. However, semantic segmentation networks based on this architecture has the risk of pixel classification errors and small objects being overwhelmed. In this paper, we solve the problem by proposing a novel three-branch architecture network called LCFNets. Compared to existing bilateral architecture, LCFNets introduce compensation branch for the first time to preserve the features of original images. Through two efficient modules, Lightweight Detail Guidance Fusion Module (L-DGF) and Lightweight Semantic Guidance Fusion Module (L-SGF), detail and semantic branches are allowed to selectively extract features from this branch. To balance the three-branch features and guide them to fuse effectively, a novel aggregation layer is designed. Depth-wise Convolution Pyramid Pooling module (DCPP) and Total Guidance Fusion Module (TGF) enable the aggregation layer to extract the global receptive field and realize multi-branch aggregation with fewer calculation complexity. Extensive experiments on Cityscapes and CamVid datasets have shown that our family of LCFNets provide a better trade-off between speed and accuracy. With the full resolution input and no ImageNet pre-training, LCFNet-slim achieves 76.86% mIoU at 114.36 FPS and LCFNet achieves 77.96% mIoU at 92.37 FPS on Cityscapes. On the other hand, LCFNet-super achieves 79.10% mIoU at 47.46 FPS.\",\"PeriodicalId\":36532,\"journal\":{\"name\":\"IEEE Transactions on Intelligent Vehicles\",\"volume\":\"9 4\",\"pages\":\"4715-4729\"},\"PeriodicalIF\":14.0000,\"publicationDate\":\"2024-02-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Intelligent Vehicles\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10428050/\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Intelligent Vehicles","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10428050/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

语义分割是智能车辆环境感知方面的一个重要研究课题。许多基于双边架构的语义分割网络已被证明是有效的。然而，基于这种架构的语义分割网络存在像素分类错误和小物体被淹没的风险。本文提出了一种名为 LCFNets 的新型三分支架构网络，从而解决了这一问题。与现有的双边架构相比，LCFNets 首次引入了补偿分支，以保留原始图像的特征。通过轻量级细节引导融合模块（L-DGF）和轻量级语义引导融合模块（L-SGF）这两个高效模块，细节分支和语义分支可以有选择地提取本分支的特征。为了平衡三个分支的特征并引导它们有效融合，设计了一个新颖的聚合层。深度卷积金字塔池化模块（DCPP）和全引导融合模块（TGF）使聚合层能够提取全局感受野，并以较低的计算复杂度实现多分支聚合。在城市景观和 CamVid 数据集上进行的大量实验表明，我们的 LCFNET 系列能在速度和准确性之间做出更好的权衡。在全分辨率输入和无 ImageNet 预训练的情况下，LCFNet-slim 以 114.36 FPS 的速度实现了 76.86% 的 mIoU，LCFNet 以 92.37 FPS 的速度实现了 77.96% 的 mIoU。另一方面，LCFNet-super 以 47.46 FPS 的速度实现了 79.10% mIoU。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

LCFNets: Compensation Strategy for Real-Time Semantic Segmentation of Autonomous Driving

Semantic segmentation is an important research topic in the environment perception of intelligent vehicles. Many semantic segmentation networks based on bilateral architecture have been proven effective. However, semantic segmentation networks based on this architecture has the risk of pixel classification errors and small objects being overwhelmed. In this paper, we solve the problem by proposing a novel three-branch architecture network called LCFNets. Compared to existing bilateral architecture, LCFNets introduce compensation branch for the first time to preserve the features of original images. Through two efficient modules, Lightweight Detail Guidance Fusion Module (L-DGF) and Lightweight Semantic Guidance Fusion Module (L-SGF), detail and semantic branches are allowed to selectively extract features from this branch. To balance the three-branch features and guide them to fuse effectively, a novel aggregation layer is designed. Depth-wise Convolution Pyramid Pooling module (DCPP) and Total Guidance Fusion Module (TGF) enable the aggregation layer to extract the global receptive field and realize multi-branch aggregation with fewer calculation complexity. Extensive experiments on Cityscapes and CamVid datasets have shown that our family of LCFNets provide a better trade-off between speed and accuracy. With the full resolution input and no ImageNet pre-training, LCFNet-slim achieves 76.86% mIoU at 114.36 FPS and LCFNet achieves 77.96% mIoU at 92.37 FPS on Cityscapes. On the other hand, LCFNet-super achieves 79.10% mIoU at 47.46 FPS.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助