Semantic segmentation of road scene based on multi-scale feature extraction and deep supervision

Longfei Wang, Chunman Yan
{"title":"Semantic segmentation of road scene based on multi-scale feature extraction and deep supervision","authors":"Longfei Wang, Chunman Yan","doi":"10.1117/12.2644695","DOIUrl":null,"url":null,"abstract":"Aiming at the problems of inaccurate segmentation edges, poor adaptability to multi-scale road targets, prone to false segmentation and missing segmentation when segmenting road targets with various and changeable occlusions in the traditional U-Net model, a semantic segmentation model of road scene based on multi-scale feature extraction and deep supervision module is proposed. Firstly, the dual attention module is embedded in the U-Net encoder, which can make the model have the ability to capture the context information of channel dimension and spatial dimension in the global range, and enhance the road features; Secondly, before upsampling, the feature map containing high-level semantic information is input into ASPP module to obtain road features of different scales; Finally, the deep supervision module is introduced into the upsampling part to learn the feature representation at different levels and retain more road detail features. Experiments are carried out on CamVid dataset and Cityscapes dataset. The results show that our Network can effectively segment road targets with different scales, and the segmented road contour is more complete and clear, which improves the accuracy of semantic segmentation while ensuring a certain segmentation speed.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Digital Image Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2644695","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Aiming at the problems of inaccurate segmentation edges, poor adaptability to multi-scale road targets, prone to false segmentation and missing segmentation when segmenting road targets with various and changeable occlusions in the traditional U-Net model, a semantic segmentation model of road scene based on multi-scale feature extraction and deep supervision module is proposed. Firstly, the dual attention module is embedded in the U-Net encoder, which can make the model have the ability to capture the context information of channel dimension and spatial dimension in the global range, and enhance the road features; Secondly, before upsampling, the feature map containing high-level semantic information is input into ASPP module to obtain road features of different scales; Finally, the deep supervision module is introduced into the upsampling part to learn the feature representation at different levels and retain more road detail features. Experiments are carried out on CamVid dataset and Cityscapes dataset. The results show that our Network can effectively segment road targets with different scales, and the segmented road contour is more complete and clear, which improves the accuracy of semantic segmentation while ensuring a certain segmentation speed.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于多尺度特征提取和深度监督的道路场景语义分割
针对传统的U-Net模型在分割遮挡多变性道路目标时存在分割边缘不准确、对多尺度道路目标适应性差、易出现分割错误和分割缺失等问题,提出了一种基于多尺度特征提取和深度监督模块的道路场景语义分割模型。首先,在U-Net编码器中嵌入双注意模块,使模型具有在全局范围内捕获通道维度和空间维度上下文信息的能力,增强道路特征;其次,在上采样前,将包含高级语义信息的特征图输入到ASPP模块中,得到不同尺度的道路特征;最后,在上采样部分引入深度监督模块,学习不同层次的特征表示,保留更多的道路细节特征。在CamVid数据集和cityscape数据集上进行了实验。结果表明,我们的网络可以有效分割不同尺度的道路目标,分割出来的道路轮廓更加完整清晰,在保证一定分割速度的同时提高了语义分割的准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Ship detection in optical remote sensing images based on saliency and rotation-invariant feature Deformable voxel grids for shape comparisons Correction of images projected on non-white surfaces based on deep neural network Self-supervision based super-resolution approach for light field refocused image Multi-visual information fusion and aggregation for video action classification
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1