STDC-MA Network for Semantic Segmentation

Xiaochun Lei, Linjun Lu, Zetao Jiang, Zhaoting Gong, Chang Lu, Jiaming Liang
{"title":"STDC-MA Network for Semantic Segmentation","authors":"Xiaochun Lei, Linjun Lu, Zetao Jiang, Zhaoting Gong, Chang Lu, Jiaming Liang","doi":"10.48550/arXiv.2205.04639","DOIUrl":null,"url":null,"abstract":"Semantic segmentation is applied extensively in autonomous driving and intelligent transportation with methods that highly demand spatial and semantic information. Here, an STDC-MA network is proposed to meet these demands. First, the STDC-Seg structure is employed in STDC-MA to ensure a lightweight and efficient structure. Subsequently, the feature alignment module (FAM) is applied to understand the offset between high-level and low-level features, solving the problem of pixel offset related to upsampling on the high-level feature map. Our approach implements the effective fusion between high-level features and low-level features. A hierarchical multiscale attention mechanism is adopted to reveal the relationship among attention regions from two different input sizes of one image. Through this relationship, regions receiving much attention are integrated into the segmentation results, thereby reducing the unfocused regions of the input image and improving the effective utilization of multiscale features. STDC- MA maintains the segmentation speed as an STDC-Seg network while improving the segmentation accuracy of small objects. STDC-MA was verified on the verification set of Cityscapes. The segmentation result of STDC-MA attained 76.81% mIOU with the input of 0.5x scale, 3.61% higher than STDC-Seg.","PeriodicalId":13486,"journal":{"name":"IET Image Process.","volume":"32 1","pages":"3758-3767"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Image Process.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2205.04639","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Semantic segmentation is applied extensively in autonomous driving and intelligent transportation with methods that highly demand spatial and semantic information. Here, an STDC-MA network is proposed to meet these demands. First, the STDC-Seg structure is employed in STDC-MA to ensure a lightweight and efficient structure. Subsequently, the feature alignment module (FAM) is applied to understand the offset between high-level and low-level features, solving the problem of pixel offset related to upsampling on the high-level feature map. Our approach implements the effective fusion between high-level features and low-level features. A hierarchical multiscale attention mechanism is adopted to reveal the relationship among attention regions from two different input sizes of one image. Through this relationship, regions receiving much attention are integrated into the segmentation results, thereby reducing the unfocused regions of the input image and improving the effective utilization of multiscale features. STDC- MA maintains the segmentation speed as an STDC-Seg network while improving the segmentation accuracy of small objects. STDC-MA was verified on the verification set of Cityscapes. The segmentation result of STDC-MA attained 76.81% mIOU with the input of 0.5x scale, 3.61% higher than STDC-Seg.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
语义分割的STDC-MA网络
语义分割在自动驾驶和智能交通中有着广泛的应用,其方法对空间信息和语义信息有很高的要求。为此,我们提出了一个STDC-MA网络来满足这些需求。首先,在STDC-MA中采用了STDC-Seg结构,保证了结构的轻量化和高效。随后,利用特征对齐模块FAM (feature alignment module)理解高阶特征与低阶特征之间的偏移量,解决高阶特征映射上采样相关的像素偏移问题。我们的方法实现了高级特征和低级特征的有效融合。采用分层多尺度注意机制揭示了一幅图像两种不同输入尺寸的注意区域之间的关系。通过这种关系,将关注较多的区域整合到分割结果中,从而减少了输入图像的未聚焦区域,提高了多尺度特征的有效利用。STDC- MA在保持STDC- seg网络分割速度的同时,提高了小目标的分割精度。在cityscape验证集上对STDC-MA进行了验证。输入0.5倍尺度时,STDC-MA的分割结果达到76.81% mIOU,比STDC-Seg高3.61%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Mask-Guided Image Person Removal with Data Synthesis EDAfuse: A encoder-decoder with atrous spatial pyramid network for infrared and visible image fusion Visible part prediction and temporal calibration for pedestrian detection STDC-MA Network for Semantic Segmentation Multi-similarity based Hyperrelation Network for few-shot segmentation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1