Shenghua Li, Quan Zhou, Jia Liu, Jie Wang, Yawen Fan, Xiaofu Wu, Longin Jan Latecki
{"title":"DCM: A Dense-Attention Context Module For Semantic Segmentation","authors":"Shenghua Li, Quan Zhou, Jia Liu, Jie Wang, Yawen Fan, Xiaofu Wu, Longin Jan Latecki","doi":"10.1109/ICIP40778.2020.9190675","DOIUrl":null,"url":null,"abstract":"For image semantic segmentation, a fully convolutional network is usually employed as the encoder to abstract visual features of the input image. A meticulously designed decoder is used to decoding the final feature map of the backbone. The output resolution of backbones which are designed for image classification task is too low to match segmentation task. Most existing methods for obtaining the final high-resolution feature map can not fully utilize the information of different layers of the backbone. To adequately extract the information of a single layer, the multi-scale context information of different layers, and the global information of backbone, we present a new attention-augmented module named Dense-attention Context Module (DCM), which is used to connect the common backbones and the other decoding heads. The experiments show the promising results of our method on Cityscapes dataset.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"121 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Image Processing (ICIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIP40778.2020.9190675","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
For image semantic segmentation, a fully convolutional network is usually employed as the encoder to abstract visual features of the input image. A meticulously designed decoder is used to decoding the final feature map of the backbone. The output resolution of backbones which are designed for image classification task is too low to match segmentation task. Most existing methods for obtaining the final high-resolution feature map can not fully utilize the information of different layers of the backbone. To adequately extract the information of a single layer, the multi-scale context information of different layers, and the global information of backbone, we present a new attention-augmented module named Dense-attention Context Module (DCM), which is used to connect the common backbones and the other decoding heads. The experiments show the promising results of our method on Cityscapes dataset.