{"title":"Encoding context and decoding aggregated information for semantic segmentation","authors":"Guodong Zhang , Wenzhu Yang , Guoyu Zhou","doi":"10.1016/j.cag.2024.104144","DOIUrl":null,"url":null,"abstract":"<div><div>In the process of feature extraction, the existing encoder–decoder network models often use the continuous downsampling method to expand the receptive field, which directly leads to the reduction of the model’s ability to obtain fine-grained spatial information, and further makes it difficult to recover the lost spatial details in the process of upsampling the feature map. In addition, the direct fusion of the features of the encoder and the decoder leads to the problem that the detailed features are masked by the semantic features. To address these challenges, we build a new semantic segmentation model named ECDAISeg. The model adopts an encoder–decoder structure. We embed Context Propagation Module (CPM) and Blend Feature Balance Module (BFBM) between the encoder and the decoder. The role of CMP is to recover lost detail information after feature extraction and to provide multi-scale contextual information for better understanding of global semantics. The BFBM is used to balance high-level semantic information with low-level detailed information through the attention mechanism, thereby filtering out redundant information and preserving important details. Evaluations on the PASCAL VOC 2012 and Cityscapes validation sets conclude that ECDAISeg achieves 82.85% and 74.49% mIoU, realizing better segmentation results compared to various representative segmentation models.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"126 ","pages":"Article 104144"},"PeriodicalIF":2.5000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Graphics-Uk","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0097849324002796","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
In the process of feature extraction, the existing encoder–decoder network models often use the continuous downsampling method to expand the receptive field, which directly leads to the reduction of the model’s ability to obtain fine-grained spatial information, and further makes it difficult to recover the lost spatial details in the process of upsampling the feature map. In addition, the direct fusion of the features of the encoder and the decoder leads to the problem that the detailed features are masked by the semantic features. To address these challenges, we build a new semantic segmentation model named ECDAISeg. The model adopts an encoder–decoder structure. We embed Context Propagation Module (CPM) and Blend Feature Balance Module (BFBM) between the encoder and the decoder. The role of CMP is to recover lost detail information after feature extraction and to provide multi-scale contextual information for better understanding of global semantics. The BFBM is used to balance high-level semantic information with low-level detailed information through the attention mechanism, thereby filtering out redundant information and preserving important details. Evaluations on the PASCAL VOC 2012 and Cityscapes validation sets conclude that ECDAISeg achieves 82.85% and 74.49% mIoU, realizing better segmentation results compared to various representative segmentation models.
期刊介绍:
Computers & Graphics is dedicated to disseminate information on research and applications of computer graphics (CG) techniques. The journal encourages articles on:
1. Research and applications of interactive computer graphics. We are particularly interested in novel interaction techniques and applications of CG to problem domains.
2. State-of-the-art papers on late-breaking, cutting-edge research on CG.
3. Information on innovative uses of graphics principles and technologies.
4. Tutorial papers on both teaching CG principles and innovative uses of CG in education.