Encoding context and decoding aggregated information for semantic segmentation

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Computers & Graphics-Uk Pub Date : 2025-02-01 DOI:10.1016/j.cag.2024.104144

Guodong Zhang , Wenzhu Yang , Guoyu Zhou

{"title":"Encoding context and decoding aggregated information for semantic segmentation","authors":"Guodong Zhang , Wenzhu Yang , Guoyu Zhou","doi":"10.1016/j.cag.2024.104144","DOIUrl":null,"url":null,"abstract":"<div><div>In the process of feature extraction, the existing encoder–decoder network models often use the continuous downsampling method to expand the receptive field, which directly leads to the reduction of the model’s ability to obtain fine-grained spatial information, and further makes it difficult to recover the lost spatial details in the process of upsampling the feature map. In addition, the direct fusion of the features of the encoder and the decoder leads to the problem that the detailed features are masked by the semantic features. To address these challenges, we build a new semantic segmentation model named ECDAISeg. The model adopts an encoder–decoder structure. We embed Context Propagation Module (CPM) and Blend Feature Balance Module (BFBM) between the encoder and the decoder. The role of CMP is to recover lost detail information after feature extraction and to provide multi-scale contextual information for better understanding of global semantics. The BFBM is used to balance high-level semantic information with low-level detailed information through the attention mechanism, thereby filtering out redundant information and preserving important details. Evaluations on the PASCAL VOC 2012 and Cityscapes validation sets conclude that ECDAISeg achieves 82.85% and 74.49% mIoU, realizing better segmentation results compared to various representative segmentation models.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"126 ","pages":"Article 104144"},"PeriodicalIF":2.5000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Graphics-Uk","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0097849324002796","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

In the process of feature extraction, the existing encoder–decoder network models often use the continuous downsampling method to expand the receptive field, which directly leads to the reduction of the model’s ability to obtain fine-grained spatial information, and further makes it difficult to recover the lost spatial details in the process of upsampling the feature map. In addition, the direct fusion of the features of the encoder and the decoder leads to the problem that the detailed features are masked by the semantic features. To address these challenges, we build a new semantic segmentation model named ECDAISeg. The model adopts an encoder–decoder structure. We embed Context Propagation Module (CPM) and Blend Feature Balance Module (BFBM) between the encoder and the decoder. The role of CMP is to recover lost detail information after feature extraction and to provide multi-scale contextual information for better understanding of global semantics. The BFBM is used to balance high-level semantic information with low-level detailed information through the attention mechanism, thereby filtering out redundant information and preserving important details. Evaluations on the PASCAL VOC 2012 and Cityscapes validation sets conclude that ECDAISeg achieves 82.85% and 74.49% mIoU, realizing better segmentation results compared to various representative segmentation models.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

求助全文

约1分钟内获得全文去求助

来源期刊

Computers & Graphics-Uk 工程技术-计算机：软件工程

CiteScore

5.30

自引率

12.00%

发文量

173

审稿时长

38 days

期刊介绍： Computers & Graphics is dedicated to disseminate information on research and applications of computer graphics (CG) techniques. The journal encourages articles on: 1. Research and applications of interactive computer graphics. We are particularly interested in novel interaction techniques and applications of CG to problem domains. 2. State-of-the-art papers on late-breaking, cutting-edge research on CG. 3. Information on innovative uses of graphics principles and technologies. 4. Tutorial papers on both teaching CG principles and innovative uses of CG in education.

期刊最新文献

Celebrating 50 years of innovation in computer graphics: Issue 127 Foreword to chinagraph 2024 special section Real-time discrete visibility fields for ray-traced dynamic scenes Single-image reflectance and transmittance estimation from any flatbed scanner Evaluating user perception toward physics-adapted avatar in remote heterogeneous spaces