Encoding context and decoding aggregated information for semantic segmentation

IF 2.8 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Computers & Graphics-Uk Pub Date : 2025-02-01 Epub Date: 2024-12-18 DOI:10.1016/j.cag.2024.104144
Guodong Zhang , Wenzhu Yang , Guoyu Zhou
{"title":"Encoding context and decoding aggregated information for semantic segmentation","authors":"Guodong Zhang ,&nbsp;Wenzhu Yang ,&nbsp;Guoyu Zhou","doi":"10.1016/j.cag.2024.104144","DOIUrl":null,"url":null,"abstract":"<div><div>In the process of feature extraction, the existing encoder–decoder network models often use the continuous downsampling method to expand the receptive field, which directly leads to the reduction of the model’s ability to obtain fine-grained spatial information, and further makes it difficult to recover the lost spatial details in the process of upsampling the feature map. In addition, the direct fusion of the features of the encoder and the decoder leads to the problem that the detailed features are masked by the semantic features. To address these challenges, we build a new semantic segmentation model named ECDAISeg. The model adopts an encoder–decoder structure. We embed Context Propagation Module (CPM) and Blend Feature Balance Module (BFBM) between the encoder and the decoder. The role of CMP is to recover lost detail information after feature extraction and to provide multi-scale contextual information for better understanding of global semantics. The BFBM is used to balance high-level semantic information with low-level detailed information through the attention mechanism, thereby filtering out redundant information and preserving important details. Evaluations on the PASCAL VOC 2012 and Cityscapes validation sets conclude that ECDAISeg achieves 82.85% and 74.49% mIoU, realizing better segmentation results compared to various representative segmentation models.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"126 ","pages":"Article 104144"},"PeriodicalIF":2.8000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Graphics-Uk","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0097849324002796","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/18 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

Abstract

In the process of feature extraction, the existing encoder–decoder network models often use the continuous downsampling method to expand the receptive field, which directly leads to the reduction of the model’s ability to obtain fine-grained spatial information, and further makes it difficult to recover the lost spatial details in the process of upsampling the feature map. In addition, the direct fusion of the features of the encoder and the decoder leads to the problem that the detailed features are masked by the semantic features. To address these challenges, we build a new semantic segmentation model named ECDAISeg. The model adopts an encoder–decoder structure. We embed Context Propagation Module (CPM) and Blend Feature Balance Module (BFBM) between the encoder and the decoder. The role of CMP is to recover lost detail information after feature extraction and to provide multi-scale contextual information for better understanding of global semantics. The BFBM is used to balance high-level semantic information with low-level detailed information through the attention mechanism, thereby filtering out redundant information and preserving important details. Evaluations on the PASCAL VOC 2012 and Cityscapes validation sets conclude that ECDAISeg achieves 82.85% and 74.49% mIoU, realizing better segmentation results compared to various representative segmentation models.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于语义分割的上下文编码和聚合信息解码
在特征提取过程中,现有的编码器-解码器网络模型往往采用连续下采样的方法来扩大接受域,这直接导致模型获取细粒度空间信息的能力降低,进一步导致特征图上采样过程中丢失的空间细节难以恢复。此外,编码器和解码器的特征直接融合导致了细节特征被语义特征所掩盖的问题。为了解决这些问题,我们建立了一个新的语义分割模型ECDAISeg。该模型采用编码器-解码器结构。我们在编码器和解码器之间嵌入上下文传播模块(CPM)和混合特征平衡模块(BFBM)。CMP的作用是在特征提取后恢复丢失的细节信息,并为更好地理解全局语义提供多尺度上下文信息。BFBM通过注意机制平衡高级语义信息和低级细节信息,过滤冗余信息,保留重要细节。通过对PASCAL VOC 2012和cityscape验证集的评价,ECDAISeg的mIoU分别达到82.85%和74.49%,与各种代表性分割模型相比,实现了更好的分割效果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Computers & Graphics-Uk
Computers & Graphics-Uk 工程技术-计算机:软件工程
CiteScore
5.30
自引率
12.00%
发文量
173
审稿时长
38 days
期刊介绍: Computers & Graphics is dedicated to disseminate information on research and applications of computer graphics (CG) techniques. The journal encourages articles on: 1. Research and applications of interactive computer graphics. We are particularly interested in novel interaction techniques and applications of CG to problem domains. 2. State-of-the-art papers on late-breaking, cutting-edge research on CG. 3. Information on innovative uses of graphics principles and technologies. 4. Tutorial papers on both teaching CG principles and innovative uses of CG in education.
期刊最新文献
QuantizationGS: Differentiable quantization model for 3D Gaussian Splatting compression AvatarMoE: Decomposing non-rigid deformation with part-aware experts for 3DGS avatars Virtual memory for 3D Gaussian Splatting PS-GS: Gaussian splatting for multi-view photometric stereo LSIST-Net: A lightweight stereoscopic image style transfer network for view consistency
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1