Chunqi Tian, Weirong Xu, Lizhi Bai, Jun Yang, Yanjun Xu
{"title":"GANet: geometry-aware network for RGB-D semantic segmentation","authors":"Chunqi Tian, Weirong Xu, Lizhi Bai, Jun Yang, Yanjun Xu","doi":"10.1007/s10489-025-06337-0","DOIUrl":null,"url":null,"abstract":"<div><p>The field of RGB-D semantic segmentation has attracted considerable interest in recent times. The challenge is to develop an effective method for combining RGB images, which capture colour variations, with depth images, which provide robust information about object geometry regardless of lighting conditions. Treating both image types equally through the same convolution operator fails to take into account their inherent differences. Thus, in this paper, we propose a novel approach that combines a geometry-aware convolution (<i>GAConv</i>) module and a multiscale fusion module (MFM) with the aim of enhancing the performance of RGB-D image segmentation. The <i>GAConv</i> module effectively captures fine-grained geometric details from depth images, while the MFM module enables efficient integration of multi-scale features, allowing the network to utilise both spatial and semantic information. Extensive experimentation was conducted on the NYUv2 and SUN RGB-D datasets, wherein our model demonstrated consistent superiority over existing state-of-the-art methods in terms of pixel accuracy and mean intersection over union (mIoU).</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-025-06337-0","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The field of RGB-D semantic segmentation has attracted considerable interest in recent times. The challenge is to develop an effective method for combining RGB images, which capture colour variations, with depth images, which provide robust information about object geometry regardless of lighting conditions. Treating both image types equally through the same convolution operator fails to take into account their inherent differences. Thus, in this paper, we propose a novel approach that combines a geometry-aware convolution (GAConv) module and a multiscale fusion module (MFM) with the aim of enhancing the performance of RGB-D image segmentation. The GAConv module effectively captures fine-grained geometric details from depth images, while the MFM module enables efficient integration of multi-scale features, allowing the network to utilise both spatial and semantic information. Extensive experimentation was conducted on the NYUv2 and SUN RGB-D datasets, wherein our model demonstrated consistent superiority over existing state-of-the-art methods in terms of pixel accuracy and mean intersection over union (mIoU).
期刊介绍:
With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance.
The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.