{"title":"On the relevance of edge-conditioned convolution for GNN-based semantic image segmentation using spatial relationships","authors":"P. Coupeau, Jean-Baptiste Fasquel, M. Dinomais","doi":"10.1109/IPTA54936.2022.9784143","DOIUrl":null,"url":null,"abstract":"This paper addresses the fundamental task of semantic image segmentation by exploiting structural information (spatial relationships between image regions). To perform such task, we propose to combine a deep neural network (CNN) with inexact “many-to-one-or-none” graph matching where graphs encode efficiently class probabilities a nd structural information related to regions segmented by the CNN. In order to achieve node classification, a basic 2 -layer graph neural network (GNN) based on the edge-conditioned convolution operator (ECConv), managing both node and edge attributes, is considered. Prelim-inary experiments are performed on both a synthetic dataset and a public dataset of face images (FASSEG). Our approach is shown to be resilient to small training datasets that often limit the performance of deep learning thanks to a preprocessing task of graph coarsening. Results show that the proposal reaches a perfect accuracy on synthetic dataset and improves performance of the CNN by 6% (bounding box dice index) on FASSEG. Moreover, it enhances by 27% the initial Hausdorff distance (i.e. with CNN only) using the entire training dataset and by 41% with only 75% of training samples.","PeriodicalId":381729,"journal":{"name":"2022 Eleventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Eleventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPTA54936.2022.9784143","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
This paper addresses the fundamental task of semantic image segmentation by exploiting structural information (spatial relationships between image regions). To perform such task, we propose to combine a deep neural network (CNN) with inexact “many-to-one-or-none” graph matching where graphs encode efficiently class probabilities a nd structural information related to regions segmented by the CNN. In order to achieve node classification, a basic 2 -layer graph neural network (GNN) based on the edge-conditioned convolution operator (ECConv), managing both node and edge attributes, is considered. Prelim-inary experiments are performed on both a synthetic dataset and a public dataset of face images (FASSEG). Our approach is shown to be resilient to small training datasets that often limit the performance of deep learning thanks to a preprocessing task of graph coarsening. Results show that the proposal reaches a perfect accuracy on synthetic dataset and improves performance of the CNN by 6% (bounding box dice index) on FASSEG. Moreover, it enhances by 27% the initial Hausdorff distance (i.e. with CNN only) using the entire training dataset and by 41% with only 75% of training samples.