Semantic Segmentation of RGB-D Images Using 3D and Local Neighbouring Features

F. Fooladgar, S. Kasaei
{"title":"Semantic Segmentation of RGB-D Images Using 3D and Local Neighbouring Features","authors":"F. Fooladgar, S. Kasaei","doi":"10.1109/DICTA.2015.7371307","DOIUrl":null,"url":null,"abstract":"3D scene understanding is one of the most important problems in the field of computer vision. Although, in the past decades, considerable attention has been devoted on the 2D scene understanding problem, now with the development of the depth sensors (like Microsoft Kinect), the 3D scene understanding has become a very challenging task. Traditionally, the scene understanding problem was considered as the semantic labeling of each image pixel. Semantic labeling of RGB-D images has not attained a comparable success, as the RGB semantic labeling, due to the lack of a challenging dataset. With the introduction of an RGB-D dataset, called NYU-V2, it became possible to propose a novel method to improve the labeling accuracy. In this paper, a semantic segmentation algorithm for RGB-D images is presented. The concentration of the proposed algorithm is on the feature description and classification steps. In the feature description step, the more discriminative features from RGB images and the 3D point cloud data are grouped with local neighboring features to incorporate their context into the classification step. In the classification step, a pairwise multi-class conditional random field framework is utilized in which the unary potential function is considered as the probabilistic output of a random forest classifier. The proposed algorithm is evaluated on the NYU-V2 dataset and the performance is compared to that of other methods presented in the literature. The proposed algorithm achieves the state-of-the-art results on the NYU-V2 dataset.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DICTA.2015.7371307","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

3D scene understanding is one of the most important problems in the field of computer vision. Although, in the past decades, considerable attention has been devoted on the 2D scene understanding problem, now with the development of the depth sensors (like Microsoft Kinect), the 3D scene understanding has become a very challenging task. Traditionally, the scene understanding problem was considered as the semantic labeling of each image pixel. Semantic labeling of RGB-D images has not attained a comparable success, as the RGB semantic labeling, due to the lack of a challenging dataset. With the introduction of an RGB-D dataset, called NYU-V2, it became possible to propose a novel method to improve the labeling accuracy. In this paper, a semantic segmentation algorithm for RGB-D images is presented. The concentration of the proposed algorithm is on the feature description and classification steps. In the feature description step, the more discriminative features from RGB images and the 3D point cloud data are grouped with local neighboring features to incorporate their context into the classification step. In the classification step, a pairwise multi-class conditional random field framework is utilized in which the unary potential function is considered as the probabilistic output of a random forest classifier. The proposed algorithm is evaluated on the NYU-V2 dataset and the performance is compared to that of other methods presented in the literature. The proposed algorithm achieves the state-of-the-art results on the NYU-V2 dataset.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于3D和局部邻近特征的RGB-D图像语义分割
三维场景理解是计算机视觉领域的重要问题之一。虽然在过去的几十年里,人们对2D场景的理解问题给予了相当大的关注,但现在随着深度传感器(如微软Kinect)的发展,3D场景的理解已经成为一项非常具有挑战性的任务。传统上,场景理解问题被认为是每个图像像素的语义标注问题。由于缺乏具有挑战性的数据集,RGB- d图像的语义标记没有取得与RGB语义标记相当的成功。随着RGB-D数据集(称为NYU-V2)的引入,提出一种提高标记准确性的新方法成为可能。本文提出了一种RGB-D图像的语义分割算法。该算法的重点在于特征描述和分类步骤。在特征描述步骤中,将RGB图像和3D点云数据中更具判别性的特征与局部相邻特征分组,将其上下文纳入分类步骤。在分类步骤中,采用两两多类条件随机场框架,其中一元势函数作为随机森林分类器的概率输出。在NYU-V2数据集上对该算法进行了评估,并将其性能与文献中提出的其他方法进行了比较。提出的算法在NYU-V2数据集上实现了最先进的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Illumination Compensated Segmentation of Microscopic Images of Activated Sludge Flocs Rotation Invariant Spatial Pyramid Matching for Image Classification Unsupervised Processing of Vehicle Appearance for Automatic Understanding in Traffic Surveillance A New Model for the Segmentation of Multiple, Overlapping, Near-Circular Objects An Analysis of Human Engagement Behaviour Using Descriptors from Human Feedback, Eye Tracking, and Saliency Modelling
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1