Indoor Scene Recognition with a Visual Attention-Driven Spatial Pooling Strategy

Tarek Elguebaly, N. Bouguila
{"title":"Indoor Scene Recognition with a Visual Attention-Driven Spatial Pooling Strategy","authors":"Tarek Elguebaly, N. Bouguila","doi":"10.1109/CRV.2014.43","DOIUrl":null,"url":null,"abstract":"Scene recognition is an important research topic in robotics and computer vision. Even though scene recognition is a problem that has been studied in depth, indoor scene categorization has had a slow progress. Indoor scene recognition is a challenging problem due to the severe high intra-class variability, mainly due to the intrinsic variety of objects that may be present, and inter-class similarities of man-made indoor structures. Therefore, most scene recognition techniques that work well for outdoor scenes demonstrate low performance on indoor scenes. Thus, in this paper, we present a simple, yet effective method for indoor scene recognition. Our approach can be illustrated as follows. First, we extract dense SIFT descriptors. Then, we combine a saliency-driven perceptual pooling with a simple spatial pooling scheme. Once the spatial and the saliency-driven encoding have been determined, we use vector quantization to compute histograms of local features from each sub-region. Later, the histograms from all sub-regions are concatenated together to generate the final representation of the image. Finally, a model based mixture classifier, which uses mixture models to characterize class densities, is applied. In order to address the problem of modeling non-Gaussian data which are largely present in our final representation of images, we use the generalized Gaussian mixture (GGM) which can be a good alternative to the Gaussian thanks to its shape flexibility. The learning of the proposed statistical model is carried out using the rival penalized expectation-maximization (RPEM) algorithm which is able to perform model selection and parameter learning together in a single step. Furthermore, we take into account the feature selection problem by determining a set of relevant features for each data cluster, so that we can speed up the used learning algorithm and get rid of noisy, redundant, or uninformative feature. To validate the proposed method we test it on the MIT indoor scenes data set.","PeriodicalId":385422,"journal":{"name":"2014 Canadian Conference on Computer and Robot Vision","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2014-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 Canadian Conference on Computer and Robot Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CRV.2014.43","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Scene recognition is an important research topic in robotics and computer vision. Even though scene recognition is a problem that has been studied in depth, indoor scene categorization has had a slow progress. Indoor scene recognition is a challenging problem due to the severe high intra-class variability, mainly due to the intrinsic variety of objects that may be present, and inter-class similarities of man-made indoor structures. Therefore, most scene recognition techniques that work well for outdoor scenes demonstrate low performance on indoor scenes. Thus, in this paper, we present a simple, yet effective method for indoor scene recognition. Our approach can be illustrated as follows. First, we extract dense SIFT descriptors. Then, we combine a saliency-driven perceptual pooling with a simple spatial pooling scheme. Once the spatial and the saliency-driven encoding have been determined, we use vector quantization to compute histograms of local features from each sub-region. Later, the histograms from all sub-regions are concatenated together to generate the final representation of the image. Finally, a model based mixture classifier, which uses mixture models to characterize class densities, is applied. In order to address the problem of modeling non-Gaussian data which are largely present in our final representation of images, we use the generalized Gaussian mixture (GGM) which can be a good alternative to the Gaussian thanks to its shape flexibility. The learning of the proposed statistical model is carried out using the rival penalized expectation-maximization (RPEM) algorithm which is able to perform model selection and parameter learning together in a single step. Furthermore, we take into account the feature selection problem by determining a set of relevant features for each data cluster, so that we can speed up the used learning algorithm and get rid of noisy, redundant, or uninformative feature. To validate the proposed method we test it on the MIT indoor scenes data set.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于视觉注意力驱动的空间池策略的室内场景识别
场景识别是机器人技术和计算机视觉领域的一个重要研究课题。尽管场景识别是一个已经深入研究的问题,但室内场景分类却进展缓慢。室内场景识别是一个具有挑战性的问题,主要是由于可能存在的物体的内在多样性和人工室内结构的类间相似性。因此,大多数场景识别技术在室外场景中表现良好,但在室内场景中表现不佳。因此,本文提出了一种简单而有效的室内场景识别方法。我们的方法可以说明如下。首先,提取密集SIFT描述子。然后,我们将显著性驱动的感知池与简单的空间池方案相结合。一旦确定了空间和显著性驱动的编码,我们使用矢量量化来计算每个子区域的局部特征直方图。然后,将所有子区域的直方图连接在一起以生成图像的最终表示。最后,提出了一种基于混合模型的混合分类器,该分类器利用混合模型来表征类密度。为了解决非高斯数据的建模问题,我们使用广义高斯混合(GGM),由于其形状的灵活性,它可以成为高斯的一个很好的替代品。采用竞争惩罚期望最大化(RPEM)算法对所提出的统计模型进行学习,该算法能够在一步中同时完成模型选择和参数学习。此外,我们通过为每个数据簇确定一组相关特征来考虑特征选择问题,以便我们可以加快使用的学习算法并去除噪声,冗余或无信息的特征。为了验证所提出的方法,我们在MIT室内场景数据集上进行了测试。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
MDS-based Multi-axial Dimensionality Reduction Model for Human Action Recognition Direct Matrix Factorization and Alignment Refinement: Application to Defect Detection Towards Full Omnidirectional Depth Sensing Using Active Vision for Small Unmanned Aerial Vehicles An Integrated Bud Detection and Localization System for Application in Greenhouse Automation Trinocular Spherical Stereo Vision for Indoor Surveillance
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1