Regional Relation Modeling for Visual Place Recognition

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval Pub Date : 2020-07-25 DOI:10.1145/3397271.3401176

Yingying Zhu, Biao Li, Jiong Wang, Zhou Zhao

{"title":"Regional Relation Modeling for Visual Place Recognition","authors":"Yingying Zhu, Biao Li, Jiong Wang, Zhou Zhao","doi":"10.1145/3397271.3401176","DOIUrl":null,"url":null,"abstract":"In the process of visual perception, humans perceive not only the appearance of objects existing in a place but also their relationships (e.g. spatial layout). However, the dominant works on visual place recognition are always based on the assumption that two images depict the same place if they contain enough similar objects, while the relation information is neglected. In this paper, we propose a regional relation module which models the regional relationships and converts the convolutional feature maps to the relational feature maps. We further design a cascaded pooling method to get discriminative relation descriptors by preventing the influence of confusing relations and preserving as much useful information as possible. Extensive experiments on two place recognition benchmarks demonstrate that training with the proposed regional relation module improves the appearance descriptors and the relation descriptors are complementary to appearance descriptors. When these two kinds of descriptors are concatenated together, the resulting combined descriptors outperform the state-of-the-art methods.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3397271.3401176","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

In the process of visual perception, humans perceive not only the appearance of objects existing in a place but also their relationships (e.g. spatial layout). However, the dominant works on visual place recognition are always based on the assumption that two images depict the same place if they contain enough similar objects, while the relation information is neglected. In this paper, we propose a regional relation module which models the regional relationships and converts the convolutional feature maps to the relational feature maps. We further design a cascaded pooling method to get discriminative relation descriptors by preventing the influence of confusing relations and preserving as much useful information as possible. Extensive experiments on two place recognition benchmarks demonstrate that training with the proposed regional relation module improves the appearance descriptors and the relation descriptors are complementary to appearance descriptors. When these two kinds of descriptors are concatenated together, the resulting combined descriptors outperform the state-of-the-art methods.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

视觉位置识别的区域关系建模

在视觉感知的过程中，人们不仅感知到一个地方存在的物体的外观，而且还感知到它们之间的关系(如空间布局)。然而，在视觉位置识别方面的主流工作总是基于两幅图像中包含足够多的相似物体就描绘了同一个地方的假设，而忽略了关系信息。本文提出了一个区域关系模块，该模块对区域关系进行建模，并将卷积特征映射转换为关系特征映射。我们进一步设计了一种级联池化方法，通过防止混淆关系的影响并保留尽可能多的有用信息来获得判别关系描述符。在两个地点识别基准上的大量实验表明，使用所提出的区域关系模块进行训练可以改善外观描述符，并且关系描述符与外观描述符是互补的。当这两种描述符连接在一起时，所得到的组合描述符优于最先进的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

自引率

0.00%

发文量

期刊最新文献

MHM: Multi-modal Clinical Data based Hierarchical Multi-label Diagnosis Prediction Correlated Features Synthesis and Alignment for Zero-shot Cross-modal Retrieval DVGAN Models Versus Satisfaction: Towards a Better Understanding of Evaluation Metrics Global Context Enhanced Graph Neural Networks for Session-based Recommendation