patch - netvlad +:学习了用于位置识别的补丁描述符和加权匹配策略

2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI) Pub Date : 2022-02-11 DOI:10.1109/MFI55806.2022.9913860

Yingfeng Cai, Junqiao Zhao, Jiafeng Cui, Fenglin Zhang, Chen Ye, T. Feng

{"title":"patch - netvlad +:学习了用于位置识别的补丁描述符和加权匹配策略","authors":"Yingfeng Cai, Junqiao Zhao, Jiafeng Cui, Fenglin Zhang, Chen Ye, T. Feng","doi":"10.1109/MFI55806.2022.9913860","DOIUrl":null,"url":null,"abstract":"Visual Place Recognition (VPR) in areas with similar scenes such as urban or indoor scenarios is a major challenge. Existing VPR methods using global descriptors have difficulty capturing local specific region (LSR) in the scene and are therefore prone to localization confusion in such scenarios. As a result, finding the LSRs that are critical for location recognition becomes key. To address this challenge, we introduced Patch-NetVLAD+, which was inspired by patch-based VPR researches. Our method proposed a fine-tuning strategy with triplet loss to make NetVLAD suitable for extracting patch-level descriptors. Moreover, unlike existing methods that treat all patches in an image equally, our method extracts patches of LSR, which present less frequently throughout the dataset, and makes them play an important role in VPR by assigning proper weights to them. Experiments on Pittsburgh30k and Tokyo247 datasets show that our approach achieved up to 9.3% performance improvement than existing patch-based methods.","PeriodicalId":344737,"journal":{"name":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Patch-NetVLAD+: Learned patch descriptor and weighted matching strategy for place recognition\",\"authors\":\"Yingfeng Cai, Junqiao Zhao, Jiafeng Cui, Fenglin Zhang, Chen Ye, T. Feng\",\"doi\":\"10.1109/MFI55806.2022.9913860\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Visual Place Recognition (VPR) in areas with similar scenes such as urban or indoor scenarios is a major challenge. Existing VPR methods using global descriptors have difficulty capturing local specific region (LSR) in the scene and are therefore prone to localization confusion in such scenarios. As a result, finding the LSRs that are critical for location recognition becomes key. To address this challenge, we introduced Patch-NetVLAD+, which was inspired by patch-based VPR researches. Our method proposed a fine-tuning strategy with triplet loss to make NetVLAD suitable for extracting patch-level descriptors. Moreover, unlike existing methods that treat all patches in an image equally, our method extracts patches of LSR, which present less frequently throughout the dataset, and makes them play an important role in VPR by assigning proper weights to them. Experiments on Pittsburgh30k and Tokyo247 datasets show that our approach achieved up to 9.3% performance improvement than existing patch-based methods.\",\"PeriodicalId\":344737,\"journal\":{\"name\":\"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)\",\"volume\":\"55 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-02-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MFI55806.2022.9913860\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MFI55806.2022.9913860","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

在具有类似场景的区域(如城市或室内场景)中，视觉位置识别(VPR)是一个主要挑战。使用全局描述符的现有VPR方法难以捕获场景中的局部特定区域(LSR)，因此在此类场景中容易出现定位混淆。因此，找到对位置识别至关重要的lsr就成为了关键。为了应对这一挑战，我们引入了Patch-NetVLAD+，该技术受到基于补丁的VPR研究的启发。我们的方法提出了一种带有三重损失的微调策略，使NetVLAD适合于提取补丁级描述符。此外，与现有方法对图像中的所有斑块一视同仁不同，我们的方法提取了在数据集中出现频率较低的LSR斑块，并通过分配适当的权重使其在VPR中发挥重要作用。在pittsburgh - 30k和Tokyo247数据集上的实验表明，我们的方法比现有的基于补丁的方法的性能提高了9.3%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Patch-NetVLAD+: Learned patch descriptor and weighted matching strategy for place recognition

Visual Place Recognition (VPR) in areas with similar scenes such as urban or indoor scenarios is a major challenge. Existing VPR methods using global descriptors have difficulty capturing local specific region (LSR) in the scene and are therefore prone to localization confusion in such scenarios. As a result, finding the LSRs that are critical for location recognition becomes key. To address this challenge, we introduced Patch-NetVLAD+, which was inspired by patch-based VPR researches. Our method proposed a fine-tuning strategy with triplet loss to make NetVLAD suitable for extracting patch-level descriptors. Moreover, unlike existing methods that treat all patches in an image equally, our method extracts patches of LSR, which present less frequently throughout the dataset, and makes them play an important role in VPR by assigning proper weights to them. Experiments on Pittsburgh30k and Tokyo247 datasets show that our approach achieved up to 9.3% performance improvement than existing patch-based methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)

自引率

0.00%

发文量