Yingfeng Cai, Junqiao Zhao, Jiafeng Cui, Fenglin Zhang, Chen Ye, T. Feng
{"title":"patch - netvlad +:学习了用于位置识别的补丁描述符和加权匹配策略","authors":"Yingfeng Cai, Junqiao Zhao, Jiafeng Cui, Fenglin Zhang, Chen Ye, T. Feng","doi":"10.1109/MFI55806.2022.9913860","DOIUrl":null,"url":null,"abstract":"Visual Place Recognition (VPR) in areas with similar scenes such as urban or indoor scenarios is a major challenge. Existing VPR methods using global descriptors have difficulty capturing local specific region (LSR) in the scene and are therefore prone to localization confusion in such scenarios. As a result, finding the LSRs that are critical for location recognition becomes key. To address this challenge, we introduced Patch-NetVLAD+, which was inspired by patch-based VPR researches. Our method proposed a fine-tuning strategy with triplet loss to make NetVLAD suitable for extracting patch-level descriptors. Moreover, unlike existing methods that treat all patches in an image equally, our method extracts patches of LSR, which present less frequently throughout the dataset, and makes them play an important role in VPR by assigning proper weights to them. Experiments on Pittsburgh30k and Tokyo247 datasets show that our approach achieved up to 9.3% performance improvement than existing patch-based methods.","PeriodicalId":344737,"journal":{"name":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Patch-NetVLAD+: Learned patch descriptor and weighted matching strategy for place recognition\",\"authors\":\"Yingfeng Cai, Junqiao Zhao, Jiafeng Cui, Fenglin Zhang, Chen Ye, T. Feng\",\"doi\":\"10.1109/MFI55806.2022.9913860\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Visual Place Recognition (VPR) in areas with similar scenes such as urban or indoor scenarios is a major challenge. Existing VPR methods using global descriptors have difficulty capturing local specific region (LSR) in the scene and are therefore prone to localization confusion in such scenarios. As a result, finding the LSRs that are critical for location recognition becomes key. To address this challenge, we introduced Patch-NetVLAD+, which was inspired by patch-based VPR researches. Our method proposed a fine-tuning strategy with triplet loss to make NetVLAD suitable for extracting patch-level descriptors. Moreover, unlike existing methods that treat all patches in an image equally, our method extracts patches of LSR, which present less frequently throughout the dataset, and makes them play an important role in VPR by assigning proper weights to them. Experiments on Pittsburgh30k and Tokyo247 datasets show that our approach achieved up to 9.3% performance improvement than existing patch-based methods.\",\"PeriodicalId\":344737,\"journal\":{\"name\":\"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)\",\"volume\":\"55 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-02-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MFI55806.2022.9913860\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MFI55806.2022.9913860","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Patch-NetVLAD+: Learned patch descriptor and weighted matching strategy for place recognition
Visual Place Recognition (VPR) in areas with similar scenes such as urban or indoor scenarios is a major challenge. Existing VPR methods using global descriptors have difficulty capturing local specific region (LSR) in the scene and are therefore prone to localization confusion in such scenarios. As a result, finding the LSRs that are critical for location recognition becomes key. To address this challenge, we introduced Patch-NetVLAD+, which was inspired by patch-based VPR researches. Our method proposed a fine-tuning strategy with triplet loss to make NetVLAD suitable for extracting patch-level descriptors. Moreover, unlike existing methods that treat all patches in an image equally, our method extracts patches of LSR, which present less frequently throughout the dataset, and makes them play an important role in VPR by assigning proper weights to them. Experiments on Pittsburgh30k and Tokyo247 datasets show that our approach achieved up to 9.3% performance improvement than existing patch-based methods.