全局辅助局部:视场约束图像地理定位的有效空中表示

2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2022-01-01 DOI:10.1109/WACV51458.2022.00275

Royston Rodrigues, Masahiro Tani

{"title":"全局辅助局部:视场约束图像地理定位的有效空中表示","authors":"Royston Rodrigues, Masahiro Tani","doi":"10.1109/WACV51458.2022.00275","DOIUrl":null,"url":null,"abstract":"When we humans recognize places from images, we not only infer about the objects that are available but even think about landmarks that might be surrounding it. Current place recognition approaches lack the ability to go beyond objects that are available in the image and hence miss out on understanding the scene completely. In this paper, we take a step towards holistic scene understanding. We address the problem of image geo-localization by retrieving corresponding aerial views from a large database of geotagged aerial imagery. One of the main challenges in tackling this problem is the limited Field of View (FoV) nature of query images which needs to be matched to aerial views which contain 360°FoV details. State-of-the-art method DSM-Net [19] tackles this challenge by matching aerial images locally within fixed FoV sectors. We show that local matching limits complete scene understanding and is inadequate when partial buildings are visible in query images or when local sectors of aerial images are covered by dense trees. Our approach considers both local and global properties of aerial images and hence is robust to such conditions. Experiments on standard benchmarks demonstrates that the proposed approach improves top-1% image recall rate on the CVACT [9] data-set from 57.08% to 77.19% and from 61.20% to 75.21% on the CVUSA [28] data-set for 70°FoV. We also achieve state-of-the art results for 90°FoV on both CVACT [9] and CVUSA [28] data-sets demonstrating the effectiveness of our proposed method.","PeriodicalId":297092,"journal":{"name":"2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Global Assists Local: Effective Aerial Representations for Field of View Constrained Image Geo-Localization\",\"authors\":\"Royston Rodrigues, Masahiro Tani\",\"doi\":\"10.1109/WACV51458.2022.00275\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"When we humans recognize places from images, we not only infer about the objects that are available but even think about landmarks that might be surrounding it. Current place recognition approaches lack the ability to go beyond objects that are available in the image and hence miss out on understanding the scene completely. In this paper, we take a step towards holistic scene understanding. We address the problem of image geo-localization by retrieving corresponding aerial views from a large database of geotagged aerial imagery. One of the main challenges in tackling this problem is the limited Field of View (FoV) nature of query images which needs to be matched to aerial views which contain 360°FoV details. State-of-the-art method DSM-Net [19] tackles this challenge by matching aerial images locally within fixed FoV sectors. We show that local matching limits complete scene understanding and is inadequate when partial buildings are visible in query images or when local sectors of aerial images are covered by dense trees. Our approach considers both local and global properties of aerial images and hence is robust to such conditions. Experiments on standard benchmarks demonstrates that the proposed approach improves top-1% image recall rate on the CVACT [9] data-set from 57.08% to 77.19% and from 61.20% to 75.21% on the CVUSA [28] data-set for 70°FoV. We also achieve state-of-the art results for 90°FoV on both CVACT [9] and CVUSA [28] data-sets demonstrating the effectiveness of our proposed method.\",\"PeriodicalId\":297092,\"journal\":{\"name\":\"2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WACV51458.2022.00275\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WACV51458.2022.00275","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

摘要

当我们人类从图像中识别地点时，我们不仅会推断出可用的物体，甚至还会考虑周围可能存在的地标。目前的位置识别方法缺乏超越图像中可用物体的能力，因此无法完全理解场景。在本文中，我们向整体场景理解迈出了一步。我们通过从大型地理标记航空图像数据库中检索相应的鸟瞰图来解决图像地理定位问题。解决这个问题的主要挑战之一是查询图像的有限视场(FoV)性质，需要与包含360°FoV细节的鸟瞰图相匹配。最先进的方法DSM-Net[19]通过在固定视场区域内局部匹配航空图像来解决这一挑战。我们表明，局部匹配限制了完整的场景理解，当查询图像中可见部分建筑物或航拍图像的局部区域被茂密的树木覆盖时，局部匹配是不充分的。我们的方法考虑了航空图像的局部和全局特性，因此对这些条件具有鲁棒性。标准基准实验表明，该方法将CVACT[9]数据集上前1%的图像召回率从57.08%提高到77.19%，在CVUSA[28]数据集上从61.20%提高到75.21%。我们还在CVACT[9]和CVUSA[28]数据集上实现了90°视场的最先进结果，证明了我们提出的方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Global Assists Local: Effective Aerial Representations for Field of View Constrained Image Geo-Localization

When we humans recognize places from images, we not only infer about the objects that are available but even think about landmarks that might be surrounding it. Current place recognition approaches lack the ability to go beyond objects that are available in the image and hence miss out on understanding the scene completely. In this paper, we take a step towards holistic scene understanding. We address the problem of image geo-localization by retrieving corresponding aerial views from a large database of geotagged aerial imagery. One of the main challenges in tackling this problem is the limited Field of View (FoV) nature of query images which needs to be matched to aerial views which contain 360°FoV details. State-of-the-art method DSM-Net [19] tackles this challenge by matching aerial images locally within fixed FoV sectors. We show that local matching limits complete scene understanding and is inadequate when partial buildings are visible in query images or when local sectors of aerial images are covered by dense trees. Our approach considers both local and global properties of aerial images and hence is robust to such conditions. Experiments on standard benchmarks demonstrates that the proposed approach improves top-1% image recall rate on the CVACT [9] data-set from 57.08% to 77.19% and from 61.20% to 75.21% on the CVUSA [28] data-set for 70°FoV. We also achieve state-of-the art results for 90°FoV on both CVACT [9] and CVUSA [28] data-sets demonstrating the effectiveness of our proposed method.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

自引率

0.00%

发文量