GeoDTR+:通过几何解缠实现通用跨视图地理定位

Xiaohan Zhang, Xingyu Li, Waqas Sultani, Chen Chen, Safwan Wshah
{"title":"GeoDTR+:通过几何解缠实现通用跨视图地理定位","authors":"Xiaohan Zhang, Xingyu Li, Waqas Sultani, Chen Chen, Safwan Wshah","doi":"10.1109/TPAMI.2024.3443652","DOIUrl":null,"url":null,"abstract":"<p><p>Cross-View Geo-Localization (CVGL) estimates the location of a ground image by matching it to a geo-tagged aerial image in a database. Recent works achieve outstanding progress on CVGL benchmarks. However, existing methods still suffer from poor performance in cross-area evaluation, in which the training and testing data are captured from completely distinct areas. We attribute this deficiency to the lack of ability to extract the geometric layout of visual features and models' overfitting to low-level details. Our preliminary work [1] introduced a Geometric Layout Extractor (GLE) to capture the geometric layout from input features. However, the previous GLE does not fully exploit information in the input feature. In this work, we propose GeoDTR+ with an enhanced GLE module that better models the correlations among visual features. To fully explore the LS techniques from our preliminary work, we further propose Contrastive Hard Samples Generation (CHSG) to facilitate model training. Extensive experiments show that GeoDTR+ achieves state-of-the-art (SOTA) results in cross-area evaluation on CVUSA [2], CVACT [3], and VIGOR [4] by a large margin ( 16.44%, 22.71%, and 13.66% without polar transformation) while keeping the same-area performance comparable to existing SOTA. Moreover, we provide detailed analyses of GeoDTR+. Our code will be available at https://gitlab.com/vail-uvm/geodtr_plus.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GeoDTR+: Toward Generic Cross-View Geolocalization via Geometric Disentanglement.\",\"authors\":\"Xiaohan Zhang, Xingyu Li, Waqas Sultani, Chen Chen, Safwan Wshah\",\"doi\":\"10.1109/TPAMI.2024.3443652\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Cross-View Geo-Localization (CVGL) estimates the location of a ground image by matching it to a geo-tagged aerial image in a database. Recent works achieve outstanding progress on CVGL benchmarks. However, existing methods still suffer from poor performance in cross-area evaluation, in which the training and testing data are captured from completely distinct areas. We attribute this deficiency to the lack of ability to extract the geometric layout of visual features and models' overfitting to low-level details. Our preliminary work [1] introduced a Geometric Layout Extractor (GLE) to capture the geometric layout from input features. However, the previous GLE does not fully exploit information in the input feature. In this work, we propose GeoDTR+ with an enhanced GLE module that better models the correlations among visual features. To fully explore the LS techniques from our preliminary work, we further propose Contrastive Hard Samples Generation (CHSG) to facilitate model training. Extensive experiments show that GeoDTR+ achieves state-of-the-art (SOTA) results in cross-area evaluation on CVUSA [2], CVACT [3], and VIGOR [4] by a large margin ( 16.44%, 22.71%, and 13.66% without polar transformation) while keeping the same-area performance comparable to existing SOTA. Moreover, we provide detailed analyses of GeoDTR+. Our code will be available at https://gitlab.com/vail-uvm/geodtr_plus.</p>\",\"PeriodicalId\":94034,\"journal\":{\"name\":\"IEEE transactions on pattern analysis and machine intelligence\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on pattern analysis and machine intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TPAMI.2024.3443652\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TPAMI.2024.3443652","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

跨视图地理定位(CVGL)通过将地面图像与数据库中带有地理标记的航空图像进行匹配,来估算地面图像的位置。最近的工作在 CVGL 基准方面取得了突出进展。然而,现有方法在跨区域评估中仍然表现不佳,在跨区域评估中,训练数据和测试数据来自完全不同的区域。我们将这一缺陷归咎于缺乏提取视觉特征几何布局的能力以及模型对低级细节的过度拟合。我们的前期工作[1]引入了几何布局提取器(GLE),从输入特征中捕捉几何布局。然而,之前的 GLE 并没有充分利用输入特征中的信息。在这项工作中,我们提出了带有增强型 GLE 模块的 GeoDTR+,该模块能更好地模拟视觉特征之间的相关性。为了充分挖掘前期工作中的 LS 技术,我们进一步提出了对比硬样本生成(CHSG)技术,以促进模型训练。广泛的实验表明,GeoDTR+ 在 CVUSA [2]、CVACT [3] 和 VIGOR [4] 的跨区域评估中以较大的优势(16.44%、22.71% 和 13.66%,无极性变换)取得了最先进的(SOTA)结果,同时保持了与现有 SOTA 相当的同区域性能。此外,我们还对 GeoDTR+ 进行了详细分析。我们的代码将发布在 https://gitlab.com/vail-uvm/geodtr_plus 网站上。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
GeoDTR+: Toward Generic Cross-View Geolocalization via Geometric Disentanglement.

Cross-View Geo-Localization (CVGL) estimates the location of a ground image by matching it to a geo-tagged aerial image in a database. Recent works achieve outstanding progress on CVGL benchmarks. However, existing methods still suffer from poor performance in cross-area evaluation, in which the training and testing data are captured from completely distinct areas. We attribute this deficiency to the lack of ability to extract the geometric layout of visual features and models' overfitting to low-level details. Our preliminary work [1] introduced a Geometric Layout Extractor (GLE) to capture the geometric layout from input features. However, the previous GLE does not fully exploit information in the input feature. In this work, we propose GeoDTR+ with an enhanced GLE module that better models the correlations among visual features. To fully explore the LS techniques from our preliminary work, we further propose Contrastive Hard Samples Generation (CHSG) to facilitate model training. Extensive experiments show that GeoDTR+ achieves state-of-the-art (SOTA) results in cross-area evaluation on CVUSA [2], CVACT [3], and VIGOR [4] by a large margin ( 16.44%, 22.71%, and 13.66% without polar transformation) while keeping the same-area performance comparable to existing SOTA. Moreover, we provide detailed analyses of GeoDTR+. Our code will be available at https://gitlab.com/vail-uvm/geodtr_plus.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Diversifying Policies with Non-Markov Dispersion to Expand the Solution Space. Integrating Neural Radiance Fields End-to-End for Cognitive Visuomotor Navigation. Variational Label Enhancement for Instance-Dependent Partial Label Learning. TagCLIP: Improving Discrimination Ability of Zero-Shot Semantic Segmentation. Efficient Neural Collaborative Search for Pickup and Delivery Problems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1