Tingyu Wang;Zihao Yang;Quan Chen;Yaoqi Sun;Chenggang Yan
{"title":"反思航拍地理定位中的多粒度特征汇集法","authors":"Tingyu Wang;Zihao Yang;Quan Chen;Yaoqi Sun;Chenggang Yan","doi":"10.1109/LSP.2024.3484330","DOIUrl":null,"url":null,"abstract":"Vision-based aerial-view geo-localization aims to match drone- and satellite-views of the same geographical location. Several feature partition strategies divide spatial features to mine contextual information. However, the compression from fine-grained features to visual descriptors is ill-considered, that is, classical pooling destroys discriminative features while increasing the sensitivity of networks to contextual information. In order to clarify this, we first review existing pooling layer and analyze their pros and cons when applied in feature compression. Inspired by the appearance of aerial views, we then summarize an ideal feature compression operation, i.e., precisely highlighting the central target while maximizing the use of environmental information in a feature-smoothing manner. To achieve the above process, we propose a distance-dependent parameter initialization strategy and form a novel pooling called \n<inline-formula><tex-math>$D^{2}$</tex-math></inline-formula>\n-GeM pooling, which can explicitly guide the network to compress fine-grained features in multiple patterns. Extensive experiments on public benchmark University-1652 substantiate that our strategy attains more appealing results without additional costs.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"3005-3009"},"PeriodicalIF":3.2000,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Rethinking Pooling for Multi-Granularity Features in Aerial-View Geo-Localization\",\"authors\":\"Tingyu Wang;Zihao Yang;Quan Chen;Yaoqi Sun;Chenggang Yan\",\"doi\":\"10.1109/LSP.2024.3484330\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Vision-based aerial-view geo-localization aims to match drone- and satellite-views of the same geographical location. Several feature partition strategies divide spatial features to mine contextual information. However, the compression from fine-grained features to visual descriptors is ill-considered, that is, classical pooling destroys discriminative features while increasing the sensitivity of networks to contextual information. In order to clarify this, we first review existing pooling layer and analyze their pros and cons when applied in feature compression. Inspired by the appearance of aerial views, we then summarize an ideal feature compression operation, i.e., precisely highlighting the central target while maximizing the use of environmental information in a feature-smoothing manner. To achieve the above process, we propose a distance-dependent parameter initialization strategy and form a novel pooling called \\n<inline-formula><tex-math>$D^{2}$</tex-math></inline-formula>\\n-GeM pooling, which can explicitly guide the network to compress fine-grained features in multiple patterns. Extensive experiments on public benchmark University-1652 substantiate that our strategy attains more appealing results without additional costs.\",\"PeriodicalId\":13154,\"journal\":{\"name\":\"IEEE Signal Processing Letters\",\"volume\":\"31 \",\"pages\":\"3005-3009\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2024-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Signal Processing Letters\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10723789/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Signal Processing Letters","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10723789/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Rethinking Pooling for Multi-Granularity Features in Aerial-View Geo-Localization
Vision-based aerial-view geo-localization aims to match drone- and satellite-views of the same geographical location. Several feature partition strategies divide spatial features to mine contextual information. However, the compression from fine-grained features to visual descriptors is ill-considered, that is, classical pooling destroys discriminative features while increasing the sensitivity of networks to contextual information. In order to clarify this, we first review existing pooling layer and analyze their pros and cons when applied in feature compression. Inspired by the appearance of aerial views, we then summarize an ideal feature compression operation, i.e., precisely highlighting the central target while maximizing the use of environmental information in a feature-smoothing manner. To achieve the above process, we propose a distance-dependent parameter initialization strategy and form a novel pooling called
$D^{2}$
-GeM pooling, which can explicitly guide the network to compress fine-grained features in multiple patterns. Extensive experiments on public benchmark University-1652 substantiate that our strategy attains more appealing results without additional costs.
期刊介绍:
The IEEE Signal Processing Letters is a monthly, archival publication designed to provide rapid dissemination of original, cutting-edge ideas and timely, significant contributions in signal, image, speech, language and audio processing. Papers published in the Letters can be presented within one year of their appearance in signal processing conferences such as ICASSP, GlobalSIP and ICIP, and also in several workshop organized by the Signal Processing Society.