{"title":"用于图像检索的频域辅助网络","authors":"Zhiming Zhang;Jiao Liu;Yongfeng Dong;Jun Zhang","doi":"10.1109/LSP.2024.3456632","DOIUrl":null,"url":null,"abstract":"Image retrieval aims to find the most semantically similar images in the database. Existing deep hash-based retrieval algorithms utilize data augmentation strategies thus generating generalized hash codes. However, simple data augmentation only improves the accuracy of hash codes from the perspective of sample diversity, without fully utilizing the inherent characteristics of the images. In this letter, we explore the frequency domain information of images and propose a Frequency Domain Auxiliary Network (FDANet) for deep hash retrieval. To capture frequency domain information that can cope with image transformations, we develop the spectrum enhancement module (SEM) in FDANet. The SEM utilizes Fourier transform techniques to extract the amplitude component that can reflect the low-level statistics of the image. Then, leveraging the extracted amplitude components, the retrieval network enhances its perception of regions undergoing relative changes in the original spatial domain. Experiments on several image retrieval benchmarks demonstrate that our method outperforms other state-of-the-art hash algorithms in terms of performance on the test metrics.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":null,"pages":null},"PeriodicalIF":3.2000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Frequency Domain Auxiliary Network for Image Retrieval\",\"authors\":\"Zhiming Zhang;Jiao Liu;Yongfeng Dong;Jun Zhang\",\"doi\":\"10.1109/LSP.2024.3456632\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Image retrieval aims to find the most semantically similar images in the database. Existing deep hash-based retrieval algorithms utilize data augmentation strategies thus generating generalized hash codes. However, simple data augmentation only improves the accuracy of hash codes from the perspective of sample diversity, without fully utilizing the inherent characteristics of the images. In this letter, we explore the frequency domain information of images and propose a Frequency Domain Auxiliary Network (FDANet) for deep hash retrieval. To capture frequency domain information that can cope with image transformations, we develop the spectrum enhancement module (SEM) in FDANet. The SEM utilizes Fourier transform techniques to extract the amplitude component that can reflect the low-level statistics of the image. Then, leveraging the extracted amplitude components, the retrieval network enhances its perception of regions undergoing relative changes in the original spatial domain. Experiments on several image retrieval benchmarks demonstrate that our method outperforms other state-of-the-art hash algorithms in terms of performance on the test metrics.\",\"PeriodicalId\":13154,\"journal\":{\"name\":\"IEEE Signal Processing Letters\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2024-09-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Signal Processing Letters\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10670001/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Signal Processing Letters","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10670001/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
A Frequency Domain Auxiliary Network for Image Retrieval
Image retrieval aims to find the most semantically similar images in the database. Existing deep hash-based retrieval algorithms utilize data augmentation strategies thus generating generalized hash codes. However, simple data augmentation only improves the accuracy of hash codes from the perspective of sample diversity, without fully utilizing the inherent characteristics of the images. In this letter, we explore the frequency domain information of images and propose a Frequency Domain Auxiliary Network (FDANet) for deep hash retrieval. To capture frequency domain information that can cope with image transformations, we develop the spectrum enhancement module (SEM) in FDANet. The SEM utilizes Fourier transform techniques to extract the amplitude component that can reflect the low-level statistics of the image. Then, leveraging the extracted amplitude components, the retrieval network enhances its perception of regions undergoing relative changes in the original spatial domain. Experiments on several image retrieval benchmarks demonstrate that our method outperforms other state-of-the-art hash algorithms in terms of performance on the test metrics.
期刊介绍:
The IEEE Signal Processing Letters is a monthly, archival publication designed to provide rapid dissemination of original, cutting-edge ideas and timely, significant contributions in signal, image, speech, language and audio processing. Papers published in the Letters can be presented within one year of their appearance in signal processing conferences such as ICASSP, GlobalSIP and ICIP, and also in several workshop organized by the Signal Processing Society.