Online Multi-modal Hashing with Dynamic Query-adaption

Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval Pub Date : 2019-07-18 DOI:10.1145/3331184.3331217

X. Lu, Lei Zhu, Zhiyong Cheng, Liqiang Nie, Huaxiang Zhang

{"title":"Online Multi-modal Hashing with Dynamic Query-adaption","authors":"X. Lu, Lei Zhu, Zhiyong Cheng, Liqiang Nie, Huaxiang Zhang","doi":"10.1145/3331184.3331217","DOIUrl":null,"url":null,"abstract":"Multi-modal hashing is an effective technique to support large-scale multimedia retrieval, due to its capability of encoding heterogeneous multi-modal features into compact and similarity-preserving binary codes. Although great progress has been achieved so far, existing methods still suffer from several problems, including: 1) All existing methods simply adopt fixed modality combination weights in online hashing process to generate the query hash codes. This strategy cannot adaptively capture the variations of different queries. 2) They either suffer from insufficient semantics (for unsupervised methods) or require high computation and storage cost (for the supervised methods, which rely on pair-wise semantic matrix). 3) They solve the hash codes with relaxed optimization strategy or bit-by-bit discrete optimization, which results in significant quantization loss or consumes considerable computation time. To address the above limitations, in this paper, we propose an Online Multi-modal Hashing with Dynamic Query-adaption (OMH-DQ) method in a novel fashion. Specifically, a self-weighted fusion strategy is designed to adaptively preserve the multi-modal feature information into hash codes by exploiting their complementarity. The hash codes are learned with the supervision of pair-wise semantic labels to enhance their discriminative capability, while avoiding the challenging symmetric similarity matrix factorization. Under such learning framework, the binary hash codes can be directly obtained with efficient operations and without quantization errors. Accordingly, our method can benefit from the semantic labels, and simultaneously, avoid the high computation complexity. Moreover, to accurately capture the query variations, at the online retrieval stage, we design a parameter-free online hashing module which can adaptively learn the query hash codes according to the dynamic query contents. Extensive experiments demonstrate the state-of-the-art performance of the proposed approach from various aspects.","PeriodicalId":20700,"journal":{"name":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"115 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"106","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3331184.3331217","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 106

Abstract

Multi-modal hashing is an effective technique to support large-scale multimedia retrieval, due to its capability of encoding heterogeneous multi-modal features into compact and similarity-preserving binary codes. Although great progress has been achieved so far, existing methods still suffer from several problems, including: 1) All existing methods simply adopt fixed modality combination weights in online hashing process to generate the query hash codes. This strategy cannot adaptively capture the variations of different queries. 2) They either suffer from insufficient semantics (for unsupervised methods) or require high computation and storage cost (for the supervised methods, which rely on pair-wise semantic matrix). 3) They solve the hash codes with relaxed optimization strategy or bit-by-bit discrete optimization, which results in significant quantization loss or consumes considerable computation time. To address the above limitations, in this paper, we propose an Online Multi-modal Hashing with Dynamic Query-adaption (OMH-DQ) method in a novel fashion. Specifically, a self-weighted fusion strategy is designed to adaptively preserve the multi-modal feature information into hash codes by exploiting their complementarity. The hash codes are learned with the supervision of pair-wise semantic labels to enhance their discriminative capability, while avoiding the challenging symmetric similarity matrix factorization. Under such learning framework, the binary hash codes can be directly obtained with efficient operations and without quantization errors. Accordingly, our method can benefit from the semantic labels, and simultaneously, avoid the high computation complexity. Moreover, to accurately capture the query variations, at the online retrieval stage, we design a parameter-free online hashing module which can adaptively learn the query hash codes according to the dynamic query contents. Extensive experiments demonstrate the state-of-the-art performance of the proposed approach from various aspects.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

具有动态查询适应性的在线多模态哈希

多模态哈希能够将异构多模态特征编码成紧凑且保持相似性的二进制编码，是支持大规模多媒体检索的有效技术。虽然目前已经取得了很大的进展，但是现有的方法仍然存在一些问题，包括:1)所有现有的方法在在线哈希过程中都简单地采用固定模态组合权值来生成查询哈希码。这种策略不能自适应地捕捉不同查询的变化。2)它们要么语义不足(对于无监督方法)，要么需要高计算和存储成本(对于依赖成对语义矩阵的监督方法)。3)他们采用宽松的优化策略或逐位离散优化来求解哈希码，这会导致严重的量化损失或消耗大量的计算时间。为了解决上述限制，在本文中，我们以一种新颖的方式提出了一种带有动态查询自适应的在线多模态哈希(OMH-DQ)方法。具体而言，设计了一种自加权融合策略，利用多模态特征信息的互补性，自适应地将多模态特征信息保存到哈希码中。哈希码是在对语义标签的监督下学习的，以增强其判别能力，同时避免了具有挑战性的对称相似矩阵分解。在这种学习框架下，可以直接得到二进制哈希码，运算效率高，没有量化误差。因此，我们的方法可以受益于语义标签，同时避免了高计算复杂度。此外，为了准确捕捉查询变化，在在线检索阶段，我们设计了一个无参数的在线哈希模块，该模块可以根据动态查询内容自适应学习查询哈希码。大量的实验从各个方面证明了所提出方法的最先进性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval

自引率

0.00%

发文量

期刊最新文献

Automatic Task Completion Flows from Web APIs Session details: Session 6A: Social Media Sequence and Time Aware Neighborhood for Session-based Recommendations: STAN Adversarial Training for Review-Based Recommendations Hate Speech Detection is Not as Easy as You May Think: A Closer Look at Model Validation