Deep multi-similarity hashing via label-guided network for cross-modal retrieval

IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Neurocomputing Pub Date : 2024-11-17 DOI:10.1016/j.neucom.2024.128830
Lei Wu , Qibing Qin , Jinkui Hou , Jiangyan Dai , Lei Huang , Wenfeng Zhang
{"title":"Deep multi-similarity hashing via label-guided network for cross-modal retrieval","authors":"Lei Wu ,&nbsp;Qibing Qin ,&nbsp;Jinkui Hou ,&nbsp;Jiangyan Dai ,&nbsp;Lei Huang ,&nbsp;Wenfeng Zhang","doi":"10.1016/j.neucom.2024.128830","DOIUrl":null,"url":null,"abstract":"<div><div>Due to low storage cost and efficient retrieval advantages, hashing technologies have gained broad attention in the field of cross-modal retrieval in recent years. However, most current cross-modal hashing usually employs random sampling or semi-hard negative mining to construct training batches for model optimization, which ignores the distribution relationships between raw samples, generating redundant and unbalanced pairs, and resulting in sub-optimal embedding spaces. In this work, we address this dilemma with a novel deep cross-modal hashing framework, called Deep Multi-similarity Hashing via Label-Guided Networks (DMsH-LN), to learn a high separability public embedding space and generate discriminative binary descriptors. Specifically, by utilizing pair mining and weighting to jointly calculate self-similarity and relative similarity between pairs, the multi-similarity loss is extended to cross-modal hashing to alleviate the negative impacts caused by redundant and imbalanced samples on hash learning, enhancing the distinguishing ability of the obtained discrete codes. Besides, to capture fine-grained semantic supervised signals, the Label-guided Network is proposed to learn class-specific semantic signals, which could effectively guide the parameter optimization of the Image Network and Text Network. Extensive experiments are conducted on four benchmark datasets, which demonstrate that the DMsH-LN framework achieves excellent retrieval performance. The source codes of DMsH-LN are downloaded from <span><span>https://github.com/QinLab-WFU/DMsH-LN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128830"},"PeriodicalIF":5.5000,"publicationDate":"2024-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231224016011","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Due to low storage cost and efficient retrieval advantages, hashing technologies have gained broad attention in the field of cross-modal retrieval in recent years. However, most current cross-modal hashing usually employs random sampling or semi-hard negative mining to construct training batches for model optimization, which ignores the distribution relationships between raw samples, generating redundant and unbalanced pairs, and resulting in sub-optimal embedding spaces. In this work, we address this dilemma with a novel deep cross-modal hashing framework, called Deep Multi-similarity Hashing via Label-Guided Networks (DMsH-LN), to learn a high separability public embedding space and generate discriminative binary descriptors. Specifically, by utilizing pair mining and weighting to jointly calculate self-similarity and relative similarity between pairs, the multi-similarity loss is extended to cross-modal hashing to alleviate the negative impacts caused by redundant and imbalanced samples on hash learning, enhancing the distinguishing ability of the obtained discrete codes. Besides, to capture fine-grained semantic supervised signals, the Label-guided Network is proposed to learn class-specific semantic signals, which could effectively guide the parameter optimization of the Image Network and Text Network. Extensive experiments are conducted on four benchmark datasets, which demonstrate that the DMsH-LN framework achieves excellent retrieval performance. The source codes of DMsH-LN are downloaded from https://github.com/QinLab-WFU/DMsH-LN.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于标签引导网络的深度多相似散列跨模态检索
由于低存储成本和高效检索的优点,哈希技术近年来在跨模式检索领域受到了广泛关注。然而,目前大多数跨模态哈希算法通常采用随机抽样或半硬负挖掘来构建训练批进行模型优化,忽略了原始样本之间的分布关系,产生冗余和不平衡对,导致嵌入空间次优。在这项工作中,我们使用一种新的深度跨模态哈希框架来解决这一困境,该框架称为基于标签引导网络的深度多相似哈希(DMsH-LN),以学习高可分性公共嵌入空间并生成判别二元描述符。具体来说,通过利用对挖掘和加权共同计算对之间的自相似度和相对相似度,将多重相似损失扩展到跨模态哈希,以减轻样本冗余和不平衡对哈希学习的负面影响,增强得到的离散码的区分能力。此外,为了捕获细粒度的语义监督信号,提出了标签引导网络学习特定类别的语义信号,可以有效地指导图像网络和文本网络的参数优化。在4个基准数据集上进行了大量的实验,结果表明,DMsH-LN框架具有良好的检索性能。DMsH-LN的源代码可从https://github.com/QinLab-WFU/DMsH-LN下载。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Neurocomputing
Neurocomputing 工程技术-计算机:人工智能
CiteScore
13.10
自引率
10.00%
发文量
1382
审稿时长
70 days
期刊介绍: Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.
期刊最新文献
Monocular thermal SLAM with neural radiance fields for 3D scene reconstruction Learning a more compact representation for low-rank tensor completion An HVS-derived network for assessing the quality of camouflaged targets with feature fusion Global Span Semantic Dependency Awareness and Filtering Network for nested named entity recognition A user behavior-aware multi-task learning model for enhanced short video recommendation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1