针对有监督的跨模态检索的联合学习

Ang Li, Yawen Li, Yingxia Shao
{"title":"针对有监督的跨模态检索的联合学习","authors":"Ang Li, Yawen Li, Yingxia Shao","doi":"10.1007/s11280-024-01249-4","DOIUrl":null,"url":null,"abstract":"<p>In the last decade, the explosive surge in multi-modal data has propelled cross-modal retrieval into the forefront of information retrieval research. Exceptional cross-modal retrieval algorithms are crucial for meeting user requirements effectively and offering invaluable support for subsequent tasks, including cross-modal recommendations, multi-modal content generation, and so forth. Previous methods for cross-modal retrieval typically search for a single common subspace, neglecting the possibility of multiple common subspaces that may mutually reinforce each other in reality, thereby resulting in the poor performance of cross-modal retrieval. To address this issue, we propose a <b>Fed</b>erated <b>S</b>upervised <b>C</b>ross-<b>M</b>odal <b>R</b>etrieval approach (FedSCMR), which leverages competition to learn the optimal common subspace, and adaptively aggregates the common subspaces of multiple clients for dynamic global aggregation. To reduce the differences between modalities, FedSCMR minimizes the semantic discrimination and consistency in the common subspace, in addition to modeling semantic discrimination in the label space. Additionally, it minimizes modal discrimination and semantic invariance across common subspaces to strengthen cross-subspace constraints and promote learning of the optimal common subspace. In the aggregation stage for federated learning, we design an adaptive model aggregation scheme that can dynamically and collaboratively evaluate the model contribution based on data volume, data category, model loss, and mean average precision, to adaptively aggregate multi-party common subspaces. Experimental results on two publicly available datasets demonstrate that our proposed FedSCMR surpasses state-of-the-art cross-modal retrieval methods.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Federated learning for supervised cross-modal retrieval\",\"authors\":\"Ang Li, Yawen Li, Yingxia Shao\",\"doi\":\"10.1007/s11280-024-01249-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>In the last decade, the explosive surge in multi-modal data has propelled cross-modal retrieval into the forefront of information retrieval research. Exceptional cross-modal retrieval algorithms are crucial for meeting user requirements effectively and offering invaluable support for subsequent tasks, including cross-modal recommendations, multi-modal content generation, and so forth. Previous methods for cross-modal retrieval typically search for a single common subspace, neglecting the possibility of multiple common subspaces that may mutually reinforce each other in reality, thereby resulting in the poor performance of cross-modal retrieval. To address this issue, we propose a <b>Fed</b>erated <b>S</b>upervised <b>C</b>ross-<b>M</b>odal <b>R</b>etrieval approach (FedSCMR), which leverages competition to learn the optimal common subspace, and adaptively aggregates the common subspaces of multiple clients for dynamic global aggregation. To reduce the differences between modalities, FedSCMR minimizes the semantic discrimination and consistency in the common subspace, in addition to modeling semantic discrimination in the label space. Additionally, it minimizes modal discrimination and semantic invariance across common subspaces to strengthen cross-subspace constraints and promote learning of the optimal common subspace. In the aggregation stage for federated learning, we design an adaptive model aggregation scheme that can dynamically and collaboratively evaluate the model contribution based on data volume, data category, model loss, and mean average precision, to adaptively aggregate multi-party common subspaces. Experimental results on two publicly available datasets demonstrate that our proposed FedSCMR surpasses state-of-the-art cross-modal retrieval methods.</p>\",\"PeriodicalId\":501180,\"journal\":{\"name\":\"World Wide Web\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"World Wide Web\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s11280-024-01249-4\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"World Wide Web","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s11280-024-01249-4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

近十年来,多模态数据的爆炸式增长推动跨模态检索成为信息检索研究的前沿领域。卓越的跨模态检索算法对于有效满足用户需求以及为后续任务(包括跨模态推荐、多模态内容生成等)提供宝贵支持至关重要。以往的跨模态检索方法通常只搜索单一的公共子空间,忽略了现实中可能存在多个相互促进的公共子空间,从而导致跨模态检索的性能不佳。为了解决这个问题,我们提出了一种联邦监督跨模态检索方法(FedSCMR),它利用竞争来学习最优的公共子空间,并自适应地聚合多个客户端的公共子空间,进行动态全局聚合。为了减少模态之间的差异,FedSCMR 除了在标签空间中模拟语义辨别外,还最大限度地减少了公共子空间中的语义辨别和一致性。此外,它还将跨公共子空间的模态歧视和语义不变性降到最低,以加强跨子空间约束,促进最优公共子空间的学习。在联合学习的聚合阶段,我们设计了一种自适应模型聚合方案,它可以根据数据量、数据类别、模型损失和平均精度,动态地协同评估模型贡献,从而自适应地聚合多方共同子空间。在两个公开数据集上的实验结果表明,我们提出的 FedSCMR 超越了最先进的跨模态检索方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Federated learning for supervised cross-modal retrieval

In the last decade, the explosive surge in multi-modal data has propelled cross-modal retrieval into the forefront of information retrieval research. Exceptional cross-modal retrieval algorithms are crucial for meeting user requirements effectively and offering invaluable support for subsequent tasks, including cross-modal recommendations, multi-modal content generation, and so forth. Previous methods for cross-modal retrieval typically search for a single common subspace, neglecting the possibility of multiple common subspaces that may mutually reinforce each other in reality, thereby resulting in the poor performance of cross-modal retrieval. To address this issue, we propose a Federated Supervised Cross-Modal Retrieval approach (FedSCMR), which leverages competition to learn the optimal common subspace, and adaptively aggregates the common subspaces of multiple clients for dynamic global aggregation. To reduce the differences between modalities, FedSCMR minimizes the semantic discrimination and consistency in the common subspace, in addition to modeling semantic discrimination in the label space. Additionally, it minimizes modal discrimination and semantic invariance across common subspaces to strengthen cross-subspace constraints and promote learning of the optimal common subspace. In the aggregation stage for federated learning, we design an adaptive model aggregation scheme that can dynamically and collaboratively evaluate the model contribution based on data volume, data category, model loss, and mean average precision, to adaptively aggregate multi-party common subspaces. Experimental results on two publicly available datasets demonstrate that our proposed FedSCMR surpasses state-of-the-art cross-modal retrieval methods.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
HetFS: a method for fast similarity search with ad-hoc meta-paths on heterogeneous information networks A SHAP-based controversy analysis through communities on Twitter pFind: Privacy-preserving lost object finding in vehicular crowdsensing Use of prompt-based learning for code-mixed and code-switched text classification Drug traceability system based on semantic blockchain and on a reputation method
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1