Dynamic collaborative learning with heterogeneous knowledge transfer for long-tailed visual recognition

IF 14.7 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Information Fusion Pub Date : 2024-10-15 DOI:10.1016/j.inffus.2024.102734
Hao Zhou , Tingjin Luo , Yongming He
{"title":"Dynamic collaborative learning with heterogeneous knowledge transfer for long-tailed visual recognition","authors":"Hao Zhou ,&nbsp;Tingjin Luo ,&nbsp;Yongming He","doi":"10.1016/j.inffus.2024.102734","DOIUrl":null,"url":null,"abstract":"<div><div>Solving the long-tailed visual recognition with deep convolutional neural networks is still a challenging task. As a mainstream method, multi-experts models achieve SOTA accuracy for tackling this problem, but the uncertainty in network learning and the complexity in fusion inference constrain the performance and practicality of the multi-experts models. To remedy this, we propose a novel dynamic collaborative learning with heterogeneous knowledge transfer model (DCHKT) in this paper, in which experts with different expertise collaborate to make predictions. DCHKT consists of two core components: dynamic adaptive weight adjustment and heterogeneous knowledge transfer learning. First, the dynamic adaptive weight adjustment is designed to shift the focus of model training between the global expert and domain experts via dynamic adaptive weight. By modulating the trade-off between the learning of features and classifier, the dynamic adaptive weight adjustment can enhance the discriminative ability of each expert and alleviate the uncertainty of model learning. Then, heterogeneous knowledge transfer learning, which measures the distribution differences between the fusion logits of multiple experts and the predicted logits of each expert with different specialties, can achieve message passing between experts and enhance the consistency of ensemble prediction in model training and inference to promote their collaborations. Finally, extensive experimental results on public long-tailed datasets: CIFAR-LT, ImageNet-LT, Place-LT and iNaturalist2018, demonstrate the effectiveness and superiority of our DCHKT.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102734"},"PeriodicalIF":14.7000,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253524005128","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Solving the long-tailed visual recognition with deep convolutional neural networks is still a challenging task. As a mainstream method, multi-experts models achieve SOTA accuracy for tackling this problem, but the uncertainty in network learning and the complexity in fusion inference constrain the performance and practicality of the multi-experts models. To remedy this, we propose a novel dynamic collaborative learning with heterogeneous knowledge transfer model (DCHKT) in this paper, in which experts with different expertise collaborate to make predictions. DCHKT consists of two core components: dynamic adaptive weight adjustment and heterogeneous knowledge transfer learning. First, the dynamic adaptive weight adjustment is designed to shift the focus of model training between the global expert and domain experts via dynamic adaptive weight. By modulating the trade-off between the learning of features and classifier, the dynamic adaptive weight adjustment can enhance the discriminative ability of each expert and alleviate the uncertainty of model learning. Then, heterogeneous knowledge transfer learning, which measures the distribution differences between the fusion logits of multiple experts and the predicted logits of each expert with different specialties, can achieve message passing between experts and enhance the consistency of ensemble prediction in model training and inference to promote their collaborations. Finally, extensive experimental results on public long-tailed datasets: CIFAR-LT, ImageNet-LT, Place-LT and iNaturalist2018, demonstrate the effectiveness and superiority of our DCHKT.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
针对长尾视觉识别的异构知识转移动态协作学习
利用深度卷积神经网络解决长尾视觉识别问题仍然是一项具有挑战性的任务。作为一种主流方法,多专家模型在解决这一问题时可以达到 SOTA 的精度,但网络学习的不确定性和融合推理的复杂性限制了多专家模型的性能和实用性。为了解决这一问题,我们在本文中提出了一种新颖的异构知识转移动态协作学习模型(DCHKT),在该模型中,具有不同专业知识的专家共同协作进行预测。DCHKT 由两个核心部分组成:动态自适应权重调整和异构知识转移学习。首先,动态自适应权重调整旨在通过动态自适应权重在全局专家和领域专家之间转移模型训练的重点。通过调节特征学习和分类器学习之间的权衡,动态自适应权重调整可以增强每位专家的判别能力,缓解模型学习的不确定性。然后,异质知识转移学习通过测量多位专家的融合对数与每位专家不同专业预测对数之间的分布差异,实现专家间的信息传递,增强模型训练和推理中集合预测的一致性,促进专家间的合作。最后,在公共长尾数据集上取得了大量实验结果:最后,在 CIFAR-LT、ImageNet-LT、Place-LT 和 iNaturalist2018 等公共长尾数据集上的大量实验结果证明了我们的 DCHKT 的有效性和优越性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Information Fusion
Information Fusion 工程技术-计算机:理论方法
CiteScore
33.20
自引率
4.30%
发文量
161
审稿时长
7.9 months
期刊介绍: Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.
期刊最新文献
Pretraining graph transformer for molecular representation with fusion of multimodal information Pan-Mamba: Effective pan-sharpening with state space model An autoencoder-based confederated clustering leveraging a robust model fusion strategy for federated unsupervised learning FairDPFL-SCS: Fair Dynamic Personalized Federated Learning with strategic client selection for improved accuracy and fairness M-IPISincNet: An explainable multi-source physics-informed neural network based on improved SincNet for rolling bearings fault diagnosis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1