{"title":"Dynamic collaborative learning with heterogeneous knowledge transfer for long-tailed visual recognition","authors":"Hao Zhou , Tingjin Luo , Yongming He","doi":"10.1016/j.inffus.2024.102734","DOIUrl":null,"url":null,"abstract":"<div><div>Solving the long-tailed visual recognition with deep convolutional neural networks is still a challenging task. As a mainstream method, multi-experts models achieve SOTA accuracy for tackling this problem, but the uncertainty in network learning and the complexity in fusion inference constrain the performance and practicality of the multi-experts models. To remedy this, we propose a novel dynamic collaborative learning with heterogeneous knowledge transfer model (DCHKT) in this paper, in which experts with different expertise collaborate to make predictions. DCHKT consists of two core components: dynamic adaptive weight adjustment and heterogeneous knowledge transfer learning. First, the dynamic adaptive weight adjustment is designed to shift the focus of model training between the global expert and domain experts via dynamic adaptive weight. By modulating the trade-off between the learning of features and classifier, the dynamic adaptive weight adjustment can enhance the discriminative ability of each expert and alleviate the uncertainty of model learning. Then, heterogeneous knowledge transfer learning, which measures the distribution differences between the fusion logits of multiple experts and the predicted logits of each expert with different specialties, can achieve message passing between experts and enhance the consistency of ensemble prediction in model training and inference to promote their collaborations. Finally, extensive experimental results on public long-tailed datasets: CIFAR-LT, ImageNet-LT, Place-LT and iNaturalist2018, demonstrate the effectiveness and superiority of our DCHKT.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102734"},"PeriodicalIF":14.7000,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253524005128","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Solving the long-tailed visual recognition with deep convolutional neural networks is still a challenging task. As a mainstream method, multi-experts models achieve SOTA accuracy for tackling this problem, but the uncertainty in network learning and the complexity in fusion inference constrain the performance and practicality of the multi-experts models. To remedy this, we propose a novel dynamic collaborative learning with heterogeneous knowledge transfer model (DCHKT) in this paper, in which experts with different expertise collaborate to make predictions. DCHKT consists of two core components: dynamic adaptive weight adjustment and heterogeneous knowledge transfer learning. First, the dynamic adaptive weight adjustment is designed to shift the focus of model training between the global expert and domain experts via dynamic adaptive weight. By modulating the trade-off between the learning of features and classifier, the dynamic adaptive weight adjustment can enhance the discriminative ability of each expert and alleviate the uncertainty of model learning. Then, heterogeneous knowledge transfer learning, which measures the distribution differences between the fusion logits of multiple experts and the predicted logits of each expert with different specialties, can achieve message passing between experts and enhance the consistency of ensemble prediction in model training and inference to promote their collaborations. Finally, extensive experimental results on public long-tailed datasets: CIFAR-LT, ImageNet-LT, Place-LT and iNaturalist2018, demonstrate the effectiveness and superiority of our DCHKT.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.