{"title":"针对长尾视觉识别的多因子边际损失特征平衡再增强网络","authors":"","doi":"10.1016/j.neucom.2024.128530","DOIUrl":null,"url":null,"abstract":"<div><p>Real-world data often exhibits a long-tailed distribution, where the number of training samples for head classes far exceeds that of tail classes. This class imbalance phenomenon poses significant challenges for training deep neural networks. Existing class-aware loss methods typically focus only on the numerical relationship between class samples, blindly favoring the optimization of tail classes during the process, while neglecting the difficulty of samples and the similarity between the current class and other classes. To this end, relying only on the number relationship can easily lead to over-fitting of tail classes, thereby failing to fully utilize the potential information in the data. Therefore, we propose the Multi-Factor Margin Loss (MFMLoss), which consists of positive margin loss and negative margin loss. MFMLoss comprehensively considers three factors at three levels: overall, class, and sample: (1) quantitative relationships, (2) inter-class similarity relationships, and (3) sample recognition difficulty. The combined consideration of these three factors enables the model to pay more attention to confusing classes and difficult samples during the training process, rather than solely on tail classes, thus achieving optimization from coarse-grained to fine-grained. To further mitigate the negative impact of the imbalance between head and tail classes on feature learning, we design a new network architecture, called F-BREN. F-BREN consists of two components: the feature balancing network and the feature re-enhancement network. The former is trained with negative margin loss, which reduces the recognizability of easy samples. The latter is trained with positive margin loss, using positive margin to give more attention to hard samples, thus balancing the model’s attention to all samples. We conducted extensive experiments on four long-tailed benchmark datasets: CIFAR10-LT, CIFAR100-LT, ImageNet-LT and iNaturalist 2018, comparing the recognition accuracy of our method with eight state-of-the-art methods. The experimental results demonstrate that our proposed method outperforms the eight compared methods.</p></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5000,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Feature balanced re-enhanced network with multi-factor margin loss for long-tailed visual recognition\",\"authors\":\"\",\"doi\":\"10.1016/j.neucom.2024.128530\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Real-world data often exhibits a long-tailed distribution, where the number of training samples for head classes far exceeds that of tail classes. This class imbalance phenomenon poses significant challenges for training deep neural networks. Existing class-aware loss methods typically focus only on the numerical relationship between class samples, blindly favoring the optimization of tail classes during the process, while neglecting the difficulty of samples and the similarity between the current class and other classes. To this end, relying only on the number relationship can easily lead to over-fitting of tail classes, thereby failing to fully utilize the potential information in the data. Therefore, we propose the Multi-Factor Margin Loss (MFMLoss), which consists of positive margin loss and negative margin loss. MFMLoss comprehensively considers three factors at three levels: overall, class, and sample: (1) quantitative relationships, (2) inter-class similarity relationships, and (3) sample recognition difficulty. The combined consideration of these three factors enables the model to pay more attention to confusing classes and difficult samples during the training process, rather than solely on tail classes, thus achieving optimization from coarse-grained to fine-grained. To further mitigate the negative impact of the imbalance between head and tail classes on feature learning, we design a new network architecture, called F-BREN. F-BREN consists of two components: the feature balancing network and the feature re-enhancement network. The former is trained with negative margin loss, which reduces the recognizability of easy samples. The latter is trained with positive margin loss, using positive margin to give more attention to hard samples, thus balancing the model’s attention to all samples. We conducted extensive experiments on four long-tailed benchmark datasets: CIFAR10-LT, CIFAR100-LT, ImageNet-LT and iNaturalist 2018, comparing the recognition accuracy of our method with eight state-of-the-art methods. The experimental results demonstrate that our proposed method outperforms the eight compared methods.</p></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2024-09-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0925231224013018\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231224013018","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Feature balanced re-enhanced network with multi-factor margin loss for long-tailed visual recognition
Real-world data often exhibits a long-tailed distribution, where the number of training samples for head classes far exceeds that of tail classes. This class imbalance phenomenon poses significant challenges for training deep neural networks. Existing class-aware loss methods typically focus only on the numerical relationship between class samples, blindly favoring the optimization of tail classes during the process, while neglecting the difficulty of samples and the similarity between the current class and other classes. To this end, relying only on the number relationship can easily lead to over-fitting of tail classes, thereby failing to fully utilize the potential information in the data. Therefore, we propose the Multi-Factor Margin Loss (MFMLoss), which consists of positive margin loss and negative margin loss. MFMLoss comprehensively considers three factors at three levels: overall, class, and sample: (1) quantitative relationships, (2) inter-class similarity relationships, and (3) sample recognition difficulty. The combined consideration of these three factors enables the model to pay more attention to confusing classes and difficult samples during the training process, rather than solely on tail classes, thus achieving optimization from coarse-grained to fine-grained. To further mitigate the negative impact of the imbalance between head and tail classes on feature learning, we design a new network architecture, called F-BREN. F-BREN consists of two components: the feature balancing network and the feature re-enhancement network. The former is trained with negative margin loss, which reduces the recognizability of easy samples. The latter is trained with positive margin loss, using positive margin to give more attention to hard samples, thus balancing the model’s attention to all samples. We conducted extensive experiments on four long-tailed benchmark datasets: CIFAR10-LT, CIFAR100-LT, ImageNet-LT and iNaturalist 2018, comparing the recognition accuracy of our method with eight state-of-the-art methods. The experimental results demonstrate that our proposed method outperforms the eight compared methods.
期刊介绍:
Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.