Chenxun Deng , Dafang Li , Lin Ji , Chengyang Zhang , Baican Li , Hongying Yan , Jiyuan Zheng , Lifeng Wang , Junguo Zhang
{"title":"ChatDiff:基于 ChatGPT 的长尾分类扩散模型。","authors":"Chenxun Deng , Dafang Li , Lin Ji , Chengyang Zhang , Baican Li , Hongying Yan , Jiyuan Zheng , Lifeng Wang , Junguo Zhang","doi":"10.1016/j.neunet.2024.106794","DOIUrl":null,"url":null,"abstract":"<div><div>Long-tailed data distributions have been a major challenge for the practical application of deep learning. Information augmentation intends to expand the long-tailed data into uniform distribution, which provides a feasible way to mitigate the data starvation of underrepresented classes. However, most existing augmentation methods face two significant challenges: (1) limited diversity in generated samples, and (2) the adverse effect of generated negative samples on downstream classification performance. In this paper, we propose a novel information augmentation method, named ChatDiff, to provide diverse positive samples for underrepresented classes, and eliminate generated negative samples. Specifically, we start with a prompt template to extract textual prior knowledge from the ChatGPT-3.5 model, enhancing the feature space for underrepresented classes. Then using this prior knowledge, a conditional diffusion model generates semantic-rich image samples for tail classes. Moreover, the proposed ChatDiff leverages a CLIP-based discriminator to screen and remove generated negative samples. This process avoids neural network learning the invalid or erroneous features, and further, improves long-tailed classification performance. Comprehensive experiments conducted on long-tailed benchmarks such as CIFAR10-LT, CIFAR100-LT, ImageNet-LT, and iNaturalist 2018, validate the effectiveness of our ChatDiff method.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"181 ","pages":"Article 106794"},"PeriodicalIF":6.0000,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ChatDiff: A ChatGPT-based diffusion model for long-tailed classification\",\"authors\":\"Chenxun Deng , Dafang Li , Lin Ji , Chengyang Zhang , Baican Li , Hongying Yan , Jiyuan Zheng , Lifeng Wang , Junguo Zhang\",\"doi\":\"10.1016/j.neunet.2024.106794\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Long-tailed data distributions have been a major challenge for the practical application of deep learning. Information augmentation intends to expand the long-tailed data into uniform distribution, which provides a feasible way to mitigate the data starvation of underrepresented classes. However, most existing augmentation methods face two significant challenges: (1) limited diversity in generated samples, and (2) the adverse effect of generated negative samples on downstream classification performance. In this paper, we propose a novel information augmentation method, named ChatDiff, to provide diverse positive samples for underrepresented classes, and eliminate generated negative samples. Specifically, we start with a prompt template to extract textual prior knowledge from the ChatGPT-3.5 model, enhancing the feature space for underrepresented classes. Then using this prior knowledge, a conditional diffusion model generates semantic-rich image samples for tail classes. Moreover, the proposed ChatDiff leverages a CLIP-based discriminator to screen and remove generated negative samples. This process avoids neural network learning the invalid or erroneous features, and further, improves long-tailed classification performance. Comprehensive experiments conducted on long-tailed benchmarks such as CIFAR10-LT, CIFAR100-LT, ImageNet-LT, and iNaturalist 2018, validate the effectiveness of our ChatDiff method.</div></div>\",\"PeriodicalId\":49763,\"journal\":{\"name\":\"Neural Networks\",\"volume\":\"181 \",\"pages\":\"Article 106794\"},\"PeriodicalIF\":6.0000,\"publicationDate\":\"2024-10-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0893608024007184\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608024007184","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
ChatDiff: A ChatGPT-based diffusion model for long-tailed classification
Long-tailed data distributions have been a major challenge for the practical application of deep learning. Information augmentation intends to expand the long-tailed data into uniform distribution, which provides a feasible way to mitigate the data starvation of underrepresented classes. However, most existing augmentation methods face two significant challenges: (1) limited diversity in generated samples, and (2) the adverse effect of generated negative samples on downstream classification performance. In this paper, we propose a novel information augmentation method, named ChatDiff, to provide diverse positive samples for underrepresented classes, and eliminate generated negative samples. Specifically, we start with a prompt template to extract textual prior knowledge from the ChatGPT-3.5 model, enhancing the feature space for underrepresented classes. Then using this prior knowledge, a conditional diffusion model generates semantic-rich image samples for tail classes. Moreover, the proposed ChatDiff leverages a CLIP-based discriminator to screen and remove generated negative samples. This process avoids neural network learning the invalid or erroneous features, and further, improves long-tailed classification performance. Comprehensive experiments conducted on long-tailed benchmarks such as CIFAR10-LT, CIFAR100-LT, ImageNet-LT, and iNaturalist 2018, validate the effectiveness of our ChatDiff method.
期刊介绍:
Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.