{"title":"一种改进的基于D2GAN的不平衡数据分类过采样算法","authors":"Xiaoqiang Zhao, Qi Yao","doi":"10.1002/sam.11640","DOIUrl":null,"url":null,"abstract":"To address the problems of pattern collapse, uncontrollable data generation and high overlap rate when generative adversarial network (GAN) oversamples imbalanced data, we propose an imbalanced data oversampling algorithm based on improved dual discriminator generative adversarial nets (D2GAN). First, we integrate the positive class attribute information into the generator and the discriminator to ensure that the generator only generates the samples for positive class samples, which overcomes the problem of uncontrollable data generation by the generator. Second, we introduce a classifier into D2GAN for discriminating the generated samples and the original data, which avoids the overlap among the generated samples and the negative class samples, and ensures the diversity of the generated samples, the problem of pattern collapse is solved. Finally, the performance of the proposed algorithm is evaluated on 9 datasets by using SVM and neural network classification algorithm for oversampling experiments, the results show that the proposed algorithm effectively improve the classification performance of imbalanced data.","PeriodicalId":342679,"journal":{"name":"Statistical Analysis and Data Mining: The ASA Data Science Journal","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Improved D2GAN‐based oversampling algorithm for imbalanced data classification\",\"authors\":\"Xiaoqiang Zhao, Qi Yao\",\"doi\":\"10.1002/sam.11640\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To address the problems of pattern collapse, uncontrollable data generation and high overlap rate when generative adversarial network (GAN) oversamples imbalanced data, we propose an imbalanced data oversampling algorithm based on improved dual discriminator generative adversarial nets (D2GAN). First, we integrate the positive class attribute information into the generator and the discriminator to ensure that the generator only generates the samples for positive class samples, which overcomes the problem of uncontrollable data generation by the generator. Second, we introduce a classifier into D2GAN for discriminating the generated samples and the original data, which avoids the overlap among the generated samples and the negative class samples, and ensures the diversity of the generated samples, the problem of pattern collapse is solved. Finally, the performance of the proposed algorithm is evaluated on 9 datasets by using SVM and neural network classification algorithm for oversampling experiments, the results show that the proposed algorithm effectively improve the classification performance of imbalanced data.\",\"PeriodicalId\":342679,\"journal\":{\"name\":\"Statistical Analysis and Data Mining: The ASA Data Science Journal\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Statistical Analysis and Data Mining: The ASA Data Science Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1002/sam.11640\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Analysis and Data Mining: The ASA Data Science Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/sam.11640","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Improved D2GAN‐based oversampling algorithm for imbalanced data classification
To address the problems of pattern collapse, uncontrollable data generation and high overlap rate when generative adversarial network (GAN) oversamples imbalanced data, we propose an imbalanced data oversampling algorithm based on improved dual discriminator generative adversarial nets (D2GAN). First, we integrate the positive class attribute information into the generator and the discriminator to ensure that the generator only generates the samples for positive class samples, which overcomes the problem of uncontrollable data generation by the generator. Second, we introduce a classifier into D2GAN for discriminating the generated samples and the original data, which avoids the overlap among the generated samples and the negative class samples, and ensures the diversity of the generated samples, the problem of pattern collapse is solved. Finally, the performance of the proposed algorithm is evaluated on 9 datasets by using SVM and neural network classification algorithm for oversampling experiments, the results show that the proposed algorithm effectively improve the classification performance of imbalanced data.