一对一的班级不平衡学习

2013 9th International Conference on Information, Communications & Signal Processing Pub Date : 2013-12-01 DOI:10.1109/ICICS.2013.6782785

Bilal Mirza, Zhiping Lin

{"title":"一对一的班级不平衡学习","authors":"Bilal Mirza, Zhiping Lin","doi":"10.1109/ICICS.2013.6782785","DOIUrl":null,"url":null,"abstract":"The performance of support vector machines (SVMs) can deteriorate when the number of samples in one class is much greater than that in the other. Existing methods tackle this problem by modifying the learning algorithms or resampling the datasets. In this paper, we propose a new method called one-vs-all for class imbalance learning (OVACIL) which neither modifies the SVM learning algorithms nor resamples the datasets. In the OVACIL method, we re-group a given imbalanced dataset into a number of new datasets comprising of all the original samples and train standard SVM classifiers using each of the datasets. The output scores of these classifiers on a testing sample are then compared and a final decision is made without a fixed decision threshold. This comparison is not biased toward any particular class, resulting in high accuracies of both classes. The Gmean and Fmeasure values obtained by OVACIL on 18 real-world imbalanced datasets surpass the previous best values reported by other state-of-the-art CIL methods on most of these datasets.","PeriodicalId":184544,"journal":{"name":"2013 9th International Conference on Information, Communications & Signal Processing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"One-vs-all for class imbalance learning\",\"authors\":\"Bilal Mirza, Zhiping Lin\",\"doi\":\"10.1109/ICICS.2013.6782785\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The performance of support vector machines (SVMs) can deteriorate when the number of samples in one class is much greater than that in the other. Existing methods tackle this problem by modifying the learning algorithms or resampling the datasets. In this paper, we propose a new method called one-vs-all for class imbalance learning (OVACIL) which neither modifies the SVM learning algorithms nor resamples the datasets. In the OVACIL method, we re-group a given imbalanced dataset into a number of new datasets comprising of all the original samples and train standard SVM classifiers using each of the datasets. The output scores of these classifiers on a testing sample are then compared and a final decision is made without a fixed decision threshold. This comparison is not biased toward any particular class, resulting in high accuracies of both classes. The Gmean and Fmeasure values obtained by OVACIL on 18 real-world imbalanced datasets surpass the previous best values reported by other state-of-the-art CIL methods on most of these datasets.\",\"PeriodicalId\":184544,\"journal\":{\"name\":\"2013 9th International Conference on Information, Communications & Signal Processing\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 9th International Conference on Information, Communications & Signal Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICICS.2013.6782785\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 9th International Conference on Information, Communications & Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICS.2013.6782785","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

摘要

当一个类别的样本数量远远大于另一个类别的样本数量时，支持向量机(svm)的性能会下降。现有的方法通过修改学习算法或对数据集重新采样来解决这个问题。在本文中，我们提出了一种新的方法，称为单对全的类不平衡学习(OVACIL)，它既不修改SVM学习算法，也不重新采样数据集。在OVACIL方法中，我们将给定的不平衡数据集重新分组为由所有原始样本组成的许多新数据集，并使用每个数据集训练标准SVM分类器。然后比较这些分类器在测试样本上的输出分数，并在没有固定决策阈值的情况下做出最终决策。这种比较不偏向于任何特定的类别，从而导致两个类别都具有很高的准确性。OVACIL在18个真实不平衡数据集上获得的Gmean和Fmeasure值超过了其他最先进的CIL方法在大多数这些数据集上报告的最佳值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

One-vs-all for class imbalance learning

The performance of support vector machines (SVMs) can deteriorate when the number of samples in one class is much greater than that in the other. Existing methods tackle this problem by modifying the learning algorithms or resampling the datasets. In this paper, we propose a new method called one-vs-all for class imbalance learning (OVACIL) which neither modifies the SVM learning algorithms nor resamples the datasets. In the OVACIL method, we re-group a given imbalanced dataset into a number of new datasets comprising of all the original samples and train standard SVM classifiers using each of the datasets. The output scores of these classifiers on a testing sample are then compared and a final decision is made without a fixed decision threshold. This comparison is not biased toward any particular class, resulting in high accuracies of both classes. The Gmean and Fmeasure values obtained by OVACIL on 18 real-world imbalanced datasets surpass the previous best values reported by other state-of-the-art CIL methods on most of these datasets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2013 9th International Conference on Information, Communications & Signal Processing

自引率

0.00%

发文量

期刊最新文献

Cubic-based 3-D localization for wireless sensor networks Using PCA algorithm to refine the results of internet traffic identification Recognizing trees at a distance with discriminative deep feature learning A random increasing sequence hash chain and smart card-based remote user authentication scheme Two dimension nonnegative partial least squares for face recognition