基于RBFNN的L-GEM的失衡问题主动学习

Junjie Hu
{"title":"基于RBFNN的L-GEM的失衡问题主动学习","authors":"Junjie Hu","doi":"10.1109/ICMLC.2012.6358972","DOIUrl":null,"url":null,"abstract":"In lots of important applications, such as malignant cell detection, network intrusion detection, error signal detection in power system, the data distributions of positive and negative classes are usually imbalance. Many classifiers could not perform well in data imbalance cases. The major problem is that classifiers tend to ignore samples and accuracy of the minority class without regarding the higher cost of misclassification in this minor class. Therefore, pattern classification for imbalance data becomes a hot challenge to both academy and industry. In this paper, we propose an active learning method for imbalance data using a stochastic sensitivity measure (ST-SM) of Radial Basis Function Neural Network (RBFNN). A large ST-SM indicates the RBFNN is uncertain and yields a large output fluctuation around a particular sample. These samples yielding large ST-SM values are selected for adding to the training set in each turn. Empirically, samples with large output perturbation (i.e. large ST-SM) should be located near the classification boundary and is of great significance for the training of classifier. As for the imbalance characteristic of the data set, the ST-SM should be able to reduce the number of redundant samples being selected in the majority class, rebalance the sample distribution of the training set, and finally improve the performance of the classifier.","PeriodicalId":128006,"journal":{"name":"2012 International Conference on Machine Learning and Cybernetics","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Active learning for imbalance problem using L-GEM of RBFNN\",\"authors\":\"Junjie Hu\",\"doi\":\"10.1109/ICMLC.2012.6358972\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In lots of important applications, such as malignant cell detection, network intrusion detection, error signal detection in power system, the data distributions of positive and negative classes are usually imbalance. Many classifiers could not perform well in data imbalance cases. The major problem is that classifiers tend to ignore samples and accuracy of the minority class without regarding the higher cost of misclassification in this minor class. Therefore, pattern classification for imbalance data becomes a hot challenge to both academy and industry. In this paper, we propose an active learning method for imbalance data using a stochastic sensitivity measure (ST-SM) of Radial Basis Function Neural Network (RBFNN). A large ST-SM indicates the RBFNN is uncertain and yields a large output fluctuation around a particular sample. These samples yielding large ST-SM values are selected for adding to the training set in each turn. Empirically, samples with large output perturbation (i.e. large ST-SM) should be located near the classification boundary and is of great significance for the training of classifier. As for the imbalance characteristic of the data set, the ST-SM should be able to reduce the number of redundant samples being selected in the majority class, rebalance the sample distribution of the training set, and finally improve the performance of the classifier.\",\"PeriodicalId\":128006,\"journal\":{\"name\":\"2012 International Conference on Machine Learning and Cybernetics\",\"volume\":\"49 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-07-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 International Conference on Machine Learning and Cybernetics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLC.2012.6358972\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 International Conference on Machine Learning and Cybernetics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLC.2012.6358972","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

摘要

在恶性细胞检测、网络入侵检测、电力系统错误信号检测等重要应用中,正、负类数据的分布往往不平衡。许多分类器在数据不平衡的情况下表现不佳。主要的问题是,分类器倾向于忽略少数类的样本和准确性,而不考虑在这个少数类中错误分类的更高成本。因此,失衡数据的模式分类成为学术界和业界共同关注的热点问题。本文提出了一种基于径向基函数神经网络(RBFNN)的随机灵敏度测量(ST-SM)的不平衡数据主动学习方法。较大的ST-SM表明RBFNN是不确定的,并且在特定样本周围产生较大的输出波动。这些产生较大ST-SM值的样本被选择添加到每一轮的训练集中。经验上,输出扰动大的样本(即ST-SM大)应该位于分类边界附近,这对分类器的训练有重要意义。对于数据集的不平衡特性,ST-SM应该能够减少多数类中被选择的冗余样本数量,重新平衡训练集的样本分布,最终提高分类器的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Active learning for imbalance problem using L-GEM of RBFNN
In lots of important applications, such as malignant cell detection, network intrusion detection, error signal detection in power system, the data distributions of positive and negative classes are usually imbalance. Many classifiers could not perform well in data imbalance cases. The major problem is that classifiers tend to ignore samples and accuracy of the minority class without regarding the higher cost of misclassification in this minor class. Therefore, pattern classification for imbalance data becomes a hot challenge to both academy and industry. In this paper, we propose an active learning method for imbalance data using a stochastic sensitivity measure (ST-SM) of Radial Basis Function Neural Network (RBFNN). A large ST-SM indicates the RBFNN is uncertain and yields a large output fluctuation around a particular sample. These samples yielding large ST-SM values are selected for adding to the training set in each turn. Empirically, samples with large output perturbation (i.e. large ST-SM) should be located near the classification boundary and is of great significance for the training of classifier. As for the imbalance characteristic of the data set, the ST-SM should be able to reduce the number of redundant samples being selected in the majority class, rebalance the sample distribution of the training set, and finally improve the performance of the classifier.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
ROBUST H∞ filtering for a class of nonlinear uncertain singular systems with time-varying delay Discriminati on between external short circuit and internal winding fault in power transformer using discrete wavelet transform and back-propagation neural network Hybrid linear and nonlinear weight Particle Swarm Optimization algorithm Transcriptional cooperativity in molecular dynamics based on normal mode analysis An efficient web document clustering algorithm for building dynamic similarity profile in Similarity-aware web caching
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1