2-Stage instance selection algorithm for KNN based on Nearest Unlike Neighbors

Chunru Dong, P. Chan, Wing W. Y. Ng, D. Yeung
{"title":"2-Stage instance selection algorithm for KNN based on Nearest Unlike Neighbors","authors":"Chunru Dong, P. Chan, Wing W. Y. Ng, D. Yeung","doi":"10.1109/ICMLC.2010.5581078","DOIUrl":null,"url":null,"abstract":"For the virtues such as simplicity, high generalization capability, and few training cost, the K-Nearest-Neighbor (KNN) classifier is widely used in pattern recognition and machine learning. However, the computation complexity of KNN classifier will become higher when dealing with large data sets classification problem. In consequence, its efficiency will be decreased greatly. This paper proposes a general two-stage training set condensing algorithm for general KNN classifier. First, we identify the noise data points and remove them from the original training set. Second, a general condensed nearest neighbor rule based on the so-called Nearest Unlike Neighbor (NUN) is presented to further eliminate the redundant samples in training set. In order to verify the performance of the proposed method, some numerical experiments are conducted on several UCI benchmark databases.","PeriodicalId":126080,"journal":{"name":"2010 International Conference on Machine Learning and Cybernetics","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 International Conference on Machine Learning and Cybernetics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLC.2010.5581078","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

For the virtues such as simplicity, high generalization capability, and few training cost, the K-Nearest-Neighbor (KNN) classifier is widely used in pattern recognition and machine learning. However, the computation complexity of KNN classifier will become higher when dealing with large data sets classification problem. In consequence, its efficiency will be decreased greatly. This paper proposes a general two-stage training set condensing algorithm for general KNN classifier. First, we identify the noise data points and remove them from the original training set. Second, a general condensed nearest neighbor rule based on the so-called Nearest Unlike Neighbor (NUN) is presented to further eliminate the redundant samples in training set. In order to verify the performance of the proposed method, some numerical experiments are conducted on several UCI benchmark databases.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于最近邻的KNN两阶段实例选择算法
KNN分类器以其简单、泛化能力强、训练成本低等优点,在模式识别和机器学习中得到了广泛的应用。然而,在处理大数据集分类问题时,KNN分类器的计算复杂度会变得更高。因此,其效率将大大降低。针对一般KNN分类器,提出了一种通用的两阶段训练集压缩算法。首先,我们识别噪声数据点并从原始训练集中去除它们。其次,提出了一种基于最近邻的通用精简近邻规则,进一步消除训练集中的冗余样本;为了验证所提方法的性能,在多个UCI基准数据库上进行了数值实验。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Does joint decoding really outperform cascade processing in English-to-Chinese transliteration generation? The role of syllabification The design of energy-saving filtering mechanism for sensor networks Feature-based approach combined with hierarchical classifying strategy to relation extraction The comparative study of different Bayesian classifier models New inverse halftoning using texture-and lookup table-based learning approach
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1