KB-CB-N分类:面向监督学习的无监督方法

2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM) Pub Date : 2011-04-11 DOI:10.1109/CIDM.2011.5949435

Z. Abdallah, M. Gaber

{"title":"KB-CB-N分类:面向监督学习的无监督方法","authors":"Z. Abdallah, M. Gaber","doi":"10.1109/CIDM.2011.5949435","DOIUrl":null,"url":null,"abstract":"Data classification has attracted considerable research attention in the field of computational statistics and data mining due to its wide range of applications. K Best Cluster Based Neighbour (KB-CB-N) is our novel classification technique based on the integration of three different similarity measures for cluster based classification. The basic principle is to apply unsupervised learning on the instances of each class in the dataset and then use the output as an input for the classification algorithm to find the K best neighbours of clusters from the density, gravity and distance perspectives. Clustering is applied as an initial step within each class to find the inherent in-class grouping in the dataset. Different data clustering techniques use different similarity measures. Each measure has its own strength and weakness. Thus, combining the three measures can benefit from the strength of each one and eliminate encountered problems of using an individual measure. Extensive experimental results using eight real datasets have evidenced that our new technique typically shows improved or equivalent performance over other existing state-of-the-art classification methods.","PeriodicalId":211565,"journal":{"name":"2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2011-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"KB-CB-N classification: Towards unsupervised approach for supervised learning\",\"authors\":\"Z. Abdallah, M. Gaber\",\"doi\":\"10.1109/CIDM.2011.5949435\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data classification has attracted considerable research attention in the field of computational statistics and data mining due to its wide range of applications. K Best Cluster Based Neighbour (KB-CB-N) is our novel classification technique based on the integration of three different similarity measures for cluster based classification. The basic principle is to apply unsupervised learning on the instances of each class in the dataset and then use the output as an input for the classification algorithm to find the K best neighbours of clusters from the density, gravity and distance perspectives. Clustering is applied as an initial step within each class to find the inherent in-class grouping in the dataset. Different data clustering techniques use different similarity measures. Each measure has its own strength and weakness. Thus, combining the three measures can benefit from the strength of each one and eliminate encountered problems of using an individual measure. Extensive experimental results using eight real datasets have evidenced that our new technique typically shows improved or equivalent performance over other existing state-of-the-art classification methods.\",\"PeriodicalId\":211565,\"journal\":{\"name\":\"2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-04-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIDM.2011.5949435\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIDM.2011.5949435","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 12

摘要

数据分类由于其广泛的应用，在计算统计和数据挖掘领域引起了相当大的研究关注。基于最佳聚类邻居(KB-CB-N)是一种基于三种不同相似性度量的聚类分类新技术。基本原理是对数据集中每个类的实例应用无监督学习，然后将输出作为分类算法的输入，从密度、重力和距离的角度找到K个簇的最佳邻居。聚类是在每个类中应用的初始步骤，以找到数据集中固有的类内分组。不同的数据聚类技术使用不同的相似性度量。每种措施都有其优缺点。因此，将三个度量结合起来可以从每个度量的优势中获益，并消除使用单个度量所遇到的问题。使用8个真实数据集的广泛实验结果证明，我们的新技术通常比其他现有的最先进的分类方法表现出改进或同等的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

KB-CB-N classification: Towards unsupervised approach for supervised learning

Data classification has attracted considerable research attention in the field of computational statistics and data mining due to its wide range of applications. K Best Cluster Based Neighbour (KB-CB-N) is our novel classification technique based on the integration of three different similarity measures for cluster based classification. The basic principle is to apply unsupervised learning on the instances of each class in the dataset and then use the output as an input for the classification algorithm to find the K best neighbours of clusters from the density, gravity and distance perspectives. Clustering is applied as an initial step within each class to find the inherent in-class grouping in the dataset. Different data clustering techniques use different similarity measures. Each measure has its own strength and weakness. Thus, combining the three measures can benefit from the strength of each one and eliminate encountered problems of using an individual measure. Extensive experimental results using eight real datasets have evidenced that our new technique typically shows improved or equivalent performance over other existing state-of-the-art classification methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)

自引率

0.00%

发文量

期刊最新文献

A multi-Biclustering Combinatorial Based algorithm Active classifier training with the 3DS strategy Link Pattern Prediction with tensor decomposition in multi-relational networks Using gaming strategies for attacker and defender in recommender systems Generating materialized views using ant based approaches and information retrieval technologies