{"title":"基于分区聚类的特征选择","authors":"Shuang Liu, Qiang Zhao, Xiang Wu","doi":"10.3233/KES-140293","DOIUrl":null,"url":null,"abstract":"Feature selection plays an important role in data mining, machine learning and pattern recognition, especially for large scale data with high dimensions. Many selection techniques have been proposed during past years. Their general purposes are to exploit certain metric to measure the relevance or irrelevance between different features of data for certain task, and then select fewer features without deteriorating discriminative capability. Each technique, however, has not absolutely better performance than others' for all kinds of data, due to the data characterized by incorrectness, incompleteness, inconsistency, and diversity. Based on this fact, this paper put forward to a new scheme based on partition clustering for feature selection, which is a special preprocessing procedure and independent of selection techniques. Experimental results carried out on UCI data sets show that the performance achieved by our proposed scheme is better than selection techniques without using this scheme in most cases.","PeriodicalId":210048,"journal":{"name":"Int. J. Knowl. Based Intell. Eng. Syst.","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Feature selection based on partition clustering\",\"authors\":\"Shuang Liu, Qiang Zhao, Xiang Wu\",\"doi\":\"10.3233/KES-140293\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Feature selection plays an important role in data mining, machine learning and pattern recognition, especially for large scale data with high dimensions. Many selection techniques have been proposed during past years. Their general purposes are to exploit certain metric to measure the relevance or irrelevance between different features of data for certain task, and then select fewer features without deteriorating discriminative capability. Each technique, however, has not absolutely better performance than others' for all kinds of data, due to the data characterized by incorrectness, incompleteness, inconsistency, and diversity. Based on this fact, this paper put forward to a new scheme based on partition clustering for feature selection, which is a special preprocessing procedure and independent of selection techniques. Experimental results carried out on UCI data sets show that the performance achieved by our proposed scheme is better than selection techniques without using this scheme in most cases.\",\"PeriodicalId\":210048,\"journal\":{\"name\":\"Int. J. Knowl. Based Intell. Eng. Syst.\",\"volume\":\"48 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. J. Knowl. Based Intell. Eng. Syst.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3233/KES-140293\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Knowl. Based Intell. Eng. Syst.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/KES-140293","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Feature selection plays an important role in data mining, machine learning and pattern recognition, especially for large scale data with high dimensions. Many selection techniques have been proposed during past years. Their general purposes are to exploit certain metric to measure the relevance or irrelevance between different features of data for certain task, and then select fewer features without deteriorating discriminative capability. Each technique, however, has not absolutely better performance than others' for all kinds of data, due to the data characterized by incorrectness, incompleteness, inconsistency, and diversity. Based on this fact, this paper put forward to a new scheme based on partition clustering for feature selection, which is a special preprocessing procedure and independent of selection techniques. Experimental results carried out on UCI data sets show that the performance achieved by our proposed scheme is better than selection techniques without using this scheme in most cases.