N. Díaz-Díaz, J. Aguilar-Ruiz, Juan A. Nepomuceno, Jorge García
{"title":"Feature selection based on bootstrapping","authors":"N. Díaz-Díaz, J. Aguilar-Ruiz, Juan A. Nepomuceno, Jorge García","doi":"10.1109/CIMA.2005.1662338","DOIUrl":null,"url":null,"abstract":"The results of feature selection methods have a great influence on the success of data mining processes, especially when the data sets have high dimensionality. In order to find the optimal result from feature selection methods, we should check each possible subset of features to obtain the precision on classification, i.e., an exhaustive search through the search space. However, it is an unfeasible task due to its computational complexity. In this paper, we propose a novel method of feature selection based on bootstrapping techniques. Our approach shows that it is not necessary to try every subset of features, but only a very small subset of combinations to achieve the same performance as the exhaustive approach. The experiments have been carried out using very high-dimensional datasets (thousands of features) and they show that it is possible to maintain the precision at the same time that the complexity is reduced substantially","PeriodicalId":306045,"journal":{"name":"2005 ICSC Congress on Computational Intelligence Methods and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2005 ICSC Congress on Computational Intelligence Methods and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIMA.2005.1662338","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
The results of feature selection methods have a great influence on the success of data mining processes, especially when the data sets have high dimensionality. In order to find the optimal result from feature selection methods, we should check each possible subset of features to obtain the precision on classification, i.e., an exhaustive search through the search space. However, it is an unfeasible task due to its computational complexity. In this paper, we propose a novel method of feature selection based on bootstrapping techniques. Our approach shows that it is not necessary to try every subset of features, but only a very small subset of combinations to achieve the same performance as the exhaustive approach. The experiments have been carried out using very high-dimensional datasets (thousands of features) and they show that it is possible to maintain the precision at the same time that the complexity is reduced substantially