{"title":"Soft partitions lead to better learned ensembles","authors":"S. Eschrich, L. Hall","doi":"10.1109/NAFIPS.2002.1018094","DOIUrl":null,"url":null,"abstract":"Ensembles of classifiers often provide better classification accuracy than a single classifier. One approach to creating ensembles is to create different subsets of the training data. We present a method of creating ensembles of classifiers by partitioning the dataset into regions using clustering. Learners are assigned to each region and the ensemble classification occurs by querying the learned classifier. The first strategy considered for partitioning the training set is to generate a hard, non-overlapping partition. This approach is shown to perform worse than a single classifier using the entire training set. However, the use of soft partitions significantly improves the overall ensemble performance. Three different methods of creating soft partitions are considered: a simple distance ratio, and both the fuzzy c-means and possibilistic c-means membership functions. All three methods are found to improve overall classifier performance beyond hard partitioning and often perform better than the base classifier using the entire training set. Experiments on six datasets illustrate the improved accuracy from creating ensembles on soft partitions of data.","PeriodicalId":348314,"journal":{"name":"2002 Annual Meeting of the North American Fuzzy Information Processing Society Proceedings. NAFIPS-FLINT 2002 (Cat. No. 02TH8622)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2002 Annual Meeting of the North American Fuzzy Information Processing Society Proceedings. NAFIPS-FLINT 2002 (Cat. No. 02TH8622)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NAFIPS.2002.1018094","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12
Abstract
Ensembles of classifiers often provide better classification accuracy than a single classifier. One approach to creating ensembles is to create different subsets of the training data. We present a method of creating ensembles of classifiers by partitioning the dataset into regions using clustering. Learners are assigned to each region and the ensemble classification occurs by querying the learned classifier. The first strategy considered for partitioning the training set is to generate a hard, non-overlapping partition. This approach is shown to perform worse than a single classifier using the entire training set. However, the use of soft partitions significantly improves the overall ensemble performance. Three different methods of creating soft partitions are considered: a simple distance ratio, and both the fuzzy c-means and possibilistic c-means membership functions. All three methods are found to improve overall classifier performance beyond hard partitioning and often perform better than the base classifier using the entire training set. Experiments on six datasets illustrate the improved accuracy from creating ensembles on soft partitions of data.