{"title":"集成过滤特征选择模型中聚合方法的效率","authors":"N. Noureldien, Saffa Mohmoud","doi":"10.14738/tmlai.94.10101","DOIUrl":null,"url":null,"abstract":"Ensemble feature selection is recommended as it proves to produce a more stable subset of features and a better classification accuracy when compared to the individual feature selection methods. In this approach, the output of feature selection methods, called base selectors, are combined using some aggregation methods. For filter feature selection methods, a list aggregation method is needed to aggregate output ranked lists into a single list, and since many list aggregation methods have been proposed the decision on which method to use to build the optimum ensemble model is a de facto question. \n In this paper, we investigate the efficiency of four aggregation methods, namely; Min, Median, Arithmetic Mean, and Geometric Mean. The performance of aggregation methods is evaluated using five datasets from different scientific fields with a variant number of instances and features. Besides, the classifies used in the evaluation are selected from three different classes, Trees, Rules, and Bayes. \n The experimental results show that 11 out of the 15 best performance results are corresponding to ensemble models. And out of the 11 best performance ensemble models, the most efficient aggregation methods are Median (5/11), followed by Arithmetic Mean (3/11) and Min (3/11). Also, results show that as the number of features increased, the efficient aggregation method changes from Min to Median to Arithmetic Mean. This may suggest that for a very high number of features the efficient aggregation method is the Arithmetic Mean. And generally, there is no aggregation method that is the best for all cases.","PeriodicalId":119801,"journal":{"name":"Transactions on Machine Learning and Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"The Efficiency of Aggregation Methods in Ensemble Filter Feature Selection Models\",\"authors\":\"N. Noureldien, Saffa Mohmoud\",\"doi\":\"10.14738/tmlai.94.10101\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Ensemble feature selection is recommended as it proves to produce a more stable subset of features and a better classification accuracy when compared to the individual feature selection methods. In this approach, the output of feature selection methods, called base selectors, are combined using some aggregation methods. For filter feature selection methods, a list aggregation method is needed to aggregate output ranked lists into a single list, and since many list aggregation methods have been proposed the decision on which method to use to build the optimum ensemble model is a de facto question. \\n In this paper, we investigate the efficiency of four aggregation methods, namely; Min, Median, Arithmetic Mean, and Geometric Mean. The performance of aggregation methods is evaluated using five datasets from different scientific fields with a variant number of instances and features. Besides, the classifies used in the evaluation are selected from three different classes, Trees, Rules, and Bayes. \\n The experimental results show that 11 out of the 15 best performance results are corresponding to ensemble models. And out of the 11 best performance ensemble models, the most efficient aggregation methods are Median (5/11), followed by Arithmetic Mean (3/11) and Min (3/11). Also, results show that as the number of features increased, the efficient aggregation method changes from Min to Median to Arithmetic Mean. This may suggest that for a very high number of features the efficient aggregation method is the Arithmetic Mean. And generally, there is no aggregation method that is the best for all cases.\",\"PeriodicalId\":119801,\"journal\":{\"name\":\"Transactions on Machine Learning and Artificial Intelligence\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-08-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Transactions on Machine Learning and Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.14738/tmlai.94.10101\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transactions on Machine Learning and Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14738/tmlai.94.10101","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The Efficiency of Aggregation Methods in Ensemble Filter Feature Selection Models
Ensemble feature selection is recommended as it proves to produce a more stable subset of features and a better classification accuracy when compared to the individual feature selection methods. In this approach, the output of feature selection methods, called base selectors, are combined using some aggregation methods. For filter feature selection methods, a list aggregation method is needed to aggregate output ranked lists into a single list, and since many list aggregation methods have been proposed the decision on which method to use to build the optimum ensemble model is a de facto question.
In this paper, we investigate the efficiency of four aggregation methods, namely; Min, Median, Arithmetic Mean, and Geometric Mean. The performance of aggregation methods is evaluated using five datasets from different scientific fields with a variant number of instances and features. Besides, the classifies used in the evaluation are selected from three different classes, Trees, Rules, and Bayes.
The experimental results show that 11 out of the 15 best performance results are corresponding to ensemble models. And out of the 11 best performance ensemble models, the most efficient aggregation methods are Median (5/11), followed by Arithmetic Mean (3/11) and Min (3/11). Also, results show that as the number of features increased, the efficient aggregation method changes from Min to Median to Arithmetic Mean. This may suggest that for a very high number of features the efficient aggregation method is the Arithmetic Mean. And generally, there is no aggregation method that is the best for all cases.