The Efficiency of Aggregation Methods in Ensemble Filter Feature Selection Models

N. Noureldien, Saffa Mohmoud
{"title":"The Efficiency of Aggregation Methods in Ensemble Filter Feature Selection Models","authors":"N. Noureldien, Saffa Mohmoud","doi":"10.14738/tmlai.94.10101","DOIUrl":null,"url":null,"abstract":"Ensemble feature selection is recommended as it proves to produce a more stable subset of features and a better classification accuracy when compared to the individual feature selection methods. In this approach, the output of feature selection methods, called base selectors, are combined using some aggregation methods. For filter feature selection methods, a list aggregation method is needed to aggregate output ranked lists into a single list, and since many list aggregation methods have been proposed the decision on which method to use to build the optimum ensemble model is a de facto question. \n      In this paper, we investigate the efficiency of four aggregation methods, namely; Min, Median, Arithmetic Mean, and Geometric Mean. The performance of aggregation methods is evaluated using five datasets from different scientific fields with a variant number of instances and features. Besides, the classifies used in the evaluation are selected from three different classes, Trees, Rules, and Bayes. \n      The experimental results show that 11 out of the 15 best performance results are corresponding to ensemble models. And out of the 11 best performance ensemble models, the most efficient aggregation methods are Median (5/11), followed by Arithmetic Mean (3/11) and Min (3/11). Also, results show that as the number of features increased, the efficient aggregation method changes from Min to Median to Arithmetic Mean. This may suggest that for a very high number of features the efficient aggregation method is the Arithmetic Mean. And generally, there is no aggregation method that is the best for all cases.","PeriodicalId":119801,"journal":{"name":"Transactions on Machine Learning and Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transactions on Machine Learning and Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14738/tmlai.94.10101","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Ensemble feature selection is recommended as it proves to produce a more stable subset of features and a better classification accuracy when compared to the individual feature selection methods. In this approach, the output of feature selection methods, called base selectors, are combined using some aggregation methods. For filter feature selection methods, a list aggregation method is needed to aggregate output ranked lists into a single list, and since many list aggregation methods have been proposed the decision on which method to use to build the optimum ensemble model is a de facto question.       In this paper, we investigate the efficiency of four aggregation methods, namely; Min, Median, Arithmetic Mean, and Geometric Mean. The performance of aggregation methods is evaluated using five datasets from different scientific fields with a variant number of instances and features. Besides, the classifies used in the evaluation are selected from three different classes, Trees, Rules, and Bayes.       The experimental results show that 11 out of the 15 best performance results are corresponding to ensemble models. And out of the 11 best performance ensemble models, the most efficient aggregation methods are Median (5/11), followed by Arithmetic Mean (3/11) and Min (3/11). Also, results show that as the number of features increased, the efficient aggregation method changes from Min to Median to Arithmetic Mean. This may suggest that for a very high number of features the efficient aggregation method is the Arithmetic Mean. And generally, there is no aggregation method that is the best for all cases.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
集成过滤特征选择模型中聚合方法的效率
推荐使用集成特征选择,因为与单独的特征选择方法相比,集成特征选择可以产生更稳定的特征子集和更好的分类精度。在这种方法中,特征选择方法的输出(称为基选择器)使用一些聚合方法进行组合。对于过滤器特征选择方法,需要使用列表聚合方法将输出的排序列表聚合为单个列表,并且由于已经提出了许多列表聚合方法,因此使用哪种方法构建最佳集成模型实际上是一个问题。本文研究了四种聚合方法的效率,即;最小值,中位数,算术平均值和几何平均值。使用来自不同科学领域的五个具有不同实例数量和特征的数据集来评估聚合方法的性能。此外,在评估中使用的分类是从三个不同的类别中选择的,树,规则和贝叶斯。实验结果表明,15个最佳性能结果中有11个对应于集成模型。在11个性能最佳的集成模型中,最有效的聚合方法是Median(5/11),其次是Arithmetic Mean(3/11)和Min(3/11)。结果表明,随着特征数量的增加,有效的聚合方法从最小值到中位数再到算术平均。这可能表明,对于非常多的特征,有效的聚合方法是算术平均值。一般来说,没有一种聚合方法对所有情况都是最好的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Addressing Challenges Encountered by English Language Teachers in Imparting Communication Skills among Higher Secondary Students: A Critical Overview Singing Voice Melody Detection Inquiring About The Memetic Relationships People Have with Societal Collapse Natural Ventilation in a Semi-Confined Enclosure Heated by a Linear Heat Source NMC: A Fast and Secure ARX Cipher
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1