A Multi-Objective Hybrid Filter-Wrapper Evolutionary Approach for Feature Construction on High-Dimensional Data

Marwa Hammami, Slim Bechikh, C. Hung, L. B. Said
{"title":"A Multi-Objective Hybrid Filter-Wrapper Evolutionary Approach for Feature Construction on High-Dimensional Data","authors":"Marwa Hammami, Slim Bechikh, C. Hung, L. B. Said","doi":"10.1109/CEC.2018.8477771","DOIUrl":null,"url":null,"abstract":"Feature selection and construction are important pre-processing techniques in data mining. They may allow not only dimensionality reduction but also classifier accuracy and efficiency improvement. These two techniques are of great importance especially for the case of high-dimensional data. Feature construction for high-dimensional data is still a very challenging topic. This can be explained by the large search space of feature combinations, whose size is a function of the number of features. Recently, researchers have used Genetic Programming (GP) for feature construction and the obtained results were promising. Unfortunately, the wrapper evaluation of each feature subset, where a feature can be constructed by a combination of features, is computationally intensive since such evaluation requires running the classifier on the data sets. Motivated by this observation, we propose, in this paper, a hybrid multiobjective evolutionary approach for efficient feature construction and selection. Our approach uses two filter objectives and one wrapper objective corresponding to the accuracy. In fact, the whole population is evaluated using two filter objectives. However, only non-dominated (best) feature subsets are improved using an indicator-based local search that optimizes the three objectives simultaneously. Our approach has been assessed on six high-dimensional datasets and compared with two existing prominent GP approaches, using three different classifiers for accuracy evaluation. Based on the obtained results, our approach is shown to provide competitive and better results compared with two competitor GP algorithms tested in this study.","PeriodicalId":212677,"journal":{"name":"2018 IEEE Congress on Evolutionary Computation (CEC)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE Congress on Evolutionary Computation (CEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CEC.2018.8477771","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

Feature selection and construction are important pre-processing techniques in data mining. They may allow not only dimensionality reduction but also classifier accuracy and efficiency improvement. These two techniques are of great importance especially for the case of high-dimensional data. Feature construction for high-dimensional data is still a very challenging topic. This can be explained by the large search space of feature combinations, whose size is a function of the number of features. Recently, researchers have used Genetic Programming (GP) for feature construction and the obtained results were promising. Unfortunately, the wrapper evaluation of each feature subset, where a feature can be constructed by a combination of features, is computationally intensive since such evaluation requires running the classifier on the data sets. Motivated by this observation, we propose, in this paper, a hybrid multiobjective evolutionary approach for efficient feature construction and selection. Our approach uses two filter objectives and one wrapper objective corresponding to the accuracy. In fact, the whole population is evaluated using two filter objectives. However, only non-dominated (best) feature subsets are improved using an indicator-based local search that optimizes the three objectives simultaneously. Our approach has been assessed on six high-dimensional datasets and compared with two existing prominent GP approaches, using three different classifiers for accuracy evaluation. Based on the obtained results, our approach is shown to provide competitive and better results compared with two competitor GP algorithms tested in this study.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
高维数据特征构建的多目标混合滤波-包装进化方法
特征选择和构造是数据挖掘中重要的预处理技术。它们不仅可以降低维数,还可以提高分类器的准确性和效率。这两种技术对于高维数据尤其重要。高维数据的特征构建仍然是一个非常具有挑战性的课题。这可以解释为特征组合的搜索空间很大,其大小是特征数量的函数。近年来,研究人员将遗传规划(GP)用于特征构建,并取得了良好的结果。不幸的是,每个特征子集的包装器评估是计算密集型的,因为这种评估需要在数据集上运行分类器。基于这一观察结果,我们提出了一种混合多目标进化方法,用于高效的特征构建和选择。我们的方法使用两个过滤器目标和一个包装器目标对应于精度。事实上,整个群体是用两个过滤目标来评估的。然而,只有非支配(最佳)的特征子集被改进使用指示器为基础的局部搜索,同时优化三个目标。我们的方法已经在六个高维数据集上进行了评估,并与两种现有的著名GP方法进行了比较,使用三种不同的分类器进行准确性评估。根据获得的结果,与本研究中测试的两种竞争对手的GP算法相比,我们的方法显示出具有竞争力和更好的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Automatic Evolution of AutoEncoders for Compressed Representations Landscape-Based Differential Evolution for Constrained Optimization Problems A Novel Approach for Optimizing Ensemble Components in Rainfall Prediction A Many-Objective Evolutionary Algorithm with Fast Clustering and Reference Point Redistribution Manyobjective Optimization to Design Physical Topology of Optical Networks with Undefined Node Locations
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1