Behrouz Ahadzadeh , Moloud Abdar , Mahdieh Foroumandi , Fatemeh Safara , Abbas Khosravi , Salvador García , Ponnuthurai Nagaratnam Suganthan
{"title":"UniBFS:适用于高维数据的新型统一解驱动二元特征选择算法","authors":"Behrouz Ahadzadeh , Moloud Abdar , Mahdieh Foroumandi , Fatemeh Safara , Abbas Khosravi , Salvador García , Ponnuthurai Nagaratnam Suganthan","doi":"10.1016/j.swevo.2024.101715","DOIUrl":null,"url":null,"abstract":"<div><p>Feature selection (FS) is a crucial technique in machine learning and data mining, serving a variety of purposes such as simplifying model construction, facilitating knowledge discovery, improving computational efficiency, and reducing memory consumption. Despite its importance, the constantly increasing search space of high-dimensional datasets poses significant challenges to FS methods, including issues like the \"curse of dimensionality,\" susceptibility to local optima, and high computational and memory costs. To overcome these challenges, a new FS algorithm named Uniform-solution-driven Binary Feature Selection (UniBFS) has been developed in this study. UniBFS exploits the inherent characteristic of binary algorithms-binary coding-to search the entire problem space for identifying relevant features while avoiding irrelevant ones. To improve the effectiveness and efficiency of the UniBFS algorithm, Redundant Features Elimination algorithm (RFE) is presented in this paper. RFE performs a local search in a very small subspace of the solutions obtained by UniBFS in different stages, and removes the redundant features which do not increase the classification accuracy. Moreover, the study proposes a hybrid algorithm that combines UniBFS with two filter-based FS methods, ReliefF and Fisher, to identify pertinent features during the global search phase. The proposed algorithms are evaluated on 30 high-dimensional datasets ranging from 2000 to 54676 dimensions, and their effectiveness and efficiency are compared with state-of-the-art techniques, demonstrating their superiority.</p></div>","PeriodicalId":48682,"journal":{"name":"Swarm and Evolutionary Computation","volume":"91 ","pages":"Article 101715"},"PeriodicalIF":8.2000,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2210650224002530/pdfft?md5=8dd201c098f02846dd90beaa107d5c3f&pid=1-s2.0-S2210650224002530-main.pdf","citationCount":"0","resultStr":"{\"title\":\"UniBFS: A novel uniform-solution-driven binary feature selection algorithm for high-dimensional data\",\"authors\":\"Behrouz Ahadzadeh , Moloud Abdar , Mahdieh Foroumandi , Fatemeh Safara , Abbas Khosravi , Salvador García , Ponnuthurai Nagaratnam Suganthan\",\"doi\":\"10.1016/j.swevo.2024.101715\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Feature selection (FS) is a crucial technique in machine learning and data mining, serving a variety of purposes such as simplifying model construction, facilitating knowledge discovery, improving computational efficiency, and reducing memory consumption. Despite its importance, the constantly increasing search space of high-dimensional datasets poses significant challenges to FS methods, including issues like the \\\"curse of dimensionality,\\\" susceptibility to local optima, and high computational and memory costs. To overcome these challenges, a new FS algorithm named Uniform-solution-driven Binary Feature Selection (UniBFS) has been developed in this study. UniBFS exploits the inherent characteristic of binary algorithms-binary coding-to search the entire problem space for identifying relevant features while avoiding irrelevant ones. To improve the effectiveness and efficiency of the UniBFS algorithm, Redundant Features Elimination algorithm (RFE) is presented in this paper. RFE performs a local search in a very small subspace of the solutions obtained by UniBFS in different stages, and removes the redundant features which do not increase the classification accuracy. Moreover, the study proposes a hybrid algorithm that combines UniBFS with two filter-based FS methods, ReliefF and Fisher, to identify pertinent features during the global search phase. The proposed algorithms are evaluated on 30 high-dimensional datasets ranging from 2000 to 54676 dimensions, and their effectiveness and efficiency are compared with state-of-the-art techniques, demonstrating their superiority.</p></div>\",\"PeriodicalId\":48682,\"journal\":{\"name\":\"Swarm and Evolutionary Computation\",\"volume\":\"91 \",\"pages\":\"Article 101715\"},\"PeriodicalIF\":8.2000,\"publicationDate\":\"2024-09-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2210650224002530/pdfft?md5=8dd201c098f02846dd90beaa107d5c3f&pid=1-s2.0-S2210650224002530-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Swarm and Evolutionary Computation\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2210650224002530\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Swarm and Evolutionary Computation","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2210650224002530","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
UniBFS: A novel uniform-solution-driven binary feature selection algorithm for high-dimensional data
Feature selection (FS) is a crucial technique in machine learning and data mining, serving a variety of purposes such as simplifying model construction, facilitating knowledge discovery, improving computational efficiency, and reducing memory consumption. Despite its importance, the constantly increasing search space of high-dimensional datasets poses significant challenges to FS methods, including issues like the "curse of dimensionality," susceptibility to local optima, and high computational and memory costs. To overcome these challenges, a new FS algorithm named Uniform-solution-driven Binary Feature Selection (UniBFS) has been developed in this study. UniBFS exploits the inherent characteristic of binary algorithms-binary coding-to search the entire problem space for identifying relevant features while avoiding irrelevant ones. To improve the effectiveness and efficiency of the UniBFS algorithm, Redundant Features Elimination algorithm (RFE) is presented in this paper. RFE performs a local search in a very small subspace of the solutions obtained by UniBFS in different stages, and removes the redundant features which do not increase the classification accuracy. Moreover, the study proposes a hybrid algorithm that combines UniBFS with two filter-based FS methods, ReliefF and Fisher, to identify pertinent features during the global search phase. The proposed algorithms are evaluated on 30 high-dimensional datasets ranging from 2000 to 54676 dimensions, and their effectiveness and efficiency are compared with state-of-the-art techniques, demonstrating their superiority.
期刊介绍:
Swarm and Evolutionary Computation is a pioneering peer-reviewed journal focused on the latest research and advancements in nature-inspired intelligent computation using swarm and evolutionary algorithms. It covers theoretical, experimental, and practical aspects of these paradigms and their hybrids, promoting interdisciplinary research. The journal prioritizes the publication of high-quality, original articles that push the boundaries of evolutionary computation and swarm intelligence. Additionally, it welcomes survey papers on current topics and novel applications. Topics of interest include but are not limited to: Genetic Algorithms, and Genetic Programming, Evolution Strategies, and Evolutionary Programming, Differential Evolution, Artificial Immune Systems, Particle Swarms, Ant Colony, Bacterial Foraging, Artificial Bees, Fireflies Algorithm, Harmony Search, Artificial Life, Digital Organisms, Estimation of Distribution Algorithms, Stochastic Diffusion Search, Quantum Computing, Nano Computing, Membrane Computing, Human-centric Computing, Hybridization of Algorithms, Memetic Computing, Autonomic Computing, Self-organizing systems, Combinatorial, Discrete, Binary, Constrained, Multi-objective, Multi-modal, Dynamic, and Large-scale Optimization.