Tao Li;Yuhua Qian;Feijiang Li;Xinyan Liang;Zhi-Hui Zhan
{"title":"Feature Subspace Learning-Based Binary Differential Evolution Algorithm for Unsupervised Feature Selection","authors":"Tao Li;Yuhua Qian;Feijiang Li;Xinyan Liang;Zhi-Hui Zhan","doi":"10.1109/TBDATA.2024.3378090","DOIUrl":null,"url":null,"abstract":"It is a challenging task to select the informative features that can maintain the manifold structure in the original feature space. Many unsupervised feature selection methods still suffer the poor cluster performance in the selected feature subset. To tackle this problem, a feature subspace learning-based binary differential evolution algorithm is proposed for unsupervised feature selection. First, a new unsupervised feature selection framework based on evolutionary computation is designed, in which the feature subspace learning and the population search mechanism are combined into a unified unsupervised feature selection. Second, a local manifold structure learning strategy and a sample pseudo-label learning strategy are presented to calculate the importance of the selected feature subspace. Third, the binary differential evolution algorithm is developed to optimize the selected feature subspace, in which the binary information migration mutation operator and the adaptive crossover operator are designed to promote the searching for the global optimal feature subspace. Experimental results on various types of real-world datasets demonstrate that the proposed algorithm can obtain more informative feature subset and competitive cluster performance compared with eight state-of-the-art unsupervised feature selection methods.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 1","pages":"99-114"},"PeriodicalIF":7.5000,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Big Data","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10473134/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
It is a challenging task to select the informative features that can maintain the manifold structure in the original feature space. Many unsupervised feature selection methods still suffer the poor cluster performance in the selected feature subset. To tackle this problem, a feature subspace learning-based binary differential evolution algorithm is proposed for unsupervised feature selection. First, a new unsupervised feature selection framework based on evolutionary computation is designed, in which the feature subspace learning and the population search mechanism are combined into a unified unsupervised feature selection. Second, a local manifold structure learning strategy and a sample pseudo-label learning strategy are presented to calculate the importance of the selected feature subspace. Third, the binary differential evolution algorithm is developed to optimize the selected feature subspace, in which the binary information migration mutation operator and the adaptive crossover operator are designed to promote the searching for the global optimal feature subspace. Experimental results on various types of real-world datasets demonstrate that the proposed algorithm can obtain more informative feature subset and competitive cluster performance compared with eight state-of-the-art unsupervised feature selection methods.
期刊介绍:
The IEEE Transactions on Big Data publishes peer-reviewed articles focusing on big data. These articles present innovative research ideas and application results across disciplines, including novel theories, algorithms, and applications. Research areas cover a wide range, such as big data analytics, visualization, curation, management, semantics, infrastructure, standards, performance analysis, intelligence extraction, scientific discovery, security, privacy, and legal issues specific to big data. The journal also prioritizes applications of big data in fields generating massive datasets.