{"title":"Adaptive and flexible \\(\\ell _1\\)-norm graph embedding for unsupervised feature selection","authors":"Kun Jiang, Ting Cao, Lei Zhu, Qindong Sun","doi":"10.1007/s10489-024-05760-z","DOIUrl":null,"url":null,"abstract":"<div><p>Unsupervised feature selection (UFS) is a fundamental and indispensable dimension reduction method for large amount of high-dimensional unlabeled data samples. Without label information, the manifold learning technique is leveraged to compensate for the lack of discrimination with the selected features. However, it is still a challenging problem to capture the geometrical structure for practical data, which are often contaminated by noises and outliers. Additionally, the predetermined graph embedded UFS models suffer from the parameter tuning problem and the separated model optimization procedures. To generate more compact and discriminative feature subsets, we propose a Robust UFS model with Adaptive and Flexible <span>\\(\\varvec{\\ell }_\\textbf{1}\\)</span>-norm Graph (RAFG) embedding. Specifically, the <span>\\(\\varvec{\\ell }_\\textbf{2,1}\\)</span>-norm is imposed on the flexible regression term to alleviate the adverse effects of both noisy features and outliers, and <span>\\(\\varvec{\\ell }_\\textbf{2,p}\\)</span>-norm regularization term is incorporated to ensure that the selected transformation matrix is sufficiently sparse. Moreover, the adaptive <span>\\(\\varvec{\\ell }_\\textbf{1}\\)</span>-norm graph learning characterize the clustering distribution via consistent embeddings, which avoids time-consuming distance computations in a high-dimensional feature space. To solve the challenging problem, we propose an efficient alternative updating algorithm with an iterative reweighted strategy, together with the necessary convergence and complexity analyses. Finally, experimental results on two synthetic data and eight benchmark datasets illustrate the effectiveness and superiority of the proposed RAFG method compared with state-of-the-art methods.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 22","pages":"11732 - 11751"},"PeriodicalIF":3.4000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-024-05760-z","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Unsupervised feature selection (UFS) is a fundamental and indispensable dimension reduction method for large amount of high-dimensional unlabeled data samples. Without label information, the manifold learning technique is leveraged to compensate for the lack of discrimination with the selected features. However, it is still a challenging problem to capture the geometrical structure for practical data, which are often contaminated by noises and outliers. Additionally, the predetermined graph embedded UFS models suffer from the parameter tuning problem and the separated model optimization procedures. To generate more compact and discriminative feature subsets, we propose a Robust UFS model with Adaptive and Flexible \(\varvec{\ell }_\textbf{1}\)-norm Graph (RAFG) embedding. Specifically, the \(\varvec{\ell }_\textbf{2,1}\)-norm is imposed on the flexible regression term to alleviate the adverse effects of both noisy features and outliers, and \(\varvec{\ell }_\textbf{2,p}\)-norm regularization term is incorporated to ensure that the selected transformation matrix is sufficiently sparse. Moreover, the adaptive \(\varvec{\ell }_\textbf{1}\)-norm graph learning characterize the clustering distribution via consistent embeddings, which avoids time-consuming distance computations in a high-dimensional feature space. To solve the challenging problem, we propose an efficient alternative updating algorithm with an iterative reweighted strategy, together with the necessary convergence and complexity analyses. Finally, experimental results on two synthetic data and eight benchmark datasets illustrate the effectiveness and superiority of the proposed RAFG method compared with state-of-the-art methods.
期刊介绍:
With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance.
The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.