具有自动加权功能的通用自适应无监督特征选择。

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Neural Networks Pub Date : 2024-10-31 DOI:10.1016/j.neunet.2024.106840

Huming Liao , Hongmei Chen , Tengyu Yin , Zhong Yuan , Shi-Jinn Horng , Tianrui Li

{"title":"具有自动加权功能的通用自适应无监督特征选择。","authors":"Huming Liao , Hongmei Chen , Tengyu Yin , Zhong Yuan , Shi-Jinn Horng , Tianrui Li","doi":"10.1016/j.neunet.2024.106840","DOIUrl":null,"url":null,"abstract":"<div><div>Feature selection (FS) is essential in machine learning and data mining as it makes handling high-dimensional data more efficient and reliable. More attention has been paid to unsupervised feature selection (UFS) due to the extra resources required to obtain labels for data in the real world. Most of the existing embedded UFS utilize a sparse projection matrix for FS. However, this may introduce additional regularization terms, and it is difficult to control the sparsity of the projection matrix well. Moreover, such methods may seriously destroy the original feature structure in the embedding space. Instead, avoiding projecting the original data into the low-dimensional embedding space and identifying features directly from the raw features that perform well in the process of making the data show a distinct cluster structure is a feasible solution. Inspired by this, this paper proposes a model called A General Adaptive Unsupervised Feature Selection with Auto-weighting (GAWFS), which utilizes two techniques, non-negative matrix factorization, and adaptive graph learning, to simulate the process of dividing data into clusters, and identifies the features that are most discriminative in the clustering process by a feature weighting matrix <span><math><mi>Θ</mi></math></span>. Since the weighting matrix is sparse, it also plays the role of FS or a filter. Finally, experiments comparing GAWFS with several state-of-the-art UFS methods on synthetic datasets and real-world datasets are conducted, and the results demonstrate the superiority of the GAWFS.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"181 ","pages":"Article 106840"},"PeriodicalIF":6.0000,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A general adaptive unsupervised feature selection with auto-weighting\",\"authors\":\"Huming Liao , Hongmei Chen , Tengyu Yin , Zhong Yuan , Shi-Jinn Horng , Tianrui Li\",\"doi\":\"10.1016/j.neunet.2024.106840\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Feature selection (FS) is essential in machine learning and data mining as it makes handling high-dimensional data more efficient and reliable. More attention has been paid to unsupervised feature selection (UFS) due to the extra resources required to obtain labels for data in the real world. Most of the existing embedded UFS utilize a sparse projection matrix for FS. However, this may introduce additional regularization terms, and it is difficult to control the sparsity of the projection matrix well. Moreover, such methods may seriously destroy the original feature structure in the embedding space. Instead, avoiding projecting the original data into the low-dimensional embedding space and identifying features directly from the raw features that perform well in the process of making the data show a distinct cluster structure is a feasible solution. Inspired by this, this paper proposes a model called A General Adaptive Unsupervised Feature Selection with Auto-weighting (GAWFS), which utilizes two techniques, non-negative matrix factorization, and adaptive graph learning, to simulate the process of dividing data into clusters, and identifies the features that are most discriminative in the clustering process by a feature weighting matrix <span><math><mi>Θ</mi></math></span>. Since the weighting matrix is sparse, it also plays the role of FS or a filter. Finally, experiments comparing GAWFS with several state-of-the-art UFS methods on synthetic datasets and real-world datasets are conducted, and the results demonstrate the superiority of the GAWFS.</div></div>\",\"PeriodicalId\":49763,\"journal\":{\"name\":\"Neural Networks\",\"volume\":\"181 \",\"pages\":\"Article 106840\"},\"PeriodicalIF\":6.0000,\"publicationDate\":\"2024-10-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0893608024007640\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608024007640","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

特征选择（FS）在机器学习和数据挖掘中至关重要，因为它能使高维数据的处理更高效、更可靠。由于在现实世界中获取数据标签需要额外资源，无监督特征选择（UFS）受到了更多关注。现有的大多数嵌入式 UFS 都利用稀疏投影矩阵进行特征选择。然而，这可能会引入额外的正则化项，而且很难很好地控制投影矩阵的稀疏性。此外，这种方法可能会严重破坏嵌入空间中的原始特征结构。相反，避免将原始数据投影到低维嵌入空间，直接从原始特征中识别出在使数据显示出明显聚类结构过程中表现良好的特征，不失为一种可行的解决方案。受此启发，本文提出了一种名为 "带自动加权的通用自适应无监督特征选择"（General Adaptive Unsupervised Feature Selection with Auto-weighting, GAWFS）的模型，它利用非负矩阵因式分解和自适应图学习这两种技术来模拟将数据划分为聚类的过程，并通过特征加权矩阵Θ来识别聚类过程中最具区分度的特征。由于加权矩阵是稀疏的，因此它也起到了 FS 或过滤器的作用。最后，在合成数据集和实际数据集上对 GAWFS 和几种最先进的 UFS 方法进行了比较实验，结果证明了 GAWFS 的优越性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A general adaptive unsupervised feature selection with auto-weighting

Feature selection (FS) is essential in machine learning and data mining as it makes handling high-dimensional data more efficient and reliable. More attention has been paid to unsupervised feature selection (UFS) due to the extra resources required to obtain labels for data in the real world. Most of the existing embedded UFS utilize a sparse projection matrix for FS. However, this may introduce additional regularization terms, and it is difficult to control the sparsity of the projection matrix well. Moreover, such methods may seriously destroy the original feature structure in the embedding space. Instead, avoiding projecting the original data into the low-dimensional embedding space and identifying features directly from the raw features that perform well in the process of making the data show a distinct cluster structure is a feasible solution. Inspired by this, this paper proposes a model called A General Adaptive Unsupervised Feature Selection with Auto-weighting (GAWFS), which utilizes two techniques, non-negative matrix factorization, and adaptive graph learning, to simulate the process of dividing data into clusters, and identifies the features that are most discriminative in the clustering process by a feature weighting matrix

Θ

. Since the weighting matrix is sparse, it also plays the role of FS or a filter. Finally, experiments comparing GAWFS with several state-of-the-art UFS methods on synthetic datasets and real-world datasets are conducted, and the results demonstrate the superiority of the GAWFS.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Neural Networks 工程技术-计算机：人工智能

CiteScore

13.90

自引率

7.70%

发文量

425

审稿时长

67 days

期刊介绍： Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.