{"title":"FWCEC: An Enhanced Feature Weighting Method via Causal Effect for Clustering","authors":"Fuyuan Cao;Xuechun Jing;Kui Yu;Jiye Liang","doi":"10.1109/TKDE.2024.3508057","DOIUrl":null,"url":null,"abstract":"Feature weighting aims to assign different weights to features based on their importance in machine learning tasks. In clustering tasks, the existing methods learn feature importance based on the clustering results derived from the collaborative contribution of all features, which overlooks the independent effect of each feature. In fact, there are underlying causal relationships between features and the clustering results, and the features with high causal effects are always more crucial for clustering. Therefore, we propose an enhanced \n<underline>F</u>\neature \n<underline>W</u>\neighting method via \n<underline>C</u>\nausal \n<underline>E</u>\nffect for \n<underline>C</u>\nlustering, calculating the causal effect of each feature on the clustering results for obtaining the independent contribution of each feature. Specifically, we start by identifying the causal relationships among the features and utilizing the causal relationships to generate a reasonable treatment group. Next, we compare the changes in the data distribution between the treatment and control groups to determine the causal effect of each feature. Finally, the causal effects of features are used for enhancing the clustering-driven weight learning. Moreover, we present a theory of relative order consistency in causal effect. Experimental results demonstrate that utilizing causal effect in weight learning facilitates efficient convergence and achieves superior accuracy compared to state-of-the-art clustering algorithms.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 2","pages":"685-697"},"PeriodicalIF":8.9000,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10770832/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Feature weighting aims to assign different weights to features based on their importance in machine learning tasks. In clustering tasks, the existing methods learn feature importance based on the clustering results derived from the collaborative contribution of all features, which overlooks the independent effect of each feature. In fact, there are underlying causal relationships between features and the clustering results, and the features with high causal effects are always more crucial for clustering. Therefore, we propose an enhanced
F
eature
W
eighting method via
C
ausal
E
ffect for
C
lustering, calculating the causal effect of each feature on the clustering results for obtaining the independent contribution of each feature. Specifically, we start by identifying the causal relationships among the features and utilizing the causal relationships to generate a reasonable treatment group. Next, we compare the changes in the data distribution between the treatment and control groups to determine the causal effect of each feature. Finally, the causal effects of features are used for enhancing the clustering-driven weight learning. Moreover, we present a theory of relative order consistency in causal effect. Experimental results demonstrate that utilizing causal effect in weight learning facilitates efficient convergence and achieves superior accuracy compared to state-of-the-art clustering algorithms.
期刊介绍:
The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.