约束谱聚类的公平性

IF 6.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Neurocomputing Pub Date : 2025-06-14 Epub Date: 2025-03-01 DOI:10.1016/j.neucom.2025.129815

Laxita Agrawal , V. Vijaya Saradhi , Teena Sharma

{"title":"约束谱聚类的公平性","authors":"Laxita Agrawal , V. Vijaya Saradhi , Teena Sharma","doi":"10.1016/j.neucom.2025.129815","DOIUrl":null,"url":null,"abstract":"<div><div>Semi-supervised clustering methods have gained significant attention in both theoretical research and real-world applications, including economics, finance, marketing, and healthcare. Among these methods, constrained spectral clustering enhances clustering quality by incorporating pairwise constraints, namely, must-link and cannot-link constraints, which guide the clustering process by specifying whether certain data points should or should not belong to the same cluster. However, traditional constrained spectral clustering methods may inadvertently propagate biases present in the data or constraints, leading to unequal representation of sensitive groups, such as different genders or racial groups, across clusters. This imbalance raises concerns about fairness, an issue that remains largely unexplored in constrained spectral clustering. To address this gap, this paper proposes a novel method named fair-constrained Spectral Clustering (fair-cSC). The proposed method integrates fairness into the must-link and cannot-link constraints by defining a fair constraint matrix, ensuring that pairwise relationships do not introduce bias against any particular group. Additionally, a balance constraint is incorporated to enforce fairness across input data points, promoting equal representation of sensitive groups within clusters. Comprehensive experiments on six benchmarked datasets, including ablation studies, demonstrate that the proposed fair-cSC method effectively enhances fairness while preserving clustering quality. Furthermore, the ablation study provides insights into the method’s performance under different settings, reinforcing its robustness and applicability in real-world scenarios.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"634 ","pages":"Article 129815"},"PeriodicalIF":6.5000,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fairness in constrained spectral clustering\",\"authors\":\"Laxita Agrawal , V. Vijaya Saradhi , Teena Sharma\",\"doi\":\"10.1016/j.neucom.2025.129815\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Semi-supervised clustering methods have gained significant attention in both theoretical research and real-world applications, including economics, finance, marketing, and healthcare. Among these methods, constrained spectral clustering enhances clustering quality by incorporating pairwise constraints, namely, must-link and cannot-link constraints, which guide the clustering process by specifying whether certain data points should or should not belong to the same cluster. However, traditional constrained spectral clustering methods may inadvertently propagate biases present in the data or constraints, leading to unequal representation of sensitive groups, such as different genders or racial groups, across clusters. This imbalance raises concerns about fairness, an issue that remains largely unexplored in constrained spectral clustering. To address this gap, this paper proposes a novel method named fair-constrained Spectral Clustering (fair-cSC). The proposed method integrates fairness into the must-link and cannot-link constraints by defining a fair constraint matrix, ensuring that pairwise relationships do not introduce bias against any particular group. Additionally, a balance constraint is incorporated to enforce fairness across input data points, promoting equal representation of sensitive groups within clusters. Comprehensive experiments on six benchmarked datasets, including ablation studies, demonstrate that the proposed fair-cSC method effectively enhances fairness while preserving clustering quality. Furthermore, the ablation study provides insights into the method’s performance under different settings, reinforcing its robustness and applicability in real-world scenarios.</div></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":\"634 \",\"pages\":\"Article 129815\"},\"PeriodicalIF\":6.5000,\"publicationDate\":\"2025-06-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0925231225004874\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/3/1 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225004874","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/1 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

半监督聚类方法在理论研究和实际应用中都得到了极大的关注，包括经济学、金融、市场营销和医疗保健。其中，约束谱聚类通过引入“必须链接约束”和“不能链接约束”的两两约束来提高聚类质量，这些约束通过指定某些数据点是否属于同一聚类来指导聚类过程。然而，传统的约束谱聚类方法可能会无意中传播数据或约束中存在的偏差，导致敏感群体（如不同性别或种族群体）在聚类中的不平等代表。这种不平衡引起了人们对公平性的担忧，这是一个在约束谱聚类中基本上未被探索的问题。为了解决这一问题，本文提出了一种新的公平约束谱聚类方法（fair-cSC）。该方法通过定义公平约束矩阵，将公平性整合到必须链接约束和不能链接约束中，确保两两关系不会引入对任何特定群体的偏见。此外，还结合了平衡约束来强制输入数据点之间的公平性，从而促进集群中敏感组的平等表示。在包括消融研究在内的6个基准数据集上进行的综合实验表明，本文提出的fair-cSC方法在保持聚类质量的同时有效地提高了公平性。此外，烧蚀研究为该方法在不同设置下的性能提供了见解，增强了其在现实场景中的鲁棒性和适用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Fairness in constrained spectral clustering

Semi-supervised clustering methods have gained significant attention in both theoretical research and real-world applications, including economics, finance, marketing, and healthcare. Among these methods, constrained spectral clustering enhances clustering quality by incorporating pairwise constraints, namely, must-link and cannot-link constraints, which guide the clustering process by specifying whether certain data points should or should not belong to the same cluster. However, traditional constrained spectral clustering methods may inadvertently propagate biases present in the data or constraints, leading to unequal representation of sensitive groups, such as different genders or racial groups, across clusters. This imbalance raises concerns about fairness, an issue that remains largely unexplored in constrained spectral clustering. To address this gap, this paper proposes a novel method named fair-constrained Spectral Clustering (fair-cSC). The proposed method integrates fairness into the must-link and cannot-link constraints by defining a fair constraint matrix, ensuring that pairwise relationships do not introduce bias against any particular group. Additionally, a balance constraint is incorporated to enforce fairness across input data points, promoting equal representation of sensitive groups within clusters. Comprehensive experiments on six benchmarked datasets, including ablation studies, demonstrate that the proposed fair-cSC method effectively enhances fairness while preserving clustering quality. Furthermore, the ablation study provides insights into the method’s performance under different settings, reinforcing its robustness and applicability in real-world scenarios.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Neurocomputing 工程技术-计算机：人工智能

CiteScore

13.10

自引率

10.00%

发文量

1382

审稿时长

70 days

期刊介绍： Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.