Fair Clustering Ensemble With Equal Cluster Capacity

Peng Zhou;Rongwen Li;Zhaolong Ling;Liang Du;Xinwang Liu
{"title":"Fair Clustering Ensemble With Equal Cluster Capacity","authors":"Peng Zhou;Rongwen Li;Zhaolong Ling;Liang Du;Xinwang Liu","doi":"10.1109/TPAMI.2024.3507857","DOIUrl":null,"url":null,"abstract":"Clustering ensemble has been widely studied in data mining and machine learning. However, the existing clustering ensemble methods do not pay attention to fairness, which is important in real-world applications, especially in applications involving humans. To address this issue, this paper proposes a novel fair clustering ensemble method, which takes multiple base clustering results as inputs and learns a fair consensus clustering result. When designing the algorithm, we observe that one of the widely used definitions of fairness may cause a cluster imbalance problem. To tackle this problem, we give a new definition of fairness that can simultaneously characterize fairness and cluster capacity equality. Based on this new definition, we design an extremely simple yet effective regularized term to achieve fairness and cluster capacity equality. We plug this regularized term into our clustering ensemble framework, finally leading to our new fair clustering ensemble method. The extensive experiments show that, compared with the state-of-the-art clustering ensemble methods, our method can not only achieve a comparable or even better clustering performance, but also obtain a much fairer and better capacity equality result, which well demonstrates the effectiveness and superiority of our method.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 3","pages":"1729-1746"},"PeriodicalIF":18.6000,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10770826/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Clustering ensemble has been widely studied in data mining and machine learning. However, the existing clustering ensemble methods do not pay attention to fairness, which is important in real-world applications, especially in applications involving humans. To address this issue, this paper proposes a novel fair clustering ensemble method, which takes multiple base clustering results as inputs and learns a fair consensus clustering result. When designing the algorithm, we observe that one of the widely used definitions of fairness may cause a cluster imbalance problem. To tackle this problem, we give a new definition of fairness that can simultaneously characterize fairness and cluster capacity equality. Based on this new definition, we design an extremely simple yet effective regularized term to achieve fairness and cluster capacity equality. We plug this regularized term into our clustering ensemble framework, finally leading to our new fair clustering ensemble method. The extensive experiments show that, compared with the state-of-the-art clustering ensemble methods, our method can not only achieve a comparable or even better clustering performance, but also obtain a much fairer and better capacity equality result, which well demonstrates the effectiveness and superiority of our method.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
具有相等簇容量的公平聚类集成
聚类集成在数据挖掘和机器学习中得到了广泛的研究。然而,现有的聚类集成方法没有注意公平性,而公平性在现实应用中,特别是在涉及人类的应用中是很重要的。为了解决这一问题,本文提出了一种新的公平聚类集成方法,该方法将多个基本聚类结果作为输入,学习一个公平的共识聚类结果。在设计算法时,我们注意到广泛使用的公平性定义之一可能导致集群不平衡问题。为了解决这一问题,我们给出了公平的新定义,该定义可以同时表征公平和集群容量平等。在此基础上,我们设计了一个非常简单而有效的正则化术语来实现公平性和集群容量平等。我们将这个正则化项插入到我们的聚类集成框架中,最终得到了我们新的公平聚类集成方法。大量的实验表明,与目前最先进的聚类集成方法相比,我们的方法不仅可以获得相当甚至更好的聚类性能,而且可以获得更公平、更好的容量相等结果,很好地证明了我们的方法的有效性和优越性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
CrossEarth: Geospatial Vision Foundation Model for Domain Generalizable Remote Sensing Semantic Segmentation. Continuous Review and Timely Correction: Enhancing the Resistance to Noisy Labels via Self-Not-True and Class-Wise Distillation. On the Transferability and Discriminability of Representation Learning in Unsupervised Domain Adaptation. Fast Multi-view Discrete Clustering via Spectral Embedding Fusion. GrowSP++: Growing Superpoints and Primitives for Unsupervised 3D Semantic Segmentation.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1