重要的聚类:与离群值聚类的最优逼近

IF 4.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Journal of Artificial Intelligence Research Pub Date : 2023-09-15 DOI:10.1613/jair.1.14883
Akanksha Agrawal, Tanmay Inamdar, Saket Saurabh, Jie Xue
{"title":"重要的聚类:与离群值聚类的最优逼近","authors":"Akanksha Agrawal, Tanmay Inamdar, Saket Saurabh, Jie Xue","doi":"10.1613/jair.1.14883","DOIUrl":null,"url":null,"abstract":"Clustering with outliers is one of the most fundamental problems in Computer Science. Given a set X of n points and two numbers k, m, the clustering with outliers aims to exclude m points from X and partition the remaining points into k clusters that minimizes a certain cost function. In this paper, we give a general approach for solving clustering with outliers, which results in a fixed-parameter tractable (FPT) algorithm in k and m—i.e., an algorithm with running time of the form f(k, m) · nO(1) for some function f—that almost matches the approximation ratio for its outlier-free counterpart. As a corollary, we obtain FPT approximation algorithms with optimal approximation ratios for k-Median and k-Means with outliers in general and Euclidean metrics. We also exhibit more applications of our approach to other variants of the problem that impose additional constraints on the clustering, such as fairness or matroid constraints.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"23 1","pages":"0"},"PeriodicalIF":4.5000,"publicationDate":"2023-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Clustering what Matters: Optimal Approximation for Clustering with Outliers\",\"authors\":\"Akanksha Agrawal, Tanmay Inamdar, Saket Saurabh, Jie Xue\",\"doi\":\"10.1613/jair.1.14883\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Clustering with outliers is one of the most fundamental problems in Computer Science. Given a set X of n points and two numbers k, m, the clustering with outliers aims to exclude m points from X and partition the remaining points into k clusters that minimizes a certain cost function. In this paper, we give a general approach for solving clustering with outliers, which results in a fixed-parameter tractable (FPT) algorithm in k and m—i.e., an algorithm with running time of the form f(k, m) · nO(1) for some function f—that almost matches the approximation ratio for its outlier-free counterpart. As a corollary, we obtain FPT approximation algorithms with optimal approximation ratios for k-Median and k-Means with outliers in general and Euclidean metrics. We also exhibit more applications of our approach to other variants of the problem that impose additional constraints on the clustering, such as fairness or matroid constraints.\",\"PeriodicalId\":54877,\"journal\":{\"name\":\"Journal of Artificial Intelligence Research\",\"volume\":\"23 1\",\"pages\":\"0\"},\"PeriodicalIF\":4.5000,\"publicationDate\":\"2023-09-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Artificial Intelligence Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1613/jair.1.14883\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Artificial Intelligence Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1613/jair.1.14883","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

异常值聚类是计算机科学中最基本的问题之一。给定一个n个点的集合X和两个数字k, m,带离群点聚类的目的是从X中排除m个点,并将剩下的点划分到k个最小化某个代价函数的聚类中。本文给出了一种求解具有离群值的聚类问题的一般方法,从而得到了k和m -即的固定参数可处理(FPT)算法。对于某些函数f,它的运行时间形式为f(k, m)·nO(1),几乎与它的无离群值对应的近似比匹配。作为推论,我们得到了一般和欧几里得度量中具有离群值的k-Median和k-Means的最优近似比的FPT近似算法。我们还展示了我们的方法在问题的其他变体上的更多应用,这些变体对聚类施加了额外的约束,例如公平性或矩阵约束。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Clustering what Matters: Optimal Approximation for Clustering with Outliers
Clustering with outliers is one of the most fundamental problems in Computer Science. Given a set X of n points and two numbers k, m, the clustering with outliers aims to exclude m points from X and partition the remaining points into k clusters that minimizes a certain cost function. In this paper, we give a general approach for solving clustering with outliers, which results in a fixed-parameter tractable (FPT) algorithm in k and m—i.e., an algorithm with running time of the form f(k, m) · nO(1) for some function f—that almost matches the approximation ratio for its outlier-free counterpart. As a corollary, we obtain FPT approximation algorithms with optimal approximation ratios for k-Median and k-Means with outliers in general and Euclidean metrics. We also exhibit more applications of our approach to other variants of the problem that impose additional constraints on the clustering, such as fairness or matroid constraints.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Artificial Intelligence Research
Journal of Artificial Intelligence Research 工程技术-计算机:人工智能
CiteScore
9.60
自引率
4.00%
发文量
98
审稿时长
4 months
期刊介绍: JAIR(ISSN 1076 - 9757) covers all areas of artificial intelligence (AI), publishing refereed research articles, survey articles, and technical notes. Established in 1993 as one of the first electronic scientific journals, JAIR is indexed by INSPEC, Science Citation Index, and MathSciNet. JAIR reviews papers within approximately three months of submission and publishes accepted articles on the internet immediately upon receiving the final versions. JAIR articles are published for free distribution on the internet by the AI Access Foundation, and for purchase in bound volumes by AAAI Press.
期刊最新文献
Collective Belief Revision Competitive Equilibria with a Constant Number of Chores Improving Resource Allocations by Sharing in Pairs A General Model for Aggregating Annotations Across Simple, Complex, and Multi-Object Annotation Tasks Asymptotics of K-Fold Cross Validation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1