Ismail Naderi, Mohsen Rezapour, Mohammad R. Salavatipour
{"title":"Approximation schemes for Min-Sum k-Clustering","authors":"Ismail Naderi, Mohsen Rezapour, Mohammad R. Salavatipour","doi":"10.1016/j.disopt.2024.100860","DOIUrl":null,"url":null,"abstract":"<div><p>We consider the Min-Sum <span><math><mi>k</mi></math></span>-Clustering (<span><math><mi>k</mi></math></span>-MSC) problem. Given a set of points in a metric which is represented by an edge-weighted graph <span><math><mrow><mi>G</mi><mo>=</mo><mrow><mo>(</mo><mi>V</mi><mo>,</mo><mi>E</mi><mo>)</mo></mrow></mrow></math></span> and a parameter <span><math><mi>k</mi></math></span>, the goal is to partition the points <span><math><mi>V</mi></math></span> into <span><math><mi>k</mi></math></span> clusters such that the sum of distances between all pairs of the points within the same cluster is minimized.</p><p>The <span><math><mi>k</mi></math></span>-MSC problem is known to be APX-hard on general metrics. The best known approximation algorithms for the problem obtained by Behsaz et al. (2019) achieve an approximation ratio of <span><math><mrow><mi>O</mi><mrow><mo>(</mo><mo>log</mo><mrow><mo>|</mo><mi>V</mi><mo>|</mo></mrow><mo>)</mo></mrow></mrow></math></span> in polynomial time for general metrics and an approximation ratio <span><math><mrow><mn>2</mn><mo>+</mo><mi>ϵ</mi></mrow></math></span> in quasi-polynomial time for metrics with bounded doubling dimension. No approximation schemes for <span><math><mi>k</mi></math></span>-MSC (when <span><math><mi>k</mi></math></span> is part of the input) is known for any non-trivial metrics prior to our work. In fact, most of the previous works rely on the simple fact that there is a 2-approximate reduction from <span><math><mi>k</mi></math></span>-MSC to the balanced <span><math><mi>k</mi></math></span>-median problem and design approximation algorithms for the latter to obtain an approximation for <span><math><mi>k</mi></math></span>-MSC.</p><p>In this paper, we obtain the first Quasi-Polynomial Time Approximation Schemes (QPTAS) for the problem on metrics induced by graphs of bounded treewidth, graphs of bounded highway dimension, graphs of bounded doubling dimensions (including fixed dimensional Euclidean metrics), and planar and minor-free graphs. We bypass the barrier of 2 for <span><math><mi>k</mi></math></span>-MSC by introducing a new clustering problem, which we call min-hub clustering, which is a generalization of balanced <span><math><mi>k</mi></math></span>-median and is a trade off between center-based clustering problems (such as balanced <span><math><mi>k</mi></math></span>-median) and pair-wise clustering (such as Min-Sum <span><math><mi>k</mi></math></span>-clustering). We then show how one can find approximation schemes for Min-hub clustering on certain classes of metrics.</p></div>","PeriodicalId":50571,"journal":{"name":"Discrete Optimization","volume":"54 ","pages":"Article 100860"},"PeriodicalIF":0.9000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1572528624000392/pdfft?md5=0298fb9c3c75e407870e412a1aae1a26&pid=1-s2.0-S1572528624000392-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Discrete Optimization","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1572528624000392","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
引用次数: 0
Abstract
We consider the Min-Sum -Clustering (-MSC) problem. Given a set of points in a metric which is represented by an edge-weighted graph and a parameter , the goal is to partition the points into clusters such that the sum of distances between all pairs of the points within the same cluster is minimized.
The -MSC problem is known to be APX-hard on general metrics. The best known approximation algorithms for the problem obtained by Behsaz et al. (2019) achieve an approximation ratio of in polynomial time for general metrics and an approximation ratio in quasi-polynomial time for metrics with bounded doubling dimension. No approximation schemes for -MSC (when is part of the input) is known for any non-trivial metrics prior to our work. In fact, most of the previous works rely on the simple fact that there is a 2-approximate reduction from -MSC to the balanced -median problem and design approximation algorithms for the latter to obtain an approximation for -MSC.
In this paper, we obtain the first Quasi-Polynomial Time Approximation Schemes (QPTAS) for the problem on metrics induced by graphs of bounded treewidth, graphs of bounded highway dimension, graphs of bounded doubling dimensions (including fixed dimensional Euclidean metrics), and planar and minor-free graphs. We bypass the barrier of 2 for -MSC by introducing a new clustering problem, which we call min-hub clustering, which is a generalization of balanced -median and is a trade off between center-based clustering problems (such as balanced -median) and pair-wise clustering (such as Min-Sum -clustering). We then show how one can find approximation schemes for Min-hub clustering on certain classes of metrics.
期刊介绍:
Discrete Optimization publishes research papers on the mathematical, computational and applied aspects of all areas of integer programming and combinatorial optimization. In addition to reports on mathematical results pertinent to discrete optimization, the journal welcomes submissions on algorithmic developments, computational experiments, and novel applications (in particular, large-scale and real-time applications). The journal also publishes clearly labelled surveys, reviews, short notes, and open problems. Manuscripts submitted for possible publication to Discrete Optimization should report on original research, should not have been previously published, and should not be under consideration for publication by any other journal.