Proceedings of the ... Annual ACM-SIAM Symposium on Discrete Algorithms. ACM-SIAM Symposium on Discrete Algorithms最新文献_第10页

A Sublinear-Time Quantum Algorithm for Approximating Partition Functions 近似配分函数的次线性时间量子算法

Proceedings of the ... Annual ACM-SIAM Symposium on Discrete Algorithms. ACM-SIAM Symposium on Discrete Algorithms

Pub Date : 2022-07-18 DOI: 10.1137/1.9781611977554.ch46

A. Cornelissen, Yassine Hamoudi

We present a novel quantum algorithm for estimating Gibbs partition functions in sublinear time with respect to the logarithm of the size of the state space. This is the first speed-up of this type to be obtained over the seminal nearly-linear time algorithm of v{S}tefankoviv{c}, Vempala and Vigoda [JACM, 2009]. Our result also preserves the quadratic speed-up in precision and spectral gap achieved in previous work by exploiting the properties of quantum Markov chains. As an application, we obtain new polynomial improvements over the best-known algorithms for computing the partition function of the Ising model, counting the number of $k$-colorings, matchings or independent sets of a graph, and estimating the volume of a convex body. Our approach relies on developing new variants of the quantum phase and amplitude estimation algorithms that return nearly unbiased estimates with low variance and without destroying their initial quantum state. We extend these subroutines into a nearly unbiased quantum mean estimator that reduces the variance quadratically faster than the classical empirical mean. No such estimator was known to exist prior to our work. These properties, which are of general interest, lead to better convergence guarantees within the paradigm of simulated annealing for computing partition functions.

我们提出了一种新的量子算法，用于在亚线性时间内根据状态空间大小的对数估计Gibbs配分函数。这是第一次获得这种类型的加速比开创性的近线性时间算法v{S}tefankoviv{c}， Vempala和Vigoda [JACM, 2009]。我们的结果还保留了先前利用量子马尔可夫链的性质所获得的精度和谱间隙的二次加速。作为一个应用，我们在最著名的算法上得到了新的多项式改进，用于计算Ising模型的配分函数，计算图的$k$着色，匹配或独立集的数量，以及估计凸体的体积。我们的方法依赖于开发量子相位和振幅估计算法的新变体，这些算法返回具有低方差且不破坏其初始量子态的几乎无偏估计。我们将这些子程序扩展成一个几乎无偏的量子均值估计器，它比经典的经验均值更快地二次减小方差。在我们的工作之前，人们并不知道存在这样的估计器。这些性质，这是普遍感兴趣的，导致更好的收敛保证范式内模拟退火计算配分函数。

{"title":"A Sublinear-Time Quantum Algorithm for Approximating Partition Functions","authors":"A. Cornelissen, Yassine Hamoudi","doi":"10.1137/1.9781611977554.ch46","DOIUrl":"https://doi.org/10.1137/1.9781611977554.ch46","url":null,"abstract":"We present a novel quantum algorithm for estimating Gibbs partition functions in sublinear time with respect to the logarithm of the size of the state space. This is the first speed-up of this type to be obtained over the seminal nearly-linear time algorithm of v{S}tefankoviv{c}, Vempala and Vigoda [JACM, 2009]. Our result also preserves the quadratic speed-up in precision and spectral gap achieved in previous work by exploiting the properties of quantum Markov chains. As an application, we obtain new polynomial improvements over the best-known algorithms for computing the partition function of the Ising model, counting the number of $k$-colorings, matchings or independent sets of a graph, and estimating the volume of a convex body. Our approach relies on developing new variants of the quantum phase and amplitude estimation algorithms that return nearly unbiased estimates with low variance and without destroying their initial quantum state. We extend these subroutines into a nearly unbiased quantum mean estimator that reduces the variance quadratically faster than the classical empirical mean. No such estimator was known to exist prior to our work. These properties, which are of general interest, lead to better convergence guarantees within the paradigm of simulated annealing for computing partition functions.","PeriodicalId":92709,"journal":{"name":"Proceedings of the ... Annual ACM-SIAM Symposium on Discrete Algorithms. ACM-SIAM Symposium on Discrete Algorithms","volume":"1 1","pages":"1245-1264"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90323000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Quantum tomography using state-preparation unitaries 使用状态制备酉元的量子层析成像

Proceedings of the ... Annual ACM-SIAM Symposium on Discrete Algorithms. ACM-SIAM Symposium on Discrete Algorithms

Pub Date : 2022-07-18 DOI: 10.48550/arXiv.2207.08800

Joran van Apeldoorn, A. Cornelissen, Andr'as Gily'en, G. Nannicini

We describe algorithms to obtain an approximate classical description of a $d$-dimensional quantum state when given access to a unitary (and its inverse) that prepares it. For pure states we characterize the query complexity for $ell_q$-norm error up to logarithmic factors. As a special case, we show that it takes $widetilde{Theta}(d/varepsilon)$ applications of the unitaries to obtain an $varepsilon$-$ell_2$-approximation of the state. For mixed states we consider a similar model, where the unitary prepares a purification of the state. In this model we give an efficient algorithm for obtaining Schatten $q$-norm estimates of a rank-$r$ mixed state, giving query upper bounds that are close to optimal. In particular, we show that a trace-norm ($q=1$) estimate can be obtained with $widetilde{mathcal{O}}(dr/varepsilon)$ queries. This improves (assuming our stronger input model) the $varepsilon$-dependence over the algorithm of Haah et al. (2017) that uses a joint measurement on $widetilde{mathcal{O}}(dr/varepsilon^2)$ copies of the state. To our knowledge, the most sample-efficient results for pure-state tomography come from setting the rank to $1$ in generic mixed-state tomography algorithms, which can be computationally demanding. We describe sample-optimal algorithms for pure states that are easy and fast to implement. Along the way we show that an $ell_infty$-norm estimate of a normalized vector induces a (slightly worse) $ell_q$-norm estimate for that vector, without losing a dimension-dependent factor in the precision. We also develop an unbiased and symmetric version of phase estimation, where the probability distribution of the estimate is centered around the true value. Finally, we give an efficient method for estimating multiple expectation values, improving over the recent result by Huggins et al. (2021) when the measurement operators do not fully overlap.

我们描述了当给定对准备它的酉(及其逆)的访问时获得$d$维量子态的近似经典描述的算法。对于纯状态，我们将查询复杂度表征为$ell_q$ -范数误差到对数因子。作为一种特殊情况，我们证明了它需要$widetilde{Theta}(d/varepsilon)$应用酉元来获得状态的$varepsilon$ - $ell_2$ -近似。对于混合状态，我们考虑一个类似的模型，其中单位准备状态的净化。在该模型中，我们给出了一种有效的算法来获得秩$r$混合状态的Schatten $q$ -范数估计，并给出了接近最优的查询上界。特别地，我们展示了可以通过$widetilde{mathcal{O}}(dr/varepsilon)$查询获得跟踪范数($q=1$)估计。这改善了(假设我们的输入模型更强)对Haah等人(2017)算法的$varepsilon$依赖，该算法使用对$widetilde{mathcal{O}}(dr/varepsilon^2)$个状态副本的联合测量。据我们所知，纯状态层析成像的样本效率最高的结果来自于在通用混合状态层析成像算法中将秩设置为$1$，这可能对计算要求很高。我们描述了简单且快速实现的纯状态的样本最优算法。在此过程中，我们展示了标准化向量的$ell_infty$ -范数估计会导致该向量的$ell_q$ -范数估计(稍微差一点)，而不会失去精度中的维度相关因素。我们还开发了一种无偏对称的相位估计，其中估计的概率分布以真实值为中心。最后，我们给出了一种有效的估计多个期望值的方法，在测量算子不完全重叠的情况下，改进了Huggins等人(2021)最近的结果。

{"title":"Quantum tomography using state-preparation unitaries","authors":"Joran van Apeldoorn, A. Cornelissen, Andr'as Gily'en, G. Nannicini","doi":"10.48550/arXiv.2207.08800","DOIUrl":"https://doi.org/10.48550/arXiv.2207.08800","url":null,"abstract":"We describe algorithms to obtain an approximate classical description of a $d$-dimensional quantum state when given access to a unitary (and its inverse) that prepares it. For pure states we characterize the query complexity for $ell_q$-norm error up to logarithmic factors. As a special case, we show that it takes $widetilde{Theta}(d/varepsilon)$ applications of the unitaries to obtain an $varepsilon$-$ell_2$-approximation of the state. For mixed states we consider a similar model, where the unitary prepares a purification of the state. In this model we give an efficient algorithm for obtaining Schatten $q$-norm estimates of a rank-$r$ mixed state, giving query upper bounds that are close to optimal. In particular, we show that a trace-norm ($q=1$) estimate can be obtained with $widetilde{mathcal{O}}(dr/varepsilon)$ queries. This improves (assuming our stronger input model) the $varepsilon$-dependence over the algorithm of Haah et al. (2017) that uses a joint measurement on $widetilde{mathcal{O}}(dr/varepsilon^2)$ copies of the state. To our knowledge, the most sample-efficient results for pure-state tomography come from setting the rank to $1$ in generic mixed-state tomography algorithms, which can be computationally demanding. We describe sample-optimal algorithms for pure states that are easy and fast to implement. Along the way we show that an $ell_infty$-norm estimate of a normalized vector induces a (slightly worse) $ell_q$-norm estimate for that vector, without losing a dimension-dependent factor in the precision. We also develop an unbiased and symmetric version of phase estimation, where the probability distribution of the estimate is centered around the true value. Finally, we give an efficient method for estimating multiple expectation values, improving over the recent result by Huggins et al. (2021) when the measurement operators do not fully overlap.","PeriodicalId":92709,"journal":{"name":"Proceedings of the ... Annual ACM-SIAM Symposium on Discrete Algorithms. ACM-SIAM Symposium on Discrete Algorithms","volume":"41 1","pages":"1265-1318"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82361014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 24

Private Convex Optimization in General Norms 一般范数下的私有凸优化

Proceedings of the ... Annual ACM-SIAM Symposium on Discrete Algorithms. ACM-SIAM Symposium on Discrete Algorithms

Pub Date : 2022-07-18 DOI: 10.48550/arXiv.2207.08347

Sivakanth Gopi, Y. Lee, Daogao Liu, Ruoqi Shen, Kevin Tian

We propose a new framework for differentially private optimization of convex functions which are Lipschitz in an arbitrary norm $|cdot|$. Our algorithms are based on a regularized exponential mechanism which samples from the density $propto exp(-k(F+mu r))$ where $F$ is the empirical loss and $r$ is a regularizer which is strongly convex with respect to $|cdot|$, generalizing a recent work of [Gopi, Lee, Liu '22] to non-Euclidean settings. We show that this mechanism satisfies Gaussian differential privacy and solves both DP-ERM (empirical risk minimization) and DP-SCO (stochastic convex optimization) by using localization tools from convex geometry. Our framework is the first to apply to private convex optimization in general normed spaces and directly recovers non-private SCO rates achieved by mirror descent as the privacy parameter $epsilon to infty$. As applications, for Lipschitz optimization in $ell_p$ norms for all $p in (1, 2)$, we obtain the first optimal privacy-utility tradeoffs; for $p = 1$, we improve tradeoffs obtained by the recent works [Asi, Feldman, Koren, Talwar '21, Bassily, Guzman, Nandi '21] by at least a logarithmic factor. Our $ell_p$ norm and Schatten-$p$ norm optimization frameworks are complemented with polynomial-time samplers whose query complexity we explicitly bound.

我们提出了一个新的框架来求解任意范数Lipschitz凸函数的微分私有优化$|cdot|$。我们的算法基于正则化指数机制，该机制从密度$propto exp(-k(F+mu r))$中采样，其中$F$是经验损失，$r$是正则化器，相对于$|cdot|$是强凸的，将[Gopi, Lee, Liu '22]的最新工作推广到非欧几里得设置。我们证明了这种机制满足高斯微分隐私，并通过使用凸几何的定位工具解决了DP-ERM(经验风险最小化)和DP-SCO(随机凸优化)。我们的框架是第一个应用于一般赋范空间中的私有凸优化，并直接恢复通过镜像下降作为隐私参数$epsilon to infty$实现的非私有SCO率。作为应用，对于Lipschitz优化在$ell_p$规范对所有$p in (1, 2)$，我们得到了第一个最优的隐私效用权衡;对于$p = 1$，我们至少通过对数因子改进了最近的作品[Asi, Feldman, Koren, Talwar '21, Bassily, Guzman, Nandi '21]所获得的权衡。我们的$ell_p$范数和Schatten- $p$范数优化框架补充了多项式时间采样器，其查询复杂度我们明确地限定了。

{"title":"Private Convex Optimization in General Norms","authors":"Sivakanth Gopi, Y. Lee, Daogao Liu, Ruoqi Shen, Kevin Tian","doi":"10.48550/arXiv.2207.08347","DOIUrl":"https://doi.org/10.48550/arXiv.2207.08347","url":null,"abstract":"We propose a new framework for differentially private optimization of convex functions which are Lipschitz in an arbitrary norm $|cdot|$. Our algorithms are based on a regularized exponential mechanism which samples from the density $propto exp(-k(F+mu r))$ where $F$ is the empirical loss and $r$ is a regularizer which is strongly convex with respect to $|cdot|$, generalizing a recent work of [Gopi, Lee, Liu '22] to non-Euclidean settings. We show that this mechanism satisfies Gaussian differential privacy and solves both DP-ERM (empirical risk minimization) and DP-SCO (stochastic convex optimization) by using localization tools from convex geometry. Our framework is the first to apply to private convex optimization in general normed spaces and directly recovers non-private SCO rates achieved by mirror descent as the privacy parameter $epsilon to infty$. As applications, for Lipschitz optimization in $ell_p$ norms for all $p in (1, 2)$, we obtain the first optimal privacy-utility tradeoffs; for $p = 1$, we improve tradeoffs obtained by the recent works [Asi, Feldman, Koren, Talwar '21, Bassily, Guzman, Nandi '21] by at least a logarithmic factor. Our $ell_p$ norm and Schatten-$p$ norm optimization frameworks are complemented with polynomial-time samplers whose query complexity we explicitly bound.","PeriodicalId":92709,"journal":{"name":"Proceedings of the ... Annual ACM-SIAM Symposium on Discrete Algorithms. ACM-SIAM Symposium on Discrete Algorithms","volume":"39 1","pages":"5068-5089"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75885533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Almost Tight Bounds for Online Facility Location in the Random-Order Model 随机顺序模型中在线设施位置的几乎紧边界

Proceedings of the ... Annual ACM-SIAM Symposium on Discrete Algorithms. ACM-SIAM Symposium on Discrete Algorithms

Pub Date : 2022-07-18 DOI: 10.48550/arXiv.2207.08783

Haim Kaplan, David Naori, D. Raz

We study the online facility location problem with uniform facility costs in the random-order model. Meyerson's algorithm [FOCS'01] is arguably the most natural and simple online algorithm for the problem with several advantages and appealing properties. Its analysis in the random-order model is one of the cornerstones of random-order analysis beyond the secretary problem. Meyerson's algorithm was shown to be (asymptotically) optimal in the standard worst-case adversarial-order model and $8$-competitive in the random order model. While this bound in the random-order model is the long-standing state-of-the-art, it is not known to be tight, and the true competitive-ratio of Meyerson's algorithm remained an open question for more than two decades. We resolve this question and prove tight bounds on the competitive-ratio of Meyerson's algorithm in the random-order model, showing that it is exactly $4$-competitive. Following our tight analysis, we introduce a generic parameterized version of Meyerson's algorithm that retains all the advantages of the original version. We show that the best algorithm in this family is exactly $3$-competitive. On the other hand, we show that no online algorithm for this problem can achieve a competitive-ratio better than $2$. Finally, we prove that the algorithms in this family are robust to partial adversarial arrival orders.

研究了随机顺序模型下具有统一设施成本的在线设施选址问题。Meyerson算法[FOCS'01]可以说是该问题最自然、最简单的在线算法，它具有几个优点和吸引人的特性。它在随机顺序模型中的分析是秘书问题以外的随机顺序分析的基础之一。Meyerson算法在标准最坏情况下具有(渐近)最优性，在随机顺序模型中具有$8$竞争性。虽然随机顺序模型中的这个界限是长期以来的最先进的技术，但它并不知道是紧密的，并且Meyerson算法的真正竞争比在二十多年来仍然是一个开放的问题。我们解决了这一问题，并证明了Meyerson算法在随机序模型下的竞争比的紧界，表明它正好是$4$竞争。在我们严格的分析之后，我们引入了一个通用的参数化版本的Meyerson算法，它保留了原始版本的所有优点。我们证明了这个家族中最好的算法正好是$3$竞争性的。另一方面，我们证明了该问题的在线算法无法获得优于$2$的竞争比。最后，我们证明了这类算法对部分对抗到达顺序具有鲁棒性。

{"title":"Almost Tight Bounds for Online Facility Location in the Random-Order Model","authors":"Haim Kaplan, David Naori, D. Raz","doi":"10.48550/arXiv.2207.08783","DOIUrl":"https://doi.org/10.48550/arXiv.2207.08783","url":null,"abstract":"We study the online facility location problem with uniform facility costs in the random-order model. Meyerson's algorithm [FOCS'01] is arguably the most natural and simple online algorithm for the problem with several advantages and appealing properties. Its analysis in the random-order model is one of the cornerstones of random-order analysis beyond the secretary problem. Meyerson's algorithm was shown to be (asymptotically) optimal in the standard worst-case adversarial-order model and $8$-competitive in the random order model. While this bound in the random-order model is the long-standing state-of-the-art, it is not known to be tight, and the true competitive-ratio of Meyerson's algorithm remained an open question for more than two decades. We resolve this question and prove tight bounds on the competitive-ratio of Meyerson's algorithm in the random-order model, showing that it is exactly $4$-competitive. Following our tight analysis, we introduce a generic parameterized version of Meyerson's algorithm that retains all the advantages of the original version. We show that the best algorithm in this family is exactly $3$-competitive. On the other hand, we show that no online algorithm for this problem can achieve a competitive-ratio better than $2$. Finally, we prove that the algorithms in this family are robust to partial adversarial arrival orders.","PeriodicalId":92709,"journal":{"name":"Proceedings of the ... Annual ACM-SIAM Symposium on Discrete Algorithms. ACM-SIAM Symposium on Discrete Algorithms","volume":"10 1","pages":"1523-1544"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88280707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Online Lewis Weight Sampling 在线刘易斯权值抽样

Proceedings of the ... Annual ACM-SIAM Symposium on Discrete Algorithms. ACM-SIAM Symposium on Discrete Algorithms

Pub Date : 2022-07-17 DOI: 10.48550/arXiv.2207.08268

David P. Woodruff, T. Yasuda

The seminal work of Cohen and Peng introduced Lewis weight sampling to the theoretical computer science community, yielding fast row sampling algorithms for approximating $d$-dimensional subspaces of $ell_p$ up to $(1+epsilon)$ error. Several works have extended this important primitive to other settings, including the online coreset and sliding window models. However, these results are only for $pin{1,2}$, and results for $p=1$ require a suboptimal $tilde O(d^2/epsilon^2)$ samples. In this work, we design the first nearly optimal $ell_p$ subspace embeddings for all $pin(0,infty)$ in the online coreset and sliding window models. In both models, our algorithms store $tilde O(d^{1lor(p/2)}/epsilon^2)$ rows. This answers a substantial generalization of the main open question of [BDMMUWZ2020], and gives the first results for all $pnotin{1,2}$. Towards our result, we give the first analysis of"one-shot'' Lewis weight sampling of sampling rows proportionally to their Lewis weights, with sample complexity $tilde O(d^{p/2}/epsilon^2)$ for $p>2$. Previously, this scheme was only known to have sample complexity $tilde O(d^{p/2}/epsilon^5)$, whereas $tilde O(d^{p/2}/epsilon^2)$ is known if a more sophisticated recursive sampling is used. The recursive sampling cannot be implemented online, thus necessitating an analysis of one-shot Lewis weight sampling. Our analysis uses a novel connection to online numerical linear algebra. As an application, we obtain the first one-pass streaming coreset algorithms for $(1+epsilon)$ approximation of important generalized linear models, such as logistic regression and $p$-probit regression. Our upper bounds are parameterized by a complexity parameter $mu$ introduced by [MSSW2018], and we show the first lower bounds showing that a linear dependence on $mu$ is necessary.

Cohen和Peng的开创性工作将Lewis权采样引入了理论计算机科学界，产生了用于近似$ell_p$的$d$维子空间直至$(1+epsilon)$误差的快速行采样算法。一些作品将这个重要的原语扩展到其他设置，包括在线核心集和滑动窗口模型。但是，这些结果仅适用于$pin{1,2}$，而$p=1$的结果需要次优的$tilde O(d^2/epsilon^2)$样本。在这项工作中，我们为在线核心集和滑动窗口模型中的所有$pin(0,infty)$设计了第一个几乎最优的$ell_p$子空间嵌入。在这两个模型中，我们的算法存储$tilde O(d^{1lor(p/2)}/epsilon^2)$行。这回答了[BDMMUWZ2020]主要开放问题的实质性概括，并给出了所有$pnotin{1,2}$的第一个结果。对于我们的结果，我们首先分析了采样行与其Lewis权成比例的“一次”Lewis权抽样，对于$p>2$，样本复杂度为$tilde O(d^{p/2}/epsilon^2)$。以前，该方案只知道样本复杂度$tilde O(d^{p/2}/epsilon^5)$，而如果使用更复杂的递归抽样，则知道$tilde O(d^{p/2}/epsilon^2)$。递归抽样不能在线实现，因此需要对单次刘易斯权抽样进行分析。我们的分析使用了一种与在线数值线性代数的新联系。作为一个应用，我们获得了第一个用于$(1+epsilon)$逼近重要的广义线性模型(如逻辑回归和$p$ -probit回归)的一遍流核心集算法。我们的上界由[MSSW2018]引入的复杂度参数$mu$参数化，我们显示了第一个下界，表明对$mu$的线性依赖是必要的。

{"title":"Online Lewis Weight Sampling","authors":"David P. Woodruff, T. Yasuda","doi":"10.48550/arXiv.2207.08268","DOIUrl":"https://doi.org/10.48550/arXiv.2207.08268","url":null,"abstract":"The seminal work of Cohen and Peng introduced Lewis weight sampling to the theoretical computer science community, yielding fast row sampling algorithms for approximating $d$-dimensional subspaces of $ell_p$ up to $(1+epsilon)$ error. Several works have extended this important primitive to other settings, including the online coreset and sliding window models. However, these results are only for $pin{1,2}$, and results for $p=1$ require a suboptimal $tilde O(d^2/epsilon^2)$ samples. In this work, we design the first nearly optimal $ell_p$ subspace embeddings for all $pin(0,infty)$ in the online coreset and sliding window models. In both models, our algorithms store $tilde O(d^{1lor(p/2)}/epsilon^2)$ rows. This answers a substantial generalization of the main open question of [BDMMUWZ2020], and gives the first results for all $pnotin{1,2}$. Towards our result, we give the first analysis of\"one-shot'' Lewis weight sampling of sampling rows proportionally to their Lewis weights, with sample complexity $tilde O(d^{p/2}/epsilon^2)$ for $p>2$. Previously, this scheme was only known to have sample complexity $tilde O(d^{p/2}/epsilon^5)$, whereas $tilde O(d^{p/2}/epsilon^2)$ is known if a more sophisticated recursive sampling is used. The recursive sampling cannot be implemented online, thus necessitating an analysis of one-shot Lewis weight sampling. Our analysis uses a novel connection to online numerical linear algebra. As an application, we obtain the first one-pass streaming coreset algorithms for $(1+epsilon)$ approximation of important generalized linear models, such as logistic regression and $p$-probit regression. Our upper bounds are parameterized by a complexity parameter $mu$ introduced by [MSSW2018], and we show the first lower bounds showing that a linear dependence on $mu$ is necessary.","PeriodicalId":92709,"journal":{"name":"Proceedings of the ... Annual ACM-SIAM Symposium on Discrete Algorithms. ACM-SIAM Symposium on Discrete Algorithms","volume":"30 1","pages":"4622-4666"},"PeriodicalIF":0.0,"publicationDate":"2022-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82668774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Shrunk subspaces via operator Sinkhorn iteration 通过运算符下沉角迭代收缩子空间

Proceedings of the ... Annual ACM-SIAM Symposium on Discrete Algorithms. ACM-SIAM Symposium on Discrete Algorithms

Pub Date : 2022-07-17 DOI: 10.48550/arXiv.2207.08311

Cole Franks, Tasuku Soma, M. Goemans

A recent breakthrough in Edmonds' problem showed that the noncommutative rank can be computed in deterministic polynomial time, and various algorithms for it were devised. However, only quite complicated algorithms are known for finding a so-called shrunk subspace, which acts as a dual certificate for the value of the noncommutative rank. In particular, the operator Sinkhorn algorithm, perhaps the simplest algorithm to compute the noncommutative rank with operator scaling, does not find a shrunk subspace. Finding a shrunk subspace plays a key role in applications, such as separation in the Brascamp-Lieb polytope, one-parameter subgroups in the null-cone membership problem, and primal-dual algorithms for matroid intersection and fractional matroid matching. In this paper, we provide a simple Sinkhorn-style algorithm to find the smallest shrunk subspace over the complex field in deterministic polynomial time. To this end, we introduce a generalization of the operator scaling problem, where the spectra of the marginals must be majorized by specified vectors. Then we design an efficient Sinkhorn-style algorithm for the generalized operator scaling problem. Applying this to the shrunk subspace problem, we show that a sufficiently long run of the algorithm also finds an approximate shrunk subspace close to the minimum exact shrunk subspace. Finally, we show that the approximate shrunk subspace can be rounded if it is sufficiently close. Along the way, we also provide a simple randomized algorithm to find the smallest shrunk subspace. As applications, we design a faster algorithm for fractional linear matroid matching and efficient weak membership and optimization algorithms for the rank-2 Brascamp-Lieb polytope.

Edmonds问题的最新突破表明，非交换秩可以在确定性多项式时间内计算，并设计了各种算法。然而，只有非常复杂的算法才能找到所谓的收缩子空间，它作为非交换秩值的双重证明。特别是，运算符Sinkhorn算法，也许是计算具有运算符缩放的非交换秩的最简单的算法，没有找到一个收缩的子空间。在Brascamp-Lieb多面体的分离、零锥隶属度问题的单参数子群、矩阵相交和分数矩阵匹配的原对偶算法等应用中，寻找压缩子空间起着关键作用。在本文中，我们提供了一个简单的sinkhorn式算法来在确定多项式时间内找到复域上的最小收缩子空间。为此，我们引入了算子标度问题的推广，其中边界的谱必须被指定的向量最大化。然后针对广义算子标度问题设计了一种高效的sinkhorn型算法。将此方法应用于收缩子空间问题，我们证明了该算法在足够长的运行时间内也能找到接近最小精确收缩子空间的近似收缩子空间。最后，我们证明了近似收缩子空间可以被舍入，如果它足够接近。在此过程中，我们还提供了一个简单的随机算法来找到最小的收缩子空间。作为应用，我们设计了一种更快的分数阶线性矩阵匹配算法和高效的2阶Brascamp-Lieb多面体弱隶属度和优化算法。

{"title":"Shrunk subspaces via operator Sinkhorn iteration","authors":"Cole Franks, Tasuku Soma, M. Goemans","doi":"10.48550/arXiv.2207.08311","DOIUrl":"https://doi.org/10.48550/arXiv.2207.08311","url":null,"abstract":"A recent breakthrough in Edmonds' problem showed that the noncommutative rank can be computed in deterministic polynomial time, and various algorithms for it were devised. However, only quite complicated algorithms are known for finding a so-called shrunk subspace, which acts as a dual certificate for the value of the noncommutative rank. In particular, the operator Sinkhorn algorithm, perhaps the simplest algorithm to compute the noncommutative rank with operator scaling, does not find a shrunk subspace. Finding a shrunk subspace plays a key role in applications, such as separation in the Brascamp-Lieb polytope, one-parameter subgroups in the null-cone membership problem, and primal-dual algorithms for matroid intersection and fractional matroid matching. In this paper, we provide a simple Sinkhorn-style algorithm to find the smallest shrunk subspace over the complex field in deterministic polynomial time. To this end, we introduce a generalization of the operator scaling problem, where the spectra of the marginals must be majorized by specified vectors. Then we design an efficient Sinkhorn-style algorithm for the generalized operator scaling problem. Applying this to the shrunk subspace problem, we show that a sufficiently long run of the algorithm also finds an approximate shrunk subspace close to the minimum exact shrunk subspace. Finally, we show that the approximate shrunk subspace can be rounded if it is sufficiently close. Along the way, we also provide a simple randomized algorithm to find the smallest shrunk subspace. As applications, we design a faster algorithm for fractional linear matroid matching and efficient weak membership and optimization algorithms for the rank-2 Brascamp-Lieb polytope.","PeriodicalId":92709,"journal":{"name":"Proceedings of the ... Annual ACM-SIAM Symposium on Discrete Algorithms. ACM-SIAM Symposium on Discrete Algorithms","volume":"1 1","pages":"1655-1668"},"PeriodicalIF":0.0,"publicationDate":"2022-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87792562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Approximation Algorithms for Steiner Tree Augmentation Problems Steiner树增广问题的近似算法

Proceedings of the ... Annual ACM-SIAM Symposium on Discrete Algorithms. ACM-SIAM Symposium on Discrete Algorithms

Pub Date : 2022-07-16 DOI: 10.1137/1.9781611977554.ch94

R. Ravi, Weizhong Zhang, Michael Zlatin

In the Steiner Tree Augmentation Problem (STAP), we are given a graph $G = (V,E)$, a set of terminals $R subseteq V$, and a Steiner tree $T$ spanning $R$. The edges $L := E setminus E(T)$ are called links and have non-negative costs. The goal is to augment $T$ by adding a minimum cost set of links, so that there are 2 edge-disjoint paths between each pair of vertices in $R$. This problem is a special case of the Survivable Network Design Problem, which can be approximated to within a factor of 2 using iterative rounding~cite{J2001}. We give the first polynomial time algorithm for STAP with approximation ratio better than 2. In particular, we achieve an approximation ratio of $(1.5 + varepsilon)$. To do this, we employ the Local Search approach of~cite{TZ2022} for the Tree Augmentation Problem and generalize their main decomposition theorem from links (of size two) to hyper-links. We also consider the Node-Weighted Steiner Tree Augmentation Problem (NW-STAP) in which the non-terminal nodes have non-negative costs. We seek a cheapest subset $S subseteq V setminus R$ so that $G[R cup S]$ is 2-edge-connected. Using a result of Nutov~cite{N2010}, there exists an $O(log |R|)$-approximation for this problem. We provide an $O(log^2 (|R|))$-approximation algorithm for NW-STAP using a greedy algorithm leveraging the spider decomposition of optimal solutions.

在斯坦纳树增强问题(STAP)中，我们给出了一个图$G = (V,E)$，一组终端$R subseteq V$，以及一个斯坦纳树$T$生成$R$。这些边$L := E setminus E(T)$被称为链接，它们的代价是非负的。我们的目标是通过添加一个最小代价的链接集来增强$T$，这样在$R$的每对顶点之间就有2条不相交的路径。这个问题是可生存网络设计问题的一个特殊情况，它可以使用迭代舍入cite{J2001}在因子2内近似。给出了第一个近似比大于2的多项式时间算法。特别地，我们得到了近似的比值$(1.5 + varepsilon)$。为了做到这一点，我们对树增强问题采用了cite{TZ2022}的局部搜索方法，并将它们的主要分解定理从链接(大小为2)推广到超链接。我们还考虑了节点加权斯坦纳树增强问题(NW-STAP)，其中非终端节点具有非负成本。我们寻找一个最便宜的子集$S subseteq V setminus R$，使得$G[R cup S]$是2边连接的。利用Nutov cite{N2010}的结果，这个问题存在一个$O(log |R|)$ -近似。我们使用贪婪算法利用最优解的蜘蛛分解为NW-STAP提供了$O(log^2 (|R|))$ -逼近算法。

{"title":"Approximation Algorithms for Steiner Tree Augmentation Problems","authors":"R. Ravi, Weizhong Zhang, Michael Zlatin","doi":"10.1137/1.9781611977554.ch94","DOIUrl":"https://doi.org/10.1137/1.9781611977554.ch94","url":null,"abstract":"In the Steiner Tree Augmentation Problem (STAP), we are given a graph $G = (V,E)$, a set of terminals $R subseteq V$, and a Steiner tree $T$ spanning $R$. The edges $L := E setminus E(T)$ are called links and have non-negative costs. The goal is to augment $T$ by adding a minimum cost set of links, so that there are 2 edge-disjoint paths between each pair of vertices in $R$. This problem is a special case of the Survivable Network Design Problem, which can be approximated to within a factor of 2 using iterative rounding~cite{J2001}. We give the first polynomial time algorithm for STAP with approximation ratio better than 2. In particular, we achieve an approximation ratio of $(1.5 + varepsilon)$. To do this, we employ the Local Search approach of~cite{TZ2022} for the Tree Augmentation Problem and generalize their main decomposition theorem from links (of size two) to hyper-links. We also consider the Node-Weighted Steiner Tree Augmentation Problem (NW-STAP) in which the non-terminal nodes have non-negative costs. We seek a cheapest subset $S subseteq V setminus R$ so that $G[R cup S]$ is 2-edge-connected. Using a result of Nutov~cite{N2010}, there exists an $O(log |R|)$-approximation for this problem. We provide an $O(log^2 (|R|))$-approximation algorithm for NW-STAP using a greedy algorithm leveraging the spider decomposition of optimal solutions.","PeriodicalId":92709,"journal":{"name":"Proceedings of the ... Annual ACM-SIAM Symposium on Discrete Algorithms. ACM-SIAM Symposium on Discrete Algorithms","volume":"1 1","pages":"2429-2448"},"PeriodicalIF":0.0,"publicationDate":"2022-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79901257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Online Prediction in Sub-linear Space 亚线性空间的在线预测

Proceedings of the ... Annual ACM-SIAM Symposium on Discrete Algorithms. ACM-SIAM Symposium on Discrete Algorithms

Pub Date : 2022-07-16 DOI: 10.48550/arXiv.2207.07974

Binghui Peng, Fred Zhang

We provide the first sub-linear space and sub-linear regret algorithm for online learning with expert advice (against an oblivious adversary), addressing an open question raised recently by Srinivas, Woodruff, Xu and Zhou (STOC 2022). We also demonstrate a separation between oblivious and (strong) adaptive adversaries by proving a linear memory lower bound of any sub-linear regret algorithm against an adaptive adversary. Our algorithm is based on a novel pool selection procedure that bypasses the traditional wisdom of leader selection for online learning, and a generic reduction that transforms any weakly sub-linear regret $o(T)$ algorithm to $T^{1-alpha}$ regret algorithm, which may be of independent interest. Our lower bound utilizes the connection of no-regret learning and equilibrium computation in zero-sum games, leading to a proof of a strong lower bound against an adaptive adversary.

我们为在线学习提供了第一个亚线性空间和亚线性后悔算法，并提供了专家建议(针对健忘的对手)，解决了Srinivas, Woodruff, Xu和Zhou最近提出的一个开放问题(STOC 2022)。我们还通过证明针对自适应对手的任何次线性后悔算法的线性记忆下界来证明遗忘和(强)自适应对手之间的分离。我们的算法基于一种新颖的池选择过程，它绕过了在线学习中领导者选择的传统智慧，以及一种将任何弱次线性后悔$o(T)$算法转换为$T^{1-alpha}$后悔算法的通用约简，这可能是独立的兴趣。

引用次数: 9

A Nearly Tight Analysis of Greedy k-means++ 贪婪k-means++的近严密分析

Proceedings of the ... Annual ACM-SIAM Symposium on Discrete Algorithms. ACM-SIAM Symposium on Discrete Algorithms

Pub Date : 2022-07-16 DOI: 10.48550/arXiv.2207.07949

C. Grunau, Ahmet Alper Ozudougru, Václav Rozhoň, Jakub Tvetek

The famous $k$-means++ algorithm of Arthur and Vassilvitskii [SODA 2007] is the most popular way of solving the $k$-means problem in practice. The algorithm is very simple: it samples the first center uniformly at random and each of the following $k-1$ centers is then always sampled proportional to its squared distance to the closest center so far. Afterward, Lloyd's iterative algorithm is run. The $k$-means++ algorithm is known to return a $Theta(log k)$ approximate solution in expectation. In their seminal work, Arthur and Vassilvitskii [SODA 2007] asked about the guarantees for its following emph{greedy} variant: in every step, we sample $ell$ candidate centers instead of one and then pick the one that minimizes the new cost. This is also how $k$-means++ is implemented in e.g. the popular Scikit-learn library [Pedregosa et al.; JMLR 2011]. We present nearly matching lower and upper bounds for the greedy $k$-means++: We prove that it is an $O(ell^3 log^3 k)$-approximation algorithm. On the other hand, we prove a lower bound of $Omega(ell^3 log^3 k / log^2(elllog k))$. Previously, only an $Omega(ell log k)$ lower bound was known [Bhattacharya, Eube, R"oglin, Schmidt; ESA 2020] and there was no known upper bound.

Arthur和Vassilvitskii [SODA 2007]著名的$k$ -means++算法是解决$k$ -means问题在实践中最流行的方法。该算法非常简单:它随机均匀地采样第一个中心，然后每个$k-1$中心的采样始终与它到最近中心的距离的平方成比例。然后，运行Lloyd迭代算法。已知$k$ -means++算法在期望中返回一个$Theta(log k)$近似解。在他们的开创性工作中，Arthur和Vassilvitskii [SODA 2007]询问了以下emph{贪心}变体的保证:在每一步中，我们抽样$ell$候选中心，而不是一个，然后选择最小新成本的一个。这也是$k$ -means++在流行的Scikit-learn库中实现的方式[Pedregosa等人;[JMLR 2011]。给出了贪心$k$ -means++的近似匹配下界和上界，并证明了它是一个$O(ell^3 log^3 k)$ -逼近算法。另一方面，我们证明了$Omega(ell^3 log^3 k / log^2(elllog k))$的一个下界。以前，只知道$Omega(ell log k)$下界[Bhattacharya, Eube, Röglin, Schmidt;ESA 2020]，没有已知的上限。

{"title":"A Nearly Tight Analysis of Greedy k-means++","authors":"C. Grunau, Ahmet Alper Ozudougru, Václav Rozhoň, Jakub Tvetek","doi":"10.48550/arXiv.2207.07949","DOIUrl":"https://doi.org/10.48550/arXiv.2207.07949","url":null,"abstract":"The famous $k$-means++ algorithm of Arthur and Vassilvitskii [SODA 2007] is the most popular way of solving the $k$-means problem in practice. The algorithm is very simple: it samples the first center uniformly at random and each of the following $k-1$ centers is then always sampled proportional to its squared distance to the closest center so far. Afterward, Lloyd's iterative algorithm is run. The $k$-means++ algorithm is known to return a $Theta(log k)$ approximate solution in expectation. In their seminal work, Arthur and Vassilvitskii [SODA 2007] asked about the guarantees for its following emph{greedy} variant: in every step, we sample $ell$ candidate centers instead of one and then pick the one that minimizes the new cost. This is also how $k$-means++ is implemented in e.g. the popular Scikit-learn library [Pedregosa et al.; JMLR 2011]. We present nearly matching lower and upper bounds for the greedy $k$-means++: We prove that it is an $O(ell^3 log^3 k)$-approximation algorithm. On the other hand, we prove a lower bound of $Omega(ell^3 log^3 k / log^2(elllog k))$. Previously, only an $Omega(ell log k)$ lower bound was known [Bhattacharya, Eube, R\"oglin, Schmidt; ESA 2020] and there was no known upper bound.","PeriodicalId":92709,"journal":{"name":"Proceedings of the ... Annual ACM-SIAM Symposium on Discrete Algorithms. ACM-SIAM Symposium on Discrete Algorithms","volume":"84 1","pages":"1012-1070"},"PeriodicalIF":0.0,"publicationDate":"2022-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77136275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Curve Simplification and Clustering under Fréchet Distance 区间距离下的曲线简化与聚类

Proceedings of the ... Annual ACM-SIAM Symposium on Discrete Algorithms. ACM-SIAM Symposium on Discrete Algorithms

Pub Date : 2022-07-16 DOI: 10.48550/arXiv.2207.07809

Siu-Wing Cheng, Haoqiang Huang

We present new approximation results on curve simplification and clustering under Fr'echet distance. Let $T = {tau_i : i in [n] }$ be polygonal curves in $R^d$ of $m$ vertices each. Let $l$ be any integer from $[m]$. We study a generalized curve simplification problem: given error bounds $delta_i>0$ for $i in [n]$, find a curve $sigma$ of at most $l$ vertices such that $d_F(sigma,tau_i) le delta_i$ for $i in [n]$. We present an algorithm that returns a null output or a curve $sigma$ of at most $l$ vertices such that $d_F(sigma,tau_i) le delta_i + epsilondelta_{max}$ for $i in [n]$, where $delta_{max} = max_{i in [n]} delta_i$. If the output is null, there is no curve of at most $l$ vertices within a Fr'echet distance of $delta_i$ from $tau_i$ for $i in [n]$. The running time is $tilde{O}bigl(n^{O(l)} m^{O(l^2)} (dl/epsilon)^{O(dl)}bigr)$. This algorithm yields the first polynomial-time bicriteria approximation scheme to simplify a curve $tau$ to another curve $sigma$, where the vertices of $sigma$ can be anywhere in $R^d$, so that $d_F(sigma,tau) le (1+epsilon)delta$ and $|sigma| le (1+alpha) min{|c| : d_F(c,tau) le delta}$ for any given $delta>0$ and any fixed $alpha, epsilon in (0,1)$. The running time is $tilde{O}bigl(m^{O(1/alpha)} (d/(alphaepsilon))^{O(d/alpha)}bigr)$. By combining our technique with some previous results in the literature, we obtain an approximation algorithm for $(k,l)$-median clustering. Given $T$, it computes a set $Sigma$ of $k$ curves, each of $l$ vertices, such that $sum_{i in [n]} min_{sigma in Sigma} d_F(sigma,tau_i)$ is within a factor $1+epsilon$ of the optimum with probability at least $1-mu$ for any given $mu, epsilon in (0,1)$. The running time is $tilde{O}bigl(n m^{O(kl^2)} mu^{-O(kl)} (dkl/epsilon)^{O((dkl/epsilon)log(1/mu))}bigr)$.

我们给出了新的曲线简化和聚类的近似结果。让 $T = {tau_i : i in [n] }$ 在多边形曲线中 $R^d$ 的 $m$ 每个顶点。让 $l$ 是以下任意整数 $[m]$．研究了给定误差界的广义曲线化简问题 $delta_i>0$ 为了 $i in [n]$，找到一条曲线 $sigma$ 最多的 $l$ 这样的顶点 $d_F(sigma,tau_i) le delta_i$ 为了 $i in [n]$．我们提出了一种返回空输出或曲线的算法 $sigma$ 最多的 $l$ 这样的顶点 $d_F(sigma,tau_i) le delta_i + epsilondelta_{max}$ 为了 $i in [n]$，其中 $delta_{max} = max_{i in [n]} delta_i$．如果输出为空，则没有最多为的曲线 $l$ 的距离内的顶点 $delta_i$ 从 $tau_i$ 为了 $i in [n]$．运行时间为 $tilde{O}bigl(n^{O(l)} m^{O(l^2)} (dl/epsilon)^{O(dl)}bigr)$．该算法产生了简化曲线的第一个多项式时间双准则近似方案 $tau$ 到另一条曲线 $sigma$的顶点 $sigma$ 可以在任何地方 $R^d$，所以 $d_F(sigma,tau) le (1+epsilon)delta$ 和 $|sigma| le (1+alpha) min{|c| : d_F(c,tau) le delta}$ 对于任何给定的 $delta>0$ 任何固定的 $alpha, epsilon in (0,1)$．运行时间为 $tilde{O}bigl(m^{O(1/alpha)} (d/(alphaepsilon))^{O(d/alpha)}bigr)$．通过将我们的技术与文献中一些先前的结果相结合，我们得到了一个近似算法 $(k,l)$-中位数聚类。给定 $T$，它计算一个集合 $Sigma$ 的 $k$ 曲线，每一个 $l$ 顶点，这样 $sum_{i in [n]} min_{sigma in Sigma} d_F(sigma,tau_i)$ 在一个因素之内 $1+epsilon$ 最优值的概率 $1-mu$ 对于任何给定的 $mu, epsilon in (0,1)$．运行时间为 $tilde{O}bigl(n m^{O(kl^2)} mu^{-O(kl)} (dkl/epsilon)^{O((dkl/epsilon)log(1/mu))}bigr)$．

{"title":"Curve Simplification and Clustering under Fréchet Distance","authors":"Siu-Wing Cheng, Haoqiang Huang","doi":"10.48550/arXiv.2207.07809","DOIUrl":"https://doi.org/10.48550/arXiv.2207.07809","url":null,"abstract":"We present new approximation results on curve simplification and clustering under Fr'echet distance. Let $T = {tau_i : i in [n] }$ be polygonal curves in $R^d$ of $m$ vertices each. Let $l$ be any integer from $[m]$. We study a generalized curve simplification problem: given error bounds $delta_i>0$ for $i in [n]$, find a curve $sigma$ of at most $l$ vertices such that $d_F(sigma,tau_i) le delta_i$ for $i in [n]$. We present an algorithm that returns a null output or a curve $sigma$ of at most $l$ vertices such that $d_F(sigma,tau_i) le delta_i + epsilondelta_{max}$ for $i in [n]$, where $delta_{max} = max_{i in [n]} delta_i$. If the output is null, there is no curve of at most $l$ vertices within a Fr'echet distance of $delta_i$ from $tau_i$ for $i in [n]$. The running time is $tilde{O}bigl(n^{O(l)} m^{O(l^2)} (dl/epsilon)^{O(dl)}bigr)$. This algorithm yields the first polynomial-time bicriteria approximation scheme to simplify a curve $tau$ to another curve $sigma$, where the vertices of $sigma$ can be anywhere in $R^d$, so that $d_F(sigma,tau) le (1+epsilon)delta$ and $|sigma| le (1+alpha) min{|c| : d_F(c,tau) le delta}$ for any given $delta>0$ and any fixed $alpha, epsilon in (0,1)$. The running time is $tilde{O}bigl(m^{O(1/alpha)} (d/(alphaepsilon))^{O(d/alpha)}bigr)$. By combining our technique with some previous results in the literature, we obtain an approximation algorithm for $(k,l)$-median clustering. Given $T$, it computes a set $Sigma$ of $k$ curves, each of $l$ vertices, such that $sum_{i in [n]} min_{sigma in Sigma} d_F(sigma,tau_i)$ is within a factor $1+epsilon$ of the optimum with probability at least $1-mu$ for any given $mu, epsilon in (0,1)$. The running time is $tilde{O}bigl(n m^{O(kl^2)} mu^{-O(kl)} (dkl/epsilon)^{O((dkl/epsilon)log(1/mu))}bigr)$.","PeriodicalId":92709,"journal":{"name":"Proceedings of the ... Annual ACM-SIAM Symposium on Discrete Algorithms. ACM-SIAM Symposium on Discrete Algorithms","volume":"24 1","pages":"1414-1432"},"PeriodicalIF":0.0,"publicationDate":"2022-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78221470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3