Faster and Deterministic Subtrajectory Clustering

arXiv - CS - Computational Geometry Pub Date : 2024-02-20 DOI:arxiv-2402.13117

Ivor van der Hoog, Thijs van der Horst, Tim Ophelders

{"title":"Faster and Deterministic Subtrajectory Clustering","authors":"Ivor van der Hoog, Thijs van der Horst, Tim Ophelders","doi":"arxiv-2402.13117","DOIUrl":null,"url":null,"abstract":"Given a trajectory $T$ and a distance $\\Delta$, we wish to find a set $C$ of\ncurves of complexity at most $\\ell$, such that we can cover $T$ with subcurves\nthat each are within Fr\\'echet distance $\\Delta$ to at least one curve in $C$.\nWe call $C$ an $(\\ell,\\Delta)$-clustering and aim to find an\n$(\\ell,\\Delta)$-clustering of minimum cardinality. This problem was introduced\nby Akitaya $et$ $al.$ (2021) and shown to be NP-complete. The main focus has\ntherefore been on bicriterial approximation algorithms, allowing for the\nclustering to be an $(\\ell, \\Theta(\\Delta))$-clustering of roughly optimal\nsize. We present algorithms that construct $(\\ell,4\\Delta)$-clusterings of\n$\\mathcal{O}(k \\log n)$ size, where $k$ is the size of the optimal $(\\ell,\n\\Delta)$-clustering. For the discrete Fr\\'echet distance, we use $\\mathcal{O}(n\n\\ell \\log n)$ space and $\\mathcal{O}(k n^2 \\log^3 n)$ deterministic worst case\ntime. For the continuous Fr\\'echet distance, we use $\\mathcal{O}(n^2 \\log n)$\nspace and $\\mathcal{O}(k n^3 \\log^3 n)$ time. Our algorithms significantly\nimprove upon the clustering quality (improving the approximation factor in\n$\\Delta$) and size (whenever $\\ell \\in \\Omega(\\log n)$). We offer deterministic\nrunning times comparable to known expected bounds. Additionally, in the\ncontinuous setting, we give a near-linear improvement upon the space usage.\nWhen compared only to deterministic results, we offer a near-linear speedup and\na near-quadratic improvement in the space usage. When we may restrict ourselves\nto only considering clusters where all subtrajectories are vertex-to-vertex\nsubcurves, we obtain even better results under the continuous Fr\\'echet\ndistance. Our algorithm becomes near quadratic and uses space that is near\nlinear in $n \\ell$.","PeriodicalId":501570,"journal":{"name":"arXiv - CS - Computational Geometry","volume":"43 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computational Geometry","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2402.13117","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Given a trajectory $T$ and a distance $\Delta$, we wish to find a set $C$ of curves of complexity at most $\ell$, such that we can cover $T$ with subcurves that each are within Fr\'echet distance $\Delta$ to at least one curve in $C$. We call $C$ an $(\ell,\Delta)$-clustering and aim to find an $(\ell,\Delta)$-clustering of minimum cardinality. This problem was introduced by Akitaya $et$ $al.$ (2021) and shown to be NP-complete. The main focus has therefore been on bicriterial approximation algorithms, allowing for the clustering to be an $(\ell, \Theta(\Delta))$-clustering of roughly optimal size. We present algorithms that construct $(\ell,4\Delta)$-clusterings of $\mathcal{O}(k \log n)$ size, where $k$ is the size of the optimal $(\ell, \Delta)$-clustering. For the discrete Fr\'echet distance, we use $\mathcal{O}(n \ell \log n)$ space and $\mathcal{O}(k n^2 \log^3 n)$ deterministic worst case time. For the continuous Fr\'echet distance, we use $\mathcal{O}(n^2 \log n)$ space and $\mathcal{O}(k n^3 \log^3 n)$ time. Our algorithms significantly improve upon the clustering quality (improving the approximation factor in $\Delta$) and size (whenever $\ell \in \Omega(\log n)$). We offer deterministic running times comparable to known expected bounds. Additionally, in the continuous setting, we give a near-linear improvement upon the space usage. When compared only to deterministic results, we offer a near-linear speedup and a near-quadratic improvement in the space usage. When we may restrict ourselves to only considering clusters where all subtrajectories are vertex-to-vertex subcurves, we obtain even better results under the continuous Fr\'echet distance. Our algorithm becomes near quadratic and uses space that is near linear in $n \ell$.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

更快的确定性子轨迹聚类

给定一条轨迹 $T$ 和一个距离 $\elta$，我们希望找到一组复杂度最多为 $\ell$ 的曲线集合 $C$，这样我们就能用每条都与 $C$ 中至少一条曲线的距离在 Fr\'echet 距离 $\elta$ 以内的子曲线覆盖 $T$。Akitaya $et$ $al.$（2021 年）提出了这个问题，并证明它是 NP-完全的。因此，我们将主要精力放在了双标准近似算法上，允许聚类是一个大致最优大小的 $(\ell, \Theta(\Delta))$ 聚类。我们提出了构建$(\ell,4\Delta)$聚类的$mathcal{O}(k \log n)$大小的算法，其中$k$是最优$(\ell,\Delta)$聚类的大小。对于离散的 Fr\'echet 距离，我们使用 $\mathcal{O}(n\ell \log n)$ 空间和 $\mathcal{O}(k n^2 \log^3 n)$ 确定性最差时间。对于连续的 Fr\'echet 距离，我们使用了 $\mathcal{O}(n^2 \log n)$ 空间和 $\mathcal{O}(k n^3 \log^3 n)$ 时间。我们的算法大大提高了聚类质量（提高了 $\Delta$ 的近似系数）和大小（只要 $\ell \ in \Omega(\log n)$）。我们提供的确定性运行时间与已知的预期边界相当。如果仅与确定性结果相比，我们提供了近乎线性的速度提升和近乎二次方的空间使用改善。当我们可以限制自己只考虑所有子轨迹都是顶点到顶点的子曲线的簇时，我们在连续的 Fr\'echetdistance 下获得了更好的结果。我们的算法变得接近二次方，使用的空间也接近 $n \ell$ 的线性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

arXiv - CS - Computational Geometry

自引率

0.00%

发文量

期刊最新文献

Minimum Plane Bichromatic Spanning Trees Evolving Distributions Under Local Motion New Lower Bound and Algorithms for Online Geometric Hitting Set Problem Computing shortest paths amid non-overlapping weighted disks Fast Comparative Analysis of Merge Trees Using Locality Sensitive Hashing