Pub Date : 2024-09-16DOI: 10.1007/s00224-024-10194-8
Giulia Bernardini, Esteban Gabory, Solon P. Pissis, Leen Stougie, Michelle Sweering, Wiktor Zuba
An elastic-degenerate (ED) string is a sequence of n finite sets of strings of total length N, introduced to represent a set of related DNA sequences, also known as a pangenome. The ED string matching (EDSM) problem consists in reporting all occurrences of a pattern of length m in an ED text. The EDSM problem has recently received some attention by the combinatorial pattern matching community, culminating in an (mathcal {tilde{O}}(nm^{omega -1})+mathcal {O}(N))-time algorithm [Bernardini et al., SIAM J. Comput. 2022], where (omega ) denotes the matrix multiplication exponent and the (mathcal {tilde{O}}(cdot )) notation suppresses polylog factors. In the k-EDSM problem, the approximate version of EDSM, we are asked to report all pattern occurrences with at most k errors. k-EDSM can be solved in (mathcal {O}(k^2mG+kN)) time, under edit distance, or (mathcal {O}(kmG+kN)) time, under Hamming distance, where G denotes the total number of strings in the ED text [Bernardini et al., Theor. Comput. Sci. 2020]. Unfortunately, G is only bounded by N, and so even for (k=1), the existing algorithms run in (varOmega (mN)) time in the worst case. In this paper we make progress in this direction. We show that 1-EDSM can be solved in (mathcal {O}((nm^2 + N)log m)) or (mathcal {O}(nm^3 + N)) time under edit distance. For the decision version of the problem, we present a faster (mathcal {O}(nm^2sqrt{log m} + Nlog log m))-time algorithm. We also show that 1-EDSM can be solved in (mathcal {O}(nm^2 + Nlog m)) time under Hamming distance. Our algorithms for edit distance rely on non-trivial reductions from 1-EDSM to special instances of classic computational geometry problems (2d rectangle stabbing or 2d range emptiness), which we show how to solve efficiently. In order to obtain an even faster algorithm for Hamming distance, we rely on employing and adapting the k-errata trees for indexing with errors [Cole et al., STOC 2004]. This is an extended version of a paper presented at LATIN 2022.
弹性退化(ED)字符串是总长度为 N 的 n 个有限字符串集合的序列,用来表示一组相关的 DNA 序列,也称为泛基因组。ED 字符串匹配(EDSM)问题包括报告 ED 文本中长度为 m 的模式的所有出现情况。最近,EDSM 问题受到了组合模式匹配界的关注,最终产生了一种 (mathcal {tilde{O}}(nm^{omega -1})+mathcal {O}(N))-time 算法 [Bernardini et al、SIAM J. Comput. 2022],其中 (omega ) 表示矩阵乘法指数,而 (mathcal {tilde{O}}(cdot )) 符号抑制了多对数因子。在 k-EDSM 问题(EDSM 的近似版本)中,我们被要求以最多 k 个错误来报告所有模式的出现。在编辑距离下,k-ESM 可以在 (mathcal {O}(k^2mG+kN)) 时间内求解,或在汉明距离下,在 (mathcal {O}(kmG+kN)) 时间内求解,其中 G 表示 ED 文本中字符串的总数 [Bernardini et al、计算科学理论,2020 年]。不幸的是,G 仅以 N 为界,因此即使对于 (k=1),现有算法在最坏情况下也需要 (varOmega(mN))时间。在本文中,我们在这个方向上取得了进展。我们证明在编辑距离下,1-EDSM 可以在 (mathcal {O}((nm^2 + N)log m))或 (mathcal {O}(nm^3 + N))时间内求解。对于决策版本的问题,我们提出了一种更快的 (mathcal {O}(nm^2sqrt{log m} + Nlog log m)时间算法。我们还证明,在汉明距离下,1-EDSM 可以在 (mathcal {O}(nm^2 + Nlog m))时间内求解。我们的编辑距离算法依赖于从 1-EDSM 到经典计算几何问题(2d 矩形刺入或 2d 范围虚空)特殊实例的非难还原,我们展示了如何高效地解决这些问题。为了获得更快的汉明距离算法,我们采用了 k-errata 树,并对其进行了调整,以实现有误差的索引[科尔等人,STOC 2004]。本文是在 LATIN 2022 大会上发表的论文的扩展版本。
{"title":"Elastic-Degenerate String Matching with 1 Error or Mismatch","authors":"Giulia Bernardini, Esteban Gabory, Solon P. Pissis, Leen Stougie, Michelle Sweering, Wiktor Zuba","doi":"10.1007/s00224-024-10194-8","DOIUrl":"https://doi.org/10.1007/s00224-024-10194-8","url":null,"abstract":"<p>An elastic-degenerate (ED) string is a sequence of <i>n</i> finite sets of strings of total length <i>N</i>, introduced to represent a set of related DNA sequences, also known as a <i>pangenome</i>. The ED string matching (EDSM) problem consists in reporting all occurrences of a pattern of length <i>m</i> in an ED text. The EDSM problem has recently received some attention by the combinatorial pattern matching community, culminating in an <span>(mathcal {tilde{O}}(nm^{omega -1})+mathcal {O}(N))</span>-time algorithm [Bernardini et al., SIAM J. Comput. 2022], where <span>(omega )</span> denotes the matrix multiplication exponent and the <span>(mathcal {tilde{O}}(cdot ))</span> notation suppresses polylog factors. In the <i>k</i>-EDSM problem, the approximate version of EDSM, we are asked to report all pattern occurrences with at most <i>k</i> errors. <i>k</i>-EDSM can be solved in <span>(mathcal {O}(k^2mG+kN))</span> time, under edit distance, or <span>(mathcal {O}(kmG+kN))</span> time, under Hamming distance, where <i>G</i> denotes the total number of strings in the ED text [Bernardini et al., Theor. Comput. Sci. 2020]. Unfortunately, <i>G</i> is only bounded by <i>N</i>, and so even for <span>(k=1)</span>, the existing algorithms run in <span>(varOmega (mN))</span> time in the worst case. In this paper we make progress in this direction. We show that 1-EDSM can be solved in <span>(mathcal {O}((nm^2 + N)log m))</span> or <span>(mathcal {O}(nm^3 + N))</span> time under edit distance. For the decision version of the problem, we present a faster <span>(mathcal {O}(nm^2sqrt{log m} + Nlog log m))</span>-time algorithm. We also show that 1-EDSM can be solved in <span>(mathcal {O}(nm^2 + Nlog m))</span> time under Hamming distance. Our algorithms for edit distance rely on non-trivial reductions from 1-EDSM to special instances of classic computational geometry problems (2d rectangle stabbing or 2d range emptiness), which we show how to solve efficiently. In order to obtain an even faster algorithm for Hamming distance, we rely on employing and adapting the <i>k</i>-errata trees for indexing with errors [Cole et al., STOC 2004]. This is an extended version of a paper presented at LATIN 2022.</p>","PeriodicalId":22832,"journal":{"name":"Theory of Computing Systems","volume":"101 1","pages":""},"PeriodicalIF":0.5,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142259368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-10DOI: 10.1007/s00224-024-10195-7
France Gheeraert, Giuseppe Romana, Manon Stipulanti
Firstly studied by Kempa and Prezza in 2018 as the unifying idea behind text compression algorithms, string attractors have become a compelling object of theoretical research within the community of combinatorics on words. In this context, they have been studied for several families of finite and infinite words. In this paper, we focus on string attractors of prefixes of particular automatic infinite words (including the famous period-doubling and k-bonacci words) related to simple-Parry numbers. For a subfamily of these words, we describe string attractors of optimal size, while for the rest of them, we provide nearly optimal-size ones. Such a contribution is of particular interest, since in general finding smallest string attractors is NP-hard. This extends our previous work published in the international conference WORDS 2023.
{"title":"String Attractors of Some Simple-Parry Automatic Sequences","authors":"France Gheeraert, Giuseppe Romana, Manon Stipulanti","doi":"10.1007/s00224-024-10195-7","DOIUrl":"https://doi.org/10.1007/s00224-024-10195-7","url":null,"abstract":"<p>Firstly studied by Kempa and Prezza in 2018 as the unifying idea behind text compression algorithms, string attractors have become a compelling object of theoretical research within the community of combinatorics on words. In this context, they have been studied for several families of finite and infinite words. In this paper, we focus on string attractors of prefixes of particular automatic infinite words (including the famous period-doubling and <i>k</i>-bonacci words) related to simple-Parry numbers. For a subfamily of these words, we describe string attractors of optimal size, while for the rest of them, we provide nearly optimal-size ones. Such a contribution is of particular interest, since in general finding smallest string attractors is NP-hard. This extends our previous work published in the international conference WORDS 2023.</p>","PeriodicalId":22832,"journal":{"name":"Theory of Computing Systems","volume":"117 1","pages":""},"PeriodicalIF":0.5,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-10DOI: 10.1007/s00224-024-10192-w
Shaull Almagor, Omer Yizhaq
Jumping automata are finite automata that read their input in a non-consecutive manner, disregarding the order of the letters in the word. We introduce and study jumping automata over infinite words. Unlike the setting of finite words, which has been well studied, for infinite words it is not clear how words can be reordered. To this end, we consider three semantics: automata that read the infinite word in some order so that no letter is overlooked, automata that can permute the word in windows of a given size k, and automata that can permute the word in windows of an existentially-quantified bound. We study expressiveness, closure properties and algorithmic properties of these models.
跳跃自动机是一种有限自动机,它可以不考虑单词中字母的顺序,以非连续的方式读取输入内容。我们介绍并研究无限词上的跳跃自动机。与有限单词的研究不同,对于无限单词,单词如何重新排序尚不清楚。为此,我们考虑了三种语义:按一定顺序读取无限词以便不忽略任何字母的自动机、能在给定大小为 k 的窗口中排列词的自动机,以及能在存在量化约束的窗口中排列词的自动机。我们将研究这些模型的表现力、闭合特性和算法特性。
{"title":"Jumping Automata over Infinite Words","authors":"Shaull Almagor, Omer Yizhaq","doi":"10.1007/s00224-024-10192-w","DOIUrl":"https://doi.org/10.1007/s00224-024-10192-w","url":null,"abstract":"<p>Jumping automata are finite automata that read their input in a non-consecutive manner, disregarding the order of the letters in the word. We introduce and study jumping automata over infinite words. Unlike the setting of finite words, which has been well studied, for infinite words it is not clear how words can be reordered. To this end, we consider three semantics: automata that read the infinite word in some order so that no letter is overlooked, automata that can permute the word in windows of a given size k, and automata that can permute the word in windows of an existentially-quantified bound. We study expressiveness, closure properties and algorithmic properties of these models.</p>","PeriodicalId":22832,"journal":{"name":"Theory of Computing Systems","volume":"173 1","pages":""},"PeriodicalIF":0.5,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-20DOI: 10.1007/s00224-024-10193-9
Aleksi Saarela
It is known that the set of solutions of any constant-free three-variable word equation can be represented using parametric words, and the number of numerical parameters and the level of nesting in these parametric words is at most logarithmic with respect to the length of the equation. We show that this result can be significantly improved in the case of unbalanced equations, that is, equations where at least one variable has a different number of occurrences on the left-hand side and on the right-hand side. More specifically, it is sufficient to have two numerical parameters and one level of nesting in this case. We also discuss the possibility of proving a similar result for balanced equations in the future.
{"title":"On the Solution Sets of Three-Variable Word Equations","authors":"Aleksi Saarela","doi":"10.1007/s00224-024-10193-9","DOIUrl":"https://doi.org/10.1007/s00224-024-10193-9","url":null,"abstract":"<p>It is known that the set of solutions of any constant-free three-variable word equation can be represented using parametric words, and the number of numerical parameters and the level of nesting in these parametric words is at most logarithmic with respect to the length of the equation. We show that this result can be significantly improved in the case of unbalanced equations, that is, equations where at least one variable has a different number of occurrences on the left-hand side and on the right-hand side. More specifically, it is sufficient to have two numerical parameters and one level of nesting in this case. We also discuss the possibility of proving a similar result for balanced equations in the future.</p>","PeriodicalId":22832,"journal":{"name":"Theory of Computing Systems","volume":"29 1","pages":""},"PeriodicalIF":0.5,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-17DOI: 10.1007/s00224-024-10189-5
Sabrina C. L. Ammann, Sebastian Stiller
A classical result by Myerson (Math. Oper. Res. 6(1), 58-73, 1981) gives a characterization of an optimal auction for any given distribution of valuations of the bidders. We consider the situation where the distribution is not explicitly given but can be observed in a sample of auction results from the same distribution. A seminal paper by Morgenstern and Roughgarden (Adv.Neural Inf. Process. Syst. 28, 2015) proposes to learn a near-optimal auction from the hypothesis class of t-level auctions. They prove a bound on the sample complexity, i.e., the function (f(varepsilon , delta )) of required samples to guarantee a certain level of precision ((1-varepsilon )) with a probability of at least ((1-delta )), for the general single-parameter case and a tighter bound for the very restricted matroid case. We show a new bound for the case of independence systems, that widely generalizes matroids and contains several important combinatorial optimization problems. This bound of (tilde{O}left( nicefrac {H^2n^4}{varepsilon ^3}right) ) falls neatly between those known for the general and the matroid case. The class of independence systems contains several well known NP-hard problems such as knapsack. Therefore, the allocation itself might in practice be limited to (alpha )-approximate solutions. In a second result we show that an approximation algorithm can be used without compromising the sample complexity. Also, the precision is affected only mildly, resulting in a factor of (alpha cdot (1-varepsilon )).
{"title":"Near-Optimal Auctions on Independence Systems","authors":"Sabrina C. L. Ammann, Sebastian Stiller","doi":"10.1007/s00224-024-10189-5","DOIUrl":"https://doi.org/10.1007/s00224-024-10189-5","url":null,"abstract":"<p>A classical result by Myerson (Math. Oper. Res. <b>6</b>(1), 58-73, 1981) gives a characterization of an optimal auction for any given distribution of valuations of the bidders. We consider the situation where the distribution is not explicitly given but can be observed in a sample of auction results from the same distribution. A seminal paper by Morgenstern and Roughgarden (Adv.Neural Inf. Process. Syst. <b>28</b>, 2015) proposes to learn a near-optimal auction from the hypothesis class of <i>t</i>-level auctions. They prove a bound on the sample complexity, i.e., the function <span>(f(varepsilon , delta ))</span> of required samples to guarantee a certain level of precision <span>((1-varepsilon ))</span> with a probability of at least <span>((1-delta ))</span>, for the general single-parameter case and a tighter bound for the very restricted matroid case. We show a new bound for the case of independence systems, that widely generalizes matroids and contains several important combinatorial optimization problems. This bound of <span>(tilde{O}left( nicefrac {H^2n^4}{varepsilon ^3}right) )</span> falls neatly between those known for the general and the matroid case. The class of independence systems contains several well known NP-hard problems such as knapsack. Therefore, the allocation itself might in practice be limited to <span>(alpha )</span>-approximate solutions. In a second result we show that an approximation algorithm can be used without compromising the sample complexity. Also, the precision is affected only mildly, resulting in a factor of <span>(alpha cdot (1-varepsilon ))</span>.</p>","PeriodicalId":22832,"journal":{"name":"Theory of Computing Systems","volume":"27 1","pages":""},"PeriodicalIF":0.5,"publicationDate":"2024-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-10DOI: 10.1007/s00224-024-10188-6
Lubomíra Dvořáková, Zuzana Masáková, Edita Pelantová
We define a new class of ternary sequences that are 2-balanced. These sequences are obtained by colouring of Sturmian sequences. We show that the class contains sequences of any given letter frequencies. We provide an upper bound on factor and abelian complexity of these sequences. Using the interpretation by rectangle exchange transformation, we prove that for almost all triples of letter frequencies, the upper bound on factor and abelian complexity is reached. The bound on factor complexity is given using a number-theoretical function which we compute explicitly for a class of parameters.
{"title":"2-Balanced Sequences Coding Rectangle Exchange Transformation","authors":"Lubomíra Dvořáková, Zuzana Masáková, Edita Pelantová","doi":"10.1007/s00224-024-10188-6","DOIUrl":"https://doi.org/10.1007/s00224-024-10188-6","url":null,"abstract":"<p>We define a new class of ternary sequences that are 2-balanced. These sequences are obtained by colouring of Sturmian sequences. We show that the class contains sequences of any given letter frequencies. We provide an upper bound on factor and abelian complexity of these sequences. Using the interpretation by rectangle exchange transformation, we prove that for almost all triples of letter frequencies, the upper bound on factor and abelian complexity is reached. The bound on factor complexity is given using a number-theoretical function which we compute explicitly for a class of parameters.</p>","PeriodicalId":22832,"journal":{"name":"Theory of Computing Systems","volume":"14 1","pages":""},"PeriodicalIF":0.5,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141943934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-01DOI: 10.1007/s00224-024-10190-y
Valérie Berthé, Herman Goulet-Ouellet
This paper studies obstructions to preservation of return sets by episturmian morphisms. We show, by way of an explicit construction, that infinitely many obstructions exist. This generalizes and improves an earlier result about Sturmian morphisms.
{"title":"Obstructions to Return Preservation for Episturmian Morphisms","authors":"Valérie Berthé, Herman Goulet-Ouellet","doi":"10.1007/s00224-024-10190-y","DOIUrl":"https://doi.org/10.1007/s00224-024-10190-y","url":null,"abstract":"<p>This paper studies obstructions to preservation of return sets by episturmian morphisms. We show, by way of an explicit construction, that infinitely many obstructions exist. This generalizes and improves an earlier result about Sturmian morphisms.</p>","PeriodicalId":22832,"journal":{"name":"Theory of Computing Systems","volume":"62 1","pages":""},"PeriodicalIF":0.5,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141887400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-01DOI: 10.1007/s00224-024-10187-7
Arseny M. Shur
We study the power of entropy compression in proving avoidance results in combinatorics on words. Namely, we analyze variants of a simple algorithm that transforms an input word into a word avoiding repetitions of prescribed type. This transformation can be made reversible by adding the log of the run of the algorithm to the output. Counting distinct logs, it is possible to conclude that a given repetition is avoidable over all sufficiently large alphabets. We introduce two methods of counting logs. Applying them to ordinary, undirected, and conjugate repetitions, we prove, in all cases, the results of type “((1+frac{1}{d}))-powers are avoidable over (d+O(1)) letters”. These results are closer to the optimum than is usually expected from purely information-theoretic considerations. In the final part, we present experimental results obtained by the mentioned transformation algorithm in the extreme case of ((d+1))-ary words avoiding ((1+frac{1}{d})^+!)-powers.
{"title":"Non-Constructive Upper Bounds for Repetition Thresholds","authors":"Arseny M. Shur","doi":"10.1007/s00224-024-10187-7","DOIUrl":"https://doi.org/10.1007/s00224-024-10187-7","url":null,"abstract":"<p>We study the power of entropy compression in proving avoidance results in combinatorics on words. Namely, we analyze variants of a simple algorithm that transforms an input word into a word avoiding repetitions of prescribed type. This transformation can be made reversible by adding the log of the run of the algorithm to the output. Counting distinct logs, it is possible to conclude that a given repetition is avoidable over all sufficiently large alphabets. We introduce two methods of counting logs. Applying them to ordinary, undirected, and conjugate repetitions, we prove, in all cases, the results of type “<span>((1+frac{1}{d}))</span>-powers are avoidable over <span>(d+O(1))</span> letters”. These results are closer to the optimum than is usually expected from purely information-theoretic considerations. In the final part, we present experimental results obtained by the mentioned transformation algorithm in the extreme case of <span>((d+1))</span>-ary words avoiding <span>((1+frac{1}{d})^+!)</span>-powers.</p>","PeriodicalId":22832,"journal":{"name":"Theory of Computing Systems","volume":"81 1","pages":""},"PeriodicalIF":0.5,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141885344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-12DOI: 10.1007/s00224-024-10184-w
Hagit Attiya, Arie Fouren, Jeremy Ko
The amortized step complexity of an implementation measures its performance as a whole, rather than the performance of individual operations. Specifically, the amortized step complexity of an implementation is the average number of steps performed by invoked operations, in the worst case, taken over all possible executions. The point contention of an execution, denoted by (dot{c}), measures the maximal number of precesses simultaneously active in the execution. Ruppert (2016) showed that the amortized step complexity of known lock-free implementations for many shared data structures includes an additive factor linear in the point contention (dot{c}). This paper shows that there is no lock-free implementation with (o(min {dot{c}, sqrt{log log n}})) amortized RMR complexity of queues, stacks or heaps from reads, writes, comparison primitives (such as compare &swap) and LL/SC, where n is the total number of the processes in the system. In addition, the paper shows a (Omega (min {dot{c}, log log n})) lower bound on the amortized step complexity for shared linked lists, skip lists, search trees and other pointer-based data structures. These lower bounds mean that the additive factor linear in (dot{c}) is inherent for these implementations, provided that the point contention is small compared to the number of processes in the system (i.e. (dot{c}in O(sqrt{log log n})) or (dot{c}in O(log log n))).
{"title":"Lower Bounds on the Amortized Time Complexity of Shared Objects","authors":"Hagit Attiya, Arie Fouren, Jeremy Ko","doi":"10.1007/s00224-024-10184-w","DOIUrl":"https://doi.org/10.1007/s00224-024-10184-w","url":null,"abstract":"<p>The <i>amortized</i> step complexity of an implementation measures its performance as a whole, rather than the performance of individual operations. Specifically, the amortized step complexity of an implementation is the average number of steps performed by invoked operations, in the worst case, taken over all possible executions. The <i>point contention</i> of an execution, denoted by <span>(dot{c})</span>, measures the maximal number of precesses simultaneously active in the execution. Ruppert (2016) showed that the amortized step complexity of known lock-free implementations for many shared data structures includes an additive factor linear in the point contention <span>(dot{c})</span>. This paper shows that there is no lock-free implementation with <span>(o(min {dot{c}, sqrt{log log n}}))</span> amortized <i>RMR</i> complexity of queues, stacks or heaps from reads, writes, comparison primitives (such as <span>compare &swap</span>) and <span>LL/SC</span>, where <i>n</i> is the total number of the processes in the system. In addition, the paper shows a <span>(Omega (min {dot{c}, log log n}))</span> lower bound on the amortized <i>step</i> complexity for shared linked lists, skip lists, search trees and other pointer-based data structures. These lower bounds mean that the additive factor linear in <span>(dot{c})</span> is inherent for these implementations, provided that the point contention is small compared to the number of processes in the system (i.e. <span>(dot{c}in O(sqrt{log log n}))</span> or <span>(dot{c}in O(log log n))</span>).</p>","PeriodicalId":22832,"journal":{"name":"Theory of Computing Systems","volume":"48 1","pages":""},"PeriodicalIF":0.5,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141608538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-03DOI: 10.1007/s00224-024-10174-y
Jiehua Chen, Hendrik Molter, Manuel Sorge, Ondřej Suchý
Motivated by the recent rapid growth of research for algorithms to cluster multi-layer and temporal graphs, we study extensions of the classical Cluster Editing problem. In Multi-Layer Cluster Editing we receive a set of graphs on the same vertex set, called layers and aim to transform all layers into cluster graphs (disjoint unions of cliques) that differ only slightly. More specifically, we want to mark at most d vertices and to transform each layer into a cluster graph using at most k edge additions or deletions per layer so that, if we remove the marked vertices, we obtain the same cluster graph in all layers. In Temporal Cluster Editing we receive a sequence of layers and we want to transform each layer into a cluster graph so that consecutive layers differ only slightly. That is, we want to transform each layer into a cluster graph with at most k edge additions or deletions and to mark a distinct set of d vertices in each layer so that each two consecutive layers are the same after removing the vertices marked in the first of the two layers. We study the combinatorial structure of the two problems via their parameterized complexity with respect to the parameters d and k, among others. Despite the similar definition, the two problems behave quite differently: In particular, Multi-Layer Cluster Editing is fixed-parameter tractable with running time (k^{O(k + d)} s^{O(1)}) for inputs of size s, whereas Temporal Cluster Editing is (textsf {W[1]})-hard with respect to k even if (d = 3).
最近,对多层图和时序图聚类算法的研究迅速发展,受此激励,我们研究了经典聚类编辑问题的扩展。在多层聚类编辑中,我们会收到一组相同顶点集上的图,称为层,目的是将所有层转化为仅有细微差别的聚类图(小群的不相交联盟)。更具体地说,我们希望最多标记 d 个顶点,并使用每层最多 k 条边的增减将每层转化为聚类图,这样,如果我们移除标记的顶点,就能在所有层中得到相同的聚类图。在 "时间聚类编辑 "中,我们会收到一连串的图层,我们希望将每一层转化为聚类图,这样连续的图层之间只有细微的差别。也就是说,我们希望将每一层转化为最多有 k 条边增删的簇图,并在每一层中标记一组不同的 d 个顶点,这样在去除第一层中标记的顶点后,每两个连续的层都是相同的。我们通过参数 d 和 k 等参数的参数化复杂度来研究这两个问题的组合结构。尽管定义相似,这两个问题的表现却大相径庭:特别是,对于大小为 s 的输入,多层集群编辑是固定参数可处理的,其运行时间为 (k^{O(k + d)} s^{O(1)}) ,而时态集群编辑即使在 (d = 3) 的情况下,相对于 k 也是(textsf {W[1]})困难的。
{"title":"Cluster Editing for Multi-Layer and Temporal Graphs","authors":"Jiehua Chen, Hendrik Molter, Manuel Sorge, Ondřej Suchý","doi":"10.1007/s00224-024-10174-y","DOIUrl":"https://doi.org/10.1007/s00224-024-10174-y","url":null,"abstract":"<p>Motivated by the recent rapid growth of research for algorithms to cluster multi-layer and temporal graphs, we study extensions of the classical <span>Cluster Editing</span> problem. In <span>Multi-Layer Cluster Editing</span> we receive a set of graphs on the same vertex set, called <i>layers</i> and aim to transform all layers into cluster graphs (disjoint unions of cliques) that differ only slightly. More specifically, we want to mark at most <i>d</i> vertices and to transform each layer into a cluster graph using at most <i>k</i> edge additions or deletions per layer so that, if we remove the marked vertices, we obtain the same cluster graph in all layers. In <span>Temporal Cluster Editing</span> we receive a <i>sequence</i> of layers and we want to transform each layer into a cluster graph so that consecutive layers differ only slightly. That is, we want to transform each layer into a cluster graph with at most <i>k</i> edge additions or deletions and to mark a distinct set of <i>d</i> vertices in each layer so that each two consecutive layers are the same after removing the vertices marked in the first of the two layers. We study the combinatorial structure of the two problems via their parameterized complexity with respect to the parameters <i>d</i> and <i>k</i>, among others. Despite the similar definition, the two problems behave quite differently: In particular, <span>Multi-Layer Cluster Editing</span> is fixed-parameter tractable with running time <span>(k^{O(k + d)} s^{O(1)})</span> for inputs of size <i>s</i>, whereas <span>Temporal Cluster Editing</span> is <span>(textsf {W[1]})</span>-hard with respect to <i>k</i> even if <span>(d = 3)</span>.</p>","PeriodicalId":22832,"journal":{"name":"Theory of Computing Systems","volume":"15 1","pages":""},"PeriodicalIF":0.5,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141511287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}