Pub Date : 2025-03-25DOI: 10.1007/s00236-025-00486-y
Andrew Bloch-Hansen, Roberto Solis-Oba
In the thief orienteering problem an agent called a thief carries a knapsack of capacity W and has a time limit T to collect a set of items of total weight at most W and maximum profit along a simple path in a weighted graph (G = (V, E)) from a start vertex s to an end vertex t. There is a set I of items each with weight (w_{i}) and profit (p_{i}) that are distributed among (V{setminus }{s,t}). The time needed by the thief to travel an edge depends on the length of the edge and the weight of the items in the knapsack at the moment when the edge is traversed. There is a polynomial-time approximation scheme for a relaxed version of the thief orienteering problem on directed acyclic graphs that produces solutions that use time at most (T(1 + epsilon )) for any constant (epsilon > 0). We give a polynomial-time algorithm for transforming instances of the problem on 2-terminal series–parallel graphs into equivalent instances of the thief orienteering problem on directed acyclic graphs; therefore, yielding a polynomial-time approximation scheme for the relaxed version of the thief orienteering problem on this graph class.
{"title":"The thief orienteering problem on 2-terminal series–parallel graphs","authors":"Andrew Bloch-Hansen, Roberto Solis-Oba","doi":"10.1007/s00236-025-00486-y","DOIUrl":"10.1007/s00236-025-00486-y","url":null,"abstract":"<div><p>In the thief orienteering problem an agent called a <i>thief</i> carries a knapsack of capacity <i>W</i> and has a time limit <i>T</i> to collect a set of items of total weight at most <i>W</i> and maximum profit along a simple path in a weighted graph <span>(G = (V, E))</span> from a start vertex <i>s</i> to an end vertex <i>t</i>. There is a set <i>I</i> of items each with weight <span>(w_{i})</span> and profit <span>(p_{i})</span> that are distributed among <span>(V{setminus }{s,t})</span>. The time needed by the thief to travel an edge depends on the length of the edge and the weight of the items in the knapsack at the moment when the edge is traversed. There is a polynomial-time approximation scheme for a relaxed version of the thief orienteering problem on directed acyclic graphs that produces solutions that use time at most <span>(T(1 + epsilon ))</span> for any constant <span>(epsilon > 0)</span>. We give a polynomial-time algorithm for transforming instances of the problem on 2-terminal series–parallel graphs into equivalent instances of the thief orienteering problem on directed acyclic graphs; therefore, yielding a polynomial-time approximation scheme for the relaxed version of the thief orienteering problem on this graph class.</p></div>","PeriodicalId":7189,"journal":{"name":"Acta Informatica","volume":"62 2","pages":""},"PeriodicalIF":0.4,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143688433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-22DOI: 10.1007/s00236-025-00485-z
Angelo Monti, Blerina Sinaimeri
A graph (G=(V,E)) is a star-k-pairwise compatibility graph (star-k-PCG) if there exists a weight function (w: V rightarrow mathbb {R}^+) and k mutually exclusive intervals (I_1, I_2, ldots I_k), such that there is an edge (uv in E) if and only if (w(u)+w(v) in bigcup _i I_i). These graphs are related to two important classes of graphs: pairwise compatibility graphs (PCGs) and multithreshold graphs. It is known that for any graph G there exists a k such that G is a star-k-PCG. Thus, for a given graph G it is interesting to know which is the minimum k such that G is a star-k-PCG. We define this minimum k as the star number of the graph, denoted by (gamma (G)). Here we investigate the star number of simple graph classes, such as graphs of small size, caterpillars, cycles and grids. Specifically, we determine the exact value of (gamma (G)) for all the graphs with at most 7 vertices. By doing so we show that the smallest graphs with star number 2 are only 4 and have exactly 5 vertices; the smallest graphs with star number 3 are only 3 and have exactly 7 vertices. Next, we provide a construction showing that the star number of caterpillars is one. Moreover, we show that the star number of cycles and two-dimensional grid graphs is 2 and that the star number of 4-dimensional grids is at least 3. Finally, we conclude with numerous open problems.
如果存在一个权重函数(w:和 k 个互斥区间(I_1, I_2, ldots I_k ),这样,当且仅当(w(u)+w(v)在 bigcup _i I_i)时,存在一条边(uv 在 E 中)。这些图与两类重要的图有关:成对相容图(PCG)和多阈值图。众所周知,对于任何图 G,都存在一个 k,使得 G 是星-k-PCG。因此,对于给定的图 G,我们有兴趣知道哪一个 k 是最小的,使得 G 是星形-k-PCG。我们把这个最小 k 定义为图的星形数,用 (gamma (G)) 表示。在此,我们将研究简单图类的星形数,如小尺寸图、毛毛虫图、循环图和网格图。具体来说,我们确定了所有顶点最多为 7 个的图的(gamma (G)) 的精确值。通过这样做,我们证明了星号为 2 的最小图形只有 4 个,并且正好有 5 个顶点;星号为 3 的最小图形只有 3 个,并且正好有 7 个顶点。接下来,我们提供了一种构造,表明毛毛虫的星号是 1。此外,我们还证明了循环图和二维网格图的星号是 2,而四维网格图的星号至少是 3。最后,我们提出了许多有待解决的问题。
{"title":"On star-k-PCGs: exploring class boundaries for small k values","authors":"Angelo Monti, Blerina Sinaimeri","doi":"10.1007/s00236-025-00485-z","DOIUrl":"10.1007/s00236-025-00485-z","url":null,"abstract":"<div><p>A graph <span>(G=(V,E))</span> is a star-<i>k</i>-pairwise compatibility graph (star-<i>k</i>-PCG) if there exists a weight function <span>(w: V rightarrow mathbb {R}^+)</span> and <i>k</i> mutually exclusive intervals <span>(I_1, I_2, ldots I_k)</span>, such that there is an edge <span>(uv in E)</span> if and only if <span>(w(u)+w(v) in bigcup _i I_i)</span>. These graphs are related to two important classes of graphs: pairwise compatibility graphs (PCGs) and multithreshold graphs. It is known that for any graph <i>G</i> there exists a <i>k</i> such that <i>G</i> is a star-<i>k</i>-PCG. Thus, for a given graph <i>G</i> it is interesting to know which is the minimum <i>k</i> such that <i>G</i> is a star-<i>k</i>-PCG. We define this minimum <i>k</i> as the <i>star number</i> of the graph, denoted by <span>(gamma (G))</span>. Here we investigate the star number of simple graph classes, such as graphs of small size, caterpillars, cycles and grids. Specifically, we determine the exact value of <span>(gamma (G))</span> for all the graphs with at most 7 vertices. By doing so we show that the smallest graphs with star number 2 are only 4 and have exactly 5 vertices; the smallest graphs with star number 3 are only 3 and have exactly 7 vertices. Next, we provide a construction showing that the star number of caterpillars is one. Moreover, we show that the star number of cycles and two-dimensional grid graphs is 2 and that the star number of 4-dimensional grids is at least 3. Finally, we conclude with numerous open problems.</p></div>","PeriodicalId":7189,"journal":{"name":"Acta Informatica","volume":"62 2","pages":""},"PeriodicalIF":0.4,"publicationDate":"2025-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s00236-025-00485-z.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143668207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-26DOI: 10.1007/s00236-025-00482-2
Sounaka Mishra
For (tge 3), (K_{1, t}) is called t-claw. A graph (G=(V, E)) is t-claw free if it does not contain t-claw as a vertex-induced subgraph. In minimum t-claw deletion problem (Min-t-Claw-Del), given a graph (G=(V, E)), it is required to find a vertex set S of minimum size such that (G[Vsetminus S]) is t-claw free. In a split graph, the vertex set is partitioned into two sets such that one forms a clique and the other forms an independent set. Every t-claw in a split graph has a center vertex in the clique partition. This observation motivates us to consider the minimum one-sided bipartite t-claw deletion problem (Min-t-OSBCD). Given a bipartite graph (G=(A cup B, E)), in Min-t-OSBCD it is asked to find a vertex set S of minimum size such that (G[(A cup B) {setminus } S]) has no t-claw with the center vertex in A. A primal-dual algorithm approximates Min-t-OSBCD within a factor of t. We prove that it is ({textsf{UGC}})-hard to approximate with a factor better than t. We also prove it is approximable within a factor of 2 for dense bipartite graphs. By using these results on Min-t-OSBCD, we prove that Min-t-Claw-Del is ({textsf{UGC}})-hard to approximate within a factor better than t, for split graphs. We also consider their complementary maximization problems and prove that they are ({textsf{APX}})-complete.
对于(tge 3), (K_{1, t})被称为t-claw。如果图(G=(V, E))不包含t爪作为顶点诱导子图,则它是无t爪的。在最小t爪删除问题(Min-t-Claw-Del)中,给定一个图(G=(V, E)),要求找到一个最小尺寸的顶点集S,使得(G[Vsetminus S])不存在t爪。在分裂图中,顶点集被划分为两个集合,其中一个形成团,另一个形成独立集。分裂图中的每个t爪在团分区中都有一个中心顶点。这一观察结果促使我们考虑最小单侧双部t爪缺失问题(Min-t-OSBCD)。给定一个二部图(G=(A cup B, E)),在Min-t-OSBCD中,它被要求找到一个最小大小的顶点集S,使得(G[(A cup B) {setminus } S])没有中心顶点在a中的t爪。一个原始对偶算法在t因子内近似Min-t-OSBCD。我们证明它是({textsf{UGC}}) -难以用比t更好的因子进行近似。我们还证明它在密集二部图的因子2内近似。通过在Min-t-OSBCD上使用这些结果,我们证明了对于分裂图,Min-t-Claw-Del在一个比t更好的因子内难以近似({textsf{UGC}})。同时考虑了它们的互补最大化问题,并证明了它们是({textsf{APX}}) -完全的。
{"title":"On minimum t-claw deletion in split graphs","authors":"Sounaka Mishra","doi":"10.1007/s00236-025-00482-2","DOIUrl":"10.1007/s00236-025-00482-2","url":null,"abstract":"<div><p>For <span>(tge 3)</span>, <span>(K_{1, t})</span> is called <i>t</i>-claw. A graph <span>(G=(V, E))</span> is <i>t</i>-claw free if it does not contain <i>t</i>-claw as a vertex-induced subgraph. In minimum <i>t</i>-claw deletion problem (<span>Min-</span><i>t</i>-<span>Claw-Del</span>), given a graph <span>(G=(V, E))</span>, it is required to find a vertex set <i>S</i> of minimum size such that <span>(G[Vsetminus S])</span> is <i>t</i>-claw free. In a split graph, the vertex set is partitioned into two sets such that one forms a clique and the other forms an independent set. Every <i>t</i>-claw in a split graph has a center vertex in the clique partition. This observation motivates us to consider the minimum one-sided bipartite <i>t</i>-claw deletion problem (<span>Min-</span><i>t</i><span>-OSBCD</span>). Given a bipartite graph <span>(G=(A cup B, E))</span>, in <span>Min-</span><i>t</i><span>-OSBCD</span> it is asked to find a vertex set <i>S</i> of minimum size such that <span>(G[(A cup B) {setminus } S])</span> has no <i>t</i>-claw with the center vertex in <i>A</i>. A primal-dual algorithm approximates <span>Min-</span><i>t</i><span>-OSBCD</span> within a factor of <i>t</i>. We prove that it is <span>({textsf{UGC}})</span>-hard to approximate with a factor better than <i>t</i>. We also prove it is approximable within a factor of 2 for dense bipartite graphs. By using these results on <span>Min-</span><i>t</i><span>-OSBCD</span>, we prove that <span>Min-</span><i>t</i>-<span>Claw-Del</span> is <span>({textsf{UGC}})</span>-hard to approximate within a factor better than <i>t</i>, for split graphs. We also consider their complementary maximization problems and prove that they are <span>({textsf{APX}})</span>-complete.</p></div>","PeriodicalId":7189,"journal":{"name":"Acta Informatica","volume":"62 1","pages":""},"PeriodicalIF":0.4,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143489619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-18DOI: 10.1007/s00236-025-00480-4
Philippe Schnoebelen, Isa Vialard
The piecewise complexity h(u) of a word is the minimal length of subwords needed to exactly characterise u. Its piecewise minimality index (rho (u)) is the smallest length k such that u is minimal among its order-k class ([u]_k) in Simon’s congruence. We initiate a study of these two descriptive complexity measures. Among other results, we provide efficient algorithms for computing h(u) and (rho (u)) for a given word u.
{"title":"On the piecewise complexity of words","authors":"Philippe Schnoebelen, Isa Vialard","doi":"10.1007/s00236-025-00480-4","DOIUrl":"10.1007/s00236-025-00480-4","url":null,"abstract":"<div><p>The piecewise complexity <i>h</i>(<i>u</i>) of a word is the minimal length of subwords needed to exactly characterise <i>u</i>. Its piecewise minimality index <span>(rho (u))</span> is the smallest length <i>k</i> such that <i>u</i> is minimal among its order-<i>k</i> class <span>([u]_k)</span> in Simon’s congruence. We initiate a study of these two descriptive complexity measures. Among other results, we provide efficient algorithms for computing <i>h</i>(<i>u</i>) and <span>(rho (u))</span> for a given word <i>u</i>.\u0000</p></div>","PeriodicalId":7189,"journal":{"name":"Acta Informatica","volume":"62 1","pages":""},"PeriodicalIF":0.4,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143431058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-13DOI: 10.1007/s00236-025-00481-3
Gonzalo Navarro, Francisco Olivares, Cristian Urbina
It was recently proved that any straight-line program (SLP) generating a given string can be transformed in linear time into an equivalent balanced SLP of the same asymptotic size. We generalize this proof to a general class of grammars we call generalized SLPs (GSLPs), which allow rules of the form (A rightarrow x) where x is any Turing-complete representation (of size |x|) of a sequence of symbols (potentially much longer than |x|). We then specialize GSLPs to so-called Iterated SLPs (ISLPs), which allow rules of the form (A rightarrow Pi _{i=k_1}^{k_2} B_1^{i^{c_1}}cdots B_t^{i^{c_t}}) of size (mathcal {O}(t)). We prove that ISLPs break, for some text families, the measure (delta ) based on substring complexity, a lower bound for most measures and compressors exploiting repetitiveness. Further, ISLPs can extract any substring of length (lambda ), from the represented text (T[1mathinner {.,.}n]), in time (mathcal {O}(lambda + log ^2 nlog log n)). This is the first compressed representation for repetitive texts breaking (delta ) while, at the same time, supporting direct access to arbitrary text symbols in polylogarithmic time. We also show how to compute some substring queries, like range minima and next/previous smaller value, in time (mathcal {O}(log ^2 n log log n)). Finally, we further specialize the grammars to run-length SLPs (RLSLPs), which restrict the rules allowed by ISLPs to the form (A rightarrow B^t). Apart from inheriting all the previous results with the term (log ^2 n log log n) reduced to the near-optimal (log n), we show that RLSLPs can exploit balancedness to efficiently compute a wide class of substring queries we call “composable”—i.e., (f(X cdot Y)) can be obtained from f(X) and f(Y). As an example, we show how to compute Karp-Rabin fingerprints of texts substrings in (mathcal {O}(log n)) time. While the results on RLSLPs were already known, ours are much simpler and require little precomputation time and extra data associated with the grammar.
{"title":"Generalized straight-line programs","authors":"Gonzalo Navarro, Francisco Olivares, Cristian Urbina","doi":"10.1007/s00236-025-00481-3","DOIUrl":"10.1007/s00236-025-00481-3","url":null,"abstract":"<div><p>It was recently proved that any straight-line program (SLP) generating a given string can be transformed in linear time into an equivalent balanced SLP of the same asymptotic size. We generalize this proof to a general class of grammars we call generalized SLPs (GSLPs), which allow rules of the form <span>(A rightarrow x)</span> where <i>x</i> is any Turing-complete representation (of size |<i>x</i>|) of a sequence of symbols (potentially much longer than |<i>x</i>|). We then specialize GSLPs to so-called Iterated SLPs (ISLPs), which allow rules of the form <span>(A rightarrow Pi _{i=k_1}^{k_2} B_1^{i^{c_1}}cdots B_t^{i^{c_t}})</span> of size <span>(mathcal {O}(t))</span>. We prove that ISLPs break, for some text families, the measure <span>(delta )</span> based on substring complexity, a lower bound for most measures and compressors exploiting repetitiveness. Further, ISLPs can extract any substring of length <span>(lambda )</span>, from the represented text <span>(T[1mathinner {.,.}n])</span>, in time <span>(mathcal {O}(lambda + log ^2 nlog log n))</span>. This is the first compressed representation for repetitive texts breaking <span>(delta )</span> while, at the same time, supporting direct access to arbitrary text symbols in polylogarithmic time. We also show how to compute some substring queries, like range minima and next/previous smaller value, in time <span>(mathcal {O}(log ^2 n log log n))</span>. Finally, we further specialize the grammars to run-length SLPs (RLSLPs), which restrict the rules allowed by ISLPs to the form <span>(A rightarrow B^t)</span>. Apart from inheriting all the previous results with the term <span>(log ^2 n log log n)</span> reduced to the near-optimal <span>(log n)</span>, we show that RLSLPs can exploit balancedness to efficiently compute a wide class of substring queries we call “composable”—i.e., <span>(f(X cdot Y))</span> can be obtained from <i>f</i>(<i>X</i>) and <i>f</i>(<i>Y</i>). As an example, we show how to compute Karp-Rabin fingerprints of texts substrings in <span>(mathcal {O}(log n))</span> time. While the results on RLSLPs were already known, ours are much simpler and require little precomputation time and extra data associated with the grammar.</p></div>","PeriodicalId":7189,"journal":{"name":"Acta Informatica","volume":"62 1","pages":""},"PeriodicalIF":0.4,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143404243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-04DOI: 10.1007/s00236-025-00478-y
Hiroto Fujimaru, Yuto Nakashima, Shunsuke Inenaga
Compact directed acyclic word graphs (CDAWGs) (Blumer et al. in J ACM 34(3):578–595, 1987) are a fundamental data structure on strings with applications in text pattern searching, data compression, and pattern discovery. Intuitively, the CDAWG of a string T is obtained by merging isomorphic subtrees of the suffix tree (Weiner, in: Proceedings of the 14th annual symposium on switching and automata theory, pp 1–11, 1973) of the same string T, thus CDAWGs are a compact indexing structure. In this paper, we investigate the sensitivity of CDAWGs when a single character edit operation (insertion, deletion, or substitution) is performed at the left-end of the input string T, namely, we are interested in the worst-case increase in the size of the CDAWG after a left-end edit operation. We prove that if (textsf{e}) is the number of edges of the CDAWG for string T, then the number of new edges added to the CDAWG after a left-end edit operation on T does not exceed (textsf{e}). Further, we present a matching lower bound on the sensitivity of CDAWGs for left-end insertions, and almost matching lower bounds for left-end deletions and substitutions. We then generalize our lower-bound instance for left-end insertions to leftward online construction of the CDAWG, and show that it requires (Omega (n^2)) time for some string of length n.
紧凑有向无环字图(CDAWGs) (Blumer et al. in J ACM 34(3): 578-595, 1987)是字符串的基本数据结构,应用于文本模式搜索、数据压缩和模式发现。直观上,字符串T的CDAWG是通过合并同一字符串T的后缀树的同构子树得到的(Weiner, in: Proceedings of the 14th annual symposium on switching and automata theory, pp 1-11, 1973),因此CDAWG是一个紧凑的索引结构。在本文中,我们研究了当在输入字符串T的左端执行单个字符编辑操作(插入、删除或替换)时CDAWG的灵敏度,即我们感兴趣的是在左端编辑操作后CDAWG大小的最坏情况增加。我们证明,如果(textsf{e})是字符串T的CDAWG的边数,那么对T进行左端编辑操作后,添加到CDAWG的新边数不超过(textsf{e})。此外,我们提出了CDAWGs对左端插入的敏感性的匹配下界,以及对左端缺失和替换的敏感性的几乎匹配下界。然后,我们将左端插入的下界实例推广到CDAWG的向左在线构建,并表明对于长度为n的字符串,它需要(Omega (n^2))时间。
{"title":"Tight bounds for the sensitivity of CDAWGs with left-end edits","authors":"Hiroto Fujimaru, Yuto Nakashima, Shunsuke Inenaga","doi":"10.1007/s00236-025-00478-y","DOIUrl":"10.1007/s00236-025-00478-y","url":null,"abstract":"<div><p><i>Compact directed acyclic word graphs</i> (<i>CDAWGs</i>) (Blumer et al. in J ACM 34(3):578–595, 1987) are a fundamental data structure on strings with applications in text pattern searching, data compression, and pattern discovery. Intuitively, the CDAWG of a string <i>T</i> is obtained by merging isomorphic subtrees of the suffix tree (Weiner, in: Proceedings of the 14th annual symposium on switching and automata theory, pp 1–11, 1973) of the same string <i>T</i>, thus CDAWGs are a compact indexing structure. In this paper, we investigate the sensitivity of CDAWGs when a single character edit operation (insertion, deletion, or substitution) is performed at the left-end of the input string <i>T</i>, namely, we are interested in the worst-case increase in the size of the CDAWG after a left-end edit operation. We prove that if <span>(textsf{e})</span> is the number of edges of the CDAWG for string <i>T</i>, then the number of new edges added to the CDAWG after a left-end edit operation on <i>T</i> does not exceed <span>(textsf{e})</span>. Further, we present a matching lower bound on the sensitivity of CDAWGs for left-end insertions, and almost matching lower bounds for left-end deletions and substitutions. We then generalize our lower-bound instance for left-end insertions to <i>leftward online construction</i> of the CDAWG, and show that it requires <span>(Omega (n^2))</span> time for some string of length <i>n</i>.</p></div>","PeriodicalId":7189,"journal":{"name":"Acta Informatica","volume":"62 1","pages":""},"PeriodicalIF":0.4,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143108330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-04DOI: 10.1007/s00236-025-00479-x
Bhisham Dev Verma, Rameshwar Pratap
Locality-sensitive hashing (LSH) is a fundamental algorithmic toolkit used by data scientists for approximate nearest neighbour search problems that have been used extensively in many large-scale data processing applications such as near-duplicate detection, nearest-neighbour search, clustering, etc. In this work, we aim to propose faster and space-efficient locality-sensitive hash functions for Euclidean distance and cosine similarity for tensor data. Typically, the naive approach for obtaining LSH for tensor data involves first reshaping the tensor into vectors, followed by applying existing LSH methods for vector data. However, this approach becomes impractical for higher-order tensors because the size of the reshaped vector becomes exponential in the order of the tensor. Consequently, the size of LSH’s parameters increases exponentially. To address this problem, we suggest two methods for LSH for Euclidean distance and cosine similarity, namely CP-E2LSH, TT-E2LSH, and CP-SRP, TT-SRP, respectively, building on CP and tensor train (TT) decompositions techniques. Our approaches are space-efficient and can be efficiently applied to low-rank CP or TT tensors. We provide a rigorous theoretical analysis of our proposal on their correctness and efficacy.
{"title":"Improving LSH via tensorized random projection","authors":"Bhisham Dev Verma, Rameshwar Pratap","doi":"10.1007/s00236-025-00479-x","DOIUrl":"10.1007/s00236-025-00479-x","url":null,"abstract":"<div><p>Locality-sensitive hashing (LSH) is a fundamental algorithmic toolkit used by data scientists for approximate nearest neighbour search problems that have been used extensively in many large-scale data processing applications such as near-duplicate detection, nearest-neighbour search, clustering, etc. In this work, we aim to propose faster and space-efficient locality-sensitive hash functions for Euclidean distance and cosine similarity for tensor data. Typically, the naive approach for obtaining LSH for tensor data involves first reshaping the tensor into vectors, followed by applying existing LSH methods for vector data. However, this approach becomes impractical for higher-order tensors because the size of the reshaped vector becomes exponential in the order of the tensor. Consequently, the size of LSH’s parameters increases exponentially. To address this problem, we suggest two methods for LSH for Euclidean distance and cosine similarity, namely CP-E2LSH, TT-E2LSH, and CP-SRP, TT-SRP, respectively, building on CP and tensor train (TT) decompositions techniques. Our approaches are space-efficient and can be efficiently applied to low-rank CP or TT tensors. We provide a rigorous theoretical analysis of our proposal on their correctness and efficacy.</p></div>","PeriodicalId":7189,"journal":{"name":"Acta Informatica","volume":"62 1","pages":""},"PeriodicalIF":0.4,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143108328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-28DOI: 10.1007/s00236-024-00475-7
Markus Chimani, Max Ilsen
We introduce and discuss the Minimum Capacity-Preserving Subgraph (MCPS) problem: given a directed graph with edge capacities (textit{cap} ) and a retention ratio (alpha in (0,1)), find the smallest subgraph that, for each pair of vertices (u, v), preserves at least a fraction (alpha ) of a maximum u-v-flow’s value. This problem originates from the practical setting of reducing the power consumption in a computer network: it models turning off as many links as possible, while retaining the ability to transmit at least (alpha ) times the traffic compared to the original network. First we prove that MCPS is NP-hard already on a restricted set of directed acyclic graphs (DAGs) with unit edge capacities. Our reduction also shows that a closely related problem (which only considers the arguably most complicated core of the problem in the objective function) is NP-hard to approximate within a sublogarithmic factor already on DAGs. In terms of positive results, we present two algorithms that solve MCPS optimally on directed series-parallel graphs (DSPs): a simple linear-time algorithm for the special case of unit edge capacities and a cubic-time dynamic programming algorithm for the general case of non-uniform edge capacities. Further, we introduce the family of laminar series-parallel graphs (LSPs), a generalization of DSPs that also includes cyclic and very dense graphs. Their properties allow us to solve MCPS on LSPs by employing our DSP-algorithms as subroutines. In addition, we give a separate quadratic-time algorithm for MCPS on LSPs with unit edge capacities that also yields straightforward quadratic time algorithms for several related problems such as Minimum Equivalent Digraph and Directed Hamiltonian Cycle on LSPs.
我们引入并讨论了最小容量保留子图(MCPS)问题:给定一个具有边容量(textit{cap} )和保留率(alpha in (0,1))的有向图,找到最小的子图,对于每对顶点(u, v),保留最大u-v流值的至少一部分(alpha )。这个问题源于计算机网络中降低功耗的实际设置:它模拟关闭尽可能多的链路,同时保留传输至少是原始网络的(alpha )倍的流量的能力。首先,我们证明了MCPS在具有单位边容量的有向无环图(dag)的限制集合上是NP-hard的。我们的简化还表明,一个密切相关的问题(只考虑目标函数中最复杂的问题核心)在dag上已经存在的次对数因子内是np -难以近似的。就积极结果而言,我们提出了两种算法来最优地解决有向序列并行图(dsp)上的MCPS:一种简单的线性时间算法用于单位边缘容量的特殊情况,一种三次时间动态规划算法用于非均匀边缘容量的一般情况。此外,我们还介绍了层流串联平行图族(LSPs),它是dsp的一种推广,还包括循环图和非常密集图。它们的特性允许我们通过使用dsp算法作为子程序来解决lsp上的MCPS。此外,我们给出了一个单独的二次时间算法,用于具有单位边缘容量的lsp上的MCPS,该算法也为lsp上的最小等效有向图和有向哈密顿循环等几个相关问题提供了简单的二次时间算法。
{"title":"Directed capacity-preserving subgraphs: hardness and exact polynomial algorithms","authors":"Markus Chimani, Max Ilsen","doi":"10.1007/s00236-024-00475-7","DOIUrl":"10.1007/s00236-024-00475-7","url":null,"abstract":"<div><p>We introduce and discuss the <span>Minimum Capacity-Preserving Subgraph (MCPS)</span> problem: given a directed graph with edge capacities <span>(textit{cap} )</span> and a retention ratio <span>(alpha in (0,1))</span>, find the smallest subgraph that, for each pair of vertices (<i>u</i>, <i>v</i>), preserves at least a fraction <span>(alpha )</span> of a maximum <i>u</i>-<i>v</i>-flow’s value. This problem originates from the practical setting of reducing the power consumption in a computer network: it models turning off as many links as possible, while retaining the ability to transmit at least <span>(alpha )</span> times the traffic compared to the original network. First we prove that <span>MCPS</span> is NP-hard already on a restricted set of directed acyclic graphs (DAGs) with unit edge capacities. Our reduction also shows that a closely related problem (which only considers the arguably most complicated core of the problem in the objective function) is NP-hard to approximate within a sublogarithmic factor already on DAGs. In terms of positive results, we present two algorithms that solve <span>MCPS</span> optimally on directed series-parallel graphs (DSPs): a simple linear-time algorithm for the special case of unit edge capacities and a cubic-time dynamic programming algorithm for the general case of non-uniform edge capacities. Further, we introduce the family of laminar series-parallel graphs (LSPs), a generalization of DSPs that also includes cyclic and very dense graphs. Their properties allow us to solve <span>MCPS</span> on LSPs by employing our DSP-algorithms as subroutines. In addition, we give a separate quadratic-time algorithm for <span>MCPS</span> on LSPs with unit edge capacities that also yields straightforward quadratic time algorithms for several related problems such as <span>Minimum Equivalent Digraph</span> and <span>Directed Hamiltonian Cycle</span> on LSPs.</p></div>","PeriodicalId":7189,"journal":{"name":"Acta Informatica","volume":"62 1","pages":""},"PeriodicalIF":0.4,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s00236-024-00475-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143109963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Let G be a connected graph. For an edge (e=xy in E(G)), e is monitored by a vertex v if (d_G(v, y)ne d_{G-e}(v, y)) or (d_G(v, x)ne d_{G-e}(v, x)). A set M of vertices of a graph G is distance-edge-monitoring (DEM for short) set if every edge e of G is monitored by some vertex of M. A DEM set X for a graph G is called fault-tolerant DEM set if (Xsetminus {v}) is also DEM set for each v in X. Denote (operatorname {dem}(G)) and (operatorname {Fdem}(G)) the smallest size of DEM set and fault-tolerant DEM sets, respectively. In this paper, we first study the relation between (operatorname {Fdem}(G)) and (operatorname {dem}(G)) for a graph G. Next, we show that (2 le operatorname {Fdem}(G) le n) for any graph G with order n. Furthermore, the extremal graphs attaining lower and upper bounds are characterized. In the end, the exact values for some networks are given. Furthermore, it is shown that for (2le s<tle n), there exists a graph G of order n such that (operatorname {dem}(G)=s) and (operatorname {Fdem}(G)=t).
设G是连通图。对于边(e=xy in E(G)),如果(d_G(v, y)ne d_{G-e}(v, y))或(d_G(v, x)ne d_{G-e}(v, x)), e由顶点v监视。如果图G的每条边e都受到M的某个顶点的监视,则图G的M个顶点集称为距离边监测集(distance-edge-monitoring,简称DEM)。如果(Xsetminus {v})也是X中每个v的DEM集,则图G的DEM集X称为容错DEM集,分别取DEM集和容错DEM集的最小大小(operatorname {dem}(G))和(operatorname {Fdem}(G))。本文首先研究了图G的(operatorname {Fdem}(G))与(operatorname {dem}(G))之间的关系,然后证明了任意n阶图G的(2 le operatorname {Fdem}(G) le n)与之间的关系,并进一步刻画了达到下界和上界的极值图。最后,给出了某些网络的精确值。进一步证明,对于(2le s<tle n),存在一个n阶的图G,使得(operatorname {dem}(G)=s)和(operatorname {Fdem}(G)=t)。
{"title":"Fault-tolerance in distance-edge-monitoring sets","authors":"Chenxu Yang, Yaping Mao, Ralf Klasing, Gang Yang, Yuzhi Xiao, Xiaoyan Zhang","doi":"10.1007/s00236-024-00476-6","DOIUrl":"10.1007/s00236-024-00476-6","url":null,"abstract":"<div><p>Let <i>G</i> be a connected graph. For an edge <span>(e=xy in E(G))</span>, <i>e</i> is monitored by a vertex <i>v</i> if <span>(d_G(v, y)ne d_{G-e}(v, y))</span> or <span>(d_G(v, x)ne d_{G-e}(v, x))</span>. A set <i>M</i> of vertices of a graph <i>G</i> is distance-edge-monitoring (DEM for short) set if every edge <i>e</i> of <i>G</i> is monitored by some vertex of <i>M</i>. A DEM set <i>X</i> for a graph <i>G</i> is called fault-tolerant DEM set if <span>(Xsetminus {v})</span> is also DEM set for each <i>v</i> in <i>X</i>. Denote <span>(operatorname {dem}(G))</span> and <span>(operatorname {Fdem}(G))</span> the smallest size of DEM set and fault-tolerant DEM sets, respectively. In this paper, we first study the relation between <span>(operatorname {Fdem}(G))</span> and <span>(operatorname {dem}(G))</span> for a graph <i>G</i>. Next, we show that <span>(2 le operatorname {Fdem}(G) le n)</span> for any graph <i>G</i> with order <i>n</i>. Furthermore, the extremal graphs attaining lower and upper bounds are characterized. In the end, the exact values for some networks are given. Furthermore, it is shown that for <span>(2le s<tle n)</span>, there exists a graph <i>G</i> of order <i>n</i> such that <span>(operatorname {dem}(G)=s)</span> and <span>(operatorname {Fdem}(G)=t)</span>.</p></div>","PeriodicalId":7189,"journal":{"name":"Acta Informatica","volume":"62 1","pages":""},"PeriodicalIF":0.4,"publicationDate":"2024-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142889663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}