Pub Date : 2024-11-21DOI: 10.1007/s00236-024-00467-7
Lucas P. Ramos, Felipe A. Louza, Guilherme P. Telles
DNA technologies have evolved significantly in the past years enabling the sequencing of a large number of genomes in a short time. Nevertheless, the underlying problem of assembling sequence fragments is computationally hard and many technical factors and limitations complicate obtaining the complete sequence of a genome. Many genomes are left in a draft state, in which each chromosome is represented by a set of sequences with partial information on their relative order. Recently, some approaches have been proposed to compare draft genomes by comparing paths in de Bruijn graphs, which are constructed by many practical genome assemblers. In this article we describe in more detail a method for comparing genomes represented as succinct colored de Bruijn graphs directly and without resorting to sequence alignments, called (texttt {gcBB}), that evaluates the entropy and expectation measures based on the Burrows-Wheeler Similarity Distribution. We also introduce an improved version of (texttt {gcBB}), called (texttt {multi-gcBB}), that improves the time and space performance considerably through the selection of different data structures. We have compared phylogenies of 12 Drosophila species obtained by other methods to those obtained with (texttt {gcBB}), achieving promising results.
{"title":"Comparative genomics with succinct colored de Bruijn graphs","authors":"Lucas P. Ramos, Felipe A. Louza, Guilherme P. Telles","doi":"10.1007/s00236-024-00467-7","DOIUrl":"10.1007/s00236-024-00467-7","url":null,"abstract":"<div><p>DNA technologies have evolved significantly in the past years enabling the sequencing of a large number of genomes in a short time. Nevertheless, the underlying problem of assembling sequence fragments is computationally hard and many technical factors and limitations complicate obtaining the complete sequence of a genome. Many genomes are left in a draft state, in which each chromosome is represented by a set of sequences with partial information on their relative order. Recently, some approaches have been proposed to compare draft genomes by comparing paths in de Bruijn graphs, which are constructed by many practical genome assemblers. In this article we describe in more detail a method for comparing genomes represented as succinct colored de Bruijn graphs directly and without resorting to sequence alignments, called <span>(texttt {gcBB})</span>, that evaluates the entropy and expectation measures based on the Burrows-Wheeler Similarity Distribution. We also introduce an improved version of <span>(texttt {gcBB})</span>, called <span>(texttt {multi-gcBB})</span>, that improves the time and space performance considerably through the selection of different data structures. We have compared phylogenies of 12 Drosophila species obtained by other methods to those obtained with <span>(texttt {gcBB})</span>, achieving promising results.</p></div>","PeriodicalId":7189,"journal":{"name":"Acta Informatica","volume":"62 1","pages":""},"PeriodicalIF":0.4,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142679814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Given a text and a pattern over an alphabet, the classic exact matching problem searches for all occurrences of the pattern in the text. Unlike exact matching, order-preserving pattern matching (OPPM) considers the relative order of elements, rather than their exact values. In this paper, we propose efficient algorithms for the OPPM problem using the “duel-and-sweep” paradigm. For a pattern of length m and a text of length n, our serial algorithm runs in (O(n + mlog m)) time, and our parallel algorithm runs in (O(log ^2 m)) time and (O(n log ^2 m)) work with (O(log m)) time and (O(m log m)) work pattern preprocessing on the Priority Concurrent Read Concurrent Write Parallel Random-Access Machines (P-CRCW PRAM).
给定一段文本和一个字母表上的模式,经典的精确匹配问题是搜索文本中模式的所有出现次数。与精确匹配不同,保序模式匹配(OPPM)考虑的是元素的相对顺序,而不是它们的精确值。在本文中,我们提出了使用 "决斗和扫荡 "范式来解决 OPPM 问题的高效算法。对于长度为 m 的模式和长度为 n 的文本,我们的串行算法运行时间为 (O(n + mlog m))、而我们的并行算法在优先并发读取并发写入并行随机存取机(P-CRCW PRAM)上的运行时间为(O(log ^2 m)),工作模式预处理时间为(O(log m)),工作模式预处理时间为(O(n log ^2 m))。
{"title":"Serial and parallel algorithms for order-preserving pattern matching based on the duel-and-sweep paradigm","authors":"Davaajav Jargalsaikhan, Diptarama Hendrian, Yohei Ueki, Ryo Yoshinaka, Ayumi Shinohara","doi":"10.1007/s00236-024-00464-w","DOIUrl":"10.1007/s00236-024-00464-w","url":null,"abstract":"<div><p>Given a text and a pattern over an alphabet, the classic exact matching problem searches for all occurrences of the pattern in the text. Unlike exact matching, <i>order-preserving pattern matching</i> (OPPM) considers the relative order of elements, rather than their exact values. In this paper, we propose efficient algorithms for the OPPM problem using the “duel-and-sweep” paradigm. For a pattern of length <i>m</i> and a text of length <i>n</i>, our serial algorithm runs in <span>(O(n + mlog m))</span> time, and our parallel algorithm runs in <span>(O(log ^2 m))</span> time and <span>(O(n log ^2 m))</span> work with <span>(O(log m))</span> time and <span>(O(m log m))</span> work pattern preprocessing on the Priority Concurrent Read Concurrent Write Parallel Random-Access Machines (P-CRCW PRAM).</p></div>","PeriodicalId":7189,"journal":{"name":"Acta Informatica","volume":"61 4","pages":"415 - 444"},"PeriodicalIF":0.4,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-23DOI: 10.1007/s00236-024-00465-9
Shunsuke Inenaga
The linear-size suffix tries (LSTries) (Crochemore et al. in Theor Comput Sci 638:171–178, 2016) are a version of suffix trees in which the edge labels are single characters, yet are able to perform pattern matching queries in optimal time. Instead of explicitly storing the input text, LSTries have some extra non-branching internal nodes called type-2 nodes. The extended techniques are then used in the linear-size compact directed acyclic word graphs (LCDAWGs) (Takagi et al., in: SPIRE 2017, pp. 304–316, 2017), which can be stored with (O(textsf{el}(T)+textsf{er}(T))) space (i.e. without the text), where (textsf{el}(T)) and (textsf{er}(T)) are the numbers of left- and right-extensions of the maximal repeats in the input text string T, respectively. In this paper, we present simpler alternatives to the aforementioned indexing structures, called the simplified LSTries (simLSTries) and the simplified LCDAWGs (simLCDAWGs), in which most of the type-2 nodes are removed. In particular, our simLCDAWGs require only (O(textsf{er}(T))) space and work in a weaker model of computation (i.e. the pointer machine model). This contrasts the (O(textsf{er}(T)))-space CDAWG representation of Belazzougui and Cunial (in: Proceedings of the 24th international symposium on string processing and information retrieval, pp. 161–175, 2017), which works on the word RAM model.
线性大小后缀树(linear-size suffix tries,LSTries)(Crochemore 等人,载于 Theor Comput Sci 638:171-178, 2016)是后缀树的一个版本,其中的边标签是单字符,但却能在最佳时间内执行模式匹配查询。LSTries 不明确存储输入文本,而是有一些额外的非分支内部节点,称为 Type-2 节点。扩展技术随后被用于线性大小的紧凑有向无环词图(LCDAWGs)(Takagi et al、in: SPIRE 2017, pp. 304-316, 2017),它可以用 (O(textsf{el}(T)+textsf{er}(T)))空间存储(即不含文本),其中 (textsf{el}(T))和 (textsf{er}(T))分别是输入文本串 T 中最大重复次数的左扩展和右扩展的数量。在本文中,我们提出了上述索引结构的简化替代方案,称为简化 LSTries(simLSTries)和简化 LCDAWGs(simLCDAWGs),其中去除了大部分类型 2 节点。特别是,我们的 simLCDAWGs 只需要 (O(textsf{er}(T))) 空间,并且可以在较弱的计算模型(即指针机模型)中工作。这与 Belazzougui 和 Cunial 的 (O(textsf{er}(T)) )空间 CDAWG 表示(in:第 24 届字符串处理与信息检索国际研讨会论文集》(Proceedings of the 24th international symposium on string processing and information retrieval, pp.
{"title":"Linear-size suffix tries and linear-size CDAWGs simplified and improved","authors":"Shunsuke Inenaga","doi":"10.1007/s00236-024-00465-9","DOIUrl":"10.1007/s00236-024-00465-9","url":null,"abstract":"<div><p>The <i>linear-size suffix tries</i> (<i>LSTries</i>) (Crochemore et al. in Theor Comput Sci 638:171–178, 2016) are a version of suffix trees in which the edge labels are single characters, yet are able to perform pattern matching queries in optimal time. Instead of explicitly storing the input text, LSTries have some extra non-branching internal nodes called <i>type-2</i> nodes. The extended techniques are then used in the <i>linear-size compact directed acyclic word graphs</i> (<i>LCDAWGs</i>) (Takagi et al., in: SPIRE 2017, pp. 304–316, 2017), which can be stored with <span>(O(textsf{el}(T)+textsf{er}(T)))</span> space (i.e. without the text), where <span>(textsf{el}(T))</span> and <span>(textsf{er}(T))</span> are the numbers of left- and right-extensions of the maximal repeats in the input text string <i>T</i>, respectively. In this paper, we present simpler alternatives to the aforementioned indexing structures, called the <i>simplified LSTries</i> (<i>simLSTries</i>) and the <i>simplified LCDAWGs</i> (<i>simLCDAWGs</i>), in which most of the type-2 nodes are removed. In particular, our simLCDAWGs require only <span>(O(textsf{er}(T)))</span> space and work in a weaker model of computation (i.e. the pointer machine model). This contrasts the <span>(O(textsf{er}(T)))</span>-space CDAWG representation of Belazzougui and Cunial (in: Proceedings of the 24th international symposium on string processing and information retrieval, pp. 161–175, 2017), which works on the word RAM model.</p></div>","PeriodicalId":7189,"journal":{"name":"Acta Informatica","volume":"61 4","pages":"445 - 468"},"PeriodicalIF":0.4,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-15DOI: 10.1007/s00236-024-00463-x
Koustav De, Harshil Mittal, Palash Dey, Neeldhara Misra
The Kemeny method is one of the popular tools for rank aggregation. However, computing an optimal Kemeny ranking is (textsf{NP})-hard. Consequently, the computational task of finding a Kemeny ranking has been studied under the lens of parameterized complexity with respect to many parameters. We study the parameterized complexity of the problem of computing all distinct Kemeny rankings. We consider the target Kemeny score, number of candidates, average distance of input rankings, maximum range of any candidate, and unanimity width as our parameters. For all these parameters, we already have (textsf{FPT}) algorithms. We find that any desirable number of Kemeny rankings can also be found without substantial increase in running time. We also present (textsf{FPT}) approximation algorithms for Kemeny rank aggregation with respect to these parameters.
{"title":"Parameterized aspects of distinct Kemeny rank aggregation","authors":"Koustav De, Harshil Mittal, Palash Dey, Neeldhara Misra","doi":"10.1007/s00236-024-00463-x","DOIUrl":"10.1007/s00236-024-00463-x","url":null,"abstract":"<div><p>The Kemeny method is one of the popular tools for rank aggregation. However, computing an optimal Kemeny ranking is <span>(textsf{NP})</span>-hard. Consequently, the computational task of finding a Kemeny ranking has been studied under the lens of parameterized complexity with respect to many parameters. We study the parameterized complexity of the problem of computing all distinct Kemeny rankings. We consider the target Kemeny score, number of candidates, average distance of input rankings, maximum range of any candidate, and unanimity width as our parameters. For all these parameters, we already have <span>(textsf{FPT})</span> algorithms. We find that any desirable number of Kemeny rankings can also be found without substantial increase in running time. We also present <span>(textsf{FPT})</span> approximation algorithms for Kemeny rank aggregation with respect to these parameters.</p></div>","PeriodicalId":7189,"journal":{"name":"Acta Informatica","volume":"61 4","pages":"401 - 414"},"PeriodicalIF":0.4,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-10DOI: 10.1007/s00236-024-00462-y
Pamela Fleischmann, Lukas Haschke, Tim Löck, Dirk Nowotka
Word-representable graphs were introduced in 2008 by Kitaev and Pyatkin in the context of semigroup theory. Graphs are called word-representable if there exists a word with the graph’s nodes as letters such that the letters in the word alternate iff there is an edge between them in the graph. Until today numerous works investigated the word-representability of graphs but mostly from the graph perspective. In this work, we change the perspective to the words, i.e., we take classes of words and investigate the represented graphs. Our first subject of interest are the conjugates of words: we determine exactly which graphs are represented if we rotate the word. Afterwards, we look at k-local words introduced by Day et al. (FSTTCS LIPIcs, 2017) in order to gain more insights into this class of words. Here, we investigate especially which graphs are represented by 1-local words. Lastly, we prove that the language of all words representing a graph is regular. We were also able to characterise k-representable graphs, solving an open problem.
词可表示图是 Kitaev 和 Pyatkin 于 2008 年在半群理论的背景下提出的。如果存在一个以图的节点为字母的单词,且单词中的字母交替出现在图中,则该图被称为单词可表示图。迄今为止,研究图的单词可表示性的著作不胜枚举,但大多是从图的角度进行研究的。在这项工作中,我们将视角转向单词,即从单词的类别出发,研究其所代表的图。我们首先关注的是单词的共轭词:如果旋转单词,我们就能准确地确定哪些图被表示出来。之后,我们研究了 Day 等人(FSTTCS LIPIcs, 2017)引入的 k 本地单词,以深入了解这类单词。在这里,我们特别研究了哪些图是由 1 本地词表示的。最后,我们证明所有表示图的词的语言都是有规律的。我们还能够表征 k 可表示图,解决了一个未决问题。
{"title":"Word-representable graphs from a word’s perspective","authors":"Pamela Fleischmann, Lukas Haschke, Tim Löck, Dirk Nowotka","doi":"10.1007/s00236-024-00462-y","DOIUrl":"10.1007/s00236-024-00462-y","url":null,"abstract":"<div><p>Word-representable graphs were introduced in 2008 by Kitaev and Pyatkin in the context of semigroup theory. Graphs are called word-representable if there exists a word with the graph’s nodes as letters such that the letters in the word alternate iff there is an edge between them in the graph. Until today numerous works investigated the word-representability of graphs but mostly from the graph perspective. In this work, we change the perspective to the words, i.e., we take classes of words and investigate the represented graphs. Our first subject of interest are the conjugates of words: we determine exactly which graphs are represented if we rotate the word. Afterwards, we look at <i>k</i>-local words introduced by Day et al. (FSTTCS LIPIcs, 2017) in order to gain more insights into this class of words. Here, we investigate especially which graphs are represented by 1-local words. Lastly, we prove that the language of all words representing a graph is regular. We were also able to characterise <i>k</i>-representable graphs, solving an open problem.\u0000</p></div>","PeriodicalId":7189,"journal":{"name":"Acta Informatica","volume":"61 4","pages":"383 - 400"},"PeriodicalIF":0.4,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s00236-024-00462-y.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141937303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-03DOI: 10.1007/s00236-024-00461-z
R. Mahendra Kumar, N. Sadagopan
<div><p>A bipartite graph <i>G</i>(<i>X</i>, <i>Y</i>) is called a star-convex bipartite graph with convexity on <i>X</i> if there is an associated star <i>T</i>(<i>X</i>, <i>F</i>), such that for each vertex in <i>Y</i>, its neighborhood in <i>X</i> induces a subtree in <i>T</i>. A graph <i>G</i> is said to be a split graph if <i>G</i> can be partitioned into a clique (<i>K</i>) and an independent set (<i>I</i>). The objective of this study is twofold: (i) to strengthen the complexity results presented in Chen et al. (J Comb Optim 32(1):95–110, 2016) for the Hamiltonian cycle (HCYCLE), the Hamiltonian path (HPATH), and the Domination (DS) problems on star-convex bipartite graphs (ii) to reinforce the results of Müller (Discret Math 156(1–3):291–298, 1996) for HCYCLE, and HPATH on split graphs by introducing a convex ordering on one of the partitions (<i>K</i> or <i>I</i>). As part of our fine-grained analysis study with the diameter being the parameter, we first show that the diameter of star-convex bipartite graphs is at most six. Next, we observe that the reduction instances of Chen et al. (J Comb Optim 32(1):95–110, 2016) are star-convex bipartite graphs with at most diameter 4, and hence HCYCLE and HPATH are NP-complete on star-convex bipartite graphs with at most diameter 4. We strengthen this result and establish the following results on star-convex bipartite graphs: (i) HCYCLE is NP-complete for diameter 3, and polynomial-time solvable for diameters 2, 5, and 6 (a transformation in complexity: P to NPC to P) (ii) HPATH is polynomial-time solvable for diameter 2, and NP-Complete, otherwise (a dichotomy). Further, with convexity being the parameter, for split graphs with convexity on <i>K</i> (resp. <i>I</i>), we show that HCYCLE and HPATH are NP-complete on star-convex (resp. comb) split graphs with convexity on <i>K</i> (resp. <i>I</i>). Further, we show that HCYCLE is NP-complete on <span>(k_{1,r})</span>-free star-convex split graphs with convexity on <i>I</i>, <span>(rge 6)</span>. On the positive side, we show that for <span>(K_{1,5})</span>-free star-convex split graphs with convexity on <i>I</i>, HCYCLE is polynomial-time solvable. Thus, we establish a dichotomy for HCYCLE on star-convex split graphs with convexity on <i>I</i>. We further show that the dominating set problem (DS) and its variants (resp. Connected, Total, Outer-Connected, and Dominating biclique) are NP-complete on star-convex bipartite graphs with diameter 3 (resp. diameter 5, and diameter 6). On the parameterized complexity front, we prove that the parameterized version of the domination problem and its variants, with the parameter being the solution size, is not fixed-parameter tractable for star-convex bipartite graphs with diameter 3 (resp. diameter 5, and diameter 6), whereas it is fixed-parameter tractable when the parameter is the number of leaves in the associated star. Further, we show that for star-convex bipartite graphs with diameters 5, and 6, the domin
{"title":"A closer look at Hamiltonicity and domination through the lens of diameter and convexity","authors":"R. Mahendra Kumar, N. Sadagopan","doi":"10.1007/s00236-024-00461-z","DOIUrl":"10.1007/s00236-024-00461-z","url":null,"abstract":"<div><p>A bipartite graph <i>G</i>(<i>X</i>, <i>Y</i>) is called a star-convex bipartite graph with convexity on <i>X</i> if there is an associated star <i>T</i>(<i>X</i>, <i>F</i>), such that for each vertex in <i>Y</i>, its neighborhood in <i>X</i> induces a subtree in <i>T</i>. A graph <i>G</i> is said to be a split graph if <i>G</i> can be partitioned into a clique (<i>K</i>) and an independent set (<i>I</i>). The objective of this study is twofold: (i) to strengthen the complexity results presented in Chen et al. (J Comb Optim 32(1):95–110, 2016) for the Hamiltonian cycle (HCYCLE), the Hamiltonian path (HPATH), and the Domination (DS) problems on star-convex bipartite graphs (ii) to reinforce the results of Müller (Discret Math 156(1–3):291–298, 1996) for HCYCLE, and HPATH on split graphs by introducing a convex ordering on one of the partitions (<i>K</i> or <i>I</i>). As part of our fine-grained analysis study with the diameter being the parameter, we first show that the diameter of star-convex bipartite graphs is at most six. Next, we observe that the reduction instances of Chen et al. (J Comb Optim 32(1):95–110, 2016) are star-convex bipartite graphs with at most diameter 4, and hence HCYCLE and HPATH are NP-complete on star-convex bipartite graphs with at most diameter 4. We strengthen this result and establish the following results on star-convex bipartite graphs: (i) HCYCLE is NP-complete for diameter 3, and polynomial-time solvable for diameters 2, 5, and 6 (a transformation in complexity: P to NPC to P) (ii) HPATH is polynomial-time solvable for diameter 2, and NP-Complete, otherwise (a dichotomy). Further, with convexity being the parameter, for split graphs with convexity on <i>K</i> (resp. <i>I</i>), we show that HCYCLE and HPATH are NP-complete on star-convex (resp. comb) split graphs with convexity on <i>K</i> (resp. <i>I</i>). Further, we show that HCYCLE is NP-complete on <span>(k_{1,r})</span>-free star-convex split graphs with convexity on <i>I</i>, <span>(rge 6)</span>. On the positive side, we show that for <span>(K_{1,5})</span>-free star-convex split graphs with convexity on <i>I</i>, HCYCLE is polynomial-time solvable. Thus, we establish a dichotomy for HCYCLE on star-convex split graphs with convexity on <i>I</i>. We further show that the dominating set problem (DS) and its variants (resp. Connected, Total, Outer-Connected, and Dominating biclique) are NP-complete on star-convex bipartite graphs with diameter 3 (resp. diameter 5, and diameter 6). On the parameterized complexity front, we prove that the parameterized version of the domination problem and its variants, with the parameter being the solution size, is not fixed-parameter tractable for star-convex bipartite graphs with diameter 3 (resp. diameter 5, and diameter 6), whereas it is fixed-parameter tractable when the parameter is the number of leaves in the associated star. Further, we show that for star-convex bipartite graphs with diameters 5, and 6, the domin","PeriodicalId":7189,"journal":{"name":"Acta Informatica","volume":"61 4","pages":"357 - 382"},"PeriodicalIF":0.4,"publicationDate":"2024-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141937307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-30DOI: 10.1007/s00236-024-00460-0
Burkay Sucu, Ebru Aydin Gol
Parametric timed automata (PTA) extend timed automata (TA) with parameters instead of fixed timing constraints, providing the flexibility to accommodate uncertainties during the design phase. Once a parametric model is obtained, the next step is finding the optimal parameters such that the resulting TA satisfies the specifications. This paper introduces a new algorithm for determining parameters from safety specifications for PTA with bounded integer parameters and no nested cycles. The algorithm searches for unsafe paths through a depth-first search and generates parameter constraints. In particular, the realizability of simple and cyclic paths are encoded via mixed integer linear programming and non-linear programming problems. Then, the parameter constraints rendering the path unrealizable are derived via quantifier elimination. The accumulated constraints through the depth-first search guarantee that a parameter valuation satisfying these constraints solves the synthesis problem. The results are illustrated over benchmarks.
{"title":"Cycle encoding-based parameter synthesis for timed automata safety","authors":"Burkay Sucu, Ebru Aydin Gol","doi":"10.1007/s00236-024-00460-0","DOIUrl":"10.1007/s00236-024-00460-0","url":null,"abstract":"<div><p>Parametric timed automata (PTA) extend timed automata (TA) with parameters instead of fixed timing constraints, providing the flexibility to accommodate uncertainties during the design phase. Once a parametric model is obtained, the next step is finding the optimal parameters such that the resulting TA satisfies the specifications. This paper introduces a new algorithm for determining parameters from safety specifications for PTA with bounded integer parameters and no nested cycles. The algorithm searches for unsafe paths through a depth-first search and generates parameter constraints. In particular, the realizability of simple and cyclic paths are encoded via mixed integer linear programming and non-linear programming problems. Then, the parameter constraints rendering the path unrealizable are derived via quantifier elimination. The accumulated constraints through the depth-first search guarantee that a parameter valuation satisfying these constraints solves the synthesis problem. The results are illustrated over benchmarks.\u0000</p></div>","PeriodicalId":7189,"journal":{"name":"Acta Informatica","volume":"61 4","pages":"333 - 356"},"PeriodicalIF":0.4,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141868296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-20DOI: 10.1007/s00236-024-00459-7
Wenfeng Lai, Adiesha Liyanage, Binhai Zhu, Peng Zou
Motivated by computing duplication patterns in sequences, a new problem called the longest letter-duplicated subsequence (LLDS) is proposed. Given a sequence S of length n, a letter-duplicated subsequence is a subsequence of S in the form of (x_1^{d_1}x_2^{d_2}ldots x_k^{d_k}) with (x_iin Sigma ), (x_jne x_{j+1}) and (d_ige 2) for all i in [k] and j in ([k-1]). A linear time algorithm for computing a longest letter-duplicated subsequence (LLDS) of S can be easily obtained. In this paper, we focus on two variants of this problem: (1) ‘all-appearance’ version, i.e., all letters in (Sigma ) must appear in the solution, and (2) the weighted version. For the former, we obtain dichotomous results: We prove that, when each letter appears in S at least 4 times, the problem and a relaxed version on feasibility testing (FT) are both NP-hard. The reduction is from ((3^+,1,2^-))-SAT, where all 3-clauses (i.e., containing 3 lals) are monotone (i.e., containing only positive literals) and all 2-clauses contain only negative literals. We then show that when each letter appears in S at most 3 times, then the problem admits an O(n) time algorithm. Finally, we consider the weighted version, where the weight of a block (x_i^{d_i} (d_ige 2)) could be any positive function which might not grow with (d_i). We give a non-trivial (O(n^2)) time dynamic programming algorithm for this version, i.e., computing an LD-subsequence of S whose weight is maximized.
受计算序列中重复模式的启发,我们提出了一个称为最长字母重复子序列(LLDS)的新问题。给定一个长度为 n 的序列 S,对于 [k] 中的所有 i 和 ([k-1]) 中的所有 j,字母重复子序列是 S 的一个子序列,其形式为 (x_1^{d_1}x_2^{d_2}ldots x_k^{d_k}) with (x_iin Sigma ), (x_jne x_{j+1}) and (d_ige 2) 。计算 S 的最长字母重复子序列(LLDS)的线性时间算法很容易得到。在本文中,我们将重点讨论这个问题的两个变体:(1)"全部出现 "版本,即解中必须出现 (Sigma ) 中的所有字母;(2)加权版本。对于前者,我们得到了二分结果:我们证明,当每个字母在 S 中至少出现 4 次时,这个问题和可行性测试(FT)的简化版本都是 NP-困难的。该问题是由((3^+,1,2^-))-SAT 简化而来的,其中所有 3 个分句(即包含 3 个字面量)都是单调的(即只包含正字面量),而所有 2 个分句只包含负字面量。然后我们证明,当每个字母在 S 中最多出现 3 次时,该问题的算法时间为 O(n)。最后,我们考虑了加权版本,其中块 (x_i^{d_i} (d_ige 2))的权重可以是任何正函数,它可能不会随着 (d_i)的增长而增长。对于这个版本,我们给出了一种非微妙的(O(n^2))时间动态编程算法,即计算 S 的 LD 子序列,其权重最大化。
{"title":"The longest letter-duplicated subsequence and related problems","authors":"Wenfeng Lai, Adiesha Liyanage, Binhai Zhu, Peng Zou","doi":"10.1007/s00236-024-00459-7","DOIUrl":"10.1007/s00236-024-00459-7","url":null,"abstract":"<div><p>Motivated by computing duplication patterns in sequences, a new problem called the longest letter-duplicated subsequence (LLDS) is proposed. Given a sequence <i>S</i> of length <i>n</i>, a letter-duplicated subsequence is a subsequence of <i>S</i> in the form of <span>(x_1^{d_1}x_2^{d_2}ldots x_k^{d_k})</span> with <span>(x_iin Sigma )</span>, <span>(x_jne x_{j+1})</span> and <span>(d_ige 2)</span> for all <i>i</i> in [<i>k</i>] and <i>j</i> in <span>([k-1])</span>. A linear time algorithm for computing a longest letter-duplicated subsequence (LLDS) of <i>S</i> can be easily obtained. In this paper, we focus on two variants of this problem: (1) ‘all-appearance’ version, i.e., all letters in <span>(Sigma )</span> must appear in the solution, and (2) the weighted version. For the former, we obtain dichotomous results: We prove that, when each letter appears in <i>S</i> at least 4 times, the problem and a relaxed version on feasibility testing (FT) are both NP-hard. The reduction is from <span>((3^+,1,2^-))</span>-SAT, where all 3-clauses (i.e., containing 3 lals) are monotone (i.e., containing only positive literals) and all 2-clauses contain only negative literals. We then show that when each letter appears in <i>S</i> at most 3 times, then the problem admits an <i>O</i>(<i>n</i>) time algorithm. Finally, we consider the weighted version, where the weight of a block <span>(x_i^{d_i} (d_ige 2))</span> could be any positive function which might not grow with <span>(d_i)</span>. We give a non-trivial <span>(O(n^2))</span> time dynamic programming algorithm for this version, i.e., computing an LD-subsequence of <i>S</i> whose weight is maximized.</p></div>","PeriodicalId":7189,"journal":{"name":"Acta Informatica","volume":"61 3","pages":"315 - 329"},"PeriodicalIF":0.4,"publicationDate":"2024-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s00236-024-00459-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141741186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-13DOI: 10.1007/s00236-024-00457-9
Wided Ghardallou, Hessamaldin Mohammadi, Richard C. Linger, Mark Pleszkoch, JiMeng Loh, Ali Mili
Invariant relations are used to analyze while loops; while their primary application is to derive the function of a loop, they can also be used to derive loop invariants, weakest preconditions, strongest postconditions, sufficient conditions of correctness, necessary conditions of correctness, and termination conditions of loops. In this paper we present two generic invariant relations that capture the semantics of loops whose loop body applies affine transformations on numeric variables.
不变量关系用于分析 while 循环;虽然它们的主要应用是推导循环的函数,但也可用于推导循环不变量、最弱前置条件、最强后置条件、正确性充分条件、正确性必要条件和循环终止条件。在本文中,我们提出了两个通用不变式关系,它们捕捉了循环体对数值变量进行仿射变换的循环语义。
{"title":"Invariant relations for affine loops","authors":"Wided Ghardallou, Hessamaldin Mohammadi, Richard C. Linger, Mark Pleszkoch, JiMeng Loh, Ali Mili","doi":"10.1007/s00236-024-00457-9","DOIUrl":"10.1007/s00236-024-00457-9","url":null,"abstract":"<div><p>Invariant relations are used to analyze while loops; while their primary application is to derive the function of a loop, they can also be used to derive loop invariants, weakest preconditions, strongest postconditions, sufficient conditions of correctness, necessary conditions of correctness, and termination conditions of loops. In this paper we present two generic invariant relations that capture the semantics of loops whose loop body applies affine transformations on numeric variables.\u0000</p></div>","PeriodicalId":7189,"journal":{"name":"Acta Informatica","volume":"61 3","pages":"261 - 314"},"PeriodicalIF":0.4,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s00236-024-00457-9.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140930753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}