Pub Date : 2020-02-03DOI: 10.4230/LIPIcs.CPM.2020.20
Tomohiro Koana, Vincent Froese, R. Niedermeier
Considering matrices with missing entries, we study NP-hard matrix completion problems where the resulting completed matrix shall have limited (local) radius. In the pure radius version, this means that the goal is to fill in the entries such that there exists a 'center string' which has Hamming distance to all matrix rows as small as possible. In stringology, this problem is also known as Closest String with Wildcards. In the local radius version, the requested center string must be one of the rows of the completed matrix. Hermelin and Rozenberg [CPM 2014, TCS 2016] performed parameterized complexity studies for Closest String with Wildcards. We answer one of their open questions, fix a bug concerning a fixed-parameter tractability result in their work, and improve some upper running time bounds. For the local radius case, we reveal a computational complexity dichotomy. In general, our results indicate that, although being NP-hard as well, this variant often allows for faster (fixed-parameter) algorithms.
{"title":"Parameterized Algorithms for Matrix Completion With Radius Constraints","authors":"Tomohiro Koana, Vincent Froese, R. Niedermeier","doi":"10.4230/LIPIcs.CPM.2020.20","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2020.20","url":null,"abstract":"Considering matrices with missing entries, we study NP-hard matrix completion problems where the resulting completed matrix shall have limited (local) radius. In the pure radius version, this means that the goal is to fill in the entries such that there exists a 'center string' which has Hamming distance to all matrix rows as small as possible. In stringology, this problem is also known as Closest String with Wildcards. In the local radius version, the requested center string must be one of the rows of the completed matrix. Hermelin and Rozenberg [CPM 2014, TCS 2016] performed parameterized complexity studies for Closest String with Wildcards. We answer one of their open questions, fix a bug concerning a fixed-parameter tractability result in their work, and improve some upper running time bounds. For the local radius case, we reveal a computational complexity dichotomy. In general, our results indicate that, although being NP-hard as well, this variant often allows for faster (fixed-parameter) algorithms.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116214210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-01-19DOI: 10.4230/LIPIcs.CPM.2020.25
V. Mäkinen, Kristoffer Sahlin
Chaining algorithms aim to form a semi-global alignment of two sequences based on a set of anchoring local alignments as input. Depending on the optimization criteria and the exact definition of a chain, there are several $O(n log n)$ time algorithms to solve this problem optimally, where $n$ is the number of input anchors. In this paper, we focus on a formulation allowing the anchors to overlap in a chain. This formulation was studied by Shibuya and Kurochin (WABI 2003), but their algorithm comes with no proof of correctness. We revisit and modify their algorithm to consider a strict definition of precedence relation on anchors, adding the required derivation to convince on the correctness of the resulting algorithm that runs in $O(n log^2 n)$ time on anchors formed by exact matches. With the more relaxed definition of precedence relation considered by Shibuya and Kurochin or when anchors are non-nested such as matches of uniform length ($k$-mers), the algorithm takes $O(n log n)$ time. We also establish a connection between chaining with overlaps to the widely studied longest common subsequence (LCS) problem.
{"title":"Chaining with overlaps revisited","authors":"V. Mäkinen, Kristoffer Sahlin","doi":"10.4230/LIPIcs.CPM.2020.25","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2020.25","url":null,"abstract":"Chaining algorithms aim to form a semi-global alignment of two sequences based on a set of anchoring local alignments as input. Depending on the optimization criteria and the exact definition of a chain, there are several $O(n log n)$ time algorithms to solve this problem optimally, where $n$ is the number of input anchors. \u0000In this paper, we focus on a formulation allowing the anchors to overlap in a chain. This formulation was studied by Shibuya and Kurochin (WABI 2003), but their algorithm comes with no proof of correctness. We revisit and modify their algorithm to consider a strict definition of precedence relation on anchors, adding the required derivation to convince on the correctness of the resulting algorithm that runs in $O(n log^2 n)$ time on anchors formed by exact matches. With the more relaxed definition of precedence relation considered by Shibuya and Kurochin or when anchors are non-nested such as matches of uniform length ($k$-mers), the algorithm takes $O(n log n)$ time. \u0000We also establish a connection between chaining with overlaps to the widely studied longest common subsequence (LCS) problem.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133846537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-11-01DOI: 10.4230/LIPIcs.CPM.2018.1
Y. Sakai
A common subsequence of two strings is maximal, if inserting any character into the subsequence can no longer yield a common subsequence of the two strings. The present article proposes a (sub)linearithmic-time, linear-space algorithm for finding a maximal common subsequence of two strings and also proposes a linear-time algorithm for determining if a common subsequence of two strings is maximal.
{"title":"Maximal Common Subsequence Algorithms","authors":"Y. Sakai","doi":"10.4230/LIPIcs.CPM.2018.1","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2018.1","url":null,"abstract":"A common subsequence of two strings is maximal, if inserting any character into the subsequence can no longer yield a common subsequence of the two strings. The present article proposes a (sub)linearithmic-time, linear-space algorithm for finding a maximal common subsequence of two strings and also proposes a linear-time algorithm for determining if a common subsequence of two strings is maximal.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126152151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-06-01DOI: 10.4230/LIPIcs.CPM.2019.29
Y. Urabe, Yuto Nakashima, Shunsuke Inenaga, H. Bannai, M. Takeda
Lempel-Ziv (LZ) factorization and Lyndon factorization are well-known factorizations of strings. Recently, Karkkainen et al. studied the relation between the sizes of the two factorizations, and showed that the size of the Lyndon factorization is always smaller than twice the size of the non-overlapping LZ factorization [STACS 2017]. In this paper, we consider a similar problem for the overlapping version of the LZ factorization. Since the size of the overlapping LZ factorization is always smaller than the size of the non-overlapping LZ factorization and, in fact, can even be an O(log n) factor smaller, it is not immediately clear whether a similar bound as in previous work would hold. Nevertheless, in this paper, we prove that the size of the Lyndon factorization is always smaller than four times the size of the overlapping LZ factorization.
{"title":"On the Size of Overlapping Lempel-Ziv and Lyndon Factorizations","authors":"Y. Urabe, Yuto Nakashima, Shunsuke Inenaga, H. Bannai, M. Takeda","doi":"10.4230/LIPIcs.CPM.2019.29","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2019.29","url":null,"abstract":"Lempel-Ziv (LZ) factorization and Lyndon factorization are well-known factorizations of strings. Recently, Karkkainen et al. studied the relation between the sizes of the two factorizations, and showed that the size of the Lyndon factorization is always smaller than twice the size of the non-overlapping LZ factorization [STACS 2017]. In this paper, we consider a similar problem for the overlapping version of the LZ factorization. Since the size of the overlapping LZ factorization is always smaller than the size of the non-overlapping LZ factorization and, in fact, can even be an O(log n) factor smaller, it is not immediately clear whether a similar bound as in previous work would hold. Nevertheless, in this paper, we prove that the size of the Lyndon factorization is always smaller than four times the size of the overlapping LZ factorization.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125742855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-06-01DOI: 10.4230/LIPIcs.CPM.2019.14
K. Labib, P. Uznański, D. Wolleb-Graf
We show, given a binary integer function diamond that is piecewise polynomial, that (+,diamond) vector products are equivalent under one-to-polylog reductions to the computation of the Hamming distance. Examples include the dominance and l_{2p+1} distances for constant p. Our results imply equivalence (up to polylog factors) between the complexity of computing All Pairs Hamming Distance, All Pairs l_{2p+1} Distance and Dominance Matrix Product, and equivalence between Hamming Distance Pattern Matching, l_{2p+1} Pattern Matching and Less-Than Pattern Matching. The resulting algorithms for l_{2p+1} Pattern Matching and All Pairs l_{2p+1}, for 2p+1 = 3,5,7,... are likely to be optimal, given lack of progress in improving upper bounds for Hamming distance in the past 30 years. While reductions between selected pairs of products were presented in the past, our work is the first to generalize them to a general class of functions, showing that a wide class of "intermediate" complexity problems are in fact equivalent.
{"title":"Hamming Distance Completeness","authors":"K. Labib, P. Uznański, D. Wolleb-Graf","doi":"10.4230/LIPIcs.CPM.2019.14","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2019.14","url":null,"abstract":"We show, given a binary integer function diamond that is piecewise polynomial, that (+,diamond) vector products are equivalent under one-to-polylog reductions to the computation of the Hamming distance. Examples include the dominance and l_{2p+1} distances for constant p. Our results imply equivalence (up to polylog factors) between the complexity of computing All Pairs Hamming Distance, All Pairs l_{2p+1} Distance and Dominance Matrix Product, and equivalence between Hamming Distance Pattern Matching, l_{2p+1} Pattern Matching and Less-Than Pattern Matching. The resulting algorithms for l_{2p+1} Pattern Matching and All Pairs l_{2p+1}, for 2p+1 = 3,5,7,... are likely to be optimal, given lack of progress in improving upper bounds for Hamming distance in the past 30 years. While reductions between selected pairs of products were presented in the past, our work is the first to generalize them to a general class of functions, showing that a wide class of \"intermediate\" complexity problems are in fact equivalent.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127205517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-06-01DOI: 10.4230/LIPIcs.CPM.2019.17
H. Bannai, Juha Kärkkäinen, D. Köppl, Marcin Piatkowski
The Burrows-Wheeler transform (BWT) is a permutation whose applications are prevalent in data compression and text indexing. The bijective BWT is a bijective variant of it that has not yet been studied for text indexing applications. We fill this gap by proposing a self-index built on the bijective BWT . The self-index applies the backward search technique of the FM-index to find a pattern P with O(|P| lg|P|) backward search steps.
{"title":"Indexing the Bijective BWT","authors":"H. Bannai, Juha Kärkkäinen, D. Köppl, Marcin Piatkowski","doi":"10.4230/LIPIcs.CPM.2019.17","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2019.17","url":null,"abstract":"The Burrows-Wheeler transform (BWT) is a permutation whose applications are prevalent in data compression and text indexing. The bijective BWT is a bijective variant of it that has not yet been studied for text indexing applications. We fill this gap by proposing a self-index built on the bijective BWT . The self-index applies the backward search technique of the FM-index to find a pattern P with O(|P| lg|P|) backward search steps.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123124845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-05-09DOI: 10.4230/LIPIcs.CPM.2021.20
Abhinav Nellore, Austin Nguyen, Reid F. Thompson
Let $G = (V, E)$ be a digraph where each vertex is unlabeled, each edge is labeled by a character in some alphabet $Omega$, and any two edges with both the same head and the same tail have different labels. The powerset construction gives a transform of $G$ into a weakly connected digraph $G' = (V', E')$ that enables solving the decision problem of whether there is a walk in $G$ matching an arbitrarily long query string $q$ in time linear in $|q|$ and independent of $|E|$ and $|V|$. We show $G$ is uniquely determined by $G'$ when for every $v_ell in V$, there is some distinct string $s_ell$ on $Omega$ such that $v_ell$ is the origin of a closed walk in $G$ matching $s_ell$, and no other walk in $G$ matches $s_ell$ unless it starts and ends at $v_ell$. We then exploit this invertibility condition to strategically alter any $G$ so its transform $G'$ enables retrieval of all $t$ terminal vertices of walks in the unaltered $G$ matching $q$ in $O(|q| + t log |V|)$ time. We conclude by proposing two defining properties of a class of transforms that includes the Burrows-Wheeler transform and the transform presented here.
设$G = (V, E)$是一个有向图,其中每个顶点都没有标记,每个边都用某种字母中的字符标记$Omega$,并且具有相同头和相同尾的任何两条边都具有不同的标签。powerset构造给出了将$G$转换为弱连接有向图$G' = (V', E')$的方法,该方法能够解决在$|q|$中是否存在与任意长的查询字符串$q$在时间线性上匹配且独立于$|E|$和$|V|$的$G$中的行走决策问题。我们证明$G$是由$G'$唯一确定的,当对于每一个$v_ell in V$,在$Omega$上有一些独特的字符串$s_ell$,使得$v_ell$是$G$中匹配$s_ell$的封闭行走的起源,并且$G$中没有匹配$s_ell$的其他行走,除非它开始和结束于$v_ell$。然后,我们利用这种可逆性条件策略性地改变任何$G$,使其变换$G'$能够在$O(|q| + t log |V|)$时间内检索未改变的$G$匹配$q$中行走的所有$t$终端顶点。最后,我们提出了一类变换的两个定义性质,这类变换包括Burrows-Wheeler变换和这里给出的变换。
{"title":"An Invertible Transform for Efficient String Matching in Labeled Digraphs","authors":"Abhinav Nellore, Austin Nguyen, Reid F. Thompson","doi":"10.4230/LIPIcs.CPM.2021.20","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2021.20","url":null,"abstract":"Let $G = (V, E)$ be a digraph where each vertex is unlabeled, each edge is labeled by a character in some alphabet $Omega$, and any two edges with both the same head and the same tail have different labels. The powerset construction gives a transform of $G$ into a weakly connected digraph $G' = (V', E')$ that enables solving the decision problem of whether there is a walk in $G$ matching an arbitrarily long query string $q$ in time linear in $|q|$ and independent of $|E|$ and $|V|$. We show $G$ is uniquely determined by $G'$ when for every $v_ell in V$, there is some distinct string $s_ell$ on $Omega$ such that $v_ell$ is the origin of a closed walk in $G$ matching $s_ell$, and no other walk in $G$ matches $s_ell$ unless it starts and ends at $v_ell$. We then exploit this invertibility condition to strategically alter any $G$ so its transform $G'$ enables retrieval of all $t$ terminal vertices of walks in the unaltered $G$ matching $q$ in $O(|q| + t log |V|)$ time. We conclude by proposing two defining properties of a class of transforms that includes the Burrows-Wheeler transform and the transform presented here.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115293386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-02-14DOI: 10.4230/LIPIcs.CPM.2019.9
T. Nishimoto, Yasuo Tabei
Converting a compressed format of a string into another compressed format without an explicit decompression is one of the central research topics in string processing. We discuss the problem of converting the run-length Burrows-Wheeler Transform (RLBWT) of a string to Lempel-Ziv 77 (LZ77) phrases of the reversed string. The first results with Policriti and Prezza's conversion algorithm [Algorithmica 2018] were $O(n log r)$ time and $O(r)$ working space for length of the string $n$, number of runs $r$ in the RLBWT, and number of LZ77 phrases $z$. Recent results with Kempa's conversion algorithm [SODA 2019] are $O(n / log n + r log^{9} n + z log^{9} n)$ time and $O(n / log_{sigma} n + r log^{8} n)$ working space for the alphabet size $sigma$ of the RLBWT. In this paper, we present a new conversion algorithm by improving Policriti and Prezza's conversion algorithm where dynamic data structures for general purpose are used. We argue that these dynamic data structures can be replaced and present new data structures for faster conversion. The time and working space of our conversion algorithm with new data structures are $O(n min { log log n, sqrt{frac{log r}{loglog r}} })$ and $O(r)$, respectively.
在不显式解压缩的情况下将字符串的压缩格式转换为另一种压缩格式是字符串处理的中心研究课题之一。讨论了将字符串的行长Burrows-Wheeler变换(RLBWT)转换为反向字符串的Lempel-Ziv 77 (LZ77)短语的问题。politici和Prezza的转换算法[Algorithmica 2018]的第一个结果是$O(n log r)$时间和$O(r)$字符串长度的工作空间$n$、RLBWT中的运行次数$r$和LZ77短语的数量$z$。最近使用Kempa的转换算法[SODA 2019]的结果是$O(n / log n + r log^{9} n + z log^{9} n)$时间和$O(n / log_{sigma} n + r log^{8} n)$ RLBWT的字母表大小工作空间$sigma$。在本文中,我们通过改进Policriti和Prezza的转换算法提出了一种新的转换算法,其中使用了通用的动态数据结构。我们认为这些动态数据结构可以被替换并呈现新的数据结构,以实现更快的转换。新数据结构转换算法的时间和工作空间分别为$O(n min { log log n, sqrt{frac{log r}{loglog r}} })$和$O(r)$。
{"title":"Conversion from RLBWT to LZ77","authors":"T. Nishimoto, Yasuo Tabei","doi":"10.4230/LIPIcs.CPM.2019.9","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2019.9","url":null,"abstract":"Converting a compressed format of a string into another compressed format without an explicit decompression is one of the central research topics in string processing. We discuss the problem of converting the run-length Burrows-Wheeler Transform (RLBWT) of a string to Lempel-Ziv 77 (LZ77) phrases of the reversed string. The first results with Policriti and Prezza's conversion algorithm [Algorithmica 2018] were $O(n log r)$ time and $O(r)$ working space for length of the string $n$, number of runs $r$ in the RLBWT, and number of LZ77 phrases $z$. Recent results with Kempa's conversion algorithm [SODA 2019] are $O(n / log n + r log^{9} n + z log^{9} n)$ time and $O(n / log_{sigma} n + r log^{8} n)$ working space for the alphabet size $sigma$ of the RLBWT. In this paper, we present a new conversion algorithm by improving Policriti and Prezza's conversion algorithm where dynamic data structures for general purpose are used. We argue that these dynamic data structures can be replaced and present new data structures for faster conversion. The time and working space of our conversion algorithm with new data structures are $O(n min { log log n, sqrt{frac{log r}{loglog r}} })$ and $O(r)$, respectively.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121668881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-30DOI: 10.4230/LIPICS.CPM.2019.27
Mitsuru Funakoshi, Yuto Nakashima, Shunsuke Inenaga, H. Bannai, M. Takeda
Palindromes are important objects in strings which have been extensively studied from combinatorial, algorithmic, and bioinformatics points of views. Manacher [J. ACM 1975] proposed a seminal algorithm that computes the longest substring palindromes (LSPals) of a given string in O(n) time, where n is the length of the string. In this paper, we consider the problem of finding the LSPal after the string is edited. We present an algorithm that uses O(n) time and space for preprocessing, and answers the length of the LSPals in O(l + log log n) time, after a substring in T is replaced by a string of arbitrary length l. This outperforms the query algorithm proposed in our previous work [CPM 2018] that uses O(l + log n) time for each query.
{"title":"Faster queries for longest substring palindrome after block edit","authors":"Mitsuru Funakoshi, Yuto Nakashima, Shunsuke Inenaga, H. Bannai, M. Takeda","doi":"10.4230/LIPICS.CPM.2019.27","DOIUrl":"https://doi.org/10.4230/LIPICS.CPM.2019.27","url":null,"abstract":"Palindromes are important objects in strings which have been extensively studied from combinatorial, algorithmic, and bioinformatics points of views. Manacher [J. ACM 1975] proposed a seminal algorithm that computes the longest substring palindromes (LSPals) of a given string in O(n) time, where n is the length of the string. In this paper, we consider the problem of finding the LSPal after the string is edited. We present an algorithm that uses O(n) time and space for preprocessing, and answers the length of the LSPals in O(l + log log n) time, after a substring in T is replaced by a string of arbitrary length l. This outperforms the query algorithm proposed in our previous work [CPM 2018] that uses O(l + log n) time for each query.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115143511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-30DOI: 10.4230/LIPIcs.CPM.2019.23
Ryo Sugahara, Yuto Nakashima, Shunsuke Inenaga, H. Bannai, M. Takeda
A maximal repetition, or run, in a string, is a periodically maximal substring whose smallest period is at most half the length of the substring. In this paper, we consider runs that correspond to a path on a trie, or in other words, on a rooted edge-labeled tree where the endpoints of the path must be a descendant/ancestor of the other. For a trie with $n$ edges, we show that the number of runs is less than $n$. We also show an $O(nsqrt{log n}log log n)$ time and $O(n)$ space algorithm for counting and finding the shallower endpoint of all runs. We further show an $O(nsqrt{log n}log^2log n)$ time and $O(n)$ space algorithm for finding both endpoints of all runs.
{"title":"Computing runs on a trie","authors":"Ryo Sugahara, Yuto Nakashima, Shunsuke Inenaga, H. Bannai, M. Takeda","doi":"10.4230/LIPIcs.CPM.2019.23","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2019.23","url":null,"abstract":"A maximal repetition, or run, in a string, is a periodically maximal substring whose smallest period is at most half the length of the substring. In this paper, we consider runs that correspond to a path on a trie, or in other words, on a rooted edge-labeled tree where the endpoints of the path must be a descendant/ancestor of the other. For a trie with $n$ edges, we show that the number of runs is less than $n$. We also show an $O(nsqrt{log n}log log n)$ time and $O(n)$ space algorithm for counting and finding the shallower endpoint of all runs. We further show an $O(nsqrt{log n}log^2log n)$ time and $O(n)$ space algorithm for finding both endpoints of all runs.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"-1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123698923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}