首页 > 最新文献

Workshop on Analytic Algorithmics and Combinatorics最新文献

英文 中文
The Spanning Trees Formulas in a Class of Double Fixed-Step Loop Networks 一类双固定步长环网的生成树公式
Pub Date : 2009-01-03 DOI: 10.1137/1.9781611972993.3
T. Atajan, N. Otsuka, Xuerong Yong
A double fixed-step loop network, Cp,q, is a digraph on n vertices 0, 1, 2, ..., n − 1 and for each vertex i (0 < i ≤ n − 1), there are exactly two arcs leaving from vertex i to vertices i + p, i + q (mod n). In this paper, we first derive an expression formula of elementary symmetric polynomials as polynomials in sums of powers then, by using this, for any positive integers p, q, n with p < q < n, an explicit formula for counting the number of spanning trees in a class of double fixed-step loop networks with constant or nonconstant jumps. We allso find two classes of networks that share the same number of spanning trees and we, finally, prove that the number of spanning trees can be approximated by a formula which is based on the mth order Fibonacci numbers. In some special cases, our results generate the formulas obtained in [15],[19],[20]. And, compared with the previous work, the advantage is that, for any jumps p, q, the number of spanning trees can be calculated directly, without establishing the recurrence relation of order 2q−1.
双定步环网络Cp,q是一个有向图,有n个顶点0,1,2,…为每个顶点n−1和我(0 <我≤n−1),有两个弧离开从顶点到顶点我+ p, i + q (mod n)。在这篇文章中,我们首先推导公式表达式中多项式的初等对称多项式的权力,利用这一点,任何正整数p, q, n和p < < n,显式公式计算生成树的数量在一个类双固定步循环网络与常数或非常数的跳跃。我们也找到了两类共享相同数量的生成树的网络,并最终证明了生成树的数量可以用一个基于m阶斐波那契数的公式来近似。在某些特殊情况下,我们的结果生成[15]、[19]、[20]中得到的公式。与以往的工作相比,其优点在于,对于任意跳跃p, q,都可以直接计算生成树的个数,而不需要建立2q−1阶的递归关系。
{"title":"The Spanning Trees Formulas in a Class of Double Fixed-Step Loop Networks","authors":"T. Atajan, N. Otsuka, Xuerong Yong","doi":"10.1137/1.9781611972993.3","DOIUrl":"https://doi.org/10.1137/1.9781611972993.3","url":null,"abstract":"A double fixed-step loop network, Cp,q, is a digraph on n vertices 0, 1, 2, ..., n − 1 and for each vertex i (0 < i ≤ n − 1), there are exactly two arcs leaving from vertex i to vertices i + p, i + q (mod n). In this paper, we first derive an expression formula of elementary symmetric polynomials as polynomials in sums of powers then, by using this, for any positive integers p, q, n with p < q < n, an explicit formula for counting the number of spanning trees in a class of double fixed-step loop networks with constant or nonconstant jumps. We allso find two classes of networks that share the same number of spanning trees and we, finally, prove that the number of spanning trees can be approximated by a formula which is based on the mth order Fibonacci numbers. In some special cases, our results generate the formulas obtained in [15],[19],[20]. And, compared with the previous work, the advantage is that, for any jumps p, q, the number of spanning trees can be calculated directly, without establishing the recurrence relation of order 2q−1.","PeriodicalId":340112,"journal":{"name":"Workshop on Analytic Algorithmics and Combinatorics","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117319371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Mathematics and Computer Science Serving/Impacting Bioinformatics 服务/影响生物信息学的数学和计算机科学
Pub Date : 2009-01-03 DOI: 10.1137/1.9781611972993.5
G. Gonnet
Since the early days of Bioinformatics, it was clear that mathematics in general, and computer science in particular, have a lot to contribute to bioinformatics. Bioinformatics has made substantial progress in the last 20 years, using tools from computer science, mathematics and statistics. As it will be seen there is plenty of work for the algorithms community.
从生物信息学的早期开始,很明显,一般来说,数学,特别是计算机科学,对生物信息学有很多贡献。利用计算机科学、数学和统计学的工具,生物信息学在过去20年里取得了实质性的进展。正如我们所看到的,算法社区还有很多工作要做。
{"title":"Mathematics and Computer Science Serving/Impacting Bioinformatics","authors":"G. Gonnet","doi":"10.1137/1.9781611972993.5","DOIUrl":"https://doi.org/10.1137/1.9781611972993.5","url":null,"abstract":"Since the early days of Bioinformatics, it was clear that mathematics in general, and computer science in particular, have a lot to contribute to bioinformatics. Bioinformatics has made substantial progress in the last 20 years, using tools from computer science, mathematics and statistics. \u0000 \u0000As it will be seen there is plenty of work for the algorithms community.","PeriodicalId":340112,"journal":{"name":"Workshop on Analytic Algorithmics and Combinatorics","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132282049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Maximum Likelihood Analysis of Heapsort 堆排序的最大似然分析
Pub Date : 2009-01-03 DOI: 10.1137/1.9781611972993.7
Ulrich Laube, M. Nebel
We present a new approach for an average-cases analysis of algorithms that supports a non-uniform distribution of the inputs and is based on the maximum likelihood training of stochastic grammars. The approach is exemplified by an analysis of the average running time of heapsort. All but one step of our analysis can be automated on top of a computer-algebra system. Thus our new approach eases the effort required for an average-case analysis exceptionally allowing for the consideration of realistic input distributions with unknown distribution functions at the same time.
我们提出了一种新的算法的平均情况分析方法,该方法支持输入的非均匀分布,并基于随机语法的最大似然训练。通过对堆排序的平均运行时间的分析来举例说明这种方法。除了一个步骤之外,我们的所有分析都可以在计算机代数系统上自动化。因此,我们的新方法减轻了平均情况分析所需的工作量,同时允许考虑具有未知分布函数的实际输入分布。
{"title":"Maximum Likelihood Analysis of Heapsort","authors":"Ulrich Laube, M. Nebel","doi":"10.1137/1.9781611972993.7","DOIUrl":"https://doi.org/10.1137/1.9781611972993.7","url":null,"abstract":"We present a new approach for an average-cases analysis of algorithms that supports a non-uniform distribution of the inputs and is based on the maximum likelihood training of stochastic grammars. The approach is exemplified by an analysis of the average running time of heapsort. All but one step of our analysis can be automated on top of a computer-algebra system. Thus our new approach eases the effort required for an average-case analysis exceptionally allowing for the consideration of realistic input distributions with unknown distribution functions at the same time.","PeriodicalId":340112,"journal":{"name":"Workshop on Analytic Algorithmics and Combinatorics","volume":"1007 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120876874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Average-case Analysis of Moves in Quick Select 快速选择招式的平均案例分析
Pub Date : 2009-01-03 DOI: 10.1137/1.9781611972993.6
H. Mahmoud
We investigate the average number of moves made by Quick Select (a variant of Quick Sort for finding order statistics) to find an element with a randomly selected rank. This kind of grand average provides smoothing over all individual cases of a specific fixed order statistic. The variance of the number of moves involves intricate dependencies, and we only give reasonably tight bounds.
我们研究了Quick Select(用于查找顺序统计的Quick Sort的变体)的平均移动次数,以查找具有随机选择秩的元素。这种大平均为特定的固定顺序统计量的所有个别情况提供平滑。移动次数的变化涉及复杂的依赖关系,我们只给出合理的严格界限。
{"title":"Average-case Analysis of Moves in Quick Select","authors":"H. Mahmoud","doi":"10.1137/1.9781611972993.6","DOIUrl":"https://doi.org/10.1137/1.9781611972993.6","url":null,"abstract":"We investigate the average number of moves made by Quick Select (a variant of Quick Sort for finding order statistics) to find an element with a randomly selected rank. This kind of grand average provides smoothing over all individual cases of a specific fixed order statistic. The variance of the number of moves involves intricate dependencies, and we only give reasonably tight bounds.","PeriodicalId":340112,"journal":{"name":"Workshop on Analytic Algorithmics and Combinatorics","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126800144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Balanced And/Or Trees and Linear Threshold Functions 平衡和/或树和线性阈值函数
Pub Date : 2009-01-03 DOI: 10.1137/1.9781611972993.8
Hervé Fournier, Danièle Gardy, Antoine Genitrini
We consider random balanced Boolean formulas, built on the two connectives and and or, and a fixed number of variables. The probability distribution induced on Boolean functions is shown to have a limit when letting the depth of these formulas grow to infinity. By investigating how this limiting distribution depends on the two underlying probability distributions, over the connectives and over the Boolean variables, we prove that its support is made of linear threshold functions, and give the speed of convergence towards this limiting distribution.
我们考虑随机平衡布尔公式,建立在两个连接词and和or和固定数量的变量上。当这些公式的深度增长到无穷大时,布尔函数上的概率分布有一个极限。通过研究这个极限分布如何依赖于两个潜在的概率分布,即连接项和布尔变量,我们证明了它的支持是由线性阈值函数构成的,并给出了收敛到这个极限分布的速度。
{"title":"Balanced And/Or Trees and Linear Threshold Functions","authors":"Hervé Fournier, Danièle Gardy, Antoine Genitrini","doi":"10.1137/1.9781611972993.8","DOIUrl":"https://doi.org/10.1137/1.9781611972993.8","url":null,"abstract":"We consider random balanced Boolean formulas, built on the two connectives and and or, and a fixed number of variables. The probability distribution induced on Boolean functions is shown to have a limit when letting the depth of these formulas grow to infinity. By investigating how this limiting distribution depends on the two underlying probability distributions, over the connectives and over the Boolean variables, we prove that its support is made of linear threshold functions, and give the speed of convergence towards this limiting distribution.","PeriodicalId":340112,"journal":{"name":"Workshop on Analytic Algorithmics and Combinatorics","volume":"12 12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114010584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Pursuit and Evasion from a Distance: Algorithms and Bounds 远距离追逐与逃避:算法与界限
Pub Date : 2009-01-03 DOI: 10.1137/1.9781611972993.1
A. Bonato, E. Chiniforooshan
Cops and Robber is a pursuit and evasion game played on graphs that has received much attention. We consider an extension of Cops and Robber, distance k Cops and Robber, where the cops win if they are distance at most k from the robber in G. The cop number of a graph G is the minimum number of cops needed to capture the robber in G. The distance k analogue of the cop number, written ck(G), equals the minimum number of cops needed to win at a given distance k. We supply a classification result for graphs with bounded ck(G) values and develop an O(n2s+3) algorithm for determining if ck(G) ≤ s. In the case k = 0, our algorithm is faster than previously known algorithms. Upper and lower bounds are found for ck(G) in terms of the order of G. We prove that [EQUATION] where ck(n) is the maximum of ck(G) over all n-node connected graphs.
《条子和强盗》是一款以图形为基础的追捕和躲避游戏,受到了广泛关注。我们考虑警察和强盗的扩展,距离k警察和强盗,如果警察与G中的强盗的距离最多为k,则警察获胜。图G中的警察数是捕获G中的强盗所需的最小警察数。距离k的警察数类比,写为ck(G),等于在给定距离k上获胜所需的最小警察数。我们提供了具有有界ck(G)值的图的分类结果,并开发了一个O(n2s+3)算法来确定ck(G)是否≤s。在k = 0的情况下,我们的算法比以前已知的算法快。我们证明了ck(G)在所有n节点连通图上ck(G)的最大值为ck(G)的[式]。
{"title":"Pursuit and Evasion from a Distance: Algorithms and Bounds","authors":"A. Bonato, E. Chiniforooshan","doi":"10.1137/1.9781611972993.1","DOIUrl":"https://doi.org/10.1137/1.9781611972993.1","url":null,"abstract":"Cops and Robber is a pursuit and evasion game played on graphs that has received much attention. We consider an extension of Cops and Robber, distance k Cops and Robber, where the cops win if they are distance at most k from the robber in G. The cop number of a graph G is the minimum number of cops needed to capture the robber in G. The distance k analogue of the cop number, written ck(G), equals the minimum number of cops needed to win at a given distance k. We supply a classification result for graphs with bounded ck(G) values and develop an O(n2s+3) algorithm for determining if ck(G) ≤ s. In the case k = 0, our algorithm is faster than previously known algorithms. Upper and lower bounds are found for ck(G) in terms of the order of G. We prove that \u0000 \u0000[EQUATION] \u0000 \u0000where ck(n) is the maximum of ck(G) over all n-node connected graphs.","PeriodicalId":340112,"journal":{"name":"Workshop on Analytic Algorithmics and Combinatorics","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115511432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Approximating L1-distances Between Mixture Distributions Using Random Projections 用随机投影近似混合分布之间的l1距离
Pub Date : 2008-04-07 DOI: 10.1137/1.9781611972993.11
Satyaki Mahalanabis, Daniel Stefankovic
We consider the problem of computing L1-distances between every pair of probability densities from a given family, a problem motivated by density estimation [15]. We point out that the technique of Cauchy random projections [10] in this context turns into stochastic integrals with respect to Cauchy motion. For piecewise-linear densities these integrals can be sampled from if one can sample from the stochastic integral of the function x → (1, x). We give an explicit density function for this stochastic integral and present an efficient (exact) sampling algorithm. As a consequence we obtain an efficient algorithm to approximate the L1-distances with a small relative error. For piecewise-polynomial densities we show how to approximately sample from the distributions resulting from the stochastic integrals. This also results in an efficient algorithm to approximate the L1-distances, although our inability to get exact samples worsens the dependence on the parameters.
我们考虑计算给定族中每对概率密度之间的l1距离的问题,这是一个由密度估计引起的问题[15]。我们指出,在这种情况下,柯西随机投影技术[10]变成了关于柯西运动的随机积分。对于分段线性密度,如果可以从函数x→(1,x)的随机积分中采样,则可以从这些积分中采样。我们给出了该随机积分的显式密度函数,并给出了一个有效的(精确的)采样算法。因此,我们得到了一种以较小的相对误差近似l1距离的有效算法。对于分段多项式密度,我们展示了如何从随机积分产生的分布中近似采样。这也导致了一个有效的算法来近似l1距离,尽管我们无法获得精确的样本恶化了对参数的依赖。
{"title":"Approximating L1-distances Between Mixture Distributions Using Random Projections","authors":"Satyaki Mahalanabis, Daniel Stefankovic","doi":"10.1137/1.9781611972993.11","DOIUrl":"https://doi.org/10.1137/1.9781611972993.11","url":null,"abstract":"We consider the problem of computing L1-distances between every pair of probability densities from a given family, a problem motivated by density estimation [15]. We point out that the technique of Cauchy random projections [10] in this context turns into stochastic integrals with respect to Cauchy motion. \u0000 \u0000For piecewise-linear densities these integrals can be sampled from if one can sample from the stochastic integral of the function x → (1, x). We give an explicit density function for this stochastic integral and present an efficient (exact) sampling algorithm. As a consequence we obtain an efficient algorithm to approximate the L1-distances with a small relative error. \u0000 \u0000For piecewise-polynomial densities we show how to approximately sample from the distributions resulting from the stochastic integrals. This also results in an efficient algorithm to approximate the L1-distances, although our inability to get exact samples worsens the dependence on the parameters.","PeriodicalId":340112,"journal":{"name":"Workshop on Analytic Algorithmics and Combinatorics","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132720281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Markovian Embeddings of General Random Strings 一般随机字符串的马尔可夫嵌入
Pub Date : 2008-01-19 DOI: 10.1137/1.9781611972986.2
M. Lladser
Let A be a finite set and X a sequence of A-valued random variables. We do not assume any particular correlation structure between these random variables; in particular, X may be a non-Markovian sequence. An adapted embedding of X is a sequence of the form R(X1), R(X1,X2), R(X1,X2,X3), etc where R is a transformation defined over finite length sequences. In this extended abstract we characterize a wide class of adapted embeddings of X that result in a first-order homogeneous Markov chain. We show that any transformation R has a unique coarsest refinement R' in this class such that R'(X1), R'(X1,X2), R'(X1,X2,X3), etc is Markovian. (By refinement we mean that R'(u) = R'(v) implies R(u) = R(v), and by coarsest refinement we mean that R' is a deterministic function of any other refinement of R in our class of transformations.) We propose a specific embedding that we denote as RX which is particularly amenable for analyzing the occurrence of patterns described by regular expressions in X. A toy example of a non-Markovian sequence of 0's and 1's is analyzed thoroughly: discrete asymptotic distributions are established for the number of occurrences of a certain regular pattern in X1, ..., Xn as n → ∞ whereas a Gaussian asymptotic distribution is shown to apply for another regular pattern.
设A是一个有限集合,X是A值随机变量的序列。我们不假设这些随机变量之间有任何特定的相关结构;特别地,X可以是一个非马尔可夫序列。X的自适应嵌入是R(X1), R(X1,X2), R(X1,X2,X3)等形式的序列,其中R是在有限长度序列上定义的变换。在这个扩展的摘要中,我们描述了一类广泛的X的自适应嵌入,它们导致一阶齐次马尔可夫链。我们证明了任何变换R在这个类中都有一个唯一的最粗糙的细化R',使得R'(X1), R'(X1,X2), R'(X1,X2,X3)等是马尔可夫的。(通过细化,我们的意思是R'(u) = R'(v)意味着R(u) = R(v),通过最粗略的细化,我们的意思是R'是我们的变换类中R的任何其他细化的确定性函数。)我们提出了一个特定的嵌入,我们表示为RX,它特别适用于分析x中正则表达式描述的模式的出现。我们彻底分析了一个0和1的非马尔可夫序列的一个小例子:建立了X1中某个正则模式出现次数的离散渐近分布,…, Xn为n→∞,而高斯渐近分布适用于另一正则模式。
{"title":"Markovian Embeddings of General Random Strings","authors":"M. Lladser","doi":"10.1137/1.9781611972986.2","DOIUrl":"https://doi.org/10.1137/1.9781611972986.2","url":null,"abstract":"Let A be a finite set and X a sequence of A-valued random variables. We do not assume any particular correlation structure between these random variables; in particular, X may be a non-Markovian sequence. An adapted embedding of X is a sequence of the form R(X1), R(X1,X2), R(X1,X2,X3), etc where R is a transformation defined over finite length sequences. In this extended abstract we characterize a wide class of adapted embeddings of X that result in a first-order homogeneous Markov chain. We show that any transformation R has a unique coarsest refinement R' in this class such that R'(X1), R'(X1,X2), R'(X1,X2,X3), etc is Markovian. (By refinement we mean that R'(u) = R'(v) implies R(u) = R(v), and by coarsest refinement we mean that R' is a deterministic function of any other refinement of R in our class of transformations.) We propose a specific embedding that we denote as RX which is particularly amenable for analyzing the occurrence of patterns described by regular expressions in X. A toy example of a non-Markovian sequence of 0's and 1's is analyzed thoroughly: discrete asymptotic distributions are established for the number of occurrences of a certain regular pattern in X1, ..., Xn as n → ∞ whereas a Gaussian asymptotic distribution is shown to apply for another regular pattern.","PeriodicalId":340112,"journal":{"name":"Workshop on Analytic Algorithmics and Combinatorics","volume":"245 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129070829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
On the Convergence of Upper Bound Techniques for the Average Length of Longest Common Subsequences 关于最长公共子序列平均长度上界技术的收敛性
Pub Date : 2008-01-19 DOI: 10.1137/1.9781611972986.1
G. S. Lueker
It has long been known [2] that the average length of the longest common subsequence of two random strings of length n over an alphabet of size k is asymptotic to γkn for some constant γk depending on k. The value of these constants remains unknown, and a number of papers have proved upper and lower bounds on them. In particular, in [6] we used a modification of methods of [3, 4] for determining lower and upper bounds on γk, combined with large computer computations, to obtain improved bounds on γ2. The method of [6] involved a parameter h; empirically, increasing h increased the computation time but gave better upper bounds. Here we show, for arbitrary k, a sufficient condition for a parameterized method to produce a sequence of upper bounds approaching the true value of γk, and show that a generalization of the method of [6] meets this condition for all k ≥ 2. While [3, 4] do not explicitly discuss how to parameterize their method, which is based on a concept they call domination, to trade off the tightness of the bound vs. the amount of computation, we discuss a very natural parameterization of their method; for the case of alphabet size k = 2 we conjecture but do not prove that it also meets the sufficient condition and hence also yields a sequence of bounds that converges to the correct value of γ2. For k > 2, it does not meet our sufficient condition. Thus we leave open the question of whether some method based on the undominated collations of [3, 4] gives bounds converging to the correct value for any k ≥ 2.
我们早就知道[2],在一个大小为k的字母表上,两个长度为n的随机字符串的最长公共子序列的平均长度对于某些常数γk是渐近于γkn的,这取决于k。这些常数的值仍然未知,一些论文已经证明了它们的上界和下界。特别是在[6]中,我们对[3,4]中确定γk下界和上界的方法进行了修改,并结合大型计算机计算,得到了改进的γ2上界。[6]的方法涉及一个参数h;从经验上看,增加h会增加计算时间,但会给出更好的上界。本文证明了对于任意k,参数化方法产生接近γk真值的上界序列的一个充分条件,并证明了[6]方法的推广对于所有k≥2都满足这个条件。虽然[3,4]没有明确讨论如何参数化他们的方法,这是基于他们称之为支配的概念,以权衡边界的紧密性与计算量,我们讨论了他们的方法的一个非常自然的参数化;对于字母大小k = 2的情况,我们推测但没有证明它也满足充分条件,因此也得到了收敛于γ2的正确值的界序列。对于k > 2,它不满足充分条件。因此,我们留下了一个开放的问题,即是否有一些基于[3,4]的非支配排序的方法给出了收敛于任何k≥2的正确值的边界。
{"title":"On the Convergence of Upper Bound Techniques for the Average Length of Longest Common Subsequences","authors":"G. S. Lueker","doi":"10.1137/1.9781611972986.1","DOIUrl":"https://doi.org/10.1137/1.9781611972986.1","url":null,"abstract":"It has long been known [2] that the average length of the longest common subsequence of two random strings of length n over an alphabet of size k is asymptotic to γkn for some constant γk depending on k. The value of these constants remains unknown, and a number of papers have proved upper and lower bounds on them. In particular, in [6] we used a modification of methods of [3, 4] for determining lower and upper bounds on γk, combined with large computer computations, to obtain improved bounds on γ2. The method of [6] involved a parameter h; empirically, increasing h increased the computation time but gave better upper bounds. Here we show, for arbitrary k, a sufficient condition for a parameterized method to produce a sequence of upper bounds approaching the true value of γk, and show that a generalization of the method of [6] meets this condition for all k ≥ 2. While [3, 4] do not explicitly discuss how to parameterize their method, which is based on a concept they call domination, to trade off the tightness of the bound vs. the amount of computation, we discuss a very natural parameterization of their method; for the case of alphabet size k = 2 we conjecture but do not prove that it also meets the sufficient condition and hence also yields a sequence of bounds that converges to the correct value of γ2. For k > 2, it does not meet our sufficient condition. Thus we leave open the question of whether some method based on the undominated collations of [3, 4] gives bounds converging to the correct value for any k ≥ 2.","PeriodicalId":340112,"journal":{"name":"Workshop on Analytic Algorithmics and Combinatorics","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129400143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nearly Tight Bounds on the Encoding Length of the Burrows-Wheeler Transform Burrows-Wheeler变换编码长度的近紧界
Pub Date : 2008-01-19 DOI: 10.1137/1.9781611972986.3
Ankur Gupta, R. Grossi, J. Vitter
In this paper, we present a nearly tight analysis of the encoding length of the Burrows-Wheeler Transform (BWT) that is motivated by the text indexing setting. For a text T of n symbols drawn from an alphabet Σ, our encoding scheme achieves bounds in terms of the hth-order empirical entropy Hh of the text, and takes linear time for encoding and decoding. We also describe a lower bound on the encoding length of the BWT that constructs an infinite (non-trivial) class of texts that are among the hardest to compress using the BWT. We then show that our upper bound encoding length is nearly tight with this lower bound for the class of texts we described. In designing our BWT encoding and its lower bound, we also address the t-subset problem; here, the goal is to store a subset of t items drawn from a universe [1..n] using just lg (nt)+O(1) bits of space. A number of solutions to this basic problem are known, however encoding or decoding usually requires either O(t) operations on large integers [Knu05, Rus05] or O(n) operations. We provide a novel approach to reduce the encoding/decoding time to just O(t) operations on small integers (of size O(lg n) bits), without increasing the space required.
在本文中,我们对文本索引设置驱动的Burrows-Wheeler变换(BWT)的编码长度进行了近乎严密的分析。对于从字母表Σ中提取的n个符号的文本T,我们的编码方案根据文本的h阶经验熵Hh实现了边界,并且编码和解码需要线性时间。我们还描述了BWT编码长度的下界,该下界构造了无限(非平凡的)文本类,这些文本是最难使用BWT压缩的。然后,我们证明了我们的编码长度上界与我们描述的文本的下界几乎是紧密的。在设计我们的BWT编码及其下界时,我们还解决了t子集问题;这里,目标是存储从宇宙[1..]中抽取的t项的子集。n]只使用lg (nt)+O(1)位空间。这个基本问题的许多解决方案是已知的,但是编码或解码通常需要对大整数进行O(t)操作[Knu05, Rus05]或O(n)操作。我们提供了一种新颖的方法,将编码/解码时间减少到对小整数(大小为O(lgn)位)的O(t)次操作,而不增加所需的空间。
{"title":"Nearly Tight Bounds on the Encoding Length of the Burrows-Wheeler Transform","authors":"Ankur Gupta, R. Grossi, J. Vitter","doi":"10.1137/1.9781611972986.3","DOIUrl":"https://doi.org/10.1137/1.9781611972986.3","url":null,"abstract":"In this paper, we present a nearly tight analysis of the encoding length of the Burrows-Wheeler Transform (BWT) that is motivated by the text indexing setting. For a text T of n symbols drawn from an alphabet Σ, our encoding scheme achieves bounds in terms of the hth-order empirical entropy Hh of the text, and takes linear time for encoding and decoding. We also describe a lower bound on the encoding length of the BWT that constructs an infinite (non-trivial) class of texts that are among the hardest to compress using the BWT. We then show that our upper bound encoding length is nearly tight with this lower bound for the class of texts we described. \u0000 \u0000In designing our BWT encoding and its lower bound, we also address the t-subset problem; here, the goal is to store a subset of t items drawn from a universe [1..n] using just lg (nt)+O(1) bits of space. A number of solutions to this basic problem are known, however encoding or decoding usually requires either O(t) operations on large integers [Knu05, Rus05] or O(n) operations. We provide a novel approach to reduce the encoding/decoding time to just O(t) operations on small integers (of size O(lg n) bits), without increasing the space required.","PeriodicalId":340112,"journal":{"name":"Workshop on Analytic Algorithmics and Combinatorics","volume":"135 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132906976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
Workshop on Analytic Algorithmics and Combinatorics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1