Bulletin of the Society of Sea Water Science, Japan最新文献

英文中文

Reduction of Supercooling of Heavy Water with Silver Iodide 用碘化银还原重水过冷

Bulletin of the Society of Sea Water Science, Japan

Pub Date : 2018-01-01 DOI: 10.11457/SWSJ.72.1_41

Shoko Naruke, K. Fujiwara, T. Sugo, S. Kawai-Noma, D. Umeno, Kyoichi Saito

引用次数: 0

Speeding up Dualization in the Fredman-Khachiyan Algorithm B 加速fredman - kachiyan算法的二象化B

Bulletin of the Society of Sea Water Science, Japan

Pub Date : 2018-01-01 DOI: 10.4230/LIPIcs.SEA.2018.6

N. Sedaghat, Tamon Stephen, L. Chindelevitch

The problem of computing the dual of a monotone Boolean function f is a fundamental problem in theoretical computer science with numerous applications. The related problem of duality testing (given two monotone Boolean functions f and g, declare that they are dual or provide a certificate that shows they are not) has a complexity that is not yet known. However, two quasi-polynomial time algorithms for it, often referred to as FK-A and FK-B, were proposed by Fredman and Khachiyan in 1996, with the latter having a better complexity guarantee. These can be naturally used as a subroutine in computing the dual of f . In this paper, we investigate this use of the FK-B algorithm for the computation of the dual of a monotone Boolean function, and present practical improvements to its performance. First, we show how FK-B can be modified to produce multiple certificates (Boolean vectors on which the functions defined by the original f and the current dual g do not provide outputs consistent with duality). Second, we show how the number of redundancy tests one of the more costly and time-consuming steps of FK-B can be substantially reduced in this context. Lastly, we describe a simple memoization technique that avoids the solution of multiple identical subproblems. We test our approach on a number of inputs coming from computational biology as well as combinatorics. These modifications provide a substantial speed-up, as much as an order of magnitude, for FK-B dualization relative to a naive implementation. Although other methods may end up being faster in practice, our work paves the way for a principled optimization process for the generation of monotone Boolean functions and their duals from an oracle. 2012 ACM Subject Classification Computing methodologies → Boolean algebra algorithms

单调布尔函数f的对偶计算问题是理论计算机科学中的一个基本问题，有着广泛的应用。对偶性测试的相关问题(给定两个单调布尔函数f和g，声明它们是对偶的或提供证明它们不是对偶的)具有未知的复杂性。然而，1996年Fredman和kachiyan提出了两种拟多项式时间算法，通常称为FK-A和FK-B，后者具有更好的复杂性保证。这些自然可以用作计算f的对偶的子程序。本文研究了FK-B算法在单调布尔函数对偶计算中的应用，并给出了对其性能的实际改进。首先，我们将展示如何修改FK-B以生成多个证书(布尔向量，在这些布尔向量上，由原始f和当前对偶g定义的函数不提供与对偶性一致的输出)。其次，我们展示了如何在这种情况下大大减少冗余测试的数量，这是FK-B更昂贵和耗时的步骤之一。最后，我们描述了一种简单的记忆技术，它可以避免求解多个相同的子问题。我们在许多来自计算生物学和组合学的输入上测试了我们的方法。相对于简单的实现，这些修改为FK-B二元化提供了相当大的加速，多达一个数量级。虽然其他方法在实践中可能会更快，但我们的工作为从oracle生成单调布尔函数及其对偶的原则优化过程铺平了道路。2012 ACM主题分类计算方法→布尔代数算法

{"title":"Speeding up Dualization in the Fredman-Khachiyan Algorithm B","authors":"N. Sedaghat, Tamon Stephen, L. Chindelevitch","doi":"10.4230/LIPIcs.SEA.2018.6","DOIUrl":"https://doi.org/10.4230/LIPIcs.SEA.2018.6","url":null,"abstract":"The problem of computing the dual of a monotone Boolean function f is a fundamental problem in theoretical computer science with numerous applications. The related problem of duality testing (given two monotone Boolean functions f and g, declare that they are dual or provide a certificate that shows they are not) has a complexity that is not yet known. However, two quasi-polynomial time algorithms for it, often referred to as FK-A and FK-B, were proposed by Fredman and Khachiyan in 1996, with the latter having a better complexity guarantee. These can be naturally used as a subroutine in computing the dual of f . In this paper, we investigate this use of the FK-B algorithm for the computation of the dual of a monotone Boolean function, and present practical improvements to its performance. First, we show how FK-B can be modified to produce multiple certificates (Boolean vectors on which the functions defined by the original f and the current dual g do not provide outputs consistent with duality). Second, we show how the number of redundancy tests one of the more costly and time-consuming steps of FK-B can be substantially reduced in this context. Lastly, we describe a simple memoization technique that avoids the solution of multiple identical subproblems. We test our approach on a number of inputs coming from computational biology as well as combinatorics. These modifications provide a substantial speed-up, as much as an order of magnitude, for FK-B dualization relative to a naive implementation. Although other methods may end up being faster in practice, our work paves the way for a principled optimization process for the generation of monotone Boolean functions and their duals from an oracle. 2012 ACM Subject Classification Computing methodologies → Boolean algebra algorithms","PeriodicalId":9448,"journal":{"name":"Bulletin of the Society of Sea Water Science, Japan","volume":"42 1","pages":"6:1-6:13"},"PeriodicalIF":0.0,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87563201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Monovalent Selective Cation-exchange Membranes Prepared from PVA-based Block Copolymers 聚乙烯醇基嵌段共聚物制备的一价选择性阳离子交换膜

Bulletin of the Society of Sea Water Science, Japan

Pub Date : 2018-01-01 DOI: 10.11457/SWSJ.72.6_338

Saeko Harada, Yuriko Kakihana, Mitsuru Higa

引用次数: 1

High-performance Liquid Chromatographic Evaluation of Effect of Metal Ion on Amyloid Fibril Formation 高效液相色谱法评价金属离子对淀粉样纤维形成的影响

Bulletin of the Society of Sea Water Science, Japan

Pub Date : 2018-01-01 DOI: 10.11457/SWSJ.72.6_336

K. Nagashima, H. Minamisawa, T. Nakagama, K. Saitoh, Hiromichi Asamoto

引用次数: 0

Real-Time Traffic Assignment Using Fast Queries in Customizable Contraction Hierarchies 在可定制的收缩层次结构中使用快速查询的实时流量分配

Bulletin of the Society of Sea Water Science, Japan

Pub Date : 2018-01-01 DOI: 10.4230/LIPIcs.SEA.2018.27

V. Buchhold, P. Sanders, D. Wagner

Given an urban road network and a set of origin-destination (OD) pairs, the traffic assignment problem asks for the traffic flow on each road segment. A common solution employs a feasible-direction method, where the direction-finding step requires many shortest-path computations. In this paper, we significantly accelerate the computation of flow patterns, enabling interactive transportation and urban planning applications. We achieve this by revisiting and carefully engineering known speedup techniques for shortest paths, and combining them with customizable contraction hierarchies. In particular, our accelerated elimination tree search is more than an order of magnitude faster for local queries than the original algorithm, and our centralized search speeds up batched point-to-point shortest paths by a factor of up to 6. These optimizations are independent of traffic assignment and can be generally used for (batched) point-to-point queries. In contrast to prior work, our evaluation uses real-world data for all parts of the problem. On a metropolitan area encompassing more than 2.7 million inhabitants, we reduce the flow-pattern computation for a typical two-hour morning peak from 76.5 to 10.5 seconds on one core, and 4.3 seconds on four cores. This represents a speedup of 18 over the state of the art, and three orders of magnitude over the Dijkstra-based baseline.

给定一个城市路网和一组OD对，交通分配问题要求在每个路段上的交通流。一种常见的解决方案采用可行方向方法，其中测向步骤需要许多最短路径计算。在本文中，我们大大加快了流模式的计算，使交互式交通和城市规划应用成为可能。我们通过重新审视和仔细设计已知的最短路径加速技术，并将它们与可定制的收缩层次结构结合起来，实现了这一点。特别是，对于本地查询，我们的加速消除树搜索比原始算法快了一个数量级以上，并且我们的集中搜索将批处理点对点最短路径的速度提高了6倍。这些优化与流量分配无关，通常可用于(批处理)点对点查询。与之前的工作相比，我们的评估使用了问题所有部分的真实数据。在一个拥有270多万居民的大都市地区，我们将典型的两个小时早晨高峰的流量模式计算从一个核心的76.5秒减少到10.5秒，四个核心的4.3秒。这比目前的技术水平提高了18倍，比基于dijkstra的基线提高了3个数量级。

{"title":"Real-Time Traffic Assignment Using Fast Queries in Customizable Contraction Hierarchies","authors":"V. Buchhold, P. Sanders, D. Wagner","doi":"10.4230/LIPIcs.SEA.2018.27","DOIUrl":"https://doi.org/10.4230/LIPIcs.SEA.2018.27","url":null,"abstract":"Given an urban road network and a set of origin-destination (OD) pairs, the traffic assignment problem asks for the traffic flow on each road segment. A common solution employs a feasible-direction method, where the direction-finding step requires many shortest-path computations. In this paper, we significantly accelerate the computation of flow patterns, enabling interactive transportation and urban planning applications. We achieve this by revisiting and carefully engineering known speedup techniques for shortest paths, and combining them with customizable contraction hierarchies. In particular, our accelerated elimination tree search is more than an order of magnitude faster for local queries than the original algorithm, and our centralized search speeds up batched point-to-point shortest paths by a factor of up to 6. These optimizations are independent of traffic assignment and can be generally used for (batched) point-to-point queries. In contrast to prior work, our evaluation uses real-world data for all parts of the problem. On a metropolitan area encompassing more than 2.7 million inhabitants, we reduce the flow-pattern computation for a typical two-hour morning peak from 76.5 to 10.5 seconds on one core, and 4.3 seconds on four cores. This represents a speedup of 18 over the state of the art, and three orders of magnitude over the Dijkstra-based baseline.","PeriodicalId":9448,"journal":{"name":"Bulletin of the Society of Sea Water Science, Japan","volume":"19 4","pages":"27:1-27:15"},"PeriodicalIF":0.0,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91456563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Engineering Motif Search for Large Motifs 大型Motif的工程Motif搜索

Bulletin of the Society of Sea Water Science, Japan

Pub Date : 2018-01-01 DOI: 10.4230/LIPIcs.SEA.2018.28

P. Kaski, Juho Lauri, Suhas Thejaswi

Given a vertex-colored graph H and a multiset M of colors as input, the graph motif problem asks us to decide whether H has a connected induced subgraph whose multiset of colors agrees with M. The graph motif problem is NP-complete but known to admit randomized algorithms based on constrained multilinear sieving over GF(2^b) that run in time O(2^kk^2m {M({2^b})}) and with a false-negative probability of at most k/2^{b-1} for a connected m-edge input and a motif of size k. On modern CPU microarchitectures such algorithms have practical edge-linear scalability to inputs with billions of edges for small motif sizes, as demonstrated by Bjorklund, Kaski, Kowalik, and Lauri [ALENEX'15]. This scalability to large graphs prompts the dual question whether it is possible to scale to large motif sizes. We present a vertex-localized variant of the constrained multilinear sieve that enables us to obtain, in time O(2^kk^2m{M({2^b})}) and for every vertex simultaneously, whether the vertex participates in at least one match with the motif, with a per-vertex probability of at most k/2^{b-1} for a false negative. Furthermore, the algorithm is easily vector-parallelizable for up to 2^k threads, and parallelizable for up to 2^kn threads, where n is the number of vertices in H. Here {M({2^b})} is the time complexity to multiply in GF(2^b). We demonstrate with an open-source implementation that our variant of constrained multilinear sieving can be engineered for vector-parallel microarchitectures to yield hardware utilization that is bound by the available memory bandwidth. Our main engineering contributions are (a) a version of the recurrence for tightly labeled arborescences that can be executed as a sequence of memory-and-arithmetic coalescent parallel workloads on multiple GPUs, and (b) a bit-sliced low-level implementation for arithmetic in characteristic 2 to support (a).

给定一个顶点彩色图H和一个多色集M作为输入，H图图案的问题让我们决定是否有一个连接诱导子图,其多重集的颜色同意M图图案的问题是np完全但已知承认基于多重线性约束的随机算法筛选/ GF (2 ^ b)中运行时间O (2 ^ kk ^ 2 M M {} ({2 ^ b}))和假阴性的可能性最多k / 2 ^ {b}连接m-edge输入和图案的大小k。在现代处理器微体系结构这类算法实用edge-linear可伸缩性Bjorklund, Kaski, Kowalik和Lauri [ALENEX'15]证明，小图案尺寸的输入具有数十亿条边。这种对大型图形的可扩展性提出了两个问题，即是否有可能缩放到大型主题尺寸。我们提出了约束多线性筛的顶点局部化变体，使我们能够在时间O(2^kk^2m{M({2^b})})中同时获得每个顶点是否与基序至少有一次匹配，对于假阴性，每个顶点的概率最多为k/2^{b-1}。此外，该算法可以很容易地对多达2^k个线程进行向量并行化，并且可以并行化多达2^kn个线程，其中n是h中的顶点数，这里{M({2^b})}是在GF(2^b)中相乘的时间复杂度。我们通过一个开源实现证明，我们的约束多线性筛分变体可以设计用于矢量并行微架构，以产生受可用内存带宽约束的硬件利用率。我们的主要工程贡献是(a)一个紧密标记的树形序列的递归版本，可以在多个gpu上作为内存和算术合并并行工作负载的序列执行，以及(b)一个特征2中的算术的位切片低级实现来支持(a)。

{"title":"Engineering Motif Search for Large Motifs","authors":"P. Kaski, Juho Lauri, Suhas Thejaswi","doi":"10.4230/LIPIcs.SEA.2018.28","DOIUrl":"https://doi.org/10.4230/LIPIcs.SEA.2018.28","url":null,"abstract":"Given a vertex-colored graph H and a multiset M of colors as input, the graph motif problem asks us to decide whether H has a connected induced subgraph whose multiset of colors agrees with M. The graph motif problem is NP-complete but known to admit randomized algorithms based on constrained multilinear sieving over GF(2^b) that run in time O(2^kk^2m {M({2^b})}) and with a false-negative probability of at most k/2^{b-1} for a connected m-edge input and a motif of size k. On modern CPU microarchitectures such algorithms have practical edge-linear scalability to inputs with billions of edges for small motif sizes, as demonstrated by Bjorklund, Kaski, Kowalik, and Lauri [ALENEX'15]. This scalability to large graphs prompts the dual question whether it is possible to scale to large motif sizes. We present a vertex-localized variant of the constrained multilinear sieve that enables us to obtain, in time O(2^kk^2m{M({2^b})}) and for every vertex simultaneously, whether the vertex participates in at least one match with the motif, with a per-vertex probability of at most k/2^{b-1} for a false negative. Furthermore, the algorithm is easily vector-parallelizable for up to 2^k threads, and parallelizable for up to 2^kn threads, where n is the number of vertices in H. Here {M({2^b})} is the time complexity to multiply in GF(2^b). We demonstrate with an open-source implementation that our variant of constrained multilinear sieving can be engineered for vector-parallel microarchitectures to yield hardware utilization that is bound by the available memory bandwidth. Our main engineering contributions are (a) a version of the recurrence for tightly labeled arborescences that can be executed as a sequence of memory-and-arithmetic coalescent parallel workloads on multiple GPUs, and (b) a bit-sliced low-level implementation for arithmetic in characteristic 2 to support (a).","PeriodicalId":9448,"journal":{"name":"Bulletin of the Society of Sea Water Science, Japan","volume":"70 1","pages":"28:1-28:19"},"PeriodicalIF":0.0,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79171735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

A Computational Investigation on the Strength of Dantzig-Wolfe Reformulations dantzigg - wolfe重构强度的计算研究

Bulletin of the Society of Sea Water Science, Japan

Pub Date : 2018-01-01 DOI: 10.4230/LIPIcs.SEA.2018.11

Michael Bastubbe, M. Lübbecke, Jonas T. Witt

In Dantzig-Wolfe reformulation of an integer program one convexifies a subset of the constraints, leading to potentially stronger dual bounds from the respective linear programming relaxation. As the subset can be chosen arbitrarily, this includes the trivial cases of convexifying no and all constraints, resulting in a weakest and strongest reformulation, respectively. Our computational study aims at better understanding of what happens in between these extremes. For a collection of integer programs with few constraints we compute, optimally solve, and evaluate the relaxations of all possible (exponentially many) Dantzig-Wolfe reformulations (with mild extensions to larger models from the MIPLIBs). We observe that only a tiny number of different dual bounds actually occur and that only a few inclusion-wise minimal representatives exist for each. This aligns with considerably different impacts of individual constraints on the strengthening the relaxation, some of which have almost no influence. In contrast, types of constraints that are convexified in textbook reformulations have a larger effect. We relate our experiments to what could be called a hierarchy of Dantzig-Wolfe reformulations.

在dantzigg - wolfe整数规划的重新表述中，人们将约束的一个子集凸化，从而从相应的线性规划松弛中得到潜在的更强的对偶边界。由于子集可以任意选择，这包括凸化no和所有约束的平凡情况，分别导致最弱和最强的重新表述。我们的计算研究旨在更好地理解在这两个极端之间发生了什么。对于具有很少约束的整数程序集合，我们计算、最优求解和评估所有可能的(指数多的)dantzigg - wolfe重新公式的松弛(从miplib略微扩展到更大的模型)。我们观察到，实际上只有极少数不同的对偶边界出现，并且每个边界只存在少数包含最小代表。这与个体约束对强化松弛的不同影响相一致，其中一些约束几乎没有影响。相比之下，在教科书重新表述中被凸出的约束类型有更大的影响。我们把我们的实验与所谓的丹齐格-沃尔夫重新表述的层次联系起来。

{"title":"A Computational Investigation on the Strength of Dantzig-Wolfe Reformulations","authors":"Michael Bastubbe, M. Lübbecke, Jonas T. Witt","doi":"10.4230/LIPIcs.SEA.2018.11","DOIUrl":"https://doi.org/10.4230/LIPIcs.SEA.2018.11","url":null,"abstract":"In Dantzig-Wolfe reformulation of an integer program one convexifies a subset of the constraints, leading to potentially stronger dual bounds from the respective linear programming relaxation. As the subset can be chosen arbitrarily, this includes the trivial cases of convexifying no and all constraints, resulting in a weakest and strongest reformulation, respectively. Our computational study aims at better understanding of what happens in between these extremes. For a collection of integer programs with few constraints we compute, optimally solve, and evaluate the relaxations of all possible (exponentially many) Dantzig-Wolfe reformulations (with mild extensions to larger models from the MIPLIBs). We observe that only a tiny number of different dual bounds actually occur and that only a few inclusion-wise minimal representatives exist for each. This aligns with considerably different impacts of individual constraints on the strengthening the relaxation, some of which have almost no influence. In contrast, types of constraints that are convexified in textbook reformulations have a larger effect. We relate our experiments to what could be called a hierarchy of Dantzig-Wolfe reformulations.","PeriodicalId":9448,"journal":{"name":"Bulletin of the Society of Sea Water Science, Japan","volume":"4 1","pages":"11:1-11:12"},"PeriodicalIF":0.0,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89303212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Isomorphism Test for Digraphs with Weighted Edges 加权边有向图的同构检验

Bulletin of the Society of Sea Water Science, Japan

Pub Date : 2018-01-01 DOI: 10.4230/LIPIcs.SEA.2018.30

A. Piperno

Colour refinement is at the heart of all the most efficient graph isomorphism software packages. In this paper we present a method for extending the applicability of refinement algorithms to directed graphs with weighted edges. We use {Traces} as a reference software, but the proposed solution is easily transferrable to any other refinement-based graph isomorphism tool in the literature. We substantiate the claim that the performances of the original algorithm remain substantially unchanged by showing experiments for some classes of benchmark graphs.

颜色细化是所有最有效的图形同构软件包的核心。在本文中，我们提出了一种方法来扩展细化算法在有权边有向图中的适用性。我们使用{Traces}作为参考软件，但提出的解决方案很容易转移到文献中任何其他基于细化的图同构工具。我们通过展示一些类别的基准图的实验来证实原始算法的性能基本保持不变的说法。

引用次数: 7

Fast Spherical Drawing of Triangulations: An Experimental Study of Graph Drawing Tools 快速球面三角形绘图:图形绘图工具的实验研究

Bulletin of the Society of Sea Water Science, Japan

Pub Date : 2018-01-01 DOI: 10.4230/LIPIcs.SEA.2018.24

L. C. Aleardi, Gaspard Denis, Éric Fusy

We consider the problem of computing a spherical crossing-free geodesic drawing of a planar graph: this problem, as well as the closely related spherical parameterization problem, has attracted a lot of attention in the last two decades both in theory and in practice, motivated by a number of applications ranging from texture mapping to mesh remeshing and morphing. Our main concern is to design and implement a linear time algorithm for the computation of spherical drawings provided with theoretical guarantees. While not being aesthetically pleasing, our method is extremely fast and can be used as initial placer for spherical iterative methods and spring embedders. We provide experimental comparison with initial placers based on planar Tutte parameterization. Finally we explore the use of spherical drawings as initial layouts for (Euclidean) spring embedders: experimental evidence shows that this greatly helps to untangle the layout and to reach better local minima.

我们考虑了计算一个平面图形的球面无交叉测地线图的问题:这个问题，以及密切相关的球面参数化问题，在过去的二十年中在理论和实践中都引起了很多关注，受到许多应用的推动，从纹理映射到网格重划分和变形。我们主要关注的是设计和实现一个线性时间算法，为球面图的计算提供理论保证。虽然不美观，但我们的方法非常快，可以用作球面迭代方法和弹簧嵌入器的初始砂矿。在平面图特参数化的基础上，与初始砂矿进行了实验比较。最后，我们探讨了球形图作为(欧几里得)弹簧嵌入器的初始布局的使用:实验证据表明，这极大地有助于理清布局并达到更好的局部最小值。

引用次数: 0

Dictionary Matching in Elastic-Degenerate Texts with Applications in Searching VCF Files On-line 弹性简并文本的字典匹配及其在VCF文件在线搜索中的应用

Bulletin of the Society of Sea Water Science, Japan

Pub Date : 2018-01-01 DOI: 10.4230/LIPIcs.SEA.2018.16

S. Pissis, Ahmad Retha

An elastic-degenerate string is a sequence of n sets of strings of total length N . It has been introduced to represent multiple sequence alignments of closely-related sequences in a compact form. For a standard pattern of length m, pattern matching in an elastic-degenerate text can be solved on-line in time O(nm2 +N) with pre-processing time and space O(m) (Grossi et al., CPM 2017). A fast bit-vector algorithm requiring time O(N · dmw e) with pre-processing time and space O(m·dmw e), where w is the size of the computer word, was also presented. In this paper we consider the same problem for a set of patterns of total length M . A straightforward generalization of the existing bit-vector algorithm would require time O(N · dMw e) with pre-processing time and space O(M · dMw e), which is prohibitive in practice. We present a new on-line O(N · d M w e)-time algorithm with pre-processing time and space O(M). We present experimental results using both synthetic and real data demonstrating the performance of the algorithm. We further demonstrate a real application of our algorithm in a pipeline for discovery and verification of minimal absent words (MAWs) in the human genome showing that a significant number of previously discovered MAWs are in fact false-positives when a population’s variants are considered. 2012 ACM Subject Classification Theory of computation → Pattern matching

弹性简并弦是由总长度为n的n组弦组成的序列。它已经被引入到表示紧密相关序列的多个序列排列在一个紧凑的形式。对于长度为m的标准模式，在预处理时间和空间为O(m)的情况下，弹性退化文本中的模式匹配可以在线解决，时间为O(nm2 +N) (Grossi et al.， CPM 2017)。提出了一种快速的位矢量算法，预处理时间为O(N·dmw e)，空间为O(m·dmw e)，其中w为计算机字的大小。在本文中，我们考虑了总长度为M的一组模式的相同问题。对现有的位矢量算法进行直接推广需要时间O(N·dMw e)，预处理时间和空间O(M·dMw e)，这在实践中是难以实现的。提出了一种时间为O(N·d·M)、预处理时间为O(M)、空间为O(M)的在线算法。我们给出了合成数据和真实数据的实验结果，证明了该算法的性能。我们进一步展示了我们的算法在发现和验证人类基因组中最小缺失词(MAWs)的管道中的实际应用，表明当考虑种群的变体时，大量先前发现的MAWs实际上是假阳性。2012 ACM学科分类计算理论→模式匹配

{"title":"Dictionary Matching in Elastic-Degenerate Texts with Applications in Searching VCF Files On-line","authors":"S. Pissis, Ahmad Retha","doi":"10.4230/LIPIcs.SEA.2018.16","DOIUrl":"https://doi.org/10.4230/LIPIcs.SEA.2018.16","url":null,"abstract":"An elastic-degenerate string is a sequence of n sets of strings of total length N . It has been introduced to represent multiple sequence alignments of closely-related sequences in a compact form. For a standard pattern of length m, pattern matching in an elastic-degenerate text can be solved on-line in time O(nm2 +N) with pre-processing time and space O(m) (Grossi et al., CPM 2017). A fast bit-vector algorithm requiring time O(N · dmw e) with pre-processing time and space O(m·dmw e), where w is the size of the computer word, was also presented. In this paper we consider the same problem for a set of patterns of total length M . A straightforward generalization of the existing bit-vector algorithm would require time O(N · dMw e) with pre-processing time and space O(M · dMw e), which is prohibitive in practice. We present a new on-line O(N · d M w e)-time algorithm with pre-processing time and space O(M). We present experimental results using both synthetic and real data demonstrating the performance of the algorithm. We further demonstrate a real application of our algorithm in a pipeline for discovery and verification of minimal absent words (MAWs) in the human genome showing that a significant number of previously discovered MAWs are in fact false-positives when a population’s variants are considered. 2012 ACM Subject Classification Theory of computation → Pattern matching","PeriodicalId":9448,"journal":{"name":"Bulletin of the Society of Sea Water Science, Japan","volume":"25 1","pages":"16:1-16:14"},"PeriodicalIF":0.0,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78245247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Bulletin of the Society of Sea Water Science, Japan

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀