Bulletin of the Society of Sea Water Science, Japan最新文献

英文中文

Algorithms for new types of fair stable matchings 新型公平稳定匹配的算法

Bulletin of the Society of Sea Water Science, Japan

Pub Date : 2020-01-29 DOI: 10.4230/LIPIcs.SEA.2020.20

Frances Cooper, D. Manlove

We study the problem of finding "fair" stable matchings in the Stable Marriage problem with Incomplete lists (SMI). For an instance $I$ of SMI there may be many stable matchings, providing significantly different outcomes for the sets of men and women. We introduce two new notions of fairness in SMI. Firstly, a regret-equal stable matching minimises the difference in ranks of a worst-off man and a worst-off woman, among all stable matchings. Secondly, a min-regret sum stable matching minimises the sum of ranks of a worst-off man and a worst-off woman, among all stable matchings. We present two new efficient algorithms to find stable matchings of these types. Firstly, the Regret-Equal Degree Iteration Algorithm finds a regret-equal stable matching in $O(d_0 nm)$ time, where $d_0$ is the absolute difference in ranks between a worst-off man and a worst-off woman in the man-optimal stable matching, $n$ is the number of men or women, and $m$ is the total length of all preference lists. Secondly, the Min-Regret Sum Algorithm finds a min-regret sum stable matching in $O(d_s m)$ time, where $d_s$ is the difference in the ranks between a worst-off man in each of the woman-optimal and man-optimal stable matchings. Experiments to compare several types of fair optimal stable matchings were conducted and show that the Regret-Equal Degree Iteration Algorithm produces matchings that are competitive with respect to other fairness objectives. On the other hand, existing types of "fair" stable matchings did not provide as close an approximation to regret-equal stable matchings.

研究了不完全列表稳定婚姻问题中寻找“公平”稳定匹配的问题。例如，SMI的$I$可能有许多稳定的匹配，为男性和女性提供明显不同的结果。我们在SMI中引入了两个新的公平概念。首先，在所有稳定的配对中，悔恨相等的稳定配对使最穷的男人和最穷的女人之间的等级差异最小化。其次，最小遗憾和稳定匹配使最穷的男人和最穷的女人在所有稳定匹配中的排名总和最小。我们提出了两种新的高效算法来寻找这些类型的稳定匹配。首先，悔恨等度迭代算法在$O(d_0 nm)$时间内找到一个悔恨等的稳定匹配，其中$d_0$为最优稳定匹配中最穷的男人和最穷的女人之间的绝对排名差，$n$为男性或女性的数量，$m$为所有偏好列表的总长度。其次，最小遗憾和算法在$O(d_s m)$时间内找到一个最小遗憾和稳定匹配，其中$d_s$是最穷的男人在每个女人最优和男人最优稳定匹配中的排名差。实验比较了几种类型的公平最优稳定匹配，结果表明，后悔等度迭代算法产生的匹配相对于其他公平目标具有竞争性。另一方面，现有类型的“公平”稳定匹配并没有提供与遗憾相等的稳定匹配的近似。

{"title":"Algorithms for new types of fair stable matchings","authors":"Frances Cooper, D. Manlove","doi":"10.4230/LIPIcs.SEA.2020.20","DOIUrl":"https://doi.org/10.4230/LIPIcs.SEA.2020.20","url":null,"abstract":"We study the problem of finding \"fair\" stable matchings in the Stable Marriage problem with Incomplete lists (SMI). For an instance $I$ of SMI there may be many stable matchings, providing significantly different outcomes for the sets of men and women. We introduce two new notions of fairness in SMI. Firstly, a regret-equal stable matching minimises the difference in ranks of a worst-off man and a worst-off woman, among all stable matchings. Secondly, a min-regret sum stable matching minimises the sum of ranks of a worst-off man and a worst-off woman, among all stable matchings. We present two new efficient algorithms to find stable matchings of these types. Firstly, the Regret-Equal Degree Iteration Algorithm finds a regret-equal stable matching in $O(d_0 nm)$ time, where $d_0$ is the absolute difference in ranks between a worst-off man and a worst-off woman in the man-optimal stable matching, $n$ is the number of men or women, and $m$ is the total length of all preference lists. Secondly, the Min-Regret Sum Algorithm finds a min-regret sum stable matching in $O(d_s m)$ time, where $d_s$ is the difference in the ranks between a worst-off man in each of the woman-optimal and man-optimal stable matchings. Experiments to compare several types of fair optimal stable matchings were conducted and show that the Regret-Equal Degree Iteration Algorithm produces matchings that are competitive with respect to other fairness objectives. On the other hand, existing types of \"fair\" stable matchings did not provide as close an approximation to regret-equal stable matchings.","PeriodicalId":9448,"journal":{"name":"Bulletin of the Society of Sea Water Science, Japan","volume":"5 1","pages":"20:1-20:13"},"PeriodicalIF":0.0,"publicationDate":"2020-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84661066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Path Query Data Structures in Practice 路径查询数据结构实践

Bulletin of the Society of Sea Water Science, Japan

Pub Date : 2020-01-28 DOI: 10.4230/LIPIcs.SEA.2020.27

Meng He, Serikzhan Kazi

We perform experimental studies on data structures that answer path median, path counting, and path reporting queries in weighted trees. These query problems generalize the well-known range median query problem in arrays, as well as the $2d$ orthogonal range counting and reporting problems in planar point sets, to tree structured data. We propose practical realizations of the latest theoretical results on path queries. Our data structures, which use tree extraction, heavy-path decomposition and wavelet trees, are implemented in both succinct and pointer-based form. Our succinct data structures are further specialized to be plain or entropy-compressed. Through experiments on large sets, we show that succinct data structures for path queries may present a viable alternative to standard pointer-based realizations, in practical scenarios. Compared to na{"i}ve approaches that compute the answer by explicit traversal of the query path, our succinct data structures are several times faster in path median queries and perform comparably in path counting and path reporting queries, while being several times more space-efficient. Plain pointer-based realizations of our data structures, requiring a few times more space than the na{"i}ve ones, yield up to $100$-times speed-up over them.

我们对在加权树中回答路径中值、路径计数和路径报告查询的数据结构进行了实验研究。这些查询问题将数组中众所周知的范围中位数查询问题，以及平面点集中的$2d$正交范围计数和报告问题推广到树状结构数据。我们提出了路径查询的最新理论结果的实际实现。我们的数据结构使用树提取、重路径分解和小波树，以简洁和基于指针的形式实现。我们的简洁数据结构进一步被专门化为简单的或熵压缩的。通过对大型数据集的实验，我们表明，在实际场景中，路径查询的简洁数据结构可能是标准的基于指针的实现的可行替代方案。与通过显式遍历查询路径来计算答案的na{ "i}ve方法相比，我们简洁的数据结构在路径中位数查询中要快几倍，在路径计数和路径报告查询中也要快几倍，同时空间效率要高几倍。简单的基于指针的数据结构的实现，需要的空间是原始数据结构的几倍，但速度却提高了100倍。

{"title":"Path Query Data Structures in Practice","authors":"Meng He, Serikzhan Kazi","doi":"10.4230/LIPIcs.SEA.2020.27","DOIUrl":"https://doi.org/10.4230/LIPIcs.SEA.2020.27","url":null,"abstract":"We perform experimental studies on data structures that answer path median, path counting, and path reporting queries in weighted trees. These query problems generalize the well-known range median query problem in arrays, as well as the $2d$ orthogonal range counting and reporting problems in planar point sets, to tree structured data. We propose practical realizations of the latest theoretical results on path queries. Our data structures, which use tree extraction, heavy-path decomposition and wavelet trees, are implemented in both succinct and pointer-based form. Our succinct data structures are further specialized to be plain or entropy-compressed. Through experiments on large sets, we show that succinct data structures for path queries may present a viable alternative to standard pointer-based realizations, in practical scenarios. Compared to na{\"i}ve approaches that compute the answer by explicit traversal of the query path, our succinct data structures are several times faster in path median queries and perform comparably in path counting and path reporting queries, while being several times more space-efficient. Plain pointer-based realizations of our data structures, requiring a few times more space than the na{\"i}ve ones, yield up to $100$-times speed-up over them.","PeriodicalId":9448,"journal":{"name":"Bulletin of the Society of Sea Water Science, Japan","volume":"65 1","pages":"27:1-27:16"},"PeriodicalIF":0.0,"publicationDate":"2020-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86837018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

High-Quality Hierarchical Process Mapping 高质量层次过程映射

Bulletin of the Society of Sea Water Science, Japan

Pub Date : 2020-01-20 DOI: 10.4230/LIPIcs.SEA.2020.4

Marcelo Fonseca Faraj, Alexander van der Grinten, Henning Meyerhenke, J. Träff, Christian Schulz

Partitioning graphs into blocks of roughly equal size such that few edges run between blocks is a frequently needed operation when processing graphs on a parallel computer. When a topology of a distributed system is known an important task is then to map the blocks of the partition onto the processors such that the overall communication cost is reduced. We present novel multilevel algorithms that integrate graph partitioning and process mapping. Important ingredients of our algorithm include fast label propagation, more localized local search, initial partitioning, as well as a compressed data structure to compute processor distances without storing a distance matrix. Experiments indicate that our algorithms speed up the overall mapping process and, due to the integrated multilevel approach, also find much better solutions in practice. For example, one configuration of our algorithm yields better solutions than the previous state-of-the-art in terms of mapping quality while being a factor 62 faster. Compared to the currently fastest iterated multilevel mapping algorithm Scotch, we obtain 16% better solutions while investing slightly more running time.

在并行计算机上处理图形时，将图形划分为大小大致相等的块，以便块之间很少有边运行，这是经常需要的操作。当知道分布式系统的拓扑结构时，一个重要的任务就是将分区的块映射到处理器上，从而降低总体通信成本。我们提出了一种结合图划分和过程映射的多层算法。该算法的重要组成部分包括快速标签传播、更本地化的局部搜索、初始分区以及在不存储距离矩阵的情况下计算处理器距离的压缩数据结构。实验表明，我们的算法加快了整体映射过程，并且由于集成的多层方法，在实践中也找到了更好的解决方案。例如，我们算法的一个配置在映射质量方面比以前的最先进的解决方案产生更好的解决方案，同时速度提高了62倍。与目前最快的迭代多层映射算法Scotch相比，我们获得了16%的解决方案，同时投入了更多的运行时间。

{"title":"High-Quality Hierarchical Process Mapping","authors":"Marcelo Fonseca Faraj, Alexander van der Grinten, Henning Meyerhenke, J. Träff, Christian Schulz","doi":"10.4230/LIPIcs.SEA.2020.4","DOIUrl":"https://doi.org/10.4230/LIPIcs.SEA.2020.4","url":null,"abstract":"Partitioning graphs into blocks of roughly equal size such that few edges run between blocks is a frequently needed operation when processing graphs on a parallel computer. When a topology of a distributed system is known an important task is then to map the blocks of the partition onto the processors such that the overall communication cost is reduced. We present novel multilevel algorithms that integrate graph partitioning and process mapping. Important ingredients of our algorithm include fast label propagation, more localized local search, initial partitioning, as well as a compressed data structure to compute processor distances without storing a distance matrix. Experiments indicate that our algorithms speed up the overall mapping process and, due to the integrated multilevel approach, also find much better solutions in practice. For example, one configuration of our algorithm yields better solutions than the previous state-of-the-art in terms of mapping quality while being a factor 62 faster. Compared to the currently fastest iterated multilevel mapping algorithm Scotch, we obtain 16% better solutions while investing slightly more running time.","PeriodicalId":9448,"journal":{"name":"Bulletin of the Society of Sea Water Science, Japan","volume":"70 1","pages":"4:1-4:15"},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73834135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Fast and Stable Repartitioning of Road Networks 快速稳定的道路网络重新划分

Bulletin of the Society of Sea Water Science, Japan

Pub Date : 2020-01-01 DOI: 10.4230/LIPIcs.SEA.2020.26

V. Buchhold, D. Delling, D. Schieferdecker, Michael Wegner

We study the problem of graph partitioning for evolving road networks. While the road network of the world is mostly stable, small updates happen on a relatively frequent basis, as can been observed with the OpenStreetMap project (http://www.openstreetmap.org). For various reasons, professional applications demand the graph partition to stay roughly the same over time, and that changes are limited to areas where graph updates occur. In this work, we define the problem, present algorithms to satisfy the stability needs, and evaluate our techniques on continental-sized road networks. Besides the stability gains, we show that, when the changes are low and local, running our novel techniques is an order of magnitude faster than running graph partitioning from scratch. 2012 ACM Subject Classification Mathematics of computing → Graph algorithms; Theory of computation → Dynamic graph algorithms

我们研究了进化道路网络的图划分问题。虽然世界上的道路网络大多是稳定的，但小的更新发生在相对频繁的基础上，正如可以在OpenStreetMap项目(http://www.openstreetmap.org)中观察到的那样。由于各种原因，专业应用程序要求图分区在一段时间内保持大致相同，并且更改仅限于发生图更新的区域。在这项工作中，我们定义了问题，提出了满足稳定性需求的算法，并在大陆规模的道路网络上评估了我们的技术。除了稳定性增益之外，我们还表明，当变化很小且局部时，运行我们的新技术比从头开始运行图分区要快一个数量级。2012 ACM学科分类计算数学→图算法;计算理论→动态图算法

引用次数: 2

Engineering Fused Lasso Solvers on Trees 树木上的工程融合套索求解器

Bulletin of the Society of Sea Water Science, Japan

Pub Date : 2020-01-01 DOI: 10.4230/LIPIcs.SEA.2020.23

Elias Kuthe, S. Rahmann

The graph fused lasso optimization problem seeks, for a given input signal y = (yi) on nodes i ∈ V of a graph G = (V,E), a reconstructed signal x = (xi) that is both element-wise close to y in quadratic error and also has bounded total variation (sum of absolute differences across edges), thereby favoring regionally constant solutions. An important application is denoising of spatially correlated data, especially for medical images. Currently, fused lasso solvers for general graph input reduce the problem to an iteration over a series of “one-dimensional” problems (on paths or line graphs), which can be solved in linear time. Recently, a direct fused lasso algorithm for tree graphs has been presented, but no implementation of it appears to be available. We here present a simplified exact algorithm and additionally a fast approximation scheme for trees, together with engineered implementations for both. We empirically evaluate their performance on different kinds of trees with distinct degree distributions (simulated trees; spanning trees of road networks, grid graphs of images, social networks). The exact algorithm is very efficient on trees with low node degrees, which covers many naturally arising graphs, while the approximation scheme can perform better on trees with several higher-degree nodes when limiting the desired accuracy to values that are useful in practice. 2012 ACM Subject Classification Theory of computation → Mathematical optimization; Theory of computation → Dynamic programming; Mathematics of computing → Trees

图融合lasso优化问题寻求的是，对于图G = (V,E)的节点i∈V上给定的输入信号y = (yi)，重构信号x = (xi)既在元素上接近y的二次误差，又具有有限的总变异(边间绝对差的总和)，从而有利于区域常数解。一个重要的应用是空间相关数据的去噪，特别是对医学图像。目前，一般图输入的融合套索求解器将问题简化为一系列“一维”问题(在路径或线形图上)的迭代，这些问题可以在线性时间内解决。最近提出了一种树形图的直接融合套索算法，但目前还没有实现。我们在这里提出了一个简化的精确算法和一个快速的树近似方案，以及两者的工程实现。我们在不同程度分布的树(模拟树;道路网络的生成树，图像的网格图，社会网络)。精确算法在低节点度的树上非常有效，它覆盖了许多自然产生的图，而近似方案在具有几个高节点的树上可以表现得更好，当将所需的精度限制在实际有用的值时。2012 ACM学科分类计算理论→数学优化;计算理论→动态规划;计算数学→树

{"title":"Engineering Fused Lasso Solvers on Trees","authors":"Elias Kuthe, S. Rahmann","doi":"10.4230/LIPIcs.SEA.2020.23","DOIUrl":"https://doi.org/10.4230/LIPIcs.SEA.2020.23","url":null,"abstract":"The graph fused lasso optimization problem seeks, for a given input signal y = (yi) on nodes i ∈ V of a graph G = (V,E), a reconstructed signal x = (xi) that is both element-wise close to y in quadratic error and also has bounded total variation (sum of absolute differences across edges), thereby favoring regionally constant solutions. An important application is denoising of spatially correlated data, especially for medical images. Currently, fused lasso solvers for general graph input reduce the problem to an iteration over a series of “one-dimensional” problems (on paths or line graphs), which can be solved in linear time. Recently, a direct fused lasso algorithm for tree graphs has been presented, but no implementation of it appears to be available. We here present a simplified exact algorithm and additionally a fast approximation scheme for trees, together with engineered implementations for both. We empirically evaluate their performance on different kinds of trees with distinct degree distributions (simulated trees; spanning trees of road networks, grid graphs of images, social networks). The exact algorithm is very efficient on trees with low node degrees, which covers many naturally arising graphs, while the approximation scheme can perform better on trees with several higher-degree nodes when limiting the desired accuracy to values that are useful in practice. 2012 ACM Subject Classification Theory of computation → Mathematical optimization; Theory of computation → Dynamic programming; Mathematics of computing → Trees","PeriodicalId":9448,"journal":{"name":"Bulletin of the Society of Sea Water Science, Japan","volume":"47 1","pages":"23:1-23:14"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80736379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Effect of Initial Assignment on Local Search Performance for Max Sat 初始分配对最大卫星局部搜索性能的影响

Bulletin of the Society of Sea Water Science, Japan

Pub Date : 2020-01-01 DOI: 10.4230/LIPIcs.SEA.2020.8

D. Berend, Yochai Twitto

In this paper, we explore the correlation between the quality of initial assignments provided to local search heuristics and that of the corresponding final assignments. We restrict our attention to the Max r-Sat problem and to one of the leading local search heuristics – Configuration Checking Local Search (CCLS). We use a tailored version of the Method of Conditional Expectations (MOCE) to generate initial assignments of diverse quality. We show that the correlation in question is significant and long-lasting. Namely, even when we delve deeper into the local search, we are still in the shadow of the initial assignment. Thus, under practical time constraints, the quality of the initial assignment is crucial to the performance of local search heuristics. To demonstrate our point, we improve CCLS by combining it with MOCE. Instead of starting CCLS from random initial assignments, we start it from excellent initial assignments, provided by MOCE. Indeed, it turns out that this kind of initialization provides a significant improvement of this state-of-the-art solver. This improvement becomes more and more significant as the instance grows. 2012 ACM Subject Classification Theory of computation→ Theory of randomized search heuristics; Theory of computation → Stochastic approximation

在本文中，我们探讨了提供给局部搜索启发式的初始任务的质量与相应的最终任务的质量之间的相关性。我们将注意力集中在Max r-Sat问题和一个领先的局部搜索启发式算法——配置检查局部搜索(CCLS)。我们使用条件期望法(MOCE)的定制版本来生成不同质量的初始任务。我们表明，所讨论的相关性是显著和持久的。也就是说，即使我们更深入地进行局部搜索，我们仍然处于初始分配的阴影中。因此，在实际时间限制下，初始分配的质量对局部搜索启发式的性能至关重要。为了证明我们的观点，我们通过将CCLS与MOCE结合来改进CCLS。我们从MOCE提供的优秀初始赋值开始，而不是从随机初始赋值开始CCLS。事实上，这种初始化为这个最先进的求解器提供了显著的改进。随着实例的增长，这种改进变得越来越重要。2012 ACM主题分类:计算理论→随机搜索启发式理论;计算理论→随机逼近

{"title":"Effect of Initial Assignment on Local Search Performance for Max Sat","authors":"D. Berend, Yochai Twitto","doi":"10.4230/LIPIcs.SEA.2020.8","DOIUrl":"https://doi.org/10.4230/LIPIcs.SEA.2020.8","url":null,"abstract":"In this paper, we explore the correlation between the quality of initial assignments provided to local search heuristics and that of the corresponding final assignments. We restrict our attention to the Max r-Sat problem and to one of the leading local search heuristics – Configuration Checking Local Search (CCLS). We use a tailored version of the Method of Conditional Expectations (MOCE) to generate initial assignments of diverse quality. We show that the correlation in question is significant and long-lasting. Namely, even when we delve deeper into the local search, we are still in the shadow of the initial assignment. Thus, under practical time constraints, the quality of the initial assignment is crucial to the performance of local search heuristics. To demonstrate our point, we improve CCLS by combining it with MOCE. Instead of starting CCLS from random initial assignments, we start it from excellent initial assignments, provided by MOCE. Indeed, it turns out that this kind of initialization provides a significant improvement of this state-of-the-art solver. This improvement becomes more and more significant as the instance grows. 2012 ACM Subject Classification Theory of computation→ Theory of randomized search heuristics; Theory of computation → Stochastic approximation","PeriodicalId":9448,"journal":{"name":"Bulletin of the Society of Sea Water Science, Japan","volume":"172 1","pages":"8:1-8:14"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90230749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Algorithm Engineering for High-Dimensional Similarity Search Problems (Invited Talk) 高维相似搜索问题的算法工程(特邀演讲)

Bulletin of the Society of Sea Water Science, Japan

Pub Date : 2020-01-01 DOI: 10.4230/LIPIcs.SEA.2020.1

Martin Aumüller

Similarity search problems in high-dimensional data arise in many areas of computer science such as data bases, image analysis, machine learning, and natural language processing. One of the most prominent problems is finding the k nearest neighbors of a data point q ∈ ℝ^d in a large set of data points S ⊂ ℝ^d, under same distance measure such as Euclidean distance. In contrast to lower dimensional settings, we do not know of worst-case efficient data structures for such search problems in high-dimensional data, i.e., data structures that are faster than a linear scan through the data set. However, there is a rich body of (often heuristic) approaches that solve nearest neighbor search problems much faster than such a scan on many real-world data sets. As a necessity, the term solve means that these approaches give approximate results that are close to the true k-nearest neighbors. In this talk, we survey recent approaches to nearest neighbor search and related problems. The talk consists of three parts: (1) What makes nearest neighbor search difficult? (2) How do current state-of-the-art algorithms work? (3) What are recent advances regarding similarity search on GPUs, in distributed settings, or in external memory?

高维数据中的相似度搜索问题出现在计算机科学的许多领域，如数据库、图像分析、机器学习和自然语言处理。其中一个最突出的问题是，在一组数据点S∧λ ^d中，在相同的距离度量(如欧几里得距离)下，找到数据点q∈λ ^d的k个近邻。与低维设置相比，我们不知道高维数据中这种搜索问题的最坏情况有效数据结构，即比线性扫描数据集更快的数据结构。然而，有很多(通常是启发式的)方法可以解决最近邻搜索问题，比在许多现实世界的数据集上进行这种扫描快得多。作为必要条件，“解”一词意味着这些方法给出的近似结果接近真正的k近邻。在这次演讲中，我们调查了最近的最近邻搜索方法和相关问题。讲座由三个部分组成:(1)是什么使最近邻搜索变得困难?(2)当前最先进的算法是如何工作的?(3)在gpu、分布式设置或外部存储器上的相似性搜索有什么最新进展?

{"title":"Algorithm Engineering for High-Dimensional Similarity Search Problems (Invited Talk)","authors":"Martin Aumüller","doi":"10.4230/LIPIcs.SEA.2020.1","DOIUrl":"https://doi.org/10.4230/LIPIcs.SEA.2020.1","url":null,"abstract":"Similarity search problems in high-dimensional data arise in many areas of computer science such as data bases, image analysis, machine learning, and natural language processing. One of the most prominent problems is finding the k nearest neighbors of a data point q ∈ ℝ^d in a large set of data points S ⊂ ℝ^d, under same distance measure such as Euclidean distance. In contrast to lower dimensional settings, we do not know of worst-case efficient data structures for such search problems in high-dimensional data, i.e., data structures that are faster than a linear scan through the data set. However, there is a rich body of (often heuristic) approaches that solve nearest neighbor search problems much faster than such a scan on many real-world data sets. As a necessity, the term solve means that these approaches give approximate results that are close to the true k-nearest neighbors. In this talk, we survey recent approaches to nearest neighbor search and related problems. The talk consists of three parts: (1) What makes nearest neighbor search difficult? (2) How do current state-of-the-art algorithms work? (3) What are recent advances regarding similarity search on GPUs, in distributed settings, or in external memory?","PeriodicalId":9448,"journal":{"name":"Bulletin of the Society of Sea Water Science, Japan","volume":"78 1","pages":"1:1-1:3"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82848492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Algorithm Engineering for Sorting and Searching, and All That (Invited Talk) 排序和搜索的算法工程，以及所有这些(特邀演讲)

Bulletin of the Society of Sea Water Science, Japan

Pub Date : 2020-01-01 DOI: 10.4230/LIPIcs.SEA.2020.2

S. Edelkamp

We look at several proposals to engineer the set of fundamental searching and sorting algorithms. Aspects are improving locality of disk access and cache access, the efficiency tuning by reducing the number of branch mispredictions, and reducing at leading factors hidden in the Big-Oh notation. These studies in algorithm engineering, in turn, lead to exiting new algorithm designs. On the practical side, we will establish that efficient sorting and searching algorithms are in tight collaboration, as sorting is used for finding duplicates in disk-based search, and heap structures designed for efficient graph search can be exploited in classical and adaptive sorting. We indicate the effects of engineered sorting and searching for combined task and motion planning. 2012 ACM Subject Classification Theory of computation → Design and analysis of algorithms

我们将研究几种设计基本搜索和排序算法集的建议。这些方面包括改进磁盘访问和缓存访问的局部性，通过减少分支错误预测的数量来调优效率，以及减少隐藏在Big-Oh表示法中的主要因素。算法工程中的这些研究反过来又会导致新的算法设计。在实践方面，我们将建立有效的排序和搜索算法是紧密协作的，因为排序用于在基于磁盘的搜索中查找重复项，而为有效的图搜索设计的堆结构可以在经典和自适应排序中利用。我们指出了工程排序和搜索对组合任务和运动规划的影响。2012 ACM学科分类:计算理论→算法设计与分析

引用次数: 0

Finding Structurally and Temporally Similar Trajectories in Graphs 在图中寻找结构和时间上相似的轨迹

Bulletin of the Society of Sea Water Science, Japan

Pub Date : 2020-01-01 DOI: 10.4230/LIPIcs.SEA.2020.24

R. Grossi, Andrea Marino, Shima Moghtasedi

The analysis of similar motions in a network provides useful information for diﬀerent applications like route recommendation. We are interested in algorithms to eﬃciently retrieve trajectories that are similar to a given query trajectory. For this task many studies have focused on extracting the geometrical information of trajectories. In this paper we investigate the properties of trajectories moving along the paths of a network. We provide a similarity function by making use of both the temporal aspect of trajectories and the structure of the underlying network. We propose an approximation technique that oﬀers the top-k similar trajectories with respect to a query trajectory in an eﬃcient way with acceptable precision. We investigate our method over real-world networks, and our experimental results show the eﬀectiveness of the proposed method.

对网络中相似运动的分析为不同的应用提供了有用的信息，比如路由推荐。我们对有效检索与给定查询轨迹相似的轨迹的算法感兴趣。为了完成这个任务，许多研究都集中在提取轨迹的几何信息上。本文研究了沿网络路径运动轨迹的性质。我们通过利用轨迹的时间方面和底层网络的结构提供了一个相似函数。我们提出了一种近似技术，以一种有效的方式以可接受的精度提供关于查询轨迹的top-k相似轨迹。我们在真实的网络中对我们的方法进行了研究，实验结果表明了我们提出的方法的有效性。

引用次数: 3

Indexing Compressed Text: A Tale of Time and Space (Invited Talk) 索引压缩文本:时间与空间的故事(特邀演讲)

Bulletin of the Society of Sea Water Science, Japan

Pub Date : 2020-01-01 DOI: 10.4230/LIPIcs.SEA.2020.3

N. Prezza

Text indexing is a classical algorithmic problem that has been studied for over four decades. The earliest optimal-time solution to the problem, the suffix tree [11], dates back to 1973 and requires up to two orders of magnitude more space than the text to be stored. In the year 2000, two breakthrough works [6, 3] showed that this space overhead is not necessary: both the index and the text can be stored in a space proportional to the text’s entropy. These contributions had an enormous impact in bioinformatics: nowadays, the two most widely-used DNA aligners employ compressed indexes [9, 8]. In recent years, it became apparent that entropy had reached its limits: modern datasets (for example, collections of thousands of human genomes) are extremely large but very repetitive and, by its very definition, entropy cannot compress repetitive texts [7]. To overcome this problem, a new generation of indexes based on dictionary compressors (for example, LZ77 and run-length BWT) emerged [7, 5, 1], together with generalizations of the indexing problem to labeled graphs [2, 10, 4]. This talk is a short and friendly survey of the landmarks of this fascinating path that took us from suffix trees to the most modern compressed indexes on labeled graphs. 2012 ACM Subject Classification Theory of computation → Data compression; Theory of computation → Sorting and searching; Theory of computation → Pattern matching

文本索引是一个经典的算法问题，已经被研究了四十多年。该问题最早的最优时间解决方案是后缀树[11]，它可以追溯到1973年，需要比文本存储多两个数量级的空间。2000年，两个突破性的作品[6,3]表明，这种空间开销是不必要的:索引和文本都可以存储在与文本熵成比例的空间中。这些贡献对生物信息学产生了巨大的影响:如今，两种最广泛使用的DNA比对器采用压缩索引[9,8]。近年来，熵已经明显达到了极限:现代数据集(例如，数千个人类基因组的集合)非常大，但非常重复，并且，根据其定义，熵不能压缩重复的文本[7]。为了克服这个问题，基于字典压缩器(例如LZ77和游程长度BWT)的新一代索引出现了[7,5,1]，以及对标记图的索引问题的推广[2,10,4]。这个演讲是一个简短而友好的调查，这条迷人的道路将我们从后缀树带到了标签图上最现代的压缩索引。2012 ACM学科分类:计算理论→数据压缩;计算理论→排序与搜索;计算理论→模式匹配

{"title":"Indexing Compressed Text: A Tale of Time and Space (Invited Talk)","authors":"N. Prezza","doi":"10.4230/LIPIcs.SEA.2020.3","DOIUrl":"https://doi.org/10.4230/LIPIcs.SEA.2020.3","url":null,"abstract":"Text indexing is a classical algorithmic problem that has been studied for over four decades. The earliest optimal-time solution to the problem, the suffix tree [11], dates back to 1973 and requires up to two orders of magnitude more space than the text to be stored. In the year 2000, two breakthrough works [6, 3] showed that this space overhead is not necessary: both the index and the text can be stored in a space proportional to the text’s entropy. These contributions had an enormous impact in bioinformatics: nowadays, the two most widely-used DNA aligners employ compressed indexes [9, 8]. In recent years, it became apparent that entropy had reached its limits: modern datasets (for example, collections of thousands of human genomes) are extremely large but very repetitive and, by its very definition, entropy cannot compress repetitive texts [7]. To overcome this problem, a new generation of indexes based on dictionary compressors (for example, LZ77 and run-length BWT) emerged [7, 5, 1], together with generalizations of the indexing problem to labeled graphs [2, 10, 4]. This talk is a short and friendly survey of the landmarks of this fascinating path that took us from suffix trees to the most modern compressed indexes on labeled graphs. 2012 ACM Subject Classification Theory of computation → Data compression; Theory of computation → Sorting and searching; Theory of computation → Pattern matching","PeriodicalId":9448,"journal":{"name":"Bulletin of the Society of Sea Water Science, Japan","volume":"261 1","pages":"3:1-3:2"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76253699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Bulletin of the Society of Sea Water Science, Japan

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀