Journal of Experimental Algorithmics最新文献_第2页

Algorithms and Data Structures for Hyperedge Queries 超边缘查询的算法和数据结构

Q2 Mathematics

Journal of Experimental Algorithmics

Pub Date : 2022-10-29 DOI: 10.1145/3568421

Jules Bertrand, F. Dufossé, Somesh Singh, B. Uçar

We consider the problem of querying the existence of hyperedges in hypergraphs. More formally, given a hypergraph, we need to answer queries of the form: “Does the following set of vertices form a hyperedge in the given hypergraph?” Our aim is to set up data structures based on hashing to answer these queries as fast as possible. We propose an adaptation of a well-known perfect hashing approach for the problem at hand. We analyze the space and runtime complexity of the proposed approach and experimentally compare it with the state-of-the-art hashing-based solutions. Experiments demonstrate the efficiency of the proposed approach with respect to the state-of-the-art.

我们考虑超图中超边的存在性问题。更正式地说，给定一个超图，我们需要回答以下形式的查询：“下面的一组顶点在给定的超图中形成超边吗？”我们的目标是建立基于哈希的数据结构，以尽可能快地回答这些查询。我们提出了一种适用于现有问题的众所周知的完美哈希方法。我们分析了所提出的方法的空间和运行时复杂性，并将其与最先进的基于哈希的解决方案进行了实验比较。实验证明了所提出的方法相对于现有技术的有效性。

引用次数: 1

On Computing the Diameter of (Weighted) Link Streams 关于计算（加权）链路流的直径

Q2 Mathematics

Journal of Experimental Algorithmics

Pub Date : 2022-10-28 DOI: 10.1145/3569168

M. Calamai, P. Crescenzi, Andrea Marino

A weighted link stream is a pair (V, 𝔼) comprising V, the set of nodes, and 𝔼, the list of temporal edges (u,v,t,λ) , where u,v are two nodes in V, t is the starting time of the temporal edge, and λ is its travel time. By making use of this model, different notions of diameter can be defined, which refer to the following distances: earliest arrival time, latest departure time, fastest time, and shortest time. After proving that any of these diameters cannot be computed in time sub-quadratic with respect to the number of temporal edges, we propose different algorithms (inspired by the approach used for computing the diameter of graphs) that allow us to compute, in practice very efficiently, the diameter of quite large real-world weighted link stream for several definitions of the diameter. In the case of the fastest time distance and of the shortest time distance, we introduce the notion of pivot-diameter, to deal with the fact that temporal paths cannot be concatenated in general. The pivot-diameter is the diameter restricted to the set of pair of nodes connected by a path that passes through a pivot (that is, a node at a given time instant). We prove that the problem of finding an optimal set of pivots, in terms of the number of pairs connected, is NP-hard, and we propose and experimentally evaluate several simple and fast heuristics for computing “good” pivot sets. All the proposed algorithms (for computing either the diameter or the pivot-diameter) require very often a very low number of single source (or target) best path computations. We verify the effectiveness of our approaches by means of an extensive set of experiments on real-world link streams. We also experimentally prove that the temporal version of the well-known 2-sweep technique, for computing a lower bound on the diameter of a graph, is quite effective in the case of weighted link stream, by returning very often tight bounds.

加权链路流是由节点集V和时间边列表(u, V, t，λ)组成的一对(V，)，其中u, V是V中的两个节点，t是时间边的起始时间，λ是其行进时间。利用该模型，可以定义不同的直径概念，即最早到达时间、最晚出发时间、最快时间和最短时间。在证明了这些直径中的任何一个都不能在时间边数量的时间次二次中计算出来之后，我们提出了不同的算法(受到用于计算图直径的方法的启发)，这些算法允许我们在实践中非常有效地计算相当大的现实世界加权链接流的直径，用于几种直径定义。在最快时间距离和最短时间距离的情况下，我们引入了枢轴直径的概念，以处理一般情况下时间路径不能连接的事实。枢轴直径是由通过枢轴(即给定时刻的节点)的路径连接的一组节点的直径。我们证明了找到一个最优的枢轴集的问题，根据连接对的数量，是np困难的，我们提出并实验评估了几个简单而快速的启发式方法来计算“好”枢轴集。所有提出的算法(用于计算直径或枢轴直径)通常只需要非常少的单源(或目标)最佳路径计算。我们通过在现实世界的链接流上进行大量的实验来验证我们方法的有效性。我们还通过实验证明了众所周知的2-sweep技术的时间版本，用于计算图直径的下界，在加权链接流的情况下非常有效，通过返回非常紧密的边界。

{"title":"On Computing the Diameter of (Weighted) Link Streams","authors":"M. Calamai, P. Crescenzi, Andrea Marino","doi":"10.1145/3569168","DOIUrl":"https://doi.org/10.1145/3569168","url":null,"abstract":"A weighted link stream is a pair (V, 𝔼) comprising V, the set of nodes, and 𝔼, the list of temporal edges (u,v,t,λ) , where u,v are two nodes in V, t is the starting time of the temporal edge, and λ is its travel time. By making use of this model, different notions of diameter can be defined, which refer to the following distances: earliest arrival time, latest departure time, fastest time, and shortest time. After proving that any of these diameters cannot be computed in time sub-quadratic with respect to the number of temporal edges, we propose different algorithms (inspired by the approach used for computing the diameter of graphs) that allow us to compute, in practice very efficiently, the diameter of quite large real-world weighted link stream for several definitions of the diameter. In the case of the fastest time distance and of the shortest time distance, we introduce the notion of pivot-diameter, to deal with the fact that temporal paths cannot be concatenated in general. The pivot-diameter is the diameter restricted to the set of pair of nodes connected by a path that passes through a pivot (that is, a node at a given time instant). We prove that the problem of finding an optimal set of pivots, in terms of the number of pairs connected, is NP-hard, and we propose and experimentally evaluate several simple and fast heuristics for computing “good” pivot sets. All the proposed algorithms (for computing either the diameter or the pivot-diameter) require very often a very low number of single source (or target) best path computations. We verify the effectiveness of our approaches by means of an extensive set of experiments on real-world link streams. We also experimentally prove that the temporal version of the well-known 2-sweep technique, for computing a lower bound on the diameter of a graph, is quite effective in the case of weighted link stream, by returning very often tight bounds.","PeriodicalId":53707,"journal":{"name":"Journal of Experimental Algorithmics","volume":"27 1","pages":"1 - 28"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43407795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Direction-optimizing Label Propagation Framework for Structure Detection in Graphs: Design, Implementation, and Experimental Analysis 用于图中结构检测的方向优化标签传播框架：设计、实现和实验分析

Q2 Mathematics

Journal of Experimental Algorithmics

Pub Date : 2022-10-27 DOI: 10.1145/3564593

Xu T. Liu, A. Lumsdaine, M. Halappanavar, K. Barker, A. Gebremedhin

Label Propagation is not only a well-known machine learning algorithm for classification but also an effective method for discovering communities and connected components in networks. We propose a new Direction-optimizing Label Propagation Algorithm (DOLPA) framework that enhances the performance of the standard Label Propagation Algorithm (LPA), increases its scalability, and extends its versatility and application scope. As a central feature, the DOLPA framework relies on the use of frontiers and alternates between label push and label pull operations to attain high performance. It is formulated in such a way that the same basic algorithm can be used for finding communities or connected components in graphs by only changing the objective function used. Additionally, DOLPA has parameters for tuning the processing order of vertices in a graph to reduce the number of edges visited and improve the quality of solution obtained. We present the design and implementation of the enhanced algorithm as well as our shared-memory parallelization of it using OpenMP. We also present an extensive experimental evaluation of our implementations using the LFR benchmark and real-world networks drawn from various domains. Compared with an implementation of LPA for community detection available in a widely used network analysis software, we achieve at most five times the F-Score while maintaining similar runtime for graphs with overlapping communities. We also compare DOLPA against an implementation of the Louvain method for community detection using the same LFR-graphs and show that DOLPA achieves about three times the F-Score at just 10% of the runtime. For connected component decomposition, our algorithm achieves orders of magnitude speedups over the basic LP-based algorithm on large-diameter graphs, up to 13.2× speedup over the Shiloach-Vishkin algorithm, and up to 1.6× speedup over Afforest on an Intel Xeon processor using 40 threads.

标签传播不仅是一种著名的分类机器学习算法，也是发现网络中社区和连接组件的有效方法。我们提出了一种新的方向优化标签传播算法（DOLPA）框架，该框架提高了标准标签传播算法的性能，增加了其可扩展性，并扩展了其通用性和应用范围。作为一个核心功能，DOLPA框架依赖于边界的使用，并在标签推送和标签拉取操作之间交替，以获得高性能。它的公式化方式是，只需改变所使用的目标函数，就可以使用相同的基本算法来寻找图中的社区或连通分量。此外，DOLPA具有用于调整图中顶点处理顺序的参数，以减少访问的边的数量并提高获得的解决方案的质量。我们介绍了增强算法的设计和实现，以及我们使用OpenMP对其进行的共享内存并行化。我们还使用LFR基准和来自各个领域的真实世界网络对我们的实现进行了广泛的实验评估。与广泛使用的网络分析软件中用于社区检测的LPA实现相比，我们实现了最多五倍的F-Score，同时为具有重叠社区的图保持相似的运行时间。我们还将DOLPA与使用相同LFR图进行社区检测的Louvain方法的实现进行了比较，并表明DOLPA仅在10%的运行时间内就达到了约三倍的F-Score。对于连接组件分解，我们的算法在大直径图上比基于LP的基本算法实现了几个数量级的加速，在使用40个线程的Intel Xeon处理器上比Shiloach Vishkin算法实现了13.2倍的加速，比Afreest算法实现了1.6倍的加速。

{"title":"Direction-optimizing Label Propagation Framework for Structure Detection in Graphs: Design, Implementation, and Experimental Analysis","authors":"Xu T. Liu, A. Lumsdaine, M. Halappanavar, K. Barker, A. Gebremedhin","doi":"10.1145/3564593","DOIUrl":"https://doi.org/10.1145/3564593","url":null,"abstract":"Label Propagation is not only a well-known machine learning algorithm for classification but also an effective method for discovering communities and connected components in networks. We propose a new Direction-optimizing Label Propagation Algorithm (DOLPA) framework that enhances the performance of the standard Label Propagation Algorithm (LPA), increases its scalability, and extends its versatility and application scope. As a central feature, the DOLPA framework relies on the use of frontiers and alternates between label push and label pull operations to attain high performance. It is formulated in such a way that the same basic algorithm can be used for finding communities or connected components in graphs by only changing the objective function used. Additionally, DOLPA has parameters for tuning the processing order of vertices in a graph to reduce the number of edges visited and improve the quality of solution obtained. We present the design and implementation of the enhanced algorithm as well as our shared-memory parallelization of it using OpenMP. We also present an extensive experimental evaluation of our implementations using the LFR benchmark and real-world networks drawn from various domains. Compared with an implementation of LPA for community detection available in a widely used network analysis software, we achieve at most five times the F-Score while maintaining similar runtime for graphs with overlapping communities. We also compare DOLPA against an implementation of the Louvain method for community detection using the same LFR-graphs and show that DOLPA achieves about three times the F-Score at just 10% of the runtime. For connected component decomposition, our algorithm achieves orders of magnitude speedups over the basic LP-based algorithm on large-diameter graphs, up to 13.2× speedup over the Shiloach-Vishkin algorithm, and up to 1.6× speedup over Afforest on an Intel Xeon processor using 40 threads.","PeriodicalId":53707,"journal":{"name":"Journal of Experimental Algorithmics","volume":"27 1","pages":"1 - 31"},"PeriodicalIF":0.0,"publicationDate":"2022-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47347710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Toward an Understanding of Long-tailed Runtimes of SLS Algorithms 对SLS算法长尾运行时的理解

Q2 Mathematics

Journal of Experimental Algorithmics

Pub Date : 2022-10-24 DOI: 10.1145/3569170

Jan-Hendrik Lorenz, Florian Wörz

The satisfiability problem (SAT) is one of the most famous problems in computer science. Traditionally, its NP-completeness has been used to argue that SAT is intractable. However, there have been tremendous practical advances in recent years that allow modern SAT solvers to solve instances with millions of variables and clauses. A particularly successful paradigm in this context is stochastic local search (SLS). In most cases, there are different ways of formulating the underlying SAT problem. While it is known that the precise formulation of the problem has a significant impact on the runtime of solvers, finding a helpful formulation is generally non-trivial. The recently introduced GapSAT solver [Lorenz and Wörz 2020] demonstrated a successful way to improve the performance of an SLS solver on average by learning additional information, which logically entails from the original problem. Still, there were also cases in which the performance slightly deteriorated. This justifies in-depth investigations into how learning logical implications affects runtimes for SLS algorithms. In this work, we propose a method for generating logically equivalent problem formulations, generalizing the ideas of GapSAT. This method allows a rigorous mathematical study of the effect on the runtime of SLS SAT solvers. Initially, we conduct empirical investigations. If the modification process is treated as random, then Johnson SB distributions provide a perfect characterization of the hardness. Since the observed Johnson SB distributions approach lognormal distributions, our analysis also suggests that the hardness is long-tailed. As a second contribution, we theoretically prove that restarts are useful for long-tailed distributions. This implies that incorporating additional restarts can further refine all algorithms employing above mentioned modification technique. Since the empirical studies compellingly suggest that the runtime distributions follow Johnson SB distributions, we also investigate this property on a theoretical basis. We succeed in proving that the runtimes for the special case of Schöning’s random walk algorithm [Schöning 2002] are approximately Johnson SB distributed.

可满足性问题是计算机科学中最著名的问题之一。传统上，它的NP完全性被用来证明SAT是棘手的。然而，近年来已经有了巨大的实际进展，使现代SAT求解器能够解决具有数百万变量和子句的实例。在这种情况下，一个特别成功的范例是随机局部搜索（SLS）。在大多数情况下，有不同的方式来表述潜在的SAT问题。虽然众所周知，问题的精确公式对求解器的运行时间有很大影响，但找到一个有用的公式通常是不容易的。最近推出的GapSAT解算器[Lorenz和Wörz 2020]展示了一种成功的方法，可以通过学习额外的信息来平均提高SLS解算器的性能，这些信息在逻辑上来自原始问题。尽管如此，也有表现略有恶化的情况。这证明了深入研究学习逻辑含义如何影响SLS算法的运行时是合理的。在这项工作中，我们提出了一种生成逻辑等价问题公式的方法，推广了GapSAT的思想。该方法允许对SLS SAT求解器的运行时间影响进行严格的数学研究。最初，我们进行实证调查。如果改性过程被视为随机的，那么Johnson SB分布提供了硬度的完美表征。由于观测到的Johnson SB分布接近对数正态分布，我们的分析也表明硬度是长尾的。作为第二个贡献，我们从理论上证明了重启对于长尾分布是有用的。这意味着结合额外的重新启动可以进一步完善采用上述修改技术的所有算法。由于经验研究令人信服地表明运行时分布遵循Johnson SB分布，我们也在理论基础上研究了这一性质。我们成功地证明了Schöning的随机游动算法[Schöning2002]的特殊情况的运行时间是近似Johnson SB分布的。

{"title":"Toward an Understanding of Long-tailed Runtimes of SLS Algorithms","authors":"Jan-Hendrik Lorenz, Florian Wörz","doi":"10.1145/3569170","DOIUrl":"https://doi.org/10.1145/3569170","url":null,"abstract":"The satisfiability problem (SAT) is one of the most famous problems in computer science. Traditionally, its NP-completeness has been used to argue that SAT is intractable. However, there have been tremendous practical advances in recent years that allow modern SAT solvers to solve instances with millions of variables and clauses. A particularly successful paradigm in this context is stochastic local search (SLS). In most cases, there are different ways of formulating the underlying SAT problem. While it is known that the precise formulation of the problem has a significant impact on the runtime of solvers, finding a helpful formulation is generally non-trivial. The recently introduced GapSAT solver [Lorenz and Wörz 2020] demonstrated a successful way to improve the performance of an SLS solver on average by learning additional information, which logically entails from the original problem. Still, there were also cases in which the performance slightly deteriorated. This justifies in-depth investigations into how learning logical implications affects runtimes for SLS algorithms. In this work, we propose a method for generating logically equivalent problem formulations, generalizing the ideas of GapSAT. This method allows a rigorous mathematical study of the effect on the runtime of SLS SAT solvers. Initially, we conduct empirical investigations. If the modification process is treated as random, then Johnson SB distributions provide a perfect characterization of the hardness. Since the observed Johnson SB distributions approach lognormal distributions, our analysis also suggests that the hardness is long-tailed. As a second contribution, we theoretically prove that restarts are useful for long-tailed distributions. This implies that incorporating additional restarts can further refine all algorithms employing above mentioned modification technique. Since the empirical studies compellingly suggest that the runtime distributions follow Johnson SB distributions, we also investigate this property on a theoretical basis. We succeed in proving that the runtimes for the special case of Schöning’s random walk algorithm [Schöning 2002] are approximately Johnson SB distributed.","PeriodicalId":53707,"journal":{"name":"Journal of Experimental Algorithmics","volume":"27 1","pages":"1 - 38"},"PeriodicalIF":0.0,"publicationDate":"2022-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48819001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Incremental Updates of Generalized Hypertree Decompositions 广义超树分解的增量更新

Q2 Mathematics

Journal of Experimental Algorithmics

Pub Date : 2022-09-21 DOI: 10.1145/3578266

G. Gottlob, Matthias Lanzinger, David M Longo, Cem Okulmus

Structural decomposition methods, such as generalized hypertree decompositions, have been successfully used for solving constraint satisfaction problems (CSPs). As decompositions can be reused to solve CSPs with the same constraint scopes, investing resources in computing good decompositions is beneficial, even though the computation itself is hard. Unfortunately, current methods need to compute a completely new decomposition, even if the scopes change only slightly. In this article, we make the first steps toward solving the problem of updating the decomposition of a CSP P so that it becomes a valid decomposition of a new CSP P' produced by some modification of P. Even though the problem is hard in theory, we propose and implement a framework for effectively updating generalized hypertree decompositions. The experimental evaluation of our algorithm strongly suggests practical applicability.

结构分解方法，如广义超树分解，已成功地用于解决约束满足问题。由于分解可以重复使用来解决具有相同约束范围的CSP，因此在计算好的分解方面投入资源是有益的，即使计算本身很困难。不幸的是，当前的方法需要计算一个全新的分解，即使作用域只有轻微的变化。在本文中，我们朝着解决更新CSP P的分解的问题迈出了第一步，使其成为由P的一些修改产生的新CSP P’的有效分解。尽管这个问题在理论上很难解决，但我们提出并实现了一个有效更新广义超树分解的框架。对我们算法的实验评估有力地表明了它的实用性。

引用次数: 0

A Data-dependent Approach for High Dimensional (Robust) Wasserstein Alignment 高维(鲁棒)Wasserstein对准的数据依赖方法

Q2 Mathematics

Journal of Experimental Algorithmics

Pub Date : 2022-09-07 DOI: 10.1145/3604910

Hu Ding, Wenjie Liu, Mingquan Ye

Many real-world problems can be formulated as the alignment between two geometric patterns. Previously, a great amount of research focus on the alignment of 2D or 3D patterns in the field of computer vision. Recently, the alignment problem in high dimensions finds several novel applications in practice. However, the research is still rather limited in the algorithmic aspect. To the best of our knowledge, most existing approaches are just simple extensions of their counterparts for 2D and 3D cases, and often suffer from the issues such as high computational complexities. In this paper, we propose an effective framework to compress the high dimensional geometric patterns. Any existing alignment method can be applied to the compressed geometric patterns and the time complexity can be significantly reduced. Our idea is inspired by the observation that high dimensional data often has a low intrinsic dimension. Our framework is a “data-dependent” approach that has the complexity depending on the intrinsic dimension of the input data. Our experimental results reveal that running the alignment algorithm on compressed patterns can achieve similar qualities, comparing with the results on the original patterns, but the runtimes (including the times cost for compression) are substantially lower.

许多现实世界中的问题可以公式化为两个几何图案之间的对齐。以前，在计算机视觉领域，大量的研究集中在2D或3D图案的对齐上。最近，高维对准问题在实践中发现了一些新的应用。然而，在算法方面的研究仍然相当有限。据我们所知，大多数现有的方法只是对2D和3D情况下的对应方法的简单扩展，并且经常存在计算复杂度高等问题。在本文中，我们提出了一个有效的框架来压缩高维几何图案。任何现有的对准方法都可以应用于压缩的几何图案，并且可以显著降低时间复杂性。我们的想法受到了高维数据通常具有低内在维度的观察的启发。我们的框架是一种“依赖数据”的方法，其复杂性取决于输入数据的内在维度。我们的实验结果表明，与在原始模式上的结果相比，在压缩模式上运行对齐算法可以获得类似的质量，但运行时间（包括压缩的时间成本）要低得多。

{"title":"A Data-dependent Approach for High Dimensional (Robust) Wasserstein Alignment","authors":"Hu Ding, Wenjie Liu, Mingquan Ye","doi":"10.1145/3604910","DOIUrl":"https://doi.org/10.1145/3604910","url":null,"abstract":"Many real-world problems can be formulated as the alignment between two geometric patterns. Previously, a great amount of research focus on the alignment of 2D or 3D patterns in the field of computer vision. Recently, the alignment problem in high dimensions finds several novel applications in practice. However, the research is still rather limited in the algorithmic aspect. To the best of our knowledge, most existing approaches are just simple extensions of their counterparts for 2D and 3D cases, and often suffer from the issues such as high computational complexities. In this paper, we propose an effective framework to compress the high dimensional geometric patterns. Any existing alignment method can be applied to the compressed geometric patterns and the time complexity can be significantly reduced. Our idea is inspired by the observation that high dimensional data often has a low intrinsic dimension. Our framework is a “data-dependent” approach that has the complexity depending on the intrinsic dimension of the input data. Our experimental results reveal that running the alignment algorithm on compressed patterns can achieve similar qualities, comparing with the results on the original patterns, but the runtimes (including the times cost for compression) are substantially lower.","PeriodicalId":53707,"journal":{"name":"Journal of Experimental Algorithmics","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45444796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

O’Reach: Even Faster Reachability in Large Graphs O 'Reach:在大图形中更快的可达性

Q2 Mathematics

Journal of Experimental Algorithmics

Pub Date : 2022-08-17 DOI: 10.1145/3556540

Kathrin Hanauer, Christian Schulz, Jonathan Trummer

One of the most fundamental problems in computer science is the reachability problem: Given a directed graph and two vertices s and t, can sreach t via a path? We revisit existing techniques and combine them with new approaches to support a large portion of reachability queries in constant time using a linear-sized reachability index. Our new algorithm O’Reach can be easily combined with previously developed solutions for the problem or run standalone. In a detailed experimental study, we compare a variety of algorithms with respect to their index-building and query times as well as their memory footprint on a diverse set of instances. Our experiments indicate that the query performance often depends strongly not only on the type of graph but also on the result, i.e., reachable or unreachable. Furthermore, we show that previous algorithms are significantly sped up when combined with our new approach in almost all scenarios. Surprisingly, due to cache effects, a higher investment in space doesn’t necessarily pay off: Reachability queries can often be answered even faster than single memory accesses in a precomputed full reachability matrix.

计算机科学中最基本的问题之一是可达性问题：给定一个有向图和两个顶点s和t，是否可以通过路径访问t？我们重新审视现有技术，并将其与新方法相结合，使用线性大小的可达性索引在恒定时间内支持大部分可达性查询。我们的新算法O'Reach可以很容易地与以前开发的问题解决方案相结合，也可以单独运行。在一项详细的实验研究中，我们比较了各种算法的索引构建和查询时间，以及它们在不同实例集上的内存占用。我们的实验表明，查询性能通常不仅与图的类型密切相关，还与结果密切相关，即可达或不可达。此外，我们还表明，在几乎所有场景中，当与我们的新方法相结合时，以前的算法都会显著加快。令人惊讶的是，由于缓存效应，对空间的更高投资并不一定会得到回报：在预先计算的完全可达性矩阵中，可达性查询的回答速度通常甚至比单个内存访问更快。

引用次数: 1

Parallel Five-cycle Counting Algorithms 并行五循环计数算法

Q2 Mathematics

Journal of Experimental Algorithmics

Pub Date : 2022-08-16 DOI: 10.1145/3556541

Jessica Shi, Louisa Ruixue Huang, Julian Shun

Counting the frequency of subgraphs in large networks is a classic research question that reveals the underlying substructures of these networks for important applications. However, subgraph counting is a challenging problem, even for subgraph sizes as small as five, due to the combinatorial explosion in the number of possible occurrences. This article focuses on the five-cycle, which is an important special case of five-vertex subgraph counting and one of the most difficult to count efficiently. We design two new parallel five-cycle counting algorithms and prove that they are work efficient and achieve polylogarithmic span. Both algorithms are based on computing low out-degree orientations, which enables the efficient computation of directed two-paths and three-paths, and the algorithms differ in the ways in which they use this orientation to eliminate double-counting. Additionally, we present new parallel algorithms for obtaining unbiased estimates of five-cycle counts using graph sparsification. We develop fast multicore implementations of the algorithms and propose a work scheduling optimization to improve their performance. Our experiments on a variety of real-world graphs using a 36-core machine with two-way hyper-threading show that our best exact parallel algorithm achieves 10–46× self-relative speedup, outperforms our serial benchmarks by 10–32×, and outperforms the previous state-of-the-art serial algorithm by up to 818×. Our best approximate algorithm, for a reasonable probability parameter, achieves up to 20× self-relative speedup and is able to approximate five-cycle counts 9–189× faster than our best exact algorithm, with between 0.52% and 11.77% error.

计算大型网络中子图的频率是一个经典的研究问题，它揭示了这些网络在重要应用中的底层子结构。然而，由于可能出现的次数的组合爆炸，子图计数是一个具有挑战性的问题，即使对于小到五个的子图来说也是如此。本文重点讨论了五循环，它是五顶点子图计数的一个重要特例，也是最难有效计数的特例之一。我们设计了两种新的并行五循环计数算法，并证明了它们的工作效率和实现多对数跨度。这两种算法都是基于计算低阶方向的，这使得能够有效地计算有向两条路径和三条路径，并且两种算法在使用该方向来消除重复计数的方式上有所不同。此外，我们还提出了新的并行算法，用于使用图稀疏化获得五个循环计数的无偏估计。我们开发了算法的快速多核实现，并提出了一种工作调度优化来提高它们的性能。我们使用具有双向超线程的36核机器在各种真实世界的图形上进行的实验表明，我们最好的精确并行算法实现了10-46倍的自相对加速，比我们的串行基准高出10-32倍，比以前最先进的串行算法高出818倍。对于合理的概率参数，我们的最佳近似算法实现了高达20倍的自相对加速，并且能够比我们的最佳精确算法快9–189倍近似五个循环计数，误差在0.52%至11.77%之间。

{"title":"Parallel Five-cycle Counting Algorithms","authors":"Jessica Shi, Louisa Ruixue Huang, Julian Shun","doi":"10.1145/3556541","DOIUrl":"https://doi.org/10.1145/3556541","url":null,"abstract":"Counting the frequency of subgraphs in large networks is a classic research question that reveals the underlying substructures of these networks for important applications. However, subgraph counting is a challenging problem, even for subgraph sizes as small as five, due to the combinatorial explosion in the number of possible occurrences. This article focuses on the five-cycle, which is an important special case of five-vertex subgraph counting and one of the most difficult to count efficiently. We design two new parallel five-cycle counting algorithms and prove that they are work efficient and achieve polylogarithmic span. Both algorithms are based on computing low out-degree orientations, which enables the efficient computation of directed two-paths and three-paths, and the algorithms differ in the ways in which they use this orientation to eliminate double-counting. Additionally, we present new parallel algorithms for obtaining unbiased estimates of five-cycle counts using graph sparsification. We develop fast multicore implementations of the algorithms and propose a work scheduling optimization to improve their performance. Our experiments on a variety of real-world graphs using a 36-core machine with two-way hyper-threading show that our best exact parallel algorithm achieves 10–46× self-relative speedup, outperforms our serial benchmarks by 10–32×, and outperforms the previous state-of-the-art serial algorithm by up to 818×. Our best approximate algorithm, for a reasonable probability parameter, achieves up to 20× self-relative speedup and is able to approximate five-cycle counts 9–189× faster than our best exact algorithm, with between 0.52% and 11.77% error.","PeriodicalId":53707,"journal":{"name":"Journal of Experimental Algorithmics","volume":"27 1","pages":"1 - 23"},"PeriodicalIF":0.0,"publicationDate":"2022-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48071251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Bounded-degree Plane Geometric Spanners in Practice 有界度平面几何扳手的实践

Q2 Mathematics

Journal of Experimental Algorithmics

Pub Date : 2022-05-06 DOI: 10.1145/3582497

Fred Anderson, Anirban Ghosh, Matthew Graham, Lucas Mougeot, David Wisnosky

The construction of bounded-degree plane geometric spanners has been a focus of interest since 2002 when Bose, Gudmundsson, and Smid proposed the first algorithm to construct such spanners. To date, eleven algorithms have been designed with various trade-offs in degree and stretch-factor. We have implemented these sophisticated spanner algorithms in C++ using the CGAL library and experimented with them using large synthetic and real-world pointsets. Our experiments have revealed their practical behavior and real-world efficacy. We share the implementations via GitHub for broader uses and future research. We design and engineer EstimateStretchFactor, a simple practical algorithm, that can estimate stretch-factors (obtains lower bounds on the exact stretch-factors) of geometric spanners – a challenging problem for which no practical algorithm is known yet. In our experiments with bounded-degree plane geometric spanners, we found that EstimateStretchFactor estimated stretch-factors almost precisely. Further, it gave linear runtime performance in practice for the pointset distributions considered in this work, making it much faster than the naive Dijkstra-based algorithm for calculating stretch-factors.

自2002年Bose、Gudmundsson和Smid提出构造有界度平面几何扳手的第一个算法以来，有界度平面几何扳手的构造一直是关注的焦点。迄今为止，已经设计了11种算法，在程度和拉伸因子方面进行了各种权衡。我们使用CGAL库在c++中实现了这些复杂的扳手算法，并使用大型合成点集和现实世界的点集对它们进行了实验。我们的实验揭示了它们的实际行为和现实世界的功效。我们通过GitHub分享实现，以供更广泛的使用和未来的研究。我们设计和工程estimatestrechfactor，一个简单实用的算法，可以估计几何扳手的拉伸因子(获得确切拉伸因子的下界)-这是一个具有挑战性的问题，目前还没有实用的算法。在我们的有界度平面几何扳手实验中，我们发现estimatestrechfactor几乎精确地估计拉伸因子。此外，它在实践中为本工作中考虑的点集分布提供了线性运行时性能，使其在计算拉伸因子时比基于朴素dijkstra的算法快得多。

{"title":"Bounded-degree Plane Geometric Spanners in Practice","authors":"Fred Anderson, Anirban Ghosh, Matthew Graham, Lucas Mougeot, David Wisnosky","doi":"10.1145/3582497","DOIUrl":"https://doi.org/10.1145/3582497","url":null,"abstract":"The construction of bounded-degree plane geometric spanners has been a focus of interest since 2002 when Bose, Gudmundsson, and Smid proposed the first algorithm to construct such spanners. To date, eleven algorithms have been designed with various trade-offs in degree and stretch-factor. We have implemented these sophisticated spanner algorithms in C++ using the CGAL library and experimented with them using large synthetic and real-world pointsets. Our experiments have revealed their practical behavior and real-world efficacy. We share the implementations via GitHub for broader uses and future research. We design and engineer EstimateStretchFactor, a simple practical algorithm, that can estimate stretch-factors (obtains lower bounds on the exact stretch-factors) of geometric spanners – a challenging problem for which no practical algorithm is known yet. In our experiments with bounded-degree plane geometric spanners, we found that EstimateStretchFactor estimated stretch-factors almost precisely. Further, it gave linear runtime performance in practice for the pointset distributions considered in this work, making it much faster than the naive Dijkstra-based algorithm for calculating stretch-factors.","PeriodicalId":53707,"journal":{"name":"Journal of Experimental Algorithmics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47948338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Minimum Partition into Plane Subgraphs: The CG:SHOP Challenge 2022 最小分割到平面子图:CG:SHOP挑战2022

Q2 Mathematics

Journal of Experimental Algorithmics

Pub Date : 2022-03-14 DOI: 10.1145/3604907

S. Fekete, Phillip Keldenich, Dominik Krupke, S. Schirra

We give an overview of the 2022 Computational Geometry Challenge targeting the problem Minimum Partition into Plane Subsets, which consists of partitioning a given set of line segments into a minimum number of non-crossing subsets.

我们概述了2022年计算几何挑战赛的目标是将问题最小划分为平面子集，该问题包括将给定的线段集划分为最小数量的非交叉子集。

引用次数: 6