{"title":"Approximately Counting Subgraphs in Data Streams","authors":"Hendrik Fichtenberger, Pan Peng","doi":"10.1145/3517804.3524145","DOIUrl":null,"url":null,"abstract":"Estimating the number of subgraphs in data streams is a fundamental problem that has received great attention in the past decade. In this paper, we give improved streaming algorithms for approximately counting the number of occurrences of an arbitrary subgraph H, denoted #H, when the input graph G is represented as a stream of m edges. To obtain our algorithms, we provide a generic transformation that converts constant-round sublinear-time graph algorithms in the query access model to constant-pass sublinear-space graph streaming algorithms. Using this transformation, we obtain the following results. • We give a 3-pass turnstile streaming algorithm for (1 ± ε)-approximating #H in Õ(mρ(H) /ε2⋅#H) space, where ρ(H) is the fractional edge-cover of H. This improves upon and generalizes a result of McGregor et al. [PODS 2016], who gave a 3-pass insertion-only streaming algorithm for (1 ± ε)-approximating the number #T of triangles in Õ(m3/2/ε2 ⋅ #T) space if the algorithm is given additional oracle access to the degrees.• We provide a constant-pass streaming algorithm for (1 ± ε)-approximating #Kr in Õ(m/λr-2 ε2 ⋅ #Kr) space for any r ≥ 3, in a graph G with degeneracy λ, where Kr is a clique on r vertices. This resolves a conjecture by Bera and Seshadhri [PODS 2020]. More generally, our reduction relates the adaptivity of a query algorithm to the pass complexity of a corresponding streaming algorithm, and it is applicable to all algorithms in standard sublinear-time graph query models, e.g., the (augmented) general model.","PeriodicalId":230606,"journal":{"name":"Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3517804.3524145","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Estimating the number of subgraphs in data streams is a fundamental problem that has received great attention in the past decade. In this paper, we give improved streaming algorithms for approximately counting the number of occurrences of an arbitrary subgraph H, denoted #H, when the input graph G is represented as a stream of m edges. To obtain our algorithms, we provide a generic transformation that converts constant-round sublinear-time graph algorithms in the query access model to constant-pass sublinear-space graph streaming algorithms. Using this transformation, we obtain the following results. • We give a 3-pass turnstile streaming algorithm for (1 ± ε)-approximating #H in Õ(mρ(H) /ε2⋅#H) space, where ρ(H) is the fractional edge-cover of H. This improves upon and generalizes a result of McGregor et al. [PODS 2016], who gave a 3-pass insertion-only streaming algorithm for (1 ± ε)-approximating the number #T of triangles in Õ(m3/2/ε2 ⋅ #T) space if the algorithm is given additional oracle access to the degrees.• We provide a constant-pass streaming algorithm for (1 ± ε)-approximating #Kr in Õ(m/λr-2 ε2 ⋅ #Kr) space for any r ≥ 3, in a graph G with degeneracy λ, where Kr is a clique on r vertices. This resolves a conjecture by Bera and Seshadhri [PODS 2020]. More generally, our reduction relates the adaptivity of a query algorithm to the pass complexity of a corresponding streaming algorithm, and it is applicable to all algorithms in standard sublinear-time graph query models, e.g., the (augmented) general model.