{"title":"Auxo:一个可伸缩和高效的图形流摘要结构","authors":"Zhiguo Jiang, Hanhua Chen, Hai Jin","doi":"10.14778/3583140.3583154","DOIUrl":null,"url":null,"abstract":"A graph stream refers to a continuous stream of edges, forming a huge and fast-evolving graph. The vast volume and high update speed of a graph stream bring stringent requirements for the data management structure, including sublinear space cost, computation-efficient operation support, and scalability of the structure. Existing designs summarize a graph stream by leveraging a hash-based compressed matrix and representing an edge using its fingerprint to achieve practical storage for a graph stream with a known upper bound of data volume. However, they fail to support the dynamically extending of graph streams.\n \n In this paper, we propose Auxo, a scalable structure to support space/time efficient summarization of dynamic graph streams. Auxo is built on a proposed novel\n prefix embedded tree\n (PET) which leverages binary logarithmic search and common binary prefixes embedding to provide an efficient and scalable tree structure. PET reduces the item insert/query time from\n O\n (|\n E\n |) to\n O\n (\n log\n |\n E\n |) as well as reducing the total storage cost by a\n log\n |\n E\n | scale, where |\n E\n | is the size of the edge set in a graph stream. To further improve the memory utilization of PET during scaling, we propose a proportional PET structure that extends a higher level in a proportionally incremental style. We conduct comprehensive experiments on large-scale real-world datasets to evaluate the performance of this design. Results show that Auxo significantly reduces the insert and query time by one to two orders of magnitude compared to the state of the arts. Meanwhile, Auxo achieves efficiently and economically structure scaling with an average memory utilization of over 80%.\n","PeriodicalId":20467,"journal":{"name":"Proc. VLDB Endow.","volume":"68 1","pages":"1386-1398"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Auxo: A Scalable and Efficient Graph Stream Summarization Structure\",\"authors\":\"Zhiguo Jiang, Hanhua Chen, Hai Jin\",\"doi\":\"10.14778/3583140.3583154\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A graph stream refers to a continuous stream of edges, forming a huge and fast-evolving graph. The vast volume and high update speed of a graph stream bring stringent requirements for the data management structure, including sublinear space cost, computation-efficient operation support, and scalability of the structure. Existing designs summarize a graph stream by leveraging a hash-based compressed matrix and representing an edge using its fingerprint to achieve practical storage for a graph stream with a known upper bound of data volume. However, they fail to support the dynamically extending of graph streams.\\n \\n In this paper, we propose Auxo, a scalable structure to support space/time efficient summarization of dynamic graph streams. Auxo is built on a proposed novel\\n prefix embedded tree\\n (PET) which leverages binary logarithmic search and common binary prefixes embedding to provide an efficient and scalable tree structure. PET reduces the item insert/query time from\\n O\\n (|\\n E\\n |) to\\n O\\n (\\n log\\n |\\n E\\n |) as well as reducing the total storage cost by a\\n log\\n |\\n E\\n | scale, where |\\n E\\n | is the size of the edge set in a graph stream. To further improve the memory utilization of PET during scaling, we propose a proportional PET structure that extends a higher level in a proportionally incremental style. We conduct comprehensive experiments on large-scale real-world datasets to evaluate the performance of this design. Results show that Auxo significantly reduces the insert and query time by one to two orders of magnitude compared to the state of the arts. Meanwhile, Auxo achieves efficiently and economically structure scaling with an average memory utilization of over 80%.\\n\",\"PeriodicalId\":20467,\"journal\":{\"name\":\"Proc. VLDB Endow.\",\"volume\":\"68 1\",\"pages\":\"1386-1398\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proc. VLDB Endow.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.14778/3583140.3583154\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proc. VLDB Endow.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14778/3583140.3583154","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
图流是指连续的边流,形成一个巨大的、快速发展的图。图流的庞大容量和高更新速度对数据管理结构提出了严格的要求,包括亚线性空间成本、计算高效的操作支持和结构的可扩展性。现有的设计通过利用基于哈希的压缩矩阵和使用其指纹表示边缘来总结图形流,从而实现具有已知数据量上限的图形流的实际存储。然而,它们不支持图形流的动态扩展。在本文中,我们提出了Auxo,一个可扩展的结构,以支持空间/时间高效的动态图流摘要。Auxo是建立在一种新的前缀嵌入树(PET)之上的,它利用二进制对数搜索和通用二进制前缀嵌入来提供一种高效且可扩展的树结构。PET将项插入/查询时间从O (| E |)减少到O (log | E |),并将总存储成本降低了log | E |比例,其中| E |是图流中边集的大小。为了进一步提高PET在缩放过程中的内存利用率,我们提出了一种比例PET结构,该结构以比例增量的方式扩展到更高的级别。我们在大规模的真实数据集上进行了全面的实验来评估该设计的性能。结果表明,与现有技术相比,Auxo显著地将插入和查询时间减少了一到两个数量级。同时,Auxo实现了高效和经济的结构扩展,平均内存利用率超过80%。
Auxo: A Scalable and Efficient Graph Stream Summarization Structure
A graph stream refers to a continuous stream of edges, forming a huge and fast-evolving graph. The vast volume and high update speed of a graph stream bring stringent requirements for the data management structure, including sublinear space cost, computation-efficient operation support, and scalability of the structure. Existing designs summarize a graph stream by leveraging a hash-based compressed matrix and representing an edge using its fingerprint to achieve practical storage for a graph stream with a known upper bound of data volume. However, they fail to support the dynamically extending of graph streams.
In this paper, we propose Auxo, a scalable structure to support space/time efficient summarization of dynamic graph streams. Auxo is built on a proposed novel
prefix embedded tree
(PET) which leverages binary logarithmic search and common binary prefixes embedding to provide an efficient and scalable tree structure. PET reduces the item insert/query time from
O
(|
E
|) to
O
(
log
|
E
|) as well as reducing the total storage cost by a
log
|
E
| scale, where |
E
| is the size of the edge set in a graph stream. To further improve the memory utilization of PET during scaling, we propose a proportional PET structure that extends a higher level in a proportionally incremental style. We conduct comprehensive experiments on large-scale real-world datasets to evaluate the performance of this design. Results show that Auxo significantly reduces the insert and query time by one to two orders of magnitude compared to the state of the arts. Meanwhile, Auxo achieves efficiently and economically structure scaling with an average memory utilization of over 80%.