GTX: A Transactional Graph Data System For HTAP Workloads

arXiv - CS - Databases Pub Date : 2024-05-02 DOI:arxiv-2405.01448

Libin Zhou, Walid Aref

{"title":"GTX: A Transactional Graph Data System For HTAP Workloads","authors":"Libin Zhou, Walid Aref","doi":"arxiv-2405.01448","DOIUrl":null,"url":null,"abstract":"Processing, managing, and analyzing dynamic graphs are the cornerstone in\nmultiple application domains including fraud detection, recommendation system,\ngraph neural network training, etc. This demo presents GTX, a latch-free\nwrite-optimized transactional graph data system that supports high throughput\nread-write transactions while maintaining competitive graph analytics. GTX has\na unique latch-free graph storage and a transaction and concurrency control\nprotocol for dynamic power-law graphs. GTX leverages atomic operations to\neliminate latches, proposes a delta-based multi-version storage, and designs a\nhybrid transaction commit protocol to reduce interference between concurrent\noperations. To further improve its throughput, we design a delta-chains index\nto support efficient edge lookups. GTX manages concurrency control at\ndelta-chain level, and provides adaptive concurrency according to the workload.\nReal-world graph access and updates exhibit temporal localities and hotspots.\nUnlike other transactional graph systems that experience significant\nperformance degradation, GTX is the only system that can adapt to temporal\nlocalities and hotspots in graph updates and maintain\nmillion-transactions-per-second throughput. GTX is prototyped as a graph\nlibrary and is evaluated using a graph library evaluation tool using real and\nsynthetic datasets.","PeriodicalId":501123,"journal":{"name":"arXiv - CS - Databases","volume":"32 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Databases","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.01448","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Processing, managing, and analyzing dynamic graphs are the cornerstone in multiple application domains including fraud detection, recommendation system, graph neural network training, etc. This demo presents GTX, a latch-free write-optimized transactional graph data system that supports high throughput read-write transactions while maintaining competitive graph analytics. GTX has a unique latch-free graph storage and a transaction and concurrency control protocol for dynamic power-law graphs. GTX leverages atomic operations to eliminate latches, proposes a delta-based multi-version storage, and designs a hybrid transaction commit protocol to reduce interference between concurrent operations. To further improve its throughput, we design a delta-chains index to support efficient edge lookups. GTX manages concurrency control at delta-chain level, and provides adaptive concurrency according to the workload. Real-world graph access and updates exhibit temporal localities and hotspots. Unlike other transactional graph systems that experience significant performance degradation, GTX is the only system that can adapt to temporal localities and hotspots in graph updates and maintain million-transactions-per-second throughput. GTX is prototyped as a graph library and is evaluated using a graph library evaluation tool using real and synthetic datasets.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

GTX：用于 HTAP 工作负载的事务性图形数据系统

处理、管理和分析动态图是欺诈检测、推荐系统、图神经网络训练等多个应用领域的基石。本演示介绍的 GTX 是一种免闩锁写入的优化事务图数据系统，可支持高吞吐量读写事务，同时保持有竞争力的图分析能力。GTX 具有独特的无锁存图存储和事务与并发控制协议，适用于动态幂律图。GTX 利用原子操作消除闩锁，提出了基于 delta 的多版本存储，并设计了混合事务提交协议，以减少并发操作之间的干扰。为了进一步提高吞吐量，我们设计了一种三角链索引，以支持高效的边查找。GTX在三角链级别管理并发控制，并根据工作负载提供自适应并发。真实世界中的图访问和更新会表现出时间局部性和热点。与其他性能会显著下降的事务图系统不同，GTX是唯一能适应图更新中的时间局部性和热点，并保持每秒百万次吞吐量的系统。GTX 以图形库为原型，并使用图形库评估工具和真实与合成数据集进行了评估。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

arXiv - CS - Databases

自引率

0.00%

发文量

期刊最新文献

Development of Data Evaluation Benchmark for Data Wrangling Recommendation System Messy Code Makes Managing ML Pipelines Difficult? Just Let LLMs Rewrite the Code! Fast and Adaptive Bulk Loading of Multidimensional Points Matrix Profile for Anomaly Detection on Multidimensional Time Series Extending predictive process monitoring for collaborative processes