Recursive Query Evaluation in a Column DBMS to Analyze Large Graphs

C. Ordonez, Achyuth Gurram, N. Rai
{"title":"Recursive Query Evaluation in a Column DBMS to Analyze Large Graphs","authors":"C. Ordonez, Achyuth Gurram, N. Rai","doi":"10.1145/2666158.2666177","DOIUrl":null,"url":null,"abstract":"Graphs represent a major challenge on big data analytics, for which there are many systems and prototypes, most of them not based on relational database management systems (DBMSs). Graph problems require substantially different algorithms compared to other analytical techniques (i.e., cubes, statistical models, machine learning) and they are especially important in the analysis of social networks and the Internet. On the other hand, recursive queries are a fundamental query mechanism to analyze graphs in a DBMS, but they can be slow with large graphs. Column DBMSs are a novel kind of faster database systems, but with significantly different storage and retrieval mechanisms compared to traditional row DBMSs. Thus we study the pros and cons of optimizing recursive queries on a column DBMS. Specifically, we study two inter-related graph problems: transitive closure and adjacency matrix multiplication, together with their respective optimization of queries combining recursive joins and recursive aggregations. An experimental evaluation with large graphs compares query optimization in a column DBMS and a row DBMS. We analyze performance tradeoffs with graphs having significantly different size, shape and connectivity. Our benchmark results prove column DBMSs are much faster than row DBMSs to analyze graphs, especially as graphs get larger and denser.","PeriodicalId":335396,"journal":{"name":"International Workshop on Data Warehousing and OLAP","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Workshop on Data Warehousing and OLAP","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2666158.2666177","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Graphs represent a major challenge on big data analytics, for which there are many systems and prototypes, most of them not based on relational database management systems (DBMSs). Graph problems require substantially different algorithms compared to other analytical techniques (i.e., cubes, statistical models, machine learning) and they are especially important in the analysis of social networks and the Internet. On the other hand, recursive queries are a fundamental query mechanism to analyze graphs in a DBMS, but they can be slow with large graphs. Column DBMSs are a novel kind of faster database systems, but with significantly different storage and retrieval mechanisms compared to traditional row DBMSs. Thus we study the pros and cons of optimizing recursive queries on a column DBMS. Specifically, we study two inter-related graph problems: transitive closure and adjacency matrix multiplication, together with their respective optimization of queries combining recursive joins and recursive aggregations. An experimental evaluation with large graphs compares query optimization in a column DBMS and a row DBMS. We analyze performance tradeoffs with graphs having significantly different size, shape and connectivity. Our benchmark results prove column DBMSs are much faster than row DBMSs to analyze graphs, especially as graphs get larger and denser.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
递归查询评估在列DBMS分析大图
图代表了大数据分析的主要挑战,有许多系统和原型,其中大多数不是基于关系数据库管理系统(dbms)。与其他分析技术(如立方体、统计模型、机器学习)相比,图问题需要截然不同的算法,它们在分析社交网络和互联网时尤为重要。另一方面,递归查询是分析DBMS中的图的基本查询机制,但是对于大型图,它们可能很慢。列dbms是一种新型的更快的数据库系统,但是与传统的行dbms相比,它的存储和检索机制有很大不同。因此,我们研究了在列DBMS上优化递归查询的利弊。具体来说,我们研究了两个相互关联的图问题:传递闭包和邻接矩阵乘法,以及它们各自结合递归连接和递归聚合的查询优化。对大型图的实验评估比较了列DBMS和行DBMS中的查询优化。我们分析具有显著不同大小、形状和连接的图的性能权衡。我们的基准测试结果证明,在分析图时,列dbms比行dbms要快得多,尤其是当图变得更大更密集时。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
An Advanced Data Warehouse for Integrating Large Sets of GPS Data Optimization of Data-intensive Flows: Is it Needed? Is it Solved? A Framework for User-Centered Declarative ETL What can Emerging Hardware do for your DBMS Buffer? A Semantic Model for Movement Data Warehouses
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1