{"title":"Recursive Query Evaluation in a Column DBMS to Analyze Large Graphs","authors":"C. Ordonez, Achyuth Gurram, N. Rai","doi":"10.1145/2666158.2666177","DOIUrl":null,"url":null,"abstract":"Graphs represent a major challenge on big data analytics, for which there are many systems and prototypes, most of them not based on relational database management systems (DBMSs). Graph problems require substantially different algorithms compared to other analytical techniques (i.e., cubes, statistical models, machine learning) and they are especially important in the analysis of social networks and the Internet. On the other hand, recursive queries are a fundamental query mechanism to analyze graphs in a DBMS, but they can be slow with large graphs. Column DBMSs are a novel kind of faster database systems, but with significantly different storage and retrieval mechanisms compared to traditional row DBMSs. Thus we study the pros and cons of optimizing recursive queries on a column DBMS. Specifically, we study two inter-related graph problems: transitive closure and adjacency matrix multiplication, together with their respective optimization of queries combining recursive joins and recursive aggregations. An experimental evaluation with large graphs compares query optimization in a column DBMS and a row DBMS. We analyze performance tradeoffs with graphs having significantly different size, shape and connectivity. Our benchmark results prove column DBMSs are much faster than row DBMSs to analyze graphs, especially as graphs get larger and denser.","PeriodicalId":335396,"journal":{"name":"International Workshop on Data Warehousing and OLAP","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Workshop on Data Warehousing and OLAP","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2666158.2666177","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Graphs represent a major challenge on big data analytics, for which there are many systems and prototypes, most of them not based on relational database management systems (DBMSs). Graph problems require substantially different algorithms compared to other analytical techniques (i.e., cubes, statistical models, machine learning) and they are especially important in the analysis of social networks and the Internet. On the other hand, recursive queries are a fundamental query mechanism to analyze graphs in a DBMS, but they can be slow with large graphs. Column DBMSs are a novel kind of faster database systems, but with significantly different storage and retrieval mechanisms compared to traditional row DBMSs. Thus we study the pros and cons of optimizing recursive queries on a column DBMS. Specifically, we study two inter-related graph problems: transitive closure and adjacency matrix multiplication, together with their respective optimization of queries combining recursive joins and recursive aggregations. An experimental evaluation with large graphs compares query optimization in a column DBMS and a row DBMS. We analyze performance tradeoffs with graphs having significantly different size, shape and connectivity. Our benchmark results prove column DBMSs are much faster than row DBMSs to analyze graphs, especially as graphs get larger and denser.