TriAL-QL:导航查询的分布式处理

Proceedings of the 18th International Workshop on Web and Databases Pub Date : 2015-05-31 DOI:10.1145/2767109.2767115

Martin Przyjaciel-Zablocki, A. Schätzle, Adriano Lange

{"title":"TriAL-QL:导航查询的分布式处理","authors":"Martin Przyjaciel-Zablocki, A. Schätzle, Adriano Lange","doi":"10.1145/2767109.2767115","DOIUrl":null,"url":null,"abstract":"Navigational queries are among the most natural query patterns for RDF data, but yet most existing RDF query languages fail to cover all the varieties inherent to its triple-based model, including SPARQL 1.1 and its derivatives. As a consequence, the development of more expressive RDF languages is of general interest. With TriAL* [14], there exists an expressive algebra which subsumes many previous approaches, while adding novel features that are not expressible in most other RDF query languages based on the standard graph model. However, its algebraic notation is inappropriate for practical usage and it is not supported by any existing RDF triple store. In this paper, we propose TriAL-QL, an easy to write and grasp language for TriAL*, preserving its compositional algebraic structure. We present an implementation based on Impala, a massive parallel SQL query engine on Hadoop, using an optimized semi-naive evaluation for the recursive fragments of TriAL*. This way, we support both data-intensive ETL-like workloads and explorative ad-hoc style queries. To demonstrate the scalability and expressiveness of our approach, we conducted experiments on generated social networks with up to 1.8 billion triples and compared different execution strategies to a Hive-based solution.","PeriodicalId":316270,"journal":{"name":"Proceedings of the 18th International Workshop on Web and Databases","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"TriAL-QL: Distributed Processing of Navigational Queries\",\"authors\":\"Martin Przyjaciel-Zablocki, A. Schätzle, Adriano Lange\",\"doi\":\"10.1145/2767109.2767115\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Navigational queries are among the most natural query patterns for RDF data, but yet most existing RDF query languages fail to cover all the varieties inherent to its triple-based model, including SPARQL 1.1 and its derivatives. As a consequence, the development of more expressive RDF languages is of general interest. With TriAL* [14], there exists an expressive algebra which subsumes many previous approaches, while adding novel features that are not expressible in most other RDF query languages based on the standard graph model. However, its algebraic notation is inappropriate for practical usage and it is not supported by any existing RDF triple store. In this paper, we propose TriAL-QL, an easy to write and grasp language for TriAL*, preserving its compositional algebraic structure. We present an implementation based on Impala, a massive parallel SQL query engine on Hadoop, using an optimized semi-naive evaluation for the recursive fragments of TriAL*. This way, we support both data-intensive ETL-like workloads and explorative ad-hoc style queries. To demonstrate the scalability and expressiveness of our approach, we conducted experiments on generated social networks with up to 1.8 billion triples and compared different execution strategies to a Hive-based solution.\",\"PeriodicalId\":316270,\"journal\":{\"name\":\"Proceedings of the 18th International Workshop on Web and Databases\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-05-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 18th International Workshop on Web and Databases\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2767109.2767115\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 18th International Workshop on Web and Databases","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2767109.2767115","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

摘要

导航查询是RDF数据最自然的查询模式之一，但是大多数现有的RDF查询语言都不能涵盖其基于三元的模型所固有的所有变体，包括SPARQL 1.1及其衍生物。因此，开发更具表达性的RDF语言引起了普遍的兴趣。在TriAL*[14]中，存在一个包含了许多以前方法的表达代数，同时添加了在大多数其他基于标准图模型的RDF查询语言中无法表达的新特性。然而，它的代数表示法不适合实际使用，任何现有的RDF三重存储都不支持它。在本文中，我们提出了一种易于编写和掌握的TriAL*语言，保留了它的组合代数结构。我们提出了一个基于Impala的实现，Impala是Hadoop上的一个大规模并行SQL查询引擎，对TriAL*的递归片段使用了优化的半幼稚求值。通过这种方式，我们既支持数据密集型的类似于etl的工作负载，也支持探索性的自组织样式查询。为了证明我们方法的可扩展性和表达性，我们在生成的社交网络上进行了多达18亿个三重组的实验，并将不同的执行策略与基于hive的解决方案进行了比较。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

TriAL-QL: Distributed Processing of Navigational Queries

Navigational queries are among the most natural query patterns for RDF data, but yet most existing RDF query languages fail to cover all the varieties inherent to its triple-based model, including SPARQL 1.1 and its derivatives. As a consequence, the development of more expressive RDF languages is of general interest. With TriAL* [14], there exists an expressive algebra which subsumes many previous approaches, while adding novel features that are not expressible in most other RDF query languages based on the standard graph model. However, its algebraic notation is inappropriate for practical usage and it is not supported by any existing RDF triple store. In this paper, we propose TriAL-QL, an easy to write and grasp language for TriAL*, preserving its compositional algebraic structure. We present an implementation based on Impala, a massive parallel SQL query engine on Hadoop, using an optimized semi-naive evaluation for the recursive fragments of TriAL*. This way, we support both data-intensive ETL-like workloads and explorative ad-hoc style queries. To demonstrate the scalability and expressiveness of our approach, we conducted experiments on generated social networks with up to 1.8 billion triples and compared different execution strategies to a Hive-based solution.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助