Massively parallel XML twig filtering using dynamic programming on FPGAs

2011 IEEE 27th International Conference on Data Engineering Pub Date : 2011-04-11 DOI:10.1109/ICDE.2011.5767899

R. Moussalli, Mariam Salloum, W. Najjar, V. Tsotras

{"title":"Massively parallel XML twig filtering using dynamic programming on FPGAs","authors":"R. Moussalli, Mariam Salloum, W. Najjar, V. Tsotras","doi":"10.1109/ICDE.2011.5767899","DOIUrl":null,"url":null,"abstract":"In recent years, XML-based Publish-Subscribe Systems have become popular due to the increased demand of timely event-notification. Users (or subscribers) pose complex profiles on the structure and content of the published messages. If a profile matches the message, the message is forwarded to the interested subscriber. As the amount of published content continues to grow, current software-based systems will not scale. We thus propose a novel architecture to exploit parallelism of twig matching on FPGAs. This approach yields up to three orders of magnitude higher throughput when compared to conventional approaches bound by the sequential aspect of software computing. This paper, presents a novel method for performing unordered holistic twig matching on FPGAs without any false positives, and whose throughput is independent of the complexity of the user queries or the characteristics of the input XML stream. Furthermore, we present experimental comparison of different granularities of twig matching, namely path-based (root-to-leaf) and pair-based (parent-child or ancestor-descendant).We provide comprehensive experiments that compare the throughput, area utilization and the accuracy of matching (percent of false positives) of our holistic, path-based and pair-based FPGA approaches.","PeriodicalId":332374,"journal":{"name":"2011 IEEE 27th International Conference on Data Engineering","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"33","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE 27th International Conference on Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.2011.5767899","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 33

Abstract

In recent years, XML-based Publish-Subscribe Systems have become popular due to the increased demand of timely event-notification. Users (or subscribers) pose complex profiles on the structure and content of the published messages. If a profile matches the message, the message is forwarded to the interested subscriber. As the amount of published content continues to grow, current software-based systems will not scale. We thus propose a novel architecture to exploit parallelism of twig matching on FPGAs. This approach yields up to three orders of magnitude higher throughput when compared to conventional approaches bound by the sequential aspect of software computing. This paper, presents a novel method for performing unordered holistic twig matching on FPGAs without any false positives, and whose throughput is independent of the complexity of the user queries or the characteristics of the input XML stream. Furthermore, we present experimental comparison of different granularities of twig matching, namely path-based (root-to-leaf) and pair-based (parent-child or ancestor-descendant).We provide comprehensive experiments that compare the throughput, area utilization and the accuracy of matching (percent of false positives) of our holistic, path-based and pair-based FPGA approaches.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

在fpga上使用动态规划进行大规模并行XML分支过滤

近年来，由于对及时事件通知的需求增加，基于xml的发布-订阅系统变得流行起来。用户(或订阅者)对已发布消息的结构和内容提出复杂的配置文件。如果配置文件与消息匹配，则将消息转发给感兴趣的订阅者。随着发布内容的数量持续增长，当前基于软件的系统将无法扩展。因此，我们提出了一种新的架构来利用fpga上小枝匹配的并行性。与受软件计算顺序方面约束的传统方法相比，这种方法的吞吐量最高可提高三个数量级。本文提出了一种在fpga上进行无误报的无序整体小枝匹配的新方法，该方法的吞吐量与用户查询的复杂性或输入XML流的特征无关。此外，我们还对不同粒度的树枝匹配进行了实验比较，即基于路径的(根到叶)和基于对的(亲子或祖先-后代)。我们提供了全面的实验，比较了我们的整体，基于路径和基于对的FPGA方法的吞吐量，面积利用率和匹配准确性(误报百分比)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2011 IEEE 27th International Conference on Data Engineering

自引率

0.00%

发文量

期刊最新文献

Advanced search, visualization and tagging of sensor metadata Bidirectional mining of non-redundant recurrent rules from a sequence database Web-scale information extraction with vertex Characteristic sets: Accurate cardinality estimation for RDF queries with multiple joins Dynamic prioritization of database queries