大型数据集正则表达式匹配的GPU加速:探索实现空间

ACM International Conference on Computing Frontiers Pub Date : 2013-05-14 DOI:10.1145/2482767.2482791

Xiaodong Yu, M. Becchi

{"title":"大型数据集正则表达式匹配的GPU加速:探索实现空间","authors":"Xiaodong Yu, M. Becchi","doi":"10.1145/2482767.2482791","DOIUrl":null,"url":null,"abstract":"Regular expression matching is a central task in several networking (and search) applications and has been accelerated on a variety of parallel architectures, including general purpose multi-core processors, network processors, field programmable gate arrays, and ASIC- and TCAM-based systems. All of these solutions are based on finite automata (either in deterministic or non-deterministic form) and mostly focus on effective memory representations for such automata. More recently, a handful of proposals have exploited the parallelism intrinsic in regular expression matching (i.e., coarse-grained packet-level parallelism and fine-grained data structure parallelism) to propose efficient regex-matching designs for GPUs. However, most GPU solutions aim at achieving good performance on small datasets, which are far less complex and problematic than those used in real-world applications.\n In this work, we provide a more comprehensive study of regular expression matching on GPUs. To this end, we consider datasets of practical size and complexity and explore advantages and limitations of different automata representations and of various GPU implementation techniques. Our goal is not to show optimal speedup on specific datasets, but to highlight advantages and disadvantages of the GPU hardware in supporting state-of-the-art automata representations and encoding schemes, approaches that have been broadly adopted on other parallel memory-based platforms.","PeriodicalId":430420,"journal":{"name":"ACM International Conference on Computing Frontiers","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"61","resultStr":"{\"title\":\"GPU acceleration of regular expression matching for large datasets: exploring the implementation space\",\"authors\":\"Xiaodong Yu, M. Becchi\",\"doi\":\"10.1145/2482767.2482791\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Regular expression matching is a central task in several networking (and search) applications and has been accelerated on a variety of parallel architectures, including general purpose multi-core processors, network processors, field programmable gate arrays, and ASIC- and TCAM-based systems. All of these solutions are based on finite automata (either in deterministic or non-deterministic form) and mostly focus on effective memory representations for such automata. More recently, a handful of proposals have exploited the parallelism intrinsic in regular expression matching (i.e., coarse-grained packet-level parallelism and fine-grained data structure parallelism) to propose efficient regex-matching designs for GPUs. However, most GPU solutions aim at achieving good performance on small datasets, which are far less complex and problematic than those used in real-world applications.\\n In this work, we provide a more comprehensive study of regular expression matching on GPUs. To this end, we consider datasets of practical size and complexity and explore advantages and limitations of different automata representations and of various GPU implementation techniques. Our goal is not to show optimal speedup on specific datasets, but to highlight advantages and disadvantages of the GPU hardware in supporting state-of-the-art automata representations and encoding schemes, approaches that have been broadly adopted on other parallel memory-based platforms.\",\"PeriodicalId\":430420,\"journal\":{\"name\":\"ACM International Conference on Computing Frontiers\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-05-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"61\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM International Conference on Computing Frontiers\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2482767.2482791\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM International Conference on Computing Frontiers","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2482767.2482791","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 61

摘要

正则表达式匹配是几个网络(和搜索)应用程序中的核心任务，并且在各种并行体系结构上得到了加速，包括通用多核处理器、网络处理器、现场可编程门阵列以及基于ASIC和tcam的系统。所有这些解决方案都基于有限自动机(确定性或非确定性形式)，并且主要关注这些自动机的有效内存表示。最近，一些建议利用正则表达式匹配中固有的并行性(即，粗粒度包级并行性和细粒度数据结构并行性)来为gpu提出有效的正则表达式匹配设计。然而，大多数GPU解决方案的目标是在小数据集上实现良好的性能，这比在现实应用中使用的数据集要简单得多。在这项工作中，我们对gpu上的正则表达式匹配进行了更全面的研究。为此，我们考虑了实际规模和复杂性的数据集，并探索了不同自动机表示和各种GPU实现技术的优点和局限性。我们的目标不是展示特定数据集上的最佳加速，而是强调GPU硬件在支持最先进的自动机表示和编码方案方面的优点和缺点，这些方法已在其他基于并行内存的平台上广泛采用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

GPU acceleration of regular expression matching for large datasets: exploring the implementation space

Regular expression matching is a central task in several networking (and search) applications and has been accelerated on a variety of parallel architectures, including general purpose multi-core processors, network processors, field programmable gate arrays, and ASIC- and TCAM-based systems. All of these solutions are based on finite automata (either in deterministic or non-deterministic form) and mostly focus on effective memory representations for such automata. More recently, a handful of proposals have exploited the parallelism intrinsic in regular expression matching (i.e., coarse-grained packet-level parallelism and fine-grained data structure parallelism) to propose efficient regex-matching designs for GPUs. However, most GPU solutions aim at achieving good performance on small datasets, which are far less complex and problematic than those used in real-world applications. In this work, we provide a more comprehensive study of regular expression matching on GPUs. To this end, we consider datasets of practical size and complexity and explore advantages and limitations of different automata representations and of various GPU implementation techniques. Our goal is not to show optimal speedup on specific datasets, but to highlight advantages and disadvantages of the GPU hardware in supporting state-of-the-art automata representations and encoding schemes, approaches that have been broadly adopted on other parallel memory-based platforms.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM International Conference on Computing Frontiers

自引率

0.00%

发文量