Accelerating Regular-Expression Matching on FPGAs with High-Level Synthesis

International Workshop on OpenCL Pub Date : 2021-04-27 DOI:10.1145/3456669.3456716

Devon Callanan, Luke Kljucaric, A. George

{"title":"Accelerating Regular-Expression Matching on FPGAs with High-Level Synthesis","authors":"Devon Callanan, Luke Kljucaric, A. George","doi":"10.1145/3456669.3456716","DOIUrl":null,"url":null,"abstract":"The importance of security infrastructures for high-throughput networks has rapidly grown as a result of expanding internet traffic and increasingly high-bandwidth connections. Intrusion-detection systems (IDSs) such as SNORT rely upon rule sets designed to alert system administrators of malicious packets. Such deep-packet inspection, which depends upon regular-expression searches, can be accelerated on programmable-logic (PL) architectures using non-deterministic finite automata (NFAs). Prior designs have relied upon register-transfer level (RTL) design descriptions and achieved efficient resource utilization through fine-grained optimizations. New advances made by field-programmable gate array (FPGA) vendors have led to more powerful compiler toolchains for OpenCL that allow for rapid development on PL architectures while generating competitive designs in terms of performance. The goal of this research is to evaluate performance differences between a custom, OpenCL-based, acceleration architecture for regular expressions and comparable RTL designs. The simplicity of the application, which requires only basic hardware building blocks, adds to the novelty of the comparison. In contrast to RTL-based solutions, which show frequency degradation with bandwidth scaling, our approach is able to maintain stable and high operating frequencies at the cost of resource usage. By scaling input bandwidth with multi-character transformations, throughput in excess of 17 Gbps can be achieved on Intel’s Arria 10 Programmable Acceleration Card, outperforming similar designs with RTL as reported in the literature.","PeriodicalId":73497,"journal":{"name":"International Workshop on OpenCL","volume":"80 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Workshop on OpenCL","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3456669.3456716","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

The importance of security infrastructures for high-throughput networks has rapidly grown as a result of expanding internet traffic and increasingly high-bandwidth connections. Intrusion-detection systems (IDSs) such as SNORT rely upon rule sets designed to alert system administrators of malicious packets. Such deep-packet inspection, which depends upon regular-expression searches, can be accelerated on programmable-logic (PL) architectures using non-deterministic finite automata (NFAs). Prior designs have relied upon register-transfer level (RTL) design descriptions and achieved efficient resource utilization through fine-grained optimizations. New advances made by field-programmable gate array (FPGA) vendors have led to more powerful compiler toolchains for OpenCL that allow for rapid development on PL architectures while generating competitive designs in terms of performance. The goal of this research is to evaluate performance differences between a custom, OpenCL-based, acceleration architecture for regular expressions and comparable RTL designs. The simplicity of the application, which requires only basic hardware building blocks, adds to the novelty of the comparison. In contrast to RTL-based solutions, which show frequency degradation with bandwidth scaling, our approach is able to maintain stable and high operating frequencies at the cost of resource usage. By scaling input bandwidth with multi-character transformations, throughput in excess of 17 Gbps can be achieved on Intel’s Arria 10 Programmable Acceleration Card, outperforming similar designs with RTL as reported in the literature.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于高级合成的fpga正则表达式匹配加速

随着互联网流量的扩大和高带宽连接的增加，安全基础设施对高吞吐量网络的重要性迅速提高。像SNORT这样的入侵检测系统(ids)依赖于旨在提醒系统管理员注意恶意数据包的规则集。这种依赖于正则表达式搜索的深度包检测可以使用非确定性有限自动机(nfa)在可编程逻辑(PL)架构上加速。先前的设计依赖于寄存器传输级(RTL)设计描述，并通过细粒度优化实现有效的资源利用。现场可编程门阵列(FPGA)供应商取得的新进展为OpenCL带来了更强大的编译器工具链，允许在PL架构上快速开发，同时在性能方面生成具有竞争力的设计。本研究的目标是评估正则表达式的自定义、基于opencl的加速架构与可比较的RTL设计之间的性能差异。应用程序的简单性(只需要基本的硬件构建块)增加了比较的新颖性。与基于rtl的解决方案相比，rtl的解决方案会随着带宽的扩展而出现频率下降，而我们的方法能够以资源使用为代价保持稳定和高的工作频率。通过使用多字符转换缩放输入带宽，Intel的Arria 10可编程加速卡可以实现超过17 Gbps的吞吐量，优于文献中报道的使用RTL的类似设计。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

International Workshop on OpenCL

自引率

0.00%

发文量

期刊最新文献

Improving Performance Portability of the Procedurally Generated High Energy Physics Event Generator MadGraph Using SYCL Acceleration of Quantum Transport Simulations with OpenCL CodePin: An Instrumentation-Based Debug Tool of SYCLomatic An Efficient Approach to Resolving Stack Overflow of SYCL Kernel on Intel® CPUs Ray Tracer based lidar simulation using SYCL