为可编程和高效的网络查找设计和实现PLUG架构

Amit Kumar, Lorenzo De Carli, Sung Jin Kim, M. Kruijf, K. Sankaralingam, Cristian Estan, S. Jha
{"title":"为可编程和高效的网络查找设计和实现PLUG架构","authors":"Amit Kumar, Lorenzo De Carli, Sung Jin Kim, M. Kruijf, K. Sankaralingam, Cristian Estan, S. Jha","doi":"10.1145/1854273.1854316","DOIUrl":null,"url":null,"abstract":"This paper proposes a new architecture called Pipelined LookUp Grid (PLUG) that can perform data structure lookups in network processing. PLUGs are programmable and through simplicity achieve power efficiency. We draw upon the insights that data structure lookups have natural structure that can be statically determined and exploited. The PLUG execution model transforms data-structure lookups into pipelined stages of computation and associates small code-blocks with data. The PLUG architecture is a tiled architecture with each tile consisting predominantly of SRAMs, a lightweight no-buffering router, and an array of lightweight computation cores. Using a principle of fixed delays in the execution model, the architecture is contention-free and completely statically scheduled thus achieving high energy efficiency. The architecture enables rapid deployment of new network protocols and generalizes as a data-structure accelerator. This paper describes the PLUG architecture, the compiler, and evaluates our RTL prototype PLUG chip synthesized on a 55nm technology library. We evaluate six diverse high-end network processing workloads including IPv4, IPv6, and Ethernet forwarding. We show that at a 55nm technology, a 16-tile PLUG occupies 58mm2, provides 4MB on-chip storage, and sustains a clock frequency of 1 GHz. This translates to 1 billion lookups per second, a latency of 18ns to 219ns, and average power less than 1 watt.","PeriodicalId":422461,"journal":{"name":"2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Design and implementation of the PLUG architecture for programmable and efficient network lookups\",\"authors\":\"Amit Kumar, Lorenzo De Carli, Sung Jin Kim, M. Kruijf, K. Sankaralingam, Cristian Estan, S. Jha\",\"doi\":\"10.1145/1854273.1854316\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper proposes a new architecture called Pipelined LookUp Grid (PLUG) that can perform data structure lookups in network processing. PLUGs are programmable and through simplicity achieve power efficiency. We draw upon the insights that data structure lookups have natural structure that can be statically determined and exploited. The PLUG execution model transforms data-structure lookups into pipelined stages of computation and associates small code-blocks with data. The PLUG architecture is a tiled architecture with each tile consisting predominantly of SRAMs, a lightweight no-buffering router, and an array of lightweight computation cores. Using a principle of fixed delays in the execution model, the architecture is contention-free and completely statically scheduled thus achieving high energy efficiency. The architecture enables rapid deployment of new network protocols and generalizes as a data-structure accelerator. This paper describes the PLUG architecture, the compiler, and evaluates our RTL prototype PLUG chip synthesized on a 55nm technology library. We evaluate six diverse high-end network processing workloads including IPv4, IPv6, and Ethernet forwarding. We show that at a 55nm technology, a 16-tile PLUG occupies 58mm2, provides 4MB on-chip storage, and sustains a clock frequency of 1 GHz. This translates to 1 billion lookups per second, a latency of 18ns to 219ns, and average power less than 1 watt.\",\"PeriodicalId\":422461,\"journal\":{\"name\":\"2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-09-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1854273.1854316\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1854273.1854316","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

摘要

本文提出了一种新的结构,称为管道查找网格(pipeline LookUp Grid, PLUG),它可以在网络处理中执行数据结构查找。插头是可编程的,通过简单实现功率效率。我们认为数据结构查找具有可以静态确定和利用的自然结构。PLUG执行模型将数据结构查找转换为计算的流水线阶段,并将小代码块与数据关联起来。PLUG架构是一种分层架构,每个分层主要由sram、一个轻量级无缓冲路由器和一组轻量级计算核心组成。在执行模型中使用固定延迟原则,该体系结构是无争用的,并且完全是静态调度的,从而实现了高能效。该体系结构能够快速部署新的网络协议,并作为数据结构加速器进行推广。本文介绍了PLUG的架构、编译器,并对我们在55nm工艺库上合成的RTL原型PLUG芯片进行了评估。我们评估了六种不同的高端网络处理工作负载,包括IPv4、IPv6和以太网转发。我们表明,在55nm技术下,16块PLUG占用58mm2,提供4MB片上存储,并保持1ghz的时钟频率。这意味着每秒查找10亿次,延迟为18ns到219ns,平均功耗低于1瓦。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Design and implementation of the PLUG architecture for programmable and efficient network lookups
This paper proposes a new architecture called Pipelined LookUp Grid (PLUG) that can perform data structure lookups in network processing. PLUGs are programmable and through simplicity achieve power efficiency. We draw upon the insights that data structure lookups have natural structure that can be statically determined and exploited. The PLUG execution model transforms data-structure lookups into pipelined stages of computation and associates small code-blocks with data. The PLUG architecture is a tiled architecture with each tile consisting predominantly of SRAMs, a lightweight no-buffering router, and an array of lightweight computation cores. Using a principle of fixed delays in the execution model, the architecture is contention-free and completely statically scheduled thus achieving high energy efficiency. The architecture enables rapid deployment of new network protocols and generalizes as a data-structure accelerator. This paper describes the PLUG architecture, the compiler, and evaluates our RTL prototype PLUG chip synthesized on a 55nm technology library. We evaluate six diverse high-end network processing workloads including IPv4, IPv6, and Ethernet forwarding. We show that at a 55nm technology, a 16-tile PLUG occupies 58mm2, provides 4MB on-chip storage, and sustains a clock frequency of 1 GHz. This translates to 1 billion lookups per second, a latency of 18ns to 219ns, and average power less than 1 watt.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Reducing task creation and termination overhead in explicitly parallel programs An intra-tile cache set balancing scheme NUcache: A multicore cache organization based on Next-Use distance Towards a science of parallel programming Discovering and understanding performance bottlenecks in transactional applications
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1