减少流水线转发引擎的动态功耗

2009 IEEE International Conference on Computer Design Pub Date : 2009-10-04 DOI:10.1109/ICCD.2009.5413163

Weirong Jiang, V. Prasanna

{"title":"减少流水线转发引擎的动态功耗","authors":"Weirong Jiang, V. Prasanna","doi":"10.1109/ICCD.2009.5413163","DOIUrl":null,"url":null,"abstract":"Power consumption has become a limiting factor in next-generation routers. IP forwarding engines dominate the overall power dissipation in a router. Although SRAM-based pipeline architectures have recently been developed as a promising alternative to power-hungry TCAM-based solutions for high-throughput IP forwarding, it remains a challenge to achieve low power. This paper proposes several novel architecture-specific techniques to reduce the dynamic power consumption in SRAM-based pipelined IP forwarding engines. First, the pipeline architecture itself is built as an inherent cache, exploiting the data locality in Internet traffic. The number of memory accesses which contribute to the majority of power consumption, is thus reduced. No external cache is needed. Second, instead of using a global clock, different pipeline stages are driven by separate clocks. The local clocking scheme is carefully designed to exploit the traffic rate variation and improve the caching performance. Third, a fine-grained memory enabling scheme is developed to eliminate unnecessary memory accesses, while preserving the packet order. Simulation experiments using real-life traces show that our solutions can achieve up to 15-fold reduction in dynamic power dissipation, over the baseline pipeline architecture that does not employ the proposed schemes. FPGA implementation results show that our design sustains 40 Gbps throughput for minimum size (40 bytes) packets while consuming a small amount of logic resources.","PeriodicalId":256908,"journal":{"name":"2009 IEEE International Conference on Computer Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Reducing dynamic power dissipation in pipelined forwarding engines\",\"authors\":\"Weirong Jiang, V. Prasanna\",\"doi\":\"10.1109/ICCD.2009.5413163\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Power consumption has become a limiting factor in next-generation routers. IP forwarding engines dominate the overall power dissipation in a router. Although SRAM-based pipeline architectures have recently been developed as a promising alternative to power-hungry TCAM-based solutions for high-throughput IP forwarding, it remains a challenge to achieve low power. This paper proposes several novel architecture-specific techniques to reduce the dynamic power consumption in SRAM-based pipelined IP forwarding engines. First, the pipeline architecture itself is built as an inherent cache, exploiting the data locality in Internet traffic. The number of memory accesses which contribute to the majority of power consumption, is thus reduced. No external cache is needed. Second, instead of using a global clock, different pipeline stages are driven by separate clocks. The local clocking scheme is carefully designed to exploit the traffic rate variation and improve the caching performance. Third, a fine-grained memory enabling scheme is developed to eliminate unnecessary memory accesses, while preserving the packet order. Simulation experiments using real-life traces show that our solutions can achieve up to 15-fold reduction in dynamic power dissipation, over the baseline pipeline architecture that does not employ the proposed schemes. FPGA implementation results show that our design sustains 40 Gbps throughput for minimum size (40 bytes) packets while consuming a small amount of logic resources.\",\"PeriodicalId\":256908,\"journal\":{\"name\":\"2009 IEEE International Conference on Computer Design\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-10-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 IEEE International Conference on Computer Design\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCD.2009.5413163\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE International Conference on Computer Design","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCD.2009.5413163","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 13

摘要

功耗已经成为下一代路由器的一个限制因素。IP转发引擎在路由器的整体功耗中占主导地位。尽管基于sram的管道架构最近被开发为一种有前途的替代方案，用于高吞吐量IP转发，但实现低功耗仍然是一个挑战。为了降低基于sram的流水线IP转发引擎的动态功耗，本文提出了几种新的特定于体系结构的技术。首先，管道架构本身被构建为一个固有的缓存，利用互联网流量中的数据局部性。因此，内存访问的数量减少了，而内存访问是造成大部分功耗的原因。不需要外部缓存。其次，不同的流水线阶段由不同的时钟驱动，而不是使用全局时钟。本地时钟方案是精心设计的，以利用流量速率的变化，提高缓存性能。第三，开发了一种细粒度的内存启用方案，以消除不必要的内存访问，同时保持数据包的顺序。使用真实轨迹的仿真实验表明，我们的解决方案可以在不采用所提出方案的基线管道架构上实现高达15倍的动态功耗降低。FPGA实现结果表明，我们的设计在消耗少量逻辑资源的同时，保持最小尺寸(40字节)数据包的40 Gbps吞吐量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Reducing dynamic power dissipation in pipelined forwarding engines

Power consumption has become a limiting factor in next-generation routers. IP forwarding engines dominate the overall power dissipation in a router. Although SRAM-based pipeline architectures have recently been developed as a promising alternative to power-hungry TCAM-based solutions for high-throughput IP forwarding, it remains a challenge to achieve low power. This paper proposes several novel architecture-specific techniques to reduce the dynamic power consumption in SRAM-based pipelined IP forwarding engines. First, the pipeline architecture itself is built as an inherent cache, exploiting the data locality in Internet traffic. The number of memory accesses which contribute to the majority of power consumption, is thus reduced. No external cache is needed. Second, instead of using a global clock, different pipeline stages are driven by separate clocks. The local clocking scheme is carefully designed to exploit the traffic rate variation and improve the caching performance. Third, a fine-grained memory enabling scheme is developed to eliminate unnecessary memory accesses, while preserving the packet order. Simulation experiments using real-life traces show that our solutions can achieve up to 15-fold reduction in dynamic power dissipation, over the baseline pipeline architecture that does not employ the proposed schemes. FPGA implementation results show that our design sustains 40 Gbps throughput for minimum size (40 bytes) packets while consuming a small amount of logic resources.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助