宽度感知的细粒度动态供应门控:一种低功耗数据路径和存储器的设计方法

L. Wang, Somnath Paul, S. Bhunia
{"title":"宽度感知的细粒度动态供应门控:一种低功耗数据路径和存储器的设计方法","authors":"L. Wang, Somnath Paul, S. Bhunia","doi":"10.1109/VLSID.2012.94","DOIUrl":null,"url":null,"abstract":"With increasing contribution of leakage in total active power, run-time leakage control techniques are becoming extremely important. Supply gating provides an effective, low-overhead and technology scalable approach for active leakage reduction through the well-known \"stacking effect\". However, conventional supply gating approaches are typically coarse-grained in both space and time - i.e. are applied to large data path or memory blocks when an entire logic/memory block is idle for sufficiently long period. They suffer from limited applicability at run time. On the other hand, fine-grained supply gating is constrained primarily by the large wake-up delay and wake-up power overhead. In this paper, we propose a novel fine-grained width-aware dynamic supply gating (WADSG) approach to reduce both active leakage and redundant switching power in data path and embedded memory (e.g. L1/L2 cache). The approach exploits the abundance of narrow-width (NW) operands in general-purpose and embedded applications to \"supply-gate\" unused parts of integer execution units and memory blocks while they are in use. We introduce a novel levelized gating strategy to virtually eliminate the wake-up delay overhead. We employ the proposed WADSG approach to a super scalar processor. To reduce the wake-up power we use a width aware instruction issue policy. In case of L1 and L2 cache, we store the width information per \"ways\" of associative cache and supply-gate the most significant bits of the NW ways. We also propose a width-aware block allocation and replacement policy to maximize the number of NW ways. Simulation results for 45nm technology with Spec2k benchmarks show major savings (34.5%) in total processor power (considering both switching and active leakage power) with no performance impact. As a by-product, the proposed scheme also improves the thermal profile of both data path and memory.","PeriodicalId":405021,"journal":{"name":"2012 25th International Conference on VLSI Design","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Width-Aware Fine-Grained Dynamic Supply Gating: A Design Methodology for Low-Power Datapath and Memory\",\"authors\":\"L. Wang, Somnath Paul, S. Bhunia\",\"doi\":\"10.1109/VLSID.2012.94\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With increasing contribution of leakage in total active power, run-time leakage control techniques are becoming extremely important. Supply gating provides an effective, low-overhead and technology scalable approach for active leakage reduction through the well-known \\\"stacking effect\\\". However, conventional supply gating approaches are typically coarse-grained in both space and time - i.e. are applied to large data path or memory blocks when an entire logic/memory block is idle for sufficiently long period. They suffer from limited applicability at run time. On the other hand, fine-grained supply gating is constrained primarily by the large wake-up delay and wake-up power overhead. In this paper, we propose a novel fine-grained width-aware dynamic supply gating (WADSG) approach to reduce both active leakage and redundant switching power in data path and embedded memory (e.g. L1/L2 cache). The approach exploits the abundance of narrow-width (NW) operands in general-purpose and embedded applications to \\\"supply-gate\\\" unused parts of integer execution units and memory blocks while they are in use. We introduce a novel levelized gating strategy to virtually eliminate the wake-up delay overhead. We employ the proposed WADSG approach to a super scalar processor. To reduce the wake-up power we use a width aware instruction issue policy. In case of L1 and L2 cache, we store the width information per \\\"ways\\\" of associative cache and supply-gate the most significant bits of the NW ways. We also propose a width-aware block allocation and replacement policy to maximize the number of NW ways. Simulation results for 45nm technology with Spec2k benchmarks show major savings (34.5%) in total processor power (considering both switching and active leakage power) with no performance impact. As a by-product, the proposed scheme also improves the thermal profile of both data path and memory.\",\"PeriodicalId\":405021,\"journal\":{\"name\":\"2012 25th International Conference on VLSI Design\",\"volume\":\"32 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-01-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 25th International Conference on VLSI Design\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/VLSID.2012.94\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 25th International Conference on VLSI Design","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VLSID.2012.94","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

随着泄漏在总有功功率中所占的比重越来越大,运行时泄漏控制技术显得尤为重要。电源门控通过众所周知的“堆叠效应”,为主动减少泄漏提供了一种有效、低开销和技术可扩展的方法。然而,传统的供应门控方法通常在空间和时间上都是粗粒度的——也就是说,当整个逻辑/内存块空闲足够长的时间时,应用于大数据路径或内存块。它们在运行时的适用性有限。另一方面,细粒度供应门控主要受到较大唤醒延迟和唤醒功率开销的限制。在本文中,我们提出了一种新颖的细粒度宽度感知动态电源门控(WADSG)方法,以减少数据路径和嵌入式存储器(例如L1/L2缓存)中的有源泄漏和冗余开关功率。该方法利用通用和嵌入式应用程序中大量的窄宽度(NW)操作数,在整数执行单元和内存块正在使用时“供应门”未使用的部分。我们引入了一种新的平化门控策略来消除唤醒延迟开销。我们将所提出的WADSG方法应用于一个超大标量处理器。为了减少唤醒功率,我们使用了宽度感知指令发布策略。在L1和L2缓存的情况下,我们根据关联缓存的“方式”存储宽度信息,并提供NW方式的最高有效位。我们还提出了一个宽度感知的块分配和替换策略,以最大限度地增加NW方式的数量。采用Spec2k基准测试的45nm技术的仿真结果显示,在没有性能影响的情况下,总处理器功耗(考虑开关和有源泄漏功率)大幅节省(34.5%)。作为副产品,该方案还改善了数据路径和存储器的热分布。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Width-Aware Fine-Grained Dynamic Supply Gating: A Design Methodology for Low-Power Datapath and Memory
With increasing contribution of leakage in total active power, run-time leakage control techniques are becoming extremely important. Supply gating provides an effective, low-overhead and technology scalable approach for active leakage reduction through the well-known "stacking effect". However, conventional supply gating approaches are typically coarse-grained in both space and time - i.e. are applied to large data path or memory blocks when an entire logic/memory block is idle for sufficiently long period. They suffer from limited applicability at run time. On the other hand, fine-grained supply gating is constrained primarily by the large wake-up delay and wake-up power overhead. In this paper, we propose a novel fine-grained width-aware dynamic supply gating (WADSG) approach to reduce both active leakage and redundant switching power in data path and embedded memory (e.g. L1/L2 cache). The approach exploits the abundance of narrow-width (NW) operands in general-purpose and embedded applications to "supply-gate" unused parts of integer execution units and memory blocks while they are in use. We introduce a novel levelized gating strategy to virtually eliminate the wake-up delay overhead. We employ the proposed WADSG approach to a super scalar processor. To reduce the wake-up power we use a width aware instruction issue policy. In case of L1 and L2 cache, we store the width information per "ways" of associative cache and supply-gate the most significant bits of the NW ways. We also propose a width-aware block allocation and replacement policy to maximize the number of NW ways. Simulation results for 45nm technology with Spec2k benchmarks show major savings (34.5%) in total processor power (considering both switching and active leakage power) with no performance impact. As a by-product, the proposed scheme also improves the thermal profile of both data path and memory.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Two Graph Based Circuit Simulator for PDE-Electrical Analogy Tutorial T8A: Designing Silicon-Photonic Communication Networks for Manycore Systems Efficient Online RTL Debugging Methodology for Logic Emulation Systems Low-Overhead Maximum Power Point Tracking for Micro-Scale Solar Energy Harvesting Systems A Framework for TSV Serialization-aware Synthesis of Application Specific 3D Networks-on-Chip
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1