Design of Ultra-Low Power Scalable-Throughput Many-Core DSP Applications

Meeta Srivastav, M. Ehteshamuddin, K. Stegner, L. Nazhandali
{"title":"Design of Ultra-Low Power Scalable-Throughput Many-Core DSP Applications","authors":"Meeta Srivastav, M. Ehteshamuddin, K. Stegner, L. Nazhandali","doi":"10.1145/2720018","DOIUrl":null,"url":null,"abstract":"We propose a system-level solution in designing process variation aware (PVA) scalable-throughput many-core systems for energy constrained applications. In our proposed methodology, we leverage the benefits of voltage scaling for obtaining energy efficiency while compensating for the loss in throughput by exploiting parallelism present in various DSP designs. We demonstrate that such a hybrid method consumes 6.27%- 28.15% less power as compared to simple dynamic voltage scaling over different workload environments. Design details of a prototype chip fabricated on 90nm technology node and its findings are presented. Chip consists of 8 homogeneous FIR cores, which are capable of running from near-threshold to nominal voltages. In our 20 chip population, we observe 7% variation in speed among the cores at nominal voltage (0.9V) and 26% at near threshold voltage (0.55V). We also observe 54% variation in power consumption of the cores. For any desired throughput, the optimum number of cores and their optimum operating voltage is chosen based on the speed and power characteristics of the cores present inside the chip. We will also present analysis on energy-efficiency of such systems based on changes in ambient temperature.","PeriodicalId":7063,"journal":{"name":"ACM Trans. Design Autom. Electr. Syst.","volume":"8 1","pages":"34:1-34:21"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Trans. Design Autom. Electr. Syst.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2720018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

We propose a system-level solution in designing process variation aware (PVA) scalable-throughput many-core systems for energy constrained applications. In our proposed methodology, we leverage the benefits of voltage scaling for obtaining energy efficiency while compensating for the loss in throughput by exploiting parallelism present in various DSP designs. We demonstrate that such a hybrid method consumes 6.27%- 28.15% less power as compared to simple dynamic voltage scaling over different workload environments. Design details of a prototype chip fabricated on 90nm technology node and its findings are presented. Chip consists of 8 homogeneous FIR cores, which are capable of running from near-threshold to nominal voltages. In our 20 chip population, we observe 7% variation in speed among the cores at nominal voltage (0.9V) and 26% at near threshold voltage (0.55V). We also observe 54% variation in power consumption of the cores. For any desired throughput, the optimum number of cores and their optimum operating voltage is chosen based on the speed and power characteristics of the cores present inside the chip. We will also present analysis on energy-efficiency of such systems based on changes in ambient temperature.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
超低功耗可扩展吞吐量多核DSP应用的设计
我们提出了一种系统级解决方案,用于设计过程变化感知(PVA)可扩展吞吐量的多核系统。在我们提出的方法中,我们利用电压缩放的好处来获得能源效率,同时通过利用各种DSP设计中的并行性来补偿吞吐量的损失。我们证明,与在不同工作负载环境下简单的动态电压缩放相比,这种混合方法消耗的功率减少了6.27%- 28.15%。介绍了一种基于90nm技术节点的原型芯片的设计细节和研究结果。芯片由8个均匀的FIR内核组成,能够从接近阈值到标称电压运行。在我们的20个芯片种群中,我们观察到在标称电压(0.9V)下内核之间的速度变化为7%,在接近阈值电压(0.55V)下内核之间的速度变化为26%。我们还观察到54%的核心功耗变化。对于任何期望的吞吐量,核心的最佳数量和它们的最佳工作电压是根据芯片内部核心的速度和功率特性来选择的。我们还将根据环境温度的变化对这类系统的能源效率进行分析。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
High-Level Synthesis Implementation of an Embedded Real-Time HEVC Intra Encoder on FPGA for Media Applications Achieving High In Situ Training Accuracy and Energy Efficiency with Analog Non-Volatile Synaptic Devices A Comprehensive Survey of Attacks without Physical Access Targeting Hardware Vulnerabilities in IoT/IIoT Devices, and Their Detection Mechanisms Improving LDPC Decoding Performance for 3D TLC NAND Flash by LLR Optimization Scheme for Hard and Soft Decision Demand-Driven Multi-Target Sample Preparation on Resource-Constrained Digital Microfluidic Biochips
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1