Power and performance aware memory-controller voting mechanism

M. Vratonjic, H. Singh, G. Kumar, R. Mohamed, Ashish Bajaj, Ken Gainey
{"title":"Power and performance aware memory-controller voting mechanism","authors":"M. Vratonjic, H. Singh, G. Kumar, R. Mohamed, Ashish Bajaj, Ken Gainey","doi":"10.1109/ISQED.2018.8357276","DOIUrl":null,"url":null,"abstract":"Modern System-on-Chips (SoCs) integrate a graphics unit (GPU) with many application processor cores (CPUs), communication cores (modem, WiFi) and device interfaces (USB, HDMI) on a single die. The primary memory system is fast becoming a major performance bottleneck as more and more of these units share this critical resource. An Integrated-Memory-Controller (IMC) is responsible for buffering and servicing memory requests from different CPU cores, GPU and other processing blocks that require DDR memory access. Previous work [2] was focused on appropriately prioritizing memory requests and increasing IMC/DDR memory frequency to improve system performance — which came at the expense of higher power consumption. Recent work has addressed this problem by using a demand based approach. This is accomplished by making the IMC aware of the application characteristics and then scaling its frequency based on the memory access demand [1]. This leads to lower IMC and DDR frequencies and thus lower power. The work presented here shows that instead of lowering the frequency, greater total system power savings can be achieved by increasing IMC frequency at the beginning of a use-case that has moderate GPU utilization. The primary motivation behind this approach is that it allows GPU, with its inherent ability to execute a larger number of parallel threads, to access memory faster and therefore complete its processing portion of the execution pipeline faster. This, in turn, allows relaxation of the timing requirements imposed on the CPU pipeline portion and consecutive cycles, thus saving on total system power. An algorithm for this technique, along with the silicon results on an SoC implemented in an industrial 28nm process, will be presented in this paper.","PeriodicalId":213351,"journal":{"name":"2018 19th International Symposium on Quality Electronic Design (ISQED)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 19th International Symposium on Quality Electronic Design (ISQED)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISQED.2018.8357276","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Modern System-on-Chips (SoCs) integrate a graphics unit (GPU) with many application processor cores (CPUs), communication cores (modem, WiFi) and device interfaces (USB, HDMI) on a single die. The primary memory system is fast becoming a major performance bottleneck as more and more of these units share this critical resource. An Integrated-Memory-Controller (IMC) is responsible for buffering and servicing memory requests from different CPU cores, GPU and other processing blocks that require DDR memory access. Previous work [2] was focused on appropriately prioritizing memory requests and increasing IMC/DDR memory frequency to improve system performance — which came at the expense of higher power consumption. Recent work has addressed this problem by using a demand based approach. This is accomplished by making the IMC aware of the application characteristics and then scaling its frequency based on the memory access demand [1]. This leads to lower IMC and DDR frequencies and thus lower power. The work presented here shows that instead of lowering the frequency, greater total system power savings can be achieved by increasing IMC frequency at the beginning of a use-case that has moderate GPU utilization. The primary motivation behind this approach is that it allows GPU, with its inherent ability to execute a larger number of parallel threads, to access memory faster and therefore complete its processing portion of the execution pipeline faster. This, in turn, allows relaxation of the timing requirements imposed on the CPU pipeline portion and consecutive cycles, thus saving on total system power. An algorithm for this technique, along with the silicon results on an SoC implemented in an industrial 28nm process, will be presented in this paper.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
功耗和性能敏感的内存控制器投票机制
现代片上系统(soc)将图形单元(GPU)与许多应用处理器核心(cpu),通信核心(调制解调器,WiFi)和设备接口(USB, HDMI)集成在单个芯片上。随着越来越多的主存单元共享这一关键资源,主存系统正迅速成为主要的性能瓶颈。集成内存控制器(IMC)负责缓冲和服务来自不同CPU核心、GPU和其他需要DDR内存访问的处理块的内存请求。以前的工作[2]主要关注内存请求的适当优先级和提高IMC/DDR内存频率以提高系统性能——这是以更高的功耗为代价的。最近的工作通过使用基于需求的方法解决了这个问题。这是通过让IMC了解应用程序的特征,然后根据内存访问需求调整其频率来实现的[1]。这将导致较低的IMC和DDR频率,从而降低功率。这里展示的工作表明,在具有中等GPU利用率的用例开始时,通过增加IMC频率可以实现更大的系统总功耗节省,而不是降低频率。这种方法背后的主要动机是,它允许GPU以其固有的能力来执行更多的并行线程,更快地访问内存,从而更快地完成执行管道的处理部分。反过来,这允许放松对CPU管道部分和连续周期施加的时间要求,从而节省系统总功率。本文将介绍该技术的算法,以及在工业28nm工艺中实现的SoC上的硅结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Body-biasing assisted vmin optimization for 5nm-node multi-Vt FD-SOI 6T-SRAM PDA-HyPAR: Path-diversity-aware hybrid planar adaptive routing algorithm for 3D NoCs A loop structure optimization targeting high-level synthesis of fast number theoretic transform Hybrid-comp: A criticality-aware compressed last-level cache Low power latch based design with smart retiming
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1