基于有限传感数据的多核芯片临界温度鲁棒预测

S. Ankireddi
{"title":"基于有限传感数据的多核芯片临界温度鲁棒预测","authors":"S. Ankireddi","doi":"10.1109/STHERM.2011.5767203","DOIUrl":null,"url":null,"abstract":"Current generations of high performance microprocessors feature multiple cores and micro-cores, with each supporting multiple threads implemented in hardware. Such designs routinely feature billions of transistors, and chip layout teams are frequently hard pressed for placement and routing of all the functional blocks and sub-blocks that go into the design. An additional complexity arises because system engineers would like to have each micro-cores temperature monitored for silicon reliability and system performance reasons, which translates into them requiring that each core preferably be outfitted with a thermal sensor that routed out to the external world. Since die real estate is already at a premium and sensor macros can often be large, CPU design teams frequently shy away from placing and routing one sensor per each micro-core. The practical implication of this is that there is no means to monitor how hot any given micro-core is getting during field operation — which can compound risk significantly from the standpoints of silicon reliability (GoX, TDDB), chip electrical performance (timing, clock skew, jitter) and system performance (real time benchmarks, field performance, data coherency etc). In this study, a multi-core processor chip with a wide range of core-to-core power variability is considered. A finite number of sensor locations, which are known to be thermally sub-optimal, are assumed to be available for placement and routing. Using sensory data from these “poor” locations and an offline training algorithm, temperatures of all key core locations are determined using a causal, linear least-squares error basis. The resulting formulation is tested for prediction integrity using a large sample Monte Carlo analysis, and the temperature predictions are found to be robust. The technique developed is general enough to be applied across any microprocessor product family. The study concludes with suggested techniques to maintain prediction robustness in the presence of measurement errors, diode part-to-part variation and other inaccuracies. The approach proposed here can circumvent the limitations on placing and routing multiple diodes in real-estate constrained multi-core microprocessor and ASIC applications.","PeriodicalId":128077,"journal":{"name":"2011 27th Annual IEEE Semiconductor Thermal Measurement and Management Symposium","volume":"150 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Robust prediction of critical temperatures in multi-core chips with limited sensory data\",\"authors\":\"S. Ankireddi\",\"doi\":\"10.1109/STHERM.2011.5767203\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Current generations of high performance microprocessors feature multiple cores and micro-cores, with each supporting multiple threads implemented in hardware. Such designs routinely feature billions of transistors, and chip layout teams are frequently hard pressed for placement and routing of all the functional blocks and sub-blocks that go into the design. An additional complexity arises because system engineers would like to have each micro-cores temperature monitored for silicon reliability and system performance reasons, which translates into them requiring that each core preferably be outfitted with a thermal sensor that routed out to the external world. Since die real estate is already at a premium and sensor macros can often be large, CPU design teams frequently shy away from placing and routing one sensor per each micro-core. The practical implication of this is that there is no means to monitor how hot any given micro-core is getting during field operation — which can compound risk significantly from the standpoints of silicon reliability (GoX, TDDB), chip electrical performance (timing, clock skew, jitter) and system performance (real time benchmarks, field performance, data coherency etc). In this study, a multi-core processor chip with a wide range of core-to-core power variability is considered. A finite number of sensor locations, which are known to be thermally sub-optimal, are assumed to be available for placement and routing. Using sensory data from these “poor” locations and an offline training algorithm, temperatures of all key core locations are determined using a causal, linear least-squares error basis. The resulting formulation is tested for prediction integrity using a large sample Monte Carlo analysis, and the temperature predictions are found to be robust. The technique developed is general enough to be applied across any microprocessor product family. The study concludes with suggested techniques to maintain prediction robustness in the presence of measurement errors, diode part-to-part variation and other inaccuracies. The approach proposed here can circumvent the limitations on placing and routing multiple diodes in real-estate constrained multi-core microprocessor and ASIC applications.\",\"PeriodicalId\":128077,\"journal\":{\"name\":\"2011 27th Annual IEEE Semiconductor Thermal Measurement and Management Symposium\",\"volume\":\"150 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-03-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 27th Annual IEEE Semiconductor Thermal Measurement and Management Symposium\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/STHERM.2011.5767203\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 27th Annual IEEE Semiconductor Thermal Measurement and Management Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/STHERM.2011.5767203","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

当前几代高性能微处理器的特点是多核和微核,每个都支持在硬件中实现多线程。这样的设计通常以数十亿个晶体管为特征,芯片布局团队经常为设计中所有功能模块和子模块的放置和布线而感到压力。由于硅可靠性和系统性能的原因,系统工程师希望对每个微核的温度进行监测,这就产生了额外的复杂性,这意味着他们要求每个核最好配备一个路由到外部世界的热传感器。由于芯片空间已经非常宝贵,而且传感器宏通常很大,CPU设计团队经常避免在每个微核上放置和路由一个传感器。这样做的实际含义是,没有办法监控任何给定的微核在现场操作过程中的温度——这可能会从硅可靠性(GoX、TDDB)、芯片电气性能(定时、时钟倾斜、抖动)和系统性能(实时基准、现场性能、数据一致性等)的角度显著地增加风险。在本研究中,考虑了具有广泛核间功率可变性的多核处理器芯片。假设有有限数量的传感器位置可用于放置和路由,这些位置已知是热次优的。使用来自这些“糟糕”位置的感官数据和离线训练算法,所有关键核心位置的温度都是使用因果线性最小二乘误差基础确定的。使用大样本蒙特卡罗分析对所得公式进行了预测完整性测试,发现温度预测是稳健的。所开发的技术是通用的,足以适用于任何微处理器产品系列。该研究总结了建议的技术,以保持在测量误差,二极管部分到部分的变化和其他不准确的存在预测稳健性。本文提出的方法可以规避在空间受限的多核微处理器和ASIC应用中放置和路由多个二极管的限制。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Robust prediction of critical temperatures in multi-core chips with limited sensory data
Current generations of high performance microprocessors feature multiple cores and micro-cores, with each supporting multiple threads implemented in hardware. Such designs routinely feature billions of transistors, and chip layout teams are frequently hard pressed for placement and routing of all the functional blocks and sub-blocks that go into the design. An additional complexity arises because system engineers would like to have each micro-cores temperature monitored for silicon reliability and system performance reasons, which translates into them requiring that each core preferably be outfitted with a thermal sensor that routed out to the external world. Since die real estate is already at a premium and sensor macros can often be large, CPU design teams frequently shy away from placing and routing one sensor per each micro-core. The practical implication of this is that there is no means to monitor how hot any given micro-core is getting during field operation — which can compound risk significantly from the standpoints of silicon reliability (GoX, TDDB), chip electrical performance (timing, clock skew, jitter) and system performance (real time benchmarks, field performance, data coherency etc). In this study, a multi-core processor chip with a wide range of core-to-core power variability is considered. A finite number of sensor locations, which are known to be thermally sub-optimal, are assumed to be available for placement and routing. Using sensory data from these “poor” locations and an offline training algorithm, temperatures of all key core locations are determined using a causal, linear least-squares error basis. The resulting formulation is tested for prediction integrity using a large sample Monte Carlo analysis, and the temperature predictions are found to be robust. The technique developed is general enough to be applied across any microprocessor product family. The study concludes with suggested techniques to maintain prediction robustness in the presence of measurement errors, diode part-to-part variation and other inaccuracies. The approach proposed here can circumvent the limitations on placing and routing multiple diodes in real-estate constrained multi-core microprocessor and ASIC applications.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Data center design using improved CFD modeling and cost reduction analysis Data center efficiency with higher ambient temperatures and optimized cooling control Effect of server load variation on rack air flow distribution in a raised floor data center Thermal design in the Design for Six Sigma — DIDOV framework ASIC package lid effects on temperature and lifetime
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1