Simulation sampling with live-points

T. Wenisch, Roland E. Wunderlich, B. Falsafi, J. Hoe
{"title":"Simulation sampling with live-points","authors":"T. Wenisch, Roland E. Wunderlich, B. Falsafi, J. Hoe","doi":"10.1109/ISPASS.2006.1620785","DOIUrl":null,"url":null,"abstract":"Current simulation-sampling techniques construct accurate model state for each measurement by continuously warming large microarchitectural structures (e.g., caches and the branch predictor) while functionally simulating the billions of instructions between measurements. This approach, called functional warming, is the main performance bottleneck of simulation sampling and requires hours of runtime while the detailed simulation of the sample requires only minutes. Existing simulators can avoid functional simulation by jumping directly to particular instruction stream locations with architectural state checkpoints. To replace functional warming, these checkpoints must additionally provide microarchitectural model state that is accurate and reusable across experiments while meeting tight storage constraints. In this paper, we present a simulation-sampling framework that replaces functional warming with live-points without sacrificing accuracy. A live-point stores the bare minimum of functionally-warmed state for accurate simulation of a limited execution window while placing minimal restrictions on microarchitectural configuration. Live-points can be processed in random rather than program order, allowing simulation results and their statistical confidence to be reported while simulations are in progress. Our framework matches the accuracy of prior simulation-sampling techniques (i.e., /spl plusmn/3% error with 99.7% confidence), while estimating the performance of an 8-way out-of-order superscalar processor running SPEC CPU2000 in 91 seconds per benchmark, on average, using a 12 GB live-point library.","PeriodicalId":369192,"journal":{"name":"2006 IEEE International Symposium on Performance Analysis of Systems and Software","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"58","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 IEEE International Symposium on Performance Analysis of Systems and Software","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPASS.2006.1620785","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 58

Abstract

Current simulation-sampling techniques construct accurate model state for each measurement by continuously warming large microarchitectural structures (e.g., caches and the branch predictor) while functionally simulating the billions of instructions between measurements. This approach, called functional warming, is the main performance bottleneck of simulation sampling and requires hours of runtime while the detailed simulation of the sample requires only minutes. Existing simulators can avoid functional simulation by jumping directly to particular instruction stream locations with architectural state checkpoints. To replace functional warming, these checkpoints must additionally provide microarchitectural model state that is accurate and reusable across experiments while meeting tight storage constraints. In this paper, we present a simulation-sampling framework that replaces functional warming with live-points without sacrificing accuracy. A live-point stores the bare minimum of functionally-warmed state for accurate simulation of a limited execution window while placing minimal restrictions on microarchitectural configuration. Live-points can be processed in random rather than program order, allowing simulation results and their statistical confidence to be reported while simulations are in progress. Our framework matches the accuracy of prior simulation-sampling techniques (i.e., /spl plusmn/3% error with 99.7% confidence), while estimating the performance of an 8-way out-of-order superscalar processor running SPEC CPU2000 in 91 seconds per benchmark, on average, using a 12 GB live-point library.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用活点模拟采样
当前的模拟采样技术通过不断加热大型微架构结构(例如缓存和分支预测器)来为每次测量构建精确的模型状态,同时在测量之间功能模拟数十亿条指令。这种方法称为功能升温,是模拟采样的主要性能瓶颈,需要数小时的运行时间,而样本的详细模拟只需要几分钟。现有的模拟器可以通过直接跳转到具有体系结构状态检查点的特定指令流位置来避免功能模拟。为了取代功能性升温,这些检查点必须额外提供精确的微架构模型状态,并在满足严格的存储约束的情况下跨实验可重用。在本文中,我们提出了一个模拟采样框架,在不牺牲精度的情况下,用活点代替功能变暖。活动点存储最小的功能预热状态,用于精确模拟有限的执行窗口,同时对微架构配置施加最小的限制。实时点可以随机处理,而不是按程序顺序处理,允许在模拟进行时报告模拟结果及其统计置信度。我们的框架与之前的模拟采样技术的准确性相匹配(即/spl + usmn/3%的误差,99.7%的置信度),同时使用12 GB的活点库估计运行SPEC CPU2000的8路无序标量处理器在每个基准测试平均91秒内的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Accelerating architectural exploration using canonical instruction segments Simulation sampling with live-points Characterizing the branch misprediction penalty Friendly fire: understanding the effects of multiprocessor prefetches Evaluating the efficacy of statistical simulation for design space exploration
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1