更快的缓存合并功能升温

Gustaf Borgström, C. Rohner, D. Black-Schaffer
{"title":"更快的缓存合并功能升温","authors":"Gustaf Borgström, C. Rohner, D. Black-Schaffer","doi":"10.1145/3579170.3579256","DOIUrl":null,"url":null,"abstract":"Smarts-like sampled hardware simulation techniques achieve good accuracy by simulating many small portions of an application in detail. However, while this reduces the simulation time, it results in extensive cache warming times, as each of the many simulation points requires warming the whole memory hierarchy. Adaptive Cache Warming reduces this time by iteratively increasing warming to achieve sufficient accuracy. Unfortunately, each increases requires that the previous warming be redone, nearly doubling the total warming. We address re-warming by developing a technique to merge the cache states from the previous and additional warming iterations. We demonstrate our merging approach on multi-level LRU cache hierarchy and evaluate and address the introduced errors. Our experiments show that Cache Merging delivers an average speedup of 1.44 ×, 1.84 ×, and 1.87 × for 128kB, 2MB, and 8MB L2 caches, respectively, (vs. a 2 × theoretical maximum speedup) with 95-percentile absolute IPC errors of only 0.029, 0.015, and 0.006, respectively. These results demonstrate that Cache Merging yields significantly higher simulation speed with minimal losses.","PeriodicalId":153341,"journal":{"name":"Proceedings of the DroneSE and RAPIDO: System Engineering for constrained embedded systems","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Faster Functional Warming with Cache Merging\",\"authors\":\"Gustaf Borgström, C. Rohner, D. Black-Schaffer\",\"doi\":\"10.1145/3579170.3579256\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Smarts-like sampled hardware simulation techniques achieve good accuracy by simulating many small portions of an application in detail. However, while this reduces the simulation time, it results in extensive cache warming times, as each of the many simulation points requires warming the whole memory hierarchy. Adaptive Cache Warming reduces this time by iteratively increasing warming to achieve sufficient accuracy. Unfortunately, each increases requires that the previous warming be redone, nearly doubling the total warming. We address re-warming by developing a technique to merge the cache states from the previous and additional warming iterations. We demonstrate our merging approach on multi-level LRU cache hierarchy and evaluate and address the introduced errors. Our experiments show that Cache Merging delivers an average speedup of 1.44 ×, 1.84 ×, and 1.87 × for 128kB, 2MB, and 8MB L2 caches, respectively, (vs. a 2 × theoretical maximum speedup) with 95-percentile absolute IPC errors of only 0.029, 0.015, and 0.006, respectively. These results demonstrate that Cache Merging yields significantly higher simulation speed with minimal losses.\",\"PeriodicalId\":153341,\"journal\":{\"name\":\"Proceedings of the DroneSE and RAPIDO: System Engineering for constrained embedded systems\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the DroneSE and RAPIDO: System Engineering for constrained embedded systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3579170.3579256\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the DroneSE and RAPIDO: System Engineering for constrained embedded systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3579170.3579256","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

类似智能的采样硬件仿真技术通过详细模拟应用程序的许多小部分来实现良好的准确性。然而,虽然这减少了模拟时间,但它导致了大量的缓存升温时间,因为许多模拟点中的每一个都需要升温整个内存层次结构。自适应缓存升温通过迭代增加升温来达到足够的精度,从而减少了这一时间。不幸的是,每次升温都需要之前的变暖重新开始,几乎使总变暖增加一倍。我们通过开发一种技术来合并来自先前和额外的升温迭代的缓存状态,从而解决重新升温问题。我们在多级LRU缓存层次结构上演示了我们的合并方法,并评估和解决了引入的错误。我们的实验表明,对于128kB、2MB和8MB的L2缓存,Cache合并分别提供了1.44倍、1.84倍和1.87倍的平均加速(相对于2倍的理论最大加速),95%的绝对IPC误差分别只有0.029、0.015和0.006。这些结果表明,缓存合并以最小的损失显著提高了模拟速度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Faster Functional Warming with Cache Merging
Smarts-like sampled hardware simulation techniques achieve good accuracy by simulating many small portions of an application in detail. However, while this reduces the simulation time, it results in extensive cache warming times, as each of the many simulation points requires warming the whole memory hierarchy. Adaptive Cache Warming reduces this time by iteratively increasing warming to achieve sufficient accuracy. Unfortunately, each increases requires that the previous warming be redone, nearly doubling the total warming. We address re-warming by developing a technique to merge the cache states from the previous and additional warming iterations. We demonstrate our merging approach on multi-level LRU cache hierarchy and evaluate and address the introduced errors. Our experiments show that Cache Merging delivers an average speedup of 1.44 ×, 1.84 ×, and 1.87 × for 128kB, 2MB, and 8MB L2 caches, respectively, (vs. a 2 × theoretical maximum speedup) with 95-percentile absolute IPC errors of only 0.029, 0.015, and 0.006, respectively. These results demonstrate that Cache Merging yields significantly higher simulation speed with minimal losses.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Automatic DRAM Subsystem Configuration with irace Datapath Optimization for Embedded Signal Processing Architectures utilizing Design Space Exploration Faster Functional Warming with Cache Merging Towards a European Network of Enabling Technologies for Drones Towards an Ontological Methodology for Dynamic Dependability Management of Unmanned Aerial Vehicles
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1