STM: Cloning the spatial and temporal memory access behavior

Amro Awad, Yan Solihin
{"title":"STM: Cloning the spatial and temporal memory access behavior","authors":"Amro Awad, Yan Solihin","doi":"10.1109/HPCA.2014.6835935","DOIUrl":null,"url":null,"abstract":"Computer architects need a deep understanding of clients' workload in order to design and tune the architecture. Unfortunately, many important clients will not share their software to computer architects due to the proprietary or confidential nature of their software. One technique to mitigate this problem is producing synthetic traces (clone) that replicate the behavior of the original workloads. Unfortunately, today there is no universal cloning technique that can capture arbitrary memory access behavior of applications. Existing technique captures only temporal, but not spatial, locality. In order to study memory hierarchy organization beyond caches, such as including prefetchers and translation lookaside buffer (TLB), capturing only temporal locality is insufficient. In this paper, we propose a new memory access behavior cloning technique that captures both temporal and spatial locality. We abbreviate our scheme as Spatio-Temporal Memory (STM) cloning. We propose a new profiling method and statistics that capture stride patterns and transition probabilities. We show how the new statistics enable accurate clone generation that allow clones to be used in place of the original benchmarks for studying the L1/L2/TLB miss rates as we vary the L1 cache, L1 prefetcher, L2 cache, TLB, and page size configurations.","PeriodicalId":164587,"journal":{"name":"2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA)","volume":"129 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA.2014.6835935","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 27

Abstract

Computer architects need a deep understanding of clients' workload in order to design and tune the architecture. Unfortunately, many important clients will not share their software to computer architects due to the proprietary or confidential nature of their software. One technique to mitigate this problem is producing synthetic traces (clone) that replicate the behavior of the original workloads. Unfortunately, today there is no universal cloning technique that can capture arbitrary memory access behavior of applications. Existing technique captures only temporal, but not spatial, locality. In order to study memory hierarchy organization beyond caches, such as including prefetchers and translation lookaside buffer (TLB), capturing only temporal locality is insufficient. In this paper, we propose a new memory access behavior cloning technique that captures both temporal and spatial locality. We abbreviate our scheme as Spatio-Temporal Memory (STM) cloning. We propose a new profiling method and statistics that capture stride patterns and transition probabilities. We show how the new statistics enable accurate clone generation that allow clones to be used in place of the original benchmarks for studying the L1/L2/TLB miss rates as we vary the L1 cache, L1 prefetcher, L2 cache, TLB, and page size configurations.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
STM:克隆空间和时间内存访问行为
为了设计和调优体系结构,计算机架构师需要深入了解客户的工作负载。不幸的是,由于软件的专有性或机密性,许多重要客户不会将其软件共享给计算机架构师。缓解此问题的一种技术是生成复制原始工作负载行为的合成跟踪(克隆)。不幸的是,目前还没有通用的克隆技术可以捕获应用程序的任意内存访问行为。现有的技术只能捕捉时间,而不能捕捉空间的局部性。为了研究缓存之外的内存层次结构组织,例如包括预取器和翻译暂置缓冲区(TLB),仅捕获时间局部性是不够的。在本文中,我们提出了一种新的内存访问行为克隆技术,可以同时捕获时间和空间局部性。我们将该方案简称为时空记忆克隆。我们提出了一种新的分析方法和统计数据来捕捉跨步模式和转移概率。我们将展示新的统计数据如何实现精确的克隆生成,从而在改变L1缓存、L1预取器、L2缓存、TLB和页面大小配置时,使用克隆代替原始基准来研究L1/L2/TLB缺失率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Precision-aware soft error protection for GPUs Low-overhead and high coverage run-time race detection through selective meta-data management Improving DRAM performance by parallelizing refreshes with accesses Improving GPGPU resource utilization through alternative thread block scheduling DraMon: Predicting memory bandwidth usage of multi-threaded programs with high accuracy and low overhead
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1