Nexus:分布式共享缓存中复制的新方法

Po-An Tsai, Nathan Beckmann, Daniel Sánchez
{"title":"Nexus:分布式共享缓存中复制的新方法","authors":"Po-An Tsai, Nathan Beckmann, Daniel Sánchez","doi":"10.1109/PACT.2017.42","DOIUrl":null,"url":null,"abstract":"Last-level caches are increasingly distributed, consisting of many small banks. To perform well, most accesses must be served by banks near requesting cores. An attractive approach is to replicate read-only data so that a copy is available nearby. But replication introduces a delicate tradeoff between capacity and latency: too little replication forces cores to access faraway banks, while too much replication wastes cache space and causes excessive off-chip misses. Workloads vary widely in their desired amount of replication, demanding an adaptive approach. Prior adaptive replication techniques only replicate data in each tile's local bank, so they focus on selecting which data to replicate. Unfortunately, data that is not replicated still incurs a full network traversal, limiting the performance of these techniques.We argue that a better strategy is to let cores share replicas and that adaptive schemes should focus on selecting how much to replicate (i.e., how many replicas to have across the chip). This idea fully exploits the latency-capacity tradeoff, achieving qualitatively higher performance than prior adaptive replication techniques. It can be applied to many prior cache organizations, and we demonstrate it on two: Nexus-R extends R-NUCA, and Nexus-J extends Jigsaw. We evaluate Nexus on HPC and server workloads running on a 144-core chip, where it outperforms prior adaptive replication schemes and improves performance by up to 90% and by 23% on average across all workloads sensitive to replication.","PeriodicalId":438103,"journal":{"name":"2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Nexus: A New Approach to Replication in Distributed Shared Caches\",\"authors\":\"Po-An Tsai, Nathan Beckmann, Daniel Sánchez\",\"doi\":\"10.1109/PACT.2017.42\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Last-level caches are increasingly distributed, consisting of many small banks. To perform well, most accesses must be served by banks near requesting cores. An attractive approach is to replicate read-only data so that a copy is available nearby. But replication introduces a delicate tradeoff between capacity and latency: too little replication forces cores to access faraway banks, while too much replication wastes cache space and causes excessive off-chip misses. Workloads vary widely in their desired amount of replication, demanding an adaptive approach. Prior adaptive replication techniques only replicate data in each tile's local bank, so they focus on selecting which data to replicate. Unfortunately, data that is not replicated still incurs a full network traversal, limiting the performance of these techniques.We argue that a better strategy is to let cores share replicas and that adaptive schemes should focus on selecting how much to replicate (i.e., how many replicas to have across the chip). This idea fully exploits the latency-capacity tradeoff, achieving qualitatively higher performance than prior adaptive replication techniques. It can be applied to many prior cache organizations, and we demonstrate it on two: Nexus-R extends R-NUCA, and Nexus-J extends Jigsaw. We evaluate Nexus on HPC and server workloads running on a 144-core chip, where it outperforms prior adaptive replication schemes and improves performance by up to 90% and by 23% on average across all workloads sensitive to replication.\",\"PeriodicalId\":438103,\"journal\":{\"name\":\"2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT)\",\"volume\":\"97 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PACT.2017.42\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PACT.2017.42","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

摘要

最后一级缓存越来越分散,由许多小银行组成。为了表现良好,大多数访问必须由靠近请求核心的银行提供服务。一种有吸引力的方法是复制只读数据,以便在附近有副本可用。但是复制在容量和延迟之间引入了一个微妙的权衡:太少的复制迫使核心访问遥远的银行,而太多的复制浪费缓存空间并导致过多的芯片外丢失。工作负载所需的复制量差异很大,因此需要一种自适应方法。以前的自适应复制技术只复制每个块的本地库中的数据,因此它们关注于选择要复制的数据。不幸的是,未复制的数据仍然需要遍历整个网络,从而限制了这些技术的性能。我们认为,更好的策略是让核心共享副本,而自适应方案应侧重于选择复制的数量(即,在芯片上有多少副本)。这个想法充分利用了延迟与容量之间的权衡,实现了比以前的自适应复制技术在质量上更高的性能。它可以应用于许多先前的缓存组织,我们在两个例子中进行了演示:Nexus-R扩展了R-NUCA, Nexus-J扩展了Jigsaw。我们对运行在144核芯片上的HPC和服务器工作负载上的Nexus进行了评估,它优于先前的自适应复制方案,在所有对复制敏感的工作负载上,性能提高了高达90%,平均提高了23%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Nexus: A New Approach to Replication in Distributed Shared Caches
Last-level caches are increasingly distributed, consisting of many small banks. To perform well, most accesses must be served by banks near requesting cores. An attractive approach is to replicate read-only data so that a copy is available nearby. But replication introduces a delicate tradeoff between capacity and latency: too little replication forces cores to access faraway banks, while too much replication wastes cache space and causes excessive off-chip misses. Workloads vary widely in their desired amount of replication, demanding an adaptive approach. Prior adaptive replication techniques only replicate data in each tile's local bank, so they focus on selecting which data to replicate. Unfortunately, data that is not replicated still incurs a full network traversal, limiting the performance of these techniques.We argue that a better strategy is to let cores share replicas and that adaptive schemes should focus on selecting how much to replicate (i.e., how many replicas to have across the chip). This idea fully exploits the latency-capacity tradeoff, achieving qualitatively higher performance than prior adaptive replication techniques. It can be applied to many prior cache organizations, and we demonstrate it on two: Nexus-R extends R-NUCA, and Nexus-J extends Jigsaw. We evaluate Nexus on HPC and server workloads running on a 144-core chip, where it outperforms prior adaptive replication schemes and improves performance by up to 90% and by 23% on average across all workloads sensitive to replication.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
POSTER: Exploiting Approximations for Energy/Quality Tradeoffs in Service-Based Applications End-to-End Deep Learning of Optimization Heuristics Large Scale Data Clustering Using Memristive k-Median Computation DrMP: Mixed Precision-Aware DRAM for High Performance Approximate and Precise Computing POSTER: Improving Datacenter Efficiency Through Partitioning-Aware Scheduling
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1