Seizing the Bandwidth Scaling of On-Package Interconnect in a Post-Moore's Law World

Grigory Chirkov, D. Wentzlaff
{"title":"Seizing the Bandwidth Scaling of On-Package Interconnect in a Post-Moore's Law World","authors":"Grigory Chirkov, D. Wentzlaff","doi":"10.1145/3577193.3593702","DOIUrl":null,"url":null,"abstract":"The slowing and forecasted end of Moore's Law have forced designers to look beyond simply adding transistors, encouraging them to employ other unused resources as a manner to increase chip performance. At the same time, in recent years, inter-die interconnect technologies made a huge leap forward, dramatically increasing the available bandwidth. While the end of Moore's Law will inevitably slow down the performance advances of single-die setups, interconnect technologies will likely continue to scale. We envision a future where designers must create ways to exploit interconnect utilization for better system performance. As an example of a feature that converts interconnect utilization into performance, we present Meduza - a write-update coherence protocol for future chiplet systems. Meduza extends previous write-update protocols to systems with multi-level cache hierarchies. Meduza improves execution speed in our benchmark suite by 19% when compared to the MESIF coherence protocol on a chiplet-based system. Moreover, Meduza promises even more advantages in future systems. This work shows that by exploiting excess interconnect bandwidth, there is significant potential for additional performance in modern and future chiplet systems.","PeriodicalId":424155,"journal":{"name":"Proceedings of the 37th International Conference on Supercomputing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 37th International Conference on Supercomputing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3577193.3593702","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

The slowing and forecasted end of Moore's Law have forced designers to look beyond simply adding transistors, encouraging them to employ other unused resources as a manner to increase chip performance. At the same time, in recent years, inter-die interconnect technologies made a huge leap forward, dramatically increasing the available bandwidth. While the end of Moore's Law will inevitably slow down the performance advances of single-die setups, interconnect technologies will likely continue to scale. We envision a future where designers must create ways to exploit interconnect utilization for better system performance. As an example of a feature that converts interconnect utilization into performance, we present Meduza - a write-update coherence protocol for future chiplet systems. Meduza extends previous write-update protocols to systems with multi-level cache hierarchies. Meduza improves execution speed in our benchmark suite by 19% when compared to the MESIF coherence protocol on a chiplet-based system. Moreover, Meduza promises even more advantages in future systems. This work shows that by exploiting excess interconnect bandwidth, there is significant potential for additional performance in modern and future chiplet systems.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在后摩尔定律世界中抓住封装互连的带宽扩展
摩尔定律的缓慢和预测的终结迫使设计师们不再仅仅考虑增加晶体管,而是鼓励他们利用其他未使用的资源来提高芯片性能。同时,近年来,芯片间互连技术取得了巨大的飞跃,极大地增加了可用带宽。虽然摩尔定律的终结将不可避免地减缓单芯片设置的性能进步,但互连技术可能会继续扩展。我们设想未来设计人员必须创造方法来利用互连来获得更好的系统性能。作为一个将互连利用率转化为性能的特性的例子,我们提出了Meduza -一个用于未来芯片系统的写更新一致性协议。Meduza将以前的写更新协议扩展到具有多级缓存层次结构的系统。与基于芯片的系统上的MESIF一致性协议相比,Meduza在我们的基准套件中的执行速度提高了19%。此外,Meduza承诺在未来的系统中会有更多的优势。这项工作表明,通过利用多余的互连带宽,在现代和未来的芯片系统中有额外性能的巨大潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
FLORIA: A Fast and Featherlight Approach for Predicting Cache Performance FT-topo: Architecture-Driven Folded-Triangle Partitioning for Communication-efficient Graph Processing Using Additive Modifications in LU Factorization Instead of Pivoting GRAP: Group-level Resource Allocation Policy for Reconfigurable Dragonfly Network in HPC Enabling Reconfigurable HPC through MPI-based Inter-FPGA Communication
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1