Comparison of hardware and software cache coherence schemes

S. Adve, Vikram S. Adve, M. Hill, M. Vernon
{"title":"Comparison of hardware and software cache coherence schemes","authors":"S. Adve, Vikram S. Adve, M. Hill, M. Vernon","doi":"10.1145/115953.115982","DOIUrl":null,"url":null,"abstract":"We use mean value analysis models to compare representative hardware and software cache coherence schemes for a large-scale shared-memory system. Our goal is to identify the workloads for which either of the schemes is significantly better. Our methodology improves upon previous analytical studies and complements previous simulation studies by developing a common high-level workload model that is used to derive separate sets of lowlevel workload parameters for the two schemes. This approach allows an equitable comparison of the two schemes for a specific workload. is attractive because the overhead of detecting stale data is transferred from runtime to compile time, and the design complexity is transferred from hardware to software. However. software schemes may perform poorly because compile-time analysis may need IO be conservative, leading to unnecessary cache misses and main memory updates. In this paper, we use approximate Mean Value Analysis [U881 to compare the performance of a representative software scheme with a directory-based hardware scheme on a large-scale shared-memory system. In a previous study comparing the performance of hardware and software coherence, Cheong and VeidenOur resuIi, show that software schemes are haum used a parallelizing compiler to implement three difable (in terms of processor efficiency) IO hardware schemes ferent Software coherence schemes [Che90]. For selccted for a wide class of programs. The only cases for which subroutines Of Seven programs, they show that the hit ratio software schemes ,,erform sienificmtlv worse than of their most sophisticated software scheme (version con, ~~~ ~~~~~~ r~","PeriodicalId":187095,"journal":{"name":"[1991] Proceedings. The 18th Annual International Symposium on Computer Architecture","volume":"64 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1991-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"91","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"[1991] Proceedings. The 18th Annual International Symposium on Computer Architecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/115953.115982","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 91

Abstract

We use mean value analysis models to compare representative hardware and software cache coherence schemes for a large-scale shared-memory system. Our goal is to identify the workloads for which either of the schemes is significantly better. Our methodology improves upon previous analytical studies and complements previous simulation studies by developing a common high-level workload model that is used to derive separate sets of lowlevel workload parameters for the two schemes. This approach allows an equitable comparison of the two schemes for a specific workload. is attractive because the overhead of detecting stale data is transferred from runtime to compile time, and the design complexity is transferred from hardware to software. However. software schemes may perform poorly because compile-time analysis may need IO be conservative, leading to unnecessary cache misses and main memory updates. In this paper, we use approximate Mean Value Analysis [U881 to compare the performance of a representative software scheme with a directory-based hardware scheme on a large-scale shared-memory system. In a previous study comparing the performance of hardware and software coherence, Cheong and VeidenOur resuIi, show that software schemes are haum used a parallelizing compiler to implement three difable (in terms of processor efficiency) IO hardware schemes ferent Software coherence schemes [Che90]. For selccted for a wide class of programs. The only cases for which subroutines Of Seven programs, they show that the hit ratio software schemes ,,erform sienificmtlv worse than of their most sophisticated software scheme (version con, ~~~ ~~~~~~ r~
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
硬件和软件缓存一致性方案的比较
我们使用均值分析模型来比较大型共享内存系统中具有代表性的硬件和软件缓存一致性方案。我们的目标是确定哪一种方案明显更好的工作负载。我们的方法改进了以前的分析研究,并通过开发一个通用的高级工作负载模型来补充以前的模拟研究,该模型用于为两个方案派生单独的低级工作负载参数集。这种方法可以公平地比较特定工作量的两种办法。之所以有吸引力,是因为检测陈旧数据的开销从运行时转移到了编译时,并且设计复杂性从硬件转移到了软件。然而。软件方案可能表现不佳,因为编译时分析可能需要IO保守,导致不必要的缓存丢失和主存更新。在本文中,我们使用近似均值分析[U881]来比较大型共享内存系统上具有代表性的软件方案与基于目录的硬件方案的性能。在之前的一项比较硬件和软件一致性性能的研究中,Cheong和VeidenOur的研究结果表明,软件方案可以使用并行编译器来实现三种不同的IO硬件方案(就处理器效率而言)和不同的软件一致性方案[Che90]。可供广泛的课程选择。在七个程序的子程序中,他们显示命中率软件方案的唯一情况下,比他们最复杂的软件方案(版本con, ~~~ ~~~~~~ r~)表现得更糟糕
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
The effect on RISC performance of register set size and structure versus code generation strategy GT-EP: a novel high-performance real-time architecture Performance prediction and tuning on a multiprocessor High performance interprocessor communication through optical wavelength division multiple access channels An empirical study of the CRAY Y-MP processor using the PERFECT club benchmarks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1