{"title":"Comparison of hardware and software cache coherence schemes","authors":"S. Adve, Vikram S. Adve, M. Hill, M. Vernon","doi":"10.1145/115953.115982","DOIUrl":null,"url":null,"abstract":"We use mean value analysis models to compare representative hardware and software cache coherence schemes for a large-scale shared-memory system. Our goal is to identify the workloads for which either of the schemes is significantly better. Our methodology improves upon previous analytical studies and complements previous simulation studies by developing a common high-level workload model that is used to derive separate sets of lowlevel workload parameters for the two schemes. This approach allows an equitable comparison of the two schemes for a specific workload. is attractive because the overhead of detecting stale data is transferred from runtime to compile time, and the design complexity is transferred from hardware to software. However. software schemes may perform poorly because compile-time analysis may need IO be conservative, leading to unnecessary cache misses and main memory updates. In this paper, we use approximate Mean Value Analysis [U881 to compare the performance of a representative software scheme with a directory-based hardware scheme on a large-scale shared-memory system. In a previous study comparing the performance of hardware and software coherence, Cheong and VeidenOur resuIi, show that software schemes are haum used a parallelizing compiler to implement three difable (in terms of processor efficiency) IO hardware schemes ferent Software coherence schemes [Che90]. For selccted for a wide class of programs. The only cases for which subroutines Of Seven programs, they show that the hit ratio software schemes ,,erform sienificmtlv worse than of their most sophisticated software scheme (version con, ~~~ ~~~~~~ r~","PeriodicalId":187095,"journal":{"name":"[1991] Proceedings. The 18th Annual International Symposium on Computer Architecture","volume":"64 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1991-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"91","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"[1991] Proceedings. The 18th Annual International Symposium on Computer Architecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/115953.115982","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 91
Abstract
We use mean value analysis models to compare representative hardware and software cache coherence schemes for a large-scale shared-memory system. Our goal is to identify the workloads for which either of the schemes is significantly better. Our methodology improves upon previous analytical studies and complements previous simulation studies by developing a common high-level workload model that is used to derive separate sets of lowlevel workload parameters for the two schemes. This approach allows an equitable comparison of the two schemes for a specific workload. is attractive because the overhead of detecting stale data is transferred from runtime to compile time, and the design complexity is transferred from hardware to software. However. software schemes may perform poorly because compile-time analysis may need IO be conservative, leading to unnecessary cache misses and main memory updates. In this paper, we use approximate Mean Value Analysis [U881 to compare the performance of a representative software scheme with a directory-based hardware scheme on a large-scale shared-memory system. In a previous study comparing the performance of hardware and software coherence, Cheong and VeidenOur resuIi, show that software schemes are haum used a parallelizing compiler to implement three difable (in terms of processor efficiency) IO hardware schemes ferent Software coherence schemes [Che90]. For selccted for a wide class of programs. The only cases for which subroutines Of Seven programs, they show that the hit ratio software schemes ,,erform sienificmtlv worse than of their most sophisticated software scheme (version con, ~~~ ~~~~~~ r~