减少和容忍延迟技术的比较评价

[1991] Proceedings. The 18th Annual International Symposium on Computer Architecture Pub Date : 1991-04-01 DOI:10.1145/115953.115978

Anoop Gupta, J. Hennessy, K. Gharachorloo, T. Mowry, W. Weber

{"title":"减少和容忍延迟技术的比较评价","authors":"Anoop Gupta, J. Hennessy, K. Gharachorloo, T. Mowry, W. Weber","doi":"10.1145/115953.115978","DOIUrl":null,"url":null,"abstract":"Techniques that can cope with the large latency of memory accesses are essential for achieving high processor utilization in large-scale shared-memory multiprocessors. In this paper, we consider four architectural techniques that address the latency problem: (i) hardware coherent caches, (ii) relaxed memory consistency, (iii) softwareconuolled prefetching, and (iv) multiple-context suppon. We some studies of benefits of the individual techniques have been done, no Study evaluates all of the techniques within a consistent framework. This paper attempts to remedy this by providing a comprehensive evaluation of the benefits of the four techniques, both individually and in combinations, using a consistent set of architectural assumptions. The results in this paper have been obtained using detailed simulations of a large-scale shared-memory multiprocessor. Our results show that caches and relaxed consistency UNformly improve performance. The improvements due to prefetching and multiple contexts are sizeable, but are much more applicationdependent. Combinations of the various techniques generally amin better performance than each one on its own. Overall, we show that using suitahle combinations of the techniques, performance can be improved by 4 to 7 dmes","PeriodicalId":187095,"journal":{"name":"[1991] Proceedings. The 18th Annual International Symposium on Computer Architecture","volume":"148 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1991-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"229","resultStr":"{\"title\":\"Comparative evaluation of latency reducing and tolerating techniques\",\"authors\":\"Anoop Gupta, J. Hennessy, K. Gharachorloo, T. Mowry, W. Weber\",\"doi\":\"10.1145/115953.115978\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Techniques that can cope with the large latency of memory accesses are essential for achieving high processor utilization in large-scale shared-memory multiprocessors. In this paper, we consider four architectural techniques that address the latency problem: (i) hardware coherent caches, (ii) relaxed memory consistency, (iii) softwareconuolled prefetching, and (iv) multiple-context suppon. We some studies of benefits of the individual techniques have been done, no Study evaluates all of the techniques within a consistent framework. This paper attempts to remedy this by providing a comprehensive evaluation of the benefits of the four techniques, both individually and in combinations, using a consistent set of architectural assumptions. The results in this paper have been obtained using detailed simulations of a large-scale shared-memory multiprocessor. Our results show that caches and relaxed consistency UNformly improve performance. The improvements due to prefetching and multiple contexts are sizeable, but are much more applicationdependent. Combinations of the various techniques generally amin better performance than each one on its own. Overall, we show that using suitahle combinations of the techniques, performance can be improved by 4 to 7 dmes\",\"PeriodicalId\":187095,\"journal\":{\"name\":\"[1991] Proceedings. The 18th Annual International Symposium on Computer Architecture\",\"volume\":\"148 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1991-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"229\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"[1991] Proceedings. The 18th Annual International Symposium on Computer Architecture\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/115953.115978\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"[1991] Proceedings. The 18th Annual International Symposium on Computer Architecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/115953.115978","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 229

摘要

能够处理内存访问的大延迟的技术对于在大规模共享内存多处理器中实现高处理器利用率至关重要。在本文中，我们考虑了解决延迟问题的四种架构技术:(i)硬件连贯缓存，(ii)放松内存一致性，(iii)软件控制预取，以及(iv)多上下文支持。我们对个别技术的益处进行了一些研究，没有研究在一致的框架内评估所有技术。本文试图通过使用一组一致的体系结构假设，对这四种技术的好处进行全面的评估，包括单独的和组合的。本文的结果是通过对一个大型共享内存多处理器的详细仿真得到的。我们的结果表明，缓存和放松一致性可以均匀地提高性能。预取和多上下文带来的改进是相当大的，但更依赖于应用程序。各种技术的组合通常比单独使用一种技术的性能更好。总的来说，我们表明，使用适当的技术组合，性能可以提高4到7个百分点

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Comparative evaluation of latency reducing and tolerating techniques

Techniques that can cope with the large latency of memory accesses are essential for achieving high processor utilization in large-scale shared-memory multiprocessors. In this paper, we consider four architectural techniques that address the latency problem: (i) hardware coherent caches, (ii) relaxed memory consistency, (iii) softwareconuolled prefetching, and (iv) multiple-context suppon. We some studies of benefits of the individual techniques have been done, no Study evaluates all of the techniques within a consistent framework. This paper attempts to remedy this by providing a comprehensive evaluation of the benefits of the four techniques, both individually and in combinations, using a consistent set of architectural assumptions. The results in this paper have been obtained using detailed simulations of a large-scale shared-memory multiprocessor. Our results show that caches and relaxed consistency UNformly improve performance. The improvements due to prefetching and multiple contexts are sizeable, but are much more applicationdependent. Combinations of the various techniques generally amin better performance than each one on its own. Overall, we show that using suitahle combinations of the techniques, performance can be improved by 4 to 7 dmes

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

[1991] Proceedings. The 18th Annual International Symposium on Computer Architecture

自引率

0.00%

发文量