Anoop Gupta, J. Hennessy, K. Gharachorloo, T. Mowry, W. Weber
{"title":"减少和容忍延迟技术的比较评价","authors":"Anoop Gupta, J. Hennessy, K. Gharachorloo, T. Mowry, W. Weber","doi":"10.1145/115953.115978","DOIUrl":null,"url":null,"abstract":"Techniques that can cope with the large latency of memory accesses are essential for achieving high processor utilization in large-scale shared-memory multiprocessors. In this paper, we consider four architectural techniques that address the latency problem: (i) hardware coherent caches, (ii) relaxed memory consistency, (iii) softwareconuolled prefetching, and (iv) multiple-context suppon. We some studies of benefits of the individual techniques have been done, no Study evaluates all of the techniques within a consistent framework. This paper attempts to remedy this by providing a comprehensive evaluation of the benefits of the four techniques, both individually and in combinations, using a consistent set of architectural assumptions. The results in this paper have been obtained using detailed simulations of a large-scale shared-memory multiprocessor. Our results show that caches and relaxed consistency UNformly improve performance. The improvements due to prefetching and multiple contexts are sizeable, but are much more applicationdependent. Combinations of the various techniques generally amin better performance than each one on its own. Overall, we show that using suitahle combinations of the techniques, performance can be improved by 4 to 7 dmes","PeriodicalId":187095,"journal":{"name":"[1991] Proceedings. The 18th Annual International Symposium on Computer Architecture","volume":"148 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1991-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"229","resultStr":"{\"title\":\"Comparative evaluation of latency reducing and tolerating techniques\",\"authors\":\"Anoop Gupta, J. Hennessy, K. Gharachorloo, T. Mowry, W. Weber\",\"doi\":\"10.1145/115953.115978\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Techniques that can cope with the large latency of memory accesses are essential for achieving high processor utilization in large-scale shared-memory multiprocessors. In this paper, we consider four architectural techniques that address the latency problem: (i) hardware coherent caches, (ii) relaxed memory consistency, (iii) softwareconuolled prefetching, and (iv) multiple-context suppon. We some studies of benefits of the individual techniques have been done, no Study evaluates all of the techniques within a consistent framework. This paper attempts to remedy this by providing a comprehensive evaluation of the benefits of the four techniques, both individually and in combinations, using a consistent set of architectural assumptions. The results in this paper have been obtained using detailed simulations of a large-scale shared-memory multiprocessor. Our results show that caches and relaxed consistency UNformly improve performance. The improvements due to prefetching and multiple contexts are sizeable, but are much more applicationdependent. Combinations of the various techniques generally amin better performance than each one on its own. Overall, we show that using suitahle combinations of the techniques, performance can be improved by 4 to 7 dmes\",\"PeriodicalId\":187095,\"journal\":{\"name\":\"[1991] Proceedings. The 18th Annual International Symposium on Computer Architecture\",\"volume\":\"148 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1991-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"229\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"[1991] Proceedings. The 18th Annual International Symposium on Computer Architecture\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/115953.115978\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"[1991] Proceedings. The 18th Annual International Symposium on Computer Architecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/115953.115978","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Comparative evaluation of latency reducing and tolerating techniques
Techniques that can cope with the large latency of memory accesses are essential for achieving high processor utilization in large-scale shared-memory multiprocessors. In this paper, we consider four architectural techniques that address the latency problem: (i) hardware coherent caches, (ii) relaxed memory consistency, (iii) softwareconuolled prefetching, and (iv) multiple-context suppon. We some studies of benefits of the individual techniques have been done, no Study evaluates all of the techniques within a consistent framework. This paper attempts to remedy this by providing a comprehensive evaluation of the benefits of the four techniques, both individually and in combinations, using a consistent set of architectural assumptions. The results in this paper have been obtained using detailed simulations of a large-scale shared-memory multiprocessor. Our results show that caches and relaxed consistency UNformly improve performance. The improvements due to prefetching and multiple contexts are sizeable, but are much more applicationdependent. Combinations of the various techniques generally amin better performance than each one on its own. Overall, we show that using suitahle combinations of the techniques, performance can be improved by 4 to 7 dmes