{"title":"使用外部流特性改进步幅预取","authors":"H. Al-Sukhni, James Holt, D. Connors","doi":"10.1109/ISPASS.2006.1620801","DOIUrl":null,"url":null,"abstract":"Stride-based prefetching mechanisms exploit regular streams of memory accesses to hide memory latency. While these mechanisms are effective, they can be improved by studying the properties of regular streams. As evidence of this, the establishment of metrics to quantify intrinsic characteristics of regular streams has been shown to enable software-based code optimizations. In this paper we extend previously identified regular stream metrics to quantify extrinsic characteristics of regular streams, and show how these new metrics can be employed to improve the efficiency of stride prefetching. The extrinsic metrics we introduce are stream affinity and stream density. Stream affinity enables prefetching for short streams that were previously ignored by stride prefetching mechanisms. Stream density enables a prioritization mechanism that dynamically selects amongst available streams in favor of those that promise more miss coverage, and provides thrashing control amongst several coexisting streams. Finally, we show that using intrinsic and extrinsic stream metrics in combination allows a novel hardware technique for controlling prefetch ahead distance (PAD) which dynamically adjusts the prefetch launch time to better enable timely prefetches while minimizing cache pollution. For a representative set of SPEC2K traces, our techniques consistently outperform our implementation of the closest previously reported stride-based prefetching technique.","PeriodicalId":369192,"journal":{"name":"2006 IEEE International Symposium on Performance Analysis of Systems and Software","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Improved stride prefetching using extrinsic stream characteristics\",\"authors\":\"H. Al-Sukhni, James Holt, D. Connors\",\"doi\":\"10.1109/ISPASS.2006.1620801\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Stride-based prefetching mechanisms exploit regular streams of memory accesses to hide memory latency. While these mechanisms are effective, they can be improved by studying the properties of regular streams. As evidence of this, the establishment of metrics to quantify intrinsic characteristics of regular streams has been shown to enable software-based code optimizations. In this paper we extend previously identified regular stream metrics to quantify extrinsic characteristics of regular streams, and show how these new metrics can be employed to improve the efficiency of stride prefetching. The extrinsic metrics we introduce are stream affinity and stream density. Stream affinity enables prefetching for short streams that were previously ignored by stride prefetching mechanisms. Stream density enables a prioritization mechanism that dynamically selects amongst available streams in favor of those that promise more miss coverage, and provides thrashing control amongst several coexisting streams. Finally, we show that using intrinsic and extrinsic stream metrics in combination allows a novel hardware technique for controlling prefetch ahead distance (PAD) which dynamically adjusts the prefetch launch time to better enable timely prefetches while minimizing cache pollution. For a representative set of SPEC2K traces, our techniques consistently outperform our implementation of the closest previously reported stride-based prefetching technique.\",\"PeriodicalId\":369192,\"journal\":{\"name\":\"2006 IEEE International Symposium on Performance Analysis of Systems and Software\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-03-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2006 IEEE International Symposium on Performance Analysis of Systems and Software\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISPASS.2006.1620801\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 IEEE International Symposium on Performance Analysis of Systems and Software","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPASS.2006.1620801","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improved stride prefetching using extrinsic stream characteristics
Stride-based prefetching mechanisms exploit regular streams of memory accesses to hide memory latency. While these mechanisms are effective, they can be improved by studying the properties of regular streams. As evidence of this, the establishment of metrics to quantify intrinsic characteristics of regular streams has been shown to enable software-based code optimizations. In this paper we extend previously identified regular stream metrics to quantify extrinsic characteristics of regular streams, and show how these new metrics can be employed to improve the efficiency of stride prefetching. The extrinsic metrics we introduce are stream affinity and stream density. Stream affinity enables prefetching for short streams that were previously ignored by stride prefetching mechanisms. Stream density enables a prioritization mechanism that dynamically selects amongst available streams in favor of those that promise more miss coverage, and provides thrashing control amongst several coexisting streams. Finally, we show that using intrinsic and extrinsic stream metrics in combination allows a novel hardware technique for controlling prefetch ahead distance (PAD) which dynamically adjusts the prefetch launch time to better enable timely prefetches while minimizing cache pollution. For a representative set of SPEC2K traces, our techniques consistently outperform our implementation of the closest previously reported stride-based prefetching technique.