Prefetch throttling and data pinning for improving performance of shared caches

2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis Pub Date : 2008-11-15 DOI:10.1145/1413370.1413430

O. Ozturk, S. Son, M. Kandemir, Mustafa Karaköy

{"title":"Prefetch throttling and data pinning for improving performance of shared caches","authors":"O. Ozturk, S. Son, M. Kandemir, Mustafa Karaköy","doi":"10.1145/1413370.1413430","DOIUrl":null,"url":null,"abstract":"In this paper, we (i) quantify the impact of compiler-directed I/O prefetching on shared caches at I/O nodes. The experimental data collected shows that while I/O prefetching brings some benefits, its effectiveness reduces significantly as the number of clients (compute nodes) is increased; (ii) identify inter-client misses due to harmful I/O prefetches as one of the main sources for this reduction in performance with increased number of clients; and (iii) propose and experimentally evaluate prefetch throttling and data pinning schemes to improve performance of I/O prefetching. Prefetch throttling prevents one or more clients from issuing further prefetches if such prefetches are predicted to be harmful, i.e., replace from the memory cache the useful data accessed by other clients. Data pinning on the other hand makes selected data blocks immune to harmful prefetches by pinning them in the memory cache. We show that these two schemes can be applied in isolation or combined together, and they can be applied at a coarse or fine granularity. Our experiments with these two optimizations using four disk-intensive applications reveal that they can improve performance by 9.7% and 15.1% on average, over standard compiler-directed I/O prefetching and no-prefetch case, respectively, when 8 clients are used.","PeriodicalId":230761,"journal":{"name":"2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1413370.1413430","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

In this paper, we (i) quantify the impact of compiler-directed I/O prefetching on shared caches at I/O nodes. The experimental data collected shows that while I/O prefetching brings some benefits, its effectiveness reduces significantly as the number of clients (compute nodes) is increased; (ii) identify inter-client misses due to harmful I/O prefetches as one of the main sources for this reduction in performance with increased number of clients; and (iii) propose and experimentally evaluate prefetch throttling and data pinning schemes to improve performance of I/O prefetching. Prefetch throttling prevents one or more clients from issuing further prefetches if such prefetches are predicted to be harmful, i.e., replace from the memory cache the useful data accessed by other clients. Data pinning on the other hand makes selected data blocks immune to harmful prefetches by pinning them in the memory cache. We show that these two schemes can be applied in isolation or combined together, and they can be applied at a coarse or fine granularity. Our experiments with these two optimizations using four disk-intensive applications reveal that they can improve performance by 9.7% and 15.1% on average, over standard compiler-directed I/O prefetching and no-prefetch case, respectively, when 8 clients are used.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

预取节流和数据固定用于提高共享缓存的性能

在本文中，我们(i)量化了编译器定向i /O预取对i /O节点上共享缓存的影响。收集的实验数据表明，虽然I/O预取带来了一些好处，但随着客户端(计算节点)数量的增加，其有效性显著降低;(ii)识别由于有害的I/O预取而导致的客户端间丢失，这是随着客户端数量增加而导致性能下降的主要原因之一;(iii)提出并实验评估预取节流和数据固定方案，以提高I/O预取的性能。预取节流防止一个或多个客户端发出进一步的预取，如果预取被预测为有害的，即从内存缓存中替换其他客户端访问的有用数据。另一方面，数据绑定通过将选定的数据块固定在内存缓存中，使其免受有害预取的影响。我们证明了这两种方案可以单独应用或组合应用，它们可以在粗粒度或细粒度上应用。我们使用四个磁盘密集型应用程序对这两种优化进行的实验表明，当使用8个客户机时，它们可以比标准的编译器定向I/O预取和无预取分别提高9.7%和15.1%的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis

自引率

0.00%

发文量