块数据布局的内存层次性能分析

Proceedings International Conference on Parallel Processing Pub Date : 2002-08-18 DOI:10.1109/ICPP.2002.1040857

Neungsoo Park, Bo Hong, V. Prasanna

{"title":"块数据布局的内存层次性能分析","authors":"Neungsoo Park, Bo Hong, V. Prasanna","doi":"10.1109/ICPP.2002.1040857","DOIUrl":null,"url":null,"abstract":"Recently, several experimental studies have been conducted on block data layout as a data transformation technique used in conjunction with tiling to improve cache performance. We provide a theoretical analysis for the TLB and cache performance of block data layout. For standard matrix access patterns, we derive an asymptotic lower bound on the number of TLB misses for any data layout and show that block data layout achieves this bound. We show that block data layout improves TLB misses by a factor of O(B) compared with conventional data layouts, where B is the block size of block data layout. This reduction contributes to the improvement in memory hierarchy performance. Using our TLB and cache analysis, we also discuss the impact of block size on the overall memory hierarchy performance. These results are validated through simulations and experiments on state-of-the-art platforms.","PeriodicalId":393916,"journal":{"name":"Proceedings International Conference on Parallel Processing","volume":"2 4","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"36","resultStr":"{\"title\":\"Analysis of memory hierarchy performance of block data layout\",\"authors\":\"Neungsoo Park, Bo Hong, V. Prasanna\",\"doi\":\"10.1109/ICPP.2002.1040857\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently, several experimental studies have been conducted on block data layout as a data transformation technique used in conjunction with tiling to improve cache performance. We provide a theoretical analysis for the TLB and cache performance of block data layout. For standard matrix access patterns, we derive an asymptotic lower bound on the number of TLB misses for any data layout and show that block data layout achieves this bound. We show that block data layout improves TLB misses by a factor of O(B) compared with conventional data layouts, where B is the block size of block data layout. This reduction contributes to the improvement in memory hierarchy performance. Using our TLB and cache analysis, we also discuss the impact of block size on the overall memory hierarchy performance. These results are validated through simulations and experiments on state-of-the-art platforms.\",\"PeriodicalId\":393916,\"journal\":{\"name\":\"Proceedings International Conference on Parallel Processing\",\"volume\":\"2 4\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-08-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"36\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings International Conference on Parallel Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICPP.2002.1040857\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPP.2002.1040857","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 36

摘要

最近，已经进行了一些实验研究，将块数据布局作为数据转换技术与平铺技术结合使用，以提高缓存性能。对块数据布局的TLB和缓存性能进行了理论分析。对于标准矩阵访问模式，我们给出了任意数据布局的TLB缺失数的渐近下界，并证明了块数据布局达到了这个下界。我们表明，与传统数据布局相比，块数据布局将TLB失误率提高了O(B)倍，其中B是块数据布局的块大小。这种减少有助于提高内存层次结构性能。使用我们的TLB和缓存分析，我们还讨论了块大小对整体内存层次结构性能的影响。这些结果通过仿真和实验在最先进的平台上得到验证。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Analysis of memory hierarchy performance of block data layout

Recently, several experimental studies have been conducted on block data layout as a data transformation technique used in conjunction with tiling to improve cache performance. We provide a theoretical analysis for the TLB and cache performance of block data layout. For standard matrix access patterns, we derive an asymptotic lower bound on the number of TLB misses for any data layout and show that block data layout achieves this bound. We show that block data layout improves TLB misses by a factor of O(B) compared with conventional data layouts, where B is the block size of block data layout. This reduction contributes to the improvement in memory hierarchy performance. Using our TLB and cache analysis, we also discuss the impact of block size on the overall memory hierarchy performance. These results are validated through simulations and experiments on state-of-the-art platforms.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings International Conference on Parallel Processing

自引率

0.00%

发文量