{"title":"使用数组平铺优化数据局部性","authors":"W. Ding, Yuanrui Zhang, Jun Liu, M. Kandemir","doi":"10.1109/ICCAD.2011.6105318","DOIUrl":null,"url":null,"abstract":"Data transformation is one of the key optimizations in maximizing cache locality. Traditional data transformation strategies employ linear data layouts, e.g., row-major or column-major, for multidimensional arrays. Although a linear layout matches the linear memory space well in most cases, it can only optimize for self-spatial locality for individual references. In this work, we propose a novel data layout transformation framework that is able to determine a tiled layout for each array in an application program. Tiled layout can exploit the group-spatial locality among different references and improve cache line utilization. In our strategy, the data elements accessed by different references in one loop iteration are placed into a tile and fetched into the same cache line at runtime. This helps minimizing conflict misses in caches. We evaluated our data layout transformation framework using 30 benchmarks on a commercial multicore machine. The experimental results show that our approach outperforms state-of-the-art data transformation strategies and works well with large core counts.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"3 1","pages":"142-149"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Optimizing data locality using array tiling\",\"authors\":\"W. Ding, Yuanrui Zhang, Jun Liu, M. Kandemir\",\"doi\":\"10.1109/ICCAD.2011.6105318\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data transformation is one of the key optimizations in maximizing cache locality. Traditional data transformation strategies employ linear data layouts, e.g., row-major or column-major, for multidimensional arrays. Although a linear layout matches the linear memory space well in most cases, it can only optimize for self-spatial locality for individual references. In this work, we propose a novel data layout transformation framework that is able to determine a tiled layout for each array in an application program. Tiled layout can exploit the group-spatial locality among different references and improve cache line utilization. In our strategy, the data elements accessed by different references in one loop iteration are placed into a tile and fetched into the same cache line at runtime. This helps minimizing conflict misses in caches. We evaluated our data layout transformation framework using 30 benchmarks on a commercial multicore machine. The experimental results show that our approach outperforms state-of-the-art data transformation strategies and works well with large core counts.\",\"PeriodicalId\":6357,\"journal\":{\"name\":\"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)\",\"volume\":\"3 1\",\"pages\":\"142-149\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-11-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCAD.2011.6105318\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCAD.2011.6105318","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Data transformation is one of the key optimizations in maximizing cache locality. Traditional data transformation strategies employ linear data layouts, e.g., row-major or column-major, for multidimensional arrays. Although a linear layout matches the linear memory space well in most cases, it can only optimize for self-spatial locality for individual references. In this work, we propose a novel data layout transformation framework that is able to determine a tiled layout for each array in an application program. Tiled layout can exploit the group-spatial locality among different references and improve cache line utilization. In our strategy, the data elements accessed by different references in one loop iteration are placed into a tile and fetched into the same cache line at runtime. This helps minimizing conflict misses in caches. We evaluated our data layout transformation framework using 30 benchmarks on a commercial multicore machine. The experimental results show that our approach outperforms state-of-the-art data transformation strategies and works well with large core counts.