{"title":"A study towards optimal data layout for GPU computing","authors":"E. Zhang, Han Li, Xipeng Shen","doi":"10.1145/2247684.2247699","DOIUrl":null,"url":null,"abstract":"The performance of Graphic Processing Units (GPU) is sensitive to irregular memory references. A recent study shows the promise of eliminating irregular references through runtime thread-data remapping. However, how to efficiently determine the optimal mapping is yet an open question. This paper presents some initial exploration to the question, especially in the dimension of data layout optimization. It describes three algorithms to compute or approximate optimal data layouts for GPU. These algorithms exhibit a spectrum of tradeoff among the space cost, time cost, and quality of the resulting data layouts.","PeriodicalId":130040,"journal":{"name":"Workshop on Memory System Performance and Correctness","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Workshop on Memory System Performance and Correctness","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2247684.2247699","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
The performance of Graphic Processing Units (GPU) is sensitive to irregular memory references. A recent study shows the promise of eliminating irregular references through runtime thread-data remapping. However, how to efficiently determine the optimal mapping is yet an open question. This paper presents some initial exploration to the question, especially in the dimension of data layout optimization. It describes three algorithms to compute or approximate optimal data layouts for GPU. These algorithms exhibit a spectrum of tradeoff among the space cost, time cost, and quality of the resulting data layouts.