E^2MC: Entropy Encoding Based Memory Compression for GPUs

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Pub Date : 2017-05-01 DOI:10.1109/IPDPS.2017.101

S. Lal, J. Lucas, B. Juurlink

{"title":"E^2MC: Entropy Encoding Based Memory Compression for GPUs","authors":"S. Lal, J. Lucas, B. Juurlink","doi":"10.1109/IPDPS.2017.101","DOIUrl":null,"url":null,"abstract":"Modern Graphics Processing Units (GPUs) provide much higher off-chip memory bandwidth than CPUs, but many GPU applications are still limited by memory bandwidth. Unfortunately, off-chip memory bandwidth is growing slower than the number of cores and has become a performance bottleneck. Thus, optimizations of effective memory bandwidth play a significant role for scaling the performance of GPUs. Memory compression is a promising approach for improving memory bandwidth which can translate into higher performance and energy efficiency. However, compression is not free and its challenges need to be addressed, otherwise the benefits of compression may be offset by its overhead. We propose an entropy encoding based memory compression (E2MC) technique for GPUs, which is based on the well-known Huffman encoding. We study the feasibility of entropy encoding for GPUs and show that it achieves higher compression ratios than state-of-the-art GPU compression techniques. Furthermore, we address the key challenges of probability estimation, choosing an appropriate symbol length for encoding, and decompression with low latency. The average compression ratio of E2MC is 53% higher than the state of the art. This translates into an average speedup of 20% compared to no compression and 8% higher compared to the state of the art. Energy consumption and energy-delayproduct are reduced by 13% and 27%, respectively. Moreover, the compression ratio achieved by E2MC is close to the optimal compression ratio given by Shannon’s source coding theorem.","PeriodicalId":209524,"journal":{"name":"2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS.2017.101","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 20

Abstract

Modern Graphics Processing Units (GPUs) provide much higher off-chip memory bandwidth than CPUs, but many GPU applications are still limited by memory bandwidth. Unfortunately, off-chip memory bandwidth is growing slower than the number of cores and has become a performance bottleneck. Thus, optimizations of effective memory bandwidth play a significant role for scaling the performance of GPUs. Memory compression is a promising approach for improving memory bandwidth which can translate into higher performance and energy efficiency. However, compression is not free and its challenges need to be addressed, otherwise the benefits of compression may be offset by its overhead. We propose an entropy encoding based memory compression (E2MC) technique for GPUs, which is based on the well-known Huffman encoding. We study the feasibility of entropy encoding for GPUs and show that it achieves higher compression ratios than state-of-the-art GPU compression techniques. Furthermore, we address the key challenges of probability estimation, choosing an appropriate symbol length for encoding, and decompression with low latency. The average compression ratio of E2MC is 53% higher than the state of the art. This translates into an average speedup of 20% compared to no compression and 8% higher compared to the state of the art. Energy consumption and energy-delayproduct are reduced by 13% and 27%, respectively. Moreover, the compression ratio achieved by E2MC is close to the optimal compression ratio given by Shannon’s source coding theorem.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于熵编码的gpu内存压缩

现代图形处理单元(GPU)提供比cpu更高的片外内存带宽，但许多GPU应用程序仍然受到内存带宽的限制。不幸的是，片外内存带宽的增长速度低于内核数量的增长速度，已经成为性能瓶颈。因此，有效内存带宽的优化对于扩展gpu的性能起着重要的作用。内存压缩是一种很有前途的提高内存带宽的方法，它可以转化为更高的性能和能源效率。然而，压缩不是免费的，需要解决它的挑战，否则压缩的好处可能会被它的开销所抵消。我们提出了一种基于熵编码的gpu内存压缩(E2MC)技术，该技术基于众所周知的霍夫曼编码。我们研究了GPU的熵编码的可行性，并表明它比最先进的GPU压缩技术实现了更高的压缩比。此外，我们还解决了概率估计、选择合适的编码符号长度和低延迟解压缩的关键挑战。E2MC的平均压缩比比目前水平高53%。这意味着与没有压缩相比，平均速度提高了20%，与目前的技术水平相比，平均速度提高了8%。能源消耗和能源延误分别降低13%和27%。而且，E2MC所获得的压缩比接近Shannon源编码定理给出的最优压缩比。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

自引率

0.00%

发文量