Dual Pattern Compression Using Data-Preprocessing for Large-Scale GPU Architectures

2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Pub Date : 2019-05-20 DOI:10.1109/IPDPS.2019.00076

Kyung Hoon Kim, Priyank Devpura, Abhishek Nayyar, Andrew Doolittle, K. H. Yum, Eun Jung Kim

{"title":"Dual Pattern Compression Using Data-Preprocessing for Large-Scale GPU Architectures","authors":"Kyung Hoon Kim, Priyank Devpura, Abhishek Nayyar, Andrew Doolittle, K. H. Yum, Eun Jung Kim","doi":"10.1109/IPDPS.2019.00076","DOIUrl":null,"url":null,"abstract":"Graphics Processing Units (GPUs) have been widely accepted for diverse general purpose applications due to a massive degree of parallelism. The demand for large-scale GPUs processing a large volume of data with high throughput has been rising rapidly. However, in large-scale GPUs, a bandwidth-efficient network design is challenging. Compression techniques are a practical remedy to effectively increase network bandwidth by reducing data size transferred. We propose a new simple compression mechanism, Dual Pattern Compression (DPC), that compresses only two patterns with a very low latency. The simplicity of compression/decompression is achieved through data remapping and data-type-aware data preprocessing which exploits bit-level data redundancy. The data type is detected during runtime. We demonstrate that our compression scheme effectively mitigates the network congestion in a large-scale GPU. It achieves IPC improvement by 33% on average (up to 126%) across various benchmarks with average space savings ratios of 61% in integer, 46% (up to 72%) in floating-point and 23% (up to 57%) in character type benchmarks.","PeriodicalId":403406,"journal":{"name":"2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"194 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS.2019.00076","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Graphics Processing Units (GPUs) have been widely accepted for diverse general purpose applications due to a massive degree of parallelism. The demand for large-scale GPUs processing a large volume of data with high throughput has been rising rapidly. However, in large-scale GPUs, a bandwidth-efficient network design is challenging. Compression techniques are a practical remedy to effectively increase network bandwidth by reducing data size transferred. We propose a new simple compression mechanism, Dual Pattern Compression (DPC), that compresses only two patterns with a very low latency. The simplicity of compression/decompression is achieved through data remapping and data-type-aware data preprocessing which exploits bit-level data redundancy. The data type is detected during runtime. We demonstrate that our compression scheme effectively mitigates the network congestion in a large-scale GPU. It achieves IPC improvement by 33% on average (up to 126%) across various benchmarks with average space savings ratios of 61% in integer, 46% (up to 72%) in floating-point and 23% (up to 57%) in character type benchmarks.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于数据预处理的大规模GPU架构双模式压缩

由于大量的并行性，图形处理单元(gpu)已被广泛接受用于各种通用应用程序。对处理大数据量、高吞吐量的大型gpu的需求一直在快速增长。然而，在大规模gpu中，带宽高效的网络设计是具有挑战性的。压缩技术是通过减少传输的数据量来有效增加网络带宽的一种实用补救措施。我们提出了一种新的简单的压缩机制，双模式压缩(DPC)，它只压缩两个模式，具有非常低的延迟。压缩/解压缩的简单性是通过数据重映射和数据类型感知的数据预处理实现的，这种预处理利用了位级数据冗余。在运行时期间检测数据类型。我们证明了我们的压缩方案有效地缓解了大规模GPU中的网络拥塞。它在各种基准测试中平均实现了33%(最高126%)的IPC改进，在整数测试中平均节省了61%的空间，在浮点测试中节省了46%(最高72%)的空间，在字符类型测试中节省了23%(最高57%)的空间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

自引率

0.00%

发文量