包泵:克服gpgpu片上互连中的网络瓶颈*

2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC) Pub Date : 2018-06-24 DOI:10.1145/3195970.3196087

Xianwei Cheng, Yang Zhao, Hui Zhao, Yuan Xie

{"title":"包泵:克服gpgpu片上互连中的网络瓶颈*","authors":"Xianwei Cheng, Yang Zhao, Hui Zhao, Yuan Xie","doi":"10.1145/3195970.3196087","DOIUrl":null,"url":null,"abstract":"In order to fully exploit GPGPU's parallel processing power, on-chip interconnects need to provide bandwidth efficient data communication. GPGPUs exhibit a many-to-few-to-many traffic pattern which makes the memory controller connected routers the network bottleneck. Inefficient design of conventional routers causes long queues of packets blocked at memory controllers and thus greatly constrained the network bandwidth. In this work, we employ heterogeneous design techniques and propose a novel decoupled architecture for routers connected with memory controllers. To further improve performance, we propose techniques called Injection Virtual Circuit and Memory-aware Adaptive Routing. We show that our scheme can effectively eliminate NoC bottleneck and improve performance by 78% on average.","PeriodicalId":6491,"journal":{"name":"2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)","volume":"14 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Packet Pump: Overcoming Network Bottleneck in On-Chip Interconnects for GPGPUs*\",\"authors\":\"Xianwei Cheng, Yang Zhao, Hui Zhao, Yuan Xie\",\"doi\":\"10.1145/3195970.3196087\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In order to fully exploit GPGPU's parallel processing power, on-chip interconnects need to provide bandwidth efficient data communication. GPGPUs exhibit a many-to-few-to-many traffic pattern which makes the memory controller connected routers the network bottleneck. Inefficient design of conventional routers causes long queues of packets blocked at memory controllers and thus greatly constrained the network bandwidth. In this work, we employ heterogeneous design techniques and propose a novel decoupled architecture for routers connected with memory controllers. To further improve performance, we propose techniques called Injection Virtual Circuit and Memory-aware Adaptive Routing. We show that our scheme can effectively eliminate NoC bottleneck and improve performance by 78% on average.\",\"PeriodicalId\":6491,\"journal\":{\"name\":\"2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)\",\"volume\":\"14 1\",\"pages\":\"1-6\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-06-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3195970.3196087\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3195970.3196087","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

摘要

为了充分利用GPGPU的并行处理能力，片上互连需要提供带宽高效的数据通信。gpgpu表现出多对少对多的流量模式，使得内存控制器连接的路由器成为网络瓶颈。传统路由器的低效设计导致数据包长队列阻塞在内存控制器上，从而极大地限制了网络带宽。在这项工作中，我们采用异构设计技术，并提出了一种新的与内存控制器连接的路由器解耦架构。为了进一步提高性能，我们提出了注入虚拟电路和内存感知自适应路由技术。结果表明，该方案可以有效地消除NoC瓶颈，性能平均提高78%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Packet Pump: Overcoming Network Bottleneck in On-Chip Interconnects for GPGPUs*

In order to fully exploit GPGPU's parallel processing power, on-chip interconnects need to provide bandwidth efficient data communication. GPGPUs exhibit a many-to-few-to-many traffic pattern which makes the memory controller connected routers the network bottleneck. Inefficient design of conventional routers causes long queues of packets blocked at memory controllers and thus greatly constrained the network bandwidth. In this work, we employ heterogeneous design techniques and propose a novel decoupled architecture for routers connected with memory controllers. To further improve performance, we propose techniques called Injection Virtual Circuit and Memory-aware Adaptive Routing. We show that our scheme can effectively eliminate NoC bottleneck and improve performance by 78% on average.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)

自引率

0.00%

发文量