扭曲指令重用以最小化gpu中的重复计算

2018 IEEE International Symposium on High Performance Computer Architecture (HPCA) Pub Date : 2018-03-27 DOI:10.1109/HPCA.2018.00041

Keunsoo Kim, W. Ro

{"title":"扭曲指令重用以最小化gpu中的重复计算","authors":"Keunsoo Kim, W. Ro","doi":"10.1109/HPCA.2018.00041","DOIUrl":null,"url":null,"abstract":"Warp instructions with an identical arithmetic operation on same input values produce the identical computation results. This paper proposes warp instruction reuse to allow such repeated warp instructions to reuse previous computation results instead of actually executing the instructions. Bypassing register reading, functional unit, and register writing operations improves energy efficiency. This reuse technique is especially beneficial for GPUs since a GPU warp register is usually as wide as thousands of bits. In addition, we propose warp register reuse which allows identical warp register values to share a single physical register through register renaming. The register reuse technique enables to see if different logical warp registers have an identical value by only looking at their physical warp register IDs. Based on this observation, warp register reuse helps to perform all necessary operations for warp instruction reuse with register IDs, which is substantially more efficient than directly manipulating register values. Performance evaluation shows that 20.5% SM energy and 10.7% GPU energy can be saved by allowing 18.7% of warp instructions to reuse prior results.","PeriodicalId":154694,"journal":{"name":"2018 IEEE International Symposium on High Performance Computer Architecture (HPCA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"WIR: Warp Instruction Reuse to Minimize Repeated Computations in GPUs\",\"authors\":\"Keunsoo Kim, W. Ro\",\"doi\":\"10.1109/HPCA.2018.00041\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Warp instructions with an identical arithmetic operation on same input values produce the identical computation results. This paper proposes warp instruction reuse to allow such repeated warp instructions to reuse previous computation results instead of actually executing the instructions. Bypassing register reading, functional unit, and register writing operations improves energy efficiency. This reuse technique is especially beneficial for GPUs since a GPU warp register is usually as wide as thousands of bits. In addition, we propose warp register reuse which allows identical warp register values to share a single physical register through register renaming. The register reuse technique enables to see if different logical warp registers have an identical value by only looking at their physical warp register IDs. Based on this observation, warp register reuse helps to perform all necessary operations for warp instruction reuse with register IDs, which is substantially more efficient than directly manipulating register values. Performance evaluation shows that 20.5% SM energy and 10.7% GPU energy can be saved by allowing 18.7% of warp instructions to reuse prior results.\",\"PeriodicalId\":154694,\"journal\":{\"name\":\"2018 IEEE International Symposium on High Performance Computer Architecture (HPCA)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-03-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE International Symposium on High Performance Computer Architecture (HPCA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPCA.2018.00041\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Symposium on High Performance Computer Architecture (HPCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA.2018.00041","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 14

摘要

对相同输入值进行相同算术运算的Warp指令产生相同的计算结果。本文提出了warp指令重用，允许重复的warp指令重用以前的计算结果，而不是实际执行这些指令。绕过寄存器读取、功能单元和寄存器写入操作，提高了能源效率。这种重用技术对GPU特别有利，因为GPU的翘曲寄存器通常宽达数千位。此外，我们建议warp寄存器重用，它允许相同的warp寄存器值通过寄存器重命名共享单个物理寄存器。寄存器重用技术可以通过查看不同的逻辑翘曲寄存器的物理翘曲寄存器id来查看它们是否具有相同的值。基于这一观察，warp寄存器重用有助于使用寄存器id执行所有必要的warp指令重用操作，这比直接操作寄存器值要有效得多。性能评估表明，允许18.7%的warp指令重用先前的结果，可以节省20.5%的SM能量和10.7%的GPU能量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

WIR: Warp Instruction Reuse to Minimize Repeated Computations in GPUs

Warp instructions with an identical arithmetic operation on same input values produce the identical computation results. This paper proposes warp instruction reuse to allow such repeated warp instructions to reuse previous computation results instead of actually executing the instructions. Bypassing register reading, functional unit, and register writing operations improves energy efficiency. This reuse technique is especially beneficial for GPUs since a GPU warp register is usually as wide as thousands of bits. In addition, we propose warp register reuse which allows identical warp register values to share a single physical register through register renaming. The register reuse technique enables to see if different logical warp registers have an identical value by only looking at their physical warp register IDs. Based on this observation, warp register reuse helps to perform all necessary operations for warp instruction reuse with register IDs, which is substantially more efficient than directly manipulating register values. Performance evaluation shows that 20.5% SM energy and 10.7% GPU energy can be saved by allowing 18.7% of warp instructions to reuse prior results.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 IEEE International Symposium on High Performance Computer Architecture (HPCA)

自引率

0.00%

发文量

期刊最新文献

Record-Replay Architecture as a General Security Framework LATTE-CC: Latency Tolerance Aware Adaptive Cache Compression Management for Energy Efficient GPUs Secure DIMM: Moving ORAM Primitives Closer to Memory OuterSPACE: An Outer Product Based Sparse Matrix Multiplication Accelerator WIR: Warp Instruction Reuse to Minimize Repeated Computations in GPUs