SIMIL: SIMple Issue Logic for GPUs

IF 1.9 4区 计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Microprocessors and Microsystems Pub Date : 2024-10-09 DOI:10.1016/j.micpro.2024.105105
{"title":"SIMIL: SIMple Issue Logic for GPUs","authors":"","doi":"10.1016/j.micpro.2024.105105","DOIUrl":null,"url":null,"abstract":"<div><div>GPU architectures have become popular for executing general-purpose programs. In particular, they are some of the most efficient architectures for machine learning applications which are among the most trendy and demanding applications nowadays.</div><div>This paper presents SIMIL (SIMple Issue Logic for GPUs), an architectural modification to the issue stage that replaces scoreboards with a Dependence Matrix to track dependencies among instructions and avoid data hazards. We show that a Dependence Matrix is more effective in the presence of repetitive use of source operands, which is common in many applications. Besides, a Dependence Matrix with minor extensions can also support a simplistic out-of-order issue. Evaluations on an NVIDIA Tesla V100-like GPU show that SIMIL provides a speed-up of up to 2.39 in some machine learning programs and 1.31 on average for various benchmarks, while it reduces energy consumption by 12.81%, with only 1.5% area overhead. We also show that SIMIL outperforms a recently proposed approach for out-of-order issue that uses register renaming.</div></div>","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":null,"pages":null},"PeriodicalIF":1.9000,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Microprocessors and Microsystems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141933124001005","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

GPU architectures have become popular for executing general-purpose programs. In particular, they are some of the most efficient architectures for machine learning applications which are among the most trendy and demanding applications nowadays.
This paper presents SIMIL (SIMple Issue Logic for GPUs), an architectural modification to the issue stage that replaces scoreboards with a Dependence Matrix to track dependencies among instructions and avoid data hazards. We show that a Dependence Matrix is more effective in the presence of repetitive use of source operands, which is common in many applications. Besides, a Dependence Matrix with minor extensions can also support a simplistic out-of-order issue. Evaluations on an NVIDIA Tesla V100-like GPU show that SIMIL provides a speed-up of up to 2.39 in some machine learning programs and 1.31 on average for various benchmarks, while it reduces energy consumption by 12.81%, with only 1.5% area overhead. We also show that SIMIL outperforms a recently proposed approach for out-of-order issue that uses register renaming.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
SIMIL:用于 GPU 的简单问题逻辑
GPU 架构已成为执行通用程序的流行架构。本文介绍了 SIMIL(SIMple Issue Logic for GPUs,GPU 的简单问题逻辑),它是对问题阶段的架构修改,用依赖矩阵取代记分板,以跟踪指令之间的依赖关系并避免数据危险。我们的研究表明,在重复使用源操作数的情况下,依赖矩阵更为有效,而这在许多应用中都很常见。此外,稍加扩展的依赖性矩阵还能支持简单的失序问题。在英伟达™(NVIDIA®)Tesla V100 类 GPU 上进行的评估表明,SIMIL 在某些机器学习程序中的速度提升了 2.39 倍,在各种基准测试中的平均速度提升了 1.31 倍,能耗降低了 12.81%,而面积开销仅为 1.5%。我们还表明,SIMIL 的性能优于最近提出的一种使用寄存器重命名来处理失序问题的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Microprocessors and Microsystems
Microprocessors and Microsystems 工程技术-工程:电子与电气
CiteScore
6.90
自引率
3.80%
发文量
204
审稿时长
172 days
期刊介绍: Microprocessors and Microsystems: Embedded Hardware Design (MICPRO) is a journal covering all design and architectural aspects related to embedded systems hardware. This includes different embedded system hardware platforms ranging from custom hardware via reconfigurable systems and application specific processors to general purpose embedded processors. Special emphasis is put on novel complex embedded architectures, such as systems on chip (SoC), systems on a programmable/reconfigurable chip (SoPC) and multi-processor systems on a chip (MPSoC), as well as, their memory and communication methods and structures, such as network-on-chip (NoC). Design automation of such systems including methodologies, techniques, flows and tools for their design, as well as, novel designs of hardware components fall within the scope of this journal. Novel cyber-physical applications that use embedded systems are also central in this journal. While software is not in the main focus of this journal, methods of hardware/software co-design, as well as, application restructuring and mapping to embedded hardware platforms, that consider interplay between software and hardware components with emphasis on hardware, are also in the journal scope.
期刊最新文献
Algorithms for scheduling CNNs on multicore MCUs at the neuron and layer levels Low-cost constant time signed digit selection for most significant bit first multiplication Retraction notice to “A Hybrid Semantic Similarity Measurement for Geospatial Entities” [Microprocessors and Microsystems 80 (2021) 103526] A hardware architecture for single and multiple ellipse detection using genetic algorithms and high-level synthesis tools SIMIL: SIMple Issue Logic for GPUs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1