Nyami: a synthesizable GPU architectural model for general-purpose and graphics-specific workloads

Jeffrey T. Bush, Philip Dexter, Timothy N. Miller, A. Carpenter
{"title":"Nyami: a synthesizable GPU architectural model for general-purpose and graphics-specific workloads","authors":"Jeffrey T. Bush, Philip Dexter, Timothy N. Miller, A. Carpenter","doi":"10.1109/ISPASS.2015.7095803","DOIUrl":null,"url":null,"abstract":"Graphics processing units (GPUs) continue to grow in popularity for general-purpose, highly parallel, high-throughput systems. This has forced GPU vendors to increase their focus on general purpose workloads, sometimes at the expense of the graphics-specific workloads. Using GPUs for general-purpose computation is a departure from the driving forces behind programmable GPUs that were focused on a narrow subset of graphics rendering operations. Rather than focus on purely graphics-related or general-purpose use, we have designed and modeled an architecture that optimizes for both simultaneously to efficiently handle all GPU workloads. In this paper, we present Nyami, a co-optimized GPU architecture and simulation model with an open-source implementation written in Verilog. This approach allows us to more easily explore the GPU design space in a synthesizable, cycle-precise, modular environment. An instruction-precise functional simulator is provided for co-simulation and verification. Overall, we assume a GPU may be used as a general-purpose GPU (GPGPU) or a graphics engine and account for this in the architecture's construction and in the options and modules selectable for synthesis and simulation. To demonstrate Nyami's viability as a GPU research platform, we exploit its flexibility and modularity to explore the impact of a set of architectural decisions. These include sensitivity to cache size and associativity, barrel and switch-on-stall multithreaded instruction scheduling, and software vs. hardware implementations of rasterization. Through these experiments, we gain insight into commonly accepted GPU architecture decisions, adapt the architecture accordingly, and give examples of the intended use as a GPU research tool.","PeriodicalId":189378,"journal":{"name":"2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPASS.2015.7095803","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18

Abstract

Graphics processing units (GPUs) continue to grow in popularity for general-purpose, highly parallel, high-throughput systems. This has forced GPU vendors to increase their focus on general purpose workloads, sometimes at the expense of the graphics-specific workloads. Using GPUs for general-purpose computation is a departure from the driving forces behind programmable GPUs that were focused on a narrow subset of graphics rendering operations. Rather than focus on purely graphics-related or general-purpose use, we have designed and modeled an architecture that optimizes for both simultaneously to efficiently handle all GPU workloads. In this paper, we present Nyami, a co-optimized GPU architecture and simulation model with an open-source implementation written in Verilog. This approach allows us to more easily explore the GPU design space in a synthesizable, cycle-precise, modular environment. An instruction-precise functional simulator is provided for co-simulation and verification. Overall, we assume a GPU may be used as a general-purpose GPU (GPGPU) or a graphics engine and account for this in the architecture's construction and in the options and modules selectable for synthesis and simulation. To demonstrate Nyami's viability as a GPU research platform, we exploit its flexibility and modularity to explore the impact of a set of architectural decisions. These include sensitivity to cache size and associativity, barrel and switch-on-stall multithreaded instruction scheduling, and software vs. hardware implementations of rasterization. Through these experiments, we gain insight into commonly accepted GPU architecture decisions, adapt the architecture accordingly, and give examples of the intended use as a GPU research tool.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
图形处理单元(gpu)在通用、高度并行、高吞吐量的系统中越来越受欢迎。这迫使GPU供应商增加对通用工作负载的关注,有时以牺牲特定于图形的工作负载为代价。使用gpu进行通用计算是对可编程gpu背后驱动力的背离,可编程gpu专注于图形渲染操作的狭窄子集。而不是专注于纯粹的图形相关或通用用途,我们已经设计和建模了一个架构,同时优化两者,以有效地处理所有GPU工作负载。在本文中,我们提出了Nyami,一个协同优化的GPU架构和仿真模型,并使用Verilog编写了一个开源实现。这种方法使我们能够在可合成的、周期精确的、模块化的环境中更容易地探索GPU设计空间。提供了一个指令精确的功能模拟器,用于联合仿真和验证。总体而言,我们假设GPU可以用作通用GPU (GPGPU)或图形引擎,并在架构的构建以及可用于合成和模拟的选项和模块中考虑到这一点。为了证明Nyami作为GPU研究平台的可行性,我们利用其灵活性和模块化来探索一组架构决策的影响。其中包括对缓存大小和关联性的敏感性,桶式和开关式暂停多线程指令调度,以及栅格化的软件与硬件实现。通过这些实验,我们深入了解了普遍接受的GPU架构决策,相应地调整了架构,并给出了作为GPU研究工具的预期用途的示例。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Graph Processing Platforms at Scale: Practices and Experiences Self-monitoring overhead of the Linux perf_ event performance counter interface Analyzing communication models for distributed thread-collaborative processors in terms of energy and time A full-system approach to analyze the impact of next-generation mobile flash storage Graph-matching-based simulation-region selection for multiple binaries
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1