Exploring AMD GPU scheduling details by experimenting with “worst practices”

IF 1.4 4区 计算机科学 Q3 COMPUTER SCIENCE, THEORY & METHODS Real-Time Systems Pub Date : 2022-03-23 DOI:10.1007/s11241-022-09381-y
Nathan Otterness, James H. Anderson
{"title":"Exploring AMD GPU scheduling details by experimenting with “worst practices”","authors":"Nathan Otterness, James H. Anderson","doi":"10.1007/s11241-022-09381-y","DOIUrl":null,"url":null,"abstract":"<p>Graphics processing units (GPUs) have been the target of a significant body of recent real-time research, but research is often hampered by the “black box” nature of GPU hardware and software. Now that one GPU manufacturer, AMD, has embraced an open-source software stack, one may expect an increased amount of real-time research to use AMD GPUs. Reality, however, is more complicated. Without understanding where internal details may differ, researchers have no basis for assuming that observations made using NVIDIA GPUs will continue to hold for AMD GPUs. Additionally, the openness of AMD’s software does not mean that their scheduling behavior is obvious, especially due to sparse, scattered documentation. In this paper, we gather the disparate pieces of documentation into a single coherent source that provides an end-to-end description of how compute work is scheduled on AMD GPUs. In doing so, we start with a concrete demonstration of how incorrect management triggers extreme worst-case behavior in shared AMD GPUs. Subsequently, we explain the internal scheduling rules for AMD GPUs, how they led to the “worst practices,” and how to correctly manage some of the most performance-critical factors in AMD GPU sharing.</p>","PeriodicalId":54507,"journal":{"name":"Real-Time Systems","volume":"216 1‐2","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2022-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Real-Time Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11241-022-09381-y","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Graphics processing units (GPUs) have been the target of a significant body of recent real-time research, but research is often hampered by the “black box” nature of GPU hardware and software. Now that one GPU manufacturer, AMD, has embraced an open-source software stack, one may expect an increased amount of real-time research to use AMD GPUs. Reality, however, is more complicated. Without understanding where internal details may differ, researchers have no basis for assuming that observations made using NVIDIA GPUs will continue to hold for AMD GPUs. Additionally, the openness of AMD’s software does not mean that their scheduling behavior is obvious, especially due to sparse, scattered documentation. In this paper, we gather the disparate pieces of documentation into a single coherent source that provides an end-to-end description of how compute work is scheduled on AMD GPUs. In doing so, we start with a concrete demonstration of how incorrect management triggers extreme worst-case behavior in shared AMD GPUs. Subsequently, we explain the internal scheduling rules for AMD GPUs, how they led to the “worst practices,” and how to correctly manage some of the most performance-critical factors in AMD GPU sharing.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过试验“最坏的做法”探索AMD GPU调度细节
图形处理单元(GPU)一直是近期实时研究的重要目标,但研究往往受到GPU硬件和软件的“黑盒子”性质的阻碍。现在,GPU制造商AMD已经接受了开源软件栈,人们可以期待越来越多的实时研究使用AMD GPU。然而,现实要复杂得多。在不了解内部细节可能不同的情况下,研究人员没有根据假设使用NVIDIA gpu的观察结果将继续适用于AMD gpu。此外,AMD的软件的开放性并不意味着他们的调度行为是明显的,特别是由于稀疏,分散的文档。在本文中,我们将不同的文档片段收集到一个统一的来源中,该来源提供了如何在AMD gpu上调度计算工作的端到端描述。在此过程中,我们首先具体演示了不正确的管理如何在共享AMD gpu中触发极端最坏情况的行为。随后,我们解释了AMD GPU的内部调度规则,它们如何导致“最糟糕的做法”,以及如何正确管理AMD GPU共享中一些最关键的性能因素。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Real-Time Systems
Real-Time Systems 工程技术-计算机:理论方法
CiteScore
2.90
自引率
7.70%
发文量
15
审稿时长
6 months
期刊介绍: Papers published in Real-Time Systems cover, among others, the following topics: requirements engineering, specification and verification techniques, design methods and tools, programming languages, operating systems, scheduling algorithms, architecture, hardware and interfacing, dependability and safety, distributed and other novel architectures, wired and wireless communications, wireless sensor systems, distributed databases, artificial intelligence techniques, expert systems, and application case studies. Applications are found in command and control systems, process control, automated manufacturing, flight control, avionics, space avionics and defense systems, shipborne systems, vision and robotics, pervasive and ubiquitous computing, and in an abundance of embedded systems.
期刊最新文献
Multi-core interference over-estimation reduction by static scheduling of multi-phase tasks Connecting the physical space and cyber space of autonomous systems more closely Mcti: mixed-criticality task-based isolation Minimizing cache usage with fixed-priority and earliest deadline first scheduling MemPol: polling-based microsecond-scale per-core memory bandwidth regulation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1