DymGPU: Dynamic Memory Management for Sharing GPUs in Virtualized Clouds

Younghun Park, Minwoo Gu, Sun-Mi Yoo, Youngjae Kim, Sungyong Park
{"title":"DymGPU: Dynamic Memory Management for Sharing GPUs in Virtualized Clouds","authors":"Younghun Park, Minwoo Gu, Sun-Mi Yoo, Youngjae Kim, Sungyong Park","doi":"10.1109/FAS-W.2018.00025","DOIUrl":null,"url":null,"abstract":"gVirt is a full GPU virtualization technique for Intel's integrated GPUs that alleviates the problems of other GPU virtualization techniques such as API remoting and direct pass-through. The original gVirt is known to have an inherent scalability limitation on the number of simultaneous virtual machines (VM). gScale solved this problem by allowing each VM to share a global graphics memory space and copy the entries in a private graphics translation table (GTT) to a physical GTT along with a GPU context switch. However, it still suffers from a large overhead of copying entries between private GTT and physical GTT, which becomes worse when the global graphics memory space allocated for each VM is overlapped. In this paper, we identify that the copy overhead caused by GPU context switch is the major bottleneck in performance improvement and propose a dynamic memory management scheme, called DymGPU, that provides two memory allocation algorithms such as size-based and utilization-based algorithms. While the size-based algorithm allocates memory space based on the memory size required by each VM, the utilization-based algorithm considers GPU utilization of each VM to allocate the memory space. DymGPU is also dynamic in the sense that the global graphics memory space used by each VM is rearranged at runtime by periodically checking idle VMs and GPU utilization of each runnable VM. We have implemented our proposed approach in gVirt and confirmed that the proposed scheme reduces GPU context switch time by up to 53% and improved the overall performance of various GPU applications by up to 39%.","PeriodicalId":164903,"journal":{"name":"2018 IEEE 3rd International Workshops on Foundations and Applications of Self* Systems (FAS*W)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 3rd International Workshops on Foundations and Applications of Self* Systems (FAS*W)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FAS-W.2018.00025","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

gVirt is a full GPU virtualization technique for Intel's integrated GPUs that alleviates the problems of other GPU virtualization techniques such as API remoting and direct pass-through. The original gVirt is known to have an inherent scalability limitation on the number of simultaneous virtual machines (VM). gScale solved this problem by allowing each VM to share a global graphics memory space and copy the entries in a private graphics translation table (GTT) to a physical GTT along with a GPU context switch. However, it still suffers from a large overhead of copying entries between private GTT and physical GTT, which becomes worse when the global graphics memory space allocated for each VM is overlapped. In this paper, we identify that the copy overhead caused by GPU context switch is the major bottleneck in performance improvement and propose a dynamic memory management scheme, called DymGPU, that provides two memory allocation algorithms such as size-based and utilization-based algorithms. While the size-based algorithm allocates memory space based on the memory size required by each VM, the utilization-based algorithm considers GPU utilization of each VM to allocate the memory space. DymGPU is also dynamic in the sense that the global graphics memory space used by each VM is rearranged at runtime by periodically checking idle VMs and GPU utilization of each runnable VM. We have implemented our proposed approach in gVirt and confirmed that the proposed scheme reduces GPU context switch time by up to 53% and improved the overall performance of various GPU applications by up to 39%.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
DymGPU:虚拟化云中共享gpu的动态内存管理
gVirt是一种针对英特尔集成GPU的完整GPU虚拟化技术,它缓解了其他GPU虚拟化技术(如API远程和直接直通)的问题。众所周知,最初的gVirt在并发虚拟机(VM)的数量上存在固有的可伸缩性限制。gScale通过允许每个VM共享全局图形内存空间并将私有图形转换表(GTT)中的条目复制到物理GTT以及GPU上下文切换来解决这个问题。但是,在私有GTT和物理GTT之间复制条目的开销仍然很大,当为每个VM分配的全局图形内存空间重叠时,情况会变得更糟。在本文中,我们确定了由GPU上下文切换引起的复制开销是性能改进的主要瓶颈,并提出了一种称为DymGPU的动态内存管理方案,该方案提供了两种内存分配算法,如基于大小和基于利用率的算法。基于大小的算法是根据每个虚拟机所需的内存大小来分配内存空间,而基于利用率的算法是根据每个虚拟机的GPU利用率来分配内存空间。DymGPU也是动态的,通过定期检查空闲虚拟机和每个可运行虚拟机的GPU利用率,在运行时重新安排每个虚拟机使用的全局图形内存空间。我们已经在gVirt中实现了我们提出的方法,并证实了所提出的方案将GPU上下文切换时间减少了53%,并将各种GPU应用程序的整体性能提高了39%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Towards Self-Adaptive Systems with Hierarchical Decentralised Control DymGPU: Dynamic Memory Management for Sharing GPUs in Virtualized Clouds Reactive and Adaptive Security Monitoring in Cloud Computing Aspects of Measuring and Evaluating the Integration Status of a (Sub-)System at Runtime Efficient Classification of Application Characteristics by Using Hardware Performance Counters with Data Mining
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1