10GbE下虚拟化多核服务器中的性能表征和缓存感知核心调度

2009 IEEE International Symposium on Workload Characterization (IISWC) Pub Date : 2009-10-04 DOI:10.1109/IISWC.2009.5306784

Danhua Guo, Guangdeng Liao, L. Bhuyan

{"title":"10GbE下虚拟化多核服务器中的性能表征和缓存感知核心调度","authors":"Danhua Guo, Guangdeng Liao, L. Bhuyan","doi":"10.1109/IISWC.2009.5306784","DOIUrl":null,"url":null,"abstract":"Virtual Machine (VM) technology is experiencing a resurgent interest as the ubiquitous multi-core processors have become the de facto configuration on modern web servers. Multicore servers potentially provide sufficient physical resources to realize VM's benefits including performance isolation, manageability and scalability. However, the network performance of virtualized multi-core servers falls short of expectation. It is therefore important to understand the overhead implications. In this paper, we evaluate the network performance of a virtualized multi-core server using a TCP streaming microbenchmark (Iperf) and SPECweb2005. We first motivate our research by presenting the performance gap between native and virtualized environment. We then break down the overhead from an architectural viewpoint and show that the cache topology greatly influences the performance. We also profile the Virtual Machine Monitor (VMM) at a function level to illustrate that functions in the current version of the Xen scheduler are the major contributors to the poor utilization of cache topology. Consequently, we implement a static onloading scheme to separate interrupt handling from application processes and execute them on cores with cache affinity. Based on the observed benefits, we modify the Xen scheduler to migrate virtual CPUs dynamically to exploit the cache topology. Our results show that the VM performance improves by an average of 12% for Iperf and 15% for SPECweb2005.","PeriodicalId":387816,"journal":{"name":"2009 IEEE International Symposium on Workload Characterization (IISWC)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2009-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":"{\"title\":\"Performance characterization and cache-aware core scheduling in a virtualized multi-core server under 10GbE\",\"authors\":\"Danhua Guo, Guangdeng Liao, L. Bhuyan\",\"doi\":\"10.1109/IISWC.2009.5306784\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Virtual Machine (VM) technology is experiencing a resurgent interest as the ubiquitous multi-core processors have become the de facto configuration on modern web servers. Multicore servers potentially provide sufficient physical resources to realize VM's benefits including performance isolation, manageability and scalability. However, the network performance of virtualized multi-core servers falls short of expectation. It is therefore important to understand the overhead implications. In this paper, we evaluate the network performance of a virtualized multi-core server using a TCP streaming microbenchmark (Iperf) and SPECweb2005. We first motivate our research by presenting the performance gap between native and virtualized environment. We then break down the overhead from an architectural viewpoint and show that the cache topology greatly influences the performance. We also profile the Virtual Machine Monitor (VMM) at a function level to illustrate that functions in the current version of the Xen scheduler are the major contributors to the poor utilization of cache topology. Consequently, we implement a static onloading scheme to separate interrupt handling from application processes and execute them on cores with cache affinity. Based on the observed benefits, we modify the Xen scheduler to migrate virtual CPUs dynamically to exploit the cache topology. Our results show that the VM performance improves by an average of 12% for Iperf and 15% for SPECweb2005.\",\"PeriodicalId\":387816,\"journal\":{\"name\":\"2009 IEEE International Symposium on Workload Characterization (IISWC)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-10-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"19\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 IEEE International Symposium on Workload Characterization (IISWC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IISWC.2009.5306784\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE International Symposium on Workload Characterization (IISWC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IISWC.2009.5306784","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 19

摘要

随着无处不在的多核处理器成为现代web服务器的实际配置，虚拟机(VM)技术正经历着人们对其兴趣的复兴。多核服务器可能提供足够的物理资源来实现VM的优势，包括性能隔离、可管理性和可伸缩性。但是，虚拟化多核服务器的网络性能没有达到预期。因此，理解开销含义是很重要的。在本文中，我们使用TCP流微基准测试(Iperf)和SPECweb2005来评估虚拟化多核服务器的网络性能。我们首先通过展示本机环境和虚拟化环境之间的性能差距来激励我们的研究。然后，我们从体系结构的角度分析了开销，并展示了缓存拓扑对性能的巨大影响。我们还在功能级别对虚拟机监视器(VMM)进行了分析，以说明当前版本的Xen调度器中的功能是导致缓存拓扑利用率低下的主要原因。因此，我们实现了一个静态加载方案，将中断处理从应用程序进程中分离出来，并在具有缓存关联的核心上执行它们。根据观察到的好处，我们修改Xen调度器，以动态迁移虚拟cpu，以利用缓存拓扑。我们的结果表明，Iperf和SPECweb2005的VM性能平均提高了12%和15%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Performance characterization and cache-aware core scheduling in a virtualized multi-core server under 10GbE

Virtual Machine (VM) technology is experiencing a resurgent interest as the ubiquitous multi-core processors have become the de facto configuration on modern web servers. Multicore servers potentially provide sufficient physical resources to realize VM's benefits including performance isolation, manageability and scalability. However, the network performance of virtualized multi-core servers falls short of expectation. It is therefore important to understand the overhead implications. In this paper, we evaluate the network performance of a virtualized multi-core server using a TCP streaming microbenchmark (Iperf) and SPECweb2005. We first motivate our research by presenting the performance gap between native and virtualized environment. We then break down the overhead from an architectural viewpoint and show that the cache topology greatly influences the performance. We also profile the Virtual Machine Monitor (VMM) at a function level to illustrate that functions in the current version of the Xen scheduler are the major contributors to the poor utilization of cache topology. Consequently, we implement a static onloading scheme to separate interrupt handling from application processes and execute them on cores with cache affinity. Based on the observed benefits, we modify the Xen scheduler to migrate virtual CPUs dynamically to exploit the cache topology. Our results show that the VM performance improves by an average of 12% for Iperf and 15% for SPECweb2005.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助