International Conference on Virtual Execution Environments最新文献_第2页

Shrinking the hypervisor one subsystem at a time: a userspace packet switch for virtual machines 一次缩小一个子系统:虚拟机的用户空间数据包交换机

International Conference on Virtual Execution Environments

Pub Date : 2014-03-01 DOI: 10.1145/2576195.2576202

Julian Stecklina

Efficient and secure networking between virtual machines is crucial in a time where a large share of the services on the Internet and in private datacenters run in virtual machines. To achieve this efficiency, virtualization solutions, such as Qemu/KVM, move toward a monolithic system architecture in which all performance critical functionality is implemented directly in the hypervisor in privileged mode. This is an attack surface in the hypervisor that can be used from compromised VMs to take over the virtual machine host and all VMs running on it. We show that it is possible to implement an efficient network switch nfor virtual machines as an unprivileged userspace component running in the host system including the driver for the upstream network adapter. Our network switch relies on functionality already present in the KVM hypervisor and requires no changes to Linux, the host operating system, and the guest. Our userspace implementation compares favorably to the existing in-kernel implementation with respect to throughput and latency. We reduced per-packet overhead by using a run-to-completion model an are able to outperform the unmodified system for VM-to-VM traffic by a large margin when packet rates are high.

在Internet上和私有数据中心中的大部分服务都在虚拟机中运行的时代，虚拟机之间高效且安全的网络连接至关重要。为了实现这种效率，Qemu/KVM等虚拟化解决方案转向了单片系统架构，在该架构中，所有性能关键功能都以特权模式直接在管理程序中实现。这是虚拟机管理程序中的攻击面，可以从受损的虚拟机中使用它来接管虚拟机主机及其上运行的所有虚拟机。我们展示了作为运行在主机系统中的非特权用户空间组件(包括上游网络适配器的驱动程序)来实现虚拟机的高效网络交换机n是可能的。我们的网络交换机依赖于KVM管理程序中已经存在的功能，并且不需要更改Linux、主机操作系统和客户机。在吞吐量和延迟方面，我们的用户空间实现优于现有的内核内实现。我们通过使用运行到完成模型减少了每个数据包的开销，并且当数据包速率很高时，能够在很大程度上优于未修改的vm到vm流量系统。

{"title":"Shrinking the hypervisor one subsystem at a time: a userspace packet switch for virtual machines","authors":"Julian Stecklina","doi":"10.1145/2576195.2576202","DOIUrl":"https://doi.org/10.1145/2576195.2576202","url":null,"abstract":"Efficient and secure networking between virtual machines is crucial in a time where a large share of the services on the Internet and in private datacenters run in virtual machines. To achieve this efficiency, virtualization solutions, such as Qemu/KVM, move toward a monolithic system architecture in which all performance critical functionality is implemented directly in the hypervisor in privileged mode. This is an attack surface in the hypervisor that can be used from compromised VMs to take over the virtual machine host and all VMs running on it.\u0000 We show that it is possible to implement an efficient network switch nfor virtual machines as an unprivileged userspace component running in the host system including the driver for the upstream network adapter. Our network switch relies on functionality already present in the KVM hypervisor and requires no changes to Linux, the host operating system, and the guest.\u0000 Our userspace implementation compares favorably to the existing in-kernel implementation with respect to throughput and latency. We reduced per-packet overhead by using a run-to-completion model an are able to outperform the unmodified system for VM-to-VM traffic by a large margin when packet rates are high.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115306690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Efficient memory virtualization for Cross-ISA system mode emulation 跨isa系统模式仿真的高效内存虚拟化

International Conference on Virtual Execution Environments

Pub Date : 2014-03-01 DOI: 10.1145/2576195.2576201

Chao-Rui Chang, Jan-Jan Wu, W. Hsu, Pangfeng Liu, P. Yew

Cross-ISA system-mode emulation has many important applications. For example, Cross-ISA system-mode emulation helps computer architects and OS developers trace and debug kernel execution-flow efficiently by emulating a slower platform (such as ARM) on a more powerful plat-form (such as an x86 machine). Cross-ISA system-mode emulation also enables workload consolidation in data centers with platforms of different instruction-set architectures (ISAs). However, system-mode emulation is much slower. One major overhead in system-mode emulation is the multi-level memory address translation that maps guest virtual address to host physical address. Shadow page tables (SPT) have been used to reduce such overheads, but primarily for same-ISA virtualization. In this paper we propose a novel approach called embedded shadow page tables (ESPT). EPST embeds a shadow page table into the address space of a cross-ISA dynamic binary translation (DBT) and uses hardware memory management unit in the CPU to translate memory addresses, instead of software translation in a current DBT emulator like QEMU. We also use the larger address space on modern 64-bit CPUs to accommodate our DBT emulator so that it will not interfere with the guest operating system. We incorporate our new scheme into QEMU, a popular, retargetable cross-ISA system emulator. SPEC CINT2006 benchmark results indicate that our technique achieves an average speedup of 1.51 times in system mode when emulating ARM on x86, and a 1.59 times speedup for emulating IA32 on x86_64.

跨isa系统模式仿真有许多重要的应用。例如，跨isa系统模式仿真通过在更强大的平台(如x86机器)上模拟较慢的平台(如ARM)，帮助计算机架构师和操作系统开发人员有效地跟踪和调试内核执行流。跨isa系统模式仿真还支持在具有不同指令集架构(isa)平台的数据中心中整合工作负载。但是，系统模式模拟要慢得多。系统模式仿真中的一个主要开销是将来宾虚拟地址映射到主机物理地址的多级内存地址转换。影子页表(SPT)已用于减少此类开销，但主要用于相同的isa虚拟化。在本文中，我们提出了一种新的方法，称为嵌入式影子页表(ESPT)。EPST将影子页表嵌入到跨isa动态二进制转换(DBT)的地址空间中，并使用CPU中的硬件内存管理单元来转换内存地址，而不是像QEMU这样的DBT模拟器中的软件转换。我们还在现代64位cpu上使用更大的地址空间来容纳DBT模拟器，这样它就不会干扰客户机操作系统。我们将我们的新方案合并到QEMU中，QEMU是一个流行的，可重新定位的跨isa系统模拟器。SPEC CINT2006基准测试结果表明，我们的技术在系统模式下在x86上仿真ARM时平均加速达到1.51倍，在x86_64上仿真IA32时平均加速达到1.59倍。

{"title":"Efficient memory virtualization for Cross-ISA system mode emulation","authors":"Chao-Rui Chang, Jan-Jan Wu, W. Hsu, Pangfeng Liu, P. Yew","doi":"10.1145/2576195.2576201","DOIUrl":"https://doi.org/10.1145/2576195.2576201","url":null,"abstract":"Cross-ISA system-mode emulation has many important applications. For example, Cross-ISA system-mode emulation helps computer architects and OS developers trace and debug kernel execution-flow efficiently by emulating a slower platform (such as ARM) on a more powerful plat-form (such as an x86 machine). Cross-ISA system-mode emulation also enables workload consolidation in data centers with platforms of different instruction-set architectures (ISAs). However, system-mode emulation is much slower. One major overhead in system-mode emulation is the multi-level memory address translation that maps guest virtual address to host physical address. Shadow page tables (SPT) have been used to reduce such overheads, but primarily for same-ISA virtualization. In this paper we propose a novel approach called embedded shadow page tables (ESPT). EPST embeds a shadow page table into the address space of a cross-ISA dynamic binary translation (DBT) and uses hardware memory management unit in the CPU to translate memory addresses, instead of software translation in a current DBT emulator like QEMU. We also use the larger address space on modern 64-bit CPUs to accommodate our DBT emulator so that it will not interfere with the guest operating system. We incorporate our new scheme into QEMU, a popular, retargetable cross-ISA system emulator. SPEC CINT2006 benchmark results indicate that our technique achieves an average speedup of 1.51 times in system mode when emulating ARM on x86, and a 1.59 times speedup for emulating IA32 on x86_64.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128286164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

Tesseract: reconciling guest I/O and hypervisor swapping in a VM Tesseract:协调虚拟机中的客户I/O和管理程序交换

International Conference on Virtual Execution Environments

Pub Date : 2014-03-01 DOI: 10.1145/2576195.2576198

K. Arya, Y. Baskakov, Alex Garthwaite

Double-paging is an often-cited, if unsubstantiated, problem in multi-level scheduling of memory between virtual machines (VMs) and the hypervisor. This problem occurs when both a virtualized guest and the hypervisor overcommit their respective physical address-spaces. When the guest pages out memory previously swapped out by the hypervisor, it initiates an expensive sequence of steps causing the contents to be read in from the hypervisor swapfile only to be written out again, significantly lengthening the time to complete the guest I/O request. As a result, performance rapidly drops. We present Tesseract, a system that directly and transparently addresses the double-paging problem. Tesseract tracks when guest and hypervisor I/O operations are redundant and modifies these I/Os to create indirections to existing disk blocks containing the page contents. Although our focus is on reconciling I/Os between the guest disks and hypervisor swap, our technique is general and can reconcile, or deduplicate, I/Os for guest pages read or written by the VM. Deduplication of disk blocks for file contents accessed in a common manner is well-understood. One challenge that our approach faces is that the locality of guest I/Os (reflecting the guest's notion of disk layout) often differs from that of the blocks in the hypervisor swap. This loss of locality through indirection results in significant performance loss on subsequent guest reads. We propose two alternatives to recovering this lost locality, each based on the idea of asynchronously reorganizing the indirected blocks in persistent storage. We evaluate our system and show that it can significantly reduce the costs of double-paging. We focus our experiments on a synthetic benchmark designed to highlight its effects. In our experiments we observe Tesseract can improve our benchmark's throughput by as much as 200% when using traditional disks and by as much as 30% when using SSD. At the same time worst case application responsiveness can be improved by a factor of 5.

在虚拟机(vm)和管理程序之间的内存多级调度中，双分页是一个经常被提及的问题(如果没有得到证实的话)。当虚拟客户机和hypervisor都超额提交各自的物理地址空间时，就会出现这个问题。当客户机分页出之前由hypervisor交换出的内存时，它会启动一系列代价高昂的步骤，导致从hypervisor交换文件中读取内容，然后再次写入，从而大大延长了完成客户机I/O请求的时间。结果，性能迅速下降。我们提出了Tesseract，一个直接和透明地解决双分页问题的系统。Tesseract跟踪客户机和hypervisor的I/O操作是否冗余，并修改这些I/O，以创建到包含页面内容的现有磁盘块的间接连接。尽管我们的重点是协调客户机磁盘和管理程序交换之间的I/ o，但我们的技术是通用的，可以协调或重复数据删除由VM读取或写入的客户机页面的I/ o。对于以普通方式访问的文件内容，对磁盘块进行重复数据删除是可以理解的。我们的方法面临的一个挑战是客户机I/ o的位置(反映客户机对磁盘布局的概念)通常与管理程序交换中的块的位置不同。这种间接的局部性损失会导致后续客户机读取时显著的性能损失。我们提出了两种方法来恢复丢失的局域性，每一种方法都基于异步重组持久化存储中的间接块的思想。我们评估了我们的系统，并表明它可以显著降低双分页的成本。我们将实验重点放在一个合成基准上，旨在突出其效果。在我们的实验中，我们观察到Tesseract在使用传统磁盘时可以将基准测试的吞吐量提高200%，在使用SSD时可以提高30%。同时，在最坏情况下，应用程序的响应性可以提高5倍。

{"title":"Tesseract: reconciling guest I/O and hypervisor swapping in a VM","authors":"K. Arya, Y. Baskakov, Alex Garthwaite","doi":"10.1145/2576195.2576198","DOIUrl":"https://doi.org/10.1145/2576195.2576198","url":null,"abstract":"Double-paging is an often-cited, if unsubstantiated, problem in multi-level scheduling of memory between virtual machines (VMs) and the hypervisor. This problem occurs when both a virtualized guest and the hypervisor overcommit their respective physical address-spaces. When the guest pages out memory previously swapped out by the hypervisor, it initiates an expensive sequence of steps causing the contents to be read in from the hypervisor swapfile only to be written out again, significantly lengthening the time to complete the guest I/O request. As a result, performance rapidly drops.\u0000 We present Tesseract, a system that directly and transparently addresses the double-paging problem. Tesseract tracks when guest and hypervisor I/O operations are redundant and modifies these I/Os to create indirections to existing disk blocks containing the page contents. Although our focus is on reconciling I/Os between the guest disks and hypervisor swap, our technique is general and can reconcile, or deduplicate, I/Os for guest pages read or written by the VM.\u0000 Deduplication of disk blocks for file contents accessed in a common manner is well-understood. One challenge that our approach faces is that the locality of guest I/Os (reflecting the guest's notion of disk layout) often differs from that of the blocks in the hypervisor swap. This loss of locality through indirection results in significant performance loss on subsequent guest reads. We propose two alternatives to recovering this lost locality, each based on the idea of asynchronously reorganizing the indirected blocks in persistent storage.\u0000 We evaluate our system and show that it can significantly reduce the costs of double-paging. We focus our experiments on a synthetic benchmark designed to highlight its effects. In our experiments we observe Tesseract can improve our benchmark's throughput by as much as 200% when using traditional disks and by as much as 30% when using SSD. At the same time worst case application responsiveness can be improved by a factor of 5.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"116 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123628555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

The case for the three R's of systems research: repeatability, reproducibility and rigor 系统研究的三个R的案例:可重复性，再现性和严谨性

International Conference on Virtual Execution Environments

Pub Date : 2014-03-01 DOI: 10.1145/2576195.2576216

J. Vitek

Computer systems research spans sub-disciplines that include embedded systems, programming languages, network- ing, and operating systems. In this talk my contention is that a number of structural factors inhibit quality systems re- search. Symptoms of the problem include unrepeatable and unreproduced results as well as results that are either devoid of meaning or that measure the wrong thing. I will illustrate the impact of these issues on our research output with examples from the development and empirical evaluation of the Schism real-time garbage collection algorithm that is shipped with the FijiVM -- a Java virtual machine for embedded and mobile devices. I will argue that our field should fos- ter: repetition of results, independent reproduction, as well as rigorous evaluation. I will outline some baby steps taken by several computer conferences. In particular I will focus on the introduction of Artifact Evaluation Committees or AECs to ECOOP, OOPLSA, PLDI and soon POPL. The goal of the AECs is to encourage author to package the soft- ware artifacts that they used to support the claims made in their paper and to submit these artifacts for evaluation. AECs were carefully designed to provide positive feedback to the authors that take the time to create repeatable research.

计算机系统研究涵盖了包括嵌入式系统、编程语言、网络和操作系统在内的分支学科。在这次演讲中，我的论点是一些结构性因素抑制了质量体系的研究。问题的症状包括不可重复和不可再现的结果，以及没有意义或测量错误的结果。我将用FijiVM(用于嵌入式和移动设备的Java虚拟机)附带的Schism实时垃圾收集算法的开发和实证评估中的例子来说明这些问题对我们的研究成果的影响。我认为我们的研究领域应该遵循以下原则:结果的重复，独立的再现，以及严格的评估。我将概述几个计算机会议所采取的一些初步步骤。我将特别关注工件评估委员会(aec)对ECOOP、OOPLSA、PLDI以及即将到来的POPL的介绍。aec的目标是鼓励作者打包他们用来支持论文中声明的软件工件，并提交这些工件以供评估。aec经过精心设计，为花时间进行可重复研究的作者提供积极反馈。

{"title":"The case for the three R's of systems research: repeatability, reproducibility and rigor","authors":"J. Vitek","doi":"10.1145/2576195.2576216","DOIUrl":"https://doi.org/10.1145/2576195.2576216","url":null,"abstract":"Computer systems research spans sub-disciplines that include embedded systems, programming languages, network- ing, and operating systems. In this talk my contention is that a number of structural factors inhibit quality systems re- search. Symptoms of the problem include unrepeatable and unreproduced results as well as results that are either devoid of meaning or that measure the wrong thing. I will illustrate the impact of these issues on our research output with examples from the development and empirical evaluation of the Schism real-time garbage collection algorithm that is shipped with the FijiVM -- a Java virtual machine for embedded and mobile devices. I will argue that our field should fos- ter: repetition of results, independent reproduction, as well as rigorous evaluation. I will outline some baby steps taken by several computer conferences. In particular I will focus on the introduction of Artifact Evaluation Committees or AECs to ECOOP, OOPLSA, PLDI and soon POPL. The goal of the AECs is to encourage author to package the soft- ware artifacts that they used to support the claims made in their paper and to submit these artifacts for evaluation. AECs were carefully designed to provide positive feedback to the authors that take the time to create repeatable research.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128369412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Virtual asymmetric multiprocessor for interactive performance of consolidated desktops 统一桌面交互性能的虚拟非对称多处理器

International Conference on Virtual Execution Environments

Pub Date : 2014-03-01 DOI: 10.1145/2576195.2576199

Hwanju Kim, Sangwook P. Kim, Jinkyu Jeong, Joonwon Lee

This paper presents virtual asymmetric multiprocessor, a new scheme of virtual desktop scheduling on multi-core processors for user-interactive performance. The proposed scheme enables virtual CPUs to be dynamically performance-asymmetric based on their hosted workloads. To enhance user experience on consolidated desktops, our scheme provides interactive workloads with fast virtual CPUs, which have more computing power than those hosting background workloads in the same virtual machine. To this end, we devise a hypervisor extension that transparently classifies background tasks from potentially interactive workloads. In addition, we introduce a guest extension that manipulates the scheduling policy of an operating system in favor of our hypervisor-level scheme so that interactive performance can be further improved. Our evaluation shows that the proposed scheme significantly improves interactive performance of application launch, Web browsing, and video playback applications when CPU-intensive workloads highly disturb the interactive workloads.

提出了一种基于多核处理器的虚拟桌面调度方案——虚拟非对称多处理器，以提高用户交互性能。该方案使虚拟cpu能够根据其承载的工作负载动态地实现性能不对称。为了增强合并桌面的用户体验，我们的方案提供了具有快速虚拟cpu的交互式工作负载，这些虚拟cpu比在同一虚拟机中托管后台工作负载的虚拟cpu具有更强的计算能力。为此，我们设计了一个管理程序扩展，可以透明地将后台任务与潜在的交互式工作负载进行分类。此外，我们还引入了一个来宾扩展，该扩展可以操纵操作系统的调度策略，以支持我们的管理程序级方案，从而进一步提高交互性能。我们的评估表明，当cpu密集型工作负载高度干扰交互工作负载时，所提出的方案显著提高了应用程序启动、Web浏览和视频播放应用程序的交互性能。

引用次数: 8

MuscalietJS: rethinking layered dynamic web runtimes MuscalietJS:重新思考分层动态web运行时

International Conference on Virtual Execution Environments

Pub Date : 2014-03-01 DOI: 10.1145/2576195.2576211

Behnam Robatmili, Calin Cascaval, Mehrdad Reshadi, Madhukar N. Kedlaya, Seth Fowler, Vrajesh Bhavsar, Michael Weber, B. Hardekopf

Layered JavaScript engines, in which the JavaScript runtime is built on top another managed runtime, provide better extensibility and portability compared to traditional monolithic engines. In this paper, we revisit the design of layered JavaScript engines and propose a layered architecture, called MuscalietJS2, that splits the responsibilities of a JavaScript engine between a high-level, JavaScript-specific component and a low-level, language-agnostic .NET VM. To make up for the performance loss due to layering, we propose a two pronged approach: high-level JavaScript optimizations and exploitation of low-level VM features that produce very efficient code for hot functions. We demonstrate the validity of the MuscalietJS design through a comprehensive evaluation using both the Sunspider benchmarks and a set of web workloads. We demonstrate that our approach outperforms other layered engines such as IronJS and Rhino engines while providing extensibility, adaptability and portability.

分层JavaScript引擎，其中JavaScript运行时构建在另一个托管运行时之上，与传统的单一引擎相比，提供了更好的可扩展性和可移植性。在本文中，我们重新审视了分层JavaScript引擎的设计，并提出了一种称为MuscalietJS2的分层架构，它将JavaScript引擎的职责划分为高级的、特定于JavaScript的组件和低级的、与语言无关的。net虚拟机。为了弥补由于分层造成的性能损失，我们提出了两种方法:高级JavaScript优化和利用低级VM特性，为热函数生成非常高效的代码。我们通过使用Sunspider基准测试和一组web工作负载进行全面评估，证明了MuscalietJS设计的有效性。我们证明了我们的方法优于其他分层引擎，如IronJS和Rhino引擎，同时提供可扩展性、适应性和可移植性。

引用次数: 2

Ginseng: market-driven memory allocation 人参:市场驱动的记忆配置

International Conference on Virtual Execution Environments

Pub Date : 2014-03-01 DOI: 10.1145/2576195.2576197

Orna Agmon Ben-Yehuda, Eyal Posener, Muli Ben-Yehuda, A. Schuster, Ahuva Mu'alem

Physical memory is the scarcest resource in today's cloud computing platforms. Cloud providers would like to maximize their clients' satisfaction by renting precious physical memory to those clients who value it the most. But real-world cloud clients are selfish: they will only tell their providers the truth about how much they value memory when it is in their own best interest to do so. How can real-world cloud providers allocate memory efficiently to those (selfish) clients who value it the most? We present Ginseng, the first market-driven cloud system that allocates memory efficiently to selfish cloud clients. Ginseng incentivizes selfish clients to bid their true value for the memory they need when they need it. Ginseng continuously collects client bids, finds an efficient memory allocation, and re-allocates physical memory to the clients that value it the most. Ginseng achieves a 6.2×--15.8x improvement (83%--100% of the optimum) in aggregate client satisfaction when compared with state-of-the-art approaches for cloud memory allocation.

物理内存是当今云计算平台中最稀缺的资源。云提供商希望通过将宝贵的物理内存租给那些最看重它的客户来最大限度地提高客户的满意度。但是现实世界的云计算客户是自私的:他们只会在符合自己最大利益的情况下告诉提供商他们对内存的重视程度。现实世界的云提供商如何有效地将内存分配给那些最看重内存的(自私的)客户端?我们提出了人参，第一个市场驱动的云系统，有效地分配内存给自私的云客户端。人参鼓励自私的客户在他们需要记忆的时候为他们真正需要的记忆出价。Ginseng不断收集客户出价，找到有效的内存分配，并将物理内存重新分配给最有价值的客户。与最先进的云内存分配方法相比，人参在总体客户满意度方面实现了6.2 -15.8倍的改进(最优的83%- 100%)。

引用次数: 80

Friendly barriers: efficient work-stealing with return barriers 友好障碍:有效的工作窃取与返回障碍

International Conference on Virtual Execution Environments

Pub Date : 2014-03-01 DOI: 10.1145/2576195.2576207

Vivek Kumar, S. Blackburn, D. Grove

This paper addresses the problem of efficiently supporting parallelism within a managed runtime. A popular approach for exploiting software parallelism on parallel hardware is task parallelism, where the programmer explicitly identifies potential parallelism and the runtime then schedules the work. Work-stealing is a promising scheduling strategy that a runtime may use to keep otherwise idle hardware busy while relieving overloaded hardware of its burden. However, work-stealing comes with substantial overheads. Recent work identified sequential overheads of work-stealing, those that occur even when no stealing takes place, as a significant source of overhead. That work was able to reduce sequential overheads to just 15%. In this work, we turn to dynamic overheads, those that occur each time a steal takes place. We show that the dynamic overhead is dominated by introspection of the victim's stack when a steal takes place. We exploit the idea of a low overhead return barrier to reduce the dynamic overhead by approximately half, resulting in total performance improvements of as much as 20%. Because, unlike prior work, we attack the overheads directly due to stealing and therefore attack the overheads that grow as parallelism grows, we improve the scalability of work-stealing applications. This result is complementary to recent work addressing the sequential overheads of work-stealing. This work therefore substantially relieves work-stealing of the increasing pressure due to increasing intra-node hardware parallelism.

本文解决了在托管运行时中有效支持并行性的问题。在并行硬件上开发软件并行性的一种流行方法是任务并行性，程序员明确地识别潜在的并行性，然后运行时调度工作。工作窃取是一种很有前途的调度策略，运行时可以使用它来保持空闲硬件的繁忙，同时减轻过载硬件的负担。然而，窃取工作带来了大量的管理费用。最近的研究发现，即使在没有窃取工作的情况下，也会发生偷工作的连续开销，这是开销的重要来源。这项工作能够将连续开销减少到15%。在本工作中，我们转向动态开销，即每次偷盗发生时发生的开销。我们展示了当偷取发生时，动态开销主要由受害者堆栈的自省控制。我们利用低开销返回屏障的想法，将动态开销减少了大约一半，从而使总性能提高了20%。因为，与之前的工作不同，我们直接攻击了由于窃取而导致的开销，因此攻击了随着并行性增长而增长的开销，我们提高了工作窃取应用程序的可伸缩性。这一结果是对最近解决窃取工作的连续开销的工作的补充。因此，这项工作大大减轻了由于节点内硬件并行性增加而增加的工作压力。

{"title":"Friendly barriers: efficient work-stealing with return barriers","authors":"Vivek Kumar, S. Blackburn, D. Grove","doi":"10.1145/2576195.2576207","DOIUrl":"https://doi.org/10.1145/2576195.2576207","url":null,"abstract":"This paper addresses the problem of efficiently supporting parallelism within a managed runtime. A popular approach for exploiting software parallelism on parallel hardware is task parallelism, where the programmer explicitly identifies potential parallelism and the runtime then schedules the work. Work-stealing is a promising scheduling strategy that a runtime may use to keep otherwise idle hardware busy while relieving overloaded hardware of its burden. However, work-stealing comes with substantial overheads. Recent work identified sequential overheads of work-stealing, those that occur even when no stealing takes place, as a significant source of overhead. That work was able to reduce sequential overheads to just 15%.\u0000 In this work, we turn to dynamic overheads, those that occur each time a steal takes place. We show that the dynamic overhead is dominated by introspection of the victim's stack when a steal takes place. We exploit the idea of a low overhead return barrier to reduce the dynamic overhead by approximately half, resulting in total performance improvements of as much as 20%. Because, unlike prior work, we attack the overheads directly due to stealing and therefore attack the overheads that grow as parallelism grows, we improve the scalability of work-stealing applications. This result is complementary to recent work addressing the sequential overheads of work-stealing. This work therefore substantially relieves work-stealing of the increasing pressure due to increasing intra-node hardware parallelism.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"48 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132974305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Real-time deep virtual machine introspection and its applications 实时深度虚拟机自省及其应用

International Conference on Virtual Execution Environments

Pub Date : 2014-03-01 DOI: 10.1145/2576195.2576196

Jennia Hizver, T. Chiueh

Virtual Machine Introspection (VMI) provides the ability to monitor virtual machines (VM) in an agentless fashion by gathering VM execution states from the hypervisor and analyzing those states to extract information about a running operating system (OS) without installing an agent inside the VM. VMI's main challenge lies in the difficulty in converting low-level byte string values into high-level semantic states of the monitored VM's OS. In this work, we tackle this challenge by developing a real-time kernel data structure monitoring (RTKDSM) system that leverages the rich OS analysis capabilities of Volatility, an open source computer forensics framework, to significantly simplify and automate analysis of VM execution states. The RTKDSM system is designed as an extensible software framework that is meant to be extended to perform application-specific VM state analysis. In addition, the RTKDSM system is able to perform real-time monitoring of any changes made to the extracted OS states of guest VMs. This real-time monitoring capability is especially important for VMI-based security applications. To minimize the performance overhead associated with real-time kernel data structure monitoring, the RTKDSM system has incorporated several optimizations whose effectiveness is reported in this paper.

虚拟机自省(Virtual Machine Introspection, VMI)提供了以无代理方式监视虚拟机的能力，方法是从管理程序收集虚拟机的执行状态，并分析这些状态，以提取有关正在运行的操作系统(OS)的信息，而无需在虚拟机内部安装代理。VMI的主要挑战在于难以将低级字节字符串值转换为被监视VM操作系统的高级语义状态。在这项工作中，我们通过开发实时内核数据结构监控(RTKDSM)系统来解决这一挑战，该系统利用了开源计算机取证框架波动性丰富的操作系统分析功能，以显着简化和自动化VM执行状态分析。RTKDSM系统被设计为一个可扩展的软件框架，旨在扩展以执行特定于应用程序的VM状态分析。此外，RTKDSM系统能够实时监视对客户虚拟机提取的操作系统状态所做的任何更改。这种实时监控功能对于基于vmi的安全应用程序尤其重要。为了最小化与实时内核数据结构监视相关的性能开销，RTKDSM系统集成了几个优化，本文报告了这些优化的有效性。

{"title":"Real-time deep virtual machine introspection and its applications","authors":"Jennia Hizver, T. Chiueh","doi":"10.1145/2576195.2576196","DOIUrl":"https://doi.org/10.1145/2576195.2576196","url":null,"abstract":"Virtual Machine Introspection (VMI) provides the ability to monitor virtual machines (VM) in an agentless fashion by gathering VM execution states from the hypervisor and analyzing those states to extract information about a running operating system (OS) without installing an agent inside the VM. VMI's main challenge lies in the difficulty in converting low-level byte string values into high-level semantic states of the monitored VM's OS. In this work, we tackle this challenge by developing a real-time kernel data structure monitoring (RTKDSM) system that leverages the rich OS analysis capabilities of Volatility, an open source computer forensics framework, to significantly simplify and automate analysis of VM execution states. The RTKDSM system is designed as an extensible software framework that is meant to be extended to perform application-specific VM state analysis. In addition, the RTKDSM system is able to perform real-time monitoring of any changes made to the extracted OS states of guest VMs. This real-time monitoring capability is especially important for VMI-based security applications. To minimize the performance overhead associated with real-time kernel data structure monitoring, the RTKDSM system has incorporated several optimizations whose effectiveness is reported in this paper.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125038041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 51

Preemptable ticket spinlocks: improving consolidated performance in the cloud 可抢占的票据自旋锁:提高云中的综合性能

International Conference on Virtual Execution Environments

Pub Date : 2013-03-16 DOI: 10.1145/2451512.2451549

Jiannan Ouyang, J. Lange

When executing inside a virtual machine environment, OS level synchronization primitives are faced with significant challenges due to the scheduling behavior of the underlying virtual machine monitor. Operations that are ensured to last only a short amount of time on real hardware, are capable of taking considerably longer when running virtualized. This change in assumptions has significant impact when an OS is executing inside a critical region that is protected by a spinlock. The interaction between OS level spinlocks and VMM scheduling is known as the Lock Holder Preemption problem and has a significant impact on overall VM performance. However, with the use of ticket locks instead of generic spinlocks, virtual environments must also contend with waiters being preempted before they are able to acquire the lock. This has the effect of blocking access to a lock, even if the lock itself is available. We identify this scenario as the Lock Waiter Preemption problem. In order to solve both problems we introduce Preemptable Ticket spinlocks, a new locking primitive that is designed to enable a VM to always make forward progress by relaxing the ordering guarantees offered by ticket locks. We show that the use of Preemptable Ticket spinlocks improves VM performance by 5.32X on average, when running on a non paravirtual VMM, and by 7.91X when running on a VMM that supports a paravirtual locking interface, when executing a set of microbenchmarks as well as a realistic e-commerce benchmark.

当在虚拟机环境中执行时，由于底层虚拟机监视器的调度行为，操作系统级同步原语面临着重大挑战。确保在真实硬件上只持续很短时间的操作，在运行虚拟化时可能会花费相当长的时间。当操作系统在受自旋锁保护的关键区域内执行时，这种假设的变化会产生重大影响。操作系统级自旋锁和VMM调度之间的交互称为锁持有人抢占问题，它对VM的整体性能有重大影响。然而，由于使用票锁而不是通用自旋锁，虚拟环境还必须应对服务员在能够获得锁之前被抢占的问题。这样做的效果是阻塞对锁的访问，即使锁本身是可用的。我们将这种情况称为锁服务员抢占问题。为了解决这两个问题，我们引入了Preemptable Ticket自旋锁，这是一种新的锁原语，旨在通过放松票据锁提供的顺序保证，使VM始终能够向前推进。我们表明，在非半虚拟的VMM上运行时，使用Preemptable Ticket自旋锁平均可将VM性能提高5.32X，在支持半虚拟锁定接口的VMM上运行时，在执行一组微基准测试以及实际的电子商务基准测试时，可将VM性能提高7.91X。

{"title":"Preemptable ticket spinlocks: improving consolidated performance in the cloud","authors":"Jiannan Ouyang, J. Lange","doi":"10.1145/2451512.2451549","DOIUrl":"https://doi.org/10.1145/2451512.2451549","url":null,"abstract":"When executing inside a virtual machine environment, OS level synchronization primitives are faced with significant challenges due to the scheduling behavior of the underlying virtual machine monitor. Operations that are ensured to last only a short amount of time on real hardware, are capable of taking considerably longer when running virtualized. This change in assumptions has significant impact when an OS is executing inside a critical region that is protected by a spinlock. The interaction between OS level spinlocks and VMM scheduling is known as the Lock Holder Preemption problem and has a significant impact on overall VM performance. However, with the use of ticket locks instead of generic spinlocks, virtual environments must also contend with waiters being preempted before they are able to acquire the lock. This has the effect of blocking access to a lock, even if the lock itself is available. We identify this scenario as the Lock Waiter Preemption problem. In order to solve both problems we introduce Preemptable Ticket spinlocks, a new locking primitive that is designed to enable a VM to always make forward progress by relaxing the ordering guarantees offered by ticket locks. We show that the use of Preemptable Ticket spinlocks improves VM performance by 5.32X on average, when running on a non paravirtual VMM, and by 7.91X when running on a VMM that supports a paravirtual locking interface, when executing a set of microbenchmarks as well as a realistic e-commerce benchmark.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127268830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 45