International Conference on Virtual Execution Environments最新文献_第7页

Hybrid binary rewriting for memory access instrumentation 用于内存访问工具的混合二进制重写

International Conference on Virtual Execution Environments

Pub Date : 2011-03-09 DOI: 10.1145/1952682.1952711

Amitabha Roy, S. Hand, T. Harris

Memory access instrumentation is fundamental to many applications such as software transactional memory systems, profiling tools and race detectors. We examine the problem of efficiently instrumenting memory accesses in x86 machine code to support software transactional memory and profiling. We aim to automatically instrument all shared memory accesses in critical sections of x86 binaries, while achieving overhead close to that obtained when performing manual instrumentation at the source code level. The two primary options in building such an instrumentation system are static and dynamic binary rewriting: the former instruments binaries at link time before execution, while the latter binary rewriting instruments binaries at runtime. Static binary rewriting offers extremely low overhead but is hampered by the limits of static analysis. Dynamic binary rewriting is able to use runtime information but typically incurs higher overhead. This paper proposes an alternative: hybrid binary rewriting. Hybrid binary rewriting is built around the idea of a persistent instrumentation cache (PIC) that is associated with a binary and contains instrumented code from it. It supports two execution modes when using instrumentation: active and passive modes. In the active execution mode, a dynamic binary rewriting engine (PIN) is used to intercept execution, and generate instrumentation into the PIC, which is an on-disk file. This execution mode can take full advantage of runtime information. Later, passive execution can be used where instrumented code is executed out of the PIC. This allows us to attain overheads similar to those incurred with static binary rewriting. This instrumentation methodology enables a variety of static and dynamic techniques to be applied. For example, in passive mode, execution occurs directly from the original executable save for regions that require instrumentation. This has allowed us to build a low-overhead transactional memory profiler. We also demonstrate how we can use the combination of static and dynamic techniques to eliminate instrumentation for accesses to locations that are thread-private.

内存访问检测是许多应用程序的基础，例如软件事务性内存系统、分析工具和竞赛检测器。我们研究了在x86机器码中有效地检测内存访问以支持软件事务性内存和分析的问题。我们的目标是在x86二进制文件的关键部分自动检测所有共享内存访问，同时实现与在源代码级别执行手动检测时接近的开销。构建这样一个检测系统的两个主要选项是静态和动态二进制重写:前者在执行之前的链接时检测二进制文件，而后者在运行时二进制重写检测二进制文件。静态二进制重写提供了极低的开销，但受到静态分析的限制。动态二进制重写能够使用运行时信息，但通常会带来更高的开销。本文提出了一种替代方案:混合二进制重写。混合二进制重写是围绕持久性插装缓存(PIC)的思想构建的，持久性插装缓存与二进制文件相关联，并包含其中的插装代码。在使用插装时，它支持两种执行模式:主动模式和被动模式。在主动执行模式下，使用动态二进制重写引擎(PIN)来拦截执行，并将检测生成到PIC中，PIC是磁盘上的文件。这种执行模式可以充分利用运行时信息。稍后，可以在从PIC执行插装代码的地方使用被动执行。这允许我们获得与静态二进制重写类似的开销。这种检测方法可以应用各种静态和动态技术。例如，在被动模式下，除了需要检测的区域外，直接从原始可执行文件执行。这使我们能够构建一个低开销的事务性内存分析器。我们还演示了如何结合使用静态和动态技术来消除对线程私有位置访问的插装。

{"title":"Hybrid binary rewriting for memory access instrumentation","authors":"Amitabha Roy, S. Hand, T. Harris","doi":"10.1145/1952682.1952711","DOIUrl":"https://doi.org/10.1145/1952682.1952711","url":null,"abstract":"Memory access instrumentation is fundamental to many applications such as software transactional memory systems, profiling tools and race detectors. We examine the problem of efficiently instrumenting memory accesses in x86 machine code to support software transactional memory and profiling. We aim to automatically instrument all shared memory accesses in critical sections of x86 binaries, while achieving overhead close to that obtained when performing manual instrumentation at the source code level.\u0000 The two primary options in building such an instrumentation system are static and dynamic binary rewriting: the former instruments binaries at link time before execution, while the latter binary rewriting instruments binaries at runtime. Static binary rewriting offers extremely low overhead but is hampered by the limits of static analysis. Dynamic binary rewriting is able to use runtime information but typically incurs higher overhead. This paper proposes an alternative: hybrid binary rewriting. Hybrid binary rewriting is built around the idea of a persistent instrumentation cache (PIC) that is associated with a binary and contains instrumented code from it. It supports two execution modes when using instrumentation: active and passive modes. In the active execution mode, a dynamic binary rewriting engine (PIN) is used to intercept execution, and generate instrumentation into the PIC, which is an on-disk file. This execution mode can take full advantage of runtime information. Later, passive execution can be used where instrumented code is executed out of the PIC. This allows us to attain overheads similar to those incurred with static binary rewriting.\u0000 This instrumentation methodology enables a variety of static and dynamic techniques to be applied. For example, in passive mode, execution occurs directly from the original executable save for regions that require instrumentation. This has allowed us to build a low-overhead transactional memory profiler. We also demonstrate how we can use the combination of static and dynamic techniques to eliminate instrumentation for accesses to locations that are thread-private.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"31 10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131573668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Workload-aware live storage migration for clouds 面向云的工作负载感知实时存储迁移

International Conference on Virtual Execution Environments

Pub Date : 2011-03-09 DOI: 10.1145/1952682.1952700

Jie Zheng, T. Ng, K. Sripanidkulchai

The emerging open cloud computing model will provide users with great freedom to dynamically migrate virtualized computing services to, from, and between clouds over the wide-area. While this freedom leads to many potential benefits, the running services must be minimally disrupted by the migration. Unfortunately, current solutions for wide-area migration incur too much disruption as they will significantly slow down storage I/O operations during migration. The resulting increase in service latency could be very costly to a business. This paper presents a novel storage migration scheduling algorithm that can greatly improve storage I/O performance during wide-area migration. Our algorithm is unique in that it considers individual virtual machine's storage I/O workload such as temporal locality, spatial locality and popularity characteristics to compute an efficient data transfer schedule. Using a fully implemented system on KVM and a trace-driven framework, we show that our algorithm provides large performance benefits across a wide range of popular virtual machine workloads.

新兴的开放云计算模型将为用户提供极大的自由，可以在广域的云之间动态迁移虚拟化计算服务。虽然这种自由带来了许多潜在的好处，但是运行中的服务必须尽可能少地受到迁移的干扰。不幸的是，目前用于广域迁移的解决方案会导致太多的中断，因为它们会在迁移期间显著减慢存储I/O操作。由此导致的服务延迟增加可能会给企业带来非常高昂的成本。本文提出了一种新的存储迁移调度算法，可以大大提高广域迁移时的存储I/O性能。我们的算法的独特之处在于它考虑了单个虚拟机的存储I/O工作负载，如时间局部性、空间局部性和流行特征，以计算有效的数据传输计划。通过在KVM上使用完全实现的系统和跟踪驱动框架，我们展示了我们的算法在各种流行的虚拟机工作负载上提供了巨大的性能优势。

引用次数: 86

Virtual WiFi: bring virtualization from wired to wireless 虚拟WiFi:将虚拟化从有线带到无线

International Conference on Virtual Execution Environments

Pub Date : 2011-03-09 DOI: 10.1145/1952682.1952706

Lei Xia, Sanjay Kumar, Xue Yang, P. Gopalakrishnan, York Liu, Sebastian Schoenberg, Xingang Guo

As virtualization trend is moving towards "client virtualization", wireless virtualization remains to be one of the technology gaps that haven't been addressed satisfactorily. Today's approaches are mainly developed for wired network, and are not suitable for virtualizing wireless network interface due to the fundamental differences between wireless and wired LAN devices that we will elaborate in this paper. We propose a wireless LAN virtualization approach named virtual WiFi that addresses the technology gap. With our proposed solution, the full wireless LAN functionalities are supported inside virtual machines; each virtual machine can establish its own connection with self-supplied credentials; and multiple separate wireless LAN connections are supported through one physical wireless LAN network interface. We designed and implemented a prototype for our proposed virtual WiFi approach, and conducted detailed performance study. Our results show that with conventional virtualization overhead mitigation mechanisms, our proposed approach can support fully functional wireless functions inside VM, and achieve close to native performance of Wireless LAN with moderately increased CPU utilization.

随着虚拟化趋势向“客户端虚拟化”发展，无线虚拟化仍然是一个尚未得到满意解决的技术空白。目前的方法主要是为有线网络开发的，由于无线和有线局域网设备之间的根本区别，因此不适合虚拟化无线网络接口，我们将在本文中详细说明。我们提出了一种名为虚拟WiFi的无线局域网虚拟化方法，以解决技术差距。通过我们提出的解决方案，在虚拟机内部支持完整的无线局域网功能;每个虚拟机可以用自己提供的凭证建立自己的连接;并且通过一个物理无线局域网网络接口支持多个独立的无线局域网连接。我们为我们提出的虚拟WiFi方法设计并实现了一个原型，并进行了详细的性能研究。我们的研究结果表明，通过传统的虚拟化开销缓解机制，我们提出的方法可以支持虚拟机内的全功能无线功能，并在适度提高CPU利用率的情况下实现接近无线局域网的本机性能。

{"title":"Virtual WiFi: bring virtualization from wired to wireless","authors":"Lei Xia, Sanjay Kumar, Xue Yang, P. Gopalakrishnan, York Liu, Sebastian Schoenberg, Xingang Guo","doi":"10.1145/1952682.1952706","DOIUrl":"https://doi.org/10.1145/1952682.1952706","url":null,"abstract":"As virtualization trend is moving towards \"client virtualization\", wireless virtualization remains to be one of the technology gaps that haven't been addressed satisfactorily. Today's approaches are mainly developed for wired network, and are not suitable for virtualizing wireless network interface due to the fundamental differences between wireless and wired LAN devices that we will elaborate in this paper. We propose a wireless LAN virtualization approach named virtual WiFi that addresses the technology gap. With our proposed solution, the full wireless LAN functionalities are supported inside virtual machines; each virtual machine can establish its own connection with self-supplied credentials; and multiple separate wireless LAN connections are supported through one physical wireless LAN network interface. We designed and implemented a prototype for our proposed virtual WiFi approach, and conducted detailed performance study. Our results show that with conventional virtualization overhead mitigation mechanisms, our proposed approach can support fully functional wireless functions inside VM, and achieve close to native performance of Wireless LAN with moderately increased CPU utilization.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133408723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 79

Selective hardware/software memory virtualization 可选择的硬件/软件内存虚拟化

International Conference on Virtual Execution Environments

Pub Date : 2011-03-09 DOI: 10.1145/1952682.1952710

Xiaolin Wang, J. Zang, Zhenlin Wang, Yingwei Luo, Xiaoming Li

As virtualization becomes a key technique for supporting cloud computing, much effort has been made to reduce virtualization overhead, so a virtualized system can match its native performance. One major overhead is due to memory or page table virtualization. Conventional virtual machines rely on a shadow mechanism to manage page tables, where a shadow page table maintained by the VMM (Virtual Machine Monitor) maps virtual addresses to machine addresses while a guest maintains its own virtual to physical page table. This shadow mechanism will result in expensive VM exits whenever there is a page fault that requires synchronization between the two page tables. To avoid this cost, both Intel and AMD provide hardware assists, EPT (extended page table) and NPT (nested page table), to facilitate address translation. With the hardware assists, the MMU (Memory Management Unit) maintains an ordinary guest page table that translates virtual addresses to guest physical addresses. In addition, the extended page table as provided by EPT translates from guest physical addresses to host physical or machine addresses. NPT works in a similar style. With EPT or NPT, a guest page fault can be handled by the guest itself without triggering VM exits. However, the hardware assists do have their disadvantage compared to the conventional shadow mechanism -- the page walk yields more memory accesses and thus longer latency. Our experimental results show that neither hardware-assisted paging (HAP) nor shadow paging (SP) can be a definite winner. Despite the fact that in over half of the cases, there is no noticeable gap between the two mechanisms, an up to 34% performance gap exists for a few benchmarks. We propose a dynamic switching mechanism that monitors TLB misses and guest page faults on the fly, and dynam-ically switches between the two paging modes. Our experiments show that this new mechanism can match and, sometimes, even beat the better performance of HAP and SP.

随着虚拟化成为支持云计算的一项关键技术，人们已经做出了很多努力来减少虚拟化开销，以便虚拟化系统能够匹配其原生性能。一个主要的开销是内存或页表虚拟化。传统的虚拟机依赖于影子机制来管理页表，其中由VMM(虚拟机监视器)维护的影子页表将虚拟地址映射到机器地址，而客户机维护自己的虚拟到物理页表。每当出现需要在两个页表之间进行同步的页面错误时，这种影子机制将导致代价高昂的VM退出。为了避免这种成本，Intel和AMD都提供了硬件辅助，EPT(扩展页表)和NPT(嵌套页表)，以方便地址转换。在硬件的帮助下，MMU(内存管理单元)维护一个普通的来宾页表，将虚拟地址转换为来宾物理地址。此外，EPT提供的扩展页表可以将来宾物理地址转换为主机物理地址或机器地址。NPT的工作方式与此类似。使用EPT或NPT，客户机页面错误可以由客户机自己处理，而不会触发VM退出。然而，与传统的影子机制相比，硬件辅助确实有缺点——页遍历产生更多的内存访问，因此延迟更长。我们的实验结果表明，无论是硬件辅助分页(HAP)还是影子分页(SP)都不是绝对的赢家。尽管在超过一半的情况下，这两种机制之间没有明显的差距，但在一些基准测试中存在高达34%的性能差距。我们提出了一种动态切换机制，可以实时监控TLB缺失和访客页面错误，并在两种分页模式之间动态切换。我们的实验表明，这种新机制可以匹配，有时甚至超过HAP和SP的更好性能。

{"title":"Selective hardware/software memory virtualization","authors":"Xiaolin Wang, J. Zang, Zhenlin Wang, Yingwei Luo, Xiaoming Li","doi":"10.1145/1952682.1952710","DOIUrl":"https://doi.org/10.1145/1952682.1952710","url":null,"abstract":"As virtualization becomes a key technique for supporting cloud computing, much effort has been made to reduce virtualization overhead, so a virtualized system can match its native performance. One major overhead is due to memory or page table virtualization. Conventional virtual machines rely on a shadow mechanism to manage page tables, where a shadow page table maintained by the VMM (Virtual Machine Monitor) maps virtual addresses to machine addresses while a guest maintains its own virtual to physical page table. This shadow mechanism will result in expensive VM exits whenever there is a page fault that requires synchronization between the two page tables. To avoid this cost, both Intel and AMD provide hardware assists, EPT (extended page table) and NPT (nested page table), to facilitate address translation. With the hardware assists, the MMU (Memory Management Unit) maintains an ordinary guest page table that translates virtual addresses to guest physical addresses. In addition, the extended page table as provided by EPT translates from guest physical addresses to host physical or machine addresses. NPT works in a similar style. With EPT or NPT, a guest page fault can be handled by the guest itself without triggering VM exits. However, the hardware assists do have their disadvantage compared to the conventional shadow mechanism -- the page walk yields more memory accesses and thus longer latency. Our experimental results show that neither hardware-assisted paging (HAP) nor shadow paging (SP) can be a definite winner. Despite the fact that in over half of the cases, there is no noticeable gap between the two mechanisms, an up to 34% performance gap exists for a few benchmarks. We propose a dynamic switching mechanism that monitors TLB misses and guest page faults on the fly, and dynam-ically switches between the two paging modes. Our experiments show that this new mechanism can match and, sometimes, even beat the better performance of HAP and SP.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123538454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 49

Performance profiling of virtual machines 虚拟机的性能分析

International Conference on Virtual Execution Environments

Pub Date : 2011-03-09 DOI: 10.1145/1952682.1952686

Jiaqing Du, Nipun Sehrawat, W. Zwaenepoel

Profilers based on hardware performance counters are indispensable for performance debugging of complex software systems. All modern processors feature hardware performance counters, but current virtual machine monitors (VMMs) do not properly expose them to the guest operating systems. Existing profiling tools require privileged access to the VMM to profile the guest and are only available for VMMs based on paravirtualization. Diagnosing performance problems of software running in a virtualized environment is therefore quite difficult. This paper describes how to extend VMMs to support performance profiling. We present two types of profiling in a virtualized environment: guest-wide profiling and system-wide profiling. Guest-wide profiling shows the runtime behavior of a guest. The profiler runs in the guest and does not require privileged access to the VMM. System-wide profiling exposes the runtime behavior of both the VMM and any number of guests. It requires profilers both in the VMM and in those guests. Not every VMM has the right architecture to support both types of profiling. We determine the requirements for each of them, and explore the possibilities for their implementation in virtual machines using hardware assistance, paravirtualization, and binary translation. We implement both guest-wide and system-wide profiling for a VMM based on the x86 hardware virtualization extensions and system-wide profiling for a VMM based on binary translation. We demonstrate that these profilers provide good accuracy with only limited overhead.

基于硬件性能计数器的分析器对于复杂软件系统的性能调试是必不可少的。所有现代处理器都具有硬件性能计数器，但是当前的虚拟机监视器(vmm)没有正确地将它们暴露给客户机操作系统。现有的分析工具需要对VMM的特权访问才能对客户机进行分析，并且仅对基于半虚拟化的VMM可用。因此，诊断在虚拟化环境中运行的软件的性能问题非常困难。本文描述了如何扩展vmm以支持性能分析。我们在虚拟化环境中提供两种类型的分析:客户机范围的分析和系统范围的分析。来宾范围的概要分析显示来宾的运行时行为。分析程序在客户机中运行，不需要对VMM的特权访问。系统范围的概要分析公开VMM和任意数量的来宾的运行时行为。它需要VMM和那些来宾中的分析器。并不是每个VMM都有合适的体系结构来支持这两种类型的分析。我们确定了它们各自的需求，并探索了在虚拟机中使用硬件辅助、半虚拟化和二进制转换实现它们的可能性。我们为基于x86硬件虚拟化扩展的VMM实现客户机范围和系统范围的分析，为基于二进制转换的VMM实现系统范围的分析。我们证明了这些分析器仅以有限的开销提供了良好的准确性。

{"title":"Performance profiling of virtual machines","authors":"Jiaqing Du, Nipun Sehrawat, W. Zwaenepoel","doi":"10.1145/1952682.1952686","DOIUrl":"https://doi.org/10.1145/1952682.1952686","url":null,"abstract":"Profilers based on hardware performance counters are indispensable for performance debugging of complex software systems. All modern processors feature hardware performance counters, but current virtual machine monitors (VMMs) do not properly expose them to the guest operating systems. Existing profiling tools require privileged access to the VMM to profile the guest and are only available for VMMs based on paravirtualization. Diagnosing performance problems of software running in a virtualized environment is therefore quite difficult.\u0000 This paper describes how to extend VMMs to support performance profiling. We present two types of profiling in a virtualized environment: guest-wide profiling and system-wide profiling. Guest-wide profiling shows the runtime behavior of a guest. The profiler runs in the guest and does not require privileged access to the VMM. System-wide profiling exposes the runtime behavior of both the VMM and any number of guests. It requires profilers both in the VMM and in those guests.\u0000 Not every VMM has the right architecture to support both types of profiling. We determine the requirements for each of them, and explore the possibilities for their implementation in virtual machines using hardware assistance, paravirtualization, and binary translation.\u0000 We implement both guest-wide and system-wide profiling for a VMM based on the x86 hardware virtualization extensions and system-wide profiling for a VMM based on binary translation. We demonstrate that these profilers provide good accuracy with only limited overhead.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127376944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 54

SymCall: symbiotic virtualization through VMM-to-guest upcalls SymCall:通过虚拟机到客户机的上行调用实现共生虚拟化

International Conference on Virtual Execution Environments

Pub Date : 2011-03-09 DOI: 10.1145/1952682.1952707

J. Lange, P. Dinda

Symbiotic virtualization is a new approach to system virtualization in which a guest OS targets the native hardware interface as in full system virtualization, but also optionally exposes a software interface that can be used by a VMM, if present, to increase performance and functionality. Neither the VMM nor the OS needs to support the symbiotic virtualization interface to function together, but if both do, both benefit. We describe the design and implementation of the SymCall symbiotic virtualization interface in our publicly available Palacios VMM for modern x86 machines. SymCall makes it possible for Palacios to make clean synchronous upcalls into a symbiotic guest, much like system calls. One use of symcalls is to allow synchronous collection of semantically rich guest data during exit handling in order to enable new VMM features. We describe the implementation of SwapBypass, a VMM service based on SymCall that reconsiders swap decisions made by a symbiotic Linux guest. Finally, we present a detailed performance evaluation of both SwapBypass and SymCall.

共生虚拟化是一种新的系统虚拟化方法，在这种方法中，客户机操作系统像在完全系统虚拟化中一样以本机硬件接口为目标，但也有选择地公开一个软件接口，如果存在的话，VMM可以使用该软件接口来提高性能和功能。VMM和操作系统都不需要支持共生虚拟化接口才能一起工作，但如果两者都支持，则两者都受益。我们描述了SymCall共生虚拟化接口的设计和实现，该接口在我们面向现代x86机器的公开可用的Palacios VMM中。SymCall使Palacios能够将干净的同步向上调用转换为共生客户机，就像系统调用一样。symcalls的一个用途是允许在退出处理期间同步收集语义丰富的客户机数据，以便启用新的VMM特性。我们描述了SwapBypass的实现，这是一个基于SymCall的VMM服务，它可以重新考虑共生Linux客户机所做的交换决策。最后，我们给出了SwapBypass和SymCall的详细性能评估。

引用次数: 29

Virtualization in the age of heterogeneous machines 异构机器时代的虚拟化

International Conference on Virtual Execution Environments

Pub Date : 2011-03-09 DOI: 10.1145/1952682.1952684

D. F. Bacon

Since their invention over 40 years ago, virtual machines have been used to virtualize one or more von Neumann processors and their associated peripherals. System virtual machines provide the illusion that the user has their own instance of a physical machine with a given instruction set architecture (ISA). Process virtual machines provide the illusion of running on a synthetic architecture independent of the underlying ISA, generally for the purpose of supporting a high-level language. To continue the historical trend of exponential increase in computational power in the face of limits on clock frequency scaling, we must find ways to harness the inherent parallelism of billions of transistors. I contend that multi-core chips are a fatally flawed approach - instead, maximum performance will be achieved by using heterogeneous chips and systems that combine customized and customizable computational substrates that achieve very high performance by closely matching the computational and communications structures of the application at hand. Such chips might look like a mashup of a conventional multicore, a GPU, an FPGA, some ASICs, and a DSP. But programming them with current technologies would be nightmarishly complex, portability would be lost, and innovation between chip generations would be severely limited. The answer (of course) is virtualization, and at both the device level and the language level. In this talk I will illustrate some challenges and potential solutions in the context of IBM's Liquid Metal project, in which we are designing a new high-level language (Lime) and compiler/runtime technology to virtualize the underlying computational devices by providing a uniform semantic model. I will also discuss problems (and opportunities) that this raises at the operating system and data center levels, particularly with computational elements like FPGAs for which "context switching" is currently either extremely expensive or simply impossible.

自从40多年前被发明以来，虚拟机已被用于虚拟化一个或多个冯·诺伊曼处理器及其相关外设。系统虚拟机提供了一种错觉，即用户拥有自己的具有给定指令集体系结构(ISA)的物理机实例。进程虚拟机提供了在独立于底层ISA的合成体系结构上运行的错觉，通常是为了支持高级语言。面对时钟频率缩放的限制，为了继续保持计算能力呈指数级增长的历史趋势，我们必须找到利用数十亿晶体管固有的并行性的方法。我认为多核芯片是一种有致命缺陷的方法——相反，最大性能将通过使用异质芯片和系统来实现，这些芯片和系统结合了定制和可定制的计算基板，通过密切匹配应用程序的计算和通信结构来实现非常高的性能。这种芯片可能看起来像传统的多核、GPU、FPGA、一些asic和DSP的混搭。但是用当前的技术对它们进行编程将会极其复杂，可移植性将会丧失，而且几代芯片之间的创新将受到严重限制。答案(当然)是虚拟化，在设备级和语言级都是如此。在这次演讲中，我将阐述IBM液态金属项目背景下的一些挑战和潜在的解决方案，在这个项目中，我们正在设计一种新的高级语言(Lime)和编译器/运行时技术，通过提供统一的语义模型来虚拟化底层计算设备。我还将讨论这在操作系统和数据中心级别引起的问题(和机会)，特别是对于像fpga这样的计算元素，“上下文切换”目前要么非常昂贵，要么根本不可能。

{"title":"Virtualization in the age of heterogeneous machines","authors":"D. F. Bacon","doi":"10.1145/1952682.1952684","DOIUrl":"https://doi.org/10.1145/1952682.1952684","url":null,"abstract":"Since their invention over 40 years ago, virtual machines have been used to virtualize one or more von Neumann processors and their associated peripherals. System virtual machines provide the illusion that the user has their own instance of a physical machine with a given instruction set architecture (ISA). Process virtual machines provide the illusion of running on a synthetic architecture independent of the underlying ISA, generally for the purpose of supporting a high-level language.\u0000 To continue the historical trend of exponential increase in computational power in the face of limits on clock frequency scaling, we must find ways to harness the inherent parallelism of billions of transistors. I contend that multi-core chips are a fatally flawed approach - instead, maximum performance will be achieved by using heterogeneous chips and systems that combine customized and customizable computational substrates that achieve very high performance by closely matching the computational and communications structures of the application at hand.\u0000 Such chips might look like a mashup of a conventional multicore, a GPU, an FPGA, some ASICs, and a DSP. But programming them with current technologies would be nightmarishly complex, portability would be lost, and innovation between chip generations would be severely limited.\u0000 The answer (of course) is virtualization, and at both the device level and the language level.\u0000 In this talk I will illustrate some challenges and potential solutions in the context of IBM's Liquid Metal project, in which we are designing a new high-level language (Lime) and compiler/runtime technology to virtualize the underlying computational devices by providing a uniform semantic model.\u0000 I will also discuss problems (and opportunities) that this raises at the operating system and data center levels, particularly with computational elements like FPGAs for which \"context switching\" is currently either extremely expensive or simply impossible.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131789227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Minimal-overhead virtualization of a large scale supercomputer 大型超级计算机的最小开销虚拟化

International Conference on Virtual Execution Environments

Pub Date : 2011-03-09 DOI: 10.1145/1952682.1952705

J. Lange, K. Pedretti, P. Dinda, P. Bridges, C. Bae, Philip Soltero, A. Merritt

Virtualization has the potential to dramatically increase the usability and reliability of high performance computing (HPC) systems. However, this potential will remain unrealized unless overheads can be minimized. This is particularly challenging on large scale machines that run carefully crafted HPC OSes supporting tightly-coupled, parallel applications. In this paper, we show how careful use of hardware and VMM features enables the virtualization of a large-scale HPC system, specifically a Cray XT4 machine, with < = 5% overhead on key HPC applications, microbenchmarks, and guests at scales of up to 4096 nodes. We describe three techniques essential for achieving such low overhead: passthrough I/O, workload-sensitive selection of paging mechanisms, and carefully controlled preemption. These techniques are forms of symbiotic virtualization, an approach on which we elaborate.

虚拟化具有显著提高高性能计算(HPC)系统的可用性和可靠性的潜力。但是，除非能够尽量减少管理费用，否则这种潜力仍将无法实现。这在运行精心设计的HPC操作系统的大型机器上尤其具有挑战性，这些操作系统支持紧密耦合的并行应用程序。在本文中，我们展示了如何谨慎地使用硬件和VMM特性来实现大规模HPC系统(特别是Cray XT4机器)的虚拟化，在高达4096个节点的规模下，关键HPC应用程序、微基准测试和客户机的开销< = 5%。我们描述了实现这种低开销所必需的三种技术:透传I/O、工作负载敏感的分页机制选择和仔细控制的抢占。这些技术是共生虚拟化的形式，我们将详细阐述这种方法。

引用次数: 82

Fast and space-efficient virtual machine checkpointing 快速和节省空间的虚拟机检查点

International Conference on Virtual Execution Environments

Pub Date : 2011-03-09 DOI: 10.1145/1952682.1952694

Eunbyung Park, Bernhard Egger, Jaejin Lee

Checkpointing, i.e., recording the volatile state of a virtual machine (VM) running as a guest in a virtual machine monitor (VMM) for later restoration, includes storing the memory available to the VM. Typically, a full image of the VM's memory along with processor and device states are recorded. With guest memory sizes of up to several gigabytes, the size of the checkpoint images becomes more and more of a concern. In this work we present a technique for fast and space-efficient checkpointing of virtual machines. In contrast to existing methods, our technique eliminates redundant data and stores only a subset of the VM's memory pages. Our technique transparently tracks I/O operations of the guest to external storage and maintains a list of memory pages whose contents are duplicated on non-volatile storage. At a checkpoint, these pages are excluded from the checkpoint image. We have implemented the proposed technique for paravirtualized as well as fully-virtualized guests in the Xen VMM. Our experiments with a paravirtualized guest (Linux) and two fullyvirtualized guests (Linux, Windows) show a significant reduction in the size of the checkpoint image as well as the time required to complete the checkpoint. Compared to the current Xen implementation, we achieve, on average, an 81% reduction in the stored data and a 74% reduction in the time required to take a checkpoint for the paravirtualized Linux guest. In a fully-virtualized environment runningWindows and Linux guests, we achieve a 64% reduction of the image size along with a 62% reduction in checkpointing time.

检查点，即在虚拟机监视器(VMM)中记录作为客户机运行的虚拟机(VM)的不稳定状态，以便以后恢复，包括存储VM可用的内存。通常，会记录VM内存的完整映像以及处理器和设备状态。由于来宾内存的大小高达几gb，检查点映像的大小变得越来越令人担忧。在这项工作中，我们提出了一种快速和节省空间的虚拟机检查点技术。与现有方法相比，我们的技术消除了冗余数据，并且只存储VM内存页的一个子集。我们的技术透明地跟踪客户机到外部存储的I/O操作，并维护一个内存页面列表，这些页面的内容在非易失性存储中重复。在检查点，这些页面将从检查点映像中排除。我们已经在Xen VMM中为半虚拟化和完全虚拟化的客户机实现了所建议的技术。我们对一个半虚拟化的客户机(Linux)和两个完全虚拟化的客户机(Linux、Windows)进行的实验表明，检查点映像的大小和完成检查点所需的时间都显著减少。与当前的Xen实现相比，我们平均减少了81%的存储数据，为半虚拟化的Linux客户机设置检查点所需的时间减少了74%。在运行windows和Linux客户机的完全虚拟化环境中，我们将映像大小减少了64%，检查点时间减少了62%。

{"title":"Fast and space-efficient virtual machine checkpointing","authors":"Eunbyung Park, Bernhard Egger, Jaejin Lee","doi":"10.1145/1952682.1952694","DOIUrl":"https://doi.org/10.1145/1952682.1952694","url":null,"abstract":"Checkpointing, i.e., recording the volatile state of a virtual machine (VM) running as a guest in a virtual machine monitor (VMM) for later restoration, includes storing the memory available to the VM. Typically, a full image of the VM's memory along with processor and device states are recorded. With guest memory sizes of up to several gigabytes, the size of the checkpoint images becomes more and more of a concern.\u0000 In this work we present a technique for fast and space-efficient checkpointing of virtual machines. In contrast to existing methods, our technique eliminates redundant data and stores only a subset of the VM's memory pages. Our technique transparently tracks I/O operations of the guest to external storage and maintains a list of memory pages whose contents are duplicated on non-volatile storage. At a checkpoint, these pages are excluded from the checkpoint image.\u0000 We have implemented the proposed technique for paravirtualized as well as fully-virtualized guests in the Xen VMM. Our experiments with a paravirtualized guest (Linux) and two fullyvirtualized guests (Linux, Windows) show a significant reduction in the size of the checkpoint image as well as the time required to complete the checkpoint. Compared to the current Xen implementation, we achieve, on average, an 81% reduction in the stored data and a 74% reduction in the time required to take a checkpoint for the paravirtualized Linux guest. In a fully-virtualized environment runningWindows and Linux guests, we achieve a 64% reduction of the image size along with a 62% reduction in checkpointing time.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115573709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 67

Patch auditing in infrastructure as a service clouds 基础设施即服务云中的补丁审计

International Conference on Virtual Execution Environments

Pub Date : 2011-03-09 DOI: 10.1145/1952682.1952702

Lionel Litty, D. Lie

A basic requirement of a secure computer system is that it be up to date with regard to software security patches. Unfortunately, Infrastructure as a Service (IaaS) clouds make this difficult. They leverage virtualization, which provides functionality that causes traditional security patch update systems to fail. In addition, the diversity of operating systems and the distributed nature of administration in the cloud compound the problem of identifying unpatched machines. In this work, we propose P2, a hypervisor-based patch audit solution. P2 audits VMs and detects the execution of unpatched binary and non-binary files in an accurate, continuous and OSagnostic manner. Two key innovations make P2 possible. First, P2 uses efficient information flow tracking to identify the use of unpatched non-binary files in a vulnerable way.We performed a patch survey and discover that 64% of files modified by security updates do not contain binary code, making the audit of non-binary files crucial. Second, P2 implements a novel algorithm that identifies binaries in mid-execution to allow handling of VMs resumed from a checkpoint or migrated into the cloud. We have implemented a prototype of P2 and and our experiments show that it accurately reports the execution of unpatched code while imposing performance overhead of 4%.

一个安全的计算机系统的一个基本要求是它的软件安全补丁是最新的。不幸的是，基础设施即服务(IaaS)云使这一点变得困难。它们利用虚拟化，虚拟化提供的功能会导致传统的安全补丁更新系统失败。此外，操作系统的多样性和云中管理的分布式特性加剧了识别未打补丁的机器的问题。在这项工作中，我们提出了P2，一个基于管理程序的补丁审计解决方案。“P2”对虚拟机进行审计，对未打补丁的二进制文件和非二进制文件的执行情况进行准确、连续和不可知的检测。两个关键的创新使P2成为可能。首先，P2使用高效的信息流跟踪，以易受攻击的方式识别未打补丁的非二进制文件的使用。我们进行了补丁调查，发现安全更新修改的文件中有64%不包含二进制代码，这使得对非二进制文件的审计变得至关重要。其次，P2实现了一种新的算法，可以在执行过程中识别二进制文件，从而允许处理从检查点恢复或迁移到云中的vm。我们已经实现了一个P2的原型，我们的实验表明，它准确地报告了未打补丁代码的执行，同时增加了4%的性能开销。

{"title":"Patch auditing in infrastructure as a service clouds","authors":"Lionel Litty, D. Lie","doi":"10.1145/1952682.1952702","DOIUrl":"https://doi.org/10.1145/1952682.1952702","url":null,"abstract":"A basic requirement of a secure computer system is that it be up to date with regard to software security patches. Unfortunately, Infrastructure as a Service (IaaS) clouds make this difficult. They leverage virtualization, which provides functionality that causes traditional security patch update systems to fail. In addition, the diversity of operating systems and the distributed nature of administration in the cloud compound the problem of identifying unpatched machines.\u0000 In this work, we propose P2, a hypervisor-based patch audit solution. P2 audits VMs and detects the execution of unpatched binary and non-binary files in an accurate, continuous and OSagnostic manner. Two key innovations make P2 possible. First, P2 uses efficient information flow tracking to identify the use of unpatched non-binary files in a vulnerable way.We performed a patch survey and discover that 64% of files modified by security updates do not contain binary code, making the audit of non-binary files crucial. Second, P2 implements a novel algorithm that identifies binaries in mid-execution to allow handling of VMs resumed from a checkpoint or migrated into the cloud. We have implemented a prototype of P2 and and our experiments show that it accurately reports the execution of unpatched code while imposing performance overhead of 4%.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129833488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8