Efficient and secure networking between virtual machines is crucial in a time where a large share of the services on the Internet and in private datacenters run in virtual machines. To achieve this efficiency, virtualization solutions, such as Qemu/KVM, move toward a monolithic system architecture in which all performance critical functionality is implemented directly in the hypervisor in privileged mode. This is an attack surface in the hypervisor that can be used from compromised VMs to take over the virtual machine host and all VMs running on it. We show that it is possible to implement an efficient network switch nfor virtual machines as an unprivileged userspace component running in the host system including the driver for the upstream network adapter. Our network switch relies on functionality already present in the KVM hypervisor and requires no changes to Linux, the host operating system, and the guest. Our userspace implementation compares favorably to the existing in-kernel implementation with respect to throughput and latency. We reduced per-packet overhead by using a run-to-completion model an are able to outperform the unmodified system for VM-to-VM traffic by a large margin when packet rates are high.
{"title":"Shrinking the hypervisor one subsystem at a time: a userspace packet switch for virtual machines","authors":"Julian Stecklina","doi":"10.1145/2576195.2576202","DOIUrl":"https://doi.org/10.1145/2576195.2576202","url":null,"abstract":"Efficient and secure networking between virtual machines is crucial in a time where a large share of the services on the Internet and in private datacenters run in virtual machines. To achieve this efficiency, virtualization solutions, such as Qemu/KVM, move toward a monolithic system architecture in which all performance critical functionality is implemented directly in the hypervisor in privileged mode. This is an attack surface in the hypervisor that can be used from compromised VMs to take over the virtual machine host and all VMs running on it.\u0000 We show that it is possible to implement an efficient network switch nfor virtual machines as an unprivileged userspace component running in the host system including the driver for the upstream network adapter. Our network switch relies on functionality already present in the KVM hypervisor and requires no changes to Linux, the host operating system, and the guest.\u0000 Our userspace implementation compares favorably to the existing in-kernel implementation with respect to throughput and latency. We reduced per-packet overhead by using a run-to-completion model an are able to outperform the unmodified system for VM-to-VM traffic by a large margin when packet rates are high.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115306690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chao-Rui Chang, Jan-Jan Wu, W. Hsu, Pangfeng Liu, P. Yew
Cross-ISA system-mode emulation has many important applications. For example, Cross-ISA system-mode emulation helps computer architects and OS developers trace and debug kernel execution-flow efficiently by emulating a slower platform (such as ARM) on a more powerful plat-form (such as an x86 machine). Cross-ISA system-mode emulation also enables workload consolidation in data centers with platforms of different instruction-set architectures (ISAs). However, system-mode emulation is much slower. One major overhead in system-mode emulation is the multi-level memory address translation that maps guest virtual address to host physical address. Shadow page tables (SPT) have been used to reduce such overheads, but primarily for same-ISA virtualization. In this paper we propose a novel approach called embedded shadow page tables (ESPT). EPST embeds a shadow page table into the address space of a cross-ISA dynamic binary translation (DBT) and uses hardware memory management unit in the CPU to translate memory addresses, instead of software translation in a current DBT emulator like QEMU. We also use the larger address space on modern 64-bit CPUs to accommodate our DBT emulator so that it will not interfere with the guest operating system. We incorporate our new scheme into QEMU, a popular, retargetable cross-ISA system emulator. SPEC CINT2006 benchmark results indicate that our technique achieves an average speedup of 1.51 times in system mode when emulating ARM on x86, and a 1.59 times speedup for emulating IA32 on x86_64.
{"title":"Efficient memory virtualization for Cross-ISA system mode emulation","authors":"Chao-Rui Chang, Jan-Jan Wu, W. Hsu, Pangfeng Liu, P. Yew","doi":"10.1145/2576195.2576201","DOIUrl":"https://doi.org/10.1145/2576195.2576201","url":null,"abstract":"Cross-ISA system-mode emulation has many important applications. For example, Cross-ISA system-mode emulation helps computer architects and OS developers trace and debug kernel execution-flow efficiently by emulating a slower platform (such as ARM) on a more powerful plat-form (such as an x86 machine). Cross-ISA system-mode emulation also enables workload consolidation in data centers with platforms of different instruction-set architectures (ISAs). However, system-mode emulation is much slower. One major overhead in system-mode emulation is the multi-level memory address translation that maps guest virtual address to host physical address. Shadow page tables (SPT) have been used to reduce such overheads, but primarily for same-ISA virtualization. In this paper we propose a novel approach called embedded shadow page tables (ESPT). EPST embeds a shadow page table into the address space of a cross-ISA dynamic binary translation (DBT) and uses hardware memory management unit in the CPU to translate memory addresses, instead of software translation in a current DBT emulator like QEMU. We also use the larger address space on modern 64-bit CPUs to accommodate our DBT emulator so that it will not interfere with the guest operating system. We incorporate our new scheme into QEMU, a popular, retargetable cross-ISA system emulator. SPEC CINT2006 benchmark results indicate that our technique achieves an average speedup of 1.51 times in system mode when emulating ARM on x86, and a 1.59 times speedup for emulating IA32 on x86_64.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128286164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Double-paging is an often-cited, if unsubstantiated, problem in multi-level scheduling of memory between virtual machines (VMs) and the hypervisor. This problem occurs when both a virtualized guest and the hypervisor overcommit their respective physical address-spaces. When the guest pages out memory previously swapped out by the hypervisor, it initiates an expensive sequence of steps causing the contents to be read in from the hypervisor swapfile only to be written out again, significantly lengthening the time to complete the guest I/O request. As a result, performance rapidly drops. We present Tesseract, a system that directly and transparently addresses the double-paging problem. Tesseract tracks when guest and hypervisor I/O operations are redundant and modifies these I/Os to create indirections to existing disk blocks containing the page contents. Although our focus is on reconciling I/Os between the guest disks and hypervisor swap, our technique is general and can reconcile, or deduplicate, I/Os for guest pages read or written by the VM. Deduplication of disk blocks for file contents accessed in a common manner is well-understood. One challenge that our approach faces is that the locality of guest I/Os (reflecting the guest's notion of disk layout) often differs from that of the blocks in the hypervisor swap. This loss of locality through indirection results in significant performance loss on subsequent guest reads. We propose two alternatives to recovering this lost locality, each based on the idea of asynchronously reorganizing the indirected blocks in persistent storage. We evaluate our system and show that it can significantly reduce the costs of double-paging. We focus our experiments on a synthetic benchmark designed to highlight its effects. In our experiments we observe Tesseract can improve our benchmark's throughput by as much as 200% when using traditional disks and by as much as 30% when using SSD. At the same time worst case application responsiveness can be improved by a factor of 5.
{"title":"Tesseract: reconciling guest I/O and hypervisor swapping in a VM","authors":"K. Arya, Y. Baskakov, Alex Garthwaite","doi":"10.1145/2576195.2576198","DOIUrl":"https://doi.org/10.1145/2576195.2576198","url":null,"abstract":"Double-paging is an often-cited, if unsubstantiated, problem in multi-level scheduling of memory between virtual machines (VMs) and the hypervisor. This problem occurs when both a virtualized guest and the hypervisor overcommit their respective physical address-spaces. When the guest pages out memory previously swapped out by the hypervisor, it initiates an expensive sequence of steps causing the contents to be read in from the hypervisor swapfile only to be written out again, significantly lengthening the time to complete the guest I/O request. As a result, performance rapidly drops.\u0000 We present Tesseract, a system that directly and transparently addresses the double-paging problem. Tesseract tracks when guest and hypervisor I/O operations are redundant and modifies these I/Os to create indirections to existing disk blocks containing the page contents. Although our focus is on reconciling I/Os between the guest disks and hypervisor swap, our technique is general and can reconcile, or deduplicate, I/Os for guest pages read or written by the VM.\u0000 Deduplication of disk blocks for file contents accessed in a common manner is well-understood. One challenge that our approach faces is that the locality of guest I/Os (reflecting the guest's notion of disk layout) often differs from that of the blocks in the hypervisor swap. This loss of locality through indirection results in significant performance loss on subsequent guest reads. We propose two alternatives to recovering this lost locality, each based on the idea of asynchronously reorganizing the indirected blocks in persistent storage.\u0000 We evaluate our system and show that it can significantly reduce the costs of double-paging. We focus our experiments on a synthetic benchmark designed to highlight its effects. In our experiments we observe Tesseract can improve our benchmark's throughput by as much as 200% when using traditional disks and by as much as 30% when using SSD. At the same time worst case application responsiveness can be improved by a factor of 5.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"116 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123628555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Computer systems research spans sub-disciplines that include embedded systems, programming languages, network- ing, and operating systems. In this talk my contention is that a number of structural factors inhibit quality systems re- search. Symptoms of the problem include unrepeatable and unreproduced results as well as results that are either devoid of meaning or that measure the wrong thing. I will illustrate the impact of these issues on our research output with examples from the development and empirical evaluation of the Schism real-time garbage collection algorithm that is shipped with the FijiVM -- a Java virtual machine for embedded and mobile devices. I will argue that our field should fos- ter: repetition of results, independent reproduction, as well as rigorous evaluation. I will outline some baby steps taken by several computer conferences. In particular I will focus on the introduction of Artifact Evaluation Committees or AECs to ECOOP, OOPLSA, PLDI and soon POPL. The goal of the AECs is to encourage author to package the soft- ware artifacts that they used to support the claims made in their paper and to submit these artifacts for evaluation. AECs were carefully designed to provide positive feedback to the authors that take the time to create repeatable research.
{"title":"The case for the three R's of systems research: repeatability, reproducibility and rigor","authors":"J. Vitek","doi":"10.1145/2576195.2576216","DOIUrl":"https://doi.org/10.1145/2576195.2576216","url":null,"abstract":"Computer systems research spans sub-disciplines that include embedded systems, programming languages, network- ing, and operating systems. In this talk my contention is that a number of structural factors inhibit quality systems re- search. Symptoms of the problem include unrepeatable and unreproduced results as well as results that are either devoid of meaning or that measure the wrong thing. I will illustrate the impact of these issues on our research output with examples from the development and empirical evaluation of the Schism real-time garbage collection algorithm that is shipped with the FijiVM -- a Java virtual machine for embedded and mobile devices. I will argue that our field should fos- ter: repetition of results, independent reproduction, as well as rigorous evaluation. I will outline some baby steps taken by several computer conferences. In particular I will focus on the introduction of Artifact Evaluation Committees or AECs to ECOOP, OOPLSA, PLDI and soon POPL. The goal of the AECs is to encourage author to package the soft- ware artifacts that they used to support the claims made in their paper and to submit these artifacts for evaluation. AECs were carefully designed to provide positive feedback to the authors that take the time to create repeatable research.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128369412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hwanju Kim, Sangwook P. Kim, Jinkyu Jeong, Joonwon Lee
This paper presents virtual asymmetric multiprocessor, a new scheme of virtual desktop scheduling on multi-core processors for user-interactive performance. The proposed scheme enables virtual CPUs to be dynamically performance-asymmetric based on their hosted workloads. To enhance user experience on consolidated desktops, our scheme provides interactive workloads with fast virtual CPUs, which have more computing power than those hosting background workloads in the same virtual machine. To this end, we devise a hypervisor extension that transparently classifies background tasks from potentially interactive workloads. In addition, we introduce a guest extension that manipulates the scheduling policy of an operating system in favor of our hypervisor-level scheme so that interactive performance can be further improved. Our evaluation shows that the proposed scheme significantly improves interactive performance of application launch, Web browsing, and video playback applications when CPU-intensive workloads highly disturb the interactive workloads.
{"title":"Virtual asymmetric multiprocessor for interactive performance of consolidated desktops","authors":"Hwanju Kim, Sangwook P. Kim, Jinkyu Jeong, Joonwon Lee","doi":"10.1145/2576195.2576199","DOIUrl":"https://doi.org/10.1145/2576195.2576199","url":null,"abstract":"This paper presents virtual asymmetric multiprocessor, a new scheme of virtual desktop scheduling on multi-core processors for user-interactive performance. The proposed scheme enables virtual CPUs to be dynamically performance-asymmetric based on their hosted workloads. To enhance user experience on consolidated desktops, our scheme provides interactive workloads with fast virtual CPUs, which have more computing power than those hosting background workloads in the same virtual machine. To this end, we devise a hypervisor extension that transparently classifies background tasks from potentially interactive workloads. In addition, we introduce a guest extension that manipulates the scheduling policy of an operating system in favor of our hypervisor-level scheme so that interactive performance can be further improved. Our evaluation shows that the proposed scheme significantly improves interactive performance of application launch, Web browsing, and video playback applications when CPU-intensive workloads highly disturb the interactive workloads.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130608663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Behnam Robatmili, Calin Cascaval, Mehrdad Reshadi, Madhukar N. Kedlaya, Seth Fowler, Vrajesh Bhavsar, Michael Weber, B. Hardekopf
Layered JavaScript engines, in which the JavaScript runtime is built on top another managed runtime, provide better extensibility and portability compared to traditional monolithic engines. In this paper, we revisit the design of layered JavaScript engines and propose a layered architecture, called MuscalietJS2, that splits the responsibilities of a JavaScript engine between a high-level, JavaScript-specific component and a low-level, language-agnostic .NET VM. To make up for the performance loss due to layering, we propose a two pronged approach: high-level JavaScript optimizations and exploitation of low-level VM features that produce very efficient code for hot functions. We demonstrate the validity of the MuscalietJS design through a comprehensive evaluation using both the Sunspider benchmarks and a set of web workloads. We demonstrate that our approach outperforms other layered engines such as IronJS and Rhino engines while providing extensibility, adaptability and portability.
{"title":"MuscalietJS: rethinking layered dynamic web runtimes","authors":"Behnam Robatmili, Calin Cascaval, Mehrdad Reshadi, Madhukar N. Kedlaya, Seth Fowler, Vrajesh Bhavsar, Michael Weber, B. Hardekopf","doi":"10.1145/2576195.2576211","DOIUrl":"https://doi.org/10.1145/2576195.2576211","url":null,"abstract":"Layered JavaScript engines, in which the JavaScript runtime is built on top another managed runtime, provide better extensibility and portability compared to traditional monolithic engines. In this paper, we revisit the design of layered JavaScript engines and propose a layered architecture, called MuscalietJS2, that splits the responsibilities of a JavaScript engine between a high-level, JavaScript-specific component and a low-level, language-agnostic .NET VM. To make up for the performance loss due to layering, we propose a two pronged approach: high-level JavaScript optimizations and exploitation of low-level VM features that produce very efficient code for hot functions. We demonstrate the validity of the MuscalietJS design through a comprehensive evaluation using both the Sunspider benchmarks and a set of web workloads. We demonstrate that our approach outperforms other layered engines such as IronJS and Rhino engines while providing extensibility, adaptability and portability.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133132551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Physical memory is the scarcest resource in today's cloud computing platforms. Cloud providers would like to maximize their clients' satisfaction by renting precious physical memory to those clients who value it the most. But real-world cloud clients are selfish: they will only tell their providers the truth about how much they value memory when it is in their own best interest to do so. How can real-world cloud providers allocate memory efficiently to those (selfish) clients who value it the most? We present Ginseng, the first market-driven cloud system that allocates memory efficiently to selfish cloud clients. Ginseng incentivizes selfish clients to bid their true value for the memory they need when they need it. Ginseng continuously collects client bids, finds an efficient memory allocation, and re-allocates physical memory to the clients that value it the most. Ginseng achieves a 6.2×--15.8x improvement (83%--100% of the optimum) in aggregate client satisfaction when compared with state-of-the-art approaches for cloud memory allocation.
{"title":"Ginseng: market-driven memory allocation","authors":"Orna Agmon Ben-Yehuda, Eyal Posener, Muli Ben-Yehuda, A. Schuster, Ahuva Mu'alem","doi":"10.1145/2576195.2576197","DOIUrl":"https://doi.org/10.1145/2576195.2576197","url":null,"abstract":"Physical memory is the scarcest resource in today's cloud computing platforms. Cloud providers would like to maximize their clients' satisfaction by renting precious physical memory to those clients who value it the most. But real-world cloud clients are selfish: they will only tell their providers the truth about how much they value memory when it is in their own best interest to do so. How can real-world cloud providers allocate memory efficiently to those (selfish) clients who value it the most?\u0000 We present Ginseng, the first market-driven cloud system that allocates memory efficiently to selfish cloud clients. Ginseng incentivizes selfish clients to bid their true value for the memory they need when they need it. Ginseng continuously collects client bids, finds an efficient memory allocation, and re-allocates physical memory to the clients that value it the most. Ginseng achieves a 6.2×--15.8x improvement (83%--100% of the optimum) in aggregate client satisfaction when compared with state-of-the-art approaches for cloud memory allocation.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121810037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper addresses the problem of efficiently supporting parallelism within a managed runtime. A popular approach for exploiting software parallelism on parallel hardware is task parallelism, where the programmer explicitly identifies potential parallelism and the runtime then schedules the work. Work-stealing is a promising scheduling strategy that a runtime may use to keep otherwise idle hardware busy while relieving overloaded hardware of its burden. However, work-stealing comes with substantial overheads. Recent work identified sequential overheads of work-stealing, those that occur even when no stealing takes place, as a significant source of overhead. That work was able to reduce sequential overheads to just 15%. In this work, we turn to dynamic overheads, those that occur each time a steal takes place. We show that the dynamic overhead is dominated by introspection of the victim's stack when a steal takes place. We exploit the idea of a low overhead return barrier to reduce the dynamic overhead by approximately half, resulting in total performance improvements of as much as 20%. Because, unlike prior work, we attack the overheads directly due to stealing and therefore attack the overheads that grow as parallelism grows, we improve the scalability of work-stealing applications. This result is complementary to recent work addressing the sequential overheads of work-stealing. This work therefore substantially relieves work-stealing of the increasing pressure due to increasing intra-node hardware parallelism.
{"title":"Friendly barriers: efficient work-stealing with return barriers","authors":"Vivek Kumar, S. Blackburn, D. Grove","doi":"10.1145/2576195.2576207","DOIUrl":"https://doi.org/10.1145/2576195.2576207","url":null,"abstract":"This paper addresses the problem of efficiently supporting parallelism within a managed runtime. A popular approach for exploiting software parallelism on parallel hardware is task parallelism, where the programmer explicitly identifies potential parallelism and the runtime then schedules the work. Work-stealing is a promising scheduling strategy that a runtime may use to keep otherwise idle hardware busy while relieving overloaded hardware of its burden. However, work-stealing comes with substantial overheads. Recent work identified sequential overheads of work-stealing, those that occur even when no stealing takes place, as a significant source of overhead. That work was able to reduce sequential overheads to just 15%.\u0000 In this work, we turn to dynamic overheads, those that occur each time a steal takes place. We show that the dynamic overhead is dominated by introspection of the victim's stack when a steal takes place. We exploit the idea of a low overhead return barrier to reduce the dynamic overhead by approximately half, resulting in total performance improvements of as much as 20%. Because, unlike prior work, we attack the overheads directly due to stealing and therefore attack the overheads that grow as parallelism grows, we improve the scalability of work-stealing applications. This result is complementary to recent work addressing the sequential overheads of work-stealing. This work therefore substantially relieves work-stealing of the increasing pressure due to increasing intra-node hardware parallelism.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"48 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132974305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Virtual Machine Introspection (VMI) provides the ability to monitor virtual machines (VM) in an agentless fashion by gathering VM execution states from the hypervisor and analyzing those states to extract information about a running operating system (OS) without installing an agent inside the VM. VMI's main challenge lies in the difficulty in converting low-level byte string values into high-level semantic states of the monitored VM's OS. In this work, we tackle this challenge by developing a real-time kernel data structure monitoring (RTKDSM) system that leverages the rich OS analysis capabilities of Volatility, an open source computer forensics framework, to significantly simplify and automate analysis of VM execution states. The RTKDSM system is designed as an extensible software framework that is meant to be extended to perform application-specific VM state analysis. In addition, the RTKDSM system is able to perform real-time monitoring of any changes made to the extracted OS states of guest VMs. This real-time monitoring capability is especially important for VMI-based security applications. To minimize the performance overhead associated with real-time kernel data structure monitoring, the RTKDSM system has incorporated several optimizations whose effectiveness is reported in this paper.
{"title":"Real-time deep virtual machine introspection and its applications","authors":"Jennia Hizver, T. Chiueh","doi":"10.1145/2576195.2576196","DOIUrl":"https://doi.org/10.1145/2576195.2576196","url":null,"abstract":"Virtual Machine Introspection (VMI) provides the ability to monitor virtual machines (VM) in an agentless fashion by gathering VM execution states from the hypervisor and analyzing those states to extract information about a running operating system (OS) without installing an agent inside the VM. VMI's main challenge lies in the difficulty in converting low-level byte string values into high-level semantic states of the monitored VM's OS. In this work, we tackle this challenge by developing a real-time kernel data structure monitoring (RTKDSM) system that leverages the rich OS analysis capabilities of Volatility, an open source computer forensics framework, to significantly simplify and automate analysis of VM execution states. The RTKDSM system is designed as an extensible software framework that is meant to be extended to perform application-specific VM state analysis. In addition, the RTKDSM system is able to perform real-time monitoring of any changes made to the extracted OS states of guest VMs. This real-time monitoring capability is especially important for VMI-based security applications. To minimize the performance overhead associated with real-time kernel data structure monitoring, the RTKDSM system has incorporated several optimizations whose effectiveness is reported in this paper.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125038041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
When executing inside a virtual machine environment, OS level synchronization primitives are faced with significant challenges due to the scheduling behavior of the underlying virtual machine monitor. Operations that are ensured to last only a short amount of time on real hardware, are capable of taking considerably longer when running virtualized. This change in assumptions has significant impact when an OS is executing inside a critical region that is protected by a spinlock. The interaction between OS level spinlocks and VMM scheduling is known as the Lock Holder Preemption problem and has a significant impact on overall VM performance. However, with the use of ticket locks instead of generic spinlocks, virtual environments must also contend with waiters being preempted before they are able to acquire the lock. This has the effect of blocking access to a lock, even if the lock itself is available. We identify this scenario as the Lock Waiter Preemption problem. In order to solve both problems we introduce Preemptable Ticket spinlocks, a new locking primitive that is designed to enable a VM to always make forward progress by relaxing the ordering guarantees offered by ticket locks. We show that the use of Preemptable Ticket spinlocks improves VM performance by 5.32X on average, when running on a non paravirtual VMM, and by 7.91X when running on a VMM that supports a paravirtual locking interface, when executing a set of microbenchmarks as well as a realistic e-commerce benchmark.
{"title":"Preemptable ticket spinlocks: improving consolidated performance in the cloud","authors":"Jiannan Ouyang, J. Lange","doi":"10.1145/2451512.2451549","DOIUrl":"https://doi.org/10.1145/2451512.2451549","url":null,"abstract":"When executing inside a virtual machine environment, OS level synchronization primitives are faced with significant challenges due to the scheduling behavior of the underlying virtual machine monitor. Operations that are ensured to last only a short amount of time on real hardware, are capable of taking considerably longer when running virtualized. This change in assumptions has significant impact when an OS is executing inside a critical region that is protected by a spinlock. The interaction between OS level spinlocks and VMM scheduling is known as the Lock Holder Preemption problem and has a significant impact on overall VM performance. However, with the use of ticket locks instead of generic spinlocks, virtual environments must also contend with waiters being preempted before they are able to acquire the lock. This has the effect of blocking access to a lock, even if the lock itself is available. We identify this scenario as the Lock Waiter Preemption problem. In order to solve both problems we introduce Preemptable Ticket spinlocks, a new locking primitive that is designed to enable a VM to always make forward progress by relaxing the ordering guarantees offered by ticket locks. We show that the use of Preemptable Ticket spinlocks improves VM performance by 5.32X on average, when running on a non paravirtual VMM, and by 7.91X when running on a VMM that supports a paravirtual locking interface, when executing a set of microbenchmarks as well as a realistic e-commerce benchmark.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127268830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}