Jinho Hwang, Ahsen J. Uppal, Timothy Wood, Howie Huang
Data center servers are typically overprovisioned, leaving spare memory and CPU capacity idle to handle unpredictable workload bursts by the virtual machines running on them. While this allows for fast hotspot mitigation, it is also wasteful. Unfortunately, making use of spare capacity without impacting active applications is particularly difficult for memory since it typically must be allocated in coarse chunks over long timescales. In this work we propose re- purposing the poorly utilized memory in a data center to store a volatile data store that is managed by the hypervisor. We present two uses for our Mortar framework: as a cache for prefetching disk blocks, and as an application-level distributed cache that follows the memcached protocol. Both prototypes use the framework to ask the hypervisor to store useful, but recoverable data within its free memory pool. This allows the hypervisor to control eviction policies and prioritize access to the cache. We demonstrate the benefits of our prototypes using realistic web applications and disk benchmarks, as well as memory traces gathered from live servers in our university's IT department. By expanding and contracting the data store size based on the free memory available, Mortar improves average response time of a web application by up to 35% compared to a fixed size memcached deployment, and improves overall video streaming performance by 45% through prefetching.
{"title":"Mortar: filling the gaps in data center memory","authors":"Jinho Hwang, Ahsen J. Uppal, Timothy Wood, Howie Huang","doi":"10.1145/2576195.2576203","DOIUrl":"https://doi.org/10.1145/2576195.2576203","url":null,"abstract":"Data center servers are typically overprovisioned, leaving spare memory and CPU capacity idle to handle unpredictable workload bursts by the virtual machines running on them. While this allows for fast hotspot mitigation, it is also wasteful. Unfortunately, making use of spare capacity without impacting active applications is particularly difficult for memory since it typically must be allocated in coarse chunks over long timescales. In this work we propose re- purposing the poorly utilized memory in a data center to store a volatile data store that is managed by the hypervisor. We present two uses for our Mortar framework: as a cache for prefetching disk blocks, and as an application-level distributed cache that follows the memcached protocol. Both prototypes use the framework to ask the hypervisor to store useful, but recoverable data within its free memory pool. This allows the hypervisor to control eviction policies and prioritize access to the cache. We demonstrate the benefits of our prototypes using realistic web applications and disk benchmarks, as well as memory traces gathered from live servers in our university's IT department. By expanding and contracting the data store size based on the free memory available, Mortar improves average response time of a web application by up to 35% compared to a fixed size memcached deployment, and improves overall video streaming performance by 45% through prefetching.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125981324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michihiro Horie, Kazunori Ogata, K. Kawachiya, Tamiya Onodera
To increase the memory efficiency in physical servers is a significant concern for increasing the number of virtual machines (VM) in them. When similar web application service runs in each guest VM, many string data with the same values are created in every guest VMs. These duplications of string data are redundant from the viewpoint of memory efficiency in the host OS. This paper proposes two approaches to reduce the duplication in Java string in a single Java VM (JVM) and across JVMs. The first approach is to share string objects cross JVMs by using a read-only memory-mapped file. The other approach is to selectively unify string objects created at runtime in the web applications. This paper evaluates our approach by using the Apache DayTrader and the DaCapo benchmark suite. Our prototype implementation chieved 7% to 12% reduction in the total size of the objects allocated over the lifetime of the programs. In addition, we observed the performance of DayTrader was maintained even under a situation of high density guest VMs in a KVM host machine.
{"title":"String deduplication for Java-based middleware in virtualized environments","authors":"Michihiro Horie, Kazunori Ogata, K. Kawachiya, Tamiya Onodera","doi":"10.1145/2576195.2576210","DOIUrl":"https://doi.org/10.1145/2576195.2576210","url":null,"abstract":"To increase the memory efficiency in physical servers is a significant concern for increasing the number of virtual machines (VM) in them. When similar web application service runs in each guest VM, many string data with the same values are created in every guest VMs. These duplications of string data are redundant from the viewpoint of memory efficiency in the host OS. This paper proposes two approaches to reduce the duplication in Java string in a single Java VM (JVM) and across JVMs. The first approach is to share string objects cross JVMs by using a read-only memory-mapped file. The other approach is to selectively unify string objects created at runtime in the web applications. This paper evaluates our approach by using the Apache DayTrader and the DaCapo benchmark suite. Our prototype implementation chieved 7% to 12% reduction in the total size of the objects allocated over the lifetime of the programs. In addition, we observed the performance of DayTrader was maintained even under a situation of high density guest VMs in a KVM host machine.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132361558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dynamic languages have been gaining popularity to the point that their performance is starting to matter. The effort required to develop a production-quality, high-performance runtime is, however, staggering and the expertise required to do so is often out of reach of the community maintaining a particular language. Many domain specific languages remain stuck with naive implementations, as they are easy to write and simple to maintain for domain scientists. In this paper, we try to see how far one can push a naive implementation while remaining portable and not requiring expertise in compilers and runtime systems. We choose the R language, a dynamic language used in statistics, as the target of our experiment and adopt the simplest possible implementation strategy, one based on evaluation of abstract syntax trees. We build our interpreter on top of a Java virtual machine and use only facilities available to all Java programmers. We compare our results to other implementations of R.
{"title":"A fast abstract syntax tree interpreter for R","authors":"T. Kalibera, Petr Maj, Floréal Morandat, J. Vitek","doi":"10.1145/2576195.2576205","DOIUrl":"https://doi.org/10.1145/2576195.2576205","url":null,"abstract":"Dynamic languages have been gaining popularity to the point that their performance is starting to matter. The effort required to develop a production-quality, high-performance runtime is, however, staggering and the expertise required to do so is often out of reach of the community maintaining a particular language. Many domain specific languages remain stuck with naive implementations, as they are easy to write and simple to maintain for domain scientists. In this paper, we try to see how far one can push a naive implementation while remaining portable and not requiring expertise in compilers and runtime systems. We choose the R language, a dynamic language used in statistics, as the target of our experiment and adopt the simplest possible implementation strategy, one based on evaluation of abstract syntax trees. We build our interpreter on top of a Java virtual machine and use only facilities available to all Java programmers. We compare our results to other implementations of R.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116964568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Virtual machine introspection (VMI) allows users to debug software that executes within a virtual machine. To support rich, whole-system analyses, a VMI tool must inspect and control systems at multiple levels of the software stack. Traditional debuggers enable inspection and control, but they limit users to treating a whole system as just one kind of target: e.g., just a kernel, or just a process, but not both. We created Stackdb, a debugging library with VMI support that allows one to monitor and control a whole system through multiple, coordinated targets. A target corresponds to a particular level of the system's software stack; multiple targets allow a user to observe a VM guest at several levels of abstraction simultaneously. For example, with Stackdb, one can observe a PHP script running in a Linux process in a Xen VM via three coordinated targets at the language, process, and kernel levels. Within Stackdb, higher-level targets are components that utilize lower-level targets; a key contribution of Stackdb is its API that supports multi-level and flexible "stacks" of targets. This paper describes the challenges we faced in creating Stackdb, presents the solutions we devised, and evaluates Stackdb through its application to a security-focused, whole-system case study.
{"title":"Composable multi-level debugging with Stackdb","authors":"David Johnson, Mike Hibler, E. Eide","doi":"10.1145/2576195.2576212","DOIUrl":"https://doi.org/10.1145/2576195.2576212","url":null,"abstract":"Virtual machine introspection (VMI) allows users to debug software that executes within a virtual machine. To support rich, whole-system analyses, a VMI tool must inspect and control systems at multiple levels of the software stack. Traditional debuggers enable inspection and control, but they limit users to treating a whole system as just one kind of target: e.g., just a kernel, or just a process, but not both.\u0000 We created Stackdb, a debugging library with VMI support that allows one to monitor and control a whole system through multiple, coordinated targets. A target corresponds to a particular level of the system's software stack; multiple targets allow a user to observe a VM guest at several levels of abstraction simultaneously. For example, with Stackdb, one can observe a PHP script running in a Linux process in a Xen VM via three coordinated targets at the language, process, and kernel levels. Within Stackdb, higher-level targets are components that utilize lower-level targets; a key contribution of Stackdb is its API that supports multi-level and flexible \"stacks\" of targets. This paper describes the challenges we faced in creating Stackdb, presents the solutions we devised, and evaluates Stackdb through its application to a security-focused, whole-system case study.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"238 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124624609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yi-Hong Lyu, Ding-Yong Hong, Tai-Yi Wu, Jan-Jan Wu, W. Hsu, Pangfeng Liu, P. Yew
Dynamic Binary Instrumentation (DBI) is a core technology for building debugging and profiling tools for application executables. Most state-of-the-art DBI systems have focused on the same instruction set architecture (ISA) where the guest binary and the host binary have the same ISA. It is uncommon to have a cross-ISA DBI system, such as a system that instruments ARM executables to run on x86 machines. We believe cross-ISA DBI systems are increasingly more important, since ARM executables could be more productively analyzed on x86 based machines such as commonly available PCs and servers. In this paper, we present DBILL, a cross-ISA and re- targetable dynamic binary instrumentation framework that builds on both QEMU and LLVM. The DBILL framework enables LLVM-based static instrumentation tools to become DBI ready, and deployable to different target architectures. Using address sanitizer and memory sanitizer as implementation examples, we show DBILL is an efficient, versatile and easy to use cross-ISA retargetable DBI framework.
{"title":"DBILL: an efficient and retargetable dynamic binary instrumentation framework using llvm backend","authors":"Yi-Hong Lyu, Ding-Yong Hong, Tai-Yi Wu, Jan-Jan Wu, W. Hsu, Pangfeng Liu, P. Yew","doi":"10.1145/2576195.2576213","DOIUrl":"https://doi.org/10.1145/2576195.2576213","url":null,"abstract":"Dynamic Binary Instrumentation (DBI) is a core technology for building debugging and profiling tools for application executables. Most state-of-the-art DBI systems have focused on the same instruction set architecture (ISA) where the guest binary and the host binary have the same ISA. It is uncommon to have a cross-ISA DBI system, such as a system that instruments ARM executables to run on x86 machines. We believe cross-ISA DBI systems are increasingly more important, since ARM executables could be more productively analyzed on x86 based machines such as commonly available PCs and servers. In this paper, we present DBILL, a cross-ISA and re- targetable dynamic binary instrumentation framework that builds on both QEMU and LLVM. The DBILL framework enables LLVM-based static instrumentation tools to become DBI ready, and deployable to different target architectures. Using address sanitizer and memory sanitizer as implementation examples, we show DBILL is an efficient, versatile and easy to use cross-ISA retargetable DBI framework.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121520834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Multi-tier applications are widely deployed in today's virtualized cloud computing environments. At the same time, management operations in these virtualized environments, such as load balancing, hardware maintenance, workload consolidation, etc., often make use of live virtual machine (VM) migration to control the placement of VMs. Although existing solutions are able to migrate a single VM efficiently, little attention has been devoted to migrating related VMs in multi-tier applications. Ignoring the relatedness of VMs during migration can lead to serious application performance degradation. This paper formulates the multi-tier application migration problem, and presents a new communication-impact-driven coordinated approach, as well as a system called COMMA that realizes this approach. Through extensive testbed experiments, numerical analyses, and a demonstration of COMMA on Amazon EC2, we show that this approach is highly effective in minimizing migration's impact on multi-tier applications' performance.
{"title":"COMMA: coordinating the migration of multi-tier applications","authors":"Jie Zheng, T. Ng, K. Sripanidkulchai, Zhaolei Liu","doi":"10.1145/2576195.2576200","DOIUrl":"https://doi.org/10.1145/2576195.2576200","url":null,"abstract":"Multi-tier applications are widely deployed in today's virtualized cloud computing environments. At the same time, management operations in these virtualized environments, such as load balancing, hardware maintenance, workload consolidation, etc., often make use of live virtual machine (VM) migration to control the placement of VMs. Although existing solutions are able to migrate a single VM efficiently, little attention has been devoted to migrating related VMs in multi-tier applications. Ignoring the relatedness of VMs during migration can lead to serious application performance degradation. This paper formulates the multi-tier application migration problem, and presents a new communication-impact-driven coordinated approach, as well as a system called COMMA that realizes this approach. Through extensive testbed experiments, numerical analyses, and a demonstration of COMMA on Amazon EC2, we show that this approach is highly effective in minimizing migration's impact on multi-tier applications' performance.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132368583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Madhukar N. Kedlaya, Behnam Robatmili, Calin Cascaval, B. Hardekopf
We are interested in implementing dynamic language runtimes on top of language-level virtual machines. Type specialization is a critical optimization for dynamic language runtimes: generic code that handles any type of data is replaced with specialized code for particular types observed during execution. However, types can change, and the runtime must recover whenever unexpected types are encountered. The state-of-the-art recovery mechanism is called deoptimization. Deoptimization is a well-known technique for dynamic language runtimes implemented in low-level languages like C. However, no dynamic language runtime implemented on top of a virtual machine such as the Common Language Runtime (CLR) or the Java Virtual Machine (JVM) uses deoptimization, because the implementation thereof used in low-level languages is not possible. In this paper we propose a novel technique that enables deoptimization for dynamic language runtimes implemented on top of typed, stack-based virtual machines. Our technique does not require any changes to the underlying virtual machine. We implement our proposed technique in a JavaScript language implementation, MCJS, running on top of the Mono runtime (CLR). We evaluate our implementation against the current state-of-the-art recovery mechanism for virtual machine-based runtimes, as implemented both in MCJS and in IronJS. We show that deoptimization provides significant performance benefits, even for runtimes running on top of a virtual machine.
{"title":"Deoptimization for dynamic language JITs on typed, stack-based virtual machines","authors":"Madhukar N. Kedlaya, Behnam Robatmili, Calin Cascaval, B. Hardekopf","doi":"10.1145/2576195.2576209","DOIUrl":"https://doi.org/10.1145/2576195.2576209","url":null,"abstract":"We are interested in implementing dynamic language runtimes on top of language-level virtual machines. Type specialization is a critical optimization for dynamic language runtimes: generic code that handles any type of data is replaced with specialized code for particular types observed during execution. However, types can change, and the runtime must recover whenever unexpected types are encountered. The state-of-the-art recovery mechanism is called deoptimization. Deoptimization is a well-known technique for dynamic language runtimes implemented in low-level languages like C. However, no dynamic language runtime implemented on top of a virtual machine such as the Common Language Runtime (CLR) or the Java Virtual Machine (JVM) uses deoptimization, because the implementation thereof used in low-level languages is not possible.\u0000 In this paper we propose a novel technique that enables deoptimization for dynamic language runtimes implemented on top of typed, stack-based virtual machines. Our technique does not require any changes to the underlying virtual machine. We implement our proposed technique in a JavaScript language implementation, MCJS, running on top of the Mono runtime (CLR). We evaluate our implementation against the current state-of-the-art recovery mechanism for virtual machine-based runtimes, as implemented both in MCJS and in IronJS. We show that deoptimization provides significant performance benefits, even for runtimes running on top of a virtual machine.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125087830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Microsoft Research Drawbridge Project began with a simple question: Is it possible to achieve the benefits of hardware virtual machines without the overheads? Following that question, we have built a line of exploratory prototypes. These prototypes range from an ARM-based phone that runs x86 Windows binaries to new forms of secure computation. In this talk, I'll briefly describe our various prototypes and the evidence we have accumulated that our first question can be answered in the affirmative.
{"title":"Experiences in the land of virtual abstractions","authors":"G. Hunt","doi":"10.1145/2674025.2576215","DOIUrl":"https://doi.org/10.1145/2674025.2576215","url":null,"abstract":"The Microsoft Research Drawbridge Project began with a simple question: Is it possible to achieve the benefits of hardware virtual machines without the overheads? Following that question, we have built a line of exploratory prototypes. These prototypes range from an ARM-based phone that runs x86 Windows binaries to new forms of secure computation. In this talk, I'll briefly describe our various prototypes and the evidence we have accumulated that our first question can be answered in the affirmative.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130564880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Limited main memory size is considered as one of the major bottlenecks in virtualization environments. Content-Based Page Sharing (CBPS) is an efficient memory deduplication technique to reduce server memory requirements, in which pages with same content are detected and shared into a single copy. As the widely used implementation of CBPS, Kernel Samepage Merging (KSM) maintains the whole memory pages into two global comparison trees (a stable tree and an unstable tree). To detect page sharing opportunities, each tracked page needs to be compared with pages already in these two large global trees. However since the vast majority of compared pages have different content with it, that will induce massive futility comparisons and thus heavy overhead. In this paper, we propose a lightweight page Classification-based Memory Deduplication approach named CMD to reduce futile page comparison overhead meanwhile to detect page sharing opportunities efficiently. The main innovation of CMD is that pages are grouped into different classifications based on page access characteristics. Pages with similar access characteristics are suggested to have higher possibility with same content, thus they are grouped into the same classification. In CMD, the large global comparison trees are divided into multiple small trees with dedicated local ones in each page classification. Page comparisons are performed just in the same classification, and pages from different classifications are never compared (since they probably result in futile comparisons). The experimental results show that CMD can efficiently reduce page comparisons (by about 68.5%) meanwhile detect nearly the same (by more than 98%) or even more page sharing opportunities.
{"title":"CMD: classification-based memory deduplication through page access characteristics","authors":"Licheng Chen, Zhipeng Wei, Zehan Cui, Mingyu Chen, Haiyang Pan, Yungang Bao","doi":"10.1145/2576195.2576204","DOIUrl":"https://doi.org/10.1145/2576195.2576204","url":null,"abstract":"Limited main memory size is considered as one of the major bottlenecks in virtualization environments. Content-Based Page Sharing (CBPS) is an efficient memory deduplication technique to reduce server memory requirements, in which pages with same content are detected and shared into a single copy. As the widely used implementation of CBPS, Kernel Samepage Merging (KSM) maintains the whole memory pages into two global comparison trees (a stable tree and an unstable tree). To detect page sharing opportunities, each tracked page needs to be compared with pages already in these two large global trees. However since the vast majority of compared pages have different content with it, that will induce massive futility comparisons and thus heavy overhead.\u0000 In this paper, we propose a lightweight page Classification-based Memory Deduplication approach named CMD to reduce futile page comparison overhead meanwhile to detect page sharing opportunities efficiently. The main innovation of CMD is that pages are grouped into different classifications based on page access characteristics. Pages with similar access characteristics are suggested to have higher possibility with same content, thus they are grouped into the same classification. In CMD, the large global comparison trees are divided into multiple small trees with dedicated local ones in each page classification. Page comparisons are performed just in the same classification, and pages from different classifications are never compared (since they probably result in futile comparisons). The experimental results show that CMD can efficiently reduce page comparisons (by about 68.5%) meanwhile detect nearly the same (by more than 98%) or even more page sharing opportunities.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131792627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Program instrumentation techniques form the basis of many recent software security defenses, including defenses against common exploits and security policy enforcement. As compared to source-code instrumentation, binary instrumentation is easier to use and more broadly applicable due to the ready availability of binary code. Two key features needed for security instrumentations are (a) it should be applied to all application code, including code contained in various system and application libraries, and (b) it should be non-bypassable. So far, dynamic binary instrumentation (DBI) techniques have provided these features, whereas static binary instrumentation (SBI) techniques have lacked them. These features, combined with ease of use, have made DBI the de facto choice for security instrumentations. However, DBI techniques can incur high overheads in several common usage scenarios, such as application startups, system-calls, and many real-world applications. We therefore develop a new platform for secure static binary instrumentation (PSI) that overcomes these drawbacks of DBI techniques, while retaining the security, robustness and ease-of-use features. We illustrate the versatility of PSI by developing several instrumentation applications: basic block counting, shadow stack defense against control-flow hijack and return-oriented programming attacks, and system call and library policy enforcement. While being competitive with the best DBI tools on CPU-intensive SPEC 2006 benchmark, PSI provides an order of magnitude reduction in overheads on a collection of real-world applications.
{"title":"A platform for secure static binary instrumentation","authors":"Mingwei Zhang, Rui Qiao, N. Hasabnis, R. Sekar","doi":"10.1145/2576195.2576208","DOIUrl":"https://doi.org/10.1145/2576195.2576208","url":null,"abstract":"Program instrumentation techniques form the basis of many recent software security defenses, including defenses against common exploits and security policy enforcement. As compared to source-code instrumentation, binary instrumentation is easier to use and more broadly applicable due to the ready availability of binary code. Two key features needed for security instrumentations are (a) it should be applied to all application code, including code contained in various system and application libraries, and (b) it should be non-bypassable. So far, dynamic binary instrumentation (DBI) techniques have provided these features, whereas static binary instrumentation (SBI) techniques have lacked them. These features, combined with ease of use, have made DBI the de facto choice for security instrumentations. However, DBI techniques can incur high overheads in several common usage scenarios, such as application startups, system-calls, and many real-world applications. We therefore develop a new platform for secure static binary instrumentation (PSI) that overcomes these drawbacks of DBI techniques, while retaining the security, robustness and ease-of-use features. We illustrate the versatility of PSI by developing several instrumentation applications: basic block counting, shadow stack defense against control-flow hijack and return-oriented programming attacks, and system call and library policy enforcement. While being competitive with the best DBI tools on CPU-intensive SPEC 2006 benchmark, PSI provides an order of magnitude reduction in overheads on a collection of real-world applications.","PeriodicalId":202844,"journal":{"name":"International Conference on Virtual Execution Environments","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133494003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}