Dynamic program analysis frameworks greatly improve software quality as they enable a wide range of powerful analysis tools (e.g., reliability, profiling, and logging) at runtime. However, because existing frameworks run only one actual execution for each software application, the execution is fully or partially coupled with an analysis tool in order to transfer execution states (e.g., accessed memory and thread interleavings) to the analysis tool, easily causing a prohibitive slowdown for the execution. To reduce the portions of execution states that require transfer, many frameworks require significantly carving analysis tools as well as the frameworks themselves. Thus, these frameworks significantly trade off transparency with analysis tools and allow only one type of tools to run within one execution. This paper presents RepFrame, an efficient and transparent framework that fully decouples execution and analysis by constructing multiple equivalent executions. To do so, RepFrame leverages a recent fault-tolerant technique: transparent state machine replication, which runs the same software application on a set of machines (or replicas), and ensures that all replicas see the same sequence of inputs and process these inputs with the same efficient thread interleavings automatically. In addition, this paper discusses potential directions in which REPFRAME can further strengthen existing analyses. Evaluation shows that REPFRAME is easy to run two asynchronous analysis tools together and has reasonable overhead.
{"title":"RepFrame: An Efficient and Transparent Framework for Dynamic Program Analysis","authors":"Heming Cui, Rui Gu, Cheng Liu, Junfeng Yang","doi":"10.1145/2797022.2797033","DOIUrl":"https://doi.org/10.1145/2797022.2797033","url":null,"abstract":"Dynamic program analysis frameworks greatly improve software quality as they enable a wide range of powerful analysis tools (e.g., reliability, profiling, and logging) at runtime. However, because existing frameworks run only one actual execution for each software application, the execution is fully or partially coupled with an analysis tool in order to transfer execution states (e.g., accessed memory and thread interleavings) to the analysis tool, easily causing a prohibitive slowdown for the execution. To reduce the portions of execution states that require transfer, many frameworks require significantly carving analysis tools as well as the frameworks themselves. Thus, these frameworks significantly trade off transparency with analysis tools and allow only one type of tools to run within one execution. This paper presents RepFrame, an efficient and transparent framework that fully decouples execution and analysis by constructing multiple equivalent executions. To do so, RepFrame leverages a recent fault-tolerant technique: transparent state machine replication, which runs the same software application on a set of machines (or replicas), and ensures that all replicas see the same sequence of inputs and process these inputs with the same efficient thread interleavings automatically. In addition, this paper discusses potential directions in which REPFRAME can further strengthen existing analyses. Evaluation shows that REPFRAME is easy to run two asynchronous analysis tools together and has reasonable overhead.","PeriodicalId":125617,"journal":{"name":"Proceedings of the 6th Asia-Pacific Workshop on Systems","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121649185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. Johansen, Eleanor Birrell, R. V. Renesse, F. Schneider, Magnus Stenhaug, D. Johansen
This paper proposes a mechanism for expressing and enforcing security policies for shared data. Security policies are expressed as stateful meta-code operations; meta-code can express a broad class of policies, including access-based policies, use-based policies, obligations, and sticky policies with declassification. The meta-code is interposed in the filesystem access path to ensure policy compliance. The generality and feasibility of our approach is demonstrated using a sports analytics prototype system.
{"title":"Enforcing Privacy Policies with Meta-Code","authors":"H. Johansen, Eleanor Birrell, R. V. Renesse, F. Schneider, Magnus Stenhaug, D. Johansen","doi":"10.1145/2797022.2797040","DOIUrl":"https://doi.org/10.1145/2797022.2797040","url":null,"abstract":"This paper proposes a mechanism for expressing and enforcing security policies for shared data. Security policies are expressed as stateful meta-code operations; meta-code can express a broad class of policies, including access-based policies, use-based policies, obligations, and sticky policies with declassification. The meta-code is interposed in the filesystem access path to ensure policy compliance. The generality and feasibility of our approach is demonstrated using a sports analytics prototype system.","PeriodicalId":125617,"journal":{"name":"Proceedings of the 6th Asia-Pacific Workshop on Systems","volume":"569 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115960180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With increasing demand of big-data processing and faster in-memory databases, cloud providers are gearing towards large virtualized instances rather than horizontal scalability. However, our experiments reveal that such instances in popular cloud services (e.g., 32 vCPUs with 208 GB supported by Google Compute Engine) do not achieve the desired scalability with increasing core count even with a simple, embarrassingly parallel job (e.g., kernel compile). On a serious note, the internal synchronization scheme (e.g., paravirtualized ticket spinlock) of the virtualized instance on a machine with higher core count (e.g., 80-core) dramatically degrades its overall performance. Our finding is different from a previously well-known scalability problem (lock contention problem), and occurs because of the sophisticated optimization techniques implemented in the hypervisor, what we call---sleepy spinlock anomaly. To solve this problem, we design and implement oticket, a variant of paravirtualized ticket spinlock that effectively scales the virtualized instances in both undersubscribed and oversubscribed environments.
{"title":"Scalability in the Clouds!: A Myth or Reality?","authors":"Sanidhya Kashyap, Changwoo Min, Taesoo Kim","doi":"10.1145/2797022.2797037","DOIUrl":"https://doi.org/10.1145/2797022.2797037","url":null,"abstract":"With increasing demand of big-data processing and faster in-memory databases, cloud providers are gearing towards large virtualized instances rather than horizontal scalability. However, our experiments reveal that such instances in popular cloud services (e.g., 32 vCPUs with 208 GB supported by Google Compute Engine) do not achieve the desired scalability with increasing core count even with a simple, embarrassingly parallel job (e.g., kernel compile). On a serious note, the internal synchronization scheme (e.g., paravirtualized ticket spinlock) of the virtualized instance on a machine with higher core count (e.g., 80-core) dramatically degrades its overall performance. Our finding is different from a previously well-known scalability problem (lock contention problem), and occurs because of the sophisticated optimization techniques implemented in the hypervisor, what we call---sleepy spinlock anomaly. To solve this problem, we design and implement oticket, a variant of paravirtualized ticket spinlock that effectively scales the virtualized instances in both undersubscribed and oversubscribed environments.","PeriodicalId":125617,"journal":{"name":"Proceedings of the 6th Asia-Pacific Workshop on Systems","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126594066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deterministic replay, which provides the ability to travel backward in time and reconstructs the past execution flow of a multi-processor system, has many prominent applications including cyclic debugging, intrusion detection, malware analysis, and fault tolerance. Previous software-only schemes cannot take advantage of modern hardware support for replay and suffer from excessive performance overhead. They also produce huge log sizes due to the inherent draw-backs of the point-to-point logging approach used. In this paper, we propose a novel approach, called Samsara, which uses hardware-assisted virtualization (HAV) extensions to achieve an efficient software-based replay system. Unlike previous software-only schemes that record dependences between individual instructions, we record processors' execution as a series of chunks. By leveraging HAV extensions, we avoid the large number of memory access detections which are a major source of overhead in the previous work and instead perform a single extended page table (EPT) traversal at the end of each chunk. We have implemented and evaluated our system on KVM with Intel's Haswell processor. Evaluation results show that our system incurs less than 3X overhead when compared to native execution with two processors while the overhead in other state-of-the-art work is much more than 10X. Our system improves recording performance dramatically with a log size even smaller than that in hardware-based scheme.
{"title":"Samsara: Efficient Deterministic Replay with Hardware Virtualization Extensions","authors":"S. Ren, Chunqi Li, L. Tan, Zhen Xiao","doi":"10.1145/2797022.2797028","DOIUrl":"https://doi.org/10.1145/2797022.2797028","url":null,"abstract":"Deterministic replay, which provides the ability to travel backward in time and reconstructs the past execution flow of a multi-processor system, has many prominent applications including cyclic debugging, intrusion detection, malware analysis, and fault tolerance. Previous software-only schemes cannot take advantage of modern hardware support for replay and suffer from excessive performance overhead. They also produce huge log sizes due to the inherent draw-backs of the point-to-point logging approach used. In this paper, we propose a novel approach, called Samsara, which uses hardware-assisted virtualization (HAV) extensions to achieve an efficient software-based replay system. Unlike previous software-only schemes that record dependences between individual instructions, we record processors' execution as a series of chunks. By leveraging HAV extensions, we avoid the large number of memory access detections which are a major source of overhead in the previous work and instead perform a single extended page table (EPT) traversal at the end of each chunk. We have implemented and evaluated our system on KVM with Intel's Haswell processor. Evaluation results show that our system incurs less than 3X overhead when compared to native execution with two processors while the overhead in other state-of-the-art work is much more than 10X. Our system improves recording performance dramatically with a log size even smaller than that in hardware-based scheme.","PeriodicalId":125617,"journal":{"name":"Proceedings of the 6th Asia-Pacific Workshop on Systems","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124295886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The automatic resolution for resolving conflict updates in cloud storage services has been well studied, however, how to correctly implement the resolution in real-world systems remains challenging. In this paper, we present the challenges we experienced when implementing our cloud storage system. They include (1) detecting the intended object for an update when the intended object has been automatically changed by the conflict resolution, and (2) producing no different intermediate results when resolving the conflict updates from more than two replicas. We present our solution of using the mechanism of the conflict resolution to redirect an update to its intended object and of using Conflict-Free Replicated Data Type (CRDT) for a "clean" implementation of conflict resolution without different intermediate results.
{"title":"A Name Is Not A Name: The Implementation Of A Cloud Storage System","authors":"Vinh Tao, Vianney Rancurel, João Neto","doi":"10.1145/2797022.2797034","DOIUrl":"https://doi.org/10.1145/2797022.2797034","url":null,"abstract":"The automatic resolution for resolving conflict updates in cloud storage services has been well studied, however, how to correctly implement the resolution in real-world systems remains challenging. In this paper, we present the challenges we experienced when implementing our cloud storage system. They include (1) detecting the intended object for an update when the intended object has been automatically changed by the conflict resolution, and (2) producing no different intermediate results when resolving the conflict updates from more than two replicas. We present our solution of using the mechanism of the conflict resolution to redirect an update to its intended object and of using Conflict-Free Replicated Data Type (CRDT) for a \"clean\" implementation of conflict resolution without different intermediate results.","PeriodicalId":125617,"journal":{"name":"Proceedings of the 6th Asia-Pacific Workshop on Systems","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129218029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Currently, most data centers consolidate servers using virtualization technology to reduce power consumption and the server's environmental "footprint." In these virtualized systems, page-sharing has been widely studied and adopted to increase the degree of consolidation, thereby increasing the chance of power saving. Traditional page-sharing schemes miss the sharing opportunities of short-lived pages, and performance degradation is caused by an exhaustive sequential scanning of the memory content. In this paper, we introduce SELF, an enhanced scheme for page-sharing in virtualized systems. We mitigated the semantic gap between the guest and host by applying a self-hint module to each virtual machine (VM), and, by exploiting the hints of the VMs, we can preferentially compare pages with high sharing probability. Through quantitative experiments, we verified that SELF could obtain new sharing opportunities by specifying directories that have high sharing probability, with low overhead.
{"title":"SELF: Improving the Memory-Sharing Opportunity using Virtual-Machine Self-Hints in Virtualized Systems","authors":"Yeji Nam, Dongwook Lee, Y. Eom","doi":"10.1145/2797022.2797038","DOIUrl":"https://doi.org/10.1145/2797022.2797038","url":null,"abstract":"Currently, most data centers consolidate servers using virtualization technology to reduce power consumption and the server's environmental \"footprint.\" In these virtualized systems, page-sharing has been widely studied and adopted to increase the degree of consolidation, thereby increasing the chance of power saving. Traditional page-sharing schemes miss the sharing opportunities of short-lived pages, and performance degradation is caused by an exhaustive sequential scanning of the memory content. In this paper, we introduce SELF, an enhanced scheme for page-sharing in virtualized systems. We mitigated the semantic gap between the guest and host by applying a self-hint module to each virtual machine (VM), and, by exploiting the hints of the VMs, we can preferentially compare pages with high sharing probability. Through quantitative experiments, we verified that SELF could obtain new sharing opportunities by specifying directories that have high sharing probability, with low overhead.","PeriodicalId":125617,"journal":{"name":"Proceedings of the 6th Asia-Pacific Workshop on Systems","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126037527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mobile operating systems have adopted a service model in which applications access system functionality by interacting with various OS Services in separate processes. These interactions cause application-specific states to be spread across many service processes, a problem we identify as state entanglement. State entanglement presents significant challenges to a wide variety of computing goals: fault isolation, fault tolerance, application migration, live update, and application speculation. We propose CORSA, a novel virtualization solution that uses a lightweight checkpoint/restore mechanism to virtualize OS Services on a per-application basis. This cleanly encapsulates a single application's service-side states into a private virtual service instance, eliminating state entanglement and enabling the above goals. We present empirical evidence that our ongoing implementation of CORSA on Android is feasible with low overhead, even in the worst case of high frequency service interactions.
{"title":"Eliminating State Entanglement with Checkpoint-based Virtualization of Mobile OS Services","authors":"Kevin Boos, A. A. Sani, Lin Zhong","doi":"10.1145/2797022.2797041","DOIUrl":"https://doi.org/10.1145/2797022.2797041","url":null,"abstract":"Mobile operating systems have adopted a service model in which applications access system functionality by interacting with various OS Services in separate processes. These interactions cause application-specific states to be spread across many service processes, a problem we identify as state entanglement. State entanglement presents significant challenges to a wide variety of computing goals: fault isolation, fault tolerance, application migration, live update, and application speculation. We propose CORSA, a novel virtualization solution that uses a lightweight checkpoint/restore mechanism to virtualize OS Services on a per-application basis. This cleanly encapsulates a single application's service-side states into a private virtual service instance, eliminating state entanglement and enabling the above goals. We present empirical evidence that our ongoing implementation of CORSA on Android is feasible with low overhead, even in the worst case of high frequency service interactions.","PeriodicalId":125617,"journal":{"name":"Proceedings of the 6th Asia-Pacific Workshop on Systems","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133319342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Containers, or OS-based virtualization, have seen a recent resurgence in deployment. The term "container" is nearly synonymous with "lightweight virtualization", despite a remarkable dearth of careful measurements supporting this notion. This paper contributes comparative measurements and analysis of both containers and hardware virtual machines where the functionality of both technologies intersects. This paper focuses on two important issues for cloud computing: density (guests per physical host) and start-up latency (for responding to load spikes). We conclude that the overall density is highly dependent on the most demanded resource. In many dimensions there are no significant differences, and in other dimensions VMs have significantly higher overheads. A particular contribution is the first detailed analysis of the biggest difference---memory footprint---and opportunities to significantly reduce this overhead.
{"title":"Containing the Hype","authors":"Kavita Agarwal, Bhushan Jain, Donald E. Porter","doi":"10.1145/2797022.2797029","DOIUrl":"https://doi.org/10.1145/2797022.2797029","url":null,"abstract":"Containers, or OS-based virtualization, have seen a recent resurgence in deployment. The term \"container\" is nearly synonymous with \"lightweight virtualization\", despite a remarkable dearth of careful measurements supporting this notion. This paper contributes comparative measurements and analysis of both containers and hardware virtual machines where the functionality of both technologies intersects. This paper focuses on two important issues for cloud computing: density (guests per physical host) and start-up latency (for responding to load spikes). We conclude that the overall density is highly dependent on the most demanded resource. In many dimensions there are no significant differences, and in other dimensions VMs have significantly higher overheads. A particular contribution is the first detailed analysis of the biggest difference---memory footprint---and opportunities to significantly reduce this overhead.","PeriodicalId":125617,"journal":{"name":"Proceedings of the 6th Asia-Pacific Workshop on Systems","volume":"216 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114156937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
HTTP adaptive streaming is increasingly popular in video delivery. This is mainly because HTTP allows easy deployment while it simplifies content delivery, and chunk-based delivery enables dynamic adaptation of video quality to varying network bandwidth. However, we find that the very nature of chunk delivery on HTTP causes some fundamental problems in efficient bandwidth utilization. In this work, we investigate why it is so hard to adapt to varying bandwidth with HTTP adaptive streaming. First, we find that the choice of chunk duration greatly affects the bandwidth adaptation logic. Second, we observe that the disparity between the advertised quality of a chunk and real encoding rate confuses the client-side adaptation logic. Third, the dependence on TCP/HTTP leads to suboptimal bandwidth utilization while it makes it challenging to adapt to rapidly-changing bandwidth. We show the evidence of the problems in our controlled experiments with popular HTTP adaptive streaming schemes, and lay out the future requirements for robust bandwidth adaptation in video streaming.
{"title":"Why Is HTTP Adaptive Streaming So Hard?","authors":"Sangwook Bae, Dahyun Jang, KyoungSoo Park","doi":"10.1145/2797022.2797031","DOIUrl":"https://doi.org/10.1145/2797022.2797031","url":null,"abstract":"HTTP adaptive streaming is increasingly popular in video delivery. This is mainly because HTTP allows easy deployment while it simplifies content delivery, and chunk-based delivery enables dynamic adaptation of video quality to varying network bandwidth. However, we find that the very nature of chunk delivery on HTTP causes some fundamental problems in efficient bandwidth utilization. In this work, we investigate why it is so hard to adapt to varying bandwidth with HTTP adaptive streaming. First, we find that the choice of chunk duration greatly affects the bandwidth adaptation logic. Second, we observe that the disparity between the advertised quality of a chunk and real encoding rate confuses the client-side adaptation logic. Third, the dependence on TCP/HTTP leads to suboptimal bandwidth utilization while it makes it challenging to adapt to rapidly-changing bandwidth. We show the evidence of the problems in our controlled experiments with popular HTTP adaptive streaming schemes, and lay out the future requirements for robust bandwidth adaptation in video streaming.","PeriodicalId":125617,"journal":{"name":"Proceedings of the 6th Asia-Pacific Workshop on Systems","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129117671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proceedings of the 6th Asia-Pacific Workshop on Systems","authors":"K. Kono, Takahiro Shinagawa","doi":"10.1145/2797022","DOIUrl":"https://doi.org/10.1145/2797022","url":null,"abstract":"","PeriodicalId":125617,"journal":{"name":"Proceedings of the 6th Asia-Pacific Workshop on Systems","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115328599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}