Proceedings of the 2017 Symposium on Cloud Computing最新文献_第7页

PBSE: a robust path-based speculative execution for degraded-network tail tolerance in data-parallel frameworks PBSE:数据并行框架中退化网络尾部容忍度的稳健的基于路径的推测执行

Proceedings of the 2017 Symposium on Cloud Computing

Pub Date : 2017-09-24 DOI: 10.1145/3127479.3131622

Riza O. Suminto, Cesar A. Stuardo, Alexandra Clark, Huan Ke, Tanakorn Leesatapornwongsa, Bo Fu, D. Kurniawan, V. Martin, Maheswara Rao G. Uma, Haryadi S. Gunawi

We reveal loopholes of Speculative Execution (SE) implementations under a unique fault model: node-level network throughput degradation. This problem appears in many data-parallel frameworks such as Hadoop MapReduce and Spark. To address this, we present PBSE, a robust, path-based speculative execution that employs three key ingredients: path progress, path diversity, and path-straggler detection and speculation. We show how PBSE is superior to other approaches such as cloning and aggressive speculation under the aforementioned fault model. PBSE is a general solution, applicable to many data-parallel frameworks such as Hadoop/HDFS+QFS, Spark and Flume.

我们在一个独特的故障模型下揭示了推测执行(SE)实现的漏洞:节点级网络吞吐量退化。这个问题出现在许多数据并行框架中，如Hadoop MapReduce和Spark。为了解决这个问题，我们提出了PBSE，这是一种强大的、基于路径的推测执行，它采用了三个关键成分:路径进度、路径多样性和路径离散者检测和推测。我们展示了在上述故障模型下，PBSE如何优于克隆和积极猜测等其他方法。PBSE是一种通用解决方案，适用于Hadoop/HDFS+QFS、Spark、Flume等多种数据并行框架。

引用次数: 20

A robust partitioning scheme for ad-hoc query workloads 针对临时查询工作负载的健壮分区方案

Proceedings of the 2017 Symposium on Cloud Computing

Pub Date : 2017-09-24 DOI: 10.1145/3127479.3131613

Anil Shanbhag, Alekh Jindal, S. Madden, Jorge-Arnulfo Quiané-Ruiz, Aaron J. Elmore

Data partitioning is crucial to improving query performance several workload-based partitioning techniques have been proposed in database literature. However, many modern analytic applications involve ad-hoc or exploratory analysis where users do not have a representative query workload a priori. Static workload-based data partitioning techniques are therefore not suitable for such settings. In this paper, we propose Amoeba, a distributed storage system that uses adaptive multi-attribute data partitioning to efficiently support ad-hoc as well as recurring queries. Amoeba requires zero set-up and tuning effort, allowing analysts to get the benefits of partitioning without requiring an upfront query workload. The key idea is to build and maintain a partitioning tree on top of the dataset. The partitioning tree allows us to answer queries with predicates by reading a subset of the data. The initial partitioning tree is created without requiring an upfront query workload and Amoeba adapts it over time by incrementally modifying subtrees based on user queries using repartitioning. A prototype of Amoeba running on top of Apache Spark improves query performance by up to 7x over full scans and up to 2x over range-based partitioning techniques on TPC-H as well as a real-world workload.

数据分区对于提高查询性能至关重要，数据库文献中提出了几种基于工作负载的分区技术。然而，许多现代分析应用程序涉及临时分析或探索性分析，其中用户没有代表性的先验查询工作负载。因此，基于工作负载的静态数据分区技术不适合这种设置。在本文中，我们提出了Amoeba，一个分布式存储系统，它使用自适应多属性数据分区来有效地支持ad-hoc和重复查询。Amoeba不需要任何设置和调优工作，允许分析人员在不需要预先查询工作负载的情况下获得分区的好处。关键思想是在数据集之上构建和维护一个分区树。分区树允许我们通过读取数据子集来回答带有谓词的查询。初始分区树是在不需要预先查询工作负载的情况下创建的，Amoeba通过使用重分区根据用户查询逐步修改子树来调整初始分区树。在Apache Spark上运行的Amoeba原型比完全扫描的查询性能提高了7倍，比基于范围的分区技术在TPC-H和实际工作负载上的查询性能提高了2倍。

{"title":"A robust partitioning scheme for ad-hoc query workloads","authors":"Anil Shanbhag, Alekh Jindal, S. Madden, Jorge-Arnulfo Quiané-Ruiz, Aaron J. Elmore","doi":"10.1145/3127479.3131613","DOIUrl":"https://doi.org/10.1145/3127479.3131613","url":null,"abstract":"Data partitioning is crucial to improving query performance several workload-based partitioning techniques have been proposed in database literature. However, many modern analytic applications involve ad-hoc or exploratory analysis where users do not have a representative query workload a priori. Static workload-based data partitioning techniques are therefore not suitable for such settings. In this paper, we propose Amoeba, a distributed storage system that uses adaptive multi-attribute data partitioning to efficiently support ad-hoc as well as recurring queries. Amoeba requires zero set-up and tuning effort, allowing analysts to get the benefits of partitioning without requiring an upfront query workload. The key idea is to build and maintain a partitioning tree on top of the dataset. The partitioning tree allows us to answer queries with predicates by reading a subset of the data. The initial partitioning tree is created without requiring an upfront query workload and Amoeba adapts it over time by incrementally modifying subtrees based on user queries using repartitioning. A prototype of Amoeba running on top of Apache Spark improves query performance by up to 7x over full scans and up to 2x over range-based partitioning techniques on TPC-H as well as a real-world workload.","PeriodicalId":20679,"journal":{"name":"Proceedings of the 2017 Symposium on Cloud Computing","volume":"31 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90347291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 36

mBalloon: enabling elastic memory management for big data processing mBalloon:为大数据处理提供弹性内存管理

Proceedings of the 2017 Symposium on Cloud Computing

Pub Date : 2017-09-24 DOI: 10.1145/3127479.3132565

Wei Chen, Aidi Pi, J. Rao, Xiaobo Zhou

Big Data processing often suffers from significant memory pressure, resulting in excessive garbage collection (GC) and out-of-memory (OOM) errors, harming system performance and reliability. Therefore, users tend to give an excessive heap size to applications to avoid job failure, causing low cluster utilization. In this paper, we demonstrate that lightweight virtualization, such as OS containers, opens up opportunities to address memory pressure: 1) tasks running in a container can be set to a large heap size to avoid OOM errors without worrying about thrashing the host machine; 2) tasks that are under memory pressure and incur significant GC activities can be temporarily "suspended" by depriving the hosting container's resources, and can be "resumed" later when other tasks complete and release their resources. We propose and develop mBalloon, an elastic memory manager, that leverages containers to flexibly and precisely control the memory usage of big data tasks. Applications running with mBalloon can survive from memory pressure, incur less GC overhead and help improve cluster utilization.

大数据处理常常面临巨大的内存压力，导致过多的垃圾收集(GC)和内存不足(OOM)错误，影响系统性能和可靠性。因此，用户倾向于为应用程序提供过大的堆大小，以避免作业失败，从而导致低集群利用率。在本文中，我们演示了轻量级虚拟化，例如操作系统容器，为解决内存压力提供了机会:1)可以将在容器中运行的任务设置为较大的堆大小，以避免OOM错误，而不必担心主机崩溃;2)处于内存压力下并引发大量GC活动的任务可以通过剥夺托管容器的资源来暂时“挂起”，并且可以在稍后其他任务完成并释放其资源时“恢复”。我们提出并开发了mBalloon，一个弹性内存管理器，它利用容器来灵活、精确地控制大数据任务的内存使用。使用mBalloon运行的应用程序可以在内存压力下生存下来，产生更少的GC开销，并有助于提高集群利用率。

引用次数: 0

A policy-based system for dynamic scaling of virtual machine memory reservations 一个基于策略的虚拟机内存预留动态扩展系统

Proceedings of the 2017 Symposium on Cloud Computing

Pub Date : 2017-09-24 DOI: 10.1145/3127479.3127491

R. Smith, S. Rixner

To maximize the effectiveness of modern virtualization systems, resources must be allocated fairly and efficiently amongst virtual machines (VMs). However, current policies for allocating memory are relatively static. As a result, system-wide memory utilization is often sub-optimal, leading to unnecessary paging and performance degradation. To better utilize the large-scale memory resources of modern machines, the virtualization system must allow virtual machines to expand beyond their initial memory reservations, while still fairly supporting concurrent virtual machines. This paper presents a system for dynamically allocating memory amongst virtual machines at runtime, as well as an evaluation of six allocation policies implemented within the system. The system allows guest VMs to expand and contract according to their changing demands by uniquely improving and integrating mechanisms such as memory ballooning, memory hotplug, and hypervisor paging. Furthermore, the system provides fairness by guaranteeing each guest a minimum reservation, charging for rentals beyond this minimum, and enforcing timely reclamation of memory.

为了使现代虚拟化系统的有效性最大化，必须在虚拟机(vm)之间公平有效地分配资源。然而，当前分配内存的策略是相对静态的。因此，系统范围内的内存利用率通常不是最优的，从而导致不必要的分页和性能下降。为了更好地利用现代机器的大规模内存资源，虚拟化系统必须允许虚拟机在其初始内存预留之外扩展，同时仍然相当支持并发虚拟机。本文提出了一个在运行时在虚拟机之间动态分配内存的系统，并对系统内实现的六种分配策略进行了评估。该系统通过独特地改进和集成内存膨胀、内存热插拔和管理程序分页等机制，允许来宾虚拟机根据不断变化的需求进行扩展和收缩。此外，该系统通过保证每位客人的最低订房量、收取超过最低订房量的租金以及强制及时回收内存来保证公平性。

引用次数: 3

WorkloadCompactor: reducing datacenter cost while providing tail latency SLO guarantees WorkloadCompactor:降低数据中心成本，同时提供尾延迟SLO保证

Proceedings of the 2017 Symposium on Cloud Computing

Pub Date : 2017-09-24 DOI: 10.1145/3127479.3132245

T. Zhu, M. Kozuch, Mor Harchol-Balter

Service providers want to reduce datacenter costs by consolidating workloads onto fewer servers. At the same time, customers have performance goals, such as meeting tail latency Service Level Objectives (SLOs). Consolidating workloads while meeting tail latency goals is challenging, especially since workloads in production environments are often bursty. To limit the congestion when consolidating workloads, customers and service providers often agree upon rate limits. Ideally, rate limits are chosen to maximize the number of workloads that can be co-located while meeting each workload's SLO. In reality, neither the service provider nor customer knows how to choose rate limits. Customers end up selecting rate limits on their own in some ad hoc fashion, and service providers are left to optimize given the chosen rate limits. This paper describes WorkloadCompactor, a new system that uses workload traces to automatically choose rate limits simultaneously with selecting onto which server to place workloads. Our system meets customer tail latency SLOs while minimizing datacenter resource costs. Our experiments show that by optimizing the choice of rate limits, WorkloadCompactor reduces the number of required servers by 30--60% as compared to state-of-the-art approaches.

服务提供商希望通过将工作负载整合到更少的服务器上来降低数据中心成本。同时，客户有性能目标，例如满足尾延迟服务水平目标(slo)。在满足尾部延迟目标的同时整合工作负载是具有挑战性的，特别是因为生产环境中的工作负载通常是突发的。为了在合并工作负载时限制拥塞，客户和服务提供商通常会商定速率限制。理想情况下，选择速率限制是为了在满足每个工作负载的SLO的同时，最大限度地增加可共存的工作负载数量。实际上，服务提供商和客户都不知道如何选择费率限制。客户最终会以某种特别的方式自行选择费率限制，而服务提供商则会根据所选的费率限制进行优化。本文介绍了WorkloadCompactor，一个利用工作负载跟踪来自动选择速率限制的新系统，同时选择将工作负载放在哪个服务器上。我们的系统满足客户尾部延迟slo，同时最大限度地降低数据中心资源成本。我们的实验表明，与最先进的方法相比，通过优化速率限制的选择，WorkloadCompactor将所需服务器的数量减少了30- 60%。

{"title":"WorkloadCompactor: reducing datacenter cost while providing tail latency SLO guarantees","authors":"T. Zhu, M. Kozuch, Mor Harchol-Balter","doi":"10.1145/3127479.3132245","DOIUrl":"https://doi.org/10.1145/3127479.3132245","url":null,"abstract":"Service providers want to reduce datacenter costs by consolidating workloads onto fewer servers. At the same time, customers have performance goals, such as meeting tail latency Service Level Objectives (SLOs). Consolidating workloads while meeting tail latency goals is challenging, especially since workloads in production environments are often bursty. To limit the congestion when consolidating workloads, customers and service providers often agree upon rate limits. Ideally, rate limits are chosen to maximize the number of workloads that can be co-located while meeting each workload's SLO. In reality, neither the service provider nor customer knows how to choose rate limits. Customers end up selecting rate limits on their own in some ad hoc fashion, and service providers are left to optimize given the chosen rate limits. This paper describes WorkloadCompactor, a new system that uses workload traces to automatically choose rate limits simultaneously with selecting onto which server to place workloads. Our system meets customer tail latency SLOs while minimizing datacenter resource costs. Our experiments show that by optimizing the choice of rate limits, WorkloadCompactor reduces the number of required servers by 30--60% as compared to state-of-the-art approaches.","PeriodicalId":20679,"journal":{"name":"Proceedings of the 2017 Symposium on Cloud Computing","volume":"137 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77511418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 35

KVS: high-efficiency kernel-level virtual switch KVS:高效的内核级虚拟交换机

Proceedings of the 2017 Symposium on Cloud Computing

Pub Date : 2017-09-24 DOI: 10.1145/3127479.3131615

Heungsik Choi, Gyeongsik Yang, Kyungwoon Lee, C. Yoo

In clouds, virtual switch (vSwitch) is in charge of packet forwarding between virtual machines (VMs). However, kernel-based vSwitches show throughput degradation for intensive packet processing; this becomes a bottleneck for the network performance of clouds. DPDK-based vSwitch (DPDK vSwitch) [1] has been developed to resolve the performance problem. Although it exhibits high throughput, DPDK vSwitch has two weak points. First, it consumes excessive memory. DPDK vSwitch uses huge page to reduce the number of memory operations, and this design causes high memory consumption even when the traffic is low. According to [2], memory determines the available number of VMs per single physical server. Thus, saving the memory decreases the capital expenditure of clouds. Second, security is another concern of the DPDK vSwitch, because its data plane is exposed to user space with the shared memory [3]. Therefore, the isolation of packets across VMs cannot be guaranteed. To overcome the excessive memory use and security concern, we propose a new kernel-level vSwitch (KVS) based on Linux. KVS do not use huge page nor bypass kernel stack. Instead, KVS applies the following key ideas to enhance the throughput.

在云中，虚拟交换机(vSwitch)负责虚拟机之间的数据包转发。然而，基于内核的虚拟交换机在密集的数据包处理中表现出吞吐量下降;这成为云计算网络性能的瓶颈。基于DPDK的vSwitch (DPDK vSwitch)[1]是为了解决性能问题而开发的。DPDK vSwitch虽然具有很高的吞吐量，但有两个缺点。首先，它消耗了过多的内存。DPDK vSwitch使用巨大的页面来减少内存操作的次数，这种设计即使在流量较低的情况下也会导致内存的高消耗。根据[2]，内存决定了每台物理服务器上可用的虚拟机数量。因此，节省内存减少了云的资本支出。其次，安全性是DPDK vSwitch的另一个关注点，因为它的数据平面通过共享内存暴露给用户空间[3]。因此无法保证跨虚拟机的报文隔离。为了克服内存的过度使用和安全问题，我们提出了一种新的基于Linux的内核级虚拟交换机(KVS)。KVS不使用巨大的页面，也不绕过内核堆栈。相反，KVS应用以下关键思想来提高吞吐量。

{"title":"KVS: high-efficiency kernel-level virtual switch","authors":"Heungsik Choi, Gyeongsik Yang, Kyungwoon Lee, C. Yoo","doi":"10.1145/3127479.3131615","DOIUrl":"https://doi.org/10.1145/3127479.3131615","url":null,"abstract":"In clouds, virtual switch (vSwitch) is in charge of packet forwarding between virtual machines (VMs). However, kernel-based vSwitches show throughput degradation for intensive packet processing; this becomes a bottleneck for the network performance of clouds. DPDK-based vSwitch (DPDK vSwitch) [1] has been developed to resolve the performance problem. Although it exhibits high throughput, DPDK vSwitch has two weak points. First, it consumes excessive memory. DPDK vSwitch uses huge page to reduce the number of memory operations, and this design causes high memory consumption even when the traffic is low. According to [2], memory determines the available number of VMs per single physical server. Thus, saving the memory decreases the capital expenditure of clouds. Second, security is another concern of the DPDK vSwitch, because its data plane is exposed to user space with the shared memory [3]. Therefore, the isolation of packets across VMs cannot be guaranteed. To overcome the excessive memory use and security concern, we propose a new kernel-level vSwitch (KVS) based on Linux. KVS do not use huge page nor bypass kernel stack. Instead, KVS applies the following key ideas to enhance the throughput.","PeriodicalId":20679,"journal":{"name":"Proceedings of the 2017 Symposium on Cloud Computing","volume":"34 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87264654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

DAIET

Proceedings of the 2017 Symposium on Cloud Computing

Pub Date : 2017-09-24 DOI: 10.1145/3127479.3132018

Amedeo Sapio, I. Abdelaziz, Marco Canini, Panos Kalnis

1 CONTEXT AND MOTIVATION Many data center applications nowadays rely on distributed computation models like MapReduce and Bulk Synchronous Parallel (BSP) for data-intensive computation at scale [4]. These models scale by leveraging the partition/aggregate pattern where data and computations are distributed across many worker servers, each performing part of the computation. A communication phase is needed each time workers need to synchronize the computation and, at last, to produce the final output. In these applications, the network communication costs can be one of the dominant scalability bottlenecks especially in case of multi-stage or iterative computations [1]. The advent of flexible networking hardware and expressive data plane programming languages have produced networks that are deeply programmable [2]. This creates the opportunity to co-design distributed systems with their network layer, which can offer substantial performance benefits. A possible use of this emerging technology is to execute the logic traditionally associated with the application layer into the network itself. Given that in the above mentioned applications the intermediate results are necessarily exchanged through the network, it is desirable to offload to it part of the aggregation task to reduce the traffic and lessen the work of the servers. However, these programmable networking devices typically have very stringent constraints on the number and type of operations that can be performed at line rate. Moreover, packet processing at high speed requires a very fast memory, such as TCAM or SRAM, which is expensive and usually available in small capacities.

如今，许多数据中心应用依赖于分布式计算模型，如MapReduce和Bulk Synchronous Parallel (BSP)进行大规模的数据密集型计算[4]。这些模型通过利用分区/聚合模式进行扩展，其中数据和计算分布在许多工作服务器上，每个工作服务器执行部分计算。每次工作者需要同步计算并最终产生最终输出时，都需要一个通信阶段。在这些应用中，网络通信成本可能是主要的可扩展性瓶颈之一，特别是在多阶段或迭代计算的情况下[1]。灵活的网络硬件和富有表现力的数据平面编程语言的出现产生了深度可编程的网络[2]。这创造了与网络层共同设计分布式系统的机会，这可以提供实质性的性能优势。这种新兴技术的一种可能用途是将传统上与应用层相关的逻辑执行到网络本身。鉴于在上述应用程序中，中间结果必须通过网络进行交换，因此希望将部分聚合任务卸载给网络，以减少流量并减少服务器的工作。然而，这些可编程网络设备通常对可以以线速率执行的操作的数量和类型有非常严格的限制。此外，高速分组处理需要非常快的存储器，如TCAM或SRAM，这是昂贵的，通常在小容量可用。

{"title":"DAIET","authors":"Amedeo Sapio, I. Abdelaziz, Marco Canini, Panos Kalnis","doi":"10.1145/3127479.3132018","DOIUrl":"https://doi.org/10.1145/3127479.3132018","url":null,"abstract":"1 CONTEXT AND MOTIVATION Many data center applications nowadays rely on distributed computation models like MapReduce and Bulk Synchronous Parallel (BSP) for data-intensive computation at scale [4]. These models scale by leveraging the partition/aggregate pattern where data and computations are distributed across many worker servers, each performing part of the computation. A communication phase is needed each time workers need to synchronize the computation and, at last, to produce the final output. In these applications, the network communication costs can be one of the dominant scalability bottlenecks especially in case of multi-stage or iterative computations [1]. The advent of flexible networking hardware and expressive data plane programming languages have produced networks that are deeply programmable [2]. This creates the opportunity to co-design distributed systems with their network layer, which can offer substantial performance benefits. A possible use of this emerging technology is to execute the logic traditionally associated with the application layer into the network itself. Given that in the above mentioned applications the intermediate results are necessarily exchanged through the network, it is desirable to offload to it part of the aggregation task to reduce the traffic and lessen the work of the servers. However, these programmable networking devices typically have very stringent constraints on the number and type of operations that can be performed at line rate. Moreover, packet processing at high speed requires a very fast memory, such as TCAM or SRAM, which is expensive and usually available in small capacities.","PeriodicalId":20679,"journal":{"name":"Proceedings of the 2017 Symposium on Cloud Computing","volume":"15 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87456845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

HotSpot: automated server hopping in cloud spot markets 热点:云现货市场的自动服务器跳转

Proceedings of the 2017 Symposium on Cloud Computing

Pub Date : 2017-09-24 DOI: 10.1145/3127479.3132017

Supreeth Shastri, David E. Irwin

Cloud spot markets offer virtual machines (VMs) for a dynamic price that is much lower than the fixed price of on-demand VMs. In exchange, spot VMs expose applications to multiple forms of risk, including price risk, or the risk that a VM's price will increase relative to others. Since spot prices vary continuously across hundreds of different types of VMs, flexible applications can mitigate price risk by moving to the VM that currently offers the lowest cost. To enable this flexibility, we present HotSpot, a resource container that "hops" VMs---by dynamically selecting and self-migrating to new VMs---as spot prices change. HotSpot containers define a migration policy that lowers cost by determining when to hop VMs based on the transaction costs (from vacating a VM early and briefly double paying for it) and benefits (the expected cost savings). As a side effect of migrating to minimize cost, HotSpot is also able to reduce the number of revocations without degrading performance. HotSpot is simple and transparent: since it operates at the systems-level on each host VM, users need only run an HotSpot-enabled VM image to use it. We implement a HotSpot prototype on EC2, and evaluate it using job traces from a production Google cluster. We then compare HotSpot to using on-demand VMs and spot VMs (with and without fault-tolerance) in EC2, and show that it is able to lower cost and reduce the number of revocations without degrading performance.

云现货市场以动态价格提供虚拟机(vm)，其价格远低于按需虚拟机的固定价格。作为交换，现货VM将应用程序暴露于多种形式的风险中，包括价格风险，或者虚拟机的价格相对于其他虚拟机将增加的风险。由于现货价格在数百种不同类型的VM之间不断变化，灵活的应用程序可以通过迁移到当前提供最低成本的VM来降低价格风险。为了实现这种灵活性，我们提出了HotSpot，这是一个资源容器，随着现货价格的变化，它可以通过动态选择和自我迁移到新的vm来“跳”虚拟机。热点容器定义了一种迁移策略，通过根据事务成本(提前腾出虚拟机并短暂地为其支付双倍费用)和收益(预期的成本节省)确定何时跳转虚拟机来降低成本。作为迁移以最小化成本的一个副作用，HotSpot还能够在不降低性能的情况下减少吊销的数量。HotSpot简单而透明:因为它在每个主机VM上的系统级运行，用户只需要运行启用了HotSpot的VM映像就可以使用它。我们在EC2上实现了一个HotSpot原型，并使用来自Google生产集群的作业跟踪对其进行了评估。然后，我们将HotSpot与在EC2中使用按需虚拟机和点虚拟机(带和不带容错)进行比较，并表明它能够在不降低性能的情况下降低成本和减少撤销次数。

{"title":"HotSpot: automated server hopping in cloud spot markets","authors":"Supreeth Shastri, David E. Irwin","doi":"10.1145/3127479.3132017","DOIUrl":"https://doi.org/10.1145/3127479.3132017","url":null,"abstract":"Cloud spot markets offer virtual machines (VMs) for a dynamic price that is much lower than the fixed price of on-demand VMs. In exchange, spot VMs expose applications to multiple forms of risk, including price risk, or the risk that a VM's price will increase relative to others. Since spot prices vary continuously across hundreds of different types of VMs, flexible applications can mitigate price risk by moving to the VM that currently offers the lowest cost. To enable this flexibility, we present HotSpot, a resource container that \"hops\" VMs---by dynamically selecting and self-migrating to new VMs---as spot prices change. HotSpot containers define a migration policy that lowers cost by determining when to hop VMs based on the transaction costs (from vacating a VM early and briefly double paying for it) and benefits (the expected cost savings). As a side effect of migrating to minimize cost, HotSpot is also able to reduce the number of revocations without degrading performance. HotSpot is simple and transparent: since it operates at the systems-level on each host VM, users need only run an HotSpot-enabled VM image to use it. We implement a HotSpot prototype on EC2, and evaluate it using job traces from a production Google cluster. We then compare HotSpot to using on-demand VMs and spot VMs (with and without fault-tolerance) in EC2, and show that it is able to lower cost and reduce the number of revocations without degrading performance.","PeriodicalId":20679,"journal":{"name":"Proceedings of the 2017 Symposium on Cloud Computing","volume":"31 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90299872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 58

Search lookaside buffer: efficient caching for index data structures 搜索缓存:索引数据结构的高效缓存

Proceedings of the 2017 Symposium on Cloud Computing

Pub Date : 2017-09-24 DOI: 10.1145/3127479.3127483

Xingbo Wu, Fan Ni, Song Jiang

With the ever increasing DRAM capacity in commodity computers, applications tend to store large amount of data in main memory for fast access. Accordingly, efficient traversal of index structures to locate requested data becomes crucial to their performance. The index data structures grow so large that only a fraction of them can be cached in the CPU cache. The CPU cache can leverage access locality to keep the most frequently used part of an index in it for fast access. However, the traversal on the index to a target data during a search for a data item can result in significant false temporal and spatial localities, which make CPU cache space substantially underutilized. In this paper we show that even for highly skewed accesses the index traversal incurs excessive cache misses leading to suboptimal data access performance. To address the issue, we introduce Search Lookaside Buffer (SLB) to selectively cache only the search results, instead of the index itself. SLB can be easily integrated with any index data structure to increase utilization of the limited CPU cache resource and improve throughput of search requests on a large data set. We integrate SLB with various index data structures and applications. Experiments show that SLB can improve throughput of the index data structures by up to an order of magnitude. Experiments with real-world key-value traces also show up to 73% throughput improvement on a hash table.

随着商品计算机中DRAM容量的不断增加，应用程序倾向于在主存储器中存储大量数据以实现快速访问。因此，有效地遍历索引结构以定位所请求的数据对它们的性能至关重要。索引数据结构增长得如此之大，以至于只有一小部分可以缓存在CPU缓存中。CPU缓存可以利用访问局部性来保留索引中最常用的部分，以便快速访问。但是，在搜索数据项期间对目标数据的索引进行遍历可能会导致大量错误的时间和空间位置，从而导致CPU缓存空间严重未得到充分利用。在本文中，我们表明，即使对于高度倾斜的访问，索引遍历也会导致过多的缓存丢失，从而导致次优的数据访问性能。为了解决这个问题，我们引入了Search Lookaside Buffer (SLB)来选择性地只缓存搜索结果，而不是索引本身。SLB可以轻松地与任何索引数据结构集成，以提高有限的CPU缓存资源的利用率，并提高大型数据集上搜索请求的吞吐量。我们将SLB集成到各种索引数据结构和应用程序中。实验表明，SLB可以将索引数据结构的吞吐量提高一个数量级。对真实世界的键值跟踪的实验也显示哈希表的吞吐量提高了73%。

{"title":"Search lookaside buffer: efficient caching for index data structures","authors":"Xingbo Wu, Fan Ni, Song Jiang","doi":"10.1145/3127479.3127483","DOIUrl":"https://doi.org/10.1145/3127479.3127483","url":null,"abstract":"With the ever increasing DRAM capacity in commodity computers, applications tend to store large amount of data in main memory for fast access. Accordingly, efficient traversal of index structures to locate requested data becomes crucial to their performance. The index data structures grow so large that only a fraction of them can be cached in the CPU cache. The CPU cache can leverage access locality to keep the most frequently used part of an index in it for fast access. However, the traversal on the index to a target data during a search for a data item can result in significant false temporal and spatial localities, which make CPU cache space substantially underutilized. In this paper we show that even for highly skewed accesses the index traversal incurs excessive cache misses leading to suboptimal data access performance. To address the issue, we introduce Search Lookaside Buffer (SLB) to selectively cache only the search results, instead of the index itself. SLB can be easily integrated with any index data structure to increase utilization of the limited CPU cache resource and improve throughput of search requests on a large data set. We integrate SLB with various index data structures and applications. Experiments show that SLB can improve throughput of the index data structures by up to an order of magnitude. Experiments with real-world key-value traces also show up to 73% throughput improvement on a hash table.","PeriodicalId":20679,"journal":{"name":"Proceedings of the 2017 Symposium on Cloud Computing","volume":"65 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90361291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

AidOps: a data-driven provisioning of high-availability services in cloud AidOps:在云中提供数据驱动的高可用性服务

Proceedings of the 2017 Symposium on Cloud Computing

Pub Date : 2017-09-24 DOI: 10.1145/3127479.3129250

D. Lugones, Jordi Arjona Aroca, Yue Jin, A. Sala, V. Hilt

The virtualization of services with high-availability requirements calls to revisit traditional operation and provisioning processes. Providers are realizing services in software on virtual machines instead of using dedicated appliances to dynamically adjust service capacity to changing demands. Cloud orchestration systems control the number of service instances deployed to make sure each service has enough capacity to meet incoming workloads. However, determining the suitable build-out of a service is challenging as it takes time to install new instances and excessive re-configurations (i.e. scale in/out) can lead to decreased stability. In this paper we present AidOps, a cloud orchestration system that leverages machine learning and domain-specific knowledge to predict the traffic demand, optimizing service performance and cost. AidOps does not require a conservative provisioning of services to cover for the worst-case demand and significantly reduces operational costs while still fulfilling service quality expectations. We have evaluated our framework with real traffic using an enterprise application and a communication service in a private cloud. Our results show up to 4X improvement in service performance indicators compared to existing orchestration systems. AidOps achieves up to 99.985% availability levels while reducing operational costs at least by 20%.

具有高可用性需求的服务虚拟化需要重新审视传统的操作和供应流程。提供商正在虚拟机上的软件中实现服务，而不是使用专用设备来动态调整服务容量以适应不断变化的需求。云编排系统控制部署的服务实例的数量，以确保每个服务都有足够的容量来满足传入的工作负载。然而，确定服务的合适构建是具有挑战性的，因为安装新实例需要时间，并且过度的重新配置(即伸缩入/出)可能导致稳定性降低。在本文中，我们介绍了AidOps，这是一个云编排系统，利用机器学习和特定领域的知识来预测流量需求，优化服务性能和成本。AidOps不需要保守的服务供应来满足最坏情况的需求，并且在满足服务质量期望的同时显着降低了运营成本。我们在私有云中使用企业应用程序和通信服务对我们的框架进行了实际流量评估。我们的结果显示，与现有的编排系统相比，服务性能指标提高了4倍。AidOps达到99.985%的可用性水平，同时将运营成本降低至少20%。

{"title":"AidOps: a data-driven provisioning of high-availability services in cloud","authors":"D. Lugones, Jordi Arjona Aroca, Yue Jin, A. Sala, V. Hilt","doi":"10.1145/3127479.3129250","DOIUrl":"https://doi.org/10.1145/3127479.3129250","url":null,"abstract":"The virtualization of services with high-availability requirements calls to revisit traditional operation and provisioning processes. Providers are realizing services in software on virtual machines instead of using dedicated appliances to dynamically adjust service capacity to changing demands. Cloud orchestration systems control the number of service instances deployed to make sure each service has enough capacity to meet incoming workloads. However, determining the suitable build-out of a service is challenging as it takes time to install new instances and excessive re-configurations (i.e. scale in/out) can lead to decreased stability. In this paper we present AidOps, a cloud orchestration system that leverages machine learning and domain-specific knowledge to predict the traffic demand, optimizing service performance and cost. AidOps does not require a conservative provisioning of services to cover for the worst-case demand and significantly reduces operational costs while still fulfilling service quality expectations. We have evaluated our framework with real traffic using an enterprise application and a communication service in a private cloud. Our results show up to 4X improvement in service performance indicators compared to existing orchestration systems. AidOps achieves up to 99.985% availability levels while reducing operational costs at least by 20%.","PeriodicalId":20679,"journal":{"name":"Proceedings of the 2017 Symposium on Cloud Computing","volume":"85 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86306590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8