Proceedings of the 2017 Symposium on Cloud Computing最新文献

英文中文

Latency reduction and load balancing in coded storage systems 编码存储系统中的延迟降低和负载平衡

Proceedings of the 2017 Symposium on Cloud Computing

Pub Date : 2017-09-24 DOI: 10.1145/3127479.3131623

Yaochen Hu, Yushi Wang, Bang Liu, Di Niu, Cheng Huang

Erasure coding has been used in storage systems to enhance data durability at a lower storage overhead. However, these systems suffer from long access latency tails due to a lack of flexible load balancing mechanisms and passively launched degraded reads when the original storage node of the requested data becomes a hotspot. We provide a new perspective to load balancing in coded storage systems by proactively and intelligently launching degraded reads and propose a variety of schemes to make optimal decisions either per request or across requests statistically. Experiments on a 98-machine cluster based on the request traces of 12 million objects collected from Windows Azure Storage (WAS) show that our schemes can reduce the median latency by 44.7% and the 95th-percentile tail latency by 77.8% in coded storage systems.

在存储系统中，以较低的存储开销来增强数据的持久性。然而，由于缺乏灵活的负载均衡机制，当请求数据的原始存储节点成为热点时，这些系统会被动地发起降级读取，从而导致访问延迟尾长。我们通过主动和智能地启动降级读取，为编码存储系统的负载平衡提供了一个新的视角，并提出了各种方案，以在统计上做出每个请求或跨请求的最佳决策。基于从Windows Azure Storage (WAS)收集的1200万个对象的请求跟踪，在98台机器集群上进行的实验表明，我们的方案可以将编码存储系统的中位延迟减少44.7%，95百分位尾部延迟减少77.8%。

引用次数: 21

Batch spot market for data analytics cloud providers 数据分析云提供商的批量现货市场

Proceedings of the 2017 Symposium on Cloud Computing

Pub Date : 2017-09-24 DOI: 10.1145/3127479.3134348

S. Costache, Tommaso Madonia, A. Tantawi, M. Steinder

Hosting data analytics services is challenging as their workload is often composed of on-line (e.g., interactive or streaming), requiring fast on-demand provisioning, and batch jobs. As workload demand fluctuations lead to varying idle capacity, efficient resource management is difficult, in particular given different provider objectives, e.g., utilization, revenue.

托管数据分析服务具有挑战性，因为它们的工作负载通常由在线(例如，交互式或流)组成，需要快速按需供应和批处理作业。由于工作负载需求波动导致不同的闲置能力，因此很难进行有效的资源管理，特别是考虑到不同的提供商目标，例如利用率和收入。

引用次数: 0

Fragola: low-latency transactions in distributed data stores Fragola:分布式数据存储中的低延迟事务

Proceedings of the 2017 Symposium on Cloud Computing

Pub Date : 2017-09-24 DOI: 10.1145/3127479.3132686

Yonatan Gottesman, Aran Bergman, Edward Bortnikov, Eshcar Hillel, I. Keidar, Ohad Shacham

As transaction processing services begin to be used in new application domains, low transaction latency becomes an important consideration. Motivated by such use cases we developed Fragola, a highly scalable low-latency and high-throughput transaction processing engine for Apache HBase. Similarly to other modern transaction managers, Fragola provides a variant of generalized snapshot isolation (SI), which scales better than traditional serializability implementations.

随着事务处理服务开始在新的应用程序领域中使用，低事务延迟成为一个重要的考虑因素。在这些用例的激励下，我们开发了Fragola，这是一个用于Apache HBase的高可伸缩、低延迟和高吞吐量的事务处理引擎。与其他现代事务管理器类似，Fragola提供了一种通用快照隔离(SI)的变体，它比传统的序列化实现具有更好的可伸缩性。

引用次数: 0

DLSH: a distribution-aware LSH scheme for approximate nearest neighbor query in cloud computing DLSH:云计算中用于近似最近邻查询的分布感知LSH方案

Proceedings of the 2017 Symposium on Cloud Computing

Pub Date : 2017-09-24 DOI: 10.1145/3127479.3127485

Yuanyuan Sun, Yu Hua, Xue Liu, Shunde Cao, Pengfei Zuo

Cloud computing needs to process and analyze massive high-dimensional data in a real-time manner. Approximate queries in cloud computing systems can provide timely queried results with acceptable accuracy, thus alleviating the consumption of a large amount of resources. Locality Sensitive Hashing (LSH) is able to maintain the data locality and support approximate queries. However, due to randomly choosing hash functions, LSH has to use too many functions to guarantee the query accuracy. The extra computation and storage overheads exacerbate the real performance of LSH. In order to reduce the overheads and deliver high performance, we propose a distribution-aware scheme, called DLSH, to offer cost-effective approximate nearest neighbor query service for cloud computing. The idea of DLSH is to leverage the principal components of the data distribution as the projection vectors of hash functions in LSH, further quantify the weight of each hash function and adjust the interval value in each hash table. We then refine the queried result set based on the hit frequency to significantly decrease the time overhead of distance computation. Extensive experiments in a large-scale cloud computing testbed demonstrate significant improvements in terms of multiple system performance metrics. We have released the source code of DLSH for public use.

云计算需要实时处理和分析海量高维数据。云计算系统中的近似查询可以提供及时且精度可接受的查询结果，从而减轻了大量资源的消耗。局部性敏感哈希(Locality Sensitive hash, LSH)能够保持数据的局部性并支持近似查询。然而，由于随机选择哈希函数，LSH不得不使用太多的函数来保证查询的准确性。额外的计算和存储开销加剧了LSH的实际性能。为了降低开销和提供高性能，我们提出了一种分布感知的DLSH方案，为云计算提供经济有效的近似最近邻查询服务。DLSH的思想是利用数据分布的主成分作为LSH中哈希函数的投影向量，进一步量化每个哈希函数的权重，并调整每个哈希表中的区间值。然后，我们根据命中频率改进查询的结果集，以显著减少距离计算的时间开销。在大型云计算测试平台上进行的大量实验表明，在多个系统性能指标方面有了显著的改进。我们已经发布了DLSH的源代码供公众使用。

{"title":"DLSH: a distribution-aware LSH scheme for approximate nearest neighbor query in cloud computing","authors":"Yuanyuan Sun, Yu Hua, Xue Liu, Shunde Cao, Pengfei Zuo","doi":"10.1145/3127479.3127485","DOIUrl":"https://doi.org/10.1145/3127479.3127485","url":null,"abstract":"Cloud computing needs to process and analyze massive high-dimensional data in a real-time manner. Approximate queries in cloud computing systems can provide timely queried results with acceptable accuracy, thus alleviating the consumption of a large amount of resources. Locality Sensitive Hashing (LSH) is able to maintain the data locality and support approximate queries. However, due to randomly choosing hash functions, LSH has to use too many functions to guarantee the query accuracy. The extra computation and storage overheads exacerbate the real performance of LSH. In order to reduce the overheads and deliver high performance, we propose a distribution-aware scheme, called DLSH, to offer cost-effective approximate nearest neighbor query service for cloud computing. The idea of DLSH is to leverage the principal components of the data distribution as the projection vectors of hash functions in LSH, further quantify the weight of each hash function and adjust the interval value in each hash table. We then refine the queried result set based on the hit frequency to significantly decrease the time overhead of distance computation. Extensive experiments in a large-scale cloud computing testbed demonstrate significant improvements in terms of multiple system performance metrics. We have released the source code of DLSH for public use.","PeriodicalId":20679,"journal":{"name":"Proceedings of the 2017 Symposium on Cloud Computing","volume":"26 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84627734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Prism: a proxy architecture for datacenter networks Prism:数据中心网络的代理架构

Proceedings of the 2017 Symposium on Cloud Computing

Pub Date : 2017-09-24 DOI: 10.1145/3127479.3127480

Yutaro Hayakawa, L. Eggert, Michio Honda, Douglas J. Santry

In datacenters, workload throughput is often constrained by the attachment bandwidth of proxy servers, despite the much higher aggregate bandwidth of backend servers. We introduce a novel architecture that addresses this problem by combining programmable network switches with a controller that together act as a network "Prism" that can transparently redirect individual client transactions to different backend servers. Unlike traditional proxy approaches, with Prism, transaction payload data is exchanged directly between clients and backend servers, which eliminates the proxy bottleneck. Because the controller only handles transactional metadata, it should scale to much higher transaction rates than traditional proxies. An experimental evaluation with a prototype implementation demonstrates correctness of operation, improved bandwidth utilization and low packet transformation overheads even in software.

在数据中心中，工作负载吞吐量通常受到代理服务器的附件带宽的限制，尽管后端服务器的总带宽要高得多。我们引入了一种新的架构，通过将可编程网络交换机与控制器相结合来解决这个问题，它们一起充当网络“棱镜”，可以透明地将单个客户端事务重定向到不同的后端服务器。与传统的代理方法不同，Prism的事务有效负载数据直接在客户机和后端服务器之间交换，从而消除了代理瓶颈。因为控制器只处理事务性元数据，所以它应该扩展到比传统代理高得多的事务速率。一个原型实现的实验评估证明了操作的正确性，提高了带宽利用率，甚至在软件上也降低了分组转换开销。

引用次数: 4

SLO-aware colocation of data center tasks based on instantaneous processor requirements 基于瞬时处理器需求的数据中心任务的慢速感知托管

Proceedings of the 2017 Symposium on Cloud Computing

Pub Date : 2017-09-05 DOI: 10.1145/3127479.3132244

P. Janus, K. Rządca

In a cloud data center, a single physical machine simultaneously executes dozens of highly heterogeneous tasks. Such colocation results in more efficient utilization of machines, but, when tasks' requirements exceed available resources, some of the tasks might be throttled down or preempted. We analyze version 2.1 of the Google cluster trace that shows short-term (1 second) task CPU usage. Contrary to the assumptions taken by many theoretical studies, we demonstrate that the empirical distributions do not follow any single distribution. However, high percentiles of the total processor usage (summed over at least 10 tasks) can be reasonably estimated by the Gaussian distribution. We use this result for a probabilistic fit test, called the Gaussian Percentile Approximation (GPA), for standard bin-packing algorithms. To check whether a new task will fit into a machine, GPA checks whether the resulting distribution's percentile corresponding to the requested service level objective, SLO is still below the machine's capacity. In our simulation experiments, GPA resulted in colocations exceeding the machines' capacity with a frequency similar to the requested SLO.

在云数据中心中，一台物理机器同时执行数十个高度异构的任务。这样的托管可以更有效地利用机器，但是，当任务的需求超过可用资源时，一些任务可能会被抑制或抢占。我们分析了Google集群跟踪的2.1版本，它显示了短期(1秒)任务CPU使用情况。与许多理论研究的假设相反，我们证明了经验分布不遵循任何单一分布。但是，总处理器使用的高百分位数(至少10个任务的总和)可以通过高斯分布合理地估计出来。我们将此结果用于标准装箱算法的概率拟合检验，称为高斯百分位近似(GPA)。为了检查新任务是否适合机器，GPA检查结果分布的百分位数是否与请求的服务水平目标相对应，SLO仍然低于机器的容量。在我们的模拟实验中，GPA导致的并发超过了机器的容量，其频率与所请求的SLO相似。

{"title":"SLO-aware colocation of data center tasks based on instantaneous processor requirements","authors":"P. Janus, K. Rządca","doi":"10.1145/3127479.3132244","DOIUrl":"https://doi.org/10.1145/3127479.3132244","url":null,"abstract":"In a cloud data center, a single physical machine simultaneously executes dozens of highly heterogeneous tasks. Such colocation results in more efficient utilization of machines, but, when tasks' requirements exceed available resources, some of the tasks might be throttled down or preempted. We analyze version 2.1 of the Google cluster trace that shows short-term (1 second) task CPU usage. Contrary to the assumptions taken by many theoretical studies, we demonstrate that the empirical distributions do not follow any single distribution. However, high percentiles of the total processor usage (summed over at least 10 tasks) can be reasonably estimated by the Gaussian distribution. We use this result for a probabilistic fit test, called the Gaussian Percentile Approximation (GPA), for standard bin-packing algorithms. To check whether a new task will fit into a machine, GPA checks whether the resulting distribution's percentile corresponding to the requested service level objective, SLO is still below the machine's capacity. In our simulation experiments, GPA resulted in colocations exceeding the machines' capacity with a frequency similar to the requested SLO.","PeriodicalId":20679,"journal":{"name":"Proceedings of the 2017 Symposium on Cloud Computing","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79177233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 24

Sparkle: optimizing spark for large memory machines and analytics spark:为大内存机器和分析优化spark

Proceedings of the 2017 Symposium on Cloud Computing

Pub Date : 2017-08-18 DOI: 10.1145/3127479.3134762

Mijung Kim, Jun Yu Li, Haris Volos, M. Marwah, A. Ulanov, K. Keeton, Joseph A. Tucek, L. Cherkasova, Le Xu, Pradeep R. Fernando

Given the growing availability of affordable scale-up servers, our goal is to bring the performance benefits of in-memory processing on scale-up servers to an increasingly common class of data analytics applications that process small to medium size datasets (up to a few 100GBs) that can easily fit in the memory of a typical scale-up server To achieve this goal, we leverage Spark, an existing memory-centric data analytics framework with wide-spread adoption among data scientists. Bringing Spark's data analytic capabilities to a scale-up system requires rethinking the original design assumptions, which, although effective for a scale-out system, are a poor match to a scale-up system resulting in unnecessary communication and memory inefficiencies.

考虑到可负担得起的扩展服务器的可用性越来越高，我们的目标是将扩展服务器上内存处理的性能优势带给越来越常见的数据分析应用程序，这些应用程序处理中小型数据集(最多100gb)，这些数据集可以很容易地适应典型的扩展服务器的内存。为了实现这一目标，我们利用了Spark，这是一个现有的以内存为中心的数据分析框架，在数据科学家中得到了广泛的采用。将Spark的数据分析功能应用到扩展系统中需要重新考虑最初的设计假设，尽管这些假设对于扩展系统是有效的，但对于扩展系统来说却不太合适，从而导致不必要的通信和内存效率低下。

引用次数: 15

How good are machine learning clouds for binary classification with good features?: extended abstract 机器学习云对于具有良好特征的二值分类有多好?:扩展摘要

Proceedings of the 2017 Symposium on Cloud Computing

Pub Date : 2017-07-29 DOI: 10.1145/3127479.3132570

Hantian Zhang, L. Zeng, Wentao Wu, Ce Zhang

In spite of the recent advancement of machine learning research, modern machine learning systems are still far from easy to use, at least from the perspective of business users or even scientists without a computer science background. Recently, there is a trend toward pushing machine learning onto the cloud as a "service," a.k.a. machine learning clouds. By putting a set of machine learning primitives on the cloud, these services significantly raise the level of abstraction for machine learning. For example, with Amazon Machine Learning, users only need to upload the dataset and specify the type of task (classification or regression). The cloud will then train machine learning models without any user intervention.

尽管最近机器学习研究取得了进展，但现代机器学习系统仍然远不容易使用，至少从商业用户甚至没有计算机科学背景的科学家的角度来看是这样。最近，有一种趋势是将机器学习作为一种“服务”推向云端，也就是机器学习云。通过将一组机器学习原语放在云上，这些服务显著提高了机器学习的抽象级别。例如，使用Amazon Machine Learning，用户只需要上传数据集并指定任务类型(分类或回归)。云将在没有任何用户干预的情况下训练机器学习模型。

引用次数: 9

Mithril: mining sporadic associations for cache prefetching 秘银:为缓存预取挖掘零星关联

Proceedings of the 2017 Symposium on Cloud Computing

Pub Date : 2017-05-21 DOI: 10.1145/3127479.3131210

Juncheng Yang, Reza Karimi, Trausti Sæmundsson, Avani Wildani, Ymir Vigfusson

The growing pressure on cloud application scalability has accentuated storage performance as a critical bottleneck. Although cache replacement algorithms have been extensively studied, cache prefetching - reducing latency by retrieving items before they are actually requested - remains an underexplored area. Existing approaches to history-based prefetching, in particular, provide too few benefits for real systems for the resources they cost. We propose Mithril, a prefetching layer that efficiently exploits historical patterns in cache request associations. Mithril is inspired by sporadic association rule mining and only relies on the timestamps of requests. Through evaluation of 135 block-storage traces, we show that Mithril is effective, giving an average of a 55% hit ratio increase over LRU and Probability Graph, and a 36% hit ratio gain over Amp at reasonable cost. Finally, we demonstrate the improvement comes from Mithril being able to capture mid-frequency blocks.

云应用程序可伸缩性的压力越来越大，这使得存储性能成为一个关键瓶颈。尽管缓存替换算法已经得到了广泛的研究，但是缓存预取——通过在实际请求之前检索条目来减少延迟——仍然是一个未被充分研究的领域。特别是，现有的基于历史的预取方法为实际系统提供的好处太少了。我们提出了Mithril，一个预取层，可以有效地利用缓存请求关联中的历史模式。Mithril受零星关联规则挖掘的启发，仅依赖于请求的时间戳。通过对135个块存储轨迹的评估，我们表明Mithril是有效的，在合理的成本下，比LRU和概率图平均提高55%的命中率，比Amp提高36%的命中率。最后，我们演示了Mithril能够捕获中频块的改进。

引用次数: 17

Occupy the cloud: distributed computing for the 99% 占领云:99%人口的分布式计算

Proceedings of the 2017 Symposium on Cloud Computing

Pub Date : 2017-02-13 DOI: 10.1145/3127479.3128601

Eric Jonas, Qifan Pu, S. Venkataraman, I. Stoica, B. Recht

Distributed computing remains inaccessible to a large number of users, in spite of many open source platforms and extensive commercial offerings. While distributed computation frameworks have moved beyond a simple map-reduce model, many users are still left to struggle with complex cluster management and configuration tools, even for running simple embarrassingly parallel jobs. We argue that stateless functions represent a viable platform for these users, eliminating cluster management overhead, fulfilling the promise of elasticity. Furthermore, using our prototype implementation, PyWren, we show that this model is general enough to implement a number of distributed computing models, such as BSP, efficiently. Extrapolating from recent trends in network bandwidth and the advent of disaggregated storage, we suggest that stateless functions are a natural fit for data processing in future computing environments.

尽管有许多开源平台和广泛的商业产品，但大量用户仍然无法使用分布式计算。虽然分布式计算框架已经超越了简单的map-reduce模型，但许多用户仍然要与复杂的集群管理和配置工具作斗争，即使是运行简单的令人尴尬的并行作业。我们认为无状态函数为这些用户提供了一个可行的平台，消除了集群管理开销，实现了弹性的承诺。此外，使用我们的原型实现PyWren，我们证明了该模型足够通用，可以有效地实现许多分布式计算模型，例如BSP。从网络带宽的最新趋势和分解存储的出现推断，我们认为无状态函数是未来计算环境中数据处理的自然选择。

引用次数: 449

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the 2017 Symposium on Cloud Computing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀