首页 > 最新文献

2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)最新文献

英文 中文
Addressing the Challenges of Executing a Massive Computational Cluster in the Cloud 解决在云中执行大规模计算集群的挑战
Brandon Posey, Christopher Gropp, Boyd Wilson, Boyd McGeachie, S. Padhi, Alexander Herzog, A. Apon
A major limitation for time-to-science can be the lack of available computing resources. Depending on the capacity of resources, executing an application suite with hundreds of thousands of jobs can take weeks when resources are in high demand. We describe how we dynamically provision a large scale high performance computing cluster of more than one million cores utilizing Amazon Web Services (AWS). We discuss the trade-offs, challenges, and solutions associated with creating such a large scale cluster with commercial cloud resources. We utilize our large scale cluster to study a parameter sweep workflow composed of message-passing parallel topic modeling jobs on multiple datasets. At peak, we achieve a simultaneous core count of 1,119,196 vCPUs across nearly 50,000 instances, and are able to execute almost half a million jobs within two hours utilizing AWS Spot Instances in a single AWS region. Our solutions to the challenges and trade-offs have broad application to the lifecycle management of similar clusters on other commercial clouds.
研究时间的一个主要限制可能是缺乏可用的计算资源。根据资源的容量,当资源需求量很大时,执行具有数十万个作业的应用程序套件可能需要数周时间。我们描述了如何利用Amazon Web Services (AWS)动态地提供超过一百万核的大规模高性能计算集群。我们将讨论与使用商业云资源创建如此大规模集群相关的权衡、挑战和解决方案。我们利用我们的大规模集群研究了一个由多个数据集上的消息传递并行主题建模作业组成的参数扫描工作流。在峰值时,我们在近50,000个实例中实现了1,119,196个vcpu的同时核心计数,并且能够在单个AWS区域中利用AWS Spot实例在两小时内执行近50万个作业。我们针对挑战和权衡的解决方案广泛应用于其他商业云上类似集群的生命周期管理。
{"title":"Addressing the Challenges of Executing a Massive Computational Cluster in the Cloud","authors":"Brandon Posey, Christopher Gropp, Boyd Wilson, Boyd McGeachie, S. Padhi, Alexander Herzog, A. Apon","doi":"10.1109/CCGRID.2018.00040","DOIUrl":"https://doi.org/10.1109/CCGRID.2018.00040","url":null,"abstract":"A major limitation for time-to-science can be the lack of available computing resources. Depending on the capacity of resources, executing an application suite with hundreds of thousands of jobs can take weeks when resources are in high demand. We describe how we dynamically provision a large scale high performance computing cluster of more than one million cores utilizing Amazon Web Services (AWS). We discuss the trade-offs, challenges, and solutions associated with creating such a large scale cluster with commercial cloud resources. We utilize our large scale cluster to study a parameter sweep workflow composed of message-passing parallel topic modeling jobs on multiple datasets. At peak, we achieve a simultaneous core count of 1,119,196 vCPUs across nearly 50,000 instances, and are able to execute almost half a million jobs within two hours utilizing AWS Spot Instances in a single AWS region. Our solutions to the challenges and trade-offs have broad application to the lifecycle management of similar clusters on other commercial clouds.","PeriodicalId":321027,"journal":{"name":"2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127410631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Towards Massive Consolidation in Data Centers with SEaMLESS 实现无缝数据中心的大规模整合
A. Segalini, Dino Lopez Pacheco, Quentin Jacquemart
In Data Centers (DCs), an abundance of virtual machines (VMs) remain idle due to network services awaiting for incoming connections, or due to established-and-idling sessions. These VMs lead to wastage of RAM – the scarcest resource in DCs – as they lock their allocated memory. In this paper, we introduce SEaMLESS, a solution designed to (i) transform fully-fledged idle VMs into lightweight and resourceless virtual network functions (VNFs), then (ii) reduces the allocated memory to those idle VMs. By replacing idle VMs with VNFs, SEaMLESS provides fast VM restoration upon user activity detection, thereby introducing limited impact on the Quality of Experience (QoE). Our results show that SEaMLESS can consolidate hundreds of VMs as VNFs onto one single machine. SEaMLESS is thus able to release the majority of the memory allocated to idle VMs. This freed memory can then be reassigned to new VMs, or lead to massive consolidation, to enable a better utilization of DC resources.
在数据中心(dc)中,由于网络服务等待传入的连接,或者由于建立和空闲会话,大量虚拟机(vm)处于空闲状态。这些vm会导致RAM(数据中心中最稀缺的资源)的浪费,因为它们会锁定已分配的内存。在本文中,我们介绍了SEaMLESS,一个旨在(i)将完全空闲的vm转换为轻量级和无资源的虚拟网络功能(VNFs)的解决方案,然后(ii)减少分配给这些空闲vm的内存。通过将空闲的虚拟机替换为VNFs, SEaMLESS可以在检测到用户活动时快速恢复虚拟机,从而减少对QoE (Quality of Experience)的影响。我们的结果表明,SEaMLESS可以将数百个vm作为VNFs整合到一台机器上。因此,SEaMLESS能够释放分配给空闲vm的大部分内存。然后可以将释放的内存重新分配给新的vm,或者进行大规模整合,以便更好地利用DC资源。
{"title":"Towards Massive Consolidation in Data Centers with SEaMLESS","authors":"A. Segalini, Dino Lopez Pacheco, Quentin Jacquemart","doi":"10.1109/CCGRID.2018.00038","DOIUrl":"https://doi.org/10.1109/CCGRID.2018.00038","url":null,"abstract":"In Data Centers (DCs), an abundance of virtual machines (VMs) remain idle due to network services awaiting for incoming connections, or due to established-and-idling sessions. These VMs lead to wastage of RAM – the scarcest resource in DCs – as they lock their allocated memory. In this paper, we introduce SEaMLESS, a solution designed to (i) transform fully-fledged idle VMs into lightweight and resourceless virtual network functions (VNFs), then (ii) reduces the allocated memory to those idle VMs. By replacing idle VMs with VNFs, SEaMLESS provides fast VM restoration upon user activity detection, thereby introducing limited impact on the Quality of Experience (QoE). Our results show that SEaMLESS can consolidate hundreds of VMs as VNFs onto one single machine. SEaMLESS is thus able to release the majority of the memory allocated to idle VMs. This freed memory can then be reassigned to new VMs, or lead to massive consolidation, to enable a better utilization of DC resources.","PeriodicalId":321027,"journal":{"name":"2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131989413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
SuperCell: Adaptive Software-Defined Storage for Cloud Storage Workloads SuperCell:针对云存储工作负载的自适应软件定义存储
K. Uehara, Yu Xiang, Y. Chen, M. Hiltunen, Kaustubh R. Joshi, R. Schlichting
The explosive growth of data due to the increasing adoption of cloud technologies in the enterprise has created a strong demand for more flexible, cost-effective, and scalable storage solutions. Many storage systems, however, are not well matched to the workloads they service due to the difficulty of configuring the storage system optimally a priori with only approximate knowledge of the workload characteristics. This paper shows how cloud-based orchestration can be leveraged to create flexible storage solutions that use continuous adaptation to tailor themselves to their target application workloads, and in doing so, provide superior performance, cost, and scalability over traditional fixed designs. To demonstrate this approach, we have built "SuperCell," a Ceph-based distributed storage solution with a recommendation engine for the storage configuration. SuperCell provides storage operators with real-time recommendations on how to reconfigure the storage system to optimize its performance, cost, and efficiency based on statistical storage modeling and data analysis of the actual workload. Using real cloud storage workloads, we experimentally demonstrate that SuperCell reduces the cost of storage systems by up to 48%, while meeting service level agreement (SLA) 99% of the time, a level that any static design fails to meet for the workloads.
由于企业越来越多地采用云技术,数据的爆炸式增长产生了对更灵活、更经济、更可扩展的存储解决方案的强烈需求。然而,许多存储系统并不能很好地匹配它们所服务的工作负载,这是由于仅通过对工作负载特征的粗略了解就难以先验地优化配置存储系统。本文展示了如何利用基于云的编排来创建灵活的存储解决方案,这些解决方案使用持续的自适应来定制其目标应用程序工作负载,并在此过程中提供优于传统固定设计的性能、成本和可伸缩性。为了演示这种方法,我们构建了“SuperCell”,这是一个基于ceph的分布式存储解决方案,带有存储配置的推荐引擎。SuperCell根据统计存储建模和实际工作量的数据分析,为存储运营商提供关于如何重新配置存储系统以优化其性能、成本和效率的实时建议。通过使用真实的云存储工作负载,我们通过实验证明SuperCell将存储系统的成本降低了48%,同时在99%的时间内满足服务水平协议(SLA),这是任何静态设计都无法满足工作负载的水平。
{"title":"SuperCell: Adaptive Software-Defined Storage for Cloud Storage Workloads","authors":"K. Uehara, Yu Xiang, Y. Chen, M. Hiltunen, Kaustubh R. Joshi, R. Schlichting","doi":"10.1109/CCGRID.2018.00025","DOIUrl":"https://doi.org/10.1109/CCGRID.2018.00025","url":null,"abstract":"The explosive growth of data due to the increasing adoption of cloud technologies in the enterprise has created a strong demand for more flexible, cost-effective, and scalable storage solutions. Many storage systems, however, are not well matched to the workloads they service due to the difficulty of configuring the storage system optimally a priori with only approximate knowledge of the workload characteristics. This paper shows how cloud-based orchestration can be leveraged to create flexible storage solutions that use continuous adaptation to tailor themselves to their target application workloads, and in doing so, provide superior performance, cost, and scalability over traditional fixed designs. To demonstrate this approach, we have built \"SuperCell,\" a Ceph-based distributed storage solution with a recommendation engine for the storage configuration. SuperCell provides storage operators with real-time recommendations on how to reconfigure the storage system to optimize its performance, cost, and efficiency based on statistical storage modeling and data analysis of the actual workload. Using real cloud storage workloads, we experimentally demonstrate that SuperCell reduces the cost of storage systems by up to 48%, while meeting service level agreement (SLA) 99% of the time, a level that any static design fails to meet for the workloads.","PeriodicalId":321027,"journal":{"name":"2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131582213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
An Empirical Evaluation of Allgatherv on Multi-GPU Systems Allgatherv在多gpu系统上的实证评价
Thomas B. Rolinger, T. Simon, Christopher D. Krieger
Applications for deep learning and big data analytics have compute and memory requirements that exceed the limits of a single GPU. However, effectively scaling out an application to multiple GPUs is challenging due to the complexities of communication between the GPUs, particularly for collective communication with irregular message sizes. In this work, we provide a performance evaluation of the Allgatherv routine on multi-GPU systems, focusing on GPU network topology and the communication library used. We present results from the OSU-micro benchmark as well as conduct a case study for sparse tensor factorization, one application that uses Allgatherv with highly irregular message sizes. We extend our existing tensor factorization tool to run on systems with different node counts and varying number of GPUs per node. We then evaluate the communication performance of our tool when using traditional MPI, CUDA-aware MVAPICH and NCCL across a suite of real-world data sets on three different systems: a 16-node cluster with one GPU per node, NVIDIA's DGX-1 with 8 GPUs and Cray's CS-Storm with 16 GPUs. Our results show that irregularity in the tensor data sets produce trends that contradict those in the OSU micro-benchmark, as well as trends that are absent from the benchmark.
深度学习和大数据分析的应用程序对计算和内存的要求超过了单个GPU的限制。然而,由于gpu之间通信的复杂性,有效地将应用程序扩展到多个gpu是具有挑战性的,特别是对于具有不规则消息大小的集体通信。在这项工作中,我们提供了Allgatherv例程在多GPU系统上的性能评估,重点关注GPU网络拓扑和使用的通信库。我们展示了来自OSU-micro基准测试的结果,并对稀疏张量分解进行了案例研究,稀疏张量分解是一个使用Allgatherv处理高度不规则消息大小的应用程序。我们扩展了现有的张量分解工具,以在具有不同节点计数和每个节点的不同gpu数量的系统上运行。然后,我们在三个不同系统上的一套真实世界数据集上使用传统的MPI, cuda感知MVAPICH和NCCL时评估我们的工具的通信性能:一个16节点集群,每个节点一个GPU, NVIDIA的DGX-1有8个GPU, Cray的CS-Storm有16个GPU。我们的研究结果表明,张量数据集的不规则性产生了与OSU微基准相矛盾的趋势,以及基准中没有的趋势。
{"title":"An Empirical Evaluation of Allgatherv on Multi-GPU Systems","authors":"Thomas B. Rolinger, T. Simon, Christopher D. Krieger","doi":"10.1109/CCGRID.2018.00027","DOIUrl":"https://doi.org/10.1109/CCGRID.2018.00027","url":null,"abstract":"Applications for deep learning and big data analytics have compute and memory requirements that exceed the limits of a single GPU. However, effectively scaling out an application to multiple GPUs is challenging due to the complexities of communication between the GPUs, particularly for collective communication with irregular message sizes. In this work, we provide a performance evaluation of the Allgatherv routine on multi-GPU systems, focusing on GPU network topology and the communication library used. We present results from the OSU-micro benchmark as well as conduct a case study for sparse tensor factorization, one application that uses Allgatherv with highly irregular message sizes. We extend our existing tensor factorization tool to run on systems with different node counts and varying number of GPUs per node. We then evaluate the communication performance of our tool when using traditional MPI, CUDA-aware MVAPICH and NCCL across a suite of real-world data sets on three different systems: a 16-node cluster with one GPU per node, NVIDIA's DGX-1 with 8 GPUs and Cray's CS-Storm with 16 GPUs. Our results show that irregularity in the tensor data sets produce trends that contradict those in the OSU micro-benchmark, as well as trends that are absent from the benchmark.","PeriodicalId":321027,"journal":{"name":"2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"213 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134572638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
RideMatcher: Peer-to-Peer Matching of Passengers for Efficient Ridesharing RideMatcher:实现高效拼车的点对点乘客匹配
N. V. Bozdog, M. Makkes, A. V. Halteren, H. Bal
The daily home-office commute of millions of people in crowded cities puts a strain on air quality, traveling time and noise pollution. This is especially problematic in western cities, where cars and taxis have low occupancy with daily commuters. To reduce these issues, authorities often encourage commuters to share their rides, also known as carpooling or ridesharing. To increase the ridesharing usage it is essential that commuters are efficiently matched. In this paper we present RideMatcher, a novel peer-to-peer system for matching car rides based on their routes and travel times. Unlike other ridesharing systems, RideMatcher is completely decentralized, which makes it possible to deploy it on distributed infrastructures, using fog and edge computing. Despite being decentralized, our system is able to efficiently match ridesharing users in near real-time. Our evaluations performed on a dataset with 34,837 real taxi trips from New York show that RideMatcher is able to reduce the number of taxi trips by up to 65%, the distance traveled by taxi cabs by up to 64%, and the cost of the trips by up to 66%.
在拥挤的城市里,数百万人每天从家到办公室通勤,这给空气质量、出行时间和噪音污染带来了压力。这在西方城市尤其成问题,在那里,汽车和出租车的使用率很低。为了减少这些问题,当局经常鼓励通勤者拼车,也被称为拼车或拼车。为了提高拼车的使用率,通勤者的有效匹配至关重要。在本文中,我们提出了RideMatcher,一个基于路线和旅行时间匹配汽车乘坐的新颖点对点系统。与其他拼车系统不同,RideMatcher是完全分散的,这使得它可以部署在分布式基础设施上,使用雾和边缘计算。尽管是去中心化的,但我们的系统能够近乎实时地有效匹配拼车用户。我们对来自纽约的34,837次真实出租车旅行的数据集进行了评估,结果表明,RideMatcher能够将出租车旅行次数减少多达65%,出租车行驶距离减少多达64%,旅行成本减少高达66%。
{"title":"RideMatcher: Peer-to-Peer Matching of Passengers for Efficient Ridesharing","authors":"N. V. Bozdog, M. Makkes, A. V. Halteren, H. Bal","doi":"10.1109/CCGRID.2018.00041","DOIUrl":"https://doi.org/10.1109/CCGRID.2018.00041","url":null,"abstract":"The daily home-office commute of millions of people in crowded cities puts a strain on air quality, traveling time and noise pollution. This is especially problematic in western cities, where cars and taxis have low occupancy with daily commuters. To reduce these issues, authorities often encourage commuters to share their rides, also known as carpooling or ridesharing. To increase the ridesharing usage it is essential that commuters are efficiently matched. In this paper we present RideMatcher, a novel peer-to-peer system for matching car rides based on their routes and travel times. Unlike other ridesharing systems, RideMatcher is completely decentralized, which makes it possible to deploy it on distributed infrastructures, using fog and edge computing. Despite being decentralized, our system is able to efficiently match ridesharing users in near real-time. Our evaluations performed on a dataset with 34,837 real taxi trips from New York show that RideMatcher is able to reduce the number of taxi trips by up to 65%, the distance traveled by taxi cabs by up to 64%, and the cost of the trips by up to 66%.","PeriodicalId":321027,"journal":{"name":"2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114329401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Improving Energy Efficiency of Database Clusters Through Prefetching and Caching 通过预取和缓存提高数据库集群的能源效率
Yi Zhou, Shubbhi Taneja, Mohammed I. Alghamdi, X. Qin
The goal of this study is to optimize energy efficiency of database clusters through prefetching and caching strategies. We design a workload-skewness scheme to collectively manage a set of hot and cold nodes in a database cluster system. The prefetching mechanism fetches popular data tables to the hot nodes while keeping unpopular data in cold nodes. We leverage a power management module to aggressively turn cold nodes in the low-power mode to conserve energy consumption. We construct a prefetching model and an energy-saving model to govern the power management module in database lusters. The energy-efficient prefetching and caching mechanism is conducive to cutting back the number of power-state transitions, thereby offering high energy efficiency. We systematically evaluate energy conservation technique in the process of managing, fetching, and storing data on clusters supporting database applications. Our experimental results show that our prefetching/caching solution significantly improves energy efficiency of the existing PostgreSQL system.
本研究的目的是通过预取和缓存策略来优化数据库集群的能源效率。我们设计了一个工作负载偏度方案来共同管理数据库集群系统中的一组热节点和一组冷节点。预取机制将流行的数据表提取到热节点,而将不流行的数据保留在冷节点中。我们利用电源管理模块积极地将冷节点切换到低功耗模式,以节省能源消耗。我们构建了一个预取模型和一个节能模型来控制数据库集群中的电源管理模块。节能的预取和缓存机制有助于减少功率状态转换的次数,从而提供高能效。我们系统地评估了在支持数据库应用的集群上管理、获取和存储数据过程中的节能技术。实验结果表明,我们的预取/缓存解决方案显著提高了现有PostgreSQL系统的能源效率。
{"title":"Improving Energy Efficiency of Database Clusters Through Prefetching and Caching","authors":"Yi Zhou, Shubbhi Taneja, Mohammed I. Alghamdi, X. Qin","doi":"10.1109/CCGRID.2018.00065","DOIUrl":"https://doi.org/10.1109/CCGRID.2018.00065","url":null,"abstract":"The goal of this study is to optimize energy efficiency of database clusters through prefetching and caching strategies. We design a workload-skewness scheme to collectively manage a set of hot and cold nodes in a database cluster system. The prefetching mechanism fetches popular data tables to the hot nodes while keeping unpopular data in cold nodes. We leverage a power management module to aggressively turn cold nodes in the low-power mode to conserve energy consumption. We construct a prefetching model and an energy-saving model to govern the power management module in database lusters. The energy-efficient prefetching and caching mechanism is conducive to cutting back the number of power-state transitions, thereby offering high energy efficiency. We systematically evaluate energy conservation technique in the process of managing, fetching, and storing data on clusters supporting database applications. Our experimental results show that our prefetching/caching solution significantly improves energy efficiency of the existing PostgreSQL system.","PeriodicalId":321027,"journal":{"name":"2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114874299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Modeling Operational Fairness of Hybrid Cloud Brokerage 混合云经纪的运营公平性建模
Sreekrishnan Venkateswaran, S. Sarkar
Cloud service brokerage is an emerging technology that attempts to simplify the consumption and operation of hybrid clouds. Today's cloud brokers attempt to insulate consumers from the vagaries of multiple clouds. To achieve the insulation, the modern cloud broker needs to disguise itself as the end-provider to consumers by creating and operating a virtual data center construct that we call a "meta-cloud", which is assembled on top of a set of participating supplier clouds. It is crucial for such a cloud broker to be considered a trusted partner both by cloud consumers and by the underpinning cloud suppliers. A fundamental tenet of brokerage trust is vendor neutrality. On the one hand, cloud consumers will be comfortable if a cloud broker guarantees that they will not be led through a preferred path. And on the other hand, cloud suppliers would be more interested in partnering with a cloud broker who promises a fair apportioning of client provisioning requests. Because consumer and supplier trust on a meta-cloud broker stems from the assumption of being agnostic to supplier clouds, there is a need for a test strategy that verifies the fairness of cloud brokerage. In this paper, we propose a calculus of fairness that defines the rules to determine the operational behavior of a cloud broker. The calculus uses temporal logic to model the fact that fairness is a trait that has to be ascertained over time; it is not a characteristic that can be judged at a per-request fulfillment level. Using our temporal calculus of fairness as the basis, we propose an algorithm to determine the fairness of a broker probabilistically, based on its observed request apportioning policies. Our model for the fairness of cloud broker behavior also factors in inter-provider variables such as cost divergence and capacity variance. We empirically validate our approach by constructing a meta-cloud from AWS, Azure and IBM, in addition to leveraging a cloud simulator. Our industrial engagements with large enterprises also validate the need for such cloud brokerage with verifiable fairness.
云服务经纪是一种新兴技术,它试图简化混合云的消费和操作。今天的云代理试图将消费者与多云的变幻莫测隔绝开来。为了实现这种隔离,现代云代理需要通过创建和操作我们称之为“元云”的虚拟数据中心构造,将自己伪装成消费者的最终提供者,元云是在一组参与的供应商云之上组装的。对于这样一个云代理来说,被云消费者和基础云供应商视为值得信赖的合作伙伴是至关重要的。经纪信托的一个基本原则是供应商中立。一方面,如果云代理保证他们不会被引导到首选路径,云消费者将会感到舒适。另一方面,云供应商更有兴趣与承诺公平分配客户供应请求的云代理合作。由于消费者和供应商对元云代理的信任源于对供应商云不可知的假设,因此需要一种测试策略来验证云代理的公平性。在本文中,我们提出了一种公平演算,它定义了确定云代理的操作行为的规则。这种演算使用时间逻辑来模拟这样一个事实,即公平是一种必须随着时间的推移而确定的特征;它不是一个可以在每个请求实现级别上判断的特征。以我们的时间公平性计算为基础,我们提出了一种基于其观察到的请求分配策略概率地确定代理公平性的算法。我们的云代理行为公平性模型还考虑了供应商之间的变量,如成本差异和容量差异。除了利用云模拟器外,我们还通过从AWS、Azure和IBM构建元云来验证我们的方法。我们与大型企业的工业合作也验证了这种具有可验证公平性的云经纪的必要性。
{"title":"Modeling Operational Fairness of Hybrid Cloud Brokerage","authors":"Sreekrishnan Venkateswaran, S. Sarkar","doi":"10.1109/CCGRID.2018.00083","DOIUrl":"https://doi.org/10.1109/CCGRID.2018.00083","url":null,"abstract":"Cloud service brokerage is an emerging technology that attempts to simplify the consumption and operation of hybrid clouds. Today's cloud brokers attempt to insulate consumers from the vagaries of multiple clouds. To achieve the insulation, the modern cloud broker needs to disguise itself as the end-provider to consumers by creating and operating a virtual data center construct that we call a \"meta-cloud\", which is assembled on top of a set of participating supplier clouds. It is crucial for such a cloud broker to be considered a trusted partner both by cloud consumers and by the underpinning cloud suppliers. A fundamental tenet of brokerage trust is vendor neutrality. On the one hand, cloud consumers will be comfortable if a cloud broker guarantees that they will not be led through a preferred path. And on the other hand, cloud suppliers would be more interested in partnering with a cloud broker who promises a fair apportioning of client provisioning requests. Because consumer and supplier trust on a meta-cloud broker stems from the assumption of being agnostic to supplier clouds, there is a need for a test strategy that verifies the fairness of cloud brokerage. In this paper, we propose a calculus of fairness that defines the rules to determine the operational behavior of a cloud broker. The calculus uses temporal logic to model the fact that fairness is a trait that has to be ascertained over time; it is not a characteristic that can be judged at a per-request fulfillment level. Using our temporal calculus of fairness as the basis, we propose an algorithm to determine the fairness of a broker probabilistically, based on its observed request apportioning policies. Our model for the fairness of cloud broker behavior also factors in inter-provider variables such as cost divergence and capacity variance. We empirically validate our approach by constructing a meta-cloud from AWS, Azure and IBM, in addition to leveraging a cloud simulator. Our industrial engagements with large enterprises also validate the need for such cloud brokerage with verifiable fairness.","PeriodicalId":321027,"journal":{"name":"2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129260733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Achieving Performance Balance Among Spark Frameworks with Two-Level Schedulers 用两级调度器实现Spark框架之间的性能平衡
Aleksandra Kuzmanovska, H. V. D. Bogert, R. H. Mak, D. Epema
When multiple data-processing frameworks with time-varying workloads are simultaneously present in a single cluster or data-center, an apparent goal is to have them experience equal performance, expressed in whatever performance metrics are applicable. In modern data-center environments, Two-Level Schedulers (TLSs) that leave the scheduling of individual jobs to the schedulers within the data-processing frameworks are typically used for managing the resources of data-processing frameworks. Two such TLSs with opposite designs are Mesos and Koala-F. Mesos employs fine-grained resource allocation and aims at Dominant Resource Fairness (DRF) among framework instances by offering resources to them for the duration of a single task. In contrast, Koala-F aims at performance fairness among framework instances by employing dynamic coarse-grained resource allocation of sets of complete nodes based on performance feedback from individual instances. The goal of this paper is to explore the trade-offs between these two TLS designs when trying to achieve performance balance among frameworks. We select Apache Spark as a representative of data-processing frameworks, and perform experiments on a modest-sized cluster, using jobs chosen from commonly used data-processing benchmarks. Our results reveal that achieving performance balance among framework instances is a challenge for both TLS designs, despite their opposite design choices. Moreover, we exhibit design flaws in the DRF allocation policy that prevent Mesos from achieving performance balance. Finally, to remedy these flaws, we propose a feedback controller for Mesos that dynamically adapts framework weights, as used in Weighted DRF (W-DRF), based on their performance.
当在单个集群或数据中心中同时存在具有时变工作负载的多个数据处理框架时,一个明显的目标是让它们体验相同的性能,用任何适用的性能指标表示。在现代数据中心环境中,将单个作业的调度留给数据处理框架内的调度程序的两级调度器(TLSs)通常用于管理数据处理框架的资源。两种设计相反的tls是Mesos和考拉- f。Mesos采用细粒度资源分配,旨在通过在单个任务期间向框架实例提供资源来实现主导资源公平(DRF)。相比之下,考拉- f通过基于单个实例的性能反馈对完整节点集进行动态粗粒度资源分配,旨在实现框架实例之间的性能公平性。本文的目标是在尝试实现框架之间的性能平衡时,探索这两种TLS设计之间的权衡。我们选择Apache Spark作为数据处理框架的代表,并在一个中等规模的集群上执行实验,使用从常用数据处理基准中选择的作业。我们的结果表明,实现框架实例之间的性能平衡对于两种TLS设计来说都是一个挑战,尽管它们的设计选择是相反的。此外,我们还展示了DRF分配策略中的设计缺陷,这些缺陷会阻止Mesos实现性能平衡。最后,为了弥补这些缺陷,我们为Mesos提出了一个反馈控制器,该控制器根据加权DRF (W-DRF)中使用的框架权重的性能动态适应框架权重。
{"title":"Achieving Performance Balance Among Spark Frameworks with Two-Level Schedulers","authors":"Aleksandra Kuzmanovska, H. V. D. Bogert, R. H. Mak, D. Epema","doi":"10.1109/CCGRID.2018.00028","DOIUrl":"https://doi.org/10.1109/CCGRID.2018.00028","url":null,"abstract":"When multiple data-processing frameworks with time-varying workloads are simultaneously present in a single cluster or data-center, an apparent goal is to have them experience equal performance, expressed in whatever performance metrics are applicable. In modern data-center environments, Two-Level Schedulers (TLSs) that leave the scheduling of individual jobs to the schedulers within the data-processing frameworks are typically used for managing the resources of data-processing frameworks. Two such TLSs with opposite designs are Mesos and Koala-F. Mesos employs fine-grained resource allocation and aims at Dominant Resource Fairness (DRF) among framework instances by offering resources to them for the duration of a single task. In contrast, Koala-F aims at performance fairness among framework instances by employing dynamic coarse-grained resource allocation of sets of complete nodes based on performance feedback from individual instances. The goal of this paper is to explore the trade-offs between these two TLS designs when trying to achieve performance balance among frameworks. We select Apache Spark as a representative of data-processing frameworks, and perform experiments on a modest-sized cluster, using jobs chosen from commonly used data-processing benchmarks. Our results reveal that achieving performance balance among framework instances is a challenge for both TLS designs, despite their opposite design choices. Moreover, we exhibit design flaws in the DRF allocation policy that prevent Mesos from achieving performance balance. Finally, to remedy these flaws, we propose a feedback controller for Mesos that dynamically adapts framework weights, as used in Weighted DRF (W-DRF), based on their performance.","PeriodicalId":321027,"journal":{"name":"2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116725946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluation of Highly Available Cloud Streaming Systems for Performance and Price 高可用性云流系统的性能和价格评估
Dung Nguyen, André Luckow, Edward B. Duffy, Ken E. Kennedy, A. Apon
This paper presents a systematic evaluation of Amazon Kinesis and Apache Kafka for meeting highly demanding application requirements. Results show that Kinesis and Kafka can provide high reliability, performance and scalability. Cost and performance trade-offs of Kinesis and Kafka are presented for a variety of application data rates, resource utilization, and resource configurations.
本文对Amazon Kinesis和Apache Kafka进行了系统评估,以满足高要求的应用需求。结果表明,Kinesis和Kafka能够提供高可靠性、高性能和可扩展性。对于各种应用程序数据速率、资源利用率和资源配置,给出了Kinesis和Kafka的成本和性能权衡。
{"title":"Evaluation of Highly Available Cloud Streaming Systems for Performance and Price","authors":"Dung Nguyen, André Luckow, Edward B. Duffy, Ken E. Kennedy, A. Apon","doi":"10.1109/CCGRID.2018.00056","DOIUrl":"https://doi.org/10.1109/CCGRID.2018.00056","url":null,"abstract":"This paper presents a systematic evaluation of Amazon Kinesis and Apache Kafka for meeting highly demanding application requirements. Results show that Kinesis and Kafka can provide high reliability, performance and scalability. Cost and performance trade-offs of Kinesis and Kafka are presented for a variety of application data rates, resource utilization, and resource configurations.","PeriodicalId":321027,"journal":{"name":"2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121050171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
SHAD: The Scalable High-Performance Algorithms and Data-Structures Library SHAD:可扩展的高性能算法和数据结构库
Vito Giovanni Castellana, Marco Minutoli
The unprecedented amount of data that needs to be processed in emerging data analytics applications poses novel challenges to industry and academia. Scalability and high performance become more than a desirable feature because, due to the scale and the nature of the problems, they draw the line between what is achievable and what is unfeasible. In this paper, we propose SHAD, the Scalable High-performance Algorithms and Data-structures library. SHAD adopts a modular design that confines low level details and promotes reuse. SHAD's core is built on an Abstract Runtime Interface which enhances portability and identifies the minimal set of features of the underlying system required by the framework. The core library includes common data-structures such as: Array, Vector, Map and Set. These are designed to accommodate significant amount of data which can be accessed in massively parallel environments, and used as building blocks for SHAD extensions, i.e. higher level software libraries. We have validated and evaluated our design with a performance and scalability study of the core components of the library. We have validated the design flexibility by proposing a Graph Library as an example of SHAD extension, which implements two different graph data-structures; we evaluate their performance with a set of graph applications. Experimental results show that the approach is promising in terms of both performance and scalability. On a distributed system with 320 cores, SHAD Arrays are able to sustain a throughput of 65 billion operations per second, while SHAD Maps sustain 1 billion of operations per second. Algorithms implemented using the Graph Library exhibit performance and scalability comparable to a custom solution, but with smaller development effort.
新兴的数据分析应用程序需要处理前所未有的数据量,这给工业界和学术界带来了新的挑战。可伸缩性和高性能不仅仅是一个理想的特性,因为由于问题的规模和性质,它们在可实现和不可实现之间划清了界限。在本文中,我们提出了SHAD,可扩展的高性能算法和数据结构库。SHAD采用模块化设计,限制底层细节并促进重用。SHAD的核心是建立在一个抽象运行时接口上的,它增强了可移植性,并识别了框架所需的底层系统的最小特性集。核心库包括常用的数据结构,如:Array、Vector、Map和Set。它们的设计是为了容纳大量的数据,这些数据可以在大规模并行环境中访问,并用作SHAD扩展的构建块,即更高级别的软件库。我们通过对库的核心组件进行性能和可扩展性研究来验证和评估我们的设计。我们通过提出一个图库作为SHAD扩展的一个例子来验证设计的灵活性,它实现了两种不同的图数据结构;我们用一组图形应用程序来评估它们的性能。实验结果表明,该方法在性能和可扩展性方面都是有希望的。在具有320核的分布式系统上,SHAD阵列能够维持每秒650亿次操作的吞吐量,而SHAD映射能够维持每秒10亿次操作。使用Graph Library实现的算法表现出与定制解决方案相当的性能和可伸缩性,但开发工作量更小。
{"title":"SHAD: The Scalable High-Performance Algorithms and Data-Structures Library","authors":"Vito Giovanni Castellana, Marco Minutoli","doi":"10.1109/CCGRID.2018.00071","DOIUrl":"https://doi.org/10.1109/CCGRID.2018.00071","url":null,"abstract":"The unprecedented amount of data that needs to be processed in emerging data analytics applications poses novel challenges to industry and academia. Scalability and high performance become more than a desirable feature because, due to the scale and the nature of the problems, they draw the line between what is achievable and what is unfeasible. In this paper, we propose SHAD, the Scalable High-performance Algorithms and Data-structures library. SHAD adopts a modular design that confines low level details and promotes reuse. SHAD's core is built on an Abstract Runtime Interface which enhances portability and identifies the minimal set of features of the underlying system required by the framework. The core library includes common data-structures such as: Array, Vector, Map and Set. These are designed to accommodate significant amount of data which can be accessed in massively parallel environments, and used as building blocks for SHAD extensions, i.e. higher level software libraries. We have validated and evaluated our design with a performance and scalability study of the core components of the library. We have validated the design flexibility by proposing a Graph Library as an example of SHAD extension, which implements two different graph data-structures; we evaluate their performance with a set of graph applications. Experimental results show that the approach is promising in terms of both performance and scalability. On a distributed system with 320 cores, SHAD Arrays are able to sustain a throughput of 65 billion operations per second, while SHAD Maps sustain 1 billion of operations per second. Algorithms implemented using the Graph Library exhibit performance and scalability comparable to a custom solution, but with smaller development effort.","PeriodicalId":321027,"journal":{"name":"2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124036661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
期刊
2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1