Proceedings of the 14th IEEE/ACM International Conference on Utility and Cloud Computing最新文献

英文中文

VECTrust VECTrust

Proceedings of the 14th IEEE/ACM International Conference on Utility and Cloud Computing

Pub Date : 2021-12-06 DOI: 10.1145/3468737.3494099

Ashish Pandey, P. Calyam, S. Debroy, Songjie Wang, Mauro Lemus Alarcon

The unprecedented growth in edge resources (e.g., scientific instruments, edge servers, sensors) and related data sources has caused a data deluge in scientific application communities. The data processing is increasingly relying on algorithms that utilize machine learning to cope with the heterogeneity, scale, and velocity of the data. At the same time, there is an abundance of low-cost computation resources that can be used for edge-cloud collaborative computing viz., "volunteer edge-cloud (VEC) computing". However, lack of trust in terms of performance, agility, cost, and security (PACS) factors in edge resources is proving to be a barrier for wider adoption of VEC. In this paper, we propose a novel "VECTrust" model for support of trusted resource allocation algorithms in VEC computing environments for scientific data-intensive workflows. Our VECTrust features a two-stage probabilistic model that defines trust of VEC computing cluster resources by considering trustworthiness in metrics relevant to PACS factors. We evaluate our VECTrust model's ability to provide dynamic resource allocation based on PACS factors, while also enhancing edge-cloud trust in a VEC computing testbed. Further, we show that VECTrust is able to create a uniform and robust probability distribution of salient PACS factor related metrics within diverse bioinformatics workflows execution over batches of workflows.

{"title":"VECTrust","authors":"Ashish Pandey, P. Calyam, S. Debroy, Songjie Wang, Mauro Lemus Alarcon","doi":"10.1145/3468737.3494099","DOIUrl":"https://doi.org/10.1145/3468737.3494099","url":null,"abstract":"The unprecedented growth in edge resources (e.g., scientific instruments, edge servers, sensors) and related data sources has caused a data deluge in scientific application communities. The data processing is increasingly relying on algorithms that utilize machine learning to cope with the heterogeneity, scale, and velocity of the data. At the same time, there is an abundance of low-cost computation resources that can be used for edge-cloud collaborative computing viz., \"volunteer edge-cloud (VEC) computing\". However, lack of trust in terms of performance, agility, cost, and security (PACS) factors in edge resources is proving to be a barrier for wider adoption of VEC. In this paper, we propose a novel \"VECTrust\" model for support of trusted resource allocation algorithms in VEC computing environments for scientific data-intensive workflows. Our VECTrust features a two-stage probabilistic model that defines trust of VEC computing cluster resources by considering trustworthiness in metrics relevant to PACS factors. We evaluate our VECTrust model's ability to provide dynamic resource allocation based on PACS factors, while also enhancing edge-cloud trust in a VEC computing testbed. Further, we show that VECTrust is able to create a uniform and robust probability distribution of salient PACS factor related metrics within diverse bioinformatics workflows execution over batches of workflows.","PeriodicalId":254382,"journal":{"name":"Proceedings of the 14th IEEE/ACM International Conference on Utility and Cloud Computing","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125338145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Leveraging vCPU-utilization rates to select cost-efficient VMs for parallel workloads 利用vcpu利用率为并行工作负载选择经济高效的虚拟机

Proceedings of the 14th IEEE/ACM International Conference on Utility and Cloud Computing

Pub Date : 2021-12-06 DOI: 10.1145/3468737.3494095

William F. C. Tavares, M. M. Assis, E. Borin

The increasing use of cloud computing for parallel workloads involves, among many problems, resources wastage. When the application does not fully utilize the provisioned resource, the end-of-the-month bill is unnecessarily increased. This is mainly caused by the user's inexperience and naïve behavior. Many studies have attempted to solve this problem by searching for the optimal VM flavor for specific applications with specific inputs. However, most of these solutions require knowledge about the application or require the application's execution on multiple VM flavors. In this work, we propose four new heuristics that recommend cost-effective VMs for parallel workloads based solely on the vCPU-utilization rate of the currently executing VM flavor. We also evaluate them on two scenarios and show that the core-heuristic is capable of recommending VM flavors that have minimal impact on performance and reduce the applications cost, on average, by 1.5x (3.0x) on high (low) vCPU-utilization rate scenarios.

在并行工作负载中越来越多地使用云计算涉及到资源浪费等诸多问题。当应用程序没有充分利用所提供的资源时，月末账单就会不必要地增加。这主要是由于用户缺乏经验和naïve行为造成的。许多研究试图通过寻找具有特定输入的特定应用的最佳VM风味来解决这个问题。然而，这些解决方案中的大多数都需要了解应用程序，或者需要应用程序在多种VM上执行。在这项工作中，我们提出了四种新的启发式方法，仅基于当前执行的VM风格的vcpu利用率，为并行工作负载推荐具有成本效益的VM。我们还在两种情况下对它们进行了评估，并表明核心启发式能够推荐对性能影响最小的VM风格，并在vcpu利用率高(低)的情况下平均降低1.5倍(3.0倍)的应用程序成本。

引用次数: 0

Automated detection of design patterns in declarative deployment models 在声明式部署模型中自动检测设计模式

Proceedings of the 14th IEEE/ACM International Conference on Utility and Cloud Computing

Pub Date : 2021-12-06 DOI: 10.1145/3468737.3494085

Lukas Harzenetter, Uwe Breitenbücher, Ghareeb Falazi, F. Leymann, Adrian Wersching

In recent years, many different deployment automation technologies have been developed to automatically deploy cloud applications. Most of these technologies employ declarative deployment models to describe the deployment of a cloud application by modeling its components, their configurations as well as the relations between them. However, while modeling the deployment of cloud applications declaratively is intuitive, declarative deployment models quickly become complex as they often contain detailed information about the application's components and their configurations. As a result, immense technical expertise is typically required to understand the semantics of a declarative deployment model, i. e., what gets deployed and how the components behave. In this paper, we present an approach that automatically detects design patterns in declarative deployment models. This eases understanding the semantics of deployment models as only the abstract and high-level semantics of the detected patterns must be known instead of technical details about components, relations, and configurations. We demonstrate an open-source implementation based on the Topology and Orchestration Specification for Cloud Applications (TOSCA) and the graphical open-source modeling tool Winery. In addition, we present a detailed case study showing how our approach can be applied in practice using the presented prototype.

近年来，已经开发了许多不同的部署自动化技术来自动部署云应用程序。这些技术中的大多数都采用声明性部署模型，通过对云应用程序的组件、它们的配置以及它们之间的关系进行建模来描述云应用程序的部署。然而，虽然声明性地对云应用程序的部署建模是直观的，但声明性部署模型很快就会变得复杂，因为它们通常包含有关应用程序组件及其配置的详细信息。因此，通常需要大量的技术专业知识来理解声明性部署模型的语义，即部署的内容以及组件的行为。在本文中，我们提出了一种在声明式部署模型中自动检测设计模式的方法。这有助于理解部署模型的语义，因为只需要知道所检测模式的抽象和高级语义，而不需要了解有关组件、关系和配置的技术细节。我们展示了一个基于云应用的拓扑和编排规范(TOSCA)和图形化开源建模工具Winery的开源实现。此外，我们提出了一个详细的案例研究，展示了我们的方法如何使用所提出的原型在实践中应用。

{"title":"Automated detection of design patterns in declarative deployment models","authors":"Lukas Harzenetter, Uwe Breitenbücher, Ghareeb Falazi, F. Leymann, Adrian Wersching","doi":"10.1145/3468737.3494085","DOIUrl":"https://doi.org/10.1145/3468737.3494085","url":null,"abstract":"In recent years, many different deployment automation technologies have been developed to automatically deploy cloud applications. Most of these technologies employ declarative deployment models to describe the deployment of a cloud application by modeling its components, their configurations as well as the relations between them. However, while modeling the deployment of cloud applications declaratively is intuitive, declarative deployment models quickly become complex as they often contain detailed information about the application's components and their configurations. As a result, immense technical expertise is typically required to understand the semantics of a declarative deployment model, i. e., what gets deployed and how the components behave. In this paper, we present an approach that automatically detects design patterns in declarative deployment models. This eases understanding the semantics of deployment models as only the abstract and high-level semantics of the detected patterns must be known instead of technical details about components, relations, and configurations. We demonstrate an open-source implementation based on the Topology and Orchestration Specification for Cloud Applications (TOSCA) and the graphical open-source modeling tool Winery. In addition, we present a detailed case study showing how our approach can be applied in practice using the presented prototype.","PeriodicalId":254382,"journal":{"name":"Proceedings of the 14th IEEE/ACM International Conference on Utility and Cloud Computing","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133482362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-cloud serverless function composition 多云无服务器功能组成

Proceedings of the 14th IEEE/ACM International Conference on Utility and Cloud Computing

Pub Date : 2021-12-06 DOI: 10.1145/3468737.3494090

J. Quenum, Jonas Josua

Function-as-a-service (FaaS) is an emerging model based on serverless cloud computing technology. It builds on the microservice architecture, where developers implement specific functionality, deploy it to a cloud provider to be executed independently in its own containerised environment. In this paper, we present a software composition approach that orchestrates FaaS from various cloud providers to fulfil the requirements of an application. Our solution integrates a hierarchical planner and a constraint satisfaction solver. Specifically, we discuss the planning method, constraint satisfaction solver, and the coordination of selected functions during the execution. We also present an experiment where our approach is tested using functions in the cloud.

功能即服务(FaaS)是一种基于无服务器云计算技术的新兴模式。它建立在微服务架构之上，在微服务架构中，开发人员实现特定的功能，将其部署到云提供商，以便在其自己的容器化环境中独立执行。在本文中，我们提出了一种软件组合方法，该方法对来自不同云提供商的FaaS进行编排，以满足应用程序的需求。我们的解决方案集成了一个分层规划器和一个约束满足求解器。具体来说，我们讨论了规划方法，约束满足求解器，以及在执行过程中选择的功能的协调。我们还提供了一个实验，其中使用云中的函数对我们的方法进行了测试。

引用次数: 4

RDS RDS

Proceedings of the 14th IEEE/ACM International Conference on Utility and Cloud Computing

Pub Date : 2021-12-06 DOI: 10.1145/3468737.3494089

Yaying Shi, Anjia Wang, Yonghong Yan, C. Liao

Data races are notorious concurrency bugs which can cause severe problems, including random crashes and corrupted execution results. However, existing data race detection tools are still challenging for users to use. It takes a significant amount of effort for users to install, configure and properly use a tool. A single tool often cannot find all the bugs in a program. Requiring users to use multiple tools is often impracticable and not productive because of the differences in tool interfaces and report formats. In this paper, we present a cloud-based, service-oriented design and implementation of a race detection service (RDS)1 to detect data races in parallel programs. RDS integrates multiple data race detection tools into a single cloud-based service via a REST API. It defines a standard JSON format to represent data race detection results, facilitating producing user-friendly reports, aggregating output of multiple tools, as well as being easily processed by other tools. RDS also defines a set of policies for aggregating outputs from multiple tools. RDS significantly simplifies the workflow of using data race detection tools and improves the report quality and productivity of performing race detection for parallel programs. Our evaluation shows that RDS can deliver more accurate results with much less effort from users, when compared with the traditional way of using any individual tools. Using four selected tools and DataRaceBench, RDS improves the Adjusted F-1 scores by 8.8% and 12.6% over the best and the average scores, respectively. For the NAS Parallel Benchmark, RDS improves 35% of the adjusted accuracy compared to the average of the tools. Our work studies a new approach of composing software tools for parallel computing via a service-oriented architecture. The same approach and framework can be used to create metaservice for compilers, performance tools, auto-tuning tools, and so on.

{"title":"RDS","authors":"Yaying Shi, Anjia Wang, Yonghong Yan, C. Liao","doi":"10.1145/3468737.3494089","DOIUrl":"https://doi.org/10.1145/3468737.3494089","url":null,"abstract":"Data races are notorious concurrency bugs which can cause severe problems, including random crashes and corrupted execution results. However, existing data race detection tools are still challenging for users to use. It takes a significant amount of effort for users to install, configure and properly use a tool. A single tool often cannot find all the bugs in a program. Requiring users to use multiple tools is often impracticable and not productive because of the differences in tool interfaces and report formats. In this paper, we present a cloud-based, service-oriented design and implementation of a race detection service (RDS)1 to detect data races in parallel programs. RDS integrates multiple data race detection tools into a single cloud-based service via a REST API. It defines a standard JSON format to represent data race detection results, facilitating producing user-friendly reports, aggregating output of multiple tools, as well as being easily processed by other tools. RDS also defines a set of policies for aggregating outputs from multiple tools. RDS significantly simplifies the workflow of using data race detection tools and improves the report quality and productivity of performing race detection for parallel programs. Our evaluation shows that RDS can deliver more accurate results with much less effort from users, when compared with the traditional way of using any individual tools. Using four selected tools and DataRaceBench, RDS improves the Adjusted F-1 scores by 8.8% and 12.6% over the best and the average scores, respectively. For the NAS Parallel Benchmark, RDS improves 35% of the adjusted accuracy compared to the average of the tools. Our work studies a new approach of composing software tools for parallel computing via a service-oriented architecture. The same approach and framework can be used to create metaservice for compilers, performance tools, auto-tuning tools, and so on.","PeriodicalId":254382,"journal":{"name":"Proceedings of the 14th IEEE/ACM International Conference on Utility and Cloud Computing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115840790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Apollo: towards an efficient distributed orchestration of serverless function compositions in the cloud-edge continuum 阿波罗:在云边缘连续体中实现无服务器功能组合的高效分布式编排

Proceedings of the 14th IEEE/ACM International Conference on Utility and Cloud Computing

Pub Date : 2021-12-06 DOI: 10.1145/3468737.3494103

Fedor Smirnov, Chris Engelhardt, Jakob Mittelberger, Behnaz Pourmohseni, T. Fahringer

This paper provides a first presentation of Apollo, an orchestration framework for serverless function compositions distributed across the cloud-edge continuum. Apollo has a modular design that enables a fine-grained decomposition of the runtime orchestration (scheduling, data transmission, etc.) of applications, so that each of the numerous orchestration decisions can be optimized separately, fully exploiting the potential for the optimization of performance and costs. Apollo features (a) a flexible model of the application and the available resources and (b) a decentralized orchestration scheme carried out by independent agents. This flexible structure enables distributing not only the processing but also the orchestration process itself across a large number of resources, each running an independent Apollo instance. In combination with the ability to execute parts of the application directly on the host of each Apollo instance, this unleashes a significant potential for cost and performance optimization by leveraging data locality. Apollo's efficiency and its potential for application performance improvement are demonstrated in a series of experiments---for both synthetic and real function compositions---where Apollo's capability for flexible distribution of tasks between local containers and serverless functions enables a significant application speedup (up to 20X).

本文首次介绍了Apollo，它是一个用于跨云边缘连续体分布的无服务器功能组合的编排框架。Apollo采用模块化设计，支持对应用程序的运行时编排(调度、数据传输等)进行细粒度分解，以便可以单独优化众多编排决策中的每一个，从而充分利用性能和成本优化的潜力。Apollo的特点是:(a)应用程序和可用资源的灵活模型，以及(b)由独立代理执行的分散编排方案。这种灵活的结构不仅可以跨大量资源分发处理，还可以跨大量资源分发编排过程本身，每个资源运行一个独立的Apollo实例。再加上直接在每个Apollo实例的主机上执行部分应用程序的能力，这就通过利用数据局部性释放了成本和性能优化的巨大潜力。Apollo的效率及其在应用程序性能改进方面的潜力在一系列实验中得到了证明——包括合成和实际功能组合——Apollo在本地容器和无服务器功能之间灵活分配任务的能力使应用程序的加速显著提高(高达20倍)。

{"title":"Apollo: towards an efficient distributed orchestration of serverless function compositions in the cloud-edge continuum","authors":"Fedor Smirnov, Chris Engelhardt, Jakob Mittelberger, Behnaz Pourmohseni, T. Fahringer","doi":"10.1145/3468737.3494103","DOIUrl":"https://doi.org/10.1145/3468737.3494103","url":null,"abstract":"This paper provides a first presentation of Apollo, an orchestration framework for serverless function compositions distributed across the cloud-edge continuum. Apollo has a modular design that enables a fine-grained decomposition of the runtime orchestration (scheduling, data transmission, etc.) of applications, so that each of the numerous orchestration decisions can be optimized separately, fully exploiting the potential for the optimization of performance and costs. Apollo features (a) a flexible model of the application and the available resources and (b) a decentralized orchestration scheme carried out by independent agents. This flexible structure enables distributing not only the processing but also the orchestration process itself across a large number of resources, each running an independent Apollo instance. In combination with the ability to execute parts of the application directly on the host of each Apollo instance, this unleashes a significant potential for cost and performance optimization by leveraging data locality. Apollo's efficiency and its potential for application performance improvement are demonstrated in a series of experiments---for both synthetic and real function compositions---where Apollo's capability for flexible distribution of tasks between local containers and serverless functions enables a significant application speedup (up to 20X).","PeriodicalId":254382,"journal":{"name":"Proceedings of the 14th IEEE/ACM International Conference on Utility and Cloud Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125499029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Accord 协议

Proceedings of the 14th IEEE/ACM International Conference on Utility and Cloud Computing

Pub Date : 2021-12-06 DOI: 10.1145/3468737.3494102

S. H. Mortazavi, Hossein Shafieirad, M. Bahnasy, A. Munir, Yuanhui Cheng, Anudeep Das, Y. Ganjali

Resource optimization algorithms in the cloud are ever more data-driven and decision-making has become reliant on more and more data flowing from different cloud components. Applications and the network control layer on the other hand mainly operate in isolation without direct communication. Recently, increased integration between the network and application has been advocated to benefit both the application and the network but the information exchange has mostly been limited to flow level information. We argue that in the realm of datacenter networks, sharing additional information such as the function processing times and deployment data for planning jobs and tasks can result in major optimization benefits for the network. In this study we present Accord as a Network Application Integration solution to achieve a holistic network-application management solution. We propose a protocol as an API between the network and application then we build a system that uses the processing and networking data from the application to perform network scheduling and routing optimizations. We demonstrate that for a sample distributed learning application, an Accord enhanced solution that uses the application processing information can yield up to 27.8% reduction in Job Completion Time (JCT). In addition, we show how Accord can yield better results for routing decisions through a reinforcement learning algorithm that outperforms first shortest path first by %13.

{"title":"Accord","authors":"S. H. Mortazavi, Hossein Shafieirad, M. Bahnasy, A. Munir, Yuanhui Cheng, Anudeep Das, Y. Ganjali","doi":"10.1145/3468737.3494102","DOIUrl":"https://doi.org/10.1145/3468737.3494102","url":null,"abstract":"Resource optimization algorithms in the cloud are ever more data-driven and decision-making has become reliant on more and more data flowing from different cloud components. Applications and the network control layer on the other hand mainly operate in isolation without direct communication. Recently, increased integration between the network and application has been advocated to benefit both the application and the network but the information exchange has mostly been limited to flow level information. We argue that in the realm of datacenter networks, sharing additional information such as the function processing times and deployment data for planning jobs and tasks can result in major optimization benefits for the network. In this study we present Accord as a Network Application Integration solution to achieve a holistic network-application management solution. We propose a protocol as an API between the network and application then we build a system that uses the processing and networking data from the application to perform network scheduling and routing optimizations. We demonstrate that for a sample distributed learning application, an Accord enhanced solution that uses the application processing information can yield up to 27.8% reduction in Job Completion Time (JCT). In addition, we show how Accord can yield better results for routing decisions through a reinforcement learning algorithm that outperforms first shortest path first by %13.","PeriodicalId":254382,"journal":{"name":"Proceedings of the 14th IEEE/ACM International Conference on Utility and Cloud Computing","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115184543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

MigSGX

Proceedings of the 14th IEEE/ACM International Conference on Utility and Cloud Computing

Pub Date : 2021-12-06 DOI: 10.1145/3468737.3494088

K. Nakashima, Kenichi Kourai

Recently, containers are widely used to process big data in clouds. To prevent information leakage from containers, applications in containers can protect sensitive information using enclaves provided by Intel SGX. The memory of enclaves is encrypted by a CPU using its internal keys. However, the execution of SGX applications cannot be continued after the container running those applications is migrated. This is because enclave memory cannot be correctly decrypted at the destination host. This paper proposes MigSGX for enabling the continuous execution of SGX applications after container migration. Since the states of enclaves cannot be directly accessed from the outside, MigSGX securely invokes each enclave and makes it dump and load its state. Atthe dump time, each enclave re-encrypts its state using a CPU-independent key to protect sensitive information. For space- and time-efficiency, MigSGX saves and restores a large amount of enclave memory in a pipelined manner. We have implemented MigSGX in the Intel SGX SDK and CRIU and showed that pipelining could improve migration performance by up to 52%. The memory necessary for migration was reduced only to 0.15%.

引用次数: 3

Amoeba: aligning stream processing operators with externally-managed state 变形虫:将流处理操作符与外部管理的状态对齐

Proceedings of the 14th IEEE/ACM International Conference on Utility and Cloud Computing

Pub Date : 2021-12-06 DOI: 10.1145/3468737.3494096

Antonis Papaioannou, K. Magoutis

Scalable stream processing systems (SPS) often require external storage systems for long-term storage of non-emphemeral state. Such state cannot be accommodated in the internal stores of SPSes that are mainly geared for fault tolerance of streaming jobs, lack externally visible APIs, and their state is disposed of at the end of such jobs. Recent research have pointed to scalable in-memory key-value stores (KVS) as an efficient solution to manage external state. While such data stores have been interconnected with scalable streaming systems, they are currently managed independently, missing opportunities for optimizations, such as exploiting locality between stream partitions and table shards, as well as coordinating elasticity actions. Both processing and data management systems are typically designed for scalability, however coordination between them poses a significant challenge. In this work we describe Amoeba, a system that dynamically adapts data-partitioning schemes and/or task or data placement across systems to eliminate unnecessary network communication across nodes. Our evaluation using state-of-the art systems, such as the Flink SPS and Redis KVS, demonstrated 2.6x performance improvement when aligning SPS tasks with KVS shards in AWS deployments of up to 64 nodes.

可扩展流处理系统(SPS)通常需要外部存储系统来长期存储非瞬时状态。这种状态不能在主要用于流作业容错的spe的内部存储中容纳，缺乏外部可见的api，并且在此类作业结束时处理它们的状态。最近的研究指出，可扩展的内存中的键值存储(KVS)是管理外部状态的有效解决方案。虽然这样的数据存储已经与可扩展的流系统互联，但它们目前是独立管理的，错过了优化的机会，比如利用流分区和表分片之间的局部性，以及协调弹性操作。处理系统和数据管理系统通常都是为可伸缩性而设计的，但是它们之间的协调构成了一个重大挑战。在这项工作中，我们描述了Amoeba，一个动态适应数据分区方案和/或跨系统的任务或数据放置的系统，以消除节点之间不必要的网络通信。我们使用最先进的系统(如Flink SPS和Redis KVS)进行评估，在多达64个节点的AWS部署中将SPS任务与KVS分片对齐时，性能提高了2.6倍。

{"title":"Amoeba: aligning stream processing operators with externally-managed state","authors":"Antonis Papaioannou, K. Magoutis","doi":"10.1145/3468737.3494096","DOIUrl":"https://doi.org/10.1145/3468737.3494096","url":null,"abstract":"Scalable stream processing systems (SPS) often require external storage systems for long-term storage of non-emphemeral state. Such state cannot be accommodated in the internal stores of SPSes that are mainly geared for fault tolerance of streaming jobs, lack externally visible APIs, and their state is disposed of at the end of such jobs. Recent research have pointed to scalable in-memory key-value stores (KVS) as an efficient solution to manage external state. While such data stores have been interconnected with scalable streaming systems, they are currently managed independently, missing opportunities for optimizations, such as exploiting locality between stream partitions and table shards, as well as coordinating elasticity actions. Both processing and data management systems are typically designed for scalability, however coordination between them poses a significant challenge. In this work we describe Amoeba, a system that dynamically adapts data-partitioning schemes and/or task or data placement across systems to eliminate unnecessary network communication across nodes. Our evaluation using state-of-the art systems, such as the Flink SPS and Redis KVS, demonstrated 2.6x performance improvement when aligning SPS tasks with KVS shards in AWS deployments of up to 64 nodes.","PeriodicalId":254382,"journal":{"name":"Proceedings of the 14th IEEE/ACM International Conference on Utility and Cloud Computing","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125172884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Courier 快递

Proceedings of the 14th IEEE/ACM International Conference on Utility and Cloud Computing

Pub Date : 2021-12-06 DOI: 10.1163/2352-0272_emho_sim_022957

Anshul Jindal, Julian Frielinghaus, Mohak Chadha, M. Gerndt

引用次数: 2

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the 14th IEEE/ACM International Conference on Utility and Cloud Computing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀