IEEE Cloud Computing最新文献

英文中文

Performance Profiling of Load Balancing Algorithms in a Cloud Architecture 云架构下负载均衡算法的性能分析

Q1 Computer Science

IEEE Cloud Computing

Pub Date : 2021-10-01 DOI: 10.1109/IEEECloudSummit52029.2021.00020

Samirah Salifu, Nathan Turlington, Michael Galloway

Load balancing is the process of distributing job requests among servers. There are many different load balancing algorithms, such as round robin, weighted round robin, and least connections, that can be used to process bioinformatics tools. Cloud computing can be used with bioinformatics to create a BioCloud program. A dynamic load balancing algorithm needs to be developed to properly handle distributing bioinformatics jobs. In this experiment, the FastQC job was used for all test cases. Four algorithms were designed to distribute the one type of job: system load, %CPU, free RAM, and round robin. The results of the data show that the FastQC job is CPU intensive, but not RAM intensive. The most efficient algorithm was %CPU.

负载平衡是在服务器之间分配作业请求的过程。有许多不同的负载平衡算法，如轮询、加权轮询和最小连接，可用于处理生物信息学工具。云计算可以与生物信息学一起使用来创建一个生物云程序。需要开发一种动态负载平衡算法来正确处理分布式生物信息学任务。在这个实验中，FastQC作业被用于所有的测试用例。设计了四种算法来分配一种作业:系统负载、%CPU、空闲RAM和轮询。数据结果表明，FastQC作业是CPU密集型的，但不是RAM密集型的。最有效的算法是%CPU。

引用次数: 0

Sovereign Data Exchange in Cloud-Connected IoT using International Data Spaces 使用国际数据空间的云连接物联网中的主权数据交换

Q1 Computer Science

IEEE Cloud Computing

Pub Date : 2021-10-01 DOI: 10.1109/IEEECloudSummit52029.2021.00010

Haydar Qarawlus, Malte Hellmeier, Johannes Pieperbeck, Ronja Quensel, Steffen Biehs, Marc Peschke

Data sovereignty is gaining increasing importance as the frequency and sensitivity of data exchange between companies and nations increase. Existing approaches ensuring sovereign data exchange in business ecosystems, like the International Data Spaces (IDS) initiative, neglect limitations of hardware resource restrictions. Therefore, we examine real-time sovereign data exchange in cloud-connected Internet of Things (IoT) devices. Two lightweight communication schemes based on request/response and publish/subscribe are proposed and implemented following the IDS guidelines. For evaluation, we use a simulated test-bed based on an Automated Guided Vehicle (AGV) use case. We examine the results based on exchanged IDS messages and CPU usage on the consumer side represented by the AGVs as IoT devices. Results show benefits in the publish/subscribe version in longer operation times, allowing to enter low-power mode, while request/response performs better on limited CPU resources or short operations.

随着公司和国家之间数据交换的频率和敏感性的增加，数据主权变得越来越重要。确保业务生态系统中主权数据交换的现有方法，如国际数据空间(IDS)倡议，忽视了硬件资源限制的限制。因此，我们研究了云连接的物联网(IoT)设备中的实时主权数据交换。根据IDS指南，提出并实现了基于请求/响应和发布/订阅的两种轻量级通信方案。为了进行评估，我们使用了一个基于自动引导车辆(AGV)用例的模拟试验台。我们根据交换的IDS消息和作为物联网设备的agv所代表的消费者端的CPU使用情况来检查结果。结果显示发布/订阅版本在较长操作时间内的优势，允许进入低功耗模式，而请求/响应在有限的CPU资源或短操作上表现更好。

引用次数: 1

When Edge Computing Meets Compact Data Structures 当边缘计算遇到紧凑的数据结构

Q1 Computer Science

IEEE Cloud Computing

Pub Date : 2021-10-01 DOI: 10.1109/IEEECloudSummit52029.2021.00013

Zheng Li, Diego Seco, José Fuentes-Sepúlveda

Edge computing enables data processing and storage closer to where the data are created. Given the largely distributed compute environment and the significantly dispersed data distribution, there are increasing demands of data sharing and collaborative processing on the edge. Since data shuffling can dominate the overall execution time of collaborative processing jobs, considering the limited power supply and bandwidth resource in edge environments, it is crucial and valuable to reduce the communication overhead across edge devices. Compared with data compression, compact data structures (CDS) seem to be more suitable in this case, for the capability of allowing data to be queried, navigated, and manipulated directly in a compact form. However, the relevant work about applying CDS to edge computing generally focuses on the intuitive benefit from reduced data size, while few discussions about the challenges are given, not to mention empirical investigations into real-world edge use cases. This research highlights the challenges, opportunities, and potential scenarios of CDS implementation in edge computing. Driven by the use case of shuffling-intensive data analytics, we proposed a three-layer architecture for CDS-aided data processing and particularly studied the feasibility and efficiency of the CDS layer. We expect this research to foster conjoint research efforts on CDS-aided edge data analytics and to make wider practical impacts.

边缘计算使数据处理和存储更接近数据创建的位置。随着计算环境的高度分布式和数据分布的显著分散，对边缘数据共享和协同处理的需求日益增加。由于数据变换会支配协作处理作业的整体执行时间，考虑到边缘环境中有限的电源和带宽资源，降低边缘设备之间的通信开销至关重要且有价值。与数据压缩相比，紧凑的数据结构(CDS)似乎更适合这种情况，因为它允许以紧凑的形式直接查询、导航和操作数据。然而，将CDS应用于边缘计算的相关工作通常侧重于减少数据大小带来的直观好处，而很少讨论挑战，更不用说对现实世界边缘用例的实证调查了。本研究强调了CDS在边缘计算中实施的挑战、机遇和潜在场景。在洗刷密集型数据分析用例的驱动下，提出了一种CDS辅助数据处理的三层架构，并重点研究了CDS层的可行性和效率。我们期望这项研究能促进cds辅助边缘数据分析的联合研究，并产生更广泛的实际影响。

{"title":"When Edge Computing Meets Compact Data Structures","authors":"Zheng Li, Diego Seco, José Fuentes-Sepúlveda","doi":"10.1109/IEEECloudSummit52029.2021.00013","DOIUrl":"https://doi.org/10.1109/IEEECloudSummit52029.2021.00013","url":null,"abstract":"Edge computing enables data processing and storage closer to where the data are created. Given the largely distributed compute environment and the significantly dispersed data distribution, there are increasing demands of data sharing and collaborative processing on the edge. Since data shuffling can dominate the overall execution time of collaborative processing jobs, considering the limited power supply and bandwidth resource in edge environments, it is crucial and valuable to reduce the communication overhead across edge devices. Compared with data compression, compact data structures (CDS) seem to be more suitable in this case, for the capability of allowing data to be queried, navigated, and manipulated directly in a compact form. However, the relevant work about applying CDS to edge computing generally focuses on the intuitive benefit from reduced data size, while few discussions about the challenges are given, not to mention empirical investigations into real-world edge use cases. This research highlights the challenges, opportunities, and potential scenarios of CDS implementation in edge computing. Driven by the use case of shuffling-intensive data analytics, we proposed a three-layer architecture for CDS-aided data processing and particularly studied the feasibility and efficiency of the CDS layer. We expect this research to foster conjoint research efforts on CDS-aided edge data analytics and to make wider practical impacts.","PeriodicalId":54281,"journal":{"name":"IEEE Cloud Computing","volume":"61 1","pages":"29-34"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83793033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Preproduction Deploys: Cloud-Native Integration Testing 预生产部署:云原生集成测试

Q1 Computer Science

IEEE Cloud Computing

Pub Date : 2021-10-01 DOI: 10.1109/IEEECloudSummit52029.2021.00015

J. Carroll, Pankaj Anand, David Guo

The microservice architecture for cloud-based systems is extended to not only require each loosely coupled component to be independently deployable, but also to provide independent routing for each component. This supports canary deployments, green/blue deployments and roll-back. Both ad hoc and system integration test traffic can be directed to components before they are released to production traffic. Front-end code is included in this architecture by using server-side rendering of JS bundles. Environments for integration testing are created with preproduction deploys side by side with production deploys using appropriate levels of isolation. After a successful integration test run, preproduction components are known to work with production precisely as it is. For isolation, test traffic uses staging databases that are copied daily from the production databases, omitting sensitive data. Safety and security concerns are dealt with in a targeted fashion, not monolithically. This architecture scales well with organization size; is more effective for integration testing; and is better aligned with agile business practices than traditional approaches.

基于云的系统的微服务架构被扩展为不仅要求每个松散耦合的组件可以独立部署，而且还要求为每个组件提供独立的路由。它支持金丝雀部署、绿/蓝部署和回滚。在将组件发布到生产流量之前，可以将临时和系统集成测试流量定向到组件。前端代码通过使用JS包的服务器端渲染包含在这个体系结构中。集成测试环境是通过使用适当的隔离级别，将生产前部署与生产部署并排创建的。在成功的集成测试运行之后，预生产组件就可以准确地与产品一起工作了。为了隔离，测试流量使用每天从生产数据库复制的临时数据库，从而省略敏感数据。安全和保障问题以一种有针对性的方式处理，而不是单一地处理。这种架构可以很好地适应组织规模;对集成测试更有效;并且比传统方法更符合敏捷业务实践。

{"title":"Preproduction Deploys: Cloud-Native Integration Testing","authors":"J. Carroll, Pankaj Anand, David Guo","doi":"10.1109/IEEECloudSummit52029.2021.00015","DOIUrl":"https://doi.org/10.1109/IEEECloudSummit52029.2021.00015","url":null,"abstract":"The microservice architecture for cloud-based systems is extended to not only require each loosely coupled component to be independently deployable, but also to provide independent routing for each component. This supports canary deployments, green/blue deployments and roll-back. Both ad hoc and system integration test traffic can be directed to components before they are released to production traffic. Front-end code is included in this architecture by using server-side rendering of JS bundles. Environments for integration testing are created with preproduction deploys side by side with production deploys using appropriate levels of isolation. After a successful integration test run, preproduction components are known to work with production precisely as it is. For isolation, test traffic uses staging databases that are copied daily from the production databases, omitting sensitive data. Safety and security concerns are dealt with in a targeted fashion, not monolithically. This architecture scales well with organization size; is more effective for integration testing; and is better aligned with agile business practices than traditional approaches.","PeriodicalId":54281,"journal":{"name":"IEEE Cloud Computing","volume":"286 2 1","pages":"41-48"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72904798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Managing Big Data Stream Pipelines Using Graphical Service Mesh Tools 使用图形服务网格工具管理大数据流管道

Q1 Computer Science

IEEE Cloud Computing

Pub Date : 2021-10-01 DOI: 10.1109/IEEECloudSummit52029.2021.00014

M. Faizan, C. Prehofer

Current big data frameworks like Apache Flink and Spark enable efficient processing of large-scale streaming data in a distributed setup. For the management of such data pipelines and the computing resources, we propose a combination of a graphical tool for pipeline management, Apache StreamPipes, and container management tools like Kubernetes. For evaluation, we implemented a use case with data preprocessing, vehicle power consumption, and driving behavior services in StreamPipes. We discuss the capabilities of StreamPipes in managing and executing complex stream processing pipelines and also evaluate the possible integration of container and service mesh tools (i.e., Istio) with StreamPipes. Furthermore, we implemented and evaluated a service management layer in our system design to provide extended features. In particular, we evaluated the delay when such a complex pipeline is restarted, e.g. for updates or reconfiguration.

当前的大数据框架，如Apache Flink和Spark，能够在分布式设置中高效地处理大规模流数据。为了管理这样的数据管道和计算资源，我们提出了一个管道管理的图形工具，Apache StreamPipes和容器管理工具(如Kubernetes)的组合。为了进行评估，我们在StreamPipes中实现了一个包含数据预处理、车辆功耗和驾驶行为服务的用例。我们讨论了StreamPipes在管理和执行复杂流处理管道方面的能力，并评估了容器和服务网格工具(如Istio)与StreamPipes集成的可能性。此外，我们在系统设计中实现并评估了一个服务管理层，以提供扩展功能。特别是，我们评估了这样一个复杂的管道重新启动时的延迟，例如更新或重新配置。

引用次数: 0

Image Disguising for Protecting Data and Model Confidentiality in Outsourced Deep Learning 外包深度学习中保护数据和模型机密性的图像伪装

Q1 Computer Science

IEEE Cloud Computing

Pub Date : 2021-09-01 DOI: 10.1109/CLOUD53861.2021.00020

Sagar Sharma, A. Alam, Keke Chen

Large training data and expensive model tweaking are common features of deep learning development for images. As a result, data owners often utilize cloud resources or machine learning service providers for developing large-scale complex models. This practice, however, raises serious privacy concerns. Existing solutions are either too expensive to be practical, or do not sufficiently protect the confidentiality of data and model. In this paper, we aim to achieve a better trade-off among the level of protection for outsourced DNN model training, the expenses, and the utility of data, using novel image disguising mechanisms. We design a suite of image disguising methods that are efficient to implement and then analyze them to understand multiple levels of tradeoffs between data utility and protection of confidentiality. The experimental evaluation shows the surprising ability of DNN modeling methods in discovering patterns in disguised images and the flexibility of these image disguising mechanisms in achieving different levels of resilience to attacks.

大量的训练数据和昂贵的模型调整是图像深度学习开发的共同特征。因此，数据所有者经常利用云资源或机器学习服务提供商来开发大规模复杂模型。然而，这种做法引发了严重的隐私问题。现有的解决方案要么太昂贵而不实用，要么不能充分保护数据和模型的机密性。在本文中，我们的目标是通过使用新的图像伪装机制，在外包DNN模型训练的保护水平、费用和数据效用之间实现更好的权衡。我们设计了一套有效实现的图像伪装方法，然后对它们进行分析，以了解数据实用性和机密性保护之间的多重权衡。实验评估显示了DNN建模方法在发现伪装图像模式方面的惊人能力，以及这些图像伪装机制在实现不同程度的攻击弹性方面的灵活性。

引用次数: 7

Para: Harvesting CPU time fragments in Big Data Analytics Para:在大数据分析中获取CPU时间片段

Q1 Computer Science

IEEE Cloud Computing

Pub Date : 2021-09-01 DOI: 10.1109/CLOUD53861.2021.00081

Yuzhao Wang, Hongliang Qu, Junqing Yu, Zhibin Yu

Modern data analytics typically run tasks on statically reserved resources (e.g., CPU and memory), which is prone to over-provision to guarantee the Quality of Service (QoS), leading to a large amount of resource time fragments. As a result, the resource utilization of a data analytics cluster is severely under-utilized. Workload co-location on shared resources has been substantially studied, but they are unaware the sizes of resource time fragments, making them hard to improve the resource utilization and guarantee QoS at the same time. In this paper, we propose Para, an event-driven scheduling mechanism, to harvest the CPU time fragments in co-located big data analytic workloads. Para innovates three techniques: 1) identifying the Idle CPU Time Window (ICTW) associated with each CPU core by capturing the task-switch event; 2) designing a runtime communication mechanism between each task execution of a workload and the underlying resource management system; 3) designing a pull-based scheduler to schedule a workload to run in the ICTW of another workload. We implement Para based on Apache Mesos and Spark. And the experimental results show that Para improves the CPU utilization by 44% and 30% on average relative to the original Mesos and enhanced Mesos under Spark's dynamic mode (MSDM), respectively. Moreover, Para increases the averaged task throughput of Mesos and MSDM by 4.8x and 1.7x, respectively, while guaranteeing the execution time of the primary applications.

现代数据分析通常在静态预留的资源(如CPU和内存)上运行任务，为了保证服务质量(QoS)，容易出现过度配置，导致大量的资源时间碎片。因此，数据分析集群的资源利用率严重不足。在共享资源上的工作负载共址已经有了大量的研究，但是他们并不知道资源时间片段的大小，这使得他们很难在提高资源利用率的同时保证QoS。在本文中，我们提出了一种事件驱动的调度机制Para，以获取同址大数据分析工作负载中的CPU时间片段。Para创新了三种技术:1)通过捕获任务切换事件来识别与每个CPU核心相关的空闲CPU时间窗口(ICTW);2)设计工作负载各任务执行与底层资源管理系统之间的运行时通信机制;3)设计一个基于pull的调度器来调度一个工作负载在另一个工作负载的ICTW中运行。我们基于Apache Mesos和Spark实现Para。实验结果表明，在Spark的动态模式(MSDM)下，相对于原始Mesos和增强Mesos, Para的CPU利用率分别提高了44%和30%。此外，Para在保证主应用程序执行时间的前提下，使Mesos和MSDM的平均任务吞吐量分别提高了4.8倍和1.7倍。

{"title":"Para: Harvesting CPU time fragments in Big Data Analytics","authors":"Yuzhao Wang, Hongliang Qu, Junqing Yu, Zhibin Yu","doi":"10.1109/CLOUD53861.2021.00081","DOIUrl":"https://doi.org/10.1109/CLOUD53861.2021.00081","url":null,"abstract":"Modern data analytics typically run tasks on statically reserved resources (e.g., CPU and memory), which is prone to over-provision to guarantee the Quality of Service (QoS), leading to a large amount of resource time fragments. As a result, the resource utilization of a data analytics cluster is severely under-utilized. Workload co-location on shared resources has been substantially studied, but they are unaware the sizes of resource time fragments, making them hard to improve the resource utilization and guarantee QoS at the same time. In this paper, we propose Para, an event-driven scheduling mechanism, to harvest the CPU time fragments in co-located big data analytic workloads. Para innovates three techniques: 1) identifying the Idle CPU Time Window (ICTW) associated with each CPU core by capturing the task-switch event; 2) designing a runtime communication mechanism between each task execution of a workload and the underlying resource management system; 3) designing a pull-based scheduler to schedule a workload to run in the ICTW of another workload. We implement Para based on Apache Mesos and Spark. And the experimental results show that Para improves the CPU utilization by 44% and 30% on average relative to the original Mesos and enhanced Mesos under Spark's dynamic mode (MSDM), respectively. Moreover, Para increases the averaged task throughput of Mesos and MSDM by 4.8x and 1.7x, respectively, while guaranteeing the execution time of the primary applications.","PeriodicalId":54281,"journal":{"name":"IEEE Cloud Computing","volume":"51 1","pages":"625-636"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74962083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A system for proactive risk assessment of application changes in cloud operations 对云操作中的应用程序变更进行前瞻性风险评估的系统

Q1 Computer Science

IEEE Cloud Computing

Pub Date : 2021-09-01 DOI: 10.1109/CLOUD53861.2021.00025

Raghav Batta, L. Shwartz, M. Nidd, A. Azad, H. Kumar

Change is one of the biggest contributors to service outages. With more enterprises migrating their applications to cloud and using automated build and deployment the volume and rate of changes has significantly increased. Furthermore, microservice-based architectures have reduced the turnaround time for changes and increased the dependency between services. All of the above make it impossible for the Site Reliability Engineers (SREs) to use the traditional methods of manual risk assessment for changes. In order to mitigate change-induced service failures and ensure continuous improvement for cloud native services, it is critical to have an automated system for assessing the risk of change deployments. In this paper, we present an AI-based system for proactively assessing the risk associated with deployment of application changes in cloud operations. The risk assessment is accompanied with actionable risk explainability. We discuss the usage of this system in two primary scenarios of automated and manual deployment. In automated deployment scenario, our approach is able to alert SREs on 70 % of problematic changes by blocking only 1.5 % of total changes and recommending human intervention. In manual deployment scenario, our approach recommends the SREs to perform extra due diligence for 2.8 % of total changes to capture 84 % of problematic changes.

变更是造成服务中断的最大原因之一。随着越来越多的企业将其应用程序迁移到云，并使用自动化构建和部署，更改的数量和速度显著增加。此外，基于微服务的体系结构减少了更改的周转时间，并增加了服务之间的依赖性。所有这些都使得站点可靠性工程师(SREs)无法使用传统的人工风险评估方法进行更改。为了减轻变更引起的服务故障，并确保云原生服务的持续改进，有一个用于评估变更部署风险的自动化系统是至关重要的。在本文中，我们提出了一个基于人工智能的系统，用于主动评估与云操作中应用程序更改部署相关的风险。风险评估伴随着可操作的风险解释。我们将讨论该系统在自动部署和手动部署两种主要场景中的使用情况。在自动化部署场景中，我们的方法能够通过仅阻止1.5%的总更改并建议人工干预来提醒SREs 70%的有问题的更改。在手动部署场景中，我们的方法建议SREs对总更改的2.8%执行额外的尽职调查，以捕获84%的有问题的更改。

{"title":"A system for proactive risk assessment of application changes in cloud operations","authors":"Raghav Batta, L. Shwartz, M. Nidd, A. Azad, H. Kumar","doi":"10.1109/CLOUD53861.2021.00025","DOIUrl":"https://doi.org/10.1109/CLOUD53861.2021.00025","url":null,"abstract":"Change is one of the biggest contributors to service outages. With more enterprises migrating their applications to cloud and using automated build and deployment the volume and rate of changes has significantly increased. Furthermore, microservice-based architectures have reduced the turnaround time for changes and increased the dependency between services. All of the above make it impossible for the Site Reliability Engineers (SREs) to use the traditional methods of manual risk assessment for changes. In order to mitigate change-induced service failures and ensure continuous improvement for cloud native services, it is critical to have an automated system for assessing the risk of change deployments. In this paper, we present an AI-based system for proactively assessing the risk associated with deployment of application changes in cloud operations. The risk assessment is accompanied with actionable risk explainability. We discuss the usage of this system in two primary scenarios of automated and manual deployment. In automated deployment scenario, our approach is able to alert SREs on 70 % of problematic changes by blocking only 1.5 % of total changes and recommending human intervention. In manual deployment scenario, our approach recommends the SREs to perform extra due diligence for 2.8 % of total changes to capture 84 % of problematic changes.","PeriodicalId":54281,"journal":{"name":"IEEE Cloud Computing","volume":"16 1","pages":"112-123"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78461191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Energy-Aware Learning Agent (EALA) for Disaggregated Cloud Scheduling 分布式云调度的能量感知学习代理(EALA)

Q1 Computer Science

IEEE Cloud Computing

Pub Date : 2021-09-01 DOI: 10.1109/CLOUD53861.2021.00075

Nicholas Nordlund, V. Vassiliadis, Michele Gazzetti, D. Syrivelis, L. Tassiulas

Cloud data centers require enormous amounts of energy to run their clusters of computers. There are huge financial and environmental incentives for cloud service providers to increase their energy efficiency without causing significant negative impacts on their customers' qualities of experience. Increasing resource utilization reduces energy consumption by consolidating workloads on fewer machines and allows cloud service providers to turn off inactive devices. While traditional architectures only allow virtual machines (VMs) to use the memory and CPU resources of a single device, VMs in a disaggregated cloud can utilize the small residual capacities of multiple separate devices. Separating VM resources across multiple devices leads to severe fragmentation that eventually negates any positive impact disaggregation has on utilization. To address the fragmentation problem, we present a method of ensuring a cloud operates using the minimal number of devices over time. Here we introduce an Energy-Aware Learning Agent (EALA) that uses reinforcement learning to guarantee the system can meet minimal quality of service requirements and provide energy savings without the need for VM migration. We evaluate the use of EALA guiding the decisions of Best-Fit compared to vanilla Best-Fit using the Google cluster trace. We show that EALA improves utilization by 2% and reduces the number of times that compute nodes switch on and off by 11% compared to vanilla Best-Fit.

云数据中心需要大量的能源来运行它们的计算机集群。云服务提供商在提高能源效率的同时不会对客户的体验质量造成重大负面影响，这对他们来说是有巨大的财务和环境激励的。通过在更少的机器上整合工作负载，提高资源利用率可以降低能耗，并允许云服务提供商关闭非活动设备。传统架构只允许虚拟机使用单个设备的内存和CPU资源，而在分解云中的虚拟机可以利用多个独立设备的少量剩余容量。跨多个设备分离VM资源会导致严重的碎片化，最终会抵消分解对利用率的任何积极影响。为了解决碎片化问题，我们提出了一种方法来确保云使用最少数量的设备运行。在这里，我们介绍了一个能量感知学习代理(EALA)，它使用强化学习来保证系统可以满足最低的服务质量要求，并且在不需要VM迁移的情况下提供节能。我们使用Google集群跟踪来评估EALA指导Best-Fit决策的使用，并将其与香草Best-Fit进行比较。我们表明，与香草Best-Fit相比，EALA将利用率提高了2%，并将计算节点的开关次数减少了11%。

{"title":"Energy-Aware Learning Agent (EALA) for Disaggregated Cloud Scheduling","authors":"Nicholas Nordlund, V. Vassiliadis, Michele Gazzetti, D. Syrivelis, L. Tassiulas","doi":"10.1109/CLOUD53861.2021.00075","DOIUrl":"https://doi.org/10.1109/CLOUD53861.2021.00075","url":null,"abstract":"Cloud data centers require enormous amounts of energy to run their clusters of computers. There are huge financial and environmental incentives for cloud service providers to increase their energy efficiency without causing significant negative impacts on their customers' qualities of experience. Increasing resource utilization reduces energy consumption by consolidating workloads on fewer machines and allows cloud service providers to turn off inactive devices. While traditional architectures only allow virtual machines (VMs) to use the memory and CPU resources of a single device, VMs in a disaggregated cloud can utilize the small residual capacities of multiple separate devices. Separating VM resources across multiple devices leads to severe fragmentation that eventually negates any positive impact disaggregation has on utilization. To address the fragmentation problem, we present a method of ensuring a cloud operates using the minimal number of devices over time. Here we introduce an Energy-Aware Learning Agent (EALA) that uses reinforcement learning to guarantee the system can meet minimal quality of service requirements and provide energy savings without the need for VM migration. We evaluate the use of EALA guiding the decisions of Best-Fit compared to vanilla Best-Fit using the Google cluster trace. We show that EALA improves utilization by 2% and reduces the number of times that compute nodes switch on and off by 11% compared to vanilla Best-Fit.","PeriodicalId":54281,"journal":{"name":"IEEE Cloud Computing","volume":"17 1","pages":"578-583"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88837609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Konveyor Move2Kube: Automated Replatforming of Applications to Kubernetes Konveyor Move2Kube:自动将应用程序迁移到Kubernetes

Q1 Computer Science

IEEE Cloud Computing

Pub Date : 2021-09-01 DOI: 10.1109/CLOUD53861.2021.00093

P. V. Seshadri, Harikrishnan Balagopal, Pablo Loyola, Akash Nayak, Chander Govindarajan, Mudit Verma, Ashok Pon Kumar, Amith Singhee

We present Move2Kube, a replatforming framework that automates the transformation of the deployment specification and development pipeline of an application from a non-Kubernetes platform to a Kubernetes-based one, minimizing changes to the application's functional implementation and architecture. Our contributions include: (1) a standardized intermediate representation to which diverse application deployment artifacts could be translated, (2) an extension framework for adding support for new source platforms, and target artifacts while allowing customization as per organizational standards. We provide initial evidence of its effectiveness in terms of effort reduction, and highlight the current research challenges and future lines of work. Move2Kube is being developed as an open source community project and it is available at https://move2kube.konveyor.io/

我们介绍了Move2Kube，这是一个重新平台化的框架，可以自动将应用程序的部署规范和开发管道从非kubernetes平台转换为基于kubernetes的平台，从而最大限度地减少对应用程序功能实现和架构的更改。我们的贡献包括:(1)一个标准化的中间表示，不同的应用程序部署工件可以转换为它;(2)一个扩展框架，用于添加对新源平台和目标工件的支持，同时允许根据组织标准进行定制。我们提供了它在减少工作量方面的有效性的初步证据，并强调了当前的研究挑战和未来的工作方向。Move2Kube是作为一个开源社区项目开发的，可以在https://move2kube.konveyor.io/上获得

引用次数: 1

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

IEEE Cloud Computing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀