Pub Date : 2021-10-01DOI: 10.1109/IEEECloudSummit52029.2021.00020
Samirah Salifu, Nathan Turlington, Michael Galloway
Load balancing is the process of distributing job requests among servers. There are many different load balancing algorithms, such as round robin, weighted round robin, and least connections, that can be used to process bioinformatics tools. Cloud computing can be used with bioinformatics to create a BioCloud program. A dynamic load balancing algorithm needs to be developed to properly handle distributing bioinformatics jobs. In this experiment, the FastQC job was used for all test cases. Four algorithms were designed to distribute the one type of job: system load, %CPU, free RAM, and round robin. The results of the data show that the FastQC job is CPU intensive, but not RAM intensive. The most efficient algorithm was %CPU.
{"title":"Performance Profiling of Load Balancing Algorithms in a Cloud Architecture","authors":"Samirah Salifu, Nathan Turlington, Michael Galloway","doi":"10.1109/IEEECloudSummit52029.2021.00020","DOIUrl":"https://doi.org/10.1109/IEEECloudSummit52029.2021.00020","url":null,"abstract":"Load balancing is the process of distributing job requests among servers. There are many different load balancing algorithms, such as round robin, weighted round robin, and least connections, that can be used to process bioinformatics tools. Cloud computing can be used with bioinformatics to create a BioCloud program. A dynamic load balancing algorithm needs to be developed to properly handle distributing bioinformatics jobs. In this experiment, the FastQC job was used for all test cases. Four algorithms were designed to distribute the one type of job: system load, %CPU, free RAM, and round robin. The results of the data show that the FastQC job is CPU intensive, but not RAM intensive. The most efficient algorithm was %CPU.","PeriodicalId":54281,"journal":{"name":"IEEE Cloud Computing","volume":"7 1","pages":"77-82"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87512354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-01DOI: 10.1109/IEEECloudSummit52029.2021.00010
Haydar Qarawlus, Malte Hellmeier, Johannes Pieperbeck, Ronja Quensel, Steffen Biehs, Marc Peschke
Data sovereignty is gaining increasing importance as the frequency and sensitivity of data exchange between companies and nations increase. Existing approaches ensuring sovereign data exchange in business ecosystems, like the International Data Spaces (IDS) initiative, neglect limitations of hardware resource restrictions. Therefore, we examine real-time sovereign data exchange in cloud-connected Internet of Things (IoT) devices. Two lightweight communication schemes based on request/response and publish/subscribe are proposed and implemented following the IDS guidelines. For evaluation, we use a simulated test-bed based on an Automated Guided Vehicle (AGV) use case. We examine the results based on exchanged IDS messages and CPU usage on the consumer side represented by the AGVs as IoT devices. Results show benefits in the publish/subscribe version in longer operation times, allowing to enter low-power mode, while request/response performs better on limited CPU resources or short operations.
{"title":"Sovereign Data Exchange in Cloud-Connected IoT using International Data Spaces","authors":"Haydar Qarawlus, Malte Hellmeier, Johannes Pieperbeck, Ronja Quensel, Steffen Biehs, Marc Peschke","doi":"10.1109/IEEECloudSummit52029.2021.00010","DOIUrl":"https://doi.org/10.1109/IEEECloudSummit52029.2021.00010","url":null,"abstract":"Data sovereignty is gaining increasing importance as the frequency and sensitivity of data exchange between companies and nations increase. Existing approaches ensuring sovereign data exchange in business ecosystems, like the International Data Spaces (IDS) initiative, neglect limitations of hardware resource restrictions. Therefore, we examine real-time sovereign data exchange in cloud-connected Internet of Things (IoT) devices. Two lightweight communication schemes based on request/response and publish/subscribe are proposed and implemented following the IDS guidelines. For evaluation, we use a simulated test-bed based on an Automated Guided Vehicle (AGV) use case. We examine the results based on exchanged IDS messages and CPU usage on the consumer side represented by the AGVs as IoT devices. Results show benefits in the publish/subscribe version in longer operation times, allowing to enter low-power mode, while request/response performs better on limited CPU resources or short operations.","PeriodicalId":54281,"journal":{"name":"IEEE Cloud Computing","volume":"44 1","pages":"13-18"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87721573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-01DOI: 10.1109/IEEECloudSummit52029.2021.00013
Zheng Li, Diego Seco, José Fuentes-Sepúlveda
Edge computing enables data processing and storage closer to where the data are created. Given the largely distributed compute environment and the significantly dispersed data distribution, there are increasing demands of data sharing and collaborative processing on the edge. Since data shuffling can dominate the overall execution time of collaborative processing jobs, considering the limited power supply and bandwidth resource in edge environments, it is crucial and valuable to reduce the communication overhead across edge devices. Compared with data compression, compact data structures (CDS) seem to be more suitable in this case, for the capability of allowing data to be queried, navigated, and manipulated directly in a compact form. However, the relevant work about applying CDS to edge computing generally focuses on the intuitive benefit from reduced data size, while few discussions about the challenges are given, not to mention empirical investigations into real-world edge use cases. This research highlights the challenges, opportunities, and potential scenarios of CDS implementation in edge computing. Driven by the use case of shuffling-intensive data analytics, we proposed a three-layer architecture for CDS-aided data processing and particularly studied the feasibility and efficiency of the CDS layer. We expect this research to foster conjoint research efforts on CDS-aided edge data analytics and to make wider practical impacts.
{"title":"When Edge Computing Meets Compact Data Structures","authors":"Zheng Li, Diego Seco, José Fuentes-Sepúlveda","doi":"10.1109/IEEECloudSummit52029.2021.00013","DOIUrl":"https://doi.org/10.1109/IEEECloudSummit52029.2021.00013","url":null,"abstract":"Edge computing enables data processing and storage closer to where the data are created. Given the largely distributed compute environment and the significantly dispersed data distribution, there are increasing demands of data sharing and collaborative processing on the edge. Since data shuffling can dominate the overall execution time of collaborative processing jobs, considering the limited power supply and bandwidth resource in edge environments, it is crucial and valuable to reduce the communication overhead across edge devices. Compared with data compression, compact data structures (CDS) seem to be more suitable in this case, for the capability of allowing data to be queried, navigated, and manipulated directly in a compact form. However, the relevant work about applying CDS to edge computing generally focuses on the intuitive benefit from reduced data size, while few discussions about the challenges are given, not to mention empirical investigations into real-world edge use cases. This research highlights the challenges, opportunities, and potential scenarios of CDS implementation in edge computing. Driven by the use case of shuffling-intensive data analytics, we proposed a three-layer architecture for CDS-aided data processing and particularly studied the feasibility and efficiency of the CDS layer. We expect this research to foster conjoint research efforts on CDS-aided edge data analytics and to make wider practical impacts.","PeriodicalId":54281,"journal":{"name":"IEEE Cloud Computing","volume":"61 1","pages":"29-34"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83793033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-01DOI: 10.1109/IEEECloudSummit52029.2021.00015
J. Carroll, Pankaj Anand, David Guo
The microservice architecture for cloud-based systems is extended to not only require each loosely coupled component to be independently deployable, but also to provide independent routing for each component. This supports canary deployments, green/blue deployments and roll-back. Both ad hoc and system integration test traffic can be directed to components before they are released to production traffic. Front-end code is included in this architecture by using server-side rendering of JS bundles. Environments for integration testing are created with preproduction deploys side by side with production deploys using appropriate levels of isolation. After a successful integration test run, preproduction components are known to work with production precisely as it is. For isolation, test traffic uses staging databases that are copied daily from the production databases, omitting sensitive data. Safety and security concerns are dealt with in a targeted fashion, not monolithically. This architecture scales well with organization size; is more effective for integration testing; and is better aligned with agile business practices than traditional approaches.
{"title":"Preproduction Deploys: Cloud-Native Integration Testing","authors":"J. Carroll, Pankaj Anand, David Guo","doi":"10.1109/IEEECloudSummit52029.2021.00015","DOIUrl":"https://doi.org/10.1109/IEEECloudSummit52029.2021.00015","url":null,"abstract":"The microservice architecture for cloud-based systems is extended to not only require each loosely coupled component to be independently deployable, but also to provide independent routing for each component. This supports canary deployments, green/blue deployments and roll-back. Both ad hoc and system integration test traffic can be directed to components before they are released to production traffic. Front-end code is included in this architecture by using server-side rendering of JS bundles. Environments for integration testing are created with preproduction deploys side by side with production deploys using appropriate levels of isolation. After a successful integration test run, preproduction components are known to work with production precisely as it is. For isolation, test traffic uses staging databases that are copied daily from the production databases, omitting sensitive data. Safety and security concerns are dealt with in a targeted fashion, not monolithically. This architecture scales well with organization size; is more effective for integration testing; and is better aligned with agile business practices than traditional approaches.","PeriodicalId":54281,"journal":{"name":"IEEE Cloud Computing","volume":"286 2 1","pages":"41-48"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72904798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-01DOI: 10.1109/IEEECloudSummit52029.2021.00014
M. Faizan, C. Prehofer
Current big data frameworks like Apache Flink and Spark enable efficient processing of large-scale streaming data in a distributed setup. For the management of such data pipelines and the computing resources, we propose a combination of a graphical tool for pipeline management, Apache StreamPipes, and container management tools like Kubernetes. For evaluation, we implemented a use case with data preprocessing, vehicle power consumption, and driving behavior services in StreamPipes. We discuss the capabilities of StreamPipes in managing and executing complex stream processing pipelines and also evaluate the possible integration of container and service mesh tools (i.e., Istio) with StreamPipes. Furthermore, we implemented and evaluated a service management layer in our system design to provide extended features. In particular, we evaluated the delay when such a complex pipeline is restarted, e.g. for updates or reconfiguration.
{"title":"Managing Big Data Stream Pipelines Using Graphical Service Mesh Tools","authors":"M. Faizan, C. Prehofer","doi":"10.1109/IEEECloudSummit52029.2021.00014","DOIUrl":"https://doi.org/10.1109/IEEECloudSummit52029.2021.00014","url":null,"abstract":"Current big data frameworks like Apache Flink and Spark enable efficient processing of large-scale streaming data in a distributed setup. For the management of such data pipelines and the computing resources, we propose a combination of a graphical tool for pipeline management, Apache StreamPipes, and container management tools like Kubernetes. For evaluation, we implemented a use case with data preprocessing, vehicle power consumption, and driving behavior services in StreamPipes. We discuss the capabilities of StreamPipes in managing and executing complex stream processing pipelines and also evaluate the possible integration of container and service mesh tools (i.e., Istio) with StreamPipes. Furthermore, we implemented and evaluated a service management layer in our system design to provide extended features. In particular, we evaluated the delay when such a complex pipeline is restarted, e.g. for updates or reconfiguration.","PeriodicalId":54281,"journal":{"name":"IEEE Cloud Computing","volume":"2 1","pages":"35-40"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75628034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-01DOI: 10.1109/CLOUD53861.2021.00020
Sagar Sharma, A. Alam, Keke Chen
Large training data and expensive model tweaking are common features of deep learning development for images. As a result, data owners often utilize cloud resources or machine learning service providers for developing large-scale complex models. This practice, however, raises serious privacy concerns. Existing solutions are either too expensive to be practical, or do not sufficiently protect the confidentiality of data and model. In this paper, we aim to achieve a better trade-off among the level of protection for outsourced DNN model training, the expenses, and the utility of data, using novel image disguising mechanisms. We design a suite of image disguising methods that are efficient to implement and then analyze them to understand multiple levels of tradeoffs between data utility and protection of confidentiality. The experimental evaluation shows the surprising ability of DNN modeling methods in discovering patterns in disguised images and the flexibility of these image disguising mechanisms in achieving different levels of resilience to attacks.
{"title":"Image Disguising for Protecting Data and Model Confidentiality in Outsourced Deep Learning","authors":"Sagar Sharma, A. Alam, Keke Chen","doi":"10.1109/CLOUD53861.2021.00020","DOIUrl":"https://doi.org/10.1109/CLOUD53861.2021.00020","url":null,"abstract":"Large training data and expensive model tweaking are common features of deep learning development for images. As a result, data owners often utilize cloud resources or machine learning service providers for developing large-scale complex models. This practice, however, raises serious privacy concerns. Existing solutions are either too expensive to be practical, or do not sufficiently protect the confidentiality of data and model. In this paper, we aim to achieve a better trade-off among the level of protection for outsourced DNN model training, the expenses, and the utility of data, using novel image disguising mechanisms. We design a suite of image disguising methods that are efficient to implement and then analyze them to understand multiple levels of tradeoffs between data utility and protection of confidentiality. The experimental evaluation shows the surprising ability of DNN modeling methods in discovering patterns in disguised images and the flexibility of these image disguising mechanisms in achieving different levels of resilience to attacks.","PeriodicalId":54281,"journal":{"name":"IEEE Cloud Computing","volume":"11 1","pages":"71-77"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75251482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-01DOI: 10.1109/CLOUD53861.2021.00081
Yuzhao Wang, Hongliang Qu, Junqing Yu, Zhibin Yu
Modern data analytics typically run tasks on statically reserved resources (e.g., CPU and memory), which is prone to over-provision to guarantee the Quality of Service (QoS), leading to a large amount of resource time fragments. As a result, the resource utilization of a data analytics cluster is severely under-utilized. Workload co-location on shared resources has been substantially studied, but they are unaware the sizes of resource time fragments, making them hard to improve the resource utilization and guarantee QoS at the same time. In this paper, we propose Para, an event-driven scheduling mechanism, to harvest the CPU time fragments in co-located big data analytic workloads. Para innovates three techniques: 1) identifying the Idle CPU Time Window (ICTW) associated with each CPU core by capturing the task-switch event; 2) designing a runtime communication mechanism between each task execution of a workload and the underlying resource management system; 3) designing a pull-based scheduler to schedule a workload to run in the ICTW of another workload. We implement Para based on Apache Mesos and Spark. And the experimental results show that Para improves the CPU utilization by 44% and 30% on average relative to the original Mesos and enhanced Mesos under Spark's dynamic mode (MSDM), respectively. Moreover, Para increases the averaged task throughput of Mesos and MSDM by 4.8x and 1.7x, respectively, while guaranteeing the execution time of the primary applications.
{"title":"Para: Harvesting CPU time fragments in Big Data Analytics","authors":"Yuzhao Wang, Hongliang Qu, Junqing Yu, Zhibin Yu","doi":"10.1109/CLOUD53861.2021.00081","DOIUrl":"https://doi.org/10.1109/CLOUD53861.2021.00081","url":null,"abstract":"Modern data analytics typically run tasks on statically reserved resources (e.g., CPU and memory), which is prone to over-provision to guarantee the Quality of Service (QoS), leading to a large amount of resource time fragments. As a result, the resource utilization of a data analytics cluster is severely under-utilized. Workload co-location on shared resources has been substantially studied, but they are unaware the sizes of resource time fragments, making them hard to improve the resource utilization and guarantee QoS at the same time. In this paper, we propose Para, an event-driven scheduling mechanism, to harvest the CPU time fragments in co-located big data analytic workloads. Para innovates three techniques: 1) identifying the Idle CPU Time Window (ICTW) associated with each CPU core by capturing the task-switch event; 2) designing a runtime communication mechanism between each task execution of a workload and the underlying resource management system; 3) designing a pull-based scheduler to schedule a workload to run in the ICTW of another workload. We implement Para based on Apache Mesos and Spark. And the experimental results show that Para improves the CPU utilization by 44% and 30% on average relative to the original Mesos and enhanced Mesos under Spark's dynamic mode (MSDM), respectively. Moreover, Para increases the averaged task throughput of Mesos and MSDM by 4.8x and 1.7x, respectively, while guaranteeing the execution time of the primary applications.","PeriodicalId":54281,"journal":{"name":"IEEE Cloud Computing","volume":"51 1","pages":"625-636"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74962083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-01DOI: 10.1109/CLOUD53861.2021.00025
Raghav Batta, L. Shwartz, M. Nidd, A. Azad, H. Kumar
Change is one of the biggest contributors to service outages. With more enterprises migrating their applications to cloud and using automated build and deployment the volume and rate of changes has significantly increased. Furthermore, microservice-based architectures have reduced the turnaround time for changes and increased the dependency between services. All of the above make it impossible for the Site Reliability Engineers (SREs) to use the traditional methods of manual risk assessment for changes. In order to mitigate change-induced service failures and ensure continuous improvement for cloud native services, it is critical to have an automated system for assessing the risk of change deployments. In this paper, we present an AI-based system for proactively assessing the risk associated with deployment of application changes in cloud operations. The risk assessment is accompanied with actionable risk explainability. We discuss the usage of this system in two primary scenarios of automated and manual deployment. In automated deployment scenario, our approach is able to alert SREs on 70 % of problematic changes by blocking only 1.5 % of total changes and recommending human intervention. In manual deployment scenario, our approach recommends the SREs to perform extra due diligence for 2.8 % of total changes to capture 84 % of problematic changes.
{"title":"A system for proactive risk assessment of application changes in cloud operations","authors":"Raghav Batta, L. Shwartz, M. Nidd, A. Azad, H. Kumar","doi":"10.1109/CLOUD53861.2021.00025","DOIUrl":"https://doi.org/10.1109/CLOUD53861.2021.00025","url":null,"abstract":"Change is one of the biggest contributors to service outages. With more enterprises migrating their applications to cloud and using automated build and deployment the volume and rate of changes has significantly increased. Furthermore, microservice-based architectures have reduced the turnaround time for changes and increased the dependency between services. All of the above make it impossible for the Site Reliability Engineers (SREs) to use the traditional methods of manual risk assessment for changes. In order to mitigate change-induced service failures and ensure continuous improvement for cloud native services, it is critical to have an automated system for assessing the risk of change deployments. In this paper, we present an AI-based system for proactively assessing the risk associated with deployment of application changes in cloud operations. The risk assessment is accompanied with actionable risk explainability. We discuss the usage of this system in two primary scenarios of automated and manual deployment. In automated deployment scenario, our approach is able to alert SREs on 70 % of problematic changes by blocking only 1.5 % of total changes and recommending human intervention. In manual deployment scenario, our approach recommends the SREs to perform extra due diligence for 2.8 % of total changes to capture 84 % of problematic changes.","PeriodicalId":54281,"journal":{"name":"IEEE Cloud Computing","volume":"16 1","pages":"112-123"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78461191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-01DOI: 10.1109/CLOUD53861.2021.00075
Nicholas Nordlund, V. Vassiliadis, Michele Gazzetti, D. Syrivelis, L. Tassiulas
Cloud data centers require enormous amounts of energy to run their clusters of computers. There are huge financial and environmental incentives for cloud service providers to increase their energy efficiency without causing significant negative impacts on their customers' qualities of experience. Increasing resource utilization reduces energy consumption by consolidating workloads on fewer machines and allows cloud service providers to turn off inactive devices. While traditional architectures only allow virtual machines (VMs) to use the memory and CPU resources of a single device, VMs in a disaggregated cloud can utilize the small residual capacities of multiple separate devices. Separating VM resources across multiple devices leads to severe fragmentation that eventually negates any positive impact disaggregation has on utilization. To address the fragmentation problem, we present a method of ensuring a cloud operates using the minimal number of devices over time. Here we introduce an Energy-Aware Learning Agent (EALA) that uses reinforcement learning to guarantee the system can meet minimal quality of service requirements and provide energy savings without the need for VM migration. We evaluate the use of EALA guiding the decisions of Best-Fit compared to vanilla Best-Fit using the Google cluster trace. We show that EALA improves utilization by 2% and reduces the number of times that compute nodes switch on and off by 11% compared to vanilla Best-Fit.
{"title":"Energy-Aware Learning Agent (EALA) for Disaggregated Cloud Scheduling","authors":"Nicholas Nordlund, V. Vassiliadis, Michele Gazzetti, D. Syrivelis, L. Tassiulas","doi":"10.1109/CLOUD53861.2021.00075","DOIUrl":"https://doi.org/10.1109/CLOUD53861.2021.00075","url":null,"abstract":"Cloud data centers require enormous amounts of energy to run their clusters of computers. There are huge financial and environmental incentives for cloud service providers to increase their energy efficiency without causing significant negative impacts on their customers' qualities of experience. Increasing resource utilization reduces energy consumption by consolidating workloads on fewer machines and allows cloud service providers to turn off inactive devices. While traditional architectures only allow virtual machines (VMs) to use the memory and CPU resources of a single device, VMs in a disaggregated cloud can utilize the small residual capacities of multiple separate devices. Separating VM resources across multiple devices leads to severe fragmentation that eventually negates any positive impact disaggregation has on utilization. To address the fragmentation problem, we present a method of ensuring a cloud operates using the minimal number of devices over time. Here we introduce an Energy-Aware Learning Agent (EALA) that uses reinforcement learning to guarantee the system can meet minimal quality of service requirements and provide energy savings without the need for VM migration. We evaluate the use of EALA guiding the decisions of Best-Fit compared to vanilla Best-Fit using the Google cluster trace. We show that EALA improves utilization by 2% and reduces the number of times that compute nodes switch on and off by 11% compared to vanilla Best-Fit.","PeriodicalId":54281,"journal":{"name":"IEEE Cloud Computing","volume":"17 1","pages":"578-583"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88837609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-01DOI: 10.1109/CLOUD53861.2021.00093
P. V. Seshadri, Harikrishnan Balagopal, Pablo Loyola, Akash Nayak, Chander Govindarajan, Mudit Verma, Ashok Pon Kumar, Amith Singhee
We present Move2Kube, a replatforming framework that automates the transformation of the deployment specification and development pipeline of an application from a non-Kubernetes platform to a Kubernetes-based one, minimizing changes to the application's functional implementation and architecture. Our contributions include: (1) a standardized intermediate representation to which diverse application deployment artifacts could be translated, (2) an extension framework for adding support for new source platforms, and target artifacts while allowing customization as per organizational standards. We provide initial evidence of its effectiveness in terms of effort reduction, and highlight the current research challenges and future lines of work. Move2Kube is being developed as an open source community project and it is available at https://move2kube.konveyor.io/
{"title":"Konveyor Move2Kube: Automated Replatforming of Applications to Kubernetes","authors":"P. V. Seshadri, Harikrishnan Balagopal, Pablo Loyola, Akash Nayak, Chander Govindarajan, Mudit Verma, Ashok Pon Kumar, Amith Singhee","doi":"10.1109/CLOUD53861.2021.00093","DOIUrl":"https://doi.org/10.1109/CLOUD53861.2021.00093","url":null,"abstract":"We present Move2Kube, a replatforming framework that automates the transformation of the deployment specification and development pipeline of an application from a non-Kubernetes platform to a Kubernetes-based one, minimizing changes to the application's functional implementation and architecture. Our contributions include: (1) a standardized intermediate representation to which diverse application deployment artifacts could be translated, (2) an extension framework for adding support for new source platforms, and target artifacts while allowing customization as per organizational standards. We provide initial evidence of its effectiveness in terms of effort reduction, and highlight the current research challenges and future lines of work. Move2Kube is being developed as an open source community project and it is available at https://move2kube.konveyor.io/","PeriodicalId":54281,"journal":{"name":"IEEE Cloud Computing","volume":"37 1","pages":"717-719"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80592433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}