IEEE Cloud Computing最新文献

英文中文

Different in different ways: A network-analysis approach to voice and prosody in Autism Spectrum Disorder. 各不相同：自闭症谱系障碍中的语音和拟声网络分析方法。

Q1 Computer Science

IEEE Cloud Computing

Pub Date : 2024-01-01 Epub Date: 2023-04-25 DOI: 10.1080/15475441.2023.2196528

Ethan Weed, Riccardo Fusaroli, Elizabeth Simmons, Inge-Marie Eigsti

The current study investigated whether the difficulty in finding group differences in prosody between speakers with autism spectrum disorder (ASD) and neurotypical (NT) speakers might be explained by identifying different acoustic profiles of speakers which, while still perceived as atypical, might be characterized by different acoustic qualities. We modelled the speech from a selection of speakers (N = 26), with and without ASD, as a network of nodes defined by acoustic features. We used a community-detection algorithm to identify clusters of speakers who were acoustically similar and compared these clusters with atypicality ratings by naïve and expert human raters. Results identified three clusters: one primarily composed of speakers with ASD, one of mostly NT speakers, and one comprised of an even mixture of ASD and NT speakers. The human raters were highly reliable at distinguishing speakers with and without ASD, regardless of which cluster the speaker was in. These results suggest that community-detection methods using a network approach may complement commonly-employed human ratings to improve our understanding of the intonation profiles in ASD.

自闭症谱系障碍（ASD）患者和神经典型（NT）患者之间的拟声难以发现群体差异，本研究探讨了这一问题是否可以通过识别不同声学特征来解释，这些声学特征虽然仍被视为非典型，但可能具有不同的声学品质。我们选择了一些患有和不患有 ASD 的说话者（N = 26），将他们的语音建模为一个由声学特征定义的节点网络。我们使用群体检测算法识别出声学上相似的说话者群集，并将这些群集与天真和专业人类评分者的非典型性评分进行比较。结果发现了三个聚类：一个主要由 ASD 说话者组成，一个主要由 NT 说话者组成，还有一个由 ASD 和 NT 说话者平均混合组成。无论说话者属于哪个群组，人类评测员在区分有 ASD 和无 ASD 的说话者方面都非常可靠。这些结果表明，使用网络方法的群组检测方法可以补充常用的人类评分方法，从而提高我们对 ASD 患者语调特征的理解。

{"title":"Different in different ways: A network-analysis approach to voice and prosody in Autism Spectrum Disorder.","authors":"Ethan Weed, Riccardo Fusaroli, Elizabeth Simmons, Inge-Marie Eigsti","doi":"10.1080/15475441.2023.2196528","DOIUrl":"10.1080/15475441.2023.2196528","url":null,"abstract":"<p><p>The current study investigated whether the difficulty in finding group differences in prosody between speakers with autism spectrum disorder (ASD) and neurotypical (NT) speakers might be explained by identifying different acoustic profiles of speakers which, while still perceived as atypical, might be characterized by different acoustic qualities. We modelled the speech from a selection of speakers (N = 26), with and without ASD, as a network of nodes defined by acoustic features. We used a community-detection algorithm to identify clusters of speakers who were acoustically similar and compared these clusters with atypicality ratings by naïve and expert human raters. Results identified three clusters: one primarily composed of speakers with ASD, one of mostly NT speakers, and one comprised of an even mixture of ASD and NT speakers. The human raters were highly reliable at distinguishing speakers with and without ASD, regardless of which cluster the speaker was in. These results suggest that community-detection methods using a network approach may complement commonly-employed human ratings to improve our understanding of the intonation profiles in ASD.</p>","PeriodicalId":54281,"journal":{"name":"IEEE Cloud Computing","volume":"3 1","pages":"40-57"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10936700/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84367082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MicroLens: A Performance Analysis Framework for Microservices Using Hidden Metrics With BPF MicroLens:一个使用BPF隐藏度量的微服务性能分析框架

Q1 Computer Science

IEEE Cloud Computing

Pub Date : 2022-07-01 DOI: 10.1109/CLOUD55607.2022.00043

Marcelo Amaral, Tatsuhiro Chiba, Scott Trent, Takeshi Yoshimura, Sunyanan Choochotkaew

Determining the root cause of performance regression for microservices is challenging. The topological cascading performance implications among microservices hide the source of the problem. Additionally, the lack of knowledge about application phases can potentially lead to false-positive critical service detection. Service resource utilization is an imperfect proxy for application performance, potentially leading to false positives. Therefore, in this work, we propose a new performance testing framework that leverages hidden Berkeley Packet Filter (BPF) kernel metrics to locate root causes of performance regression. The framework applies a systematic multi-level approach to analyze microservice performance without intrusive code instrumentation. First, the framework constructs an attributed graph with microservice requests, scores the services to identify the critical paths, and ranks the low-level metrics to highlight the root cause of performance regression. Through judiciously designed experiments, we evaluated the metric collection overhead, showing less than 18% more latency when the application is running across hosts and 9% within the same host. In addition, depending on the application, no overhead is experienced, while the state-of-the-art approach presented up to 1060% more latency. The microservice benchmark evaluation shows that MicroLens can successfully identify the set of root causes and that the causes vary when the application is running in different infrastructures.

确定微服务性能退化的根本原因是一项挑战。微服务之间的拓扑级联性能暗示隐藏了问题的根源。此外，缺乏对应用程序阶段的了解可能会导致关键服务检测误报。服务资源利用率是应用程序性能的不完美代理，可能导致误报。因此，在这项工作中，我们提出了一个新的性能测试框架，该框架利用隐藏伯克利包过滤器(BPF)内核指标来定位性能回归的根本原因。该框架采用系统的多级方法来分析微服务的性能，而不需要侵入式代码工具。首先，该框架构建了带有微服务请求的属性图，对服务进行评分以识别关键路径，并对低级指标进行排序以突出性能回归的根本原因。通过精心设计的实验，我们评估了度量收集开销，结果显示，当应用程序跨主机运行时，延迟增加不到18%，在同一主机内延迟增加9%。此外，根据应用程序的不同，没有任何开销，而最先进的方法提供了高达1060%的延迟。微服务基准评估表明，MicroLens可以成功地识别出一组根本原因，并且当应用程序在不同的基础设施中运行时，原因是不同的。

{"title":"MicroLens: A Performance Analysis Framework for Microservices Using Hidden Metrics With BPF","authors":"Marcelo Amaral, Tatsuhiro Chiba, Scott Trent, Takeshi Yoshimura, Sunyanan Choochotkaew","doi":"10.1109/CLOUD55607.2022.00043","DOIUrl":"https://doi.org/10.1109/CLOUD55607.2022.00043","url":null,"abstract":"Determining the root cause of performance regression for microservices is challenging. The topological cascading performance implications among microservices hide the source of the problem. Additionally, the lack of knowledge about application phases can potentially lead to false-positive critical service detection. Service resource utilization is an imperfect proxy for application performance, potentially leading to false positives. Therefore, in this work, we propose a new performance testing framework that leverages hidden Berkeley Packet Filter (BPF) kernel metrics to locate root causes of performance regression. The framework applies a systematic multi-level approach to analyze microservice performance without intrusive code instrumentation. First, the framework constructs an attributed graph with microservice requests, scores the services to identify the critical paths, and ranks the low-level metrics to highlight the root cause of performance regression. Through judiciously designed experiments, we evaluated the metric collection overhead, showing less than 18% more latency when the application is running across hosts and 9% within the same host. In addition, depending on the application, no overhead is experienced, while the state-of-the-art approach presented up to 1060% more latency. The microservice benchmark evaluation shows that MicroLens can successfully identify the set of root causes and that the causes vary when the application is running in different infrastructures.","PeriodicalId":54281,"journal":{"name":"IEEE Cloud Computing","volume":"2677 1","pages":"230-240"},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80301939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

QoS aware FaaS for Heterogeneous Edge-Cloud continuum 异构边缘云连续体的QoS感知FaaS

Q1 Computer Science

IEEE Cloud Computing

Pub Date : 2022-07-01 DOI: 10.1109/CLOUD55607.2022.00023

R. SheshadriK., J. Lakshmi

Function as a Service (FaaS) is one of the widely used serverless computing service offerings to build and deploy applications on the Cloud. The platform is popular for its "pay-as-you-go" billing model, microservice-based design, event-driven executions, and autonomous scaling. Although it has its firm roots in Cloud computing service offerings, it is considerably explored in the Edge computing layer. The efficient resource management of FaaS is attractive to Edge computing because of the limited nature of resources. Existing literature on Edge-Cloud FaaS platforms orchestrates compute workloads based on factors such as data locality, resource availability, network costs, and bandwidth. However, the state-of-the-art platforms lack a comprehensive way to address the challenges of managing heterogeneous resources in the FaaS platform. The resource specification in a heterogeneous setting, lack of Quality of Service (QoS) driven resource provisioning, and function deployment exacerbate the problem of resource selection, and function deployment in FaaS platforms with a heterogeneous resource pool. To address these gaps, the current work presents a novel heterogeneous FaaS platform that deduces function resource specification using Machine Learning (ML) methods, performs smart function placement on Edge/Cloud based on a user-specified QoS requirement, and exploit data locality by caching appropriate data for function executions. Experimental results based on real-world workloads on a video surveillance application show that the proposed platform brings efficient resource utilization and cost savings at the Cloud by reducing the resource usage by up to 30%, while improving the performance of function executions by up to 25% at Edge and Cloud.

功能即服务(FaaS)是一种广泛使用的无服务器计算服务产品，用于在云上构建和部署应用程序。该平台因其“按需付费”的计费模式、基于微服务的设计、事件驱动的执行和自主扩展而广受欢迎。尽管它在云计算服务产品中有着坚实的基础，但它在边缘计算层中进行了相当大的探索。由于资源的有限性，FaaS的高效资源管理对边缘计算具有很大的吸引力。边缘云FaaS平台的现有文献基于数据局部性、资源可用性、网络成本和带宽等因素编排计算工作负载。然而，最先进的平台缺乏一种全面的方法来解决FaaS平台中管理异构资源的挑战。异构环境下的资源规范、缺乏QoS驱动的资源发放和功能部署，加剧了异构资源池下FaaS平台的资源选择和功能部署问题。为了解决这些差距，目前的工作提出了一种新的异构FaaS平台，该平台使用机器学习(ML)方法推导功能资源规范，根据用户指定的QoS要求在边缘/云上执行智能功能放置，并通过缓存适当的数据来利用数据局部性。基于视频监控应用的实际工作负载的实验结果表明，所提出的平台通过减少高达30%的资源使用，在云端带来了高效的资源利用和成本节约，同时在边缘和云上提高了高达25%的功能执行性能。

{"title":"QoS aware FaaS for Heterogeneous Edge-Cloud continuum","authors":"R. SheshadriK., J. Lakshmi","doi":"10.1109/CLOUD55607.2022.00023","DOIUrl":"https://doi.org/10.1109/CLOUD55607.2022.00023","url":null,"abstract":"Function as a Service (FaaS) is one of the widely used serverless computing service offerings to build and deploy applications on the Cloud. The platform is popular for its \"pay-as-you-go\" billing model, microservice-based design, event-driven executions, and autonomous scaling. Although it has its firm roots in Cloud computing service offerings, it is considerably explored in the Edge computing layer. The efficient resource management of FaaS is attractive to Edge computing because of the limited nature of resources. Existing literature on Edge-Cloud FaaS platforms orchestrates compute workloads based on factors such as data locality, resource availability, network costs, and bandwidth. However, the state-of-the-art platforms lack a comprehensive way to address the challenges of managing heterogeneous resources in the FaaS platform. The resource specification in a heterogeneous setting, lack of Quality of Service (QoS) driven resource provisioning, and function deployment exacerbate the problem of resource selection, and function deployment in FaaS platforms with a heterogeneous resource pool. To address these gaps, the current work presents a novel heterogeneous FaaS platform that deduces function resource specification using Machine Learning (ML) methods, performs smart function placement on Edge/Cloud based on a user-specified QoS requirement, and exploit data locality by caching appropriate data for function executions. Experimental results based on real-world workloads on a video surveillance application show that the proposed platform brings efficient resource utilization and cost savings at the Cloud by reducing the resource usage by up to 30%, while improving the performance of function executions by up to 25% at Edge and Cloud.","PeriodicalId":54281,"journal":{"name":"IEEE Cloud Computing","volume":"86 1","pages":"70-80"},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85146771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

A Guided Approach Towards Complex Chaos Selection, Prioritisation and Injection 复杂混沌选择、优先排序和注入的导向方法

Q1 Computer Science

IEEE Cloud Computing

Pub Date : 2022-07-01 DOI: 10.1109/CLOUD55607.2022.00025

Ojaswa Sharma, Mudit Verma, Saumya Bhadauria, P. Jayachandran

Though Chaos Engineering is a popular method to test reliability and performance assurance, available tools can only inject random or manually curated faults into a target system. Given the vast array of faults that can be injected, it is crucial to a.) intelligently pick the faults that can have tangible effects, b.) increase the test coverage, and c.) reduce the overall time needed to assess the reliability of a system under adverse conditions. To the effect, we are proposing to learn from past major outages and use genetic algorithm-based meta-heuristics to evolve complex fault injections.

虽然混沌工程是一种测试可靠性和性能保证的流行方法，但可用的工具只能将随机或人工策划的故障注入目标系统。考虑到可以注入的大量故障，至关重要的是a.)智能地选择可能具有实际影响的故障，b.)增加测试覆盖率，以及c.)减少在不利条件下评估系统可靠性所需的总时间。为此，我们建议从过去的重大故障中学习，并使用基于遗传算法的元启发式方法来进化复杂的故障注入。

引用次数: 0

A Case For Cross-Domain Observability to Debug Performance Issues in Microservices 用跨域可观察性调试微服务中的性能问题

Q1 Computer Science

IEEE Cloud Computing

Pub Date : 2022-07-01 DOI: 10.1109/CLOUD55607.2022.00045

R. K., Praveen Tammana, Pravein G. Kannan, Priyanka Naik

Many applications deployed in the cloud are usually refactored into small components called microservices that are deployed as containers in a Kubernetes environment. Such applications are deployed on a cluster of physical servers which are connected via the datacenter network.In such deployments, resources such as compute, memory, and network, are shared and hence some microservices (culprits) can misbehave and consume more resources. This interference among applications hosted on the same node leads to performance issues (e.g., high latency, packet loss) in the microservices (victims) followed by a delayed or low-quality response. Given the highly distributed and transient nature of the workloads, it’s extremely challenging to debug performance issues. Especially, given the nature of existing monitoring tools, which collect traces and analyze them at individual points (network, host, etc) in a disaggregated manner.In this paper, we argue toward a case for a cross-domain (network & host) monitoring and debugging framework which could provide the end-to-end observability to debug performance issues of applications and pin-point the root-cause whether it is on the sender-host, receiver-host or the network. We present the design and provide preliminary implementation details using eBPF (extended Berkeley Packet Filter) to elucidate the feasibility of the system.

部署在云中的许多应用程序通常被重构为称为微服务的小组件，这些组件作为容器部署在Kubernetes环境中。这些应用程序部署在通过数据中心网络连接的物理服务器集群上。在这样的部署中，资源(如计算、内存和网络)是共享的，因此一些微服务(罪魁祸首)可能行为不当并消耗更多的资源。托管在同一节点上的应用程序之间的这种干扰导致微服务(受害者)中的性能问题(例如，高延迟，数据包丢失)，随后是延迟或低质量的响应。考虑到工作负载的高度分布式和瞬态特性，调试性能问题极具挑战性。特别是，考虑到现有监控工具的性质，它们以分解的方式在单个点(网络、主机等)收集跟踪并分析它们。在本文中，我们讨论了一个跨域(网络和主机)监控和调试框架的案例，该框架可以提供端到端的可观察性，以调试应用程序的性能问题，并指出根本原因，无论是在发送方-主机，接收方-主机还是网络上。我们提出了设计并提供了使用eBPF(扩展伯克利包过滤器)的初步实现细节，以阐明系统的可行性。

{"title":"A Case For Cross-Domain Observability to Debug Performance Issues in Microservices","authors":"R. K., Praveen Tammana, Pravein G. Kannan, Priyanka Naik","doi":"10.1109/CLOUD55607.2022.00045","DOIUrl":"https://doi.org/10.1109/CLOUD55607.2022.00045","url":null,"abstract":"Many applications deployed in the cloud are usually refactored into small components called microservices that are deployed as containers in a Kubernetes environment. Such applications are deployed on a cluster of physical servers which are connected via the datacenter network.In such deployments, resources such as compute, memory, and network, are shared and hence some microservices (culprits) can misbehave and consume more resources. This interference among applications hosted on the same node leads to performance issues (e.g., high latency, packet loss) in the microservices (victims) followed by a delayed or low-quality response. Given the highly distributed and transient nature of the workloads, it’s extremely challenging to debug performance issues. Especially, given the nature of existing monitoring tools, which collect traces and analyze them at individual points (network, host, etc) in a disaggregated manner.In this paper, we argue toward a case for a cross-domain (network & host) monitoring and debugging framework which could provide the end-to-end observability to debug performance issues of applications and pin-point the root-cause whether it is on the sender-host, receiver-host or the network. We present the design and provide preliminary implementation details using eBPF (extended Berkeley Packet Filter) to elucidate the feasibility of the system.","PeriodicalId":54281,"journal":{"name":"IEEE Cloud Computing","volume":"1 1","pages":"244-246"},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87732732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Data Leakage Free ABAC Policy Construction in Multi-Cloud Collaboration 多云协作中无数据泄漏的ABAC策略构建

Q1 Computer Science

IEEE Cloud Computing

Pub Date : 2022-07-01 DOI: 10.1109/CLOUD55607.2022.00054

J. C. John, Arobinda Gupta, S. Sural

With an increase in the diversity and complexity of requirements from organizations for cloud computing, there is a growing need for integrating the services of multiple cloud providers. In such multi-cloud systems, data leakage is considered to be a major security concern, which is caused by illegitimate actions of malicious users often acting in collusion. The possibility of data leakage in such environments is characterized by the number of interoperations as well as the trustworthiness of users on the collaborating clouds. In this paper, we address the problem of secure multi-cloud collaboration from an Attribute-based Access Control (ABAC) policy management perspective. In particular, we define a problem that aims to formulate ABAC policy rules for establishing a high degree of inter-cloud accesses while eliminating potential paths for data leakage. A data leakage free ABAC policy generation algorithm is proposed that first determines the likelihood of data leakage and then attempts to maximize inter-cloud collaborations. Experimental results on several large data sets show the efficacy of the proposed approach.

随着组织对云计算需求的多样性和复杂性的增加，越来越需要集成多个云提供商的服务。在这种多云系统中，数据泄露被认为是一个主要的安全问题，这是由于恶意用户的非法行为经常相互勾结造成的。在这种环境中，数据泄露的可能性取决于互操作的数量以及协作云上用户的可信度。在本文中，我们从基于属性的访问控制(ABAC)策略管理的角度解决了安全多云协作的问题。特别是，我们定义了一个问题，旨在制定ABAC策略规则，以建立高度的云间访问，同时消除潜在的数据泄漏路径。提出了一种无数据泄漏的ABAC策略生成算法，该算法首先确定数据泄漏的可能性，然后尝试最大化云间协作。在多个大型数据集上的实验结果表明了该方法的有效性。

引用次数: 0

Radio: Reconciling Disk I/O Interference in a Para-virtualized Cloud 无线电:协调准虚拟化云中的磁盘I/O干扰

Q1 Computer Science

IEEE Cloud Computing

Pub Date : 2022-07-01 DOI: 10.1109/CLOUD55607.2022.00034

Guangwen Yang, Liana Wane, W. Xue

As more virtual machines (VMs) are consolidated in the cloud system, interference among VMs sharing underlying resources may occur more frequently than ever. In particular, certain VMs’ disk I/O performance gets impacted, leading to related cloud services being seriously compromised. Existing interference analysis approaches cannot guarantee desired results due to 1) lack of effective techniques for characterizing disk I/O interference and 2) considerable runtime overhead for determining interference and related culprits. To overcome these barriers, we present Radio, an end-to-end analysis tool for disk I/O interference diagnostics in a para-virtualized cloud. Radio quantifies the dynamic changes in I/O strength across virtual CPUs (vCPUs), constructs the performance repository to efficiently identify VMs’ abnormal behaviors, and then exploits interference heat maps and non-constant correlation approaches to infer the culprits of interference. With Radio's deployment at the National Supercomputing Center in Wuxi for more than 10 months, we demonstrate its effectiveness in real-world use cases on the cloud system with more than 300 VMs deployed. Radio can effectively analyze the interference issues within 20 seconds, incurring only 0.2% extra CPU overhead on the host machine. With this achievement, Radio has successfully assisted system administrators in reducing the daily incidence of interference from more than 65% to less than 10% and improving the overall disk throughput of the cloud system by more than 27.5%.

随着云系统中越来越多的虚拟机被整合，共享底层资源的虚拟机之间的相互干扰可能比以往任何时候都更加频繁。特别是影响部分虚拟机的磁盘I/O性能，导致相关云业务受到严重影响。现有的干扰分析方法不能保证预期的结果，因为1)缺乏表征磁盘I/O干扰的有效技术，2)确定干扰和相关罪魁祸首的运行时开销很大。为了克服这些障碍，我们提出了Radio，这是一种端到端分析工具，用于准虚拟化云中的磁盘I/O干扰诊断。无线电量化了虚拟cpu (vcpu)之间I/O强度的动态变化，构建了性能存储库来有效识别虚拟机的异常行为，然后利用干扰热图和非恒定相关方法来推断干扰的罪魁祸首。随着Radio在无锡国家超级计算中心的部署超过10个月，我们在部署了300多个虚拟机的云系统的实际用例中展示了它的有效性。无线电可以在20秒内有效地分析干扰问题，在主机上只产生0.2%的额外CPU开销。凭借这一成就，Radio成功地帮助系统管理员将日常干扰发生率从65%以上降低到10%以下，并将云系统的整体磁盘吞吐量提高了27.5%以上。

{"title":"Radio: Reconciling Disk I/O Interference in a Para-virtualized Cloud","authors":"Guangwen Yang, Liana Wane, W. Xue","doi":"10.1109/CLOUD55607.2022.00034","DOIUrl":"https://doi.org/10.1109/CLOUD55607.2022.00034","url":null,"abstract":"As more virtual machines (VMs) are consolidated in the cloud system, interference among VMs sharing underlying resources may occur more frequently than ever. In particular, certain VMs’ disk I/O performance gets impacted, leading to related cloud services being seriously compromised. Existing interference analysis approaches cannot guarantee desired results due to 1) lack of effective techniques for characterizing disk I/O interference and 2) considerable runtime overhead for determining interference and related culprits. To overcome these barriers, we present Radio, an end-to-end analysis tool for disk I/O interference diagnostics in a para-virtualized cloud. Radio quantifies the dynamic changes in I/O strength across virtual CPUs (vCPUs), constructs the performance repository to efficiently identify VMs’ abnormal behaviors, and then exploits interference heat maps and non-constant correlation approaches to infer the culprits of interference. With Radio's deployment at the National Supercomputing Center in Wuxi for more than 10 months, we demonstrate its effectiveness in real-world use cases on the cloud system with more than 300 VMs deployed. Radio can effectively analyze the interference issues within 20 seconds, incurring only 0.2% extra CPU overhead on the host machine. With this achievement, Radio has successfully assisted system administrators in reducing the daily incidence of interference from more than 65% to less than 10% and improving the overall disk throughput of the cloud system by more than 27.5%.","PeriodicalId":54281,"journal":{"name":"IEEE Cloud Computing","volume":"1 1","pages":"144-156"},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90096641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

AutoDECK: Automated Declarative Performance Evaluation and Tuning Framework on Kubernetes AutoDECK: Kubernetes上的自动声明性性能评估和调优框架

Q1 Computer Science

IEEE Cloud Computing

Pub Date : 2022-07-01 DOI: 10.1109/CLOUD55607.2022.00053

Sunyanan Choochotkaew, Tatsuhiro Chiba, Scott Trent, Takeshi Yoshimura, Marcelo Amaral

Containerization and application variety bring many challenges in automating evaluations for performance tuning and comparison among infrastructure choices. Due to the tightly-coupled design of benchmarks and evaluation tools, the present automated tools on Kubernetes are limited to trivial microbenchmarks and cannot be extended to complex cloudnative architectures such as microservices and serverless, which are usually managed by customized operators for setting up workload dependencies. In this paper, we propose AutoDECK, a performance evaluation framework with a fully declarative manner. The proposed framework automates configuring, deploying, evaluating, summarizing, and visualizing the benchmarking workload. It seamlessly integrates mature Kubernetes-native systems and extends multiple functionalities such as tracking the image-build pipeline, and auto-tuning. We present five use cases of evaluations and analysis through various kinds of bench-marks including microbenchmarks and HPC/AI benchmarks. The evaluation results can also differentiate characteristics such as resource usage behavior and parallelism effectiveness between different clusters. Furthermore, the results demonstrate the benefit of integrating an auto-tuning feature in the proposed framework, as shown by the 10% transferred memory bytes in the Sysbench benchmark.

容器化和应用程序的多样性给自动评估性能调优和基础设施选择之间的比较带来了许多挑战。由于基准测试和评估工具的紧密耦合设计，目前Kubernetes上的自动化工具仅限于琐碎的微基准测试，无法扩展到复杂的云原生架构(如微服务和无服务器)，这些架构通常由定制的操作人员管理，以设置工作负载依赖关系。在本文中，我们提出了AutoDECK，一个性能评估框架，具有完全声明的方式。建议的框架可以自动配置、部署、评估、汇总和可视化基准测试工作负载。它无缝地集成了成熟的kubernetes本地系统，并扩展了多种功能，如跟踪映像构建管道和自动调优。我们通过各种基准测试(包括微基准测试和HPC/AI基准测试)提出了五个评估和分析用例。评估结果还可以区分不同集群之间的资源使用行为和并行效率等特征。此外，结果证明了在提议的框架中集成自动调优特性的好处，如Sysbench基准测试中传输的10%内存字节所示。

{"title":"AutoDECK: Automated Declarative Performance Evaluation and Tuning Framework on Kubernetes","authors":"Sunyanan Choochotkaew, Tatsuhiro Chiba, Scott Trent, Takeshi Yoshimura, Marcelo Amaral","doi":"10.1109/CLOUD55607.2022.00053","DOIUrl":"https://doi.org/10.1109/CLOUD55607.2022.00053","url":null,"abstract":"Containerization and application variety bring many challenges in automating evaluations for performance tuning and comparison among infrastructure choices. Due to the tightly-coupled design of benchmarks and evaluation tools, the present automated tools on Kubernetes are limited to trivial microbenchmarks and cannot be extended to complex cloudnative architectures such as microservices and serverless, which are usually managed by customized operators for setting up workload dependencies. In this paper, we propose AutoDECK, a performance evaluation framework with a fully declarative manner. The proposed framework automates configuring, deploying, evaluating, summarizing, and visualizing the benchmarking workload. It seamlessly integrates mature Kubernetes-native systems and extends multiple functionalities such as tracking the image-build pipeline, and auto-tuning. We present five use cases of evaluations and analysis through various kinds of bench-marks including microbenchmarks and HPC/AI benchmarks. The evaluation results can also differentiate characteristics such as resource usage behavior and parallelism effectiveness between different clusters. Furthermore, the results demonstrate the benefit of integrating an auto-tuning feature in the proposed framework, as shown by the 10% transferred memory bytes in the Sysbench benchmark.","PeriodicalId":54281,"journal":{"name":"IEEE Cloud Computing","volume":"102 1","pages":"309-314"},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90651380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Symposium on Convergence of CLOUD & HPC 云计算与高性能计算融合研讨会

Q1 Computer Science

IEEE Cloud Computing

Pub Date : 2022-07-01 DOI: 10.1109/cloud55607.2022.00015

引用次数: 0

SLAM: SLO-Aware Memory Optimization for Serverless Applications SLAM:无服务器应用程序的慢速感知内存优化

Q1 Computer Science

IEEE Cloud Computing

Pub Date : 2022-07-01 DOI: 10.1109/CLOUD55607.2022.00019

Gor Safaryan, Anshul Jindal, Mohak Chadha, M. Gerndt

Serverless computing paradigm has become more ingrained into the industry, as it offers a cheap alternative for application development and deployment. This new paradigm has also created new kinds of problems for the developer, who needs to tune memory configurations for balancing cost and performance. Many researchers have addressed the issue of minimizing cost and meeting Service Level Objective (SLO) requirements for a single FaaS function, but there has been a gap for solving the same problem for an application consisting of many FaaS functions, creating complex application workflows.In this work, we designed a tool called SLAM to address the issue. SLAM uses distributed tracing to detect the relationship among the FaaS functions within a serverless application. By modeling each of them, it estimates the execution time for the application at different memory configurations. Using these estimations, SLAM determines the optimal memory configuration for the given serverless application based on the specified SLO requirements and user-specified objectives (minimum cost or minimum execution time). We demonstrate the functionality of SLAM on AWS Lambda by testing on four applications. Our results show that the suggested memory configurations guarantee that more than 95% of requests are completed within the predefined SLOs.

无服务器计算范式已经在业界根深蒂固，因为它为应用程序开发和部署提供了一种廉价的替代方案。这种新范例也给开发人员带来了新的问题，他们需要调优内存配置以平衡成本和性能。许多研究人员已经解决了单个FaaS功能的成本最小化和满足服务水平目标(Service Level Objective, SLO)要求的问题，但是对于由许多FaaS功能组成的应用程序来说，解决同样的问题仍然存在差距，因为这会创建复杂的应用程序工作流。在这项工作中，我们设计了一个叫做SLAM的工具来解决这个问题。SLAM使用分布式跟踪来检测无服务器应用程序中FaaS功能之间的关系。通过对它们进行建模，它可以估计应用程序在不同内存配置下的执行时间。使用这些估计，SLAM根据指定的SLO需求和用户指定的目标(最小成本或最小执行时间)确定给定无服务器应用程序的最佳内存配置。我们通过在四个应用程序上进行测试来演示SLAM在AWS Lambda上的功能。我们的结果表明，建议的内存配置保证95%以上的请求在预定义的slo内完成。

{"title":"SLAM: SLO-Aware Memory Optimization for Serverless Applications","authors":"Gor Safaryan, Anshul Jindal, Mohak Chadha, M. Gerndt","doi":"10.1109/CLOUD55607.2022.00019","DOIUrl":"https://doi.org/10.1109/CLOUD55607.2022.00019","url":null,"abstract":"Serverless computing paradigm has become more ingrained into the industry, as it offers a cheap alternative for application development and deployment. This new paradigm has also created new kinds of problems for the developer, who needs to tune memory configurations for balancing cost and performance. Many researchers have addressed the issue of minimizing cost and meeting Service Level Objective (SLO) requirements for a single FaaS function, but there has been a gap for solving the same problem for an application consisting of many FaaS functions, creating complex application workflows.In this work, we designed a tool called SLAM to address the issue. SLAM uses distributed tracing to detect the relationship among the FaaS functions within a serverless application. By modeling each of them, it estimates the execution time for the application at different memory configurations. Using these estimations, SLAM determines the optimal memory configuration for the given serverless application based on the specified SLO requirements and user-specified objectives (minimum cost or minimum execution time). We demonstrate the functionality of SLAM on AWS Lambda by testing on four applications. Our results show that the suggested memory configurations guarantee that more than 95% of requests are completed within the predefined SLOs.","PeriodicalId":54281,"journal":{"name":"IEEE Cloud Computing","volume":"183 1","pages":"30-39"},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79578750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

IEEE Cloud Computing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀