首页 > 最新文献

ACM Transactions on Autonomous and Adaptive Systems最新文献

英文 中文
Self-Adaptation in Industry: A Survey 工业中的自我适应:一项调查
IF 2.7 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-05-28 DOI: https://dl.acm.org/doi/10.1145/3589227
Danny Weyns, Ilias Gerostathopoulos, Nadeem Abbas, Jesper Andersson, Stefan Biffl, Premek Brada, Tomas Bures, Amleto Di Salle, Matthias Galster, Patricia Lago, Grace Lewis, Marin Litoiu, Angelika Musil, Juergen Musil, Panos Patros, Patrizio Pelliccione

Computing systems form the backbone of many areas in our society, from manufacturing to traffic control, healthcare, and financial systems. When software plays a vital role in the design, construction, and operation, these systems are referred to as software-intensive systems. Self-adaptation equips a software-intensive system with a feedback loop that either automates tasks that otherwise need to be performed by human operators or deals with uncertain conditions. Such feedback loops have found their way to a variety of practical applications; typical examples are an elastic cloud to adapt computing resources and automated server management to respond quickly to business needs. To gain insight into the motivations for applying self-adaptation in practice, the problems solved using self-adaptation and how these problems are solved, and the difficulties and risks that industry faces in adopting self-adaptation, we performed a large-scale survey. We received 184 valid responses from practitioners spread over 21 countries. Based on the analysis of the survey data, we provide an empirically grounded overview the of state of the practice in the application of self-adaptation. From that, we derive insights for researchers to check their current research with industrial needs, and for practitioners to compare their current practice in applying self-adaptation. These insights also provide opportunities for applying self-adaptation in practice and pave the way for future industry-research collaborations.

计算系统构成了我们社会许多领域的支柱,从制造业到交通控制、医疗保健和金融系统。当软件在设计、构建和操作中起着至关重要的作用时,这些系统被称为软件密集型系统。自适应为软件密集型系统配备了一个反馈回路,可以自动执行需要人工操作人员执行的任务,也可以处理不确定的情况。这种反馈回路已经在各种实际应用中找到了出路;典型的例子是弹性云,以适应计算资源和自动化服务器管理,以快速响应业务需求。为了深入了解企业在实践中采用自适应的动机、采用自适应解决了哪些问题以及如何解决这些问题,以及企业采用自适应面临的困难和风险,我们进行了大规模的调查。我们收到了来自21个国家从业人员的184份有效回复。在对调查数据进行分析的基础上,以实证为基础对自适应的应用现状进行了概述。从中,我们获得了研究人员与行业需求检查其当前研究的见解,以及从业者在应用自适应方面比较其当前实践的见解。这些见解也为在实践中应用自我适应提供了机会,并为未来的行业研究合作铺平了道路。
{"title":"Self-Adaptation in Industry: A Survey","authors":"Danny Weyns, Ilias Gerostathopoulos, Nadeem Abbas, Jesper Andersson, Stefan Biffl, Premek Brada, Tomas Bures, Amleto Di Salle, Matthias Galster, Patricia Lago, Grace Lewis, Marin Litoiu, Angelika Musil, Juergen Musil, Panos Patros, Patrizio Pelliccione","doi":"https://dl.acm.org/doi/10.1145/3589227","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3589227","url":null,"abstract":"<p>Computing systems form the backbone of many areas in our society, from manufacturing to traffic control, healthcare, and financial systems. When software plays a vital role in the design, construction, and operation, these systems are referred to as software-intensive systems. Self-adaptation equips a software-intensive system with a feedback loop that either automates tasks that otherwise need to be performed by human operators or deals with uncertain conditions. Such feedback loops have found their way to a variety of practical applications; typical examples are an elastic cloud to adapt computing resources and automated server management to respond quickly to business needs. To gain insight into the motivations for applying self-adaptation in practice, the problems solved using self-adaptation and how these problems are solved, and the difficulties and risks that industry faces in adopting self-adaptation, we performed a large-scale survey. We received 184 valid responses from practitioners spread over 21 countries. Based on the analysis of the survey data, we provide an empirically grounded overview the of state of the practice in the application of self-adaptation. From that, we derive insights for researchers to check their current research with industrial needs, and for practitioners to compare their current practice in applying self-adaptation. These insights also provide opportunities for applying self-adaptation in practice and pave the way for future industry-research collaborations.</p>","PeriodicalId":50919,"journal":{"name":"ACM Transactions on Autonomous and Adaptive Systems","volume":"6 7‐8","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138503619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hierarchical Auto-Scaling Policies for Data Stream Processing on Heterogeneous Resources 异构资源数据流处理的分层自动伸缩策略
IF 2.7 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-05-16 DOI: https://dl.acm.org/doi/10.1145/3597435
Gabriele Russo Russo, Valeria Cardellini, Francesco Lo Presti

Data Stream Processing (DSP) applications analyze data flows in near real-time by means of operators, which process and transform incoming data. Operators handle high data rates running parallel replicas across multiple processors and hosts. To guarantee consistent performance without wasting resources in face of variable workloads, auto-scaling techniques have been studied to adapt operator parallelism at run-time. However, most the effort has been spent under the assumption of homogeneous computing infrastructures, neglecting the complexity of modern environments.

We consider the problem of deciding both how many operator replicas should be executed and which types of computing nodes should be acquired. We devise heterogeneity-aware policies by means of a two-layered hierarchy of controllers. While application-level components steer the adaptation process for whole applications, aiming to guarantee user-specified requirements, lower-layer components control auto-scaling of single operators. We tackle the fundamental challenge of performance and workload uncertainty, exploiting Bayesian optimization and reinforcement learning to devise policies. The evaluation shows that our approach is able to meet users’ requirements in terms of response time and adaptation overhead, while minimizing the cost due to resource usage, outperforming state-of-the-art baselines. We also demonstrate how partial model information is exploited to reduce training time for learning-based controllers.

数据流处理(DSP)应用程序通过操作人员对传入数据进行处理和转换,近乎实时地分析数据流。运营商在多个处理器和主机上运行并行副本来处理高数据速率。为了在不浪费资源的情况下保证性能的一致性,研究了在运行时适应运算符并行性的自动伸缩技术。然而,大多数工作都是在同构计算基础设施的假设下进行的,忽略了现代环境的复杂性。我们考虑的问题是决定应该执行多少操作符副本以及应该获得哪种类型的计算节点。我们通过两层控制器层次结构设计异构感知策略。应用层组件控制整个应用程序的自适应过程,以保证用户指定的需求,底层组件控制单个操作符的自动缩放。我们解决了性能和工作负载不确定性的基本挑战,利用贝叶斯优化和强化学习来设计策略。评估表明,我们的方法能够在响应时间和适应开销方面满足用户的需求,同时最小化由于资源使用而导致的成本,优于最先进的基线。我们还演示了如何利用部分模型信息来减少基于学习的控制器的训练时间。
{"title":"Hierarchical Auto-Scaling Policies for Data Stream Processing on Heterogeneous Resources","authors":"Gabriele Russo Russo, Valeria Cardellini, Francesco Lo Presti","doi":"https://dl.acm.org/doi/10.1145/3597435","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3597435","url":null,"abstract":"<p>Data Stream Processing (DSP) applications analyze data flows in near real-time by means of operators, which process and transform incoming data. Operators handle high data rates running parallel replicas across multiple processors and hosts. To guarantee consistent performance without wasting resources in face of variable workloads, auto-scaling techniques have been studied to adapt operator parallelism at run-time. However, most the effort has been spent under the assumption of homogeneous computing infrastructures, neglecting the complexity of modern environments. </p><p>We consider the problem of deciding both how many operator replicas should be executed and which types of computing nodes should be acquired. We devise heterogeneity-aware policies by means of a two-layered hierarchy of controllers. While application-level components steer the adaptation process for whole applications, aiming to guarantee user-specified requirements, lower-layer components control auto-scaling of single operators. We tackle the fundamental challenge of performance and workload uncertainty, exploiting Bayesian optimization and reinforcement learning to devise policies. The evaluation shows that our approach is able to meet users’ requirements in terms of response time and adaptation overhead, while minimizing the cost due to resource usage, outperforming state-of-the-art baselines. We also demonstrate how partial model information is exploited to reduce training time for learning-based controllers.</p>","PeriodicalId":50919,"journal":{"name":"ACM Transactions on Autonomous and Adaptive Systems","volume":"7 3","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138503616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hierarchical Auto-Scaling Policies for Data Stream Processing on Heterogeneous Resources 异构资源数据流处理的分层自动伸缩策略
IF 2.7 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-05-16 DOI: 10.1145/3597435
Gabriele Russo Russo, V. Cardellini, F. Lo Presti
Data Stream Processing (DSP) applications analyze data flows in near real-time by means of operators, which process and transform incoming data. Operators handle high data rates running parallel replicas across multiple processors and hosts. To guarantee consistent performance without wasting resources in face of variable workloads, auto-scaling techniques have been studied to adapt operator parallelism at run-time. However, most the effort has been spent under the assumption of homogeneous computing infrastructures, neglecting the complexity of modern environments. We consider the problem of deciding both how many operator replicas should be executed and which types of computing nodes should be acquired. We devise heterogeneity-aware policies by means of a two-layered hierarchy of controllers. While application-level components steer the adaptation process for whole applications, aiming to guarantee user-specified requirements, lower-layer components control auto-scaling of single operators. We tackle the fundamental challenge of performance and workload uncertainty, exploiting Bayesian optimization and reinforcement learning to devise policies. The evaluation shows that our approach is able to meet users’ requirements in terms of response time and adaptation overhead, while minimizing the cost due to resource usage, outperforming state-of-the-art baselines. We also demonstrate how partial model information is exploited to reduce training time for learning-based controllers.
数据流处理(DSP)应用程序通过操作员近乎实时地分析数据流,操作员处理和转换传入数据。运营商处理跨多个处理器和主机运行并行复制副本的高数据速率。为了在面对可变工作负载时保证一致的性能而不浪费资源,已经研究了自动伸缩技术来适应运行时的运算符并行性。然而,大多数工作都是在同质计算基础设施的假设下进行的,忽略了现代环境的复杂性。我们考虑了决定应该执行多少操作员副本以及应该获取哪些类型的计算节点的问题。我们通过控制器的两层层次结构来设计异构感知策略。应用程序级组件指导整个应用程序的自适应过程,旨在保证用户指定的要求,而较低层组件控制单个操作员的自动缩放。我们利用贝叶斯优化和强化学习来制定策略,以应对性能和工作负载不确定性的根本挑战。评估表明,我们的方法能够满足用户在响应时间和适应开销方面的要求,同时最大限度地降低资源使用成本,优于最先进的基线。我们还演示了如何利用部分模型信息来减少基于学习的控制器的训练时间。
{"title":"Hierarchical Auto-Scaling Policies for Data Stream Processing on Heterogeneous Resources","authors":"Gabriele Russo Russo, V. Cardellini, F. Lo Presti","doi":"10.1145/3597435","DOIUrl":"https://doi.org/10.1145/3597435","url":null,"abstract":"Data Stream Processing (DSP) applications analyze data flows in near real-time by means of operators, which process and transform incoming data. Operators handle high data rates running parallel replicas across multiple processors and hosts. To guarantee consistent performance without wasting resources in face of variable workloads, auto-scaling techniques have been studied to adapt operator parallelism at run-time. However, most the effort has been spent under the assumption of homogeneous computing infrastructures, neglecting the complexity of modern environments. We consider the problem of deciding both how many operator replicas should be executed and which types of computing nodes should be acquired. We devise heterogeneity-aware policies by means of a two-layered hierarchy of controllers. While application-level components steer the adaptation process for whole applications, aiming to guarantee user-specified requirements, lower-layer components control auto-scaling of single operators. We tackle the fundamental challenge of performance and workload uncertainty, exploiting Bayesian optimization and reinforcement learning to devise policies. The evaluation shows that our approach is able to meet users’ requirements in terms of response time and adaptation overhead, while minimizing the cost due to resource usage, outperforming state-of-the-art baselines. We also demonstrate how partial model information is exploited to reduce training time for learning-based controllers.","PeriodicalId":50919,"journal":{"name":"ACM Transactions on Autonomous and Adaptive Systems","volume":"1 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41467818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GLDAP: Global Dynamic Action Persistence Adaptation for Deep Reinforcement Learning 深度强化学习的全局动态动作持久性适应
IF 2.7 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-04-03 DOI: 10.1145/3590154
Junbo Tong, Daming Shi, Yi Liu, Wenhui Fan
In the implementation of deep reinforcement learning (DRL), action persistence strategies are often adopted so agents maintain their actions for a fixed or variable number of steps. The choice of the persistent duration for agent actions usually has notable effects on the performance of reinforcement learning algorithms. Aiming at the research gap of global dynamic optimal action persistence and its application in multi-agent systems, we propose a novel framework: global dynamic action persistence (GLDAP), which achieves global action persistence adaptation for deep reinforcement learning. We introduce a closed-loop method that is used to learn the estimated value and the corresponding policy of each candidate action persistence. Our experiment shows that GLDAP achieves an average of 2.5%~90.7% performance improvement and 3~20 times higher sampling efficiency over several baselines across various single-agent and multi-agent domains. We also validate the ability of GLDAP to determine the optimal action persistence through multiple experiments.
在深度强化学习(DRL)的实现中,通常采用动作持久性策略,以便代理将其动作维持在固定或可变的步骤数。agent动作持续时间的选择通常对强化学习算法的性能有显著影响。针对全局动态最优动作持久性及其在多智能体系统中的应用研究空白,我们提出了一个新的框架:全局动态动作持久性(GLDAP),该框架实现了深度强化学习的全局动作持久性自适应。我们介绍了一种闭环方法,用于学习每个候选动作持久性的估计值和相应的策略。我们的实验表明,在不同的单智能体和多智能体领域,GLDAP的性能平均提高了2.5%~90.7%,采样效率提高了3~20倍。我们还通过多个实验验证了GLDAP确定最佳动作持久性的能力。
{"title":"GLDAP: Global Dynamic Action Persistence Adaptation for Deep Reinforcement Learning","authors":"Junbo Tong, Daming Shi, Yi Liu, Wenhui Fan","doi":"10.1145/3590154","DOIUrl":"https://doi.org/10.1145/3590154","url":null,"abstract":"In the implementation of deep reinforcement learning (DRL), action persistence strategies are often adopted so agents maintain their actions for a fixed or variable number of steps. The choice of the persistent duration for agent actions usually has notable effects on the performance of reinforcement learning algorithms. Aiming at the research gap of global dynamic optimal action persistence and its application in multi-agent systems, we propose a novel framework: global dynamic action persistence (GLDAP), which achieves global action persistence adaptation for deep reinforcement learning. We introduce a closed-loop method that is used to learn the estimated value and the corresponding policy of each candidate action persistence. Our experiment shows that GLDAP achieves an average of 2.5%~90.7% performance improvement and 3~20 times higher sampling efficiency over several baselines across various single-agent and multi-agent domains. We also validate the ability of GLDAP to determine the optimal action persistence through multiple experiments.","PeriodicalId":50919,"journal":{"name":"ACM Transactions on Autonomous and Adaptive Systems","volume":" ","pages":"1 - 18"},"PeriodicalIF":2.7,"publicationDate":"2023-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47238586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On Understanding Context Modelling for Adaptive Authentication Systems 理解自适应认证系统的上下文建模
IF 2.7 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-03-31 DOI: https://dl.acm.org/doi/10.1145/3582696
Anne Bumiller, Stéphanie Challita, Benoit Combemale, Olivier Barais, Nicolas Aillery, Gael Le Lan

In many situations, it is of interest for authentication systems to adapt to context (e.g., when the user’s behavior differs from the previous behavior). Hence, representing the context with appropriate and well-designed models is crucial. We provide a comprehensive overview and analysis of research work on Context Modelling for Adaptive Authentication systems (CM4AA). To this end, we pursue three goals based on the Systematic Mapping Study (SMS) and Systematic Literature Review (SLR) research methodologies. We first present a SMS to structure the research area of CM4AA (goal 1). We complement the SMS with an SLR to gather and synthesise evidence about context information and its modelling for adaptive authentication systems (goal 2). From the knowledge gained from goal 2, we determine the desired properties of the context information model and its use for adaptive authentication systems (goal 3). Motivated to find out how to model context information for adaptive authentication, we provide a structured survey of the literature to date on CM4AA and a classification of existing proposals according to several analysis metrics. We demonstrate the ability of capturing a common set of contextual features that are relevant for adaptive authentication systems independent from the application domain. We emphasise that despite the possibility of a unified framework, no standard for CM4AA exists.

在许多情况下,身份验证系统需要适应上下文(例如,当用户的行为与之前的行为不同时)。因此,使用适当且设计良好的模型来表示上下文是至关重要的。我们对自适应认证系统(CM4AA)的上下文建模研究工作进行了全面的概述和分析。为此,我们基于系统制图研究(SMS)和系统文献综述(SLR)的研究方法追求三个目标。我们首先提出了一个SMS来构建CM4AA的研究领域(目标1)。我们用单反来补充SMS,以收集和合成有关上下文信息及其自适应认证系统建模的证据(目标2)。从从目标2获得的知识中,我们确定了上下文信息模型的期望属性及其在自适应认证系统中的使用(目标3)。我们对CM4AA迄今为止的文献进行了结构化的调查,并根据几个分析指标对现有提案进行了分类。我们演示了捕获一组公共上下文特性的能力,这些特性与独立于应用程序域的自适应身份验证系统相关。我们强调,尽管有统一框架的可能性,但CM4AA没有标准。
{"title":"On Understanding Context Modelling for Adaptive Authentication Systems","authors":"Anne Bumiller, Stéphanie Challita, Benoit Combemale, Olivier Barais, Nicolas Aillery, Gael Le Lan","doi":"https://dl.acm.org/doi/10.1145/3582696","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3582696","url":null,"abstract":"<p>In many situations, it is of interest for authentication systems to adapt to context (e.g., when the user’s behavior differs from the previous behavior). Hence, representing the context with appropriate and well-designed models is crucial. We provide a comprehensive overview and analysis of research work on <b>C</b><i>ontext</i> <b>M</b><i>odelling</i> <b>f</b><i>or</i> <b>A</b><i>daptive</i> <b>A</b><i>uthentication systems</i> (CM4AA). To this end, we pursue three goals based on the <i>Systematic Mapping Study (SMS)</i> and <i>Systematic Literature Review (SLR)</i> research methodologies. We first present a SMS to structure the research area of CM4AA (<b>goal 1</b>). We complement the SMS with an SLR to gather and synthesise evidence about context information and its modelling for adaptive authentication systems (<b>goal 2</b>). From the knowledge gained from goal 2, we determine the desired properties of the context information model and its use for adaptive authentication systems (<b>goal 3</b>). Motivated to find out how to model context information for adaptive authentication, we provide a structured survey of the literature to date on CM4AA and a classification of existing proposals according to several analysis metrics. We demonstrate the ability of capturing a common set of contextual features that are relevant for adaptive authentication systems independent from the application domain. We emphasise that despite the possibility of a unified framework, no standard for CM4AA exists.</p>","PeriodicalId":50919,"journal":{"name":"ACM Transactions on Autonomous and Adaptive Systems","volume":"7 4","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138503615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Model-driven Cluster Resource Management for AI Workloads in Edge Clouds 边缘云中AI工作负载的模型驱动集群资源管理
IF 2.7 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-03-27 DOI: https://dl.acm.org/doi/10.1145/3582080
Qianlin Liang, Walid A. Hanafy, Ahmed Ali-Eldin, Prashant Shenoy

Since emerging edge applications such as Internet of Things (IoT) analytics and augmented reality have tight latency constraints, hardware AI accelerators have been recently proposed to speed up deep neural network (DNN) inference run by these applications. Resource-constrained edge servers and accelerators tend to be multiplexed across multiple IoT applications, introducing the potential for performance interference between latency-sensitive workloads. In this article, we design analytic models to capture the performance of DNN inference workloads on shared edge accelerators, such as GPU and edgeTPU, under different multiplexing and concurrency behaviors. After validating our models using extensive experiments, we use them to design various cluster resource management algorithms to intelligently manage multiple applications on edge accelerators while respecting their latency constraints. We implement a prototype of our system in Kubernetes and show that our system can host 2.3× more DNN applications in heterogeneous multi-tenant edge clusters with no latency violations when compared to traditional knapsack hosting algorithms.

由于物联网(IoT)分析和增强现实等新兴边缘应用具有严格的延迟限制,因此最近提出了硬件AI加速器来加速这些应用运行的深度神经网络(DNN)推理。资源受限的边缘服务器和加速器倾向于跨多个物联网应用进行多路复用,从而在对延迟敏感的工作负载之间引入了性能干扰的可能性。在本文中,我们设计了分析模型来捕获共享边缘加速器(如GPU和edgeTPU)上DNN推理工作负载在不同复用和并发行为下的性能。在使用大量实验验证我们的模型之后,我们使用它们来设计各种集群资源管理算法,以智能地管理边缘加速器上的多个应用程序,同时尊重其延迟限制。我们在Kubernetes中实现了我们系统的原型,并表明我们的系统可以在异构多租户边缘集群中托管2.3倍的DNN应用程序,与传统的背包托管算法相比,没有延迟违反。
{"title":"Model-driven Cluster Resource Management for AI Workloads in Edge Clouds","authors":"Qianlin Liang, Walid A. Hanafy, Ahmed Ali-Eldin, Prashant Shenoy","doi":"https://dl.acm.org/doi/10.1145/3582080","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3582080","url":null,"abstract":"<p>Since emerging edge applications such as Internet of Things (IoT) analytics and augmented reality have tight latency constraints, hardware AI accelerators have been recently proposed to speed up deep neural network (DNN) inference run by these applications. Resource-constrained edge servers and accelerators tend to be multiplexed across multiple IoT applications, introducing the potential for performance interference between latency-sensitive workloads. In this article, we design analytic models to capture the performance of DNN inference workloads on shared edge accelerators, such as GPU and edgeTPU, under different multiplexing and concurrency behaviors. After validating our models using extensive experiments, we use them to design various cluster resource management algorithms to intelligently manage multiple applications on edge accelerators while respecting their latency constraints. We implement a prototype of our system in Kubernetes and show that our system can host 2.3× more DNN applications in heterogeneous multi-tenant edge clusters with no latency violations when compared to traditional knapsack hosting algorithms.</p>","PeriodicalId":50919,"journal":{"name":"ACM Transactions on Autonomous and Adaptive Systems","volume":"7 6","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138503613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Distributed Size-constrained Clustering Algorithm for Modular Robot-based Programmable Matter 基于模块化机器人可编程物的分布式大小约束聚类算法
IF 2.7 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-03-27 DOI: https://dl.acm.org/doi/10.1145/3580282
Jad Bassil, Abdallah Makhoul, Benoît Piranda, Julien Bourgeois

Modular robots are defined as autonomous kinematic machines with variable morphology. They are composed of several thousands or even millions of modules that are able to coordinate to behave intelligently. Clustering the modules in modular robots has many benefits, including scalability, energy-efficiency, reducing communication delay, and improving the self-reconfiguration process that focuses on finding a sequence of reconfiguration actions to convert robots from an initial shape to a goal one. The main idea of clustering is to divide the modules in an initial shape into a number of groups based on the final goal shape to enhance the self-reconfiguration process by allowing clusters to reconfigure in parallel. In this work, we prove that the size-constrained clustering problem is NP-complete, and we propose a new tree-based size-constrained clustering algorithm called “SC-Clust.” To show the efficiency of our approach, we implement and demonstrate our algorithm in simulation on networks of up to 30000 modules and on the Blinky Blocks hardware with up to 144 modules.

模块化机器人被定义为具有可变形态的自主运动机器。它们由数千甚至数百万个模块组成,这些模块能够相互协调以实现智能行为。将模块化机器人中的模块聚类有很多好处,包括可伸缩性、能源效率、减少通信延迟和改进自重构过程,该过程的重点是寻找一系列重构动作,将机器人从初始形状转换为目标形状。聚类的主要思想是基于最终目标形状将初始形状的模块划分为若干组,通过允许集群并行重新配置来增强自重构过程。在这项工作中,我们证明了大小约束聚类问题是np完全的,并提出了一种新的基于树的大小约束聚类算法“SC-Clust”。为了展示我们方法的效率,我们在多达30000个模块的网络和多达144个模块的Blinky Blocks硬件上实现并演示了我们的算法。
{"title":"Distributed Size-constrained Clustering Algorithm for Modular Robot-based Programmable Matter","authors":"Jad Bassil, Abdallah Makhoul, Benoît Piranda, Julien Bourgeois","doi":"https://dl.acm.org/doi/10.1145/3580282","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3580282","url":null,"abstract":"<p>Modular robots are defined as autonomous kinematic machines with variable morphology. They are composed of several thousands or even millions of modules that are able to coordinate to behave intelligently. Clustering the modules in modular robots has many benefits, including scalability, energy-efficiency, reducing communication delay, and improving the self-reconfiguration process that focuses on finding a sequence of reconfiguration actions to convert robots from an initial shape to a goal one. The main idea of clustering is to divide the modules in an initial shape into a number of groups based on the final goal shape to enhance the self-reconfiguration process by allowing clusters to reconfigure in parallel. In this work, we prove that the size-constrained clustering problem is NP-complete, and we propose a new tree-based size-constrained clustering algorithm called “SC-Clust.” To show the efficiency of our approach, we implement and demonstrate our algorithm in simulation on networks of up to 30000 modules and on the <i>Blinky Blocks</i> hardware with up to 144 modules.</p>","PeriodicalId":50919,"journal":{"name":"ACM Transactions on Autonomous and Adaptive Systems","volume":"7 5","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138503614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Genetic Programming-based Framework for Semi-automated Multi-agent Systems Engineering 基于遗传规划的半自动化多智能体系统工程框架
IF 2.7 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-03-02 DOI: 10.1145/3584731
Nicola Mc Donnell, J. Duggan, E. Howley
With the rise of new technologies, such as Edge computing, Internet of Things, Smart Cities, and Smart Grids, there is a growing need for multi-agent systems (MAS) approaches. Designing multi-agent systems is challenging, and doing this in an automated way is even more so. To address this, we propose a new framework, Evolved Gossip Contracts (EGC). It builds on Gossip Contracts (GC), a decentralised cooperation protocol that is used as the communication mechanism to facilitate self-organisation in a cooperative MAS. GC has several methods that are implemented uniquely, depending on the goal the MAS aims to achieve. The EGC framework uses evolutionary computing to search for the best implementation of these methods. To evaluate EGC, it was used to solve a classical NP-hard optimisation problem, the Bin Packing Problem (BPP). The experimental results show that EGC successfully discovered a decentralised strategy to solve the BPP, which is better than two classical heuristics on test cases similar to those on which it was trained; the improvement is statistically significant. EGC is the first framework that leverages evolutionary computing to semi-automate the discovery of a communication protocol for a MAS that has been shown to be effective at solving an NP-hard problem.
随着边缘计算、物联网、智能城市和智能电网等新技术的兴起,对多智能体系统(MAS)方法的需求越来越大。设计多智能体系统具有挑战性,而以自动化的方式进行设计更是如此。为了解决这一问题,我们提出了一个新的框架,即进化的流言契约(EGC)。它建立在Gossip Contracts(GC)的基础上,这是一种去中心化的合作协议,用作促进合作MAS中自我组织的通信机制。GC有几种方法是唯一实现的,这取决于MAS旨在实现的目标。EGC框架使用进化计算来搜索这些方法的最佳实现。为了评估EGC,它被用来解决一个经典的NP难优化问题,即装箱问题(BPP)。实验结果表明,EGC成功地发现了一种去中心化的策略来解决BPP,这比在类似于其训练的测试用例上的两种经典启发式算法要好;这种改善在统计学上是显著的。EGC是第一个利用进化计算半自动发现MAS通信协议的框架,该协议已被证明在解决NP难题方面是有效的。
{"title":"A Genetic Programming-based Framework for Semi-automated Multi-agent Systems Engineering","authors":"Nicola Mc Donnell, J. Duggan, E. Howley","doi":"10.1145/3584731","DOIUrl":"https://doi.org/10.1145/3584731","url":null,"abstract":"With the rise of new technologies, such as Edge computing, Internet of Things, Smart Cities, and Smart Grids, there is a growing need for multi-agent systems (MAS) approaches. Designing multi-agent systems is challenging, and doing this in an automated way is even more so. To address this, we propose a new framework, Evolved Gossip Contracts (EGC). It builds on Gossip Contracts (GC), a decentralised cooperation protocol that is used as the communication mechanism to facilitate self-organisation in a cooperative MAS. GC has several methods that are implemented uniquely, depending on the goal the MAS aims to achieve. The EGC framework uses evolutionary computing to search for the best implementation of these methods. To evaluate EGC, it was used to solve a classical NP-hard optimisation problem, the Bin Packing Problem (BPP). The experimental results show that EGC successfully discovered a decentralised strategy to solve the BPP, which is better than two classical heuristics on test cases similar to those on which it was trained; the improvement is statistically significant. EGC is the first framework that leverages evolutionary computing to semi-automate the discovery of a communication protocol for a MAS that has been shown to be effective at solving an NP-hard problem.","PeriodicalId":50919,"journal":{"name":"ACM Transactions on Autonomous and Adaptive Systems","volume":"18 1","pages":"1 - 30"},"PeriodicalIF":2.7,"publicationDate":"2023-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42109373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enforcing Resilience in Cyber-physical Systems via Equilibrium Verification at Runtime 通过运行时的平衡验证来增强网络物理系统的弹性
IF 2.7 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-02-16 DOI: 10.1145/3584364
Matteo Camilli, R. Mirandola, P. Scandurra
Cyber-physical systems often operate in dynamic environments where unexpected events should be managed while guaranteeing acceptable behavior. Providing comprehensive evidence of their dependability under change represents a major open challenge. In this article, we exploit the notion of equilibrium, that is, the ability of the system to maintain an acceptable behavior within its multidimensional viability zone and propose RUNE2 (RUNtime Equilibrium verification and Enforcement), an approach able to verify at runtime the equilibrium condition and to enforce the system to stay in its viability zone. RUNE2 includes (i) a system specification that takes into account the uncertainties related to partial knowledge and possible changes; (ii) the computation of the equilibrium condition to define the boundaries of the viability zone; (iii) a runtime equilibrium verification method that leverages Bayesian inference to reason about the ability of the system to remain viable; and (iv) a resilience enforcement mechanism that exploits the posterior knowledge to steer the execution of the system inside the viability zone. We demonstrate both benefits and costs of the proposed approach by conducting an empirical evaluation using two case studies and 24 systems synthetically generated from pseudo-random models with increasing structural complexity.
网络物理系统经常在动态环境中运行,在保证可接受的行为的同时,需要管理意外事件。提供它们在变化下的可靠性的全面证据是一个主要的公开挑战。在本文中,我们利用平衡的概念,即系统在其多维生存区域内维持可接受行为的能力,并提出RUNE2(运行时平衡验证和实施),这是一种能够在运行时验证平衡条件并强制系统保持在其生存区域的方法。RUNE2包括(i)考虑到与部分知识和可能变化相关的不确定性的系统规范;(ii)计算确定生存区边界的平衡条件;(iii)运行时平衡验证方法,该方法利用贝叶斯推理来推断系统保持可行性的能力;(iv)弹性执行机制,利用后验知识来引导系统在可行性区域内的执行。我们通过使用两个案例研究和24个由伪随机模型合成的结构复杂性增加的系统进行实证评估,证明了所提出方法的收益和成本。
{"title":"Enforcing Resilience in Cyber-physical Systems via Equilibrium Verification at Runtime","authors":"Matteo Camilli, R. Mirandola, P. Scandurra","doi":"10.1145/3584364","DOIUrl":"https://doi.org/10.1145/3584364","url":null,"abstract":"Cyber-physical systems often operate in dynamic environments where unexpected events should be managed while guaranteeing acceptable behavior. Providing comprehensive evidence of their dependability under change represents a major open challenge. In this article, we exploit the notion of equilibrium, that is, the ability of the system to maintain an acceptable behavior within its multidimensional viability zone and propose RUNE2 (RUNtime Equilibrium verification and Enforcement), an approach able to verify at runtime the equilibrium condition and to enforce the system to stay in its viability zone. RUNE2 includes (i) a system specification that takes into account the uncertainties related to partial knowledge and possible changes; (ii) the computation of the equilibrium condition to define the boundaries of the viability zone; (iii) a runtime equilibrium verification method that leverages Bayesian inference to reason about the ability of the system to remain viable; and (iv) a resilience enforcement mechanism that exploits the posterior knowledge to steer the execution of the system inside the viability zone. We demonstrate both benefits and costs of the proposed approach by conducting an empirical evaluation using two case studies and 24 systems synthetically generated from pseudo-random models with increasing structural complexity.","PeriodicalId":50919,"journal":{"name":"ACM Transactions on Autonomous and Adaptive Systems","volume":"18 1","pages":"1 - 32"},"PeriodicalIF":2.7,"publicationDate":"2023-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45188570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Enforcing Resilience in Cyber-physical Systems via Equilibrium Verification at Runtime 通过运行时的平衡验证来增强网络物理系统的弹性
IF 2.7 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-02-16 DOI: https://dl.acm.org/doi/10.1145/3584364
Matteo Camilli, Raffaela Mirandola, Patrizia Scandurra

Cyber-Physical Systems often operate in dynamic environments where unexpected events should be managed while guaranteeing acceptable behavior. Providing comprehensive evidence of their dependability under change represents a major open challenge. In this paper, we exploit the notion of equilibrium, that is, the ability of the system to maintain an acceptable behavior within its multidimensional viability zone and we propose RUNE2 (RUNtime Equilibrium verification and Enforcement), an approach able to verify at runtime the equilibrium condition and to enforce the system to stay in its viability zone. RUNE2 includes (i) a system specification that takes into account the uncertainties related to partial knowledge and possible changes; (ii) the computation of the equilibrium condition to define the boundaries of the viability zone; (iii) a runtime equilibrium verification method that leverages Bayesian inference to reason about the ability of the system to remain viable; and (iv) a resilience enforcement mechanism that exploits the posterior knowledge to steer the execution of the system inside the viability zone. We demonstrate both benefits and costs of the proposed approach by conducting an empirical evaluation using two selected case studies and additional 24 systems synthetically generated from pseudo-random models having increasing structural complexity.

网络物理系统经常在动态环境中运行,在保证可接受的行为的同时,需要管理意外事件。提供它们在变化下的可靠性的全面证据是一个主要的公开挑战。在本文中,我们利用平衡的概念,即系统在其多维生存区内维持可接受行为的能力,并提出RUNE2(运行时平衡验证和实施),这是一种能够在运行时验证平衡条件并强制系统保持在其生存区内的方法。RUNE2包括(i)考虑到与部分知识和可能变化相关的不确定性的系统规范;(ii)计算确定生存区边界的平衡条件;(iii)运行时平衡验证方法,该方法利用贝叶斯推理来推断系统保持可行性的能力;(iv)弹性执行机制,利用后验知识来引导系统在可行性区域内的执行。我们通过使用两个选定的案例研究和另外24个由结构复杂性增加的伪随机模型合成的系统进行实证评估,证明了所提出方法的收益和成本。
{"title":"Enforcing Resilience in Cyber-physical Systems via Equilibrium Verification at Runtime","authors":"Matteo Camilli, Raffaela Mirandola, Patrizia Scandurra","doi":"https://dl.acm.org/doi/10.1145/3584364","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3584364","url":null,"abstract":"<p>Cyber-Physical Systems often operate in dynamic environments where unexpected events should be managed while guaranteeing acceptable behavior. Providing comprehensive evidence of their dependability under change represents a major open challenge. In this paper, we exploit the notion of equilibrium, that is, the ability of the system to maintain an acceptable behavior within its multidimensional viability zone and we propose RUNE<sup>2</sup> (RUNtime Equilibrium verification and Enforcement), an approach able to verify at runtime the equilibrium condition and to enforce the system to stay in its viability zone. RUNE<sup>2</sup> includes (<i>i</i>) a system specification that takes into account the uncertainties related to partial knowledge and possible changes; (<i>ii</i>) the computation of the equilibrium condition to define the boundaries of the viability zone; (<i>iii</i>) a runtime equilibrium verification method that leverages Bayesian inference to reason about the ability of the system to remain viable; and (<i>iv</i>) a resilience enforcement mechanism that exploits the posterior knowledge to steer the execution of the system inside the viability zone. We demonstrate both benefits and costs of the proposed approach by conducting an empirical evaluation using two selected case studies and additional 24 systems synthetically generated from pseudo-random models having increasing structural complexity.</p>","PeriodicalId":50919,"journal":{"name":"ACM Transactions on Autonomous and Adaptive Systems","volume":"7 7","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138503612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ACM Transactions on Autonomous and Adaptive Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1