Qora: Neural-Enhanced Interference-Aware Resource Provisioning for Serverless Computing

IF 6.4 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS IEEE Transactions on Automation Science and Engineering Pub Date : 2025-01-15 DOI:10.1109/TASE.2025.3526197

Ruifeng Ma;Yufeng Zhan;Chuge Wu;Zicong Hong;Yasir Ali;Yuanqing Xia

{"title":"Qora: Neural-Enhanced Interference-Aware Resource Provisioning for Serverless Computing","authors":"Ruifeng Ma;Yufeng Zhan;Chuge Wu;Zicong Hong;Yasir Ali;Yuanqing Xia","doi":"10.1109/TASE.2025.3526197","DOIUrl":null,"url":null,"abstract":"Serverless is an emerging cloud paradigm that offers fine-grained resource sharing through serverless functions. However, this resource sharing can cause interference, leading to performance degradation and QoS violations. Existing white box-based approaches for serverless resource provision often demand extensive expert knowledge, which is challenging to obtain due to the complexity of interference sources. This paper proposes Qora, a neural-enhanced interference-aware resource provisioning system for serverless computing. We model the resource provisioning of serverless functions as a novel combinatorial optimization problem, wherein the constraints on the queries per second are derived from neural network performance model. By leveraging neural networks to model the nonlinear performance fluctuations under various interference sources, our approach better captures the real-world behavior of serverless functions. To solve the formulated problem efficiently, rather than adopting commercial optimizer solvers like Gurobi, we propose a two-stage-VNS algorithm that searches discrete variables more efficiently and supports Sigmoid activations, avoiding introducing redundant discrete variables. Unlike pure machine learning methods lacking theoretical optimal guarantees, our approach is rigorously proven globally optimal based on optimization theory. We implement Qora on Kubernetes as a serverless system automating resource provisioning. Experimental results demonstrate that Qora reduces the QoS violation rate by 98% while reducing up to 35% resource costs compared with the state-of-the-arts. Note to Practitioners—From the perspective of cloud service providers, this paper considers the automatic resource provisioning for serverless functions. To improve hardware utilization, cloud providers tend to co-locate serverless functions on the same server. However, co-located functions compete for shared resources (memory bandwidth, L3 cache, etc.), which causes interference and leads to performance degradation and QoS violations. We use neural networks to build the performance models of interference-prone serverless functions and form the resource allocation optimization problem with neural network performance models as constraints. Compared to white box modeling methods, our neural network modeling adapts to complex and variable interference. Compared to deep reinforcement learning methods, our combinatorial optimization methods have stronger interpretability. In order to solve this optimization problem efficiently, we design the two-stage-VNS solution algorithm. We implement Qora on Kubernetes as a serverless system, which can automatically allocate computing resources. Experiments with small-scale real clusters and large-scale simulations demonstrate the effectiveness of Qora.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"10609-10624"},"PeriodicalIF":6.4000,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10843098/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Serverless is an emerging cloud paradigm that offers fine-grained resource sharing through serverless functions. However, this resource sharing can cause interference, leading to performance degradation and QoS violations. Existing white box-based approaches for serverless resource provision often demand extensive expert knowledge, which is challenging to obtain due to the complexity of interference sources. This paper proposes Qora, a neural-enhanced interference-aware resource provisioning system for serverless computing. We model the resource provisioning of serverless functions as a novel combinatorial optimization problem, wherein the constraints on the queries per second are derived from neural network performance model. By leveraging neural networks to model the nonlinear performance fluctuations under various interference sources, our approach better captures the real-world behavior of serverless functions. To solve the formulated problem efficiently, rather than adopting commercial optimizer solvers like Gurobi, we propose a two-stage-VNS algorithm that searches discrete variables more efficiently and supports Sigmoid activations, avoiding introducing redundant discrete variables. Unlike pure machine learning methods lacking theoretical optimal guarantees, our approach is rigorously proven globally optimal based on optimization theory. We implement Qora on Kubernetes as a serverless system automating resource provisioning. Experimental results demonstrate that Qora reduces the QoS violation rate by 98% while reducing up to 35% resource costs compared with the state-of-the-arts. Note to Practitioners—From the perspective of cloud service providers, this paper considers the automatic resource provisioning for serverless functions. To improve hardware utilization, cloud providers tend to co-locate serverless functions on the same server. However, co-located functions compete for shared resources (memory bandwidth, L3 cache, etc.), which causes interference and leads to performance degradation and QoS violations. We use neural networks to build the performance models of interference-prone serverless functions and form the resource allocation optimization problem with neural network performance models as constraints. Compared to white box modeling methods, our neural network modeling adapts to complex and variable interference. Compared to deep reinforcement learning methods, our combinatorial optimization methods have stronger interpretability. In order to solve this optimization problem efficiently, we design the two-stage-VNS solution algorithm. We implement Qora on Kubernetes as a serverless system, which can automatically allocate computing resources. Experiments with small-scale real clusters and large-scale simulations demonstrate the effectiveness of Qora.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Qora：面向无服务器计算的神经增强干扰感知资源配置

无服务器是一种新兴的云范式，它通过无服务器功能提供细粒度的资源共享。然而，这种资源共享可能会导致干扰，从而导致性能下降和QoS冲突。现有的基于白盒的无服务器资源提供方法通常需要广泛的专家知识，由于干扰源的复杂性，这些知识很难获得。本文提出了一种用于无服务器计算的神经增强干扰感知资源供应系统Qora。我们将无服务器函数的资源配置建模为一个新的组合优化问题，其中对每秒查询数的约束来源于神经网络性能模型。通过利用神经网络来模拟各种干扰源下的非线性性能波动，我们的方法更好地捕捉了无服务器函数的真实行为。为了有效地解决制定的问题，而不是采用像Gurobi这样的商业优化求解器，我们提出了一种两阶段vns算法，该算法更有效地搜索离散变量，并支持Sigmoid激活，避免引入冗余离散变量。与缺乏理论最优保证的纯机器学习方法不同，我们的方法是基于优化理论的严格证明的全局最优方法。我们在Kubernetes上实现Qora，作为一个自动化资源配置的无服务器系统。实验结果表明，与最先进的方法相比，Qora将QoS违规率降低了98%，同时减少了高达35%的资源成本。从业人员注意事项——本文从云服务提供商的角度考虑无服务器功能的自动资源配置。为了提高硬件利用率，云提供商倾向于将无服务器功能放在同一台服务器上。然而，共存的功能会竞争共享资源（内存带宽、L3缓存等），从而产生干扰，导致性能下降和QoS违规。利用神经网络建立易受干扰的无服务器函数的性能模型，形成以神经网络性能模型为约束的资源分配优化问题。与白盒建模方法相比，我们的神经网络建模能够适应复杂和可变的干扰。与深度强化学习方法相比，我们的组合优化方法具有更强的可解释性。为了有效地解决这一优化问题，我们设计了两阶段vns求解算法。我们在Kubernetes上实现Qora作为一个无服务器系统，它可以自动分配计算资源。小规模真实聚类和大规模仿真实验验证了Qora算法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Automation Science and Engineering 工程技术-自动化与控制系统

CiteScore

12.50

自引率

14.30%

发文量

404

审稿时长

3.0 months

期刊介绍： The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.