arXiv - CS - Operating Systems最新文献_第3页

Imaginary Machines: A Serverless Model for Cloud Applications 虚构的机器：云应用的无服务器模型

arXiv - CS - Operating Systems

Pub Date : 2024-06-30 DOI: arxiv-2407.00839

Michael Wawrzoniak, Rodrigo Bruno, Ana Klimovic, Gustavo Alonso

Serverless Function-as-a-Service (FaaS) platforms provide applications withresources that are highly elastic, quick to instantiate, accounted at finegranularity, and without the need for explicit runtime resource orchestration.This combination of the core properties underpins the success and popularity ofthe serverless FaaS paradigm. However, these benefits are not available to mostcloud applications because they are designed for networked virtualmachines/containers environments. Since such cloud applications cannot takeadvantage of the highly elastic resources of serverless and require run-timeorchestration systems to operate, they suffer from lower resource utilization,additional management complexity, and costs relative to their FaaS serverlesscounterparts. We propose Imaginary Machines, a new serverless model for cloud applications.This model (1.) exposes the highly elastic resources of serverless platforms asthe traditional network-of-hosts model that cloud applications expect, and (2.)it eliminates the need for explicit run-time orchestration by transparentlymanaging application resources based on signals generated during cloudapplication executions. With the Imaginary Machines model, unmodified cloudapplications become serverless applications. While still based on thenetwork-of-host model, they benefit from the highly elastic resources and donot require runtime orchestration, just like their specialized serverless FaaScounterparts, promising increased resource utilization while reducingmanagement costs.

无服务器功能即服务（FaaS）平台为应用程序提供了高弹性、快速实例化、细粒度核算的资源，而且无需明确的运行时资源协调。然而，由于大多数云应用是为联网虚拟机/容器环境设计的，因此无法获得这些优势。由于这类云应用无法利用无服务器的高弹性资源，而且需要运行时或编排系统才能运行，因此与无服务器 FaaS 应用相比，它们的资源利用率较低、管理复杂度更高、成本也更高。我们提出了面向云应用的全新无服务器模型--"虚构机器"（Imaginary Machines）。该模型（1）将无服务器平台的高弹性资源作为云应用所期望的传统主机网络模型进行公开，（2）根据云应用执行过程中产生的信号透明地管理应用程序资源，从而消除了对显式运行时编排的需求。有了虚拟机模型，未经修改的云应用就变成了无服务器应用程序。虽然仍基于主机网络模型，但它们受益于高弹性资源，不需要运行时编排，就像专门的无服务器 FaaS 对应程序一样，有望提高资源利用率，同时降低管理成本。

{"title":"Imaginary Machines: A Serverless Model for Cloud Applications","authors":"Michael Wawrzoniak, Rodrigo Bruno, Ana Klimovic, Gustavo Alonso","doi":"arxiv-2407.00839","DOIUrl":"https://doi.org/arxiv-2407.00839","url":null,"abstract":"Serverless Function-as-a-Service (FaaS) platforms provide applications with\u0000resources that are highly elastic, quick to instantiate, accounted at fine\u0000granularity, and without the need for explicit runtime resource orchestration.\u0000This combination of the core properties underpins the success and popularity of\u0000the serverless FaaS paradigm. However, these benefits are not available to most\u0000cloud applications because they are designed for networked virtual\u0000machines/containers environments. Since such cloud applications cannot take\u0000advantage of the highly elastic resources of serverless and require run-time\u0000orchestration systems to operate, they suffer from lower resource utilization,\u0000additional management complexity, and costs relative to their FaaS serverless\u0000counterparts. We propose Imaginary Machines, a new serverless model for cloud applications.\u0000This model (1.) exposes the highly elastic resources of serverless platforms as\u0000the traditional network-of-hosts model that cloud applications expect, and (2.)\u0000it eliminates the need for explicit run-time orchestration by transparently\u0000managing application resources based on signals generated during cloud\u0000application executions. With the Imaginary Machines model, unmodified cloud\u0000applications become serverless applications. While still based on the\u0000network-of-host model, they benefit from the highly elastic resources and do\u0000not require runtime orchestration, just like their specialized serverless FaaS\u0000counterparts, promising increased resource utilization while reducing\u0000management costs.","PeriodicalId":501333,"journal":{"name":"arXiv - CS - Operating Systems","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Boxer: FaaSt Ephemeral Elasticity for Off-the-Shelf Cloud Applications Boxer：用于现成云应用的 FaaSt 瞬时弹性

arXiv - CS - Operating Systems

Pub Date : 2024-06-30 DOI: arxiv-2407.00832

Michael Wawrzoniak, Rodrigo Bruno, Ana Klimovic, Gustavo Alonso

Elasticity is a key property of cloud computing. However, elasticity isoffered today at the granularity of virtual machines, which take tens ofseconds to start. This is insufficient to react to load spikes and suddenfailures in latency sensitive applications, leading users to resort toexpensive overprovisioning. Function-as-a-Service (FaaS) provides significantlyhigher elasticity than VMs, but comes coupled with an event-triggeredprogramming model and a constrained execution environment that makes themunsuitable for off-the-shelf applications. Previous work tries to overcomethese obstacles but often requires re-architecting the applications. In thispaper, we show how off-the-shelf applications can transparently benefit fromephemeral elasticity with FaaS. We built Boxer, an interposition layer spanningVMs and AWS Lambda, that intercepts application execution and emulates thenetwork-of-hosts environment that applications expect when deployed in aconventional VM/container environment. The ephemeral elasticity of Boxerenables significant performance and cost savings for off-the-shelf applicationswith, e.g., recovery times over 5x faster than EC2 instances and absorbing loadspikes comparable to overprovisioned EC2 VM instances.

弹性是云计算的一个关键特性。然而，目前提供的弹性是以虚拟机为粒度的，虚拟机启动需要几十秒的时间。这不足以应对延迟敏感型应用的负载峰值和突发故障，导致用户不得不采用昂贵的超额配置。功能即服务（FaaS）提供了比虚拟机高得多的弹性，但与之相伴的事件触发编程模型和受限的执行环境使其不适合现成的应用。以前的工作试图克服这些障碍，但往往需要重新构建应用程序。在本文中，我们展示了现成的应用如何通过 FaaS 以透明的方式从短暂弹性中获益。我们构建了一个横跨虚拟机和 AWS Lambda 的中间层 Boxer，它可以拦截应用程序的执行，并模拟应用程序在传统虚拟机/容器环境中部署时所期望的主机网络环境。Boxeren 的短暂弹性可为现成的应用程序节省大量性能和成本，例如，恢复时间比 EC2 实例快 5 倍以上，吸收负载峰值的能力可与超额配置的 EC2 VM 实例相媲美。

{"title":"Boxer: FaaSt Ephemeral Elasticity for Off-the-Shelf Cloud Applications","authors":"Michael Wawrzoniak, Rodrigo Bruno, Ana Klimovic, Gustavo Alonso","doi":"arxiv-2407.00832","DOIUrl":"https://doi.org/arxiv-2407.00832","url":null,"abstract":"Elasticity is a key property of cloud computing. However, elasticity is\u0000offered today at the granularity of virtual machines, which take tens of\u0000seconds to start. This is insufficient to react to load spikes and sudden\u0000failures in latency sensitive applications, leading users to resort to\u0000expensive overprovisioning. Function-as-a-Service (FaaS) provides significantly\u0000higher elasticity than VMs, but comes coupled with an event-triggered\u0000programming model and a constrained execution environment that makes them\u0000unsuitable for off-the-shelf applications. Previous work tries to overcome\u0000these obstacles but often requires re-architecting the applications. In this\u0000paper, we show how off-the-shelf applications can transparently benefit from\u0000ephemeral elasticity with FaaS. We built Boxer, an interposition layer spanning\u0000VMs and AWS Lambda, that intercepts application execution and emulates the\u0000network-of-hosts environment that applications expect when deployed in a\u0000conventional VM/container environment. The ephemeral elasticity of Boxer\u0000enables significant performance and cost savings for off-the-shelf applications\u0000with, e.g., recovery times over 5x faster than EC2 instances and absorbing load\u0000spikes comparable to overprovisioned EC2 VM instances.","PeriodicalId":501333,"journal":{"name":"arXiv - CS - Operating Systems","volume":"213 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141518702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

FastMig: Leveraging FastFreeze to Establish Robust Service Liquidity in Cloud 2.0 FastMig：利用 FastFreeze 在云 2.0 中建立稳健的服务流动性

arXiv - CS - Operating Systems

Pub Date : 2024-06-29 DOI: arxiv-2407.00313

Sorawit Manatura, Thanawat Chanikaphon, Chantana Chantrapornchai, Mohsen Amini Salehi

Service liquidity across edge-to-cloud or multi-cloud will serve as thecornerstone of the next generation of cloud computing systems (Cloud 2.0).Provided that cloud-based services are predominantly containerized, anefficient and robust live container migration solution is required toaccomplish service liquidity. In a nod to this growing requirement, in thisresearch, we leverage FastFreeze, a popular platform for processcheckpoint/restore within a container, and promote it to be a robust solutionfor end-to-end live migration of containerized services. In particular, wedevelop a new platform, called FastMig that proactively controls thecheckpoint/restore operations of FastFreeze, thereby, allowing for robust livemigration of containerized services via standard HTTP interfaces. The proposedplatform introduces post-checkpointing and pre-restoration operations toenhance migration robustness. Notably, the pre-restoration operation includescontainerized service startup options, enabling warm restoration and reducingthe migration downtime. In addition, we develop a method to make FastFreezerobust against failures that commonly happen during the migration and evenduring the normal operation of a containerized service. Experimental resultsunder real-world settings show that the migration downtime of a containerizedservice can be reduced by 30X compared to the situation where the originalFastFreeze was deployed for the migration. Moreover, we demonstrate thatFastMig and warm restoration method together can significantly mitigate thecontainer startup overhead. Importantly, these improvements are achievedwithout any significant performance reduction and only incurs a small resourceusage overhead, compared to the bare (ie non-FastFreeze) containerizedservices.

跨边缘到云或多云的服务流动性将成为下一代云计算系统（云 2.0）的基石。由于基于云的服务主要是容器化的，因此需要一个高效、强大的实时容器迁移解决方案来实现服务流动性。为了满足这一日益增长的需求，在本研究中，我们利用 FastFreeze（一个在容器内进行过程检查点/恢复的流行平台），并将其推广为容器化服务端到端实时迁移的强大解决方案。特别是，我们开发了一个名为FastMig的新平台，它能主动控制FastFreeze的检查点/恢复操作，从而允许通过标准HTTP接口对容器化服务进行稳健的实时迁移。拟议的平台引入了后检查点和预恢复操作，以增强迁移的稳健性。值得注意的是，预恢复操作包括容器化服务启动选项，可实现热恢复并减少迁移停机时间。此外，我们还开发了一种方法，使 FastFreezer 能够抵御迁移过程中和容器化服务正常运行期间经常发生的故障。真实环境下的实验结果表明，与部署原始 FastFreeze 进行迁移的情况相比，容器化服务的迁移停机时间可以缩短 30 倍。此外，我们还证明，FastMig 和热恢复方法一起使用，可以显著减少容器启动开销。重要的是，与裸（非 FastFreeze）容器化服务相比，这些改进不仅没有显著降低性能，而且只产生了少量的资源使用开销。

{"title":"FastMig: Leveraging FastFreeze to Establish Robust Service Liquidity in Cloud 2.0","authors":"Sorawit Manatura, Thanawat Chanikaphon, Chantana Chantrapornchai, Mohsen Amini Salehi","doi":"arxiv-2407.00313","DOIUrl":"https://doi.org/arxiv-2407.00313","url":null,"abstract":"Service liquidity across edge-to-cloud or multi-cloud will serve as the\u0000cornerstone of the next generation of cloud computing systems (Cloud 2.0).\u0000Provided that cloud-based services are predominantly containerized, an\u0000efficient and robust live container migration solution is required to\u0000accomplish service liquidity. In a nod to this growing requirement, in this\u0000research, we leverage FastFreeze, a popular platform for process\u0000checkpoint/restore within a container, and promote it to be a robust solution\u0000for end-to-end live migration of containerized services. In particular, we\u0000develop a new platform, called FastMig that proactively controls the\u0000checkpoint/restore operations of FastFreeze, thereby, allowing for robust live\u0000migration of containerized services via standard HTTP interfaces. The proposed\u0000platform introduces post-checkpointing and pre-restoration operations to\u0000enhance migration robustness. Notably, the pre-restoration operation includes\u0000containerized service startup options, enabling warm restoration and reducing\u0000the migration downtime. In addition, we develop a method to make FastFreeze\u0000robust against failures that commonly happen during the migration and even\u0000during the normal operation of a containerized service. Experimental results\u0000under real-world settings show that the migration downtime of a containerized\u0000service can be reduced by 30X compared to the situation where the original\u0000FastFreeze was deployed for the migration. Moreover, we demonstrate that\u0000FastMig and warm restoration method together can significantly mitigate the\u0000container startup overhead. Importantly, these improvements are achieved\u0000without any significant performance reduction and only incurs a small resource\u0000usage overhead, compared to the bare (ie non-FastFreeze) containerized\u0000services.","PeriodicalId":501333,"journal":{"name":"arXiv - CS - Operating Systems","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141518703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A parallel evolutionary algorithm to optimize dynamic memory managers in embedded systems 优化嵌入式系统动态内存管理器的并行进化算法

arXiv - CS - Operating Systems

Pub Date : 2024-06-28 DOI: arxiv-2407.09555

José L. Risco-Martín, David Atienza, J. Manuel Colmenar, Oscar Garnica

For the last thirty years, several Dynamic Memory Managers (DMMs) have beenproposed. Such DMMs include first fit, best fit, segregated fit and buddysystems. Since the performance, memory usage and energy consumption of each DMMdiffers, software engineers often face difficult choices in selecting the mostsuitable approach for their applications. This issue has special impact in thefield of portable consumer embedded systems, that must execute a limited amountof multimedia applications (e.g., 3D games, video players and signal processingsoftware, etc.), demanding high performance and extensive memory usage at a lowenergy consumption. Recently, we have developed a novel methodology based ongenetic programming to automatically design custom DMMs, optimizingperformance, memory usage and energy consumption. However, although thisprocess is automatic and faster than state-of-the-art optimizations, it demandsintensive computation, resulting in a time consuming process. Thus, parallelprocessing can be very useful to enable to explore more solutions spending thesame time, as well as to implement new algorithms. In this paper we present anovel parallel evolutionary algorithm for DMMs optimization in embeddedsystems, based on the Discrete Event Specification (DEVS) formalism over aService Oriented Architecture (SOA) framework. Parallelism significantlyimproves the performance of the sequential exploration algorithm. On the onehand, when the number of generations are the same in both approaches, ourparallel optimization framework is able to reach a speed-up of 86.40x whencompared with other state-of-the-art approaches. On the other, it improves theglobal quality (i.e., level of performance, low memory usage and low energyconsumption) of the final DMM obtained in a 36.36% with respect to twowell-known general-purpose DMMs and two state-of-the-art optimizationmethodologies.

在过去的三十年里，人们提出了多种动态内存管理器（DMM）。这些动态内存管理器包括首次拟合、最佳拟合、隔离拟合和萌芽系统。由于每种 DMM 的性能、内存使用量和能耗都不尽相同，软件工程师在为其应用选择最合适的方法时常常面临困难的抉择。这个问题在便携式消费嵌入式系统领域有着特殊的影响，因为这些系统必须执行数量有限的多媒体应用（如 3D 游戏、视频播放器和信号处理软件等），要求高性能、高内存使用率和低能耗。最近，我们开发了一种基于遗传编程的新方法，用于自动设计定制的 DMM，优化性能、内存使用和能耗。然而，虽然这一过程是自动的，而且比最先进的优化方法更快，但它需要密集的计算，导致过程耗时。因此，并行处理非常有用，可以在相同的时间内探索更多的解决方案，并实现新的算法。本文基于面向服务架构（SOA）框架下的离散事件规范（DEVS）形式，提出了一种用于嵌入式系统中 DMMs 优化的高级并行进化算法。并行化大大提高了顺序探索算法的性能。一方面，当两种方法的代数相同时，我们的并行优化框架与其他最先进的方法相比，速度提高了 86.40 倍。另一方面，与两种已知的通用 DMM 和两种最先进的优化方法相比，我们的并行优化框架提高了最终 DMM 的整体质量（即性能水平、低内存使用率和低能耗）36.36%。

{"title":"A parallel evolutionary algorithm to optimize dynamic memory managers in embedded systems","authors":"José L. Risco-Martín, David Atienza, J. Manuel Colmenar, Oscar Garnica","doi":"arxiv-2407.09555","DOIUrl":"https://doi.org/arxiv-2407.09555","url":null,"abstract":"For the last thirty years, several Dynamic Memory Managers (DMMs) have been\u0000proposed. Such DMMs include first fit, best fit, segregated fit and buddy\u0000systems. Since the performance, memory usage and energy consumption of each DMM\u0000differs, software engineers often face difficult choices in selecting the most\u0000suitable approach for their applications. This issue has special impact in the\u0000field of portable consumer embedded systems, that must execute a limited amount\u0000of multimedia applications (e.g., 3D games, video players and signal processing\u0000software, etc.), demanding high performance and extensive memory usage at a low\u0000energy consumption. Recently, we have developed a novel methodology based on\u0000genetic programming to automatically design custom DMMs, optimizing\u0000performance, memory usage and energy consumption. However, although this\u0000process is automatic and faster than state-of-the-art optimizations, it demands\u0000intensive computation, resulting in a time consuming process. Thus, parallel\u0000processing can be very useful to enable to explore more solutions spending the\u0000same time, as well as to implement new algorithms. In this paper we present a\u0000novel parallel evolutionary algorithm for DMMs optimization in embedded\u0000systems, based on the Discrete Event Specification (DEVS) formalism over a\u0000Service Oriented Architecture (SOA) framework. Parallelism significantly\u0000improves the performance of the sequential exploration algorithm. On the one\u0000hand, when the number of generations are the same in both approaches, our\u0000parallel optimization framework is able to reach a speed-up of 86.40x when\u0000compared with other state-of-the-art approaches. On the other, it improves the\u0000global quality (i.e., level of performance, low memory usage and low energy\u0000consumption) of the final DMM obtained in a 36.36% with respect to two\u0000well-known general-purpose DMMs and two state-of-the-art optimization\u0000methodologies.","PeriodicalId":501333,"journal":{"name":"arXiv - CS - Operating Systems","volume":"40 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141718063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

E-Mapper: Energy-Efficient Resource Allocation for Traditional Operating Systems on Heterogeneous Processors E-Mapper：在异构处理器上为传统操作系统分配高能效资源

arXiv - CS - Operating Systems

Pub Date : 2024-06-27 DOI: arxiv-2406.18980

Till Smejkal, Robert Khasanov, Jeronimo Castrillon, Hermann Härtig

Energy efficiency has become a key concern in modern computing. Majorprocessor vendors now offer heterogeneous architectures that combine powerfulcores with energy-efficient ones, such as Intel P/E systems, Apple M1 chips,and Samsungs Exyno's CPUs. However, apart from simple cost-based threadallocation strategies, today's OS schedulers do not fully exploit thesesystems' potential for adaptive energy-efficient computing. This is, in part,due to missing application-level interfaces to pass information abouttask-level energy consumption and application-level elasticity. This paperpresents E-Mapper, a novel resource management approach integrated into Linuxfor improved execution on heterogeneous processors. In E-Mapper, we baseresource allocation decisions on high-level application descriptions that usercan attach to programs or that the system can learn automatically at runtime.Our approach supports various programming models including OpenMP, Intel TBB,and TensorFlow. Crucially, E-Mapper leverages this information to extend beyondexisting thread-to-core allocation strategies by actively managing applicationconfigurations through a novel uniform application-resource manager interface.By doing so, E-Mapper achieves substantial enhancements in both performance andenergy efficiency, particularly in multi-application scenarios. On an IntelRaptor Lake and an Arm big.LITTLE system, E-Mapper reduces the applicationexecution on average by 20 % with an average reduction in energy consumption of34 %. We argue that our solution marks a crucial step toward creating a genericapproach for sustainable and efficient computing across different processorarchitectures.

能效已成为现代计算的关键问题。目前，主要的处理器供应商都提供了将功能强大的内核与高能效内核相结合的异构架构，如英特尔 P/E 系统、苹果 M1 芯片和三星 Exyno CPU。然而，除了简单的基于成本的线程分配策略外，当今的操作系统调度程序并没有充分利用这些系统在自适应节能计算方面的潜力。部分原因在于缺少应用级接口来传递任务级能耗和应用级弹性的信息。本文介绍的 E-Mapper 是一种集成到 Linux 中的新型资源管理方法，用于改进异构处理器上的执行。在 E-Mapper 中，我们将资源分配决策建立在高级应用描述的基础上，用户可以将高级应用描述附加到程序中，系统也可以在运行时自动学习高级应用描述。最重要的是，E-Mapper 利用这些信息，通过新颖的统一应用资源管理器接口主动管理应用程序配置，从而超越了现有的线程到内核分配策略。在英特尔Raptor Lake和Arm big.LITTLE系统上，E-Mapper平均减少了20%的应用执行量，平均减少了34%的能耗。我们认为，我们的解决方案标志着在不同处理器体系结构上创建可持续高效计算通用方法的关键一步。

{"title":"E-Mapper: Energy-Efficient Resource Allocation for Traditional Operating Systems on Heterogeneous Processors","authors":"Till Smejkal, Robert Khasanov, Jeronimo Castrillon, Hermann Härtig","doi":"arxiv-2406.18980","DOIUrl":"https://doi.org/arxiv-2406.18980","url":null,"abstract":"Energy efficiency has become a key concern in modern computing. Major\u0000processor vendors now offer heterogeneous architectures that combine powerful\u0000cores with energy-efficient ones, such as Intel P/E systems, Apple M1 chips,\u0000and Samsungs Exyno's CPUs. However, apart from simple cost-based thread\u0000allocation strategies, today's OS schedulers do not fully exploit these\u0000systems' potential for adaptive energy-efficient computing. This is, in part,\u0000due to missing application-level interfaces to pass information about\u0000task-level energy consumption and application-level elasticity. This paper\u0000presents E-Mapper, a novel resource management approach integrated into Linux\u0000for improved execution on heterogeneous processors. In E-Mapper, we base\u0000resource allocation decisions on high-level application descriptions that user\u0000can attach to programs or that the system can learn automatically at runtime.\u0000Our approach supports various programming models including OpenMP, Intel TBB,\u0000and TensorFlow. Crucially, E-Mapper leverages this information to extend beyond\u0000existing thread-to-core allocation strategies by actively managing application\u0000configurations through a novel uniform application-resource manager interface.\u0000By doing so, E-Mapper achieves substantial enhancements in both performance and\u0000energy efficiency, particularly in multi-application scenarios. On an Intel\u0000Raptor Lake and an Arm big.LITTLE system, E-Mapper reduces the application\u0000execution on average by 20 % with an average reduction in energy consumption of\u000034 %. We argue that our solution marks a crucial step toward creating a generic\u0000approach for sustainable and efficient computing across different processor\u0000architectures.","PeriodicalId":501333,"journal":{"name":"arXiv - CS - Operating Systems","volume":"161 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Evaluating Serverless Machine Learning Performance on Google Cloud Run 评估 Google Cloud Run 上的无服务器机器学习性能

arXiv - CS - Operating Systems

Pub Date : 2024-06-24 DOI: arxiv-2406.16250

Prerana Khatiwada, Pranjal Dhakal

End-users can get functions-as-a-service from serverless platforms, whichpromise lower hosting costs, high availability, fault tolerance, and dynamicflexibility for hosting individual functions known as microservices. Machinelearning tools are seen to be reliably useful, and the services created usingthese tools are in increasing demand on a large scale. The serverless platformsare uniquely suited for hosting these machine learning services to be used forlarge-scale applications. These platforms are well known for their costefficiency, fault tolerance, resource scaling, robust APIs for communication,and global reach. However, machine learning services are different from theweb-services in that these serverless platforms were originally designed tohost web services. We aimed to understand how these serverless platforms handlemachine learning workloads with our study. We examine machine learningperformance on one of the serverless platforms - Google Cloud Run, which is aGPU-less infrastructure that is not designed for machine learning applicationdeployment.

终端用户可以从无服务器平台上获得功能即服务（functions-as-a-service），这些平台承诺较低的托管成本、高可用性、容错性和动态灵活性，以托管被称为微服务（microservices）的单个功能。机器学习工具被认为是可靠有用的，使用这些工具创建的服务在大规模需求中日益增多。无服务器平台非常适合托管这些用于大规模应用的机器学习服务。这些平台以其成本效益、容错、资源扩展、强大的通信 API 和全球覆盖而闻名。然而，机器学习服务不同于网络服务，因为这些无服务器平台最初是为托管网络服务而设计的。我们的研究旨在了解这些无服务器平台如何处理机器学习工作负载。我们研究了无服务器平台之一--谷歌云运行（Google Cloud Run）上的机器学习性能，这是一种无 GPU 的基础设施，并非为机器学习应用部署而设计。

引用次数: 0

Simulation of high-performance memory allocators 模拟高性能内存分配器

arXiv - CS - Operating Systems

Pub Date : 2024-06-22 DOI: arxiv-2406.15776

José L. Risco-Martín, J. Manuel Colmenar, David Atienza, J. Ignacio Hidalgo

For the last thirty years, a large variety of memory allocators have beenproposed. Since performance, memory usage and energy consumption of each memoryallocator differs, software engineers often face difficult choices in selectingthe most suitable approach for their applications. To this end, customallocators are developed from scratch, which is a difficult and error-proneprocess. This issue has special impact in the field of portable consumerembedded systems, that must execute a limited amount of multimediaapplications, demanding high performance and extensive memory usage at a lowenergy consumption. This paper presents a flexible and efficient simulator tostudy Dynamic Memory Managers (DMMs), a composition of one or more memoryallocators. This novel approach allows programmers to simulate custom andgeneral DMMs, which can be composed without incurring any additional runtimeoverhead or additional programming cost. We show that this infrastructuresimplifies DMM construction, mainly because the target application does notneed to be compiled every time a new DMM must be evaluated and because wepropose a structured method to search and build DMMs in an object-orientedfashion. Within a search procedure, the system designer can choose the "best"allocator by simulation for a particular target application and embeddedsystem. In our evaluation, we show that our scheme delivers better performance,less memory usage and less energy consumption than single memory allocators.

在过去的三十年里，人们提出了各种各样的内存分配器。由于每种内存分配器的性能、内存使用量和能耗各不相同，软件工程师在为其应用程序选择最合适的方法时常常面临困难的抉择。为此，需要从头开始开发自定义分配器，这是一个困难且容易出错的过程。这个问题在便携式消费嵌入式系统领域具有特殊的影响，因为这些系统必须执行数量有限的多媒体应用，要求在低能耗的情况下实现高性能和大量内存使用。本文介绍了一种灵活高效的模拟器，用于研究由一个或多个内存分配器组成的动态内存管理器（DMM）。这种新颖的方法允许程序员模拟定制的和通用的 DMM，这些 DMM 的组成不会产生任何额外的运行时开销或额外的编程成本。我们的研究表明，这种基础架构简化了 DMM 的构建，这主要是因为目标应用程序无需在每次评估新的 DMM 时都进行编译，而且我们提出了一种结构化方法，以面向对象的方式搜索和构建 DMM。在搜索过程中，系统设计人员可以针对特定的目标应用和嵌入式系统，通过模拟选择 "最佳 "分配器。在评估中，我们发现与单一内存分配器相比，我们的方案性能更好、内存使用量更少、能耗更低。

{"title":"Simulation of high-performance memory allocators","authors":"José L. Risco-Martín, J. Manuel Colmenar, David Atienza, J. Ignacio Hidalgo","doi":"arxiv-2406.15776","DOIUrl":"https://doi.org/arxiv-2406.15776","url":null,"abstract":"For the last thirty years, a large variety of memory allocators have been\u0000proposed. Since performance, memory usage and energy consumption of each memory\u0000allocator differs, software engineers often face difficult choices in selecting\u0000the most suitable approach for their applications. To this end, custom\u0000allocators are developed from scratch, which is a difficult and error-prone\u0000process. This issue has special impact in the field of portable consumer\u0000embedded systems, that must execute a limited amount of multimedia\u0000applications, demanding high performance and extensive memory usage at a low\u0000energy consumption. This paper presents a flexible and efficient simulator to\u0000study Dynamic Memory Managers (DMMs), a composition of one or more memory\u0000allocators. This novel approach allows programmers to simulate custom and\u0000general DMMs, which can be composed without incurring any additional runtime\u0000overhead or additional programming cost. We show that this infrastructure\u0000simplifies DMM construction, mainly because the target application does not\u0000need to be compiled every time a new DMM must be evaluated and because we\u0000propose a structured method to search and build DMMs in an object-oriented\u0000fashion. Within a search procedure, the system designer can choose the \"best\"\u0000allocator by simulation for a particular target application and embedded\u0000system. In our evaluation, we show that our scheme delivers better performance,\u0000less memory usage and less energy consumption than single memory allocators.","PeriodicalId":501333,"journal":{"name":"arXiv - CS - Operating Systems","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SquirrelFS: using the Rust compiler to check file-system crash consistency SquirrelFS：使用 Rust 编译器检查文件系统崩溃一致性

arXiv - CS - Operating Systems

Pub Date : 2024-06-14 DOI: arxiv-2406.09649

Hayley LeBlanc, Nathan Taylor, James Bornholt, Vijay Chidambaram

This work introduces a new approach to building crash-safe file systems forpersistent memory. We exploit the fact that Rust's typestate pattern allowscompile-time enforcement of a specific order of operations. We introduce anovel crash-consistency mechanism, Synchronous Soft Updates, that boils downcrash safety to enforcing ordering among updates to file-system metadata. Weemploy this approach to build SquirrelFS, a new file system withcrash-consistency guarantees that are checked at compile time. SquirrelFSavoids the need for separate proofs, instead incorporating correctnessguarantees into the typestate itself. Compiling SquirrelFS only takes tens ofseconds; successful compilation indicates crash consistency, while an errorprovides a starting point for fixing the bug. We evaluate SquirrelFS againststate of the art file systems such as NOVA and WineFS, and find that SquirrelFSachieves similar or better performance on a wide range of benchmarks andapplications.

这项工作介绍了一种为持久内存构建崩溃安全文件系统的新方法。我们利用了 Rust 的类型状态模式允许在编译时强制执行特定操作顺序这一事实。我们引入了一种新的崩溃一致性机制--同步软更新（Synchronous Soft Updates），它将崩溃安全归结为强制执行文件系统元数据更新的顺序。我们采用这种方法构建了 SquirrelFS，这是一种新的文件系统，具有崩溃一致性保证，可在编译时进行检查。SquirrelFS 不需要单独的证明，而是将正确性保证融入类型状态本身。编译 SquirrelFS 只需几十秒；编译成功表明崩溃一致性，而错误则提供了修复错误的起点。我们将 SquirrelFS 与最先进的文件系统（如 NOVA 和 WineFS）进行了对比评估，发现 SquirrelFS 在各种基准测试和应用中都取得了类似甚至更好的性能。

引用次数: 0

A Survey of Unikernel Security: Insights and Trends from a Quantitative Analysis 统一内核安全调查：定量分析得出的见解和趋势

arXiv - CS - Operating Systems

Pub Date : 2024-06-04 DOI: arxiv-2406.01872

Alex WollmanDakota State University, John HastingsDakota State University

Unikernels, an evolution of LibOSs, are emerging as a virtualizationtechnology to rival those currently used by cloud providers. Unikernels combinethe user and kernel space into one "uni"fied memory space and omitfunctionality that is not necessary for its application to run, thusdrastically reducing the required resources. The removed functionality howeveris far-reaching and includes components that have become common securitytechnologies such as Address Space Layout Randomization (ASLR), Data ExecutionPrevention (DEP), and Non-executable bits (NX bits). This raises questionsabout the real-world security of unikernels. This research presents aquantitative methodology using TF-IDF to analyze the focus of securitydiscussions within unikernel research literature. Based on a corpus of 33unikernel-related papers spanning 2013-2023, our analysis found that MemoryProtection Extensions and Data Execution Prevention were the least frequentlyoccurring topics, while SGX was the most frequent topic. The findings quantifypriorities and assumptions in unikernel security research, bringing to lightpotential risks from underexplored attack surfaces. The quantitative approachis broadly applicable for revealing trends and gaps in niche security domains.

由 LibOS 演化而来的 Unikernels 正在成为一种虚拟化技术，可与云提供商目前使用的虚拟化技术相媲美。Unikernels 将用户空间和内核空间合并为一个 "统一 "的内存空间，并省略了应用程序运行所不需要的功能，从而大大减少了所需资源。然而，被删除的功能影响深远，其中包括已成为常见安全技术的组件，如地址空间布局随机化（ASLR）、数据执行预防（DEP）和不可执行位（NX 位）。这就对单核的实际安全性提出了质疑。本研究提出了一种使用 TF-IDF 分析单内核研究文献中安全讨论焦点的定量方法。基于2013-2023年间33篇单核相关论文的语料库，我们的分析发现，内存保护扩展和数据执行预防是出现频率最低的话题，而SGX则是出现频率最高的话题。研究结果量化了单内核安全研究中的优先事项和假设，揭示了未充分开发的攻击面所带来的潜在风险。这种量化方法广泛适用于揭示利基安全领域的趋势和差距。

{"title":"A Survey of Unikernel Security: Insights and Trends from a Quantitative Analysis","authors":"Alex WollmanDakota State University, John HastingsDakota State University","doi":"arxiv-2406.01872","DOIUrl":"https://doi.org/arxiv-2406.01872","url":null,"abstract":"Unikernels, an evolution of LibOSs, are emerging as a virtualization\u0000technology to rival those currently used by cloud providers. Unikernels combine\u0000the user and kernel space into one \"uni\"fied memory space and omit\u0000functionality that is not necessary for its application to run, thus\u0000drastically reducing the required resources. The removed functionality however\u0000is far-reaching and includes components that have become common security\u0000technologies such as Address Space Layout Randomization (ASLR), Data Execution\u0000Prevention (DEP), and Non-executable bits (NX bits). This raises questions\u0000about the real-world security of unikernels. This research presents a\u0000quantitative methodology using TF-IDF to analyze the focus of security\u0000discussions within unikernel research literature. Based on a corpus of 33\u0000unikernel-related papers spanning 2013-2023, our analysis found that Memory\u0000Protection Extensions and Data Execution Prevention were the least frequently\u0000occurring topics, while SGX was the most frequent topic. The findings quantify\u0000priorities and assumptions in unikernel security research, bringing to light\u0000potential risks from underexplored attack surfaces. The quantitative approach\u0000is broadly applicable for revealing trends and gaps in niche security domains.","PeriodicalId":501333,"journal":{"name":"arXiv - CS - Operating Systems","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141257918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SVFF: An Automated Framework for SR-IOV Virtual Function Management in FPGA Accelerated Virtualized Environments SVFF：FPGA 加速虚拟化环境中 SR-IOV 虚拟功能管理的自动化框架

arXiv - CS - Operating Systems

Pub Date : 2024-06-03 DOI: arxiv-2406.01225

Stefano Cirici, Michele Paolino, Daniel Raho

FPGA accelerator devices have emerged as a powerful platform for implementinghigh-performance and scalable solutions in a wide range of industries,leveraging their reconfigurability and virtualization capabilities.Virtualization, in particular, offers several benefits including improvedsecurity by resource isolation and sharing, and SR-IOV is the main solution forenabling it on FPGAs. This paper introduces the SR-IOV Virtual Function Framework (SVFF), asolution that aims to simplify and enhance the management of Virtual Functions(VFs) on PCIe-attached FPGA devices in Linux and QEMU/KVM environments, solvingthe lack of SR-IOV re-configuration support on guests. The framework leveragesthe SR-IOV support in the Xilinx Queue-based Direct Memory Access (QDMA) toautomate the creation, attachment, detachment, and reconfiguration of VFs todifferent Virtual Machines (VMs). A novel pause functionality for the VFIOdevice has been implemented in QEMU to enable the detachment of VFs from thehost without detaching them from the guest, making reconfiguration of VFstransparent for guests that already have a VF attached to them without anyperformance loss. The proposed solution offers the ability to automatically andseamlessly assign a set of VFs to different VMs and adjust the configuration onthe fly. Thanks to the pause functionality, it also offers the ability toattach additional VFs to new VMs without affecting devices already attached toother VMs.

FPGA加速器设备已成为各行各业实施高性能和可扩展解决方案的强大平台，可充分利用其可重构性和虚拟化功能。虚拟化尤其具有多种优势，包括通过资源隔离和共享提高安全性，而SR-IOV是在FPGA上实现虚拟化的主要解决方案。本文介绍了 SR-IOV 虚拟功能框架（SVFF），该解决方案旨在简化和增强 Linux 和 QEMU/KVM 环境中 PCIe 附加 FPGA 设备上虚拟功能（VFs）的管理，解决客户机缺乏 SR-IOV 重新配置支持的问题。该框架利用赛灵思基于队列的直接内存访问（QDMA）中的 SR-IOV 支持，将 VF 的创建、连接、分离和重新配置自动化为不同的虚拟机（VM）。在 QEMU 中为 VFIO 设备实现了一种新颖的暂停功能，可在不从客户机分离 VF 的情况下从主机分离 VF，从而使 VF 的重新配置对已连接 VF 的客户机透明，且不会造成任何性能损失。所提出的解决方案能够自动、无缝地将一组虚拟机分配给不同的虚拟机，并即时调整配置。由于具有暂停功能，它还能为新虚拟机附加额外的 VF，而不会影响已附加到其他虚拟机上的设备。

{"title":"SVFF: An Automated Framework for SR-IOV Virtual Function Management in FPGA Accelerated Virtualized Environments","authors":"Stefano Cirici, Michele Paolino, Daniel Raho","doi":"arxiv-2406.01225","DOIUrl":"https://doi.org/arxiv-2406.01225","url":null,"abstract":"FPGA accelerator devices have emerged as a powerful platform for implementing\u0000high-performance and scalable solutions in a wide range of industries,\u0000leveraging their reconfigurability and virtualization capabilities.\u0000Virtualization, in particular, offers several benefits including improved\u0000security by resource isolation and sharing, and SR-IOV is the main solution for\u0000enabling it on FPGAs. This paper introduces the SR-IOV Virtual Function Framework (SVFF), a\u0000solution that aims to simplify and enhance the management of Virtual Functions\u0000(VFs) on PCIe-attached FPGA devices in Linux and QEMU/KVM environments, solving\u0000the lack of SR-IOV re-configuration support on guests. The framework leverages\u0000the SR-IOV support in the Xilinx Queue-based Direct Memory Access (QDMA) to\u0000automate the creation, attachment, detachment, and reconfiguration of VFs to\u0000different Virtual Machines (VMs). A novel pause functionality for the VFIO\u0000device has been implemented in QEMU to enable the detachment of VFs from the\u0000host without detaching them from the guest, making reconfiguration of VFs\u0000transparent for guests that already have a VF attached to them without any\u0000performance loss. The proposed solution offers the ability to automatically and\u0000seamlessly assign a set of VFs to different VMs and adjust the configuration on\u0000the fly. Thanks to the pause functionality, it also offers the ability to\u0000attach additional VFs to new VMs without affecting devices already attached to\u0000other VMs.","PeriodicalId":501333,"journal":{"name":"arXiv - CS - Operating Systems","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141257833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0