NextGen-Malloc: Giving Memory Allocator Its Own Room in the House

Proceedings of the 19th Workshop on Hot Topics in Operating Systems Pub Date : 2023-06-22 DOI:10.1145/3593856.3595911

Ruihao Li, Qinzhe Wu, K. Kavi, Gayatri Mehta, N. Yadwadkar, L. John

{"title":"NextGen-Malloc: Giving Memory Allocator Its Own Room in the House","authors":"Ruihao Li, Qinzhe Wu, K. Kavi, Gayatri Mehta, N. Yadwadkar, L. John","doi":"10.1145/3593856.3595911","DOIUrl":null,"url":null,"abstract":"Memory allocation and management have a significant impact on performance and energy of modern applications. We observe that performance can vary by as much as 72% in some applications based on which memory allocator is used. Many current allocators are multi-threaded to support concurrent allocation requests from different threads. However, such multi-threading comes at the cost of maintaining complex metadata that is tightly coupled and intertwined with user data. When memory management functions and other user programs run on the same core, the metadata used by management functions may pollute the processor caches and other resources. In this paper, we make a case for offloading memory allocation (and other similar management functions) from main processing cores to other processing units to boost performance, reduce energy consumption, and customize services to specific applications or application domains. To offload these multi-threaded fine-granularity functions, we propose to decouple the metadata of these functions from the rest of application data to reduce the overhead of inter-thread metadata synchronization. We draw attention to the following key questions to realize this opportunity: (a) What are the tradeoffs and challenges in offloading memory allocation to a dedicated core? (b) Should we use general-purpose cores or special-purpose cores for executing critical system management functions? (c) Can this methodology apply to heterogeneous systems (e.g., with GPUs, accelerators) and other service functions as well?","PeriodicalId":330470,"journal":{"name":"Proceedings of the 19th Workshop on Hot Topics in Operating Systems","volume":"114 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 19th Workshop on Hot Topics in Operating Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3593856.3595911","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Memory allocation and management have a significant impact on performance and energy of modern applications. We observe that performance can vary by as much as 72% in some applications based on which memory allocator is used. Many current allocators are multi-threaded to support concurrent allocation requests from different threads. However, such multi-threading comes at the cost of maintaining complex metadata that is tightly coupled and intertwined with user data. When memory management functions and other user programs run on the same core, the metadata used by management functions may pollute the processor caches and other resources. In this paper, we make a case for offloading memory allocation (and other similar management functions) from main processing cores to other processing units to boost performance, reduce energy consumption, and customize services to specific applications or application domains. To offload these multi-threaded fine-granularity functions, we propose to decouple the metadata of these functions from the rest of application data to reduce the overhead of inter-thread metadata synchronization. We draw attention to the following key questions to realize this opportunity: (a) What are the tradeoffs and challenges in offloading memory allocation to a dedicated core? (b) Should we use general-purpose cores or special-purpose cores for executing critical system management functions? (c) Can this methodology apply to heterogeneous systems (e.g., with GPUs, accelerators) and other service functions as well?

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

NextGen-Malloc:给内存分配器自己的空间

内存分配和管理对现代应用程序的性能和能耗有着重要的影响。我们观察到，在某些应用程序中，基于所使用的内存分配器，性能可能相差高达72%。许多当前的分配器都是多线程的，以支持来自不同线程的并发分配请求。然而，这种多线程的代价是维护与用户数据紧密耦合和交织在一起的复杂元数据。当内存管理函数和其他用户程序在同一核心上运行时，管理函数使用的元数据可能会污染处理器缓存和其他资源。在本文中，我们将内存分配(和其他类似的管理功能)从主处理核心卸载到其他处理单元，以提高性能，降低能耗，并为特定的应用程序或应用程序域定制服务。为了卸载这些多线程细粒度函数，我们建议将这些函数的元数据与其他应用程序数据解耦，以减少线程间元数据同步的开销。我们提请注意以下关键问题，以实现这一机会:(a)将内存分配卸载到专用核心的权衡和挑战是什么?(b)我们应该使用通用核还是专用核来执行关键的系统管理功能?(c)这种方法是否也适用于异构系统(例如，带有图形处理器、加速器)和其他业务功能?

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 19th Workshop on Hot Topics in Operating Systems

自引率

0.00%

发文量

期刊最新文献

Fabric-Centric Computing FBMM: Using the VFS for Extensibility in Kernel Memory Management Evolving Operating System Kernels Towards Secure Kernel-Driver Interfaces Prefetching Using Principles of Hippocampal-Neocortical Interaction HotGPT: How to Make Software Documentation More Useful with a Large Language Model?