SFMalloc:一个无锁且无同步的多核动态内存分配器

Sangmin Seo, Junghyun Kim, Jaejin Lee
{"title":"SFMalloc:一个无锁且无同步的多核动态内存分配器","authors":"Sangmin Seo, Junghyun Kim, Jaejin Lee","doi":"10.1109/PACT.2011.57","DOIUrl":null,"url":null,"abstract":"As parallel programming becomes the mainstream due to multicore processors, dynamic memory allocators used in C and C++ can suppress the performance of multi-threaded applications if they are not scalable. In this paper, we present a new dynamic memory allocator for multi-threaded applications. The allocator never uses any synchronization for common cases. It uses only lock-free synchronization mechanisms for uncommon cases. Each thread owns a private heap and handles memory requests on the heap. Our allocator is completely synchronization-free when a thread allocates a memory block and deal locates it by itself. Synchronization-free means that threads do not communicate with each other at all. On the other hand, if a thread allocates a block and another thread frees it, we use a lock-free stack to atomically add it to the owner thread's heap to avoid the memory blowup problem. Furthermore, our allocator exploits various memory block caching mechanisms to reduce the latency of memory management. Freed blocks or intermediate memory chunks are cached hierarchically in each thread's heap and they are used for future memory allocation. We compare the performance and scalability of our allocator to those of well-known existing multi-threaded memory allocators using eight benchmarks. Experimental results on a 48-core AMD system show that our approach achieves better performance than other allocators for all benchmarks and is highly scalable with a large number of threads.","PeriodicalId":106423,"journal":{"name":"2011 International Conference on Parallel Architectures and Compilation Techniques","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"32","resultStr":"{\"title\":\"SFMalloc: A Lock-Free and Mostly Synchronization-Free Dynamic Memory Allocator for Manycores\",\"authors\":\"Sangmin Seo, Junghyun Kim, Jaejin Lee\",\"doi\":\"10.1109/PACT.2011.57\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As parallel programming becomes the mainstream due to multicore processors, dynamic memory allocators used in C and C++ can suppress the performance of multi-threaded applications if they are not scalable. In this paper, we present a new dynamic memory allocator for multi-threaded applications. The allocator never uses any synchronization for common cases. It uses only lock-free synchronization mechanisms for uncommon cases. Each thread owns a private heap and handles memory requests on the heap. Our allocator is completely synchronization-free when a thread allocates a memory block and deal locates it by itself. Synchronization-free means that threads do not communicate with each other at all. On the other hand, if a thread allocates a block and another thread frees it, we use a lock-free stack to atomically add it to the owner thread's heap to avoid the memory blowup problem. Furthermore, our allocator exploits various memory block caching mechanisms to reduce the latency of memory management. Freed blocks or intermediate memory chunks are cached hierarchically in each thread's heap and they are used for future memory allocation. We compare the performance and scalability of our allocator to those of well-known existing multi-threaded memory allocators using eight benchmarks. Experimental results on a 48-core AMD system show that our approach achieves better performance than other allocators for all benchmarks and is highly scalable with a large number of threads.\",\"PeriodicalId\":106423,\"journal\":{\"name\":\"2011 International Conference on Parallel Architectures and Compilation Techniques\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-10-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"32\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 International Conference on Parallel Architectures and Compilation Techniques\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PACT.2011.57\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 International Conference on Parallel Architectures and Compilation Techniques","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PACT.2011.57","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 32

摘要

由于多核处理器的出现,并行编程成为主流,C和c++中使用的动态内存分配器如果不能扩展,可能会抑制多线程应用程序的性能。本文提出了一种新的多线程应用动态内存分配器。对于常见情况,分配器从不使用任何同步。它只在不常见的情况下使用无锁同步机制。每个线程拥有一个私有堆,并在堆上处理内存请求。当线程分配内存块并自己定位它时,我们的分配器是完全不同步的。无同步意味着线程完全不相互通信。另一方面,如果一个线程分配了一个块,而另一个线程释放了它,我们使用无锁堆栈自动将它添加到所有者线程的堆中,以避免内存爆炸问题。此外,我们的分配器利用各种内存块缓存机制来减少内存管理的延迟。释放的块或中间内存块分层地缓存在每个线程的堆中,并用于将来的内存分配。我们使用8个基准测试,将我们的分配器的性能和可伸缩性与那些知名的现有多线程内存分配器进行比较。在48核AMD系统上的实验结果表明,我们的方法在所有基准测试中都比其他分配器取得了更好的性能,并且在大量线程下具有很高的可扩展性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
SFMalloc: A Lock-Free and Mostly Synchronization-Free Dynamic Memory Allocator for Manycores
As parallel programming becomes the mainstream due to multicore processors, dynamic memory allocators used in C and C++ can suppress the performance of multi-threaded applications if they are not scalable. In this paper, we present a new dynamic memory allocator for multi-threaded applications. The allocator never uses any synchronization for common cases. It uses only lock-free synchronization mechanisms for uncommon cases. Each thread owns a private heap and handles memory requests on the heap. Our allocator is completely synchronization-free when a thread allocates a memory block and deal locates it by itself. Synchronization-free means that threads do not communicate with each other at all. On the other hand, if a thread allocates a block and another thread frees it, we use a lock-free stack to atomically add it to the owner thread's heap to avoid the memory blowup problem. Furthermore, our allocator exploits various memory block caching mechanisms to reduce the latency of memory management. Freed blocks or intermediate memory chunks are cached hierarchically in each thread's heap and they are used for future memory allocation. We compare the performance and scalability of our allocator to those of well-known existing multi-threaded memory allocators using eight benchmarks. Experimental results on a 48-core AMD system show that our approach achieves better performance than other allocators for all benchmarks and is highly scalable with a large number of threads.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Modeling and Performance Evaluation of TSO-Preserving Binary Optimization An Alternative Memory Access Scheduling in Manycore Accelerators DiDi: Mitigating the Performance Impact of TLB Shootdowns Using a Shared TLB Directory Compiling Dynamic Data Structures in Python to Enable the Use of Multi-core and Many-core Libraries Enhancing Data Locality for Dynamic Simulations through Asynchronous Data Transformations and Adaptive Control
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1