Scalable Data-structures with Hierarchical, Distributed Delegation

Yuxin Ren, Gabriel Parmer
{"title":"Scalable Data-structures with Hierarchical, Distributed Delegation","authors":"Yuxin Ren, Gabriel Parmer","doi":"10.1145/3361525.3361537","DOIUrl":null,"url":null,"abstract":"Scaling data-structures up to the increasing number of cores provided by modern systems is challenging. The quest for scalability is complicated by the non-uniform memory accesses (NUMA) of multi-socket machines that often prohibit the effective use of data-structures that span memory localities. Conventional shared memory data-structures using efficient non-blocking or lock-based implementations inevitably suffer from cache-coherency overheads, and non-local memory accesses between sockets. Multi-socket systems are common in cloud hardware, and many products are pushing shared memory systems to greater scales, thus making the ability to scale data-structures all the more pressing. In this paper, we present the Distributed, Delegated Parallel Sections (DPS) runtime system that uses message-passing to move the computation on portions of data-structures between memory localities, while leveraging efficient shared memory implementations within each locality to harness efficient parallelism. We show through a series of data-structure scalability evaluations, and through an adaptation of memcached, that DPS enables strong data-structure scalability. DPS provides more than a factor of 3.1 improvements in throughput, and 23x decreases in tail latency for memcached.","PeriodicalId":381253,"journal":{"name":"Proceedings of the 20th International Middleware Conference","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 20th International Middleware Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3361525.3361537","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Scaling data-structures up to the increasing number of cores provided by modern systems is challenging. The quest for scalability is complicated by the non-uniform memory accesses (NUMA) of multi-socket machines that often prohibit the effective use of data-structures that span memory localities. Conventional shared memory data-structures using efficient non-blocking or lock-based implementations inevitably suffer from cache-coherency overheads, and non-local memory accesses between sockets. Multi-socket systems are common in cloud hardware, and many products are pushing shared memory systems to greater scales, thus making the ability to scale data-structures all the more pressing. In this paper, we present the Distributed, Delegated Parallel Sections (DPS) runtime system that uses message-passing to move the computation on portions of data-structures between memory localities, while leveraging efficient shared memory implementations within each locality to harness efficient parallelism. We show through a series of data-structure scalability evaluations, and through an adaptation of memcached, that DPS enables strong data-structure scalability. DPS provides more than a factor of 3.1 improvements in throughput, and 23x decreases in tail latency for memcached.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
可扩展的数据结构与分层,分布式委托
将数据结构扩展到现代系统提供的越来越多的核心是一项挑战。多套接字机器的非统一内存访问(NUMA)通常会阻碍跨内存位置的数据结构的有效使用,这使得对可伸缩性的追求变得复杂。使用高效的非阻塞或基于锁的实现的传统共享内存数据结构不可避免地要承受缓存一致性开销,以及套接字之间的非本地内存访问。多套接字系统在云硬件中很常见,许多产品正在将共享内存系统推向更大的规模,从而使扩展数据结构的能力更加紧迫。在本文中,我们提出了分布式、委托并行部分(DPS)运行时系统,该系统使用消息传递在内存位置之间移动数据结构部分的计算,同时利用每个位置内有效的共享内存实现来利用有效的并行性。通过一系列数据结构可伸缩性评估和memcached的适配,我们表明DPS支持强大的数据结构可伸缩性。DPS提供了3.1倍以上的吞吐量改进,并将memcached的尾部延迟减少了23倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
OS-Augmented Oversubscription of Opportunistic Memory with a User-Assisted OOM Killer Medley: A Novel Distributed Failure Detector for IoT Networks AccTEE FabricCRDT: A Conflict-Free Replicated Datatypes Approach to Permissioned Blockchains Combining it all: Cost minimal and low-latency stream processing across distributed heterogeneous infrastructures
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1