Scalable Data-structures with Hierarchical, Distributed Delegation

Proceedings of the 20th International Middleware Conference Pub Date : 2019-12-09 DOI:10.1145/3361525.3361537

Yuxin Ren, Gabriel Parmer

{"title":"Scalable Data-structures with Hierarchical, Distributed Delegation","authors":"Yuxin Ren, Gabriel Parmer","doi":"10.1145/3361525.3361537","DOIUrl":null,"url":null,"abstract":"Scaling data-structures up to the increasing number of cores provided by modern systems is challenging. The quest for scalability is complicated by the non-uniform memory accesses (NUMA) of multi-socket machines that often prohibit the effective use of data-structures that span memory localities. Conventional shared memory data-structures using efficient non-blocking or lock-based implementations inevitably suffer from cache-coherency overheads, and non-local memory accesses between sockets. Multi-socket systems are common in cloud hardware, and many products are pushing shared memory systems to greater scales, thus making the ability to scale data-structures all the more pressing. In this paper, we present the Distributed, Delegated Parallel Sections (DPS) runtime system that uses message-passing to move the computation on portions of data-structures between memory localities, while leveraging efficient shared memory implementations within each locality to harness efficient parallelism. We show through a series of data-structure scalability evaluations, and through an adaptation of memcached, that DPS enables strong data-structure scalability. DPS provides more than a factor of 3.1 improvements in throughput, and 23x decreases in tail latency for memcached.","PeriodicalId":381253,"journal":{"name":"Proceedings of the 20th International Middleware Conference","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 20th International Middleware Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3361525.3361537","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Scaling data-structures up to the increasing number of cores provided by modern systems is challenging. The quest for scalability is complicated by the non-uniform memory accesses (NUMA) of multi-socket machines that often prohibit the effective use of data-structures that span memory localities. Conventional shared memory data-structures using efficient non-blocking or lock-based implementations inevitably suffer from cache-coherency overheads, and non-local memory accesses between sockets. Multi-socket systems are common in cloud hardware, and many products are pushing shared memory systems to greater scales, thus making the ability to scale data-structures all the more pressing. In this paper, we present the Distributed, Delegated Parallel Sections (DPS) runtime system that uses message-passing to move the computation on portions of data-structures between memory localities, while leveraging efficient shared memory implementations within each locality to harness efficient parallelism. We show through a series of data-structure scalability evaluations, and through an adaptation of memcached, that DPS enables strong data-structure scalability. DPS provides more than a factor of 3.1 improvements in throughput, and 23x decreases in tail latency for memcached.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

可扩展的数据结构与分层，分布式委托

将数据结构扩展到现代系统提供的越来越多的核心是一项挑战。多套接字机器的非统一内存访问(NUMA)通常会阻碍跨内存位置的数据结构的有效使用，这使得对可伸缩性的追求变得复杂。使用高效的非阻塞或基于锁的实现的传统共享内存数据结构不可避免地要承受缓存一致性开销，以及套接字之间的非本地内存访问。多套接字系统在云硬件中很常见，许多产品正在将共享内存系统推向更大的规模，从而使扩展数据结构的能力更加紧迫。在本文中，我们提出了分布式、委托并行部分(DPS)运行时系统，该系统使用消息传递在内存位置之间移动数据结构部分的计算，同时利用每个位置内有效的共享内存实现来利用有效的并行性。通过一系列数据结构可伸缩性评估和memcached的适配，我们表明DPS支持强大的数据结构可伸缩性。DPS提供了3.1倍以上的吞吐量改进，并将memcached的尾部延迟减少了23倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 20th International Middleware Conference

自引率

0.00%

发文量

期刊最新文献

OS-Augmented Oversubscription of Opportunistic Memory with a User-Assisted OOM Killer Medley: A Novel Distributed Failure Detector for IoT Networks AccTEE FabricCRDT: A Conflict-Free Replicated Datatypes Approach to Permissioned Blockchains Combining it all: Cost minimal and low-latency stream processing across distributed heterogeneous infrastructures