SELCC: Coherent Caching over Compute-Limited Disaggregated Memory

arXiv - CS - Emerging Technologies Pub Date : 2024-09-03 DOI:arxiv-2409.02088

Ruihong Wang, Jianguo Wang, Walid G. Aref

{"title":"SELCC: Coherent Caching over Compute-Limited Disaggregated Memory","authors":"Ruihong Wang, Jianguo Wang, Walid G. Aref","doi":"arxiv-2409.02088","DOIUrl":null,"url":null,"abstract":"Disaggregating memory from compute offers the opportunity to better utilize\nstranded memory in data centers. It is important to cache data in the compute\nnodes and maintain cache coherence across multiple compute nodes to save on\nround-trip communication cost between the disaggregated memory and the compute\nnodes. However, the limited computing power on the disaggregated memory servers\nmakes it challenging to maintain cache coherence among multiple compute-side\ncaches over disaggregated shared memory. This paper introduces SELCC; a\nShared-Exclusive Latch Cache Coherence protocol that maintains cache coherence\nwithout imposing any computational burden on the remote memory side. SELCC\nbuilds on a one-sided shared-exclusive latch protocol by introducing lazy latch\nrelease and invalidation messages among the compute nodes so that it can\nguarantee both data access atomicity and cache coherence. SELCC minimizes\ncommunication round-trips by embedding the current cache copy holder IDs into\nRDMA latch words and prioritizes local concurrency control over global\nconcurrency control. We instantiate the SELCC protocol onto compute-sided\ncache, forming an abstraction layer over disaggregated memory. This abstraction\nlayer provides main-memory-like APIs to upper-level applications, and thus\nenabling existing data structures and algorithms to function over disaggregated\nmemory with minimal code change. To demonstrate the usability of SELCC, we\nimplement a B-tree and three transaction concurrency control algorithms over\nSELCC's APIs. Micro-benchmark results show that the SELCC protocol achieves\nbetter performance compared to RPC-based cache-coherence protocols.\nAdditionally, YCSB and TPC-C benchmarks indicate that applications over SELCC\ncan achieve comparable or superior performance against competitors over\ndisaggregated memory.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"72 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Emerging Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.02088","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Disaggregating memory from compute offers the opportunity to better utilize stranded memory in data centers. It is important to cache data in the compute nodes and maintain cache coherence across multiple compute nodes to save on round-trip communication cost between the disaggregated memory and the compute nodes. However, the limited computing power on the disaggregated memory servers makes it challenging to maintain cache coherence among multiple compute-side caches over disaggregated shared memory. This paper introduces SELCC; a Shared-Exclusive Latch Cache Coherence protocol that maintains cache coherence without imposing any computational burden on the remote memory side. SELCC builds on a one-sided shared-exclusive latch protocol by introducing lazy latch release and invalidation messages among the compute nodes so that it can guarantee both data access atomicity and cache coherence. SELCC minimizes communication round-trips by embedding the current cache copy holder IDs into RDMA latch words and prioritizes local concurrency control over global concurrency control. We instantiate the SELCC protocol onto compute-sided cache, forming an abstraction layer over disaggregated memory. This abstraction layer provides main-memory-like APIs to upper-level applications, and thus enabling existing data structures and algorithms to function over disaggregated memory with minimal code change. To demonstrate the usability of SELCC, we implement a B-tree and three transaction concurrency control algorithms over SELCC's APIs. Micro-benchmark results show that the SELCC protocol achieves better performance compared to RPC-based cache-coherence protocols. Additionally, YCSB and TPC-C benchmarks indicate that applications over SELCC can achieve comparable or superior performance against competitors over disaggregated memory.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

SELCC：计算受限的分解内存上的相干缓存

将内存从计算中分离出来为更好地利用数据中心的内存提供了机会。在计算节点中缓存数据并在多个计算节点之间保持缓存一致性以节省分解内存与计算节点之间的往返通信成本非常重要。然而，由于分解内存服务器的计算能力有限，在分解共享内存上保持多个计算侧缓存之间的缓存一致性具有挑战性。本文介绍的 SELCC 是一种共享独占锁存器缓存一致性协议，它能在不给远程内存侧带来任何计算负担的情况下保持缓存一致性。SELCC 建立在单边共享独占锁存器协议的基础上，在计算节点之间引入了懒锁存器释放和失效消息，从而保证了数据访问的原子性和高速缓存的一致性。SELCC 通过将当前缓存副本持有者 ID 嵌入到 RDMA 锁存字中，最大限度地减少了通信往返次数，并将本地并发控制置于全局并发控制之上。我们将 SELCC 协议实例化到计算侧高速缓存上，在分解内存上形成一个抽象层。这个抽象层为上层应用提供了类似于主存储器的应用程序接口，从而使现有的数据结构和算法只需最少的代码改动就能在分解内存上运行。为了证明SELCC的可用性，我们在SELCC的API上实现了一个B树和三个事务并发控制算法。微基准测试结果表明，与基于RPC的高速缓存一致性协议相比，SELCC协议实现了更高的性能。此外，YCSB和TPC-C基准测试表明，通过SELCC的应用程序可以在分解内存上实现与竞争对手相当或更高的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

arXiv - CS - Emerging Technologies

自引率

0.00%

发文量