Efficient virtual memory for big memory servers

Proceedings of the 40th Annual International Symposium on Computer Architecture Pub Date : 2013-06-23 DOI:10.1145/2485922.2485943

Arkaprava Basu, Jayneel Gandhi, Jichuan Chang, M. Hill, M. Swift

{"title":"Efficient virtual memory for big memory servers","authors":"Arkaprava Basu, Jayneel Gandhi, Jichuan Chang, M. Hill, M. Swift","doi":"10.1145/2485922.2485943","DOIUrl":null,"url":null,"abstract":"Our analysis shows that many \"big-memory\" server workloads, such as databases, in-memory caches, and graph analytics, pay a high cost for page-based virtual memory. They consume as much as 10% of execution cycles on TLB misses, even using large pages. On the other hand, we find that these workloads use read-write permission on most pages, are provisioned not to swap, and rarely benefit from the full flexibility of page-based virtual memory. To remove the TLB miss overhead for big-memory workloads, we propose mapping part of a process's linear virtual address space with a direct segment, while page mapping the rest of the virtual address space. Direct segments use minimal hardware---base, limit and offset registers per core---to map contiguous virtual memory regions directly to contiguous physical memory. They eliminate the possibility of TLB misses for key data structures such as database buffer pools and in-memory key-value stores. Memory mapped by a direct segment may be converted back to paging when needed. We prototype direct-segment software support for x86-64 in Linux and emulate direct-segment hardware. For our workloads, direct segments eliminate almost all TLB misses and reduce the execution time wasted on TLB misses to less than 0.5%.","PeriodicalId":20555,"journal":{"name":"Proceedings of the 40th Annual International Symposium on Computer Architecture","volume":"3 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"319","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 40th Annual International Symposium on Computer Architecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2485922.2485943","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 319

Abstract

Our analysis shows that many "big-memory" server workloads, such as databases, in-memory caches, and graph analytics, pay a high cost for page-based virtual memory. They consume as much as 10% of execution cycles on TLB misses, even using large pages. On the other hand, we find that these workloads use read-write permission on most pages, are provisioned not to swap, and rarely benefit from the full flexibility of page-based virtual memory. To remove the TLB miss overhead for big-memory workloads, we propose mapping part of a process's linear virtual address space with a direct segment, while page mapping the rest of the virtual address space. Direct segments use minimal hardware---base, limit and offset registers per core---to map contiguous virtual memory regions directly to contiguous physical memory. They eliminate the possibility of TLB misses for key data structures such as database buffer pools and in-memory key-value stores. Memory mapped by a direct segment may be converted back to paging when needed. We prototype direct-segment software support for x86-64 in Linux and emulate direct-segment hardware. For our workloads, direct segments eliminate almost all TLB misses and reduce the execution time wasted on TLB misses to less than 0.5%.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

为大内存服务器提供高效的虚拟内存

我们的分析表明，许多“大内存”服务器工作负载(如数据库、内存缓存和图形分析)为基于页面的虚拟内存付出了高昂的成本。即使使用大页面，它们也会在TLB失败上消耗高达10%的执行周期。另一方面，我们发现这些工作负载在大多数页面上使用读写权限，不允许交换，并且很少受益于基于页面的虚拟内存的全部灵活性。为了消除大内存工作负载的TLB遗漏开销，我们建议将进程的线性虚拟地址空间的一部分映射为直接段，而将页面映射为虚拟地址空间的其余部分。直接段使用最小的硬件——每个核的基寄存器、限制寄存器和偏移寄存器——将连续的虚拟内存区域直接映射到连续的物理内存。它们消除了关键数据结构(如数据库缓冲池和内存中的键值存储)丢失TLB的可能性。由直接段映射的内存可以在需要时转换回分页。我们对Linux中支持x86-64的直接分段软件进行了原型设计，并对直接分段硬件进行了仿真。对于我们的工作负载，直接段消除了几乎所有的TLB失误，并将TLB失误所浪费的执行时间减少到0.5%以下。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 40th Annual International Symposium on Computer Architecture

自引率

0.00%

发文量