Improving DRAM performance by parallelizing refreshes with accesses

2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA) Pub Date : 2014-06-19 DOI:10.1109/HPCA.2014.6835946

K. Chang, Donghyuk Lee, Zeshan A. Chishti, Alaa R. Alameldeen, C. Wilkerson, Yoongu Kim, O. Mutlu

{"title":"Improving DRAM performance by parallelizing refreshes with accesses","authors":"K. Chang, Donghyuk Lee, Zeshan A. Chishti, Alaa R. Alameldeen, C. Wilkerson, Yoongu Kim, O. Mutlu","doi":"10.1109/HPCA.2014.6835946","DOIUrl":null,"url":null,"abstract":"Modern DRAM cells are periodically refreshed to prevent data loss due to leakage. Commodity DDR (double data rate) DRAM refreshes cells at the rank level. This degrades performance significantly because it prevents an entire DRAM rank from serving memory requests while being refreshed. DRAM designed for mobile platforms, LPDDR (low power DDR) DRAM, supports an enhanced mode, called per-bank refresh, that refreshes cells at the bank level. This enables a bank to be accessed while another in the same rank is being refreshed, alleviating part of the negative performance impact of refreshes. Unfortunately, there are two shortcomings of per-bank refresh employed in today's systems. First, we observe that the perbank refresh scheduling scheme does not exploit the full potential of overlapping refreshes with accesses across banks because it restricts the banks to be refreshed in a sequential round-robin order. Second, accesses to a bank that is being refreshed have to wait. To mitigate the negative performance impact of DRAM refresh, we propose two complementary mechanisms, DARP (Dynamic Access Refresh Parallelization) and SARP (Subarray Access Refresh Parallelization). The goal is to address the drawbacks of per-bank refresh by building more efficient techniques to parallelize refreshes and accesses within DRAM. First, instead of issuing per-bank refreshes in a round-robin order, as it is done today, DARP issues per-bank refreshes to idle banks in an out-of-order manner. Furthermore, DARP proactively schedules refreshes during intervals when a batch of writes are draining to DRAM. Second, SARP exploits the existence of mostly-independent subarrays within a bank. With minor modifications to DRAM organization, it allows a bank to serve memory accesses to an idle subarray while another subarray is being refreshed. Extensive evaluations on a wide variety of workloads and systems show that our mechanisms improve system performance (and energy efficiency) compared to three state-of-the-art refresh policies and the performance benefit increases as DRAM density increases.","PeriodicalId":164587,"journal":{"name":"2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"208","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA.2014.6835946","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 208

Abstract

Modern DRAM cells are periodically refreshed to prevent data loss due to leakage. Commodity DDR (double data rate) DRAM refreshes cells at the rank level. This degrades performance significantly because it prevents an entire DRAM rank from serving memory requests while being refreshed. DRAM designed for mobile platforms, LPDDR (low power DDR) DRAM, supports an enhanced mode, called per-bank refresh, that refreshes cells at the bank level. This enables a bank to be accessed while another in the same rank is being refreshed, alleviating part of the negative performance impact of refreshes. Unfortunately, there are two shortcomings of per-bank refresh employed in today's systems. First, we observe that the perbank refresh scheduling scheme does not exploit the full potential of overlapping refreshes with accesses across banks because it restricts the banks to be refreshed in a sequential round-robin order. Second, accesses to a bank that is being refreshed have to wait. To mitigate the negative performance impact of DRAM refresh, we propose two complementary mechanisms, DARP (Dynamic Access Refresh Parallelization) and SARP (Subarray Access Refresh Parallelization). The goal is to address the drawbacks of per-bank refresh by building more efficient techniques to parallelize refreshes and accesses within DRAM. First, instead of issuing per-bank refreshes in a round-robin order, as it is done today, DARP issues per-bank refreshes to idle banks in an out-of-order manner. Furthermore, DARP proactively schedules refreshes during intervals when a batch of writes are draining to DRAM. Second, SARP exploits the existence of mostly-independent subarrays within a bank. With minor modifications to DRAM organization, it allows a bank to serve memory accesses to an idle subarray while another subarray is being refreshed. Extensive evaluations on a wide variety of workloads and systems show that our mechanisms improve system performance (and energy efficiency) compared to three state-of-the-art refresh policies and the performance benefit increases as DRAM density increases.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

通过并行刷新和访问来提高DRAM性能

现代DRAM单元定期刷新，以防止因泄漏而丢失数据。商品DDR(双倍数据速率)DRAM在等级级别刷新单元。这会显著降低性能，因为它会阻止整个DRAM队列在刷新时处理内存请求。专为移动平台设计的LPDDR(低功耗DDR) DRAM支持一种增强模式，称为每银行刷新，在银行级别刷新单元。这使得可以在刷新同一级别的另一个银行时访问一个银行，从而减轻了刷新对性能的部分负面影响。不幸的是，在当今的系统中，每个银行的刷新有两个缺点。首先，我们观察到，perbank刷新调度方案没有充分利用跨银行访问重叠刷新的潜力，因为它限制了银行以连续的循环顺序进行刷新。其次，访问正在刷新的银行需要等待。为了减轻DRAM刷新对性能的负面影响，我们提出了两种互补机制，DARP(动态访问刷新并行化)和SARP(子阵列访问刷新并行化)。我们的目标是通过构建更有效的技术来并行处理DRAM中的刷新和访问，从而解决每个银行刷新的缺点。首先，DARP不像现在那样以循环顺序发布每个银行的刷新，而是以无序的方式向空闲银行发布每个银行的刷新。此外，DARP在一批写操作消耗到DRAM时主动安排刷新。其次，SARP利用银行中存在的大多数独立子数组。通过对DRAM组织进行少量修改，它允许存储库在刷新另一个子数组时为空闲子数组提供内存访问。对各种工作负载和系统的广泛评估表明，与三种最先进的刷新策略相比，我们的机制提高了系统性能(和能源效率)，并且性能优势随着DRAM密度的增加而增加。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA)

自引率

0.00%

发文量