Understanding the Dynamic Caches on Intel Processors: Methods and Applications

2014 12th IEEE International Conference on Embedded and Ubiquitous Computing Pub Date : 2014-08-26 DOI:10.1109/EUC.2014.18

Yi Zhang, Nan Guan, W. Yi

{"title":"Understanding the Dynamic Caches on Intel Processors: Methods and Applications","authors":"Yi Zhang, Nan Guan, W. Yi","doi":"10.1109/EUC.2014.18","DOIUrl":null,"url":null,"abstract":"The design and implementation of caches on a given platform has significant impacts to many areas in computer system design. On chip-multiprocessors (CMP), new cache architectures are proposed to meet the rapidly increasing performance requirements. However, the cache architectures are usually not well-documented for commercial processors. This raises difficulties for people to precisely understand the working principle of many components of the processors, not only the cache itself, but also the related components like the whole memory subsystem. This paper aims at disclosing the working principle of the last level cache of Intel Ivy Bridge processors. First, we identify the address translation logic on this cache. Second, we disclose the replacement policy of the cache. This is a dynamic insertion replacement policy, which is very different from the widely used LRU policy and its variants. Although this replacement policy has been proposed in academic literatures, our work is the first one showing it is actually used in commercial processors. To show the significance of our discovery, we design a methodology to generate controllable cache miss sequences under this new cache, and apply it to the design of a benchmark to model the memory concurrency. Evaluations on physical machines are conducted to show the effectiveness of the proposed method.","PeriodicalId":331736,"journal":{"name":"2014 12th IEEE International Conference on Embedded and Ubiquitous Computing","volume":"62 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 12th IEEE International Conference on Embedded and Ubiquitous Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EUC.2014.18","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

Abstract

The design and implementation of caches on a given platform has significant impacts to many areas in computer system design. On chip-multiprocessors (CMP), new cache architectures are proposed to meet the rapidly increasing performance requirements. However, the cache architectures are usually not well-documented for commercial processors. This raises difficulties for people to precisely understand the working principle of many components of the processors, not only the cache itself, but also the related components like the whole memory subsystem. This paper aims at disclosing the working principle of the last level cache of Intel Ivy Bridge processors. First, we identify the address translation logic on this cache. Second, we disclose the replacement policy of the cache. This is a dynamic insertion replacement policy, which is very different from the widely used LRU policy and its variants. Although this replacement policy has been proposed in academic literatures, our work is the first one showing it is actually used in commercial processors. To show the significance of our discovery, we design a methodology to generate controllable cache miss sequences under this new cache, and apply it to the design of a benchmark to model the memory concurrency. Evaluations on physical machines are conducted to show the effectiveness of the proposed method.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

理解英特尔处理器上的动态缓存:方法和应用

在给定平台上缓存的设计和实现对计算机系统设计的许多领域都有重大影响。在芯片多处理器(CMP)上，为了满足快速增长的性能要求，提出了新的缓存架构。然而，对于商业处理器来说，缓存架构通常没有很好的文档。这给人们准确理解处理器的许多组件的工作原理带来了困难，不仅是缓存本身，还有整个内存子系统等相关组件。本文旨在揭示英特尔长春藤桥处理器最后一级缓存的工作原理。首先，我们确定此缓存上的地址转换逻辑。其次，我们披露了缓存的替换策略。这是一种动态插入替换策略，与广泛使用的LRU策略及其变体有很大不同。虽然这种替换策略已经在学术文献中提出，但我们的工作是第一个显示它实际上在商业处理器中使用的研究。为了显示我们的发现的意义，我们设计了一种方法来在这种新的缓存下生成可控的缓存缺失序列，并将其应用于设计一个基准来建模内存并发性。在物理机器上进行了评估，以证明所提出方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2014 12th IEEE International Conference on Embedded and Ubiquitous Computing

自引率

0.00%

发文量

期刊最新文献

The Monotonic Separation Kernel pRoot: An Adaptable Wireless Sensor-Actuator Hardware Platform Mobile Augmented Reality System for Marine Navigation Assistance Investigating Flow Dynamics with Wireless Pressure Sensors Network Platform Device Assignment to KVM-on-ARM Virtual Machines via VFIO