Comparing LLC-Memory Traffic between CPU and GPU Architectures

2021 IEEE/ACM Redefining Scalability for Diversely Heterogeneous Architectures Workshop (RSDHA) Pub Date : 2021-11-01 DOI:10.1109/rsdha54838.2021.00007

Mohammad Alaul Haque Monil, Seyong Lee, J. Vetter, A. Malony

{"title":"Comparing LLC-Memory Traffic between CPU and GPU Architectures","authors":"Mohammad Alaul Haque Monil, Seyong Lee, J. Vetter, A. Malony","doi":"10.1109/rsdha54838.2021.00007","DOIUrl":null,"url":null,"abstract":"The cache hierarchy in modern CPUs and GPUs is becoming increasingly complex, which makes understanding the handshake between the memory access patterns and the cache hierarchy difficult. Moreover, the details of different cache policies are not publicly available. Therefore, the research community relies on observation to understand the relationship between memory access patterns and cache hierarchy. Our previous studies delved into the different microarchitectures of Intel CPUs. In this study, GPUs from NVIDIA and AMD are considered. Even though the execution models in CPUs and GPUs are distinct, this study attempts to correlate the behavior of the cache hierarchy of CPUs and GPUs. Using the knowledge gathered from studying Intel CPUs, the similarities and dissimilarities between CPUs and GPUs are identified. Through model evaluation, this study provides a proof of concept that traffic between last-level cache and memory can be predicted for sequential streaming and strided access patterns on GPUs.","PeriodicalId":119942,"journal":{"name":"2021 IEEE/ACM Redefining Scalability for Diversely Heterogeneous Architectures Workshop (RSDHA)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/ACM Redefining Scalability for Diversely Heterogeneous Architectures Workshop (RSDHA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/rsdha54838.2021.00007","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The cache hierarchy in modern CPUs and GPUs is becoming increasingly complex, which makes understanding the handshake between the memory access patterns and the cache hierarchy difficult. Moreover, the details of different cache policies are not publicly available. Therefore, the research community relies on observation to understand the relationship between memory access patterns and cache hierarchy. Our previous studies delved into the different microarchitectures of Intel CPUs. In this study, GPUs from NVIDIA and AMD are considered. Even though the execution models in CPUs and GPUs are distinct, this study attempts to correlate the behavior of the cache hierarchy of CPUs and GPUs. Using the knowledge gathered from studying Intel CPUs, the similarities and dissimilarities between CPUs and GPUs are identified. Through model evaluation, this study provides a proof of concept that traffic between last-level cache and memory can be predicted for sequential streaming and strided access patterns on GPUs.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

比较CPU和GPU架构之间的LLC-Memory流量

现代cpu和gpu中的缓存层次结构变得越来越复杂，这使得理解内存访问模式和缓存层次结构之间的握手变得困难。此外，不同缓存策略的详细信息是不公开的。因此，研究界依赖于观察来理解内存访问模式和缓存层次之间的关系。我们之前的研究深入研究了英特尔cpu的不同微架构。在本研究中，我们考虑了NVIDIA和AMD的gpu。尽管cpu和gpu的执行模型是不同的，但本研究试图将cpu和gpu的缓存层次结构的行为联系起来。通过对Intel cpu的研究，找出了cpu和gpu的异同点。通过模型评估，本研究提供了一个概念证明，在gpu上的顺序流和跨行访问模式下，可以预测最后一级缓存和内存之间的流量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2021 IEEE/ACM Redefining Scalability for Diversely Heterogeneous Architectures Workshop (RSDHA)

自引率

0.00%

发文量

期刊最新文献

Comparing LLC-Memory Traffic between CPU and GPU Architectures Platform Agnostic Streaming Data Application Performance Models ELIχR: Eliminating Computation Redundancy in CNN-Based Video Processing [Copyright notice] Energy Efficient Task Graph Execution Using Compute Unit Masking in GPUs