Distance-in-time versus distance-in-space

M. Kandemir, Xulong Tang, Hui Zhao, Jihyun Ryoo, M. Karakoy
{"title":"Distance-in-time versus distance-in-space","authors":"M. Kandemir, Xulong Tang, Hui Zhao, Jihyun Ryoo, M. Karakoy","doi":"10.1145/3453483.3454069","DOIUrl":null,"url":null,"abstract":"Cache behavior is one of the major factors that influence the performance of applications. Most of the existing compiler techniques that target cache memories focus exclusively on reducing data reuse distances in time (DIT). However, current manycore systems employ distributed on-chip caches that are connected using an on-chip network. As a result, a reused data element/block needs to travel over this on-chip network, and the distance to be traveled -- reuse distance in space (DIS) -- can be as influential in dictating application performance as reuse DIT. This paper represents the first attempt at defining a compiler framework that accommodates both DIT and DIS. Specifically, it first classifies data reuses into four groups: G1: (low DIT, low DIS), G2: (high DIT, low DIS), G3: (low DIT, high DIS), and G4: (high DIT, high DIS). Then, observing that reuses in G1 represent the ideal case and there is nothing much to be done in computations in G4, it proposes a \"reuse transfer\" strategy that transfers select reuses between G2 and G3, eventually, transforming each reuse to either G1 or G4. Finally, it evaluates the proposed strategy using a set of 10 multithreaded applications. The collected results reveal that the proposed strategy reduces parallel execution times of the tested applications between 19.3% and 33.3%.","PeriodicalId":20557,"journal":{"name":"Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation","volume":"71 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3453483.3454069","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Cache behavior is one of the major factors that influence the performance of applications. Most of the existing compiler techniques that target cache memories focus exclusively on reducing data reuse distances in time (DIT). However, current manycore systems employ distributed on-chip caches that are connected using an on-chip network. As a result, a reused data element/block needs to travel over this on-chip network, and the distance to be traveled -- reuse distance in space (DIS) -- can be as influential in dictating application performance as reuse DIT. This paper represents the first attempt at defining a compiler framework that accommodates both DIT and DIS. Specifically, it first classifies data reuses into four groups: G1: (low DIT, low DIS), G2: (high DIT, low DIS), G3: (low DIT, high DIS), and G4: (high DIT, high DIS). Then, observing that reuses in G1 represent the ideal case and there is nothing much to be done in computations in G4, it proposes a "reuse transfer" strategy that transfers select reuses between G2 and G3, eventually, transforming each reuse to either G1 or G4. Finally, it evaluates the proposed strategy using a set of 10 multithreaded applications. The collected results reveal that the proposed strategy reduces parallel execution times of the tested applications between 19.3% and 33.3%.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
时间距离和空间距离
缓存行为是影响应用程序性能的主要因素之一。大多数针对缓存内存的现有编译器技术都只关注于减少数据重用的时间间隔(DIT)。然而,目前的多核系统采用分布式片上缓存,这些缓存使用片上网络连接。因此,被重用的数据元素/块需要通过这个片上网络传输,而传输的距离——空间中的重用距离(DIS)——在决定应用程序性能方面的影响可能与重用DIT一样大。本文首次尝试定义一个既包含DIT又包含DIS的编译器框架。具体来说,它首先将数据重用分为四组:G1:(低DIT,低DIS), G2:(高DIT,低DIS), G3:(低DIT,高DIS)和G4:(高DIT,高DIS)。然后,观察到G1中的重用代表了理想的情况,并且G4中的计算没有太多要做的事情,它提出了一种“重用传输”策略,在G2和G3之间传输选定的重用,最终将每个重用转换为G1或G4。最后,它使用一组10个多线程应用程序来评估所建议的策略。收集的结果表明,该策略将测试应用程序的并行执行时间减少了19.3%至33.3%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Learning to find naming issues with big code and small supervision Cyclic program synthesis Fluid: a framework for approximate concurrency via controlled dependency relaxation Bliss: auto-tuning complex applications using a pool of diverse lightweight learning models Phased synthesis of divide and conquer programs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1