Program optimization for instruction caches

ASPLOS III Pub Date : 1989-04-01 DOI:10.1145/70082.68200
S. McFarling
{"title":"Program optimization for instruction caches","authors":"S. McFarling","doi":"10.1145/70082.68200","DOIUrl":null,"url":null,"abstract":"This paper presents an optimization algorithm for reducing instruction cache misses. The algorithm uses profile information to reposition programs in memory so that a direct-mapped cache behaves much like an optimal cache with full associativity and full knowledge of the future. For best results, the cache should have a mechanism for excluding certain instructions designated by the compiler. This paper first presents a reduced form of the algorithm. This form is shown to produce an optimal miss rate for programs without conditionals and with a tree call graph, assuming basic blocks can be reordered at will. If conditionals are allowed, but there are no loops within conditionals, the algorithm does as well as an optimal cache for the worst case execution of the program consistent with the profile information. Next, the algorithm is extended with heuristics for general programs. The effectiveness of these heuristics are demonstrated with empirical results for a set of 10 programs for various cache sizes. The improvement depends on cache size. For a 512 word cache, miss rates for a direct-mapped instruction cache are halved. For an 8K word cache, miss rates fall by over 75%. Over a wide range of cache sizes the algorithm is as effective as increasing the cache size by a factor of 3 times. For 512 words, the algorithm generates only 32% more misses than an optimal cache. Optimized programs on a direct-mapped cache have lower miss rates than unoptimized programs on set-associative caches of the same size.","PeriodicalId":359206,"journal":{"name":"ASPLOS III","volume":"65 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1989-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"257","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ASPLOS III","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/70082.68200","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 257

Abstract

This paper presents an optimization algorithm for reducing instruction cache misses. The algorithm uses profile information to reposition programs in memory so that a direct-mapped cache behaves much like an optimal cache with full associativity and full knowledge of the future. For best results, the cache should have a mechanism for excluding certain instructions designated by the compiler. This paper first presents a reduced form of the algorithm. This form is shown to produce an optimal miss rate for programs without conditionals and with a tree call graph, assuming basic blocks can be reordered at will. If conditionals are allowed, but there are no loops within conditionals, the algorithm does as well as an optimal cache for the worst case execution of the program consistent with the profile information. Next, the algorithm is extended with heuristics for general programs. The effectiveness of these heuristics are demonstrated with empirical results for a set of 10 programs for various cache sizes. The improvement depends on cache size. For a 512 word cache, miss rates for a direct-mapped instruction cache are halved. For an 8K word cache, miss rates fall by over 75%. Over a wide range of cache sizes the algorithm is as effective as increasing the cache size by a factor of 3 times. For 512 words, the algorithm generates only 32% more misses than an optimal cache. Optimized programs on a direct-mapped cache have lower miss rates than unoptimized programs on set-associative caches of the same size.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
指令缓存的程序优化
提出了一种减少指令缓存失误的优化算法。该算法使用配置文件信息在内存中重新定位程序,这样直接映射的缓存就像具有完全关联性和完全了解未来的最优缓存一样。为了获得最佳效果,缓存应该具有排除编译器指定的某些指令的机制。本文首先给出了该算法的简化形式。在假设基本块可以随意重新排序的情况下,这种形式可以在没有条件和树调用图的情况下为程序产生最优的缺失率。如果允许条件,但条件中没有循环,则算法可以为与概要信息一致的程序的最坏情况执行提供最佳缓存。然后,利用启发式算法对一般程序进行了扩展。对于不同缓存大小的一组10个程序的经验结果证明了这些启发式方法的有效性。改进取决于缓存大小。对于512字的缓存,直接映射指令缓存的缺失率减半。对于8K字缓存,缺失率下降了75%以上。在广泛的缓存大小范围内,该算法与将缓存大小增加3倍一样有效。对于512个单词,该算法只比最优缓存多产生32%的错误。在直接映射缓存上优化的程序比在相同大小的集合关联缓存上未优化的程序有更低的丢失率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Program optimization for instruction caches A message driven OR-parallel machine Reference history, page size, and migration daemons in local/remote architectures An analysis of 8086 instruction set usage in MS DOS programs Available instruction-level parallelism for superscalar and superpipelined machines
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1