Exploiting Partially Context-Sensitive Profiles to Improve Performance of Hot Code

IF 1.5 2区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING ACM Transactions on Programming Languages and Systems Pub Date : 2023-09-13 DOI:10.1145/3612937
Maja Vukasovic, Aleksandar Prokopec
{"title":"Exploiting Partially Context-Sensitive Profiles to Improve Performance of Hot Code","authors":"Maja Vukasovic, Aleksandar Prokopec","doi":"10.1145/3612937","DOIUrl":null,"url":null,"abstract":"Availability of profiling information is a major advantage of just-in-time (JIT) compilation. Profiles guide the compilation order and optimizations, thus substantially improving program performance. Ahead-of-time (AOT) compilation can also utilize profiles, obtained during separate profiling runs of the programs. Profiles can be context-sensitive, i.e., each profile entry is associated with a call-stack. To ease profile collection and reduce overheads, many systems collect partially context-sensitive profiles, which record only a call-stack suffix. Despite prior related work, partially context-sensitive profiles have the potential to further improve compiler optimizations. In this paper, we describe a novel technique that exploits partially context-sensitive profiles to determine which portions of code are hot, and compile them with additional compilation budget. This technique is applicable to most AOT compilers that can access partially context-sensitive profiles, and its goal is to improve program performance without significantly increasing code size. The technique relies on a new hot-code-detection algorithm to reconstruct hot regions based on the partial profiles. The compilation ordering and the inlining of the compiler are modified to exploit the information about the hot code. We formally describe the proposed algorithm and its heuristics, and then describe our implementation inside GraalVM Native Image, a state-of-the-art AOT compiler for Java. Evaluation of the proposed technique on 16 benchmarks from DaCapo, Scalabench and Renaissance suites shows a performance improvement between \\(22\\% \\) and \\(40\\% \\) on 4 benchmarks, and between \\(2.5\\% \\) and \\(10\\% \\) on 5 benchmarks. Code-size increase ranges from \\(0.8-9\\% \\) , where 10 benchmarks exhibit an increase of less than \\(2.5\\% \\) .","PeriodicalId":50939,"journal":{"name":"ACM Transactions on Programming Languages and Systems","volume":"20 1","pages":"0"},"PeriodicalIF":1.5000,"publicationDate":"2023-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Programming Languages and Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3612937","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

Abstract

Availability of profiling information is a major advantage of just-in-time (JIT) compilation. Profiles guide the compilation order and optimizations, thus substantially improving program performance. Ahead-of-time (AOT) compilation can also utilize profiles, obtained during separate profiling runs of the programs. Profiles can be context-sensitive, i.e., each profile entry is associated with a call-stack. To ease profile collection and reduce overheads, many systems collect partially context-sensitive profiles, which record only a call-stack suffix. Despite prior related work, partially context-sensitive profiles have the potential to further improve compiler optimizations. In this paper, we describe a novel technique that exploits partially context-sensitive profiles to determine which portions of code are hot, and compile them with additional compilation budget. This technique is applicable to most AOT compilers that can access partially context-sensitive profiles, and its goal is to improve program performance without significantly increasing code size. The technique relies on a new hot-code-detection algorithm to reconstruct hot regions based on the partial profiles. The compilation ordering and the inlining of the compiler are modified to exploit the information about the hot code. We formally describe the proposed algorithm and its heuristics, and then describe our implementation inside GraalVM Native Image, a state-of-the-art AOT compiler for Java. Evaluation of the proposed technique on 16 benchmarks from DaCapo, Scalabench and Renaissance suites shows a performance improvement between \(22\% \) and \(40\% \) on 4 benchmarks, and between \(2.5\% \) and \(10\% \) on 5 benchmarks. Code-size increase ranges from \(0.8-9\% \) , where 10 benchmarks exhibit an increase of less than \(2.5\% \) .
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用部分上下文敏感的配置文件来提高热代码的性能
分析信息的可用性是即时(JIT)编译的一个主要优点。概要文件指导编译顺序和优化,从而大大提高程序性能。提前(AOT)编译也可以利用在程序的单独分析运行期间获得的配置文件。概要文件可以是上下文敏感的,也就是说,每个概要文件条目都与一个调用堆栈相关联。为了简化概要文件收集并减少开销,许多系统收集部分上下文敏感的概要文件,这些概要文件只记录调用堆栈后缀。尽管之前有过相关的工作,但部分上下文敏感的配置文件仍有进一步改进编译器优化的潜力。在本文中,我们描述了一种新技术,它利用部分上下文敏感的配置文件来确定代码的哪些部分是热的,并使用额外的编译预算来编译它们。该技术适用于大多数可以访问部分上下文敏感配置文件的AOT编译器,其目标是在不显著增加代码大小的情况下提高程序性能。该技术依赖于一种新的热码检测算法来重建基于局部轮廓的热区域。修改了编译器的编译顺序和内联,以利用热代码的信息。我们正式描述了提出的算法及其启发式,然后描述了我们在GraalVM Native Image(一种最先进的Java AOT编译器)中的实现。在来自DaCapo、scalabbench和Renaissance套件的16个基准测试上对所提出的技术进行评估,结果显示\(22\% \)和\(40\% \)之间有4个基准测试的性能提高,\(2.5\% \)和\(10\% \)之间有5个基准测试的性能提高。代码大小的增长范围从\(0.8-9\% \)开始,其中10个基准测试显示的增长幅度小于\(2.5\% \)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
ACM Transactions on Programming Languages and Systems
ACM Transactions on Programming Languages and Systems 工程技术-计算机:软件工程
CiteScore
3.10
自引率
7.70%
发文量
28
审稿时长
>12 weeks
期刊介绍: ACM Transactions on Programming Languages and Systems (TOPLAS) is the premier journal for reporting recent research advances in the areas of programming languages, and systems to assist the task of programming. Papers can be either theoretical or experimental in style, but in either case, they must contain innovative and novel content that advances the state of the art of programming languages and systems. We also invite strictly experimental papers that compare existing approaches, as well as tutorial and survey papers. The scope of TOPLAS includes, but is not limited to, the following subjects: language design for sequential and parallel programming programming language implementation programming language semantics compilers and interpreters runtime systems for program execution storage allocation and garbage collection languages and methods for writing program specifications languages and methods for secure and reliable programs testing and verification of programs
期刊最新文献
Proving Correctness of Parallel Implementations of Transition System Models CFLOBDDs: Context-Free-Language Ordered Binary Decision Diagrams Adversities in Abstract Interpretation: Accommodating Robustness by Abstract Interpretation: ACM Transactions on Programming Languages and Systems: Vol 0, No ja Homeostasis: Design and Implementation of a Self-Stabilizing Compiler Locally Abstract, Globally Concrete Semantics of Concurrent Programming Languages
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1