{"title":"Exploiting Partially Context-Sensitive Profiles to Improve Performance of Hot Code","authors":"Maja Vukasovic, Aleksandar Prokopec","doi":"10.1145/3612937","DOIUrl":null,"url":null,"abstract":"Availability of profiling information is a major advantage of just-in-time (JIT) compilation. Profiles guide the compilation order and optimizations, thus substantially improving program performance. Ahead-of-time (AOT) compilation can also utilize profiles, obtained during separate profiling runs of the programs. Profiles can be context-sensitive, i.e., each profile entry is associated with a call-stack. To ease profile collection and reduce overheads, many systems collect partially context-sensitive profiles, which record only a call-stack suffix. Despite prior related work, partially context-sensitive profiles have the potential to further improve compiler optimizations. In this paper, we describe a novel technique that exploits partially context-sensitive profiles to determine which portions of code are hot, and compile them with additional compilation budget. This technique is applicable to most AOT compilers that can access partially context-sensitive profiles, and its goal is to improve program performance without significantly increasing code size. The technique relies on a new hot-code-detection algorithm to reconstruct hot regions based on the partial profiles. The compilation ordering and the inlining of the compiler are modified to exploit the information about the hot code. We formally describe the proposed algorithm and its heuristics, and then describe our implementation inside GraalVM Native Image, a state-of-the-art AOT compiler for Java. Evaluation of the proposed technique on 16 benchmarks from DaCapo, Scalabench and Renaissance suites shows a performance improvement between \\(22\\% \\) and \\(40\\% \\) on 4 benchmarks, and between \\(2.5\\% \\) and \\(10\\% \\) on 5 benchmarks. Code-size increase ranges from \\(0.8-9\\% \\) , where 10 benchmarks exhibit an increase of less than \\(2.5\\% \\) .","PeriodicalId":50939,"journal":{"name":"ACM Transactions on Programming Languages and Systems","volume":"20 1","pages":"0"},"PeriodicalIF":1.5000,"publicationDate":"2023-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Programming Languages and Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3612937","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Availability of profiling information is a major advantage of just-in-time (JIT) compilation. Profiles guide the compilation order and optimizations, thus substantially improving program performance. Ahead-of-time (AOT) compilation can also utilize profiles, obtained during separate profiling runs of the programs. Profiles can be context-sensitive, i.e., each profile entry is associated with a call-stack. To ease profile collection and reduce overheads, many systems collect partially context-sensitive profiles, which record only a call-stack suffix. Despite prior related work, partially context-sensitive profiles have the potential to further improve compiler optimizations. In this paper, we describe a novel technique that exploits partially context-sensitive profiles to determine which portions of code are hot, and compile them with additional compilation budget. This technique is applicable to most AOT compilers that can access partially context-sensitive profiles, and its goal is to improve program performance without significantly increasing code size. The technique relies on a new hot-code-detection algorithm to reconstruct hot regions based on the partial profiles. The compilation ordering and the inlining of the compiler are modified to exploit the information about the hot code. We formally describe the proposed algorithm and its heuristics, and then describe our implementation inside GraalVM Native Image, a state-of-the-art AOT compiler for Java. Evaluation of the proposed technique on 16 benchmarks from DaCapo, Scalabench and Renaissance suites shows a performance improvement between \(22\% \) and \(40\% \) on 4 benchmarks, and between \(2.5\% \) and \(10\% \) on 5 benchmarks. Code-size increase ranges from \(0.8-9\% \) , where 10 benchmarks exhibit an increase of less than \(2.5\% \) .
期刊介绍:
ACM Transactions on Programming Languages and Systems (TOPLAS) is the premier journal for reporting recent research advances in the areas of programming languages, and systems to assist the task of programming. Papers can be either theoretical or experimental in style, but in either case, they must contain innovative and novel content that advances the state of the art of programming languages and systems. We also invite strictly experimental papers that compare existing approaches, as well as tutorial and survey papers. The scope of TOPLAS includes, but is not limited to, the following subjects:
language design for sequential and parallel programming
programming language implementation
programming language semantics
compilers and interpreters
runtime systems for program execution
storage allocation and garbage collection
languages and methods for writing program specifications
languages and methods for secure and reliable programs
testing and verification of programs