{"title":"The influence of HPCToolkit and Score-p on hardware performance counters","authors":"Jan-Patrick Lehr, Christian Iwainsky, C. Bischof","doi":"10.1145/3141865.3141869","DOIUrl":null,"url":null,"abstract":"Performance measurement and analysis are commonly carried out tasks for high-performance computing applications. Both sampling and instrumentation approaches for performance measurement can capture hardware performance counter (HWPC) metrics to asses the software's ability to use the functional units of the processor. Since the measurement software usually executes on the same processor, it necessarily competes with the target application for hardware resources. Consequently, the measurement system perturbs the target application, which often results in runtime overhead. While the runtime overhead of different measurement techniques has been previously studied, it has not been thoroughly examined to what extent HWPC values are perturbed by the measurement process. In this paper, we investigate the influence of the two widely-used performance measurement systems HPCToolkit (sampling) and Score-P (instrumentation) w.r.t. their influence on HWPC. Our experiments on the SPEC CPU 2006 C/C++ benchmarks show that, while Score-P's default instrumentation can massively increase runtime, it does not always heavily perturb relevant HWPC. On the other hand, HPCToolkit shows no significant runtime overhead, but significantly influences some relevant HWPC. We conclude that for every performance experiment sufficient baseline measurements are essential to identify the HWPC that remain valid indicators of performance for a given measurement technique. Thus, performance analysis tools need to offer easily accessible means to automate the baseline and validation functionality.","PeriodicalId":424955,"journal":{"name":"Proceedings of the 4th ACM SIGPLAN International Workshop on Software Engineering for Parallel Systems","volume":"56 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 4th ACM SIGPLAN International Workshop on Software Engineering for Parallel Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3141865.3141869","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Performance measurement and analysis are commonly carried out tasks for high-performance computing applications. Both sampling and instrumentation approaches for performance measurement can capture hardware performance counter (HWPC) metrics to asses the software's ability to use the functional units of the processor. Since the measurement software usually executes on the same processor, it necessarily competes with the target application for hardware resources. Consequently, the measurement system perturbs the target application, which often results in runtime overhead. While the runtime overhead of different measurement techniques has been previously studied, it has not been thoroughly examined to what extent HWPC values are perturbed by the measurement process. In this paper, we investigate the influence of the two widely-used performance measurement systems HPCToolkit (sampling) and Score-P (instrumentation) w.r.t. their influence on HWPC. Our experiments on the SPEC CPU 2006 C/C++ benchmarks show that, while Score-P's default instrumentation can massively increase runtime, it does not always heavily perturb relevant HWPC. On the other hand, HPCToolkit shows no significant runtime overhead, but significantly influences some relevant HWPC. We conclude that for every performance experiment sufficient baseline measurements are essential to identify the HWPC that remain valid indicators of performance for a given measurement technique. Thus, performance analysis tools need to offer easily accessible means to automate the baseline and validation functionality.
性能测量和分析是高性能计算应用程序通常执行的任务。用于性能测量的采样和仪器方法都可以捕获硬件性能计数器(HWPC)指标,以评估软件使用处理器功能单元的能力。由于测量软件通常在同一处理器上执行,因此它必然与目标应用程序争夺硬件资源。因此,测量系统会干扰目标应用程序,这通常会导致运行时开销。虽然以前研究过不同测量技术的运行时开销,但尚未彻底检查HWPC值在多大程度上受到测量过程的干扰。在本文中,我们研究了两种广泛使用的绩效测量系统HPCToolkit(采样)和Score-P(仪器)的影响,而不是它们对HWPC的影响。我们在SPEC CPU 2006 C/ c++基准测试上的实验表明,虽然Score-P的默认工具可以大量增加运行时,但它并不总是严重干扰相关的HWPC。另一方面,HPCToolkit没有显示出明显的运行时开销,但会显著影响一些相关的HWPC。我们得出的结论是,对于每个性能实验,充分的基线测量对于确定HWPC仍然是给定测量技术的有效性能指标至关重要。因此,性能分析工具需要提供易于访问的方法来自动化基线和验证功能。