A Rigorous Benchmarking and Performance Analysis Methodology for Python Workloads

Arthur Crapé, L. Eeckhout
{"title":"A Rigorous Benchmarking and Performance Analysis Methodology for Python Workloads","authors":"Arthur Crapé, L. Eeckhout","doi":"10.1109/IISWC50251.2020.00017","DOIUrl":null,"url":null,"abstract":"Computer architecture and computer systems research and development is heavily driven by benchmarking and performance analysis. It is thus of paramount importance that rigorous methodologies are used to draw correct conclusions and steer research and development in the right direction. While rigorous methodologies are widely used for native and managed programming language workloads, scripting language workloads are subject to ad-hoc methodologies which lead to incorrect and misleading conclusions. In particular, we find incorrect public statements regarding different virtual machines for Python, the most popular scripting language. The incorrect conclusion is a result of using the geometric mean speedup and not making a distinction between start-up and steady-state performance. In this paper, we propose a statistically rigorous benchmarking and performance analysis methodology for Python workloads, which makes a distinction between start-up and steady-state performance and which summarizes average performance across a set of benchmarks using the harmonic mean speedup. We find that a rigorous methodology makes a difference in practice. In particular, we find that the PyPy JIT compiler outperforms the CPython interpreter by 1.76 × for steady-state while being 2% slower for start-up, which refutes the statement on the PyPy website that ‘PyPy outperforms CPython by 4.4× on average’ based on the geometric mean speedup and not making a distinction between start-up and steady-state. We use the proposed methodology to analyze Python workloads which yields several interesting findings regarding PyPy versus CPython performance, start-up versus steady-state performance, the impact of a workload's input size, and Python workload execution characteristics at the microarchitecture level.","PeriodicalId":365983,"journal":{"name":"2020 IEEE International Symposium on Workload Characterization (IISWC)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Symposium on Workload Characterization (IISWC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IISWC50251.2020.00017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Computer architecture and computer systems research and development is heavily driven by benchmarking and performance analysis. It is thus of paramount importance that rigorous methodologies are used to draw correct conclusions and steer research and development in the right direction. While rigorous methodologies are widely used for native and managed programming language workloads, scripting language workloads are subject to ad-hoc methodologies which lead to incorrect and misleading conclusions. In particular, we find incorrect public statements regarding different virtual machines for Python, the most popular scripting language. The incorrect conclusion is a result of using the geometric mean speedup and not making a distinction between start-up and steady-state performance. In this paper, we propose a statistically rigorous benchmarking and performance analysis methodology for Python workloads, which makes a distinction between start-up and steady-state performance and which summarizes average performance across a set of benchmarks using the harmonic mean speedup. We find that a rigorous methodology makes a difference in practice. In particular, we find that the PyPy JIT compiler outperforms the CPython interpreter by 1.76 × for steady-state while being 2% slower for start-up, which refutes the statement on the PyPy website that ‘PyPy outperforms CPython by 4.4× on average’ based on the geometric mean speedup and not making a distinction between start-up and steady-state. We use the proposed methodology to analyze Python workloads which yields several interesting findings regarding PyPy versus CPython performance, start-up versus steady-state performance, the impact of a workload's input size, and Python workload execution characteristics at the microarchitecture level.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Python工作负载的严格基准测试和性能分析方法
计算机体系结构和计算机系统的研究和发展在很大程度上是由基准测试和性能分析驱动的。因此,使用严谨的方法得出正确的结论并引导研究和发展朝着正确的方向发展是至关重要的。虽然严格的方法被广泛用于本机和托管编程语言工作负载,但脚本语言工作负载受制于特别的方法,从而导致不正确和误导性的结论。特别是,我们发现了关于Python(最流行的脚本语言)的不同虚拟机的不正确的公开声明。不正确的结论是使用几何平均加速而没有区分启动和稳态性能的结果。在本文中,我们为Python工作负载提出了一种统计上严格的基准测试和性能分析方法,该方法区分了启动和稳态性能,并使用谐波平均加速总结了一组基准测试的平均性能。我们发现,严格的方法论在实践中起着重要作用。特别是,我们发现PyPy JIT编译器在稳定状态下比CPython解释器性能高1.76倍,而在启动时比CPython解释器慢2%,这驳斥了PyPy网站上基于几何平均加速而没有区分启动和稳定状态的“PyPy平均比CPython性能高4.4倍”的说法。我们使用提出的方法来分析Python工作负载,得出了几个有趣的发现,包括PyPy与CPython性能、启动与稳态性能、工作负载输入大小的影响,以及微架构级别的Python工作负载执行特征。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Organizing Committee : IISWC 2020 Characterizing the impact of last-level cache replacement policies on big-data workloads AI on the Edge: Characterizing AI-based IoT Applications Using Specialized Edge Architectures Empirical Analysis and Modeling of Compute Times of CNN Operations on AWS Cloud Reliability Modeling of NISQ- Era Quantum Computers
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1