Companion of the 2018 ACM/SPEC International Conference on Performance Engineering最新文献

英文中文

Performance and Cost Comparison of Cloud Services for Deep Learning Workload 深度学习工作负载的云服务性能和成本比较

Companion of the 2018 ACM/SPEC International Conference on Performance Engineering

Pub Date : 2021-04-19 DOI: 10.1145/3447545.3451184

Dheeraj Chahal, Mayank Mishra, S. Palepu, Rekha Singhal

Many organizations are migrating their on-premise artificial intelligence workloads to the cloud due to the availability of cost-effective and highly scalable infrastructure, software and platform services. To ease the process of migration, many cloud vendors provide services, frameworks and tools that can be used for deployment of applications on cloud infrastructure. Finding the most appropriate service and infrastructure for a given application that results in a desired performance at minimal cost, is a challenge. In this work, we present a methodology to migrate a deep learning model based recommender system to ML platform and serverless architecture. Furthermore, we show our experimental evaluation of the AWS ML platform called SageMaker and the serverless platform service known as Lambda. In our study, we also discuss performance and cost trade-off while using cloud infrastructure.

由于具有成本效益和高度可扩展的基础设施、软件和平台服务的可用性，许多组织正在将其本地人工智能工作负载迁移到云。为了简化迁移过程，许多云供应商提供了可用于在云基础设施上部署应用程序的服务、框架和工具。为给定的应用程序找到最合适的服务和基础设施，从而以最小的成本获得所需的性能，这是一个挑战。在这项工作中，我们提出了一种将基于深度学习模型的推荐系统迁移到ML平台和无服务器架构的方法。此外，我们还展示了我们对AWS ML平台SageMaker和无服务器平台服务Lambda的实验评估。在我们的研究中，我们还讨论了使用云基础设施时的性能和成本权衡。

引用次数: 9

SPEC Spotlight on the International Standards Group (ISG) 国际标准组织(ISG)的焦点

Companion of the 2018 ACM/SPEC International Conference on Performance Engineering

Pub Date : 2021-04-19 DOI: 10.1145/3447545.3451171

Norbert Schmitt, K. Lange, Sanjay Sharma, Aaron Cragin, D. Reiner, Samuel Kounev

The driving philosophy for the Standard Performance Evaluation Corporation (SPEC) is to ensure that the marketplace has a fair and useful set of metrics to differentiate systems, by providing standardized benchmark suites and international standards. This poster-paper gives an overview of SPEC with a focus on the newly founded International Standards Group (ISG).

Standard Performance Evaluation Corporation (SPEC)的驱动理念是通过提供标准化的基准套件和国际标准，确保市场有一组公平和有用的度量来区分系统。这张海报概述了SPEC，重点介绍了新成立的国际标准组织(ISG)。

引用次数: 0

Demonstration Paper: Monitoring Machine Learning Contracts with QoA4ML 演示文件:使用QoA4ML监控机器学习契约

Companion of the 2018 ACM/SPEC International Conference on Performance Engineering

Pub Date : 2021-04-19 DOI: 10.1145/3447545.3451172

M. Nguyen, Hong Linh Truong

Using machine learning (ML) services, both service customers and providers need to monitor complex contractual constraints of ML service that are strongly related to ML models and data. Therefore, establishing and monitoring comprehensive ML contracts are crucial in ML serving. This paper demonstrates a set of features and utilities of the QoA4ML framework for ML contracts.

使用机器学习(ML)服务，服务客户和提供商都需要监控与ML模型和数据密切相关的ML服务的复杂合同约束。因此，建立和监控全面的机器学习合同对于机器学习服务至关重要。本文演示了用于ML契约的QoA4ML框架的一组功能和实用程序。

引用次数: 0

On Preventively Minimizing the Performance Impact of Black Swans (Vision Paper) 预防性地减少黑天鹅事件对工作表现的影响(愿景文件)

Companion of the 2018 ACM/SPEC International Conference on Performance Engineering

Pub Date : 2021-04-19 DOI: 10.1145/3447545.3451204

A. Bondi

Recent episodes of web overloads suggest the need to test system performance under loads that reflect extreme variations in usage patterns well outside normal anticipated ranges. These loads are sometimes expected or even scheduled. Examples of expected loads include surges in transactions or request submission when popular rock concert tickets go on sale, when the deadline for the submission of census forms approaches, and when a desperate population is attempting to sign up for a vaccination during a pandemic. Examples of unexpected loads are the surge in unemployment benefit applications in many US states with the onset of COVID19 lockdowns and repeated queries about the geographic distribution of signatories on the U.K. Parliament's petition website prior to a Brexit vote in 2019. We will consider software performance ramifications of these examples and the architectural questions they raise. We discuss how modeling and performance testing and known processes for evaluating architectures and designs can be used to identify potential performance issues that would be caused by sudden increases in load or changes in load patterns.

最近的web过载事件表明，需要在负载下测试系统性能，这些负载反映了使用模式的极端变化，远远超出了正常的预期范围。这些负载有时是预期的，甚至是计划好的。预期负荷的例子包括:当流行摇滚音乐会门票开始销售时，当提交人口普查表格的截止日期临近时，当绝望的人群在大流行期间试图报名接种疫苗时，交易或请求提交量激增。意外负荷的例子包括，随着covid - 19封锁的开始，美国许多州的失业救济申请激增，以及在2019年英国脱欧公投前，英国议会请愿网站上签署人的地理分布被反复询问。我们将考虑这些示例的软件性能分支以及它们提出的架构问题。我们将讨论如何使用建模和性能测试以及评估体系结构和设计的已知流程来识别可能由负载突然增加或负载模式更改引起的潜在性能问题。

{"title":"On Preventively Minimizing the Performance Impact of Black Swans (Vision Paper)","authors":"A. Bondi","doi":"10.1145/3447545.3451204","DOIUrl":"https://doi.org/10.1145/3447545.3451204","url":null,"abstract":"Recent episodes of web overloads suggest the need to test system performance under loads that reflect extreme variations in usage patterns well outside normal anticipated ranges. These loads are sometimes expected or even scheduled. Examples of expected loads include surges in transactions or request submission when popular rock concert tickets go on sale, when the deadline for the submission of census forms approaches, and when a desperate population is attempting to sign up for a vaccination during a pandemic. Examples of unexpected loads are the surge in unemployment benefit applications in many US states with the onset of COVID19 lockdowns and repeated queries about the geographic distribution of signatories on the U.K. Parliament's petition website prior to a Brexit vote in 2019. We will consider software performance ramifications of these examples and the architectural questions they raise. We discuss how modeling and performance testing and known processes for evaluating architectures and designs can be used to identify potential performance issues that would be caused by sudden increases in load or changes in load patterns.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83527017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Performance Engineering and Database Development at MongoDB MongoDB的性能工程和数据库开发

Companion of the 2018 ACM/SPEC International Conference on Performance Engineering

Pub Date : 2021-04-19 DOI: 10.1145/3447545.3451199

D. Daly

Performance and the related properties of stability and resilience are essential to MongoDB. We have invested heavily in these areas: involving all development engineers in aspects of performance, building a team of specialized performance engineers to understand issues that do not fit neatly within the scope of individual development teams, and dedicating multiple teams to develop and support tools for performance testing and analysis. We have built an automated infrastructure for performance testing that is integrated with our continuous integration system. Performance tests routinely run against our development branch in order to identify changes in performance as early as possible. We have invested heavily to ensure both that results are low noise and reproducible and that we can detect when performance changes. We continue to invest to make the system better and to make it easier to add new workloads. All development engineers are expected to interact with our performance system: investigating performance changes, fixing regressions, and adding new performance tests. We also expect performance to be considered at project design time. The project design should ensure that there is appropriate performance coverage for the project, which may require repurposing existing tests or adding new ones. Not all performance issues are specific to a team or software module. Some issues emerge from the interaction of multiple modules or interaction with external systems or software. To attack these larger problems, we have a dedicated performance team. Our performance team is responsible for investigating these more complex issues, identifying high value areas for improvement, as well as helping guide the development engineers with their performance tests. We have experience both hiring and training engineers for the performance engineering skills needed to ship a performant database system. In this talk we will cover the skills needed for our performance activities and which skills, if added to undergraduate curricula, would help us the most. We will address the skills we would like all development engineers to have, as well as those for our dedicated performance team.

性能以及稳定性和弹性的相关属性对MongoDB至关重要。我们在这些领域投入了大量的资金:让所有的开发工程师参与到性能的各个方面，建立一个由专门的性能工程师组成的团队来理解不适合单个开发团队范围的问题，并让多个团队专门开发和支持性能测试和分析的工具。我们已经为性能测试构建了一个自动化的基础设施，它与我们的持续集成系统集成在一起。为了尽早识别性能上的变化，我们会定期对开发分支进行性能测试。我们投入了大量资金，以确保结果噪音低、可重复，并且我们可以在性能变化时检测到。我们将继续投资，使系统更好，并使其更容易添加新的工作负载。所有的开发工程师都应该与我们的性能系统进行交互:调查性能变化，修复回归，并添加新的性能测试。我们还希望在项目设计时考虑性能。项目设计应该确保项目有适当的性能覆盖，这可能需要重新利用现有的测试或添加新的测试。并非所有的性能问题都是特定于某个团队或软件模块的。一些问题来自多个模块的交互或与外部系统或软件的交互。为了解决这些更大的问题，我们有一个专门的性能团队。我们的性能团队负责调查这些更复杂的问题，确定需要改进的高价值领域，以及帮助指导开发工程师进行性能测试。我们在招聘和培训性能工程技能工程师方面拥有丰富的经验，这些技能是交付高性能数据库系统所需的。在这次演讲中，我们将介绍我们表演活动所需的技能，以及哪些技能如果加入到本科课程中，将对我们帮助最大。我们将讨论我们希望所有开发工程师都具备的技能，以及我们专门的性能团队的技能。

{"title":"Performance Engineering and Database Development at MongoDB","authors":"D. Daly","doi":"10.1145/3447545.3451199","DOIUrl":"https://doi.org/10.1145/3447545.3451199","url":null,"abstract":"Performance and the related properties of stability and resilience are essential to MongoDB. We have invested heavily in these areas: involving all development engineers in aspects of performance, building a team of specialized performance engineers to understand issues that do not fit neatly within the scope of individual development teams, and dedicating multiple teams to develop and support tools for performance testing and analysis. We have built an automated infrastructure for performance testing that is integrated with our continuous integration system. Performance tests routinely run against our development branch in order to identify changes in performance as early as possible. We have invested heavily to ensure both that results are low noise and reproducible and that we can detect when performance changes. We continue to invest to make the system better and to make it easier to add new workloads. All development engineers are expected to interact with our performance system: investigating performance changes, fixing regressions, and adding new performance tests. We also expect performance to be considered at project design time. The project design should ensure that there is appropriate performance coverage for the project, which may require repurposing existing tests or adding new ones. Not all performance issues are specific to a team or software module. Some issues emerge from the interaction of multiple modules or interaction with external systems or software. To attack these larger problems, we have a dedicated performance team. Our performance team is responsible for investigating these more complex issues, identifying high value areas for improvement, as well as helping guide the development engineers with their performance tests. We have experience both hiring and training engineers for the performance engineering skills needed to ship a performant database system. In this talk we will cover the skills needed for our performance activities and which skills, if added to undergraduate curricula, would help us the most. We will address the skills we would like all development engineers to have, as well as those for our dedicated performance team.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87770487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Viability of Azure IoT Hub for Processing High Velocity Large Scale IoT Data Azure物联网中心处理高速大规模物联网数据的可行性

Companion of the 2018 ACM/SPEC International Conference on Performance Engineering

Pub Date : 2021-04-19 DOI: 10.1145/3447545.3451187

Wajdi Halabi, Daniel N. Smith, J. Hill, Jason W. Anderson, Ken E. Kennedy, Brandon Posey, Linh Ngo, A. Apon

We utilize the Clemson supercomputer to generate a massive workload for testing the performance of Microsoft Azure IoT Hub. The workload emulates sensor data from a large manufacturing facility. We study the effects of message frequency, distribution, and size on round-trip latency for different IoT Hub configurations. Significant variation in latency occurs when the system exceeds IoT Hub specifications. The results are predictable and well-behaved for a well-engineered system and can meet soft real-time deadlines.

我们利用克莱姆森超级计算机生成大量工作负载来测试微软Azure IoT Hub的性能。工作负载模拟来自大型制造工厂的传感器数据。我们研究了不同IoT Hub配置的消息频率、分布和大小对往返延迟的影响。当系统超过物联网集线器规格时，延迟会发生显著变化。对于一个设计良好的系统来说，结果是可预测的，并且表现良好，并且可以满足软实时截止日期。

引用次数: 1

An Empirical Evaluation of the Performance of Video Conferencing Systems 视频会议系统性能的实证评价

Companion of the 2018 ACM/SPEC International Conference on Performance Engineering

Pub Date : 2021-04-19 DOI: 10.1145/3447545.3451186

Richard Bieringa, Abijith Radhakrishnan, Tavneet Singh, Sophie Vos, Jesse Donkervliet, A. Iosup

The global COVID-19 pandemic forced society to shift to remote education and work. This shift relies on various video conference systems (VCSs) such as Zoom, Microsoft Teams, and Jitsi, consequently increasing pressure on their digital service infrastructure. Although understanding the performance of these essential cloud services could lead to better designs and improved service deployments, only limited research on this topic currently exists. Addressing this problem, in this work we propose an experimental method to analyze and compare VCSs. Our method is based on real-world experiments where the client-side is controlled, and focuses on VCS resource requirements and performance. We design and implement a tool to automatically conduct these real-world experiments, and use it to compare three platforms on the client side: Zoom, Microsoft Teams, and Jitsi. Our work exposes that there are significant differences between the systems tested in terms of resource usage and performance variability, and provides evidence for a suspected memory leak in Zoom, the system widely regarded as the industry market leader.

全球COVID-19大流行迫使社会转向远程教育和工作。这种转变依赖于各种视频会议系统(vcs)，如Zoom、Microsoft Teams和Jitsi，因此增加了他们的数字服务基础设施的压力。虽然了解这些基本云服务的性能可以带来更好的设计和改进的服务部署，但目前对这一主题的研究非常有限。针对这一问题，本文提出了一种实验方法来分析和比较vc。我们的方法基于客户端受控的真实世界实验，并关注VCS资源需求和性能。我们设计并实现了一个工具来自动执行这些现实世界的实验，并使用它来比较客户端的三个平台:Zoom、Microsoft Teams和Jitsi。我们的工作揭示了在资源使用和性能可变性方面测试的系统之间存在显著差异，并为Zoom(被广泛认为是行业市场领导者的系统)中可疑的内存泄漏提供了证据。

{"title":"An Empirical Evaluation of the Performance of Video Conferencing Systems","authors":"Richard Bieringa, Abijith Radhakrishnan, Tavneet Singh, Sophie Vos, Jesse Donkervliet, A. Iosup","doi":"10.1145/3447545.3451186","DOIUrl":"https://doi.org/10.1145/3447545.3451186","url":null,"abstract":"The global COVID-19 pandemic forced society to shift to remote education and work. This shift relies on various video conference systems (VCSs) such as Zoom, Microsoft Teams, and Jitsi, consequently increasing pressure on their digital service infrastructure. Although understanding the performance of these essential cloud services could lead to better designs and improved service deployments, only limited research on this topic currently exists. Addressing this problem, in this work we propose an experimental method to analyze and compare VCSs. Our method is based on real-world experiments where the client-side is controlled, and focuses on VCS resource requirements and performance. We design and implement a tool to automatically conduct these real-world experiments, and use it to compare three platforms on the client side: Zoom, Microsoft Teams, and Jitsi. Our work exposes that there are significant differences between the systems tested in terms of resource usage and performance variability, and provides evidence for a suspected memory leak in Zoom, the system widely regarded as the industry market leader.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83580176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Cloud Performance Variability Prediction 云性能变异性预测

Companion of the 2018 ACM/SPEC International Conference on Performance Engineering

Pub Date : 2021-04-19 DOI: 10.1145/3447545.3451182

Yuxuan Zhao, Dmitry Duplyakin, R. Ricci, Alexandru Uta

Cloud computing plays an essential role in our society nowadays. Many important services are highly dependant on the stable performance of the cloud. However, as prior work has shown, clouds exhibit large degrees of performance variability. Next to the stochastic variation induced by noisy neighbors, an important facet of cloud performance variability is given by changepoints---the instances where the non-stationary performance metrics exhibit persisting changes, which often last until subsequent changepoints occur. Such undesirable artifacts of the unstable application performance lead to problems with application performance evaluation and prediction efforts. Thus, characterization and understanding of performance changepoints become important elements of studying application performance in the cloud. In this paper, we showcase and tune two different changepoint detection methods, as well as demonstrate how the timing of the changepoints they identify can be predicted. We present a gradient-boosting-based prediction method, show that it can achieve good prediction accuracy, and give advice to practitioners on how to use our results.

云计算在当今社会中扮演着至关重要的角色。许多重要的服务高度依赖于云的稳定性能。然而，正如先前的工作所表明的，云表现出很大程度的性能可变性。除了由噪声邻居引起的随机变化之外，云性能可变性的一个重要方面是由变化点给出的——非平稳性能指标表现出持续变化的实例，这些变化通常持续到随后的变化点出现。这些不稳定的应用程序性能的不良产物会导致应用程序性能评估和预测工作出现问题。因此，表征和理解性能变化点成为研究云中应用程序性能的重要元素。在本文中，我们展示并调优了两种不同的变更点检测方法，并演示了如何预测它们识别的变更点的时间。我们提出了一种基于梯度提升的预测方法，结果表明该方法可以达到很好的预测精度，并对从业者如何使用我们的结果提出了建议。

{"title":"Cloud Performance Variability Prediction","authors":"Yuxuan Zhao, Dmitry Duplyakin, R. Ricci, Alexandru Uta","doi":"10.1145/3447545.3451182","DOIUrl":"https://doi.org/10.1145/3447545.3451182","url":null,"abstract":"Cloud computing plays an essential role in our society nowadays. Many important services are highly dependant on the stable performance of the cloud. However, as prior work has shown, clouds exhibit large degrees of performance variability. Next to the stochastic variation induced by noisy neighbors, an important facet of cloud performance variability is given by changepoints---the instances where the non-stationary performance metrics exhibit persisting changes, which often last until subsequent changepoints occur. Such undesirable artifacts of the unstable application performance lead to problems with application performance evaluation and prediction efforts. Thus, characterization and understanding of performance changepoints become important elements of studying application performance in the cloud. In this paper, we showcase and tune two different changepoint detection methods, as well as demonstrate how the timing of the changepoints they identify can be predicted. We present a gradient-boosting-based prediction method, show that it can achieve good prediction accuracy, and give advice to practitioners on how to use our results.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"70 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79568273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Towards Extraction of Message-Based Communication in Mixed-Technology Architectures for Performance Model 面向性能模型的混合技术架构中基于消息的通信提取

Companion of the 2018 ACM/SPEC International Conference on Performance Engineering

Pub Date : 2021-04-19 DOI: 10.1145/3447545.3451201

Snigdha Singh, Yves Richard Kirschner, A. Koziolek

Software systems architected using multiple technologies are becoming popular. Many developers use these technologies as it offers high service quality which has often been optimized in terms of performance. In spite of the fact that performance is a key to the technology-mixed software applications, still there a little research on performance evaluation approaches explicitly considering the extraction of architecture for modelling and predicting performance. In this paper, we discuss the opportunities and challenges in applying existing architecture extraction approaches to support model-driven performance prediction for technology-mixed software. Further, we discuss how it can be extended to support a message-based system. We describe how various technologies deriving the architecture can be transformed to create the performance model. In order to realise the work, we used a case study from the energy system domain as an running example to support our arguments and observations throughout the paper.

使用多种技术构建的软件系统正变得越来越流行。许多开发人员使用这些技术，因为它提供了高质量的服务，并且通常在性能方面进行了优化。尽管性能是技术混合软件应用的关键，但明确考虑提取架构以建模和预测性能的性能评估方法的研究仍然很少。在本文中，我们讨论了应用现有架构提取方法来支持技术混合软件的模型驱动性能预测的机遇和挑战。此外，我们还讨论了如何扩展它以支持基于消息的系统。我们描述了如何转换派生体系结构的各种技术来创建性能模型。为了实现这项工作，我们使用了一个来自能源系统领域的案例研究作为一个运行的例子来支持我们在整个论文中的论点和观察。

引用次数: 0

A New Course on Systems Benchmarking - For Scientists and Engineers 系统基准测试新课程-面向科学家和工程师

Companion of the 2018 ACM/SPEC International Conference on Performance Engineering

Pub Date : 2021-04-19 DOI: 10.1145/3447545.3451198

Samuel Kounev

A benchmark is a tool coupled with a methodology for the evaluation and comparison of systems or components with respect to specific characteristics, such as performance, reliability, or security. Benchmarks enable educated purchasing decisions and play a great role as evaluation tools during system design, development, and maintenance. In research, benchmarks play an integral part in evaluation and validation of new approaches and methodologies. Traditional benchmarks have been focused on evaluating performance, typically understood as the amount of useful work accomplished by a system (or component) compared to the time and resources used. Ranging from simple benchmarks, targeting specific hardware or software components, to large and complex benchmarks focusing on entire systems (e.g., information systems, storage systems, cloud platforms), performance benchmarks have contributed significantly to improve successive generations of systems. Beyond traditional performance benchmarking, research on dependability benchmarking has increased in the past two decades. Due to the increasing relevance of security issues, security benchmarking has also become an important research field. Finally, resilience benchmarking faces challenges related to the integration of performance, dependability, and security benchmarking as well as to the adaptive characteristics of the systems under consideration. Each benchmark is characterized by three key aspects: metrics, workloads, and measurement methodology. The metrics determine what values should be derived based on measurements to produce the benchmark results. The workloads determine under which usage scenarios and conditions (e.g., executed programs, induced system load, injected failures/security attacks) measurements should be performed to derive the metrics. Finally, the measurement methodology defines the end-to-end process to execute the benchmark, collect measurements, and produce the benchmark results. The increasing size and complexity of modern systems make the engineering of benchmarks a challenging task. Thus, we see the need for a better education on the theoretical and practical foundations necessary for gaining a deep understanding of benchmarking and the benchmark engineering process. In this talk, we present an overview of a new course focused on systems benchmarking, based on our book "Systems Benchmarking - For Scientists and Engineers" (http://benchmarking-book.com/). The course captures our experiences that have been gained over the past 15 years in teaching a regular graduate course on performance engineering of computing systems. The latter was taught at four different European universities since 2006, including University of Cambridge, Technical University of Catalonia, Karlsruhe Institute of Technology, and University of Würzburg. The conception, design, and development of benchmarks requires a thorough understanding of the benchmarking fundamentals beyond understanding of the system und

基准测试是一种工具，结合了一种方法，用于评估和比较系统或组件的特定特征，如性能、可靠性或安全性。基准测试使有根据的购买决策成为可能，并且在系统设计、开发和维护期间作为评估工具发挥重要作用。在研究中，基准在评估和验证新方法和方法方面发挥着不可或缺的作用。传统的基准测试侧重于评估性能，通常被理解为系统(或组件)完成的有用工作量与所使用的时间和资源的比较。从针对特定硬件或软件组件的简单基准测试，到针对整个系统(例如，信息系统、存储系统、云平台)的大型复杂基准测试，性能基准测试对改进连续几代系统做出了重大贡献。除了传统的性能基准测试之外，在过去的二十年中，对可靠性基准测试的研究也有所增加。由于安全问题的相关性越来越高，安全基准测试也成为一个重要的研究领域。最后，弹性基准测试面临着与性能、可靠性和安全性基准测试的集成以及所考虑的系统的自适应特征相关的挑战。每个基准都有三个关键方面的特征:指标、工作负载和度量方法。这些度量标准决定了应该根据产生基准测试结果的度量来派生哪些值。工作负载确定在何种使用场景和条件下(例如，执行的程序、诱导的系统负载、注入的故障/安全攻击)应该执行度量以派生度量。最后，度量方法定义端到端流程，以执行基准测试、收集度量并生成基准测试结果。现代系统的规模和复杂性不断增加，使得基准测试的工程设计成为一项具有挑战性的任务。因此，我们认为需要更好的理论和实践基础教育，以获得对基准和基准工程过程的深刻理解。在这次演讲中，我们将根据我们的书“系统基准测试-为科学家和工程师”(http://benchmarking-book.com/)概述一门专注于系统基准测试的新课程。本课程总结了我们在过去15年中教授计算机系统性能工程的常规研究生课程所获得的经验。自2006年以来，后者在四所不同的欧洲大学任教，包括剑桥大学、加泰罗尼亚技术大学、卡尔斯鲁厄理工学院和维尔茨堡大学。基准测试的概念、设计和开发需要对基准测试基础有透彻的理解，而不仅仅是对被测系统的理解，包括统计、度量方法、度量和相关的工作负载特征。本课程将深入探讨这些问题;它涵盖了如何确定要度量的相关系统特性，如何度量这些特性，以及如何在度量中聚合度量结果。此外，将指标聚合到评分系统中，以及工作负载的设计，包括工作负载表征和建模，都是所涉及的其他具有挑战性的主题。最后，研究了现代基准及其在工业和研究中的应用。我们涵盖了基准测试的广泛不同应用领域，介绍了基准测试开发的特定领域的贡献。这些贡献解决了在特定系统或子系统的基准的概念和开发中出现的独特挑战。他们还演示了课程第一部分的基础和概念如何在现有基准中使用。

{"title":"A New Course on Systems Benchmarking - For Scientists and Engineers","authors":"Samuel Kounev","doi":"10.1145/3447545.3451198","DOIUrl":"https://doi.org/10.1145/3447545.3451198","url":null,"abstract":"A benchmark is a tool coupled with a methodology for the evaluation and comparison of systems or components with respect to specific characteristics, such as performance, reliability, or security. Benchmarks enable educated purchasing decisions and play a great role as evaluation tools during system design, development, and maintenance. In research, benchmarks play an integral part in evaluation and validation of new approaches and methodologies. Traditional benchmarks have been focused on evaluating performance, typically understood as the amount of useful work accomplished by a system (or component) compared to the time and resources used. Ranging from simple benchmarks, targeting specific hardware or software components, to large and complex benchmarks focusing on entire systems (e.g., information systems, storage systems, cloud platforms), performance benchmarks have contributed significantly to improve successive generations of systems. Beyond traditional performance benchmarking, research on dependability benchmarking has increased in the past two decades. Due to the increasing relevance of security issues, security benchmarking has also become an important research field. Finally, resilience benchmarking faces challenges related to the integration of performance, dependability, and security benchmarking as well as to the adaptive characteristics of the systems under consideration. Each benchmark is characterized by three key aspects: metrics, workloads, and measurement methodology. The metrics determine what values should be derived based on measurements to produce the benchmark results. The workloads determine under which usage scenarios and conditions (e.g., executed programs, induced system load, injected failures/security attacks) measurements should be performed to derive the metrics. Finally, the measurement methodology defines the end-to-end process to execute the benchmark, collect measurements, and produce the benchmark results. The increasing size and complexity of modern systems make the engineering of benchmarks a challenging task. Thus, we see the need for a better education on the theoretical and practical foundations necessary for gaining a deep understanding of benchmarking and the benchmark engineering process. In this talk, we present an overview of a new course focused on systems benchmarking, based on our book \"Systems Benchmarking - For Scientists and Engineers\" (http://benchmarking-book.com/). The course captures our experiences that have been gained over the past 15 years in teaching a regular graduate course on performance engineering of computing systems. The latter was taught at four different European universities since 2006, including University of Cambridge, Technical University of Catalonia, Karlsruhe Institute of Technology, and University of Würzburg. The conception, design, and development of benchmarks requires a thorough understanding of the benchmarking fundamentals beyond understanding of the system und","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"82 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79333723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Companion of the 2018 ACM/SPEC International Conference on Performance Engineering

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀