Dheeraj Chahal, Mayank Mishra, S. Palepu, Rekha Singhal
Many organizations are migrating their on-premise artificial intelligence workloads to the cloud due to the availability of cost-effective and highly scalable infrastructure, software and platform services. To ease the process of migration, many cloud vendors provide services, frameworks and tools that can be used for deployment of applications on cloud infrastructure. Finding the most appropriate service and infrastructure for a given application that results in a desired performance at minimal cost, is a challenge. In this work, we present a methodology to migrate a deep learning model based recommender system to ML platform and serverless architecture. Furthermore, we show our experimental evaluation of the AWS ML platform called SageMaker and the serverless platform service known as Lambda. In our study, we also discuss performance and cost trade-off while using cloud infrastructure.
{"title":"Performance and Cost Comparison of Cloud Services for Deep Learning Workload","authors":"Dheeraj Chahal, Mayank Mishra, S. Palepu, Rekha Singhal","doi":"10.1145/3447545.3451184","DOIUrl":"https://doi.org/10.1145/3447545.3451184","url":null,"abstract":"Many organizations are migrating their on-premise artificial intelligence workloads to the cloud due to the availability of cost-effective and highly scalable infrastructure, software and platform services. To ease the process of migration, many cloud vendors provide services, frameworks and tools that can be used for deployment of applications on cloud infrastructure. Finding the most appropriate service and infrastructure for a given application that results in a desired performance at minimal cost, is a challenge. In this work, we present a methodology to migrate a deep learning model based recommender system to ML platform and serverless architecture. Furthermore, we show our experimental evaluation of the AWS ML platform called SageMaker and the serverless platform service known as Lambda. In our study, we also discuss performance and cost trade-off while using cloud infrastructure.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"109 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80874783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Norbert Schmitt, K. Lange, Sanjay Sharma, Aaron Cragin, D. Reiner, Samuel Kounev
The driving philosophy for the Standard Performance Evaluation Corporation (SPEC) is to ensure that the marketplace has a fair and useful set of metrics to differentiate systems, by providing standardized benchmark suites and international standards. This poster-paper gives an overview of SPEC with a focus on the newly founded International Standards Group (ISG).
Standard Performance Evaluation Corporation (SPEC)的驱动理念是通过提供标准化的基准套件和国际标准,确保市场有一组公平和有用的度量来区分系统。这张海报概述了SPEC,重点介绍了新成立的国际标准组织(ISG)。
{"title":"SPEC Spotlight on the International Standards Group (ISG)","authors":"Norbert Schmitt, K. Lange, Sanjay Sharma, Aaron Cragin, D. Reiner, Samuel Kounev","doi":"10.1145/3447545.3451171","DOIUrl":"https://doi.org/10.1145/3447545.3451171","url":null,"abstract":"The driving philosophy for the Standard Performance Evaluation Corporation (SPEC) is to ensure that the marketplace has a fair and useful set of metrics to differentiate systems, by providing standardized benchmark suites and international standards. This poster-paper gives an overview of SPEC with a focus on the newly founded International Standards Group (ISG).","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"32 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91271401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Using machine learning (ML) services, both service customers and providers need to monitor complex contractual constraints of ML service that are strongly related to ML models and data. Therefore, establishing and monitoring comprehensive ML contracts are crucial in ML serving. This paper demonstrates a set of features and utilities of the QoA4ML framework for ML contracts.
{"title":"Demonstration Paper: Monitoring Machine Learning Contracts with QoA4ML","authors":"M. Nguyen, Hong Linh Truong","doi":"10.1145/3447545.3451172","DOIUrl":"https://doi.org/10.1145/3447545.3451172","url":null,"abstract":"Using machine learning (ML) services, both service customers and providers need to monitor complex contractual constraints of ML service that are strongly related to ML models and data. Therefore, establishing and monitoring comprehensive ML contracts are crucial in ML serving. This paper demonstrates a set of features and utilities of the QoA4ML framework for ML contracts.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76986459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recent episodes of web overloads suggest the need to test system performance under loads that reflect extreme variations in usage patterns well outside normal anticipated ranges. These loads are sometimes expected or even scheduled. Examples of expected loads include surges in transactions or request submission when popular rock concert tickets go on sale, when the deadline for the submission of census forms approaches, and when a desperate population is attempting to sign up for a vaccination during a pandemic. Examples of unexpected loads are the surge in unemployment benefit applications in many US states with the onset of COVID19 lockdowns and repeated queries about the geographic distribution of signatories on the U.K. Parliament's petition website prior to a Brexit vote in 2019. We will consider software performance ramifications of these examples and the architectural questions they raise. We discuss how modeling and performance testing and known processes for evaluating architectures and designs can be used to identify potential performance issues that would be caused by sudden increases in load or changes in load patterns.
{"title":"On Preventively Minimizing the Performance Impact of Black Swans (Vision Paper)","authors":"A. Bondi","doi":"10.1145/3447545.3451204","DOIUrl":"https://doi.org/10.1145/3447545.3451204","url":null,"abstract":"Recent episodes of web overloads suggest the need to test system performance under loads that reflect extreme variations in usage patterns well outside normal anticipated ranges. These loads are sometimes expected or even scheduled. Examples of expected loads include surges in transactions or request submission when popular rock concert tickets go on sale, when the deadline for the submission of census forms approaches, and when a desperate population is attempting to sign up for a vaccination during a pandemic. Examples of unexpected loads are the surge in unemployment benefit applications in many US states with the onset of COVID19 lockdowns and repeated queries about the geographic distribution of signatories on the U.K. Parliament's petition website prior to a Brexit vote in 2019. We will consider software performance ramifications of these examples and the architectural questions they raise. We discuss how modeling and performance testing and known processes for evaluating architectures and designs can be used to identify potential performance issues that would be caused by sudden increases in load or changes in load patterns.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83527017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Performance and the related properties of stability and resilience are essential to MongoDB. We have invested heavily in these areas: involving all development engineers in aspects of performance, building a team of specialized performance engineers to understand issues that do not fit neatly within the scope of individual development teams, and dedicating multiple teams to develop and support tools for performance testing and analysis. We have built an automated infrastructure for performance testing that is integrated with our continuous integration system. Performance tests routinely run against our development branch in order to identify changes in performance as early as possible. We have invested heavily to ensure both that results are low noise and reproducible and that we can detect when performance changes. We continue to invest to make the system better and to make it easier to add new workloads. All development engineers are expected to interact with our performance system: investigating performance changes, fixing regressions, and adding new performance tests. We also expect performance to be considered at project design time. The project design should ensure that there is appropriate performance coverage for the project, which may require repurposing existing tests or adding new ones. Not all performance issues are specific to a team or software module. Some issues emerge from the interaction of multiple modules or interaction with external systems or software. To attack these larger problems, we have a dedicated performance team. Our performance team is responsible for investigating these more complex issues, identifying high value areas for improvement, as well as helping guide the development engineers with their performance tests. We have experience both hiring and training engineers for the performance engineering skills needed to ship a performant database system. In this talk we will cover the skills needed for our performance activities and which skills, if added to undergraduate curricula, would help us the most. We will address the skills we would like all development engineers to have, as well as those for our dedicated performance team.
{"title":"Performance Engineering and Database Development at MongoDB","authors":"D. Daly","doi":"10.1145/3447545.3451199","DOIUrl":"https://doi.org/10.1145/3447545.3451199","url":null,"abstract":"Performance and the related properties of stability and resilience are essential to MongoDB. We have invested heavily in these areas: involving all development engineers in aspects of performance, building a team of specialized performance engineers to understand issues that do not fit neatly within the scope of individual development teams, and dedicating multiple teams to develop and support tools for performance testing and analysis. We have built an automated infrastructure for performance testing that is integrated with our continuous integration system. Performance tests routinely run against our development branch in order to identify changes in performance as early as possible. We have invested heavily to ensure both that results are low noise and reproducible and that we can detect when performance changes. We continue to invest to make the system better and to make it easier to add new workloads. All development engineers are expected to interact with our performance system: investigating performance changes, fixing regressions, and adding new performance tests. We also expect performance to be considered at project design time. The project design should ensure that there is appropriate performance coverage for the project, which may require repurposing existing tests or adding new ones. Not all performance issues are specific to a team or software module. Some issues emerge from the interaction of multiple modules or interaction with external systems or software. To attack these larger problems, we have a dedicated performance team. Our performance team is responsible for investigating these more complex issues, identifying high value areas for improvement, as well as helping guide the development engineers with their performance tests. We have experience both hiring and training engineers for the performance engineering skills needed to ship a performant database system. In this talk we will cover the skills needed for our performance activities and which skills, if added to undergraduate curricula, would help us the most. We will address the skills we would like all development engineers to have, as well as those for our dedicated performance team.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87770487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wajdi Halabi, Daniel N. Smith, J. Hill, Jason W. Anderson, Ken E. Kennedy, Brandon Posey, Linh Ngo, A. Apon
We utilize the Clemson supercomputer to generate a massive workload for testing the performance of Microsoft Azure IoT Hub. The workload emulates sensor data from a large manufacturing facility. We study the effects of message frequency, distribution, and size on round-trip latency for different IoT Hub configurations. Significant variation in latency occurs when the system exceeds IoT Hub specifications. The results are predictable and well-behaved for a well-engineered system and can meet soft real-time deadlines.
{"title":"Viability of Azure IoT Hub for Processing High Velocity Large Scale IoT Data","authors":"Wajdi Halabi, Daniel N. Smith, J. Hill, Jason W. Anderson, Ken E. Kennedy, Brandon Posey, Linh Ngo, A. Apon","doi":"10.1145/3447545.3451187","DOIUrl":"https://doi.org/10.1145/3447545.3451187","url":null,"abstract":"We utilize the Clemson supercomputer to generate a massive workload for testing the performance of Microsoft Azure IoT Hub. The workload emulates sensor data from a large manufacturing facility. We study the effects of message frequency, distribution, and size on round-trip latency for different IoT Hub configurations. Significant variation in latency occurs when the system exceeds IoT Hub specifications. The results are predictable and well-behaved for a well-engineered system and can meet soft real-time deadlines.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90934125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Richard Bieringa, Abijith Radhakrishnan, Tavneet Singh, Sophie Vos, Jesse Donkervliet, A. Iosup
The global COVID-19 pandemic forced society to shift to remote education and work. This shift relies on various video conference systems (VCSs) such as Zoom, Microsoft Teams, and Jitsi, consequently increasing pressure on their digital service infrastructure. Although understanding the performance of these essential cloud services could lead to better designs and improved service deployments, only limited research on this topic currently exists. Addressing this problem, in this work we propose an experimental method to analyze and compare VCSs. Our method is based on real-world experiments where the client-side is controlled, and focuses on VCS resource requirements and performance. We design and implement a tool to automatically conduct these real-world experiments, and use it to compare three platforms on the client side: Zoom, Microsoft Teams, and Jitsi. Our work exposes that there are significant differences between the systems tested in terms of resource usage and performance variability, and provides evidence for a suspected memory leak in Zoom, the system widely regarded as the industry market leader.
{"title":"An Empirical Evaluation of the Performance of Video Conferencing Systems","authors":"Richard Bieringa, Abijith Radhakrishnan, Tavneet Singh, Sophie Vos, Jesse Donkervliet, A. Iosup","doi":"10.1145/3447545.3451186","DOIUrl":"https://doi.org/10.1145/3447545.3451186","url":null,"abstract":"The global COVID-19 pandemic forced society to shift to remote education and work. This shift relies on various video conference systems (VCSs) such as Zoom, Microsoft Teams, and Jitsi, consequently increasing pressure on their digital service infrastructure. Although understanding the performance of these essential cloud services could lead to better designs and improved service deployments, only limited research on this topic currently exists. Addressing this problem, in this work we propose an experimental method to analyze and compare VCSs. Our method is based on real-world experiments where the client-side is controlled, and focuses on VCS resource requirements and performance. We design and implement a tool to automatically conduct these real-world experiments, and use it to compare three platforms on the client side: Zoom, Microsoft Teams, and Jitsi. Our work exposes that there are significant differences between the systems tested in terms of resource usage and performance variability, and provides evidence for a suspected memory leak in Zoom, the system widely regarded as the industry market leader.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83580176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuxuan Zhao, Dmitry Duplyakin, R. Ricci, Alexandru Uta
Cloud computing plays an essential role in our society nowadays. Many important services are highly dependant on the stable performance of the cloud. However, as prior work has shown, clouds exhibit large degrees of performance variability. Next to the stochastic variation induced by noisy neighbors, an important facet of cloud performance variability is given by changepoints---the instances where the non-stationary performance metrics exhibit persisting changes, which often last until subsequent changepoints occur. Such undesirable artifacts of the unstable application performance lead to problems with application performance evaluation and prediction efforts. Thus, characterization and understanding of performance changepoints become important elements of studying application performance in the cloud. In this paper, we showcase and tune two different changepoint detection methods, as well as demonstrate how the timing of the changepoints they identify can be predicted. We present a gradient-boosting-based prediction method, show that it can achieve good prediction accuracy, and give advice to practitioners on how to use our results.
{"title":"Cloud Performance Variability Prediction","authors":"Yuxuan Zhao, Dmitry Duplyakin, R. Ricci, Alexandru Uta","doi":"10.1145/3447545.3451182","DOIUrl":"https://doi.org/10.1145/3447545.3451182","url":null,"abstract":"Cloud computing plays an essential role in our society nowadays. Many important services are highly dependant on the stable performance of the cloud. However, as prior work has shown, clouds exhibit large degrees of performance variability. Next to the stochastic variation induced by noisy neighbors, an important facet of cloud performance variability is given by changepoints---the instances where the non-stationary performance metrics exhibit persisting changes, which often last until subsequent changepoints occur. Such undesirable artifacts of the unstable application performance lead to problems with application performance evaluation and prediction efforts. Thus, characterization and understanding of performance changepoints become important elements of studying application performance in the cloud. In this paper, we showcase and tune two different changepoint detection methods, as well as demonstrate how the timing of the changepoints they identify can be predicted. We present a gradient-boosting-based prediction method, show that it can achieve good prediction accuracy, and give advice to practitioners on how to use our results.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"70 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79568273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Snigdha Singh, Yves Richard Kirschner, A. Koziolek
Software systems architected using multiple technologies are becoming popular. Many developers use these technologies as it offers high service quality which has often been optimized in terms of performance. In spite of the fact that performance is a key to the technology-mixed software applications, still there a little research on performance evaluation approaches explicitly considering the extraction of architecture for modelling and predicting performance. In this paper, we discuss the opportunities and challenges in applying existing architecture extraction approaches to support model-driven performance prediction for technology-mixed software. Further, we discuss how it can be extended to support a message-based system. We describe how various technologies deriving the architecture can be transformed to create the performance model. In order to realise the work, we used a case study from the energy system domain as an running example to support our arguments and observations throughout the paper.
{"title":"Towards Extraction of Message-Based Communication in Mixed-Technology Architectures for Performance Model","authors":"Snigdha Singh, Yves Richard Kirschner, A. Koziolek","doi":"10.1145/3447545.3451201","DOIUrl":"https://doi.org/10.1145/3447545.3451201","url":null,"abstract":"Software systems architected using multiple technologies are becoming popular. Many developers use these technologies as it offers high service quality which has often been optimized in terms of performance. In spite of the fact that performance is a key to the technology-mixed software applications, still there a little research on performance evaluation approaches explicitly considering the extraction of architecture for modelling and predicting performance. In this paper, we discuss the opportunities and challenges in applying existing architecture extraction approaches to support model-driven performance prediction for technology-mixed software. Further, we discuss how it can be extended to support a message-based system. We describe how various technologies deriving the architecture can be transformed to create the performance model. In order to realise the work, we used a case study from the energy system domain as an running example to support our arguments and observations throughout the paper.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"31 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76170744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A benchmark is a tool coupled with a methodology for the evaluation and comparison of systems or components with respect to specific characteristics, such as performance, reliability, or security. Benchmarks enable educated purchasing decisions and play a great role as evaluation tools during system design, development, and maintenance. In research, benchmarks play an integral part in evaluation and validation of new approaches and methodologies. Traditional benchmarks have been focused on evaluating performance, typically understood as the amount of useful work accomplished by a system (or component) compared to the time and resources used. Ranging from simple benchmarks, targeting specific hardware or software components, to large and complex benchmarks focusing on entire systems (e.g., information systems, storage systems, cloud platforms), performance benchmarks have contributed significantly to improve successive generations of systems. Beyond traditional performance benchmarking, research on dependability benchmarking has increased in the past two decades. Due to the increasing relevance of security issues, security benchmarking has also become an important research field. Finally, resilience benchmarking faces challenges related to the integration of performance, dependability, and security benchmarking as well as to the adaptive characteristics of the systems under consideration. Each benchmark is characterized by three key aspects: metrics, workloads, and measurement methodology. The metrics determine what values should be derived based on measurements to produce the benchmark results. The workloads determine under which usage scenarios and conditions (e.g., executed programs, induced system load, injected failures/security attacks) measurements should be performed to derive the metrics. Finally, the measurement methodology defines the end-to-end process to execute the benchmark, collect measurements, and produce the benchmark results. The increasing size and complexity of modern systems make the engineering of benchmarks a challenging task. Thus, we see the need for a better education on the theoretical and practical foundations necessary for gaining a deep understanding of benchmarking and the benchmark engineering process. In this talk, we present an overview of a new course focused on systems benchmarking, based on our book "Systems Benchmarking - For Scientists and Engineers" (http://benchmarking-book.com/). The course captures our experiences that have been gained over the past 15 years in teaching a regular graduate course on performance engineering of computing systems. The latter was taught at four different European universities since 2006, including University of Cambridge, Technical University of Catalonia, Karlsruhe Institute of Technology, and University of Würzburg. The conception, design, and development of benchmarks requires a thorough understanding of the benchmarking fundamentals beyond understanding of the system und
{"title":"A New Course on Systems Benchmarking - For Scientists and Engineers","authors":"Samuel Kounev","doi":"10.1145/3447545.3451198","DOIUrl":"https://doi.org/10.1145/3447545.3451198","url":null,"abstract":"A benchmark is a tool coupled with a methodology for the evaluation and comparison of systems or components with respect to specific characteristics, such as performance, reliability, or security. Benchmarks enable educated purchasing decisions and play a great role as evaluation tools during system design, development, and maintenance. In research, benchmarks play an integral part in evaluation and validation of new approaches and methodologies. Traditional benchmarks have been focused on evaluating performance, typically understood as the amount of useful work accomplished by a system (or component) compared to the time and resources used. Ranging from simple benchmarks, targeting specific hardware or software components, to large and complex benchmarks focusing on entire systems (e.g., information systems, storage systems, cloud platforms), performance benchmarks have contributed significantly to improve successive generations of systems. Beyond traditional performance benchmarking, research on dependability benchmarking has increased in the past two decades. Due to the increasing relevance of security issues, security benchmarking has also become an important research field. Finally, resilience benchmarking faces challenges related to the integration of performance, dependability, and security benchmarking as well as to the adaptive characteristics of the systems under consideration. Each benchmark is characterized by three key aspects: metrics, workloads, and measurement methodology. The metrics determine what values should be derived based on measurements to produce the benchmark results. The workloads determine under which usage scenarios and conditions (e.g., executed programs, induced system load, injected failures/security attacks) measurements should be performed to derive the metrics. Finally, the measurement methodology defines the end-to-end process to execute the benchmark, collect measurements, and produce the benchmark results. The increasing size and complexity of modern systems make the engineering of benchmarks a challenging task. Thus, we see the need for a better education on the theoretical and practical foundations necessary for gaining a deep understanding of benchmarking and the benchmark engineering process. In this talk, we present an overview of a new course focused on systems benchmarking, based on our book \"Systems Benchmarking - For Scientists and Engineers\" (http://benchmarking-book.com/). The course captures our experiences that have been gained over the past 15 years in teaching a regular graduate course on performance engineering of computing systems. The latter was taught at four different European universities since 2006, including University of Cambridge, Technical University of Catalonia, Karlsruhe Institute of Technology, and University of Würzburg. The conception, design, and development of benchmarks requires a thorough understanding of the benchmarking fundamentals beyond understanding of the system und","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"82 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79333723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}