Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering最新文献

英文中文

Performance Prediction of Explicit ODE Methods on Multi-Core Cluster Systems 多核集群系统上显式ODE方法的性能预测

Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering

Pub Date : 2019-04-04 DOI: 10.1145/3297663.3310306

M. Scherg, Johannes Seiferth, Matthias Korch, T. Rauber

When migrating a scientific application to a new HPC system, the program code usually has to be re-tuned to achieve the best possible performance. Auto-tuning techniques are a promising approach to support the portability of performance. Often, a large pool of possible implementation variants exists from which the most efficient variant needs to be selected. Ideally, auto-tuning approaches should be capable of undertaking this task in an efficient manner for a new HPC system and new characteristics of the input data by applying suitable analytic models and program transformations. In this article, we discuss a performance prediction methodology for multi-core cluster applications, which can assist this selection process by significantly reducing the selection effort compared to in-depth runtime tests. The methodology proposed is an extension of an analytical performance prediction model for shared-memory applications introduced in our previous work. Our methodology is based on the execution-cache-memory (ECM) performance model and estimations of intra-node and inter-node communication costs, which we apply to numerical solution methods for ordinary differential equations (ODEs). In particular, we investigate whether it is possible to obtain accurate performance predictions for hybrid MPI/OpenMP implementation variants in order to support the variant selection. We demonstrate that our approach is able to reliably select a set of efficient variants for a given configuration (ODE system, solver and hardware platform) and, thus, to narrow down the search space for possible later empirical tuning.

当将科学应用程序迁移到新的HPC系统时，通常必须重新调整程序代码以实现最佳性能。自动调优技术是支持性能可移植性的一种很有前途的方法。通常，存在大量可能的实现变体，需要从中选择最有效的变体。理想情况下，通过应用合适的分析模型和程序转换，自动调整方法应该能够以一种有效的方式为新的高性能计算系统和输入数据的新特征承担这项任务。在本文中，我们将讨论一种多核集群应用程序的性能预测方法，与深入的运行时测试相比，该方法可以显著减少选择工作量，从而帮助选择过程。所提出的方法是对我们之前工作中介绍的共享内存应用程序的分析性能预测模型的扩展。我们的方法是基于执行-缓存-内存(ECM)性能模型和节点内和节点间通信成本的估计，我们将其应用于常微分方程(ode)的数值解方法。特别是，我们研究了是否有可能获得混合MPI/OpenMP实现变体的准确性能预测，以支持变体选择。我们证明，我们的方法能够可靠地为给定的配置(ODE系统、求解器和硬件平台)选择一组有效的变体，从而缩小搜索空间，以便以后可能的经验调整。

{"title":"Performance Prediction of Explicit ODE Methods on Multi-Core Cluster Systems","authors":"M. Scherg, Johannes Seiferth, Matthias Korch, T. Rauber","doi":"10.1145/3297663.3310306","DOIUrl":"https://doi.org/10.1145/3297663.3310306","url":null,"abstract":"When migrating a scientific application to a new HPC system, the program code usually has to be re-tuned to achieve the best possible performance. Auto-tuning techniques are a promising approach to support the portability of performance. Often, a large pool of possible implementation variants exists from which the most efficient variant needs to be selected. Ideally, auto-tuning approaches should be capable of undertaking this task in an efficient manner for a new HPC system and new characteristics of the input data by applying suitable analytic models and program transformations. In this article, we discuss a performance prediction methodology for multi-core cluster applications, which can assist this selection process by significantly reducing the selection effort compared to in-depth runtime tests. The methodology proposed is an extension of an analytical performance prediction model for shared-memory applications introduced in our previous work. Our methodology is based on the execution-cache-memory (ECM) performance model and estimations of intra-node and inter-node communication costs, which we apply to numerical solution methods for ordinary differential equations (ODEs). In particular, we investigate whether it is possible to obtain accurate performance predictions for hybrid MPI/OpenMP implementation variants in order to support the variant selection. We demonstrate that our approach is able to reliably select a set of efficient variants for a given configuration (ODE system, solver and hardware platform) and, thus, to narrow down the search space for possible later empirical tuning.","PeriodicalId":273447,"journal":{"name":"Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121878768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

SPEC CPU2017: Performance, Event, and Energy Characterization on the Core i7-8700K SPEC CPU2017: Core i7-8700K的性能、事件和能量表征

Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering

Pub Date : 2019-04-04 DOI: 10.1145/3297663.3310314

Ranjan Hebbar, A. Milenković

Computer engineers in academia and industry rely on a standardized set of benchmarks to quantitatively evaluate the performance of computer systems and research prototypes. SPEC CPU2017 is the most recent incarnation of standard benchmarks designed to stress a system's processor, memory subsystem, and compiler. This paper describes the results of measurement-based studies focusing on characterization, performance, and energy-efficiency analyses of SPEC CPU2017 on the Intel's Core i7-8700K. Intel and GNU compilers are used to create executable files utilized in performance studies. The results show that executables produced by the Intel compilers are superior to those produced by GNU compilers. We characterize all the benchmarks, perform a top-down microarchitectural analysis to identify performance bottlenecks, and test benchmark scalability with respect to performance and energy. Findings from these studies can be used to guide future performance evaluations and computer architecture research

学术界和工业界的计算机工程师依靠一套标准化的基准来定量评估计算机系统和研究原型的性能。SPEC CPU2017是标准基准测试的最新化身，旨在强调系统的处理器、内存子系统和编译器。本文描述了基于测量的研究结果，重点关注英特尔酷睿i7-8700K上SPEC CPU2017的特性，性能和能效分析。英特尔和GNU编译器用于创建性能研究中使用的可执行文件。结果表明，由Intel编译器生成的可执行文件优于由GNU编译器生成的可执行文件。我们描述所有基准，执行自顶向下的微体系结构分析以确定性能瓶颈，并测试基准在性能和能耗方面的可伸缩性。这些研究结果可用于指导未来的性能评估和计算机体系结构研究

引用次数: 21

Memory Centric Characterization and Analysis of SPEC CPU2017 Suite 以内存为中心的SPEC CPU2017套件特性与分析

Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering

Pub Date : 2019-04-04 DOI: 10.1145/3297663.3310311

Sarabjeet Singh, M. Awasthi

In this paper, we provide a comprehensive, memory-centric characterization of the SPEC CPU2017 benchmark suite, using a number of mechanisms including dynamic binary instrumentation, measurements on native hardware using hardware performance counters and operating system based tools. We present a number of results including working set sizes, memory capacity consumption and memory bandwidth utilization of various workloads. Our experiments reveal that, on the x86_64 ISA, SPEC CPU2017 workloads execute a significant number of memory related instructions, with approximately 50% of all dynamic instructions requiring memory accesses. We also show that there is a large variation in the memory footprint and bandwidth utilization profiles of the entire suite, with some benchmarks using as much as 16 GB of main memory and up to 2.3 GB/s of memory bandwidth. We perform instruction distribution analysis of the benchmark suite and find that the average instruction count for SPEC CPU2017 workloads is an order of magnitude higher than SPEC CPU2006 ones. In addition, we also find that FP benchmarks of the suite have higher compute requirements: on average, FP workloads execute three times the number of compute operations as compared to INT workloads.

在本文中，我们对SPEC CPU2017基准测试套件进行了全面的、以内存为中心的描述，使用了多种机制，包括动态二进制仪器、使用硬件性能计数器和基于操作系统的工具对本地硬件进行测量。我们给出了许多结果，包括各种工作负载的工作集大小、内存容量消耗和内存带宽利用率。我们的实验表明，在x86_64 ISA上，SPEC CPU2017工作负载执行了大量与内存相关的指令，大约50%的动态指令需要内存访问。我们还展示了整个套件的内存占用和带宽利用配置文件存在很大差异，一些基准测试使用多达16 GB的主内存和高达2.3 GB/s的内存带宽。我们对基准套件进行了指令分布分析，发现SPEC CPU2017工作负载的平均指令计数比SPEC CPU2006工作负载高一个数量级。此外，我们还发现套件的FP基准测试具有更高的计算需求:平均而言，与INT工作负载相比，FP工作负载执行的计算操作数量是其三倍。

{"title":"Memory Centric Characterization and Analysis of SPEC CPU2017 Suite","authors":"Sarabjeet Singh, M. Awasthi","doi":"10.1145/3297663.3310311","DOIUrl":"https://doi.org/10.1145/3297663.3310311","url":null,"abstract":"In this paper, we provide a comprehensive, memory-centric characterization of the SPEC CPU2017 benchmark suite, using a number of mechanisms including dynamic binary instrumentation, measurements on native hardware using hardware performance counters and operating system based tools. We present a number of results including working set sizes, memory capacity consumption and memory bandwidth utilization of various workloads. Our experiments reveal that, on the x86_64 ISA, SPEC CPU2017 workloads execute a significant number of memory related instructions, with approximately 50% of all dynamic instructions requiring memory accesses. We also show that there is a large variation in the memory footprint and bandwidth utilization profiles of the entire suite, with some benchmarks using as much as 16 GB of main memory and up to 2.3 GB/s of memory bandwidth. We perform instruction distribution analysis of the benchmark suite and find that the average instruction count for SPEC CPU2017 workloads is an order of magnitude higher than SPEC CPU2006 ones. In addition, we also find that FP benchmarks of the suite have higher compute requirements: on average, FP workloads execute three times the number of compute operations as compared to INT workloads.","PeriodicalId":273447,"journal":{"name":"Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116187728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 26

Performance Oriented Dynamic Bypassing for Intrusion Detection Systems 面向性能的入侵检测系统动态旁路

Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering

Pub Date : 2019-04-04 DOI: 10.1145/3297663.3310313

Lukas Iffländer, Jonathan Stoll, Nishant Rawtani, Veronika Lesch, K. Lange, Samuel Kounev

Attacks on software systems are becoming more and more frequent, aggressive and sophisticated. With the changing threat landscape, in 2018, organizations are looking at when they will be attacked, not if. Intrusion Detection Systems (IDSs) can help in defending against these attacks. The systems that host IDSs require extensive computing resources as IDSs tend to detect attacks under overloaded conditions wrongfully. With the end of Moore's law and the growing adoption of Internet of Things, designers of security systems can no longer expect processing power to keep up the pace with them. This limitation requires ways to increase the performance of these systems without adding additional compute power. In this work, we present two dynamic and a static approach to bypass IDS for traffic deemed benign. We provide its prototype implementation and evaluate our solution. Our evaluation shows promising results. Performance is increased up to the level of a system without an IDS. Attack detection is within the margin of error from the 100% rate. However, our findings show that dynamic approaches perform best when using software switches. The use of a hardware switch reduces the detection rate and performance significantly.

对软件系统的攻击变得越来越频繁、激进和复杂。随着威胁形势的变化，在2018年，组织正在关注他们何时会受到攻击，而不是是否会受到攻击。入侵检测系统(ids)可以帮助防御这些攻击。承载ids的系统需要大量的计算资源，因为ids容易在过载情况下错误地检测攻击。随着摩尔定律的终结和物联网的日益普及，安全系统的设计者不能再指望处理能力能跟上他们的步伐。这种限制要求在不增加额外计算能力的情况下提高这些系统的性能。在这项工作中，我们提出了两种动态和静态的方法来绕过IDS的流量被认为是良性的。我们提供了它的原型实现并评估了我们的解决方案。我们的评估显示出可喜的结果。性能提高到没有IDS的系统的水平。攻击检测在100%的误差范围内。然而，我们的研究结果表明，动态方法在使用软件开关时表现最佳。使用硬件开关可以显著降低检测率和性能。

{"title":"Performance Oriented Dynamic Bypassing for Intrusion Detection Systems","authors":"Lukas Iffländer, Jonathan Stoll, Nishant Rawtani, Veronika Lesch, K. Lange, Samuel Kounev","doi":"10.1145/3297663.3310313","DOIUrl":"https://doi.org/10.1145/3297663.3310313","url":null,"abstract":"Attacks on software systems are becoming more and more frequent, aggressive and sophisticated. With the changing threat landscape, in 2018, organizations are looking at when they will be attacked, not if. Intrusion Detection Systems (IDSs) can help in defending against these attacks. The systems that host IDSs require extensive computing resources as IDSs tend to detect attacks under overloaded conditions wrongfully. With the end of Moore's law and the growing adoption of Internet of Things, designers of security systems can no longer expect processing power to keep up the pace with them. This limitation requires ways to increase the performance of these systems without adding additional compute power. In this work, we present two dynamic and a static approach to bypass IDS for traffic deemed benign. We provide its prototype implementation and evaluate our solution. Our evaluation shows promising results. Performance is increased up to the level of a system without an IDS. Attack detection is within the margin of error from the 100% rate. However, our findings show that dynamic approaches perform best when using software switches. The use of a hardware switch reduces the detection rate and performance significantly.","PeriodicalId":273447,"journal":{"name":"Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121478985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Follower Core: A Model To Simulate Large Multicore SoCs 跟随核:一个模拟大型多核soc的模型

Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering

Pub Date : 2019-04-04 DOI: 10.1145/3297663.3309678

Tanuj Agarwal, Bill Jones, A. Bhowmik

Cycle accurate simulator is a critical tool for processor design and as the complexity and the core count of the processor increase, the simulation becomes extremely time and resource consuming and hence not very practical. Accurate multi-core performance estimation in realistic time is needed for making the right design choices and make high quality performance projections. In this work we present a multi-core simulation model called Follower Core, that helps us to approximate the multi-core simulations by simulating some cores in detail and abstracting out the other cores without reducing the overall activities at the shared resources. This enables us to simulate all the critical shared resources in the multi-core system accurately and hence the detailed core can provide correct performance estimation. The approach is applied over existing simulation models and it reduces the simulation time significantly, especially for long running workloads. The 'Follower Core' model provides an average speed up of 3x compared to baseline and is an accurate approximation of detailed multi-core simulations with a maximum error of 2% with the baseline model and extends our capabilities by improving our coverage and providing flexibilities to run mixed workloads.

周期精确模拟器是处理器设计的重要工具，随着处理器复杂度和内核数的增加，仿真变得非常耗时和耗费资源，因此不太实用。为了做出正确的设计选择和高质量的性能预测，需要实时准确的多核性能估计。在这项工作中，我们提出了一个称为Follower Core的多核仿真模型，该模型通过详细模拟一些核心并抽象出其他核心来帮助我们近似多核仿真，而不会减少共享资源上的整体活动。这使我们能够准确地模拟多核系统中所有关键的共享资源，从而详细的核可以提供正确的性能估计。该方法应用于现有的仿真模型，大大减少了仿真时间，特别是对于长时间运行的工作负载。与基线相比，“追随者核心”模型的平均速度提高了3倍，是详细多核模拟的精确近近值，与基线模型相比，最大误差为2%，并通过提高覆盖范围和提供运行混合工作负载的灵活性来扩展我们的能力。

{"title":"Follower Core: A Model To Simulate Large Multicore SoCs","authors":"Tanuj Agarwal, Bill Jones, A. Bhowmik","doi":"10.1145/3297663.3309678","DOIUrl":"https://doi.org/10.1145/3297663.3309678","url":null,"abstract":"Cycle accurate simulator is a critical tool for processor design and as the complexity and the core count of the processor increase, the simulation becomes extremely time and resource consuming and hence not very practical. Accurate multi-core performance estimation in realistic time is needed for making the right design choices and make high quality performance projections. In this work we present a multi-core simulation model called Follower Core, that helps us to approximate the multi-core simulations by simulating some cores in detail and abstracting out the other cores without reducing the overall activities at the shared resources. This enables us to simulate all the critical shared resources in the multi-core system accurately and hence the detailed core can provide correct performance estimation. The approach is applied over existing simulation models and it reduces the simulation time significantly, especially for long running workloads. The 'Follower Core' model provides an average speed up of 3x compared to baseline and is an accurate approximation of detailed multi-core simulations with a maximum error of 2% with the baseline model and extends our capabilities by improving our coverage and providing flexibilities to run mixed workloads.","PeriodicalId":273447,"journal":{"name":"Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131568458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Performance Evaluation of Multi-Path TCP for Data Center and Cloud Workloads 面向数据中心和云工作负载的多路径TCP性能评估

Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering

Pub Date : 2019-04-04 DOI: 10.1145/3297663.3310295

Lucas Chaufournier, A. Ali-Eldin, Prateek Sharma, P. Shenoy, D. Towsley

Today's cloud data centers host a wide range of applications including data analytics, batch processing, and interactive processing. These applications require high throughput, low latency, and high reliability from the network. Satisfying these requirements in the face of dynamically varying network conditions remains a challenging problem. Multi-Path TCP (MPTCP) is a recently proposed IETF extension to TCP that divides a conventional TCP flow into multiple subflows so as to utilize multiple paths over the network. Despite the theoretical and practical benefits of MPTCP, its effectiveness for cloud applications and environments remains unclear as there has been little work to quantify the benefits of MPTCP for real cloud applications. We present a broad empirical study of the effectiveness and feasibility of MPTCP for data center and cloud applications, under different network conditions. Our results show that while MPTCP provides useful bandwidth aggregation, congestion avoidance, and improved resiliency for some cloud applications, these benefits do not apply uniformly across applications, especially in cloud settings.

今天的云数据中心承载着广泛的应用程序，包括数据分析、批处理和交互式处理。这些应用程序需要来自网络的高吞吐量、低延迟和高可靠性。面对动态变化的网络条件，满足这些要求仍然是一个具有挑战性的问题。多路径TCP (MPTCP)是IETF最近提出的对TCP的扩展，它将传统的TCP流分成多个子流，从而利用网络上的多条路径。尽管MPTCP在理论上和实践上都有好处，但它对云应用和环境的有效性仍然不清楚，因为很少有工作可以量化MPTCP对实际云应用的好处。我们在不同的网络条件下对数据中心和云应用的MPTCP的有效性和可行性进行了广泛的实证研究。我们的结果表明，虽然MPTCP为一些云应用程序提供了有用的带宽聚合、拥塞避免和改进的弹性，但这些好处并不是在应用程序之间统一适用的，尤其是在云设置中。

{"title":"Performance Evaluation of Multi-Path TCP for Data Center and Cloud Workloads","authors":"Lucas Chaufournier, A. Ali-Eldin, Prateek Sharma, P. Shenoy, D. Towsley","doi":"10.1145/3297663.3310295","DOIUrl":"https://doi.org/10.1145/3297663.3310295","url":null,"abstract":"Today's cloud data centers host a wide range of applications including data analytics, batch processing, and interactive processing. These applications require high throughput, low latency, and high reliability from the network. Satisfying these requirements in the face of dynamically varying network conditions remains a challenging problem. Multi-Path TCP (MPTCP) is a recently proposed IETF extension to TCP that divides a conventional TCP flow into multiple subflows so as to utilize multiple paths over the network. Despite the theoretical and practical benefits of MPTCP, its effectiveness for cloud applications and environments remains unclear as there has been little work to quantify the benefits of MPTCP for real cloud applications. We present a broad empirical study of the effectiveness and feasibility of MPTCP for data center and cloud applications, under different network conditions. Our results show that while MPTCP provides useful bandwidth aggregation, congestion avoidance, and improved resiliency for some cloud applications, these benefits do not apply uniformly across applications, especially in cloud settings.","PeriodicalId":273447,"journal":{"name":"Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering","volume":"121 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113990968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

Software Aging and Software Rejuvenation: Keynote 软件老化与软件复兴:主题演讲

Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering

Pub Date : 2019-04-04 DOI: 10.1145/3297663.3310290

K. Trivedi

The study of software failures has now become more important since it has been recognized that computer system outages are more due to software faults than due to hardware faults. The phenome- non of "software aging", in which the state of the software system degrades with time, has been reported in widely used software and also in high-availability and safety-critical systems. The primary causes of this degradation are the exhaustion of operating system resources, data corruption and numerical error accumulation. This may eventually lead to performance degradation of the software system or crash/hang failure or both. To counteract this phenome- non, a proactive approach to fault management, called "software rejuvenation" has been proposed. This essentially involves grace- fully terminating an application or a system and restarting it in a clean internal state. This process removes the accumulated errors and frees up operating system resources. This method therefore avoids or postpones unplanned and potentially expensive system outages due to software aging. In this talk, we discuss methods of evaluating the effectiveness of proactive fault management in operational software systems and determining optimal times to perform rejuvenation.

软件故障的研究现在变得越来越重要，因为人们已经认识到计算机系统的中断更多地是由于软件故障而不是由于硬件故障。“软件老化”现象，即软件系统的状态随着时间的推移而退化，在广泛使用的软件以及高可用性和安全关键型系统中都有报道。这种退化的主要原因是操作系统资源耗尽、数据损坏和数值错误积累。这可能最终导致软件系统的性能下降或崩溃/挂起失败，或两者兼而有之。为了抵消这种现象，提出了一种主动的故障管理方法，称为“软件复兴”。这本质上涉及优雅地终止应用程序或系统，并在干净的内部状态下重新启动它。此过程可以清除累积的错误并释放操作系统资源。因此，这种方法避免或推迟了由于软件老化而导致的计划外和潜在的昂贵系统中断。在这次演讲中，我们讨论了在运行软件系统中评估主动故障管理的有效性和确定执行恢复的最佳时间的方法。

{"title":"Software Aging and Software Rejuvenation: Keynote","authors":"K. Trivedi","doi":"10.1145/3297663.3310290","DOIUrl":"https://doi.org/10.1145/3297663.3310290","url":null,"abstract":"The study of software failures has now become more important since it has been recognized that computer system outages are more due to software faults than due to hardware faults. The phenome- non of \"software aging\", in which the state of the software system degrades with time, has been reported in widely used software and also in high-availability and safety-critical systems. The primary causes of this degradation are the exhaustion of operating system resources, data corruption and numerical error accumulation. This may eventually lead to performance degradation of the software system or crash/hang failure or both. To counteract this phenome- non, a proactive approach to fault management, called \"software rejuvenation\" has been proposed. This essentially involves grace- fully terminating an application or a system and restarting it in a clean internal state. This process removes the accumulated errors and frees up operating system resources. This method therefore avoids or postpones unplanned and potentially expensive system outages due to software aging. In this talk, we discuss methods of evaluating the effectiveness of proactive fault management in operational software systems and determining optimal times to perform rejuvenation.","PeriodicalId":273447,"journal":{"name":"Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125058580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Yardstick: A Benchmark for Minecraft-like Services 标准:《我的世界》类服务的基准

Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering

Pub Date : 2019-04-04 DOI: 10.1145/3297663.3310307

Jerom van der Sar, Jesse Donkervliet, A. Iosup

Online gaming applications entertain hundreds of millions of daily active players and often feature vastly complex architecture. Among online games, Minecraft-like games simulate unique (e.g., modifiable) environments, are virally popular, and are increasingly provided as a service. However, the performance of Minecraft-like services, and in particular their scalability, is not well understood. Moreover, currently no benchmark exists for Minecraft-like games. Addressing this knowledge gap, in this work we design and use the Yardstick benchmark to analyze the performance of Minecraft-like services. Yardstick is based on an operational model that captures salient characteristics of Minecraft-like services. As input workload, Yardstick captures important features, such as the most-popular maps used within the Minecraft community. Yardstick captures system- and application-level metrics, and derives from them service-level metrics such as frequency of game-updates under scalable workload. We implement Yardstick, and, through real-world experiments in our clusters, we explore the performance and scalability of popular Minecraft-like servers, including the official vanilla server, and the community-developed servers Spigot and Glowstone. Our findings indicate the scalability limits of these servers, that Minecraft-like services are poorly parallelized, and that Glowstone is the least viable option among those tested.

在线游戏应用每天都有上亿的活跃玩家，它们的结构通常非常复杂。在网络游戏中，类似于《我的世界》的游戏模拟了独特的(例如，可修改的)环境，非常受欢迎，并且越来越多地被作为一种服务提供。然而，类似《我的世界》的服务的性能，特别是它们的可扩展性，还没有被很好地理解。此外，目前还没有类似《我的世界》的游戏的基准。为了解决这一知识差距，我们设计并使用了Yardstick基准来分析类似《我的世界》的服务的性能。Yardstick基于一个操作模型，该模型捕捉了类似《我的世界》的服务的显著特征。作为输入工作量，Yardstick捕获了重要的功能，例如Minecraft社区中最受欢迎的地图。Yardstick捕获系统和应用程序级别的指标，并从中派生出服务级别的指标，如在可扩展工作负载下的游戏更新频率。我们实现了Yardstick，并且通过在我们的集群中进行真实世界的实验，我们探索了流行的类似minecraft的服务器的性能和可扩展性，包括官方的vanilla服务器，以及社区开发的Spigot和Glowstone服务器。我们的发现表明了这些服务器的可伸缩性限制，类似《我的世界》的服务并行化很差，Glowstone是测试中最不可行的选择。

{"title":"Yardstick: A Benchmark for Minecraft-like Services","authors":"Jerom van der Sar, Jesse Donkervliet, A. Iosup","doi":"10.1145/3297663.3310307","DOIUrl":"https://doi.org/10.1145/3297663.3310307","url":null,"abstract":"Online gaming applications entertain hundreds of millions of daily active players and often feature vastly complex architecture. Among online games, Minecraft-like games simulate unique (e.g., modifiable) environments, are virally popular, and are increasingly provided as a service. However, the performance of Minecraft-like services, and in particular their scalability, is not well understood. Moreover, currently no benchmark exists for Minecraft-like games. Addressing this knowledge gap, in this work we design and use the Yardstick benchmark to analyze the performance of Minecraft-like services. Yardstick is based on an operational model that captures salient characteristics of Minecraft-like services. As input workload, Yardstick captures important features, such as the most-popular maps used within the Minecraft community. Yardstick captures system- and application-level metrics, and derives from them service-level metrics such as frequency of game-updates under scalable workload. We implement Yardstick, and, through real-world experiments in our clusters, we explore the performance and scalability of popular Minecraft-like servers, including the official vanilla server, and the community-developed servers Spigot and Glowstone. Our findings indicate the scalability limits of these servers, that Minecraft-like services are poorly parallelized, and that Glowstone is the least viable option among those tested.","PeriodicalId":273447,"journal":{"name":"Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121130090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Overload Protection of Cloud-IoT Applications by Feedback Control of Smart Devices 基于智能设备反馈控制的云-物联网应用过载保护

Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering

Pub Date : 2019-04-04 DOI: 10.1145/3297663.3309673

Manuel Gotin, Dominik Werle, Felix Lösch, A. Koziolek, Ralf H. Reussner

One of the most common usage scenarios for Cloud-IoT applications is Sensing-as-a-Service, which focuses on the processing of sensor data in order to make it available for other applications. Auto-scaling is a popular runtime management technique for cloud applications to cope with a varying resource demand by provisioning resources in an autonomous manner. However, if an auto-scaling system cannot provide the required resources, e.g., due to cost constraints, the cloud application is overloaded, which impacts its performance and availability. We present a feedback control mechanism to mitigate and recover from overload situations by adapting the send rate of smart devices in consideration of the current processing rate of the cloud application. This mechanism supports a coupling with the widely used threshold-based auto-scaling systems. In a case study, we demonstrate the capability of the approach to cope with overload scenarios in a realistic environment. Overall, we consider this approach as a novel tool for runtime managing cloud applications.

云-物联网应用程序最常见的使用场景之一是传感即服务，其重点是传感器数据的处理，以便使其可用于其他应用程序。自动伸缩是一种流行的运行时管理技术，用于云应用程序，通过以自治的方式提供资源来应对不同的资源需求。但是，如果自动扩展系统不能提供所需的资源，例如，由于成本限制，云应用程序就会过载，从而影响其性能和可用性。我们提出了一种反馈控制机制，通过考虑当前云应用程序的处理速率来调整智能设备的发送速率，从而减轻和恢复过载情况。该机制支持与广泛使用的基于阈值的自动缩放系统的耦合。在一个案例研究中，我们演示了该方法在现实环境中处理过载场景的能力。总的来说，我们认为这种方法是一种用于运行时管理云应用程序的新工具。

引用次数: 4

Behavior-driven Load Testing Using Contextual Knowledge - Approach and Experiences 使用上下文知识的行为驱动负载测试-方法和经验

Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering

Pub Date : 2019-04-04 DOI: 10.1145/3297663.3309674

Henning Schulz, Dusan Okanovic, A. Hoorn, Vincenzo Ferme, C. Pautasso

Load testing is widely considered a meaningful technique for performance quality assurance. However, empirical studies reveal that in practice, load testing is not applied systematically, due to the sound expert knowledge required to specify, implement, and execute load tests. Our Behavior-driven Load Testing (BDLT) approach eases load test specification and execution for users with no or little expert knowledge. It allows a user to describe a load test in a template-based natural language and to rely on an automated framework to execute the test. Utilizing the system's contextual knowledge such as workload-influencing events, the framework automatically determines the workload and test configuration. We investigated the applicability of our approach in an industrial case study, where we were able to express four load test concerns using BDLT and received positive feedback from our industrial partner. They understood the BDLT definitions well and proposed further applications, such as the usage for software quality acceptance criteria.

负载测试被广泛认为是一种有意义的性能质量保证技术。然而，实证研究表明，在实践中，由于需要完善的专家知识来指定、实施和执行负载测试，负载测试并没有得到系统的应用。我们的行为驱动负载测试(BDLT)方法为没有或很少有专业知识的用户简化了负载测试规范和执行。它允许用户用基于模板的自然语言描述负载测试，并依赖于自动化框架来执行测试。利用系统的上下文知识，例如工作负载影响事件，框架自动确定工作负载和测试配置。我们在一个工业案例研究中调查了我们的方法的适用性，我们能够使用BDLT表达四个负载测试关注点，并从我们的工业合作伙伴那里得到了积极的反馈。他们很好地理解了BDLT的定义，并提出了进一步的应用，例如软件质量验收标准的使用。

引用次数: 21

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀