首页 > 最新文献

Proceedings of the 5th ACM/SPEC international conference on Performance engineering最新文献

英文 中文
Real-time multi-cloud management needs application awareness 实时多云管理需要应用意识
J. Chinneck, Marin Litoiu, C. Woodside
Current cloud management systems have limited awareness of the user application, and application managers have no awareness of the state of the cloud. For applications with strong real-time requirements, distributed across new multi-cloud environments, this lack of awareness hampers response-time assurance, efficient deployment and rapid adaptation to changing workloads. This paper considers what forms this awareness may take, how it can be exploited in managing the applications and the clouds, and how it can influence cloud architecture.
当前的云管理系统对用户应用程序的了解有限,而应用程序管理人员对云的状态一无所知。对于具有强实时性要求的应用程序,分布在新的多云环境中,这种意识的缺乏阻碍了响应时间保证、高效部署和快速适应不断变化的工作负载。本文考虑了这种意识可能采取的形式,如何在管理应用程序和云时利用它,以及它如何影响云架构。
{"title":"Real-time multi-cloud management needs application awareness","authors":"J. Chinneck, Marin Litoiu, C. Woodside","doi":"10.1145/2568088.2576763","DOIUrl":"https://doi.org/10.1145/2568088.2576763","url":null,"abstract":"Current cloud management systems have limited awareness of the user application, and application managers have no awareness of the state of the cloud. For applications with strong real-time requirements, distributed across new multi-cloud environments, this lack of awareness hampers response-time assurance, efficient deployment and rapid adaptation to changing workloads. This paper considers what forms this awareness may take, how it can be exploited in managing the applications and the clouds, and how it can influence cloud architecture.","PeriodicalId":243233,"journal":{"name":"Proceedings of the 5th ACM/SPEC international conference on Performance engineering","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115163919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A power-measurement methodology for large-scale, high-performance computing 一种用于大规模高性能计算的功率测量方法
T. Scogland, C. Steffen, T. Wilde, Florent Parent, S. Coghlan, Natalie J. Bates, Wu-chun Feng, E. Strohmaier
Improvement in the energy efficiency of supercomputers can be accelerated by improving the quality and comparability of efficiency measurements. The ability to generate accurate measurements at extreme scale are just now emerging. The realization of system-level measurement capabilities can be accelerated with a commonly adopted and high quality measurement methodology for use while running a workload, typically a benchmark. This paper describes a methodology that has been developed collaboratively through the Energy Efficient HPC Working Group to support architectural analysis and comparative measurements for rankings, such as the Top500 and Green500. To support measurements with varying amounts of effort and equipment required we present three distinct levels of measurement, which provide increasing levels of accuracy. Level 1 is similar to the Green500 run rules today, a single average power measurement extrapolated from a subset of a machine. Level 2 is more comprehensive, but still widely achievable. Level 3 is the most rigorous of the three methodologies but is only possible at a few sites. However, the Level 3 methodology generates a high quality result that exposes details that the other methodologies may miss. In addition, we present case studies from the Leibniz Supercomputing Centre (LRZ), Argonne National Laboratory (ANL) and Calcul Québec Université Laval that explore the benefits and difficulties of gathering high quality, system-level measurements on large-scale machines.
通过提高效率测量的质量和可比性,可以加速超级计算机能效的提高。在极端尺度下进行精确测量的能力才刚刚出现。系统级度量功能的实现可以通过在运行工作负载(通常是基准测试)时使用的普遍采用的高质量度量方法来加速。本文描述了一种通过高效能高性能计算工作组(Energy Efficient HPC Working Group)合作开发的方法,该方法支持建筑分析和排名的比较测量,例如Top500和Green500。为了支持不同工作量和所需设备的测量,我们提出了三个不同的测量级别,这些级别提供了越来越高的准确性。级别1类似于今天的Green500运行规则,从机器的一个子集推断出单个平均功率测量。第2级更全面,但仍然可以广泛实现。第3级是三种方法中最严格的,但只适用于少数站点。然而,第3级方法产生了高质量的结果,揭示了其他方法可能错过的细节。此外,我们提出了来自莱布尼茨超级计算中心(LRZ)、阿贡国家实验室(ANL)和拉瓦尔大学的案例研究,探讨了在大型机器上收集高质量、系统级测量的好处和困难。
{"title":"A power-measurement methodology for large-scale, high-performance computing","authors":"T. Scogland, C. Steffen, T. Wilde, Florent Parent, S. Coghlan, Natalie J. Bates, Wu-chun Feng, E. Strohmaier","doi":"10.1145/2568088.2576795","DOIUrl":"https://doi.org/10.1145/2568088.2576795","url":null,"abstract":"Improvement in the energy efficiency of supercomputers can be accelerated by improving the quality and comparability of efficiency measurements. The ability to generate accurate measurements at extreme scale are just now emerging. The realization of system-level measurement capabilities can be accelerated with a commonly adopted and high quality measurement methodology for use while running a workload, typically a benchmark. This paper describes a methodology that has been developed collaboratively through the Energy Efficient HPC Working Group to support architectural analysis and comparative measurements for rankings, such as the Top500 and Green500. To support measurements with varying amounts of effort and equipment required we present three distinct levels of measurement, which provide increasing levels of accuracy. Level 1 is similar to the Green500 run rules today, a single average power measurement extrapolated from a subset of a machine. Level 2 is more comprehensive, but still widely achievable. Level 3 is the most rigorous of the three methodologies but is only possible at a few sites. However, the Level 3 methodology generates a high quality result that exposes details that the other methodologies may miss. In addition, we present case studies from the Leibniz Supercomputing Centre (LRZ), Argonne National Laboratory (ANL) and Calcul Québec Université Laval that explore the benefits and difficulties of gathering high quality, system-level measurements on large-scale machines.","PeriodicalId":243233,"journal":{"name":"Proceedings of the 5th ACM/SPEC international conference on Performance engineering","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132383470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
A meta-controller method for improving run-time self-architecting in SOA systems 用于改进SOA系统中运行时自架构的元控制器方法
J. M. Ewing, D. Menascé
This paper builds on SASSY, a system for automatically generating SOA software architectures that optimize a given utility function of multiple QoS metrics. In SASSY, SOA software systems are automatically re-architected when services fail or degrade. Optimizing both architecture and service provider selection presents a pair of nested NP-hard problems. Here we adapt hill-climbing, beam search, simulated annealing, and evolutionary programming to both architecture optimization and service provider selection. Each of these techniques has several parameters that influence their efficiency. We introduce in this paper a meta-controller that automates the run-time selection of heuristic search techniques and their parameters. We examine two different meta-controller implementations that each use online learning. The first implementation identifies the best heuristic search combination from various prepared combinations. The second implementation analyzes the current self-architecting problem (e.g. changes in performance metrics, service degradations/failures) and looks for similar, previously encountered re-architecting problems to find an effective heuristic search combination for the current problem. A large set of experiments demonstrates the effectiveness of the first meta-controller implementation and indicates opportunities for improving the second meta-controller implementation.
本文建立在SASSY的基础上,SASSY是一个自动生成SOA软件架构的系统,可以优化多个QoS指标的给定效用函数。在SASSY中,当服务失败或降级时,SOA软件系统会自动重新架构。优化体系结构和服务提供商的选择提出了一对嵌套的np困难问题。在这里,我们将爬坡、波束搜索、模拟退火和进化规划应用于架构优化和服务提供商选择。每种技术都有几个影响其效率的参数。本文介绍了一种元控制器,它可以自动地在运行时选择启发式搜索技术及其参数。我们研究了两种不同的元控制器实现,它们都使用在线学习。第一个实现从各种准备好的组合中确定最佳启发式搜索组合。第二个实现分析当前的自架构问题(例如,性能指标的变化、服务降级/故障),并寻找类似的、以前遇到的重新架构问题,为当前问题找到一个有效的启发式搜索组合。大量的实验证明了第一种元控制器实现的有效性,并指出了改进第二种元控制器实现的机会。
{"title":"A meta-controller method for improving run-time self-architecting in SOA systems","authors":"J. M. Ewing, D. Menascé","doi":"10.1145/2568088.2568098","DOIUrl":"https://doi.org/10.1145/2568088.2568098","url":null,"abstract":"This paper builds on SASSY, a system for automatically generating SOA software architectures that optimize a given utility function of multiple QoS metrics. In SASSY, SOA software systems are automatically re-architected when services fail or degrade. Optimizing both architecture and service provider selection presents a pair of nested NP-hard problems. Here we adapt hill-climbing, beam search, simulated annealing, and evolutionary programming to both architecture optimization and service provider selection. Each of these techniques has several parameters that influence their efficiency. We introduce in this paper a meta-controller that automates the run-time selection of heuristic search techniques and their parameters. We examine two different meta-controller implementations that each use online learning. The first implementation identifies the best heuristic search combination from various prepared combinations. The second implementation analyzes the current self-architecting problem (e.g. changes in performance metrics, service degradations/failures) and looks for similar, previously encountered re-architecting problems to find an effective heuristic search combination for the current problem. A large set of experiments demonstrates the effectiveness of the first meta-controller implementation and indicates opportunities for improving the second meta-controller implementation.","PeriodicalId":243233,"journal":{"name":"Proceedings of the 5th ACM/SPEC international conference on Performance engineering","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122521006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Scalable hybrid stream and hadoop network analysis system 可扩展的混合流和hadoop网络分析系统
V. Bumgardner, V. Marek
Collections of network traces have long been used in network traffic analysis. Flow analysis can be used in network anomaly discovery, intrusion detection and more generally, discovery of actionable events on the network. The data collected during processing may be also used for prediction and avoidance of traffic congestion, network capacity planning, and the development of software-defined networking rules. As network flow rates increase and new network technologies are introduced on existing hardware platforms, many organizations find themselves either technically or financially unable to generate, collect, and/or analyze network flow data. The continued rapid growth of network trace data, requires new methods of scalable data collection and analysis. We report on our deployment of a system designed and implemented at the University of Kentucky that supports analysis of network traffic across the enterprise. Our system addresses problems of scale in existing systems, by using distributed computing methodologies, and is based on a combination of stream and batch processing techniques. In addition to collection, stream processing using Storm is utilized to enrich the data stream with ephemeral environment data. Enriched stream-data is then used for event detection and near real-time flow analysis by an in-line complex event processor. Batch processing is performed by the Hadoop MapReduce framework, from data stored in HBase BigTable storage. In benchmarks on our 10 node cluster, using actual network data, we were able to stream process over 315k flows/sec. In batch analysis were we able to process over 2.6M flows/sec with a storage compression ratio of 6.7:1.
长期以来,网络轨迹集一直用于网络流量分析。流分析可用于网络异常发现、入侵检测以及更一般的网络上可操作事件的发现。处理过程中收集的数据还可用于预测和避免流量拥塞、规划网络容量、制定软件定义的网络规则等。随着网络流量的增加以及在现有硬件平台上引入新的网络技术,许多组织发现自己在技术上或经济上无法生成、收集和/或分析网络流量数据。网络跟踪数据的持续快速增长,需要新的可扩展的数据收集和分析方法。我们报告我们在肯塔基大学设计和实现的系统的部署情况,该系统支持对整个企业的网络流量进行分析。我们的系统解决了现有系统的规模问题,通过使用分布式计算方法,并基于流和批处理技术的组合。除了收集之外,使用Storm进行流处理还可以使用短暂的环境数据来丰富数据流。丰富的流数据,然后用于事件检测和近实时流分析由一个内联复杂事件处理器。批处理由Hadoop MapReduce框架对存储在HBase BigTable存储中的数据进行处理。在我们的10个节点集群的基准测试中,使用实际的网络数据,我们能够以超过315k流/秒的速度进行流处理。在批量分析中,我们能够处理超过2.6M流/秒,存储压缩比为6.7:1。
{"title":"Scalable hybrid stream and hadoop network analysis system","authors":"V. Bumgardner, V. Marek","doi":"10.1145/2568088.2568103","DOIUrl":"https://doi.org/10.1145/2568088.2568103","url":null,"abstract":"Collections of network traces have long been used in network traffic analysis. Flow analysis can be used in network anomaly discovery, intrusion detection and more generally, discovery of actionable events on the network. The data collected during processing may be also used for prediction and avoidance of traffic congestion, network capacity planning, and the development of software-defined networking rules. As network flow rates increase and new network technologies are introduced on existing hardware platforms, many organizations find themselves either technically or financially unable to generate, collect, and/or analyze network flow data. The continued rapid growth of network trace data, requires new methods of scalable data collection and analysis. We report on our deployment of a system designed and implemented at the University of Kentucky that supports analysis of network traffic across the enterprise. Our system addresses problems of scale in existing systems, by using distributed computing methodologies, and is based on a combination of stream and batch processing techniques. In addition to collection, stream processing using Storm is utilized to enrich the data stream with ephemeral environment data. Enriched stream-data is then used for event detection and near real-time flow analysis by an in-line complex event processor. Batch processing is performed by the Hadoop MapReduce framework, from data stored in HBase BigTable storage. In benchmarks on our 10 node cluster, using actual network data, we were able to stream process over 315k flows/sec. In batch analysis were we able to process over 2.6M flows/sec with a storage compression ratio of 6.7:1.","PeriodicalId":243233,"journal":{"name":"Proceedings of the 5th ACM/SPEC international conference on Performance engineering","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115059259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
On the limits of modeling generational garbage collector performance 关于分代垃圾收集器性能建模的限制
P. Libic, L. Bulej, Vojtech Horký, P. Tůma
Garbage collection is an element of many contemporary software platforms whose performance is determined by complex interactions and is therefore difficult to quantify and model. We investigate the difference between the behavior of a real garbage collector implementation and a simplified model on a selection of workloads, focusing on the accuracy achievable with particular input information (sizes, references, lifetimes). Our work highlights the limits of performance modeling of garbage collection and points out issues of existing evaluation tools that may lead to incorrect experimental conclusions.
垃圾收集是许多当代软件平台的一个组成部分,其性能由复杂的交互决定,因此难以量化和建模。我们研究了真实垃圾收集器实现和简化模型在选择工作负载时的行为差异,重点关注特定输入信息(大小、引用、生命周期)所能达到的准确性。我们的工作强调了垃圾收集性能建模的局限性,并指出了现有评估工具可能导致不正确实验结论的问题。
{"title":"On the limits of modeling generational garbage collector performance","authors":"P. Libic, L. Bulej, Vojtech Horký, P. Tůma","doi":"10.1145/2568088.2568097","DOIUrl":"https://doi.org/10.1145/2568088.2568097","url":null,"abstract":"Garbage collection is an element of many contemporary software platforms whose performance is determined by complex interactions and is therefore difficult to quantify and model. We investigate the difference between the behavior of a real garbage collector implementation and a simplified model on a selection of workloads, focusing on the accuracy achievable with particular input information (sizes, references, lifetimes). Our work highlights the limits of performance modeling of garbage collection and points out issues of existing evaluation tools that may lead to incorrect experimental conclusions.","PeriodicalId":243233,"journal":{"name":"Proceedings of the 5th ACM/SPEC international conference on Performance engineering","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115252394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Test-driving Intel Xeon Phi 测试英特尔Xeon Phi
Jianbin Fang, H. Sips, Lilun Zhang, Chuanfu Xu, Yonggang Che, A. Varbanescu
Based on Intel's Many Integrated Core (MIC) architecture, Intel Xeon Phi is one of the few truly many-core CPUs - featuring around 60 fairly powerful cores, two levels of caches, and graphic memory, all interconnected by a very fast ring. Given its promised ease-of-use and high performance, we took Xeon Phi out for a test drive. In this paper, we present this experience at two different levels: (1) the microbenchmark level, where we stress "each nut and bolt" of Phi in the lab, and (2) the application level, where we study Phi's performance response in a real-life environment. At the microbenchmarking level, we show the high performance of five components of the architecture, focusing on their maximum achieved performance and the prerequisites to achieve it. Next, we choose a medical imaging application (Leukocyte Tracking) as a case study. We observed that it is rather easy to get functional code and start benchmarking, but the first performance numbers can be far from satisfying. Our experience indicates that a simple data structure and massive parallelism are critical for Xeon Phi to perform well. When compiler-driven parallelization and/or vectorization fails, programming Xeon Phi for performance can become very challenging.
基于英特尔的多核心集成(MIC)架构,英特尔至强Phi是为数不多的真正的多核cpu之一——拥有大约60个相当强大的核心,两级缓存和图形内存,所有这些都通过一个非常快的环连接。鉴于其承诺的易用性和高性能,我们将Xeon Phi进行了测试。在本文中,我们从两个不同的层面呈现这种体验:(1)微基准层面,我们在实验室中强调Phi的“每个螺母和螺栓”;(2)应用层面,我们在现实环境中研究Phi的性能响应。在微基准测试级别,我们展示了该体系结构的五个组件的高性能,重点关注它们的最大性能和实现它的先决条件。接下来,我们选择一个医学成像应用(白细胞跟踪)作为案例研究。我们观察到,获得功能代码并开始基准测试是相当容易的,但是第一个性能数字可能远不能令人满意。我们的经验表明,简单的数据结构和大规模并行性对于Xeon Phi处理器的良好性能至关重要。当编译器驱动的并行化和/或向量化失败时,对Xeon Phi进行性能编程可能会变得非常具有挑战性。
{"title":"Test-driving Intel Xeon Phi","authors":"Jianbin Fang, H. Sips, Lilun Zhang, Chuanfu Xu, Yonggang Che, A. Varbanescu","doi":"10.1145/2568088.2576799","DOIUrl":"https://doi.org/10.1145/2568088.2576799","url":null,"abstract":"Based on Intel's Many Integrated Core (MIC) architecture, Intel Xeon Phi is one of the few truly many-core CPUs - featuring around 60 fairly powerful cores, two levels of caches, and graphic memory, all interconnected by a very fast ring. Given its promised ease-of-use and high performance, we took Xeon Phi out for a test drive. In this paper, we present this experience at two different levels: (1) the microbenchmark level, where we stress \"each nut and bolt\" of Phi in the lab, and (2) the application level, where we study Phi's performance response in a real-life environment. At the microbenchmarking level, we show the high performance of five components of the architecture, focusing on their maximum achieved performance and the prerequisites to achieve it. Next, we choose a medical imaging application (Leukocyte Tracking) as a case study. We observed that it is rather easy to get functional code and start benchmarking, but the first performance numbers can be far from satisfying. Our experience indicates that a simple data structure and massive parallelism are critical for Xeon Phi to perform well. When compiler-driven parallelization and/or vectorization fails, programming Xeon Phi for performance can become very challenging.","PeriodicalId":243233,"journal":{"name":"Proceedings of the 5th ACM/SPEC international conference on Performance engineering","volume":"315 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123154302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 99
Server efficiency rating tool (SERT) 1.0.2: an overview 服务器效率评估工具(SERT) 1.0.2:概述
Hansfried Block, J. Arnold, John Beckett, Sanjay Sharma, Michael G. Tricker, Kyle M. Rogers
The Server Efficiency Rating Tool (SERT) has released the Standard Performance Evaluation Corporation (SPEC) and the EPA released Version 2.0 of the ENERGY STAR for Computer Servers program in early 2013 to include the mandatory use of the SERT. Other governments world-wide that are concerned with the growing power consumption of servers and datacenters are also considering adoption of the SERT. This poster-paper provides an overview of the current release of 1.0.2 version of SERT.
服务器效率评级工具(SERT)发布了标准性能评估公司(SPEC), EPA在2013年初发布了计算机服务器能源之星2.0版本,其中包括强制使用SERT。世界范围内关注服务器和数据中心日益增长的电力消耗的其他政府也在考虑采用SERT。这张海报提供了SERT当前1.0.2版本的概述。
{"title":"Server efficiency rating tool (SERT) 1.0.2: an overview","authors":"Hansfried Block, J. Arnold, John Beckett, Sanjay Sharma, Michael G. Tricker, Kyle M. Rogers","doi":"10.1145/2568088.2576094","DOIUrl":"https://doi.org/10.1145/2568088.2576094","url":null,"abstract":"The Server Efficiency Rating Tool (SERT) has released the Standard Performance Evaluation Corporation (SPEC) and the EPA released Version 2.0 of the ENERGY STAR for Computer Servers program in early 2013 to include the mandatory use of the SERT. Other governments world-wide that are concerned with the growing power consumption of servers and datacenters are also considering adoption of the SERT. This poster-paper provides an overview of the current release of 1.0.2 version of SERT.","PeriodicalId":243233,"journal":{"name":"Proceedings of the 5th ACM/SPEC international conference on Performance engineering","volume":"11 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120873234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Performance queries for architecture-level performance models 架构级性能模型的性能查询
F. Gorsler, Fabian Brosig, Samuel Kounev
Over the past few decades, many performance modeling formalisms and prediction techniques for software architectures have been developed in the performance engineering community. However, using a performance model to predict the performance of a software system normally requires extensive experience with the respective modeling formalism and involves a number of complex and time consuming manual steps. In this paper, we propose a generic declarative interface to performance prediction techniques to simplify and automate the process of using architecture-level software performance models for performance analysis. The proposed Descartes Query Language (DQL) is a language to express the demanded performance metrics for prediction as well as the goals and constraints of the specific prediction scenario. It reduces the manual effort and learning curve in working with performance models by a unified interface independent of the employed modeling formalism. We evaluate the applicability and benefits of the proposed approach in the context of several representative case studies.
在过去的几十年里,性能工程社区已经开发了许多用于软件体系结构的性能建模形式化和预测技术。然而,使用性能模型来预测软件系统的性能通常需要对各自的建模形式化有丰富的经验,并且涉及许多复杂且耗时的手动步骤。在本文中,我们提出了性能预测技术的通用声明性接口,以简化和自动化使用架构级软件性能模型进行性能分析的过程。提出的笛卡儿查询语言(DQL)是一种表达预测所需的性能指标以及特定预测场景的目标和约束的语言。它通过一个独立于所使用的建模形式化的统一接口,减少了处理性能模型的手工工作和学习曲线。我们在几个代表性案例研究的背景下评估所提出的方法的适用性和效益。
{"title":"Performance queries for architecture-level performance models","authors":"F. Gorsler, Fabian Brosig, Samuel Kounev","doi":"10.1145/2568088.2568100","DOIUrl":"https://doi.org/10.1145/2568088.2568100","url":null,"abstract":"Over the past few decades, many performance modeling formalisms and prediction techniques for software architectures have been developed in the performance engineering community. However, using a performance model to predict the performance of a software system normally requires extensive experience with the respective modeling formalism and involves a number of complex and time consuming manual steps. In this paper, we propose a generic declarative interface to performance prediction techniques to simplify and automate the process of using architecture-level software performance models for performance analysis. The proposed Descartes Query Language (DQL) is a language to express the demanded performance metrics for prediction as well as the goals and constraints of the specific prediction scenario. It reduces the manual effort and learning curve in working with performance models by a unified interface independent of the employed modeling formalism. We evaluate the applicability and benefits of the proposed approach in the context of several representative case studies.","PeriodicalId":243233,"journal":{"name":"Proceedings of the 5th ACM/SPEC international conference on Performance engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122706871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Extreme big data processing in large-scale graph analytics and billion-scale social simulation 大规模图形分析和十亿规模社会模拟中的极端大数据处理
T. Suzumura
This paper introduces some of the example applications handling extremely big data with supercomputers such as large-scale network analysis, X10-based large-scale graph analytics library, Graph500 benchmark, and billion-scale social simulation.
本文介绍了超大规模网络分析、基于x10的大规模图分析库、Graph500基准测试、十亿规模社会模拟等在超级计算机上处理超大数据的应用实例。
{"title":"Extreme big data processing in large-scale graph analytics and billion-scale social simulation","authors":"T. Suzumura","doi":"10.1145/2568088.2576096","DOIUrl":"https://doi.org/10.1145/2568088.2576096","url":null,"abstract":"This paper introduces some of the example applications handling extremely big data with supercomputers such as large-scale network analysis, X10-based large-scale graph analytics library, Graph500 benchmark, and billion-scale social simulation.","PeriodicalId":243233,"journal":{"name":"Proceedings of the 5th ACM/SPEC international conference on Performance engineering","volume":"252 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131555936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Efficient and accurate stack trace sampling in the Java hotspot virtual machine 在Java热点虚拟机中高效准确的堆栈跟踪采样
Peter Hofer, H. Mössenböck
Sampling is a popular approach to collecting data for profiling and monitoring, because it has a small impact on performance and does not modify the observed application. When sampling stack traces, they can be merged into a calling context tree that shows where the application spends its time and where performance problems lie. However, Java VM implementations usually rely on safepoints for sampling stack traces. Safepoints can cause inaccuracies and have a considerable performance impact. We present a new approach that does not use safepoints, but instead relies on the operating system to take snapshots of the stack at arbitrary points. These snapshots are then asynchronously decoded to call traces, which are merged into a calling context tree. We show that we are able to decode over 90% of the snapshots, and that our approach has very small impact on performance even at high sampling rates.
抽样是为分析和监视收集数据的常用方法,因为它对性能的影响很小,而且不会修改观察到的应用程序。在对堆栈跟踪进行采样时,它们可以合并到一个调用上下文树中,该树显示应用程序在哪里花费时间以及性能问题在哪里。然而,Java VM实现通常依赖于安全点来采样堆栈跟踪。安全点可能导致不准确,并对性能产生相当大的影响。我们提出了一种新的方法,它不使用安全点,而是依赖于操作系统在任意点对堆栈进行快照。然后将这些快照异步解码为调用跟踪,并将其合并到调用上下文树中。我们表明,我们能够解码超过90%的快照,并且即使在高采样率下,我们的方法对性能的影响也很小。
{"title":"Efficient and accurate stack trace sampling in the Java hotspot virtual machine","authors":"Peter Hofer, H. Mössenböck","doi":"10.1145/2568088.2576759","DOIUrl":"https://doi.org/10.1145/2568088.2576759","url":null,"abstract":"Sampling is a popular approach to collecting data for profiling and monitoring, because it has a small impact on performance and does not modify the observed application. When sampling stack traces, they can be merged into a calling context tree that shows where the application spends its time and where performance problems lie. However, Java VM implementations usually rely on safepoints for sampling stack traces. Safepoints can cause inaccuracies and have a considerable performance impact. We present a new approach that does not use safepoints, but instead relies on the operating system to take snapshots of the stack at arbitrary points. These snapshots are then asynchronously decoded to call traces, which are merged into a calling context tree. We show that we are able to decode over 90% of the snapshots, and that our approach has very small impact on performance even at high sampling rates.","PeriodicalId":243233,"journal":{"name":"Proceedings of the 5th ACM/SPEC international conference on Performance engineering","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131806452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
Proceedings of the 5th ACM/SPEC international conference on Performance engineering
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1