首页 > 最新文献

Proceedings of the 4th ACM SIGPLAN International Workshop on Software Engineering for Parallel Systems最新文献

英文 中文
Performance analysis and optimization of the RAMPAGE metal alloy potential generation software RAMPAGE金属合金电位生成软件的性能分析与优化
P. Roth, H. Shan, D. Riegner, Nikolas Antolin, S. Sreepathi, L. Oliker, Samuel Williams, S. Moore, W. Windl
The Rapid Alloy Method for Producing Accurate, General Empirical potential generation toolkit (RAMPAGE) is a program for fitting multicomponent interatomic potential functions for metal alloys. In this paper, we describe a collaborative effort between domain scientists and performance engineers to improve the parallelism, scalability, and maintainability of the code. We modified RAMPAGE to use the Message Passing Interface (MPI) for communication and synchronization, to use more than one MPI process when evaluating candidate potential functions, and to have its MPI processes execute functionality that was previously executed by external non-MPI processes. We ported RAMPAGE to run on the Eos and Titan Cray systems of the United States Department of Energy (DOE)'s Oak Ridge Leadership Computing Facility (OLCF), and the Cori and Edison systems at the DOE's National Energy Research Scientific Computing Center (NERSC). Our modifications resulted in a 7x speedup on 8 Eos system nodes, and scalability up to 2048 processes on the Cori system with Intel Knights Landing processors. To improve maintainability of the RAMPAGE source code, we introduced several software engineering best practices to the RAMPAGE developers' workflow.
RAMPAGE是一个用于拟合金属合金多组分原子间电位函数的程序。在本文中,我们描述了领域科学家和性能工程师之间的协作努力,以提高代码的并行性、可伸缩性和可维护性。我们修改了RAMPAGE,使用消息传递接口(MPI)进行通信和同步,在评估候选潜在功能时使用多个MPI进程,并让其MPI进程执行以前由外部非MPI进程执行的功能。我们将RAMPAGE移植到美国能源部(DOE)的橡树岭领导计算设施(OLCF)的Eos和Titan Cray系统,以及能源部国家能源研究科学计算中心(NERSC)的Cori和Edison系统上运行。我们的修改使8个Eos系统节点的速度提高了7倍,并且在带有Intel Knights Landing处理器的Cori系统上可扩展到2048个进程。为了提高RAMPAGE源代码的可维护性,我们向RAMPAGE开发人员的工作流程引入了几个软件工程最佳实践。
{"title":"Performance analysis and optimization of the RAMPAGE metal alloy potential generation software","authors":"P. Roth, H. Shan, D. Riegner, Nikolas Antolin, S. Sreepathi, L. Oliker, Samuel Williams, S. Moore, W. Windl","doi":"10.1145/3141865.3141868","DOIUrl":"https://doi.org/10.1145/3141865.3141868","url":null,"abstract":"The Rapid Alloy Method for Producing Accurate, General Empirical potential generation toolkit (RAMPAGE) is a program for fitting multicomponent interatomic potential functions for metal alloys. In this paper, we describe a collaborative effort between domain scientists and performance engineers to improve the parallelism, scalability, and maintainability of the code. We modified RAMPAGE to use the Message Passing Interface (MPI) for communication and synchronization, to use more than one MPI process when evaluating candidate potential functions, and to have its MPI processes execute functionality that was previously executed by external non-MPI processes. We ported RAMPAGE to run on the Eos and Titan Cray systems of the United States Department of Energy (DOE)'s Oak Ridge Leadership Computing Facility (OLCF), and the Cori and Edison systems at the DOE's National Energy Research Scientific Computing Center (NERSC). Our modifications resulted in a 7x speedup on 8 Eos system nodes, and scalability up to 2048 processes on the Cori system with Intel Knights Landing processors. To improve maintainability of the RAMPAGE source code, we introduced several software engineering best practices to the RAMPAGE developers' workflow.","PeriodicalId":424955,"journal":{"name":"Proceedings of the 4th ACM SIGPLAN International Workshop on Software Engineering for Parallel Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115491728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
How to test your concurrent software: an approach for the selection of testing techniques 如何测试你的并发软件:一种选择测试技术的方法
S. Melo, S. Souza, P. L. D. Souza, Jeffrey C. Carver
High-Performance Computing (HPC) applications consist of concurrent programs with multi-process and/or multithreaded models with varying degrees of parallelism. Although their design patterns, models, and principles are similar to those of sequential ones, their non-deterministic behavior makes the testing activity more complex. In an attempt to solve such complexity, several techniques for concurrent software testing have been developed over the past years. However, the transference of knowledge between academy and industry remains a challenge, mainly due to the lack of a solid base of evidence with information that assists the decision-making process. This paper proposes the construction of a body of evidence for the concurrent programming field that supports the selection of an adequate testing technique for a software project. We propose a characterization schema which assists the decision-making support and is based on relevant information from the technical literature regarding available techniques, attributes, and concepts of concurrent programming that affect the testing process. The schema classified 109 studies that compose the preliminary body of evidence. A survey was conducted with specialists for the validation of the schema, regarding adequacy and relevance of the attributes defined. The results indicate the schema is effective and can support testing teams for concurrent applications.
高性能计算(HPC)应用程序由具有不同并行度的多进程和/或多线程模型的并发程序组成。尽管它们的设计模式、模型和原则与顺序的设计模式、模型和原则相似,但它们的不确定性行为使测试活动更加复杂。为了解决这种复杂性,在过去的几年里,已经开发了几种用于并发软件测试的技术。然而,学术界和产业界之间的知识转移仍然是一个挑战,主要原因是缺乏坚实的证据基础和有助于决策过程的信息。本文提出构建并行编程领域的证据体系,以支持为软件项目选择适当的测试技术。我们提出了一个表征模式,它有助于决策支持,并基于技术文献中有关影响测试过程的可用技术、属性和并发编程概念的相关信息。该图式对109项研究进行了分类,这些研究构成了初步的证据。与专家一起进行了一项调查,以验证模式,关于所定义属性的充分性和相关性。结果表明该模式是有效的,可以支持并发应用程序的测试团队。
{"title":"How to test your concurrent software: an approach for the selection of testing techniques","authors":"S. Melo, S. Souza, P. L. D. Souza, Jeffrey C. Carver","doi":"10.1145/3141865.3142468","DOIUrl":"https://doi.org/10.1145/3141865.3142468","url":null,"abstract":"High-Performance Computing (HPC) applications consist of concurrent programs with multi-process and/or multithreaded models with varying degrees of parallelism. Although their design patterns, models, and principles are similar to those of sequential ones, their non-deterministic behavior makes the testing activity more complex. In an attempt to solve such complexity, several techniques for concurrent software testing have been developed over the past years. However, the transference of knowledge between academy and industry remains a challenge, mainly due to the lack of a solid base of evidence with information that assists the decision-making process. This paper proposes the construction of a body of evidence for the concurrent programming field that supports the selection of an adequate testing technique for a software project. We propose a characterization schema which assists the decision-making support and is based on relevant information from the technical literature regarding available techniques, attributes, and concepts of concurrent programming that affect the testing process. The schema classified 109 studies that compose the preliminary body of evidence. A survey was conducted with specialists for the validation of the schema, regarding adequacy and relevance of the attributes defined. The results indicate the schema is effective and can support testing teams for concurrent applications.","PeriodicalId":424955,"journal":{"name":"Proceedings of the 4th ACM SIGPLAN International Workshop on Software Engineering for Parallel Systems","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127127750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
MALT: a Malloc tracker MALT: Malloc跟踪器
S. Valat, Andres S. Charif-Rubial, W. Jalby
At the beginning of computer science memory management was a big issue with applications requiring to fit in the small amount of available memory (close to a few kilobytes). Hardware evolution has made this resource cheap for the past few years. Available memory is now close to a few hundred gigabytes. But the current evolution in the multi/many-core era tends to make some issues come back. The memory available tends not to follow the increasing number of cores making the memory resource per thread rare again. We also encounter new issues with the requirement to manage a bigger space with many more allocated objects. This new aspect increases the probability of memory leaks. It also increases the probability of memory management performance issues. Hence, with MALT we provide a tool to track the memory allocated by an application. We then map the extracted metrics onto the source code, just like kcachegrind does with valgrind for the CPU performance. Compared to most available tools, MALT can also be used to track potential performance losses due to bad allocation patterns (too many allocations, small allocations, recycling large allocations, short-lived allocations...) thanks to the various metrics it exposes to the user. This paper will detail the metrics extracted by MALT and how we present them to the user thanks to a nice web based graphical interface which is missing with most of the available Linux tools.
在计算机科学的初期,内存管理是一个大问题,因为应用程序需要适应少量可用内存(接近几千字节)。在过去的几年中,硬件的发展使这种资源变得便宜。可用内存现在接近几百gb。但在当前的多核/多核时代,有些问题又出现了。可用内存往往不会随着内核数量的增加而增加,这使得每个线程的内存资源再次变得稀少。我们还遇到了新的问题,即需要管理包含更多已分配对象的更大空间。这个新方面增加了内存泄漏的可能性。它还增加了出现内存管理性能问题的可能性。因此,使用MALT,我们提供了一个工具来跟踪应用程序分配的内存。然后,我们将提取的指标映射到源代码,就像kcachegrind对CPU性能的valgrind所做的那样。与大多数可用的工具相比,MALT还可以用于跟踪由于不良分配模式(过多分配、小分配、回收大分配、短期分配……)而导致的潜在性能损失,这要归功于它向用户展示的各种指标。本文将详细介绍MALT提取的指标,以及我们如何通过基于web的图形界面将其呈现给用户,这是大多数可用的Linux工具所缺少的。
{"title":"MALT: a Malloc tracker","authors":"S. Valat, Andres S. Charif-Rubial, W. Jalby","doi":"10.1145/3141865.3141867","DOIUrl":"https://doi.org/10.1145/3141865.3141867","url":null,"abstract":"At the beginning of computer science memory management was a big issue with applications requiring to fit in the small amount of available memory (close to a few kilobytes). Hardware evolution has made this resource cheap for the past few years. Available memory is now close to a few hundred gigabytes. But the current evolution in the multi/many-core era tends to make some issues come back. The memory available tends not to follow the increasing number of cores making the memory resource per thread rare again. We also encounter new issues with the requirement to manage a bigger space with many more allocated objects. This new aspect increases the probability of memory leaks. It also increases the probability of memory management performance issues. Hence, with MALT we provide a tool to track the memory allocated by an application. We then map the extracted metrics onto the source code, just like kcachegrind does with valgrind for the CPU performance. Compared to most available tools, MALT can also be used to track potential performance losses due to bad allocation patterns (too many allocations, small allocations, recycling large allocations, short-lived allocations...) thanks to the various metrics it exposes to the user. This paper will detail the metrics extracted by MALT and how we present them to the user thanks to a nice web based graphical interface which is missing with most of the available Linux tools.","PeriodicalId":424955,"journal":{"name":"Proceedings of the 4th ACM SIGPLAN International Workshop on Software Engineering for Parallel Systems","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114266136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Transactional actors: communication in transactions 事务参与者:事务中的通信
Janwillem Swalens, Joeri De Koster, W. Meuter
Developers often require different concurrency models to fit the various concurrency needs of the different parts of their applications. Many programming languages, such as Clojure, Scala, and Haskell, cater to this need by incorporating different concurrency models. It has been shown that, in practice, developers often combine these concurrency models. However, they are often combined in an ad hoc way and the semantics of the combination is not always well-defined. The starting hypothesis of this paper is that different concurrency models need to be carefully integrated such that the properties of each individual model are still maintained. This paper proposes one such combination, namely the combination of the actor model and software transactional memory. In this paper we show that, while both individual models offer strong safety guarantees, these guarantees are no longer valid when they are combined. The main contribution of this paper is a novel hybrid concurrency model called transactional actors that combines both models while preserving their guarantees. This paper also presents an implementation in Clojure and an experimental evaluation of the performance of the transactional actor model.
开发人员通常需要不同的并发模型来适应应用程序不同部分的各种并发需求。许多编程语言,如Clojure、Scala和Haskell,都通过合并不同的并发模型来满足这种需求。实践表明,开发人员经常将这些并发模型组合在一起。然而,它们经常以一种特别的方式组合在一起,并且组合的语义并不总是定义良好。本文的开始假设是,需要小心地集成不同的并发模型,以便仍然维护每个单独模型的属性。本文提出了一种这样的组合,即行动者模型与软件事务内存的结合。在本文中,我们表明,虽然两个单独的模型都提供了强有力的安全保证,但当它们组合在一起时,这些保证不再有效。本文的主要贡献是一种称为事务参与者的新型混合并发模型,它将两种模型结合在一起,同时保留了它们的保证。本文还介绍了一个在Clojure中的实现,并对事务参与者模型的性能进行了实验评估。
{"title":"Transactional actors: communication in transactions","authors":"Janwillem Swalens, Joeri De Koster, W. Meuter","doi":"10.1145/3141865.3141866","DOIUrl":"https://doi.org/10.1145/3141865.3141866","url":null,"abstract":"Developers often require different concurrency models to fit the various concurrency needs of the different parts of their applications. Many programming languages, such as Clojure, Scala, and Haskell, cater to this need by incorporating different concurrency models. It has been shown that, in practice, developers often combine these concurrency models. However, they are often combined in an ad hoc way and the semantics of the combination is not always well-defined. The starting hypothesis of this paper is that different concurrency models need to be carefully integrated such that the properties of each individual model are still maintained. This paper proposes one such combination, namely the combination of the actor model and software transactional memory. In this paper we show that, while both individual models offer strong safety guarantees, these guarantees are no longer valid when they are combined. The main contribution of this paper is a novel hybrid concurrency model called transactional actors that combines both models while preserving their guarantees. This paper also presents an implementation in Clojure and an experimental evaluation of the performance of the transactional actor model.","PeriodicalId":424955,"journal":{"name":"Proceedings of the 4th ACM SIGPLAN International Workshop on Software Engineering for Parallel Systems","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129112017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Facilitating collaboration in high-performance computing projects with an interaction room 通过交互室促进高性能计算项目中的协作
Matthias Book, M. Riedel, Helmut Neukirchen, Markus Goetz
The design, development and deployment of scientific computing applications can be quite complex as they require scientific, high-performance computing (HPC), and software engineering expertise. Often, HPC applications are however developed by end users who are experts in their scientific domain, but need support from a supercomputing centre for the engineering and optimization aspects. The cooperation and communication between experts from these quite different disciplines can be difficult though. We therefore propose to employ the Interaction Room, a technique that facilitates interdisciplinary collaboration in complex software projects.
科学计算应用程序的设计、开发和部署可能相当复杂,因为它们需要科学、高性能计算(HPC)和软件工程专业知识。然而,HPC应用程序通常是由其科学领域的专家最终用户开发的,但在工程和优化方面需要超级计算中心的支持。然而,来自这些完全不同学科的专家之间的合作和交流可能会很困难。因此,我们建议采用交互室,这是一种在复杂软件项目中促进跨学科协作的技术。
{"title":"Facilitating collaboration in high-performance computing projects with an interaction room","authors":"Matthias Book, M. Riedel, Helmut Neukirchen, Markus Goetz","doi":"10.1145/3141865.3142467","DOIUrl":"https://doi.org/10.1145/3141865.3142467","url":null,"abstract":"The design, development and deployment of scientific computing applications can be quite complex as they require scientific, high-performance computing (HPC), and software engineering expertise. Often, HPC applications are however developed by end users who are experts in their scientific domain, but need support from a supercomputing centre for the engineering and optimization aspects. The cooperation and communication between experts from these quite different disciplines can be difficult though. We therefore propose to employ the Interaction Room, a technique that facilitates interdisciplinary collaboration in complex software projects.","PeriodicalId":424955,"journal":{"name":"Proceedings of the 4th ACM SIGPLAN International Workshop on Software Engineering for Parallel Systems","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117264508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Declaring Lua data types for GPU code generation 声明用于GPU代码生成的Lua数据类型
Paulo Motta
Some effort has been employed to allow interpreted languages to be able to take advantage of the computing capabilities of GPUs. Using interpreted languages allows to abstract the hardware and its specificities away from the user application, making development less complicated. However, due to hardware dependencies, the code needs to be compiled before execution. We want to compile a Lua function into a GPU kernel as transparently as possible, allowing the user to access the underlying hardware, without the complexities related to the traditional GPU programming. This scenario presents a great challenge on how to infer the variables data types while interfering as little as possible on the user programming paradigm.
人们已经做出了一些努力,使解释型语言能够利用gpu的计算能力。使用解释性语言可以将硬件及其特性从用户应用程序中抽象出来,从而降低开发的复杂性。但是,由于硬件依赖性,代码需要在执行之前进行编译。我们希望尽可能透明地将Lua函数编译成GPU内核,允许用户访问底层硬件,而不需要与传统GPU编程相关的复杂性。这个场景提出了一个巨大的挑战,即如何推断变量的数据类型,同时尽可能少地干扰用户编程范式。
{"title":"Declaring Lua data types for GPU code generation","authors":"Paulo Motta","doi":"10.1145/3141865.3142466","DOIUrl":"https://doi.org/10.1145/3141865.3142466","url":null,"abstract":"Some effort has been employed to allow interpreted languages to be able to take advantage of the computing capabilities of GPUs. Using interpreted languages allows to abstract the hardware and its specificities away from the user application, making development less complicated. However, due to hardware dependencies, the code needs to be compiled before execution. We want to compile a Lua function into a GPU kernel as transparently as possible, allowing the user to access the underlying hardware, without the complexities related to the traditional GPU programming. This scenario presents a great challenge on how to infer the variables data types while interfering as little as possible on the user programming paradigm.","PeriodicalId":424955,"journal":{"name":"Proceedings of the 4th ACM SIGPLAN International Workshop on Software Engineering for Parallel Systems","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131686396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The influence of HPCToolkit and Score-p on hardware performance counters HPCToolkit和Score-p对硬件性能指标的影响
Jan-Patrick Lehr, Christian Iwainsky, C. Bischof
Performance measurement and analysis are commonly carried out tasks for high-performance computing applications. Both sampling and instrumentation approaches for performance measurement can capture hardware performance counter (HWPC) metrics to asses the software's ability to use the functional units of the processor. Since the measurement software usually executes on the same processor, it necessarily competes with the target application for hardware resources. Consequently, the measurement system perturbs the target application, which often results in runtime overhead. While the runtime overhead of different measurement techniques has been previously studied, it has not been thoroughly examined to what extent HWPC values are perturbed by the measurement process. In this paper, we investigate the influence of the two widely-used performance measurement systems HPCToolkit (sampling) and Score-P (instrumentation) w.r.t. their influence on HWPC. Our experiments on the SPEC CPU 2006 C/C++ benchmarks show that, while Score-P's default instrumentation can massively increase runtime, it does not always heavily perturb relevant HWPC. On the other hand, HPCToolkit shows no significant runtime overhead, but significantly influences some relevant HWPC. We conclude that for every performance experiment sufficient baseline measurements are essential to identify the HWPC that remain valid indicators of performance for a given measurement technique. Thus, performance analysis tools need to offer easily accessible means to automate the baseline and validation functionality.
性能测量和分析是高性能计算应用程序通常执行的任务。用于性能测量的采样和仪器方法都可以捕获硬件性能计数器(HWPC)指标,以评估软件使用处理器功能单元的能力。由于测量软件通常在同一处理器上执行,因此它必然与目标应用程序争夺硬件资源。因此,测量系统会干扰目标应用程序,这通常会导致运行时开销。虽然以前研究过不同测量技术的运行时开销,但尚未彻底检查HWPC值在多大程度上受到测量过程的干扰。在本文中,我们研究了两种广泛使用的绩效测量系统HPCToolkit(采样)和Score-P(仪器)的影响,而不是它们对HWPC的影响。我们在SPEC CPU 2006 C/ c++基准测试上的实验表明,虽然Score-P的默认工具可以大量增加运行时,但它并不总是严重干扰相关的HWPC。另一方面,HPCToolkit没有显示出明显的运行时开销,但会显著影响一些相关的HWPC。我们得出的结论是,对于每个性能实验,充分的基线测量对于确定HWPC仍然是给定测量技术的有效性能指标至关重要。因此,性能分析工具需要提供易于访问的方法来自动化基线和验证功能。
{"title":"The influence of HPCToolkit and Score-p on hardware performance counters","authors":"Jan-Patrick Lehr, Christian Iwainsky, C. Bischof","doi":"10.1145/3141865.3141869","DOIUrl":"https://doi.org/10.1145/3141865.3141869","url":null,"abstract":"Performance measurement and analysis are commonly carried out tasks for high-performance computing applications. Both sampling and instrumentation approaches for performance measurement can capture hardware performance counter (HWPC) metrics to asses the software's ability to use the functional units of the processor. Since the measurement software usually executes on the same processor, it necessarily competes with the target application for hardware resources. Consequently, the measurement system perturbs the target application, which often results in runtime overhead. While the runtime overhead of different measurement techniques has been previously studied, it has not been thoroughly examined to what extent HWPC values are perturbed by the measurement process. In this paper, we investigate the influence of the two widely-used performance measurement systems HPCToolkit (sampling) and Score-P (instrumentation) w.r.t. their influence on HWPC. Our experiments on the SPEC CPU 2006 C/C++ benchmarks show that, while Score-P's default instrumentation can massively increase runtime, it does not always heavily perturb relevant HWPC. On the other hand, HPCToolkit shows no significant runtime overhead, but significantly influences some relevant HWPC. We conclude that for every performance experiment sufficient baseline measurements are essential to identify the HWPC that remain valid indicators of performance for a given measurement technique. Thus, performance analysis tools need to offer easily accessible means to automate the baseline and validation functionality.","PeriodicalId":424955,"journal":{"name":"Proceedings of the 4th ACM SIGPLAN International Workshop on Software Engineering for Parallel Systems","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124169379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Proceedings of the 4th ACM SIGPLAN International Workshop on Software Engineering for Parallel Systems 第四届ACM SIGPLAN并行系统软件工程国际研讨会论文集
{"title":"Proceedings of the 4th ACM SIGPLAN International Workshop on Software Engineering for Parallel Systems","authors":"","doi":"10.1145/3141865","DOIUrl":"https://doi.org/10.1145/3141865","url":null,"abstract":"","PeriodicalId":424955,"journal":{"name":"Proceedings of the 4th ACM SIGPLAN International Workshop on Software Engineering for Parallel Systems","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125337651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the 4th ACM SIGPLAN International Workshop on Software Engineering for Parallel Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1