首页 > 最新文献

Proceedings of the IEEE/ACM SC95 Conference最新文献

英文 中文
An HPF Compiler for the IBM SP2 用于IBM SP2的HPF编译器
Pub Date : 1995-12-08 DOI: 10.1145/224170.224422
Manish Gupta, S. Midkiff, E. Schonberg, V. Seshadri, David Shields, Ko-Yang Wang, Wai-Mee Ching, T. Ngo
We describe pHPF, an research prototype HPF compiler for the IBM SP series parallel machines. The compiler accepts as input Fortran 90 and Fortran 77 programs, augmented with HPF directives; sequential loops are automatically parallelized. The compiler supports symbolic analysis of expressions. This allows parameters such as the number of processors to be unknown at compile-time without significantly affecting performance. Communication schedules and computation guards are generated in a parameterized form at compile-time. Several novel optimizations and improved versions of well-known optimizations have been implemented in pHPF to exploit parallelism and reduce communication costs. These optimizations include elimination of redundant communication using data-availability analysis; using collective communication; new techniques for mapping scalar variables; coarse-grain wavefronting; and communication reduction in multi-dimensional shift communications. We present experimental results for some well-known benchmark routines. The results show the effectiveness of the compiler in generating efficient code for HPF programs.
我们描述了一个用于IBM SP系列并行机的研究原型HPF编译器。编译器接受带有HPF指令的Fortran 90和Fortran 77程序作为输入;顺序循环自动并行化。编译器支持表达式的符号分析。这使得处理器数量等参数在编译时是未知的,而不会显著影响性能。通信调度和计算保护在编译时以参数化的形式生成。在pHPF中实现了几个新的优化和知名优化的改进版本,以利用并行性并降低通信成本。这些优化包括使用数据可用性分析消除冗余通信;运用集体沟通;映射标量变量的新技术;粗粒度的波阵面;以及多维位移通信中的通信减少。给出了一些著名基准例程的实验结果。结果表明,该编译器在生成高效的HPF程序代码方面是有效的。
{"title":"An HPF Compiler for the IBM SP2","authors":"Manish Gupta, S. Midkiff, E. Schonberg, V. Seshadri, David Shields, Ko-Yang Wang, Wai-Mee Ching, T. Ngo","doi":"10.1145/224170.224422","DOIUrl":"https://doi.org/10.1145/224170.224422","url":null,"abstract":"We describe pHPF, an research prototype HPF compiler for the IBM SP series parallel machines. The compiler accepts as input Fortran 90 and Fortran 77 programs, augmented with HPF directives; sequential loops are automatically parallelized. The compiler supports symbolic analysis of expressions. This allows parameters such as the number of processors to be unknown at compile-time without significantly affecting performance. Communication schedules and computation guards are generated in a parameterized form at compile-time. Several novel optimizations and improved versions of well-known optimizations have been implemented in pHPF to exploit parallelism and reduce communication costs. These optimizations include elimination of redundant communication using data-availability analysis; using collective communication; new techniques for mapping scalar variables; coarse-grain wavefronting; and communication reduction in multi-dimensional shift communications. We present experimental results for some well-known benchmark routines. The results show the effectiveness of the compiler in generating efficient code for HPF programs.","PeriodicalId":269909,"journal":{"name":"Proceedings of the IEEE/ACM SC95 Conference","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122819487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 104
Detecting Coarse - Grain Parallelism Using an Interprocedural Parallelizing Compiler 使用程序间并行化编译器检测粗粒度并行性
Pub Date : 1995-12-08 DOI: 10.1145/224170.224337
Mary W. Hall, Saman P. Amarasinghe, Brian R. Murphy, Shih-Wei Liao, M. Lam
This paper presents an extensive empirical evaluation of an interprocedural parallelizing compiler, developed as part of the Stanford SUIF compiler system. The system incorporates a comprehensive and integrated collection of analyses, including privatization and reduction recognition for both array and scalar variables, and symbolic analysis of array subscripts. The interprocedural analysis framework is designed to provide analysis results nearly as precise as full inlining but without its associated costs. Experimentation with this system shows that it is capable of detecting coarser granularity of parallelism than previously possible. Specifically, it can parallelize loops that span numerous procedures and hundreds of lines of codes, frequently requiring modifications to array data structures such as privatization and reduction transformations. Measurements from several standard benchmark suites demonstrate that an integrated combination of interprocedural analyses can substantially advance the capability of automatic parallelization technology.
本文提出了一个广泛的程序间并行编译器的经验评估,开发作为斯坦福大学SUIF编译器系统的一部分。该系统包括全面和综合的分析收集,包括数组和标量变量的私有化和减少识别,以及数组下标的符号分析。程序间分析框架旨在提供几乎与全内联一样精确的分析结果,但没有相关的成本。实验表明,该系统能够检测到比以前更粗粒度的并行性。具体地说,它可以并行化跨越许多过程和数百行代码的循环,经常需要修改数组数据结构,例如私有化和简化转换。来自几个标准基准测试套件的测量表明,程序间分析的集成组合可以大大提高自动并行化技术的能力。
{"title":"Detecting Coarse - Grain Parallelism Using an Interprocedural Parallelizing Compiler","authors":"Mary W. Hall, Saman P. Amarasinghe, Brian R. Murphy, Shih-Wei Liao, M. Lam","doi":"10.1145/224170.224337","DOIUrl":"https://doi.org/10.1145/224170.224337","url":null,"abstract":"This paper presents an extensive empirical evaluation of an interprocedural parallelizing compiler, developed as part of the Stanford SUIF compiler system. The system incorporates a comprehensive and integrated collection of analyses, including privatization and reduction recognition for both array and scalar variables, and symbolic analysis of array subscripts. The interprocedural analysis framework is designed to provide analysis results nearly as precise as full inlining but without its associated costs. Experimentation with this system shows that it is capable of detecting coarser granularity of parallelism than previously possible. Specifically, it can parallelize loops that span numerous procedures and hundreds of lines of codes, frequently requiring modifications to array data structures such as privatization and reduction transformations. Measurements from several standard benchmark suites demonstrate that an integrated combination of interprocedural analyses can substantially advance the capability of automatic parallelization technology.","PeriodicalId":269909,"journal":{"name":"Proceedings of the IEEE/ACM SC95 Conference","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117136880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 181
Where is the Supercomputer Software Revolution? 超级计算机软件革命在哪里?
Pub Date : 1995-12-08 DOI: 10.1145/224170.224507
Dennis Gannon, L. Smarr, V. Schuster
{"title":"Where is the Supercomputer Software Revolution?","authors":"Dennis Gannon, L. Smarr, V. Schuster","doi":"10.1145/224170.224507","DOIUrl":"https://doi.org/10.1145/224170.224507","url":null,"abstract":"","PeriodicalId":269909,"journal":{"name":"Proceedings of the IEEE/ACM SC95 Conference","volume":"129 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115890015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Developing Computational Science Curricula: The EarthVision Experience 开发计算科学课程:EarthVision的经验
Pub Date : 1995-12-08 DOI: 10.1145/224170.224202
Ralph K. Coppola, E. Toth
Technology is used to empower students to go beyond traditional limitations. EarthVision provides the opportunity to participate in an authentic research environment enables the students to develop a sense of self worth and esteem established in the context of a phased curriculum, bringing together experts in a variety of disciplines. New techniques such as modeling and scientific visualization are employed to expand the types of phenomena which are possible to examine at a high school level. The use of concept strands going from simple elements to complicated representations helps to move the teacher/student teams from a highly structured learning environment to one that is highly independent. The scientific method, which employs validation throughout the computational science process, brings rigor and integrity which stimulates skill development needed for the development of autonomy. The result is significant cognitive development coupled with a positive affective orientation.
科技被用来赋予学生超越传统限制的能力。EarthVision提供了参与真实研究环境的机会,使学生能够在分阶段课程的背景下建立自我价值感和自尊感,汇集了各种学科的专家。采用建模和科学可视化等新技术来扩大可能在高中水平上检查的现象类型。使用从简单元素到复杂表示的概念链有助于将教师/学生团队从高度结构化的学习环境转移到高度独立的学习环境。在整个计算科学过程中采用验证的科学方法带来了严谨性和完整性,从而刺激了自主发展所需的技能发展。结果是显著的认知发展与积极的情感取向相结合。
{"title":"Developing Computational Science Curricula: The EarthVision Experience","authors":"Ralph K. Coppola, E. Toth","doi":"10.1145/224170.224202","DOIUrl":"https://doi.org/10.1145/224170.224202","url":null,"abstract":"Technology is used to empower students to go beyond traditional limitations. EarthVision provides the opportunity to participate in an authentic research environment enables the students to develop a sense of self worth and esteem established in the context of a phased curriculum, bringing together experts in a variety of disciplines. New techniques such as modeling and scientific visualization are employed to expand the types of phenomena which are possible to examine at a high school level. The use of concept strands going from simple elements to complicated representations helps to move the teacher/student teams from a highly structured learning environment to one that is highly independent. The scientific method, which employs validation throughout the computational science process, brings rigor and integrity which stimulates skill development needed for the development of autonomy. The result is significant cognitive development coupled with a positive affective orientation.","PeriodicalId":269909,"journal":{"name":"Proceedings of the IEEE/ACM SC95 Conference","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124845002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Performance of a Parallel Global Atmospheric Chemical Tracer Model 平行全球大气化学示踪模型的性能
Pub Date : 1995-12-08 DOI: 10.1145/224170.224504
J. Demmel, Sharon L. Smith
As partof a NASA HPCC Grand Challenge project, we are designing and implementing a parallel atmospheric chemical tracer model that will be suitable for use in global simulations. To accomplish this goal, our starting point has been an atmospheric pollution model that was originally used to study pollution in the Los Angeles Basin. The model includes gas-phase and aqueous-phase chemistry, radiation, aerosol physics, advection, convection, deposition, visibility and emissions. The potential bottlenecks in the model for parallel implementation are the compute-intensiveODE solving phase with load balancing problems,and the communication-intensive-advection phase. We describe the implementation and performance results on a variety of platforms,with emphasis on a detailed performance model we developed to predict performance, identify bottlenecks, guide our implementation, assess scalability, and evaluate architectures. An atmospheric chemical tracer model such as the one we describe in this paper will be one component of a larger Earth Systems Model (ESM), being developed under the direction of C. R. Mechoso of UCLA, incorporating atmospheric dynamics, atmospheric physics, ocean dynamics, and a database and visualization system.
作为NASA HPCC大挑战项目的一部分,我们正在设计和实施一个平行的大气化学示踪剂模型,该模型将适用于全球模拟。为了实现这一目标,我们的出发点是一个大气污染模型,该模型最初用于研究洛杉矶盆地的污染。该模型包括气相和水相化学、辐射、气溶胶物理、平流、对流、沉积、能见度和排放。并行实现模型的潜在瓶颈是具有负载平衡问题的计算密集型ode求解阶段和通信密集型平流阶段。我们描述了在各种平台上的实现和性能结果,重点介绍了我们开发的详细性能模型,以预测性能、识别瓶颈、指导实现、评估可伸缩性和评估架构。我们在本文中描述的大气化学示踪模型将成为更大的地球系统模型(ESM)的一个组成部分,该模型正在加州大学洛杉矶分校C. R. Mechoso的指导下开发,包括大气动力学、大气物理学、海洋动力学以及数据库和可视化系统。
{"title":"Performance of a Parallel Global Atmospheric Chemical Tracer Model","authors":"J. Demmel, Sharon L. Smith","doi":"10.1145/224170.224504","DOIUrl":"https://doi.org/10.1145/224170.224504","url":null,"abstract":"As partof a NASA HPCC Grand Challenge project, we are designing and implementing a parallel atmospheric chemical tracer model that will be suitable for use in global simulations. To accomplish this goal, our starting point has been an atmospheric pollution model that was originally used to study pollution in the Los Angeles Basin. The model includes gas-phase and aqueous-phase chemistry, radiation, aerosol physics, advection, convection, deposition, visibility and emissions. The potential bottlenecks in the model for parallel implementation are the compute-intensiveODE solving phase with load balancing problems,and the communication-intensive-advection phase. We describe the implementation and performance results on a variety of platforms,with emphasis on a detailed performance model we developed to predict performance, identify bottlenecks, guide our implementation, assess scalability, and evaluate architectures. An atmospheric chemical tracer model such as the one we describe in this paper will be one component of a larger Earth Systems Model (ESM), being developed under the direction of C. R. Mechoso of UCLA, incorporating atmospheric dynamics, atmospheric physics, ocean dynamics, and a database and visualization system.","PeriodicalId":269909,"journal":{"name":"Proceedings of the IEEE/ACM SC95 Conference","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121759414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Lattice QCD on the IBM Scalable POWERParallel Systems SP2 IBM可扩展POWERParallel系统SP2上的点阵QCD
Pub Date : 1995-12-08 DOI: 10.1145/224170.224307
C. Bernard, C. DeTar, S. Gottlieb, U. Heller, J. Hetrick, N. Ishizuka, L. Kärkkäinen, S. Lantz, K. Rummukainen, R. Sugar, D. Toussaint, M. Wingate
A 512 node IBM Scalable POWERParallel Systems SP2 was installed at the Cornell Theory Center in October 1994. During the past couple of months we have been porting and optimizing code for carrying out lattice QCD calculations. Present performance is far from ideal, however, and optimization efforts are still under way. The rate limiting step in our code involves a rather generic inversion of a large, sparse system, based on a partial differential equation in a multidimensional space. The insights we have gained so far may be useful in diagnosing performance in a wide class of applications.
1994年10月,一台512节点的IBM可伸缩POWERParallel Systems SP2安装在康奈尔理论中心。在过去的几个月里,我们一直在移植和优化执行晶格QCD计算的代码。然而,目前的性能远非理想,优化工作仍在进行中。我们代码中的速率限制步骤涉及基于多维空间中的偏微分方程的大型稀疏系统的相当一般的反转。到目前为止,我们获得的见解可能有助于诊断各种应用程序的性能。
{"title":"Lattice QCD on the IBM Scalable POWERParallel Systems SP2","authors":"C. Bernard, C. DeTar, S. Gottlieb, U. Heller, J. Hetrick, N. Ishizuka, L. Kärkkäinen, S. Lantz, K. Rummukainen, R. Sugar, D. Toussaint, M. Wingate","doi":"10.1145/224170.224307","DOIUrl":"https://doi.org/10.1145/224170.224307","url":null,"abstract":"A 512 node IBM Scalable POWERParallel Systems SP2 was installed at the Cornell Theory Center in October 1994. During the past couple of months we have been porting and optimizing code for carrying out lattice QCD calculations. Present performance is far from ideal, however, and optimization efforts are still under way. The rate limiting step in our code involves a rather generic inversion of a large, sparse system, based on a partial differential equation in a multidimensional space. The insights we have gained so far may be useful in diagnosing performance in a wide class of applications.","PeriodicalId":269909,"journal":{"name":"Proceedings of the IEEE/ACM SC95 Conference","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127544778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Parallel Incompressible Flow Solver Package with a Parallel Multigrid Elliptic Kernel 具有并行多网格椭圆核的并行不可压缩流求解器包
Pub Date : 1995-12-08 DOI: 10.1145/224170.224406
J. Lou, R. Ferraro
A parallel time-dependent incompressible flow solver and a parallel multigrid elliptic kernel are described. The flow solver is based on a second-order projection method applied to a staggered finite-difference grid. The multigrid algorithms implemented in the elliptic kernel, which is needed by the flow solver, are V-cycle and full V-cycle schemes. A grid-partition strategy is used in the parallel implementations of both the flow solver and the multigrid elliptic kernel on all fine and coarse grids. Numerical experiments and parallel performance tests show the parallel solver package is numerically stable, physically robust and computationally efficient. Both the multigrid elliptic kernel and the flow solver scale very well to a large number of processors on the Intel Paragon and the Cray T3D for computations with moderate granularity. The solver package has been carefully designed and coded so that it can be easily adapted to solving a variety of interesting two and three-dimensional flow problems. The solver package is portable to parallel systems that support MPI, PVM and Intel NX for interprocessor communications.
描述了一种并行时变不可压缩流求解器和一种并行多网格椭圆核。流动求解基于二阶投影法,应用于交错有限差分网格。流求解所需要的椭圆核上实现的多网格算法有v循环和全v循环两种。流求解器和多网格椭圆核在粗、细网格上的并行实现采用网格划分策略。数值实验和并行性能测试表明,该并行求解器包具有数值稳定性、物理鲁棒性和计算效率。多网格椭圆内核和流求解器都可以很好地扩展到Intel Paragon和Cray T3D上的大量处理器上,用于中等粒度的计算。求解器包经过精心设计和编码,因此它可以很容易地适应解决各种有趣的二维和三维流动问题。求解器包可移植到支持MPI、PVM和Intel NX处理器间通信的并行系统上。
{"title":"A Parallel Incompressible Flow Solver Package with a Parallel Multigrid Elliptic Kernel","authors":"J. Lou, R. Ferraro","doi":"10.1145/224170.224406","DOIUrl":"https://doi.org/10.1145/224170.224406","url":null,"abstract":"A parallel time-dependent incompressible flow solver and a parallel multigrid elliptic kernel are described. The flow solver is based on a second-order projection method applied to a staggered finite-difference grid. The multigrid algorithms implemented in the elliptic kernel, which is needed by the flow solver, are V-cycle and full V-cycle schemes. A grid-partition strategy is used in the parallel implementations of both the flow solver and the multigrid elliptic kernel on all fine and coarse grids. Numerical experiments and parallel performance tests show the parallel solver package is numerically stable, physically robust and computationally efficient. Both the multigrid elliptic kernel and the flow solver scale very well to a large number of processors on the Intel Paragon and the Cray T3D for computations with moderate granularity. The solver package has been carefully designed and coded so that it can be easily adapted to solving a variety of interesting two and three-dimensional flow problems. The solver package is portable to parallel systems that support MPI, PVM and Intel NX for interprocessor communications.","PeriodicalId":269909,"journal":{"name":"Proceedings of the IEEE/ACM SC95 Conference","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128802120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Computational Approach to the Statistical Mechanics of Protein Folding 蛋白质折叠统计力学的计算方法
Pub Date : 1995-12-08 DOI: 10.1145/224170.224216
M. Hao, H. Scheraga
A statistical mechanical approach to the protein folding problem is developed based on computer simulations. The properties of proteins related to conformation and folding are determined from the density of states of the protein. A new simulation procedure, the Entropy Sampling Monte Carlo method, is used to determine accurately the density of states of the protein. To enhance the efficiency of sampling the conformational space of a protein, two techniques (a conformational-biased chain regrowth procedure and a jump-walking method) were introduced into the simulation. Applications of the approach to study a number of model polypeptides and a small protein, Bovine Pancreatic Trypsin Inhibitor, have been carried out. The results obtained demonstrate that the new approach is more powerful and produces richer information about the thermodynamics and folding behavior of proteins than conventional simulation methods.
在计算机模拟的基础上,提出了一种解决蛋白质折叠问题的统计力学方法。蛋白质与构象和折叠有关的性质是由蛋白质的状态密度决定的。一种新的模拟程序,熵采样蒙特卡罗方法,被用来准确地确定蛋白质的状态密度。为了提高对蛋白质构象空间采样的效率,在模拟中引入了两种技术(构象偏置链再生法和跳跃行走法)。应用该方法研究了许多模型多肽和一种小蛋白,牛胰腺胰蛋白酶抑制剂,已经进行。结果表明,与传统的模拟方法相比,新方法更强大,可以提供更丰富的蛋白质热力学和折叠行为信息。
{"title":"Computational Approach to the Statistical Mechanics of Protein Folding","authors":"M. Hao, H. Scheraga","doi":"10.1145/224170.224216","DOIUrl":"https://doi.org/10.1145/224170.224216","url":null,"abstract":"A statistical mechanical approach to the protein folding problem is developed based on computer simulations. The properties of proteins related to conformation and folding are determined from the density of states of the protein. A new simulation procedure, the Entropy Sampling Monte Carlo method, is used to determine accurately the density of states of the protein. To enhance the efficiency of sampling the conformational space of a protein, two techniques (a conformational-biased chain regrowth procedure and a jump-walking method) were introduced into the simulation. Applications of the approach to study a number of model polypeptides and a small protein, Bovine Pancreatic Trypsin Inhibitor, have been carried out. The results obtained demonstrate that the new approach is more powerful and produces richer information about the thermodynamics and folding behavior of proteins than conventional simulation methods.","PeriodicalId":269909,"journal":{"name":"Proceedings of the IEEE/ACM SC95 Conference","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127435739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Structured Approach to Instrumentation System Development and Evaluation 仪器仪表系统开发与评估的结构化方法
Pub Date : 1995-12-08 DOI: 10.1145/224170.224271
A. Waheed, D. Rover
Software instrumentation is a widely used technique for parallel program performance evaluation, debugging, steering, and visualization. With increasing sophistication of parallel tool development technologies and broadening of application areas where these tools are being used, runtime data collection and management activities are growing in importance; we use the term instrumentation system (IS) to refer to components that support these activities in state-of-the-art parallel tool environments. An IS consists of Local Instrumentation Servers, an Instrumentation System Manager, and a Transfer Protocol. The overheads and perturbation effects attributed to an IS must be accounted for to ensure correct and efficient representation of program behavior, especially for on-line and real-time environments. Moreover, an IS is a key facilitator of integration of tools in an environment. In this paper, we define the primary components of an IS and their roles in an integrated environment, and classify ISs according to selected features. We introduce a structured approach to plan, design, model, evaluate, implement, and validate an IS. The approach provides a means to formally address domain-specific requirements. The modeling and evaluation processes are illustrated in the context of three distinctive IS case studies for PICL, Paradyn, and Vista. Valuable feedback on performance effects of IS parameters and policies can assist developers in making design decisions early in the software development cycle. Additionally, use of structured software engineering methods can support the mapping of an abstract IS model to an implementation of the IS.
软件检测是一种广泛用于并行程序性能评估、调试、控制和可视化的技术。随着并行工具开发技术的日益成熟和这些工具的应用领域的扩大,运行时数据收集和管理活动变得越来越重要;我们使用术语仪表系统(IS)来指代在最先进的并行工具环境中支持这些活动的组件。一个IS由本地仪器服务器、仪器系统管理器和传输协议组成。必须考虑到IS的开销和扰动效应,以确保程序行为的正确和有效表示,特别是在线和实时环境。此外,信息系统是环境中工具集成的关键促进者。本文定义了信息系统的主要组成部分及其在集成环境中的作用,并根据所选择的特征对信息系统进行分类。我们介绍了一种结构化的方法来计划、设计、建模、评估、实施和验证一个信息系统。该方法提供了一种正式处理领域特定需求的方法。建模和评估过程在PICL、Paradyn和Vista三个不同的IS案例研究的背景下进行了说明。关于IS参数和策略的性能影响的有价值的反馈可以帮助开发人员在软件开发周期的早期做出设计决策。此外,使用结构化软件工程方法可以支持抽象IS模型到IS实现的映射。
{"title":"A Structured Approach to Instrumentation System Development and Evaluation","authors":"A. Waheed, D. Rover","doi":"10.1145/224170.224271","DOIUrl":"https://doi.org/10.1145/224170.224271","url":null,"abstract":"Software instrumentation is a widely used technique for parallel program performance evaluation, debugging, steering, and visualization. With increasing sophistication of parallel tool development technologies and broadening of application areas where these tools are being used, runtime data collection and management activities are growing in importance; we use the term instrumentation system (IS) to refer to components that support these activities in state-of-the-art parallel tool environments. An IS consists of Local Instrumentation Servers, an Instrumentation System Manager, and a Transfer Protocol. The overheads and perturbation effects attributed to an IS must be accounted for to ensure correct and efficient representation of program behavior, especially for on-line and real-time environments. Moreover, an IS is a key facilitator of integration of tools in an environment. In this paper, we define the primary components of an IS and their roles in an integrated environment, and classify ISs according to selected features. We introduce a structured approach to plan, design, model, evaluate, implement, and validate an IS. The approach provides a means to formally address domain-specific requirements. The modeling and evaluation processes are illustrated in the context of three distinctive IS case studies for PICL, Paradyn, and Vista. Valuable feedback on performance effects of IS parameters and policies can assist developers in making design decisions early in the software development cycle. Additionally, use of structured software engineering methods can support the mapping of an abstract IS model to an implementation of the IS.","PeriodicalId":269909,"journal":{"name":"Proceedings of the IEEE/ACM SC95 Conference","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132655817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Symbolic Array Dataflow Analysis for Array Privatization and Program Parallelization 面向数组私有化和程序并行化的符号数组数据流分析
Pub Date : 1995-12-08 DOI: 10.1145/224170.224318
Junjie Gu, Zhiyuan Li, Gyungho Lee
Array dataflow information plays an important role for successful automatic parallelization of Fortran programs. This paper proposes a powerful symbolic array dataflow analysis to support array privatization and loop parallelization for programs with arbitrary control flow graphs and acyclic call graphs. Our scheme summarizes array access information using guarded array regions and propagates such regions over a Hierarchical Supergraph (HSG). The use of guards allows us to use the information in IF conditions to sharpen the array dataflow analysis and thereby to handle difficult cases which elude other existing techniques. The guarded array regions retain the simplicity of set operations for regular array regions in common cases, and they enhance regular array regions in complicated cases by using guards to handle complex symbolic expressions and array shapes. Scalar values that appear in array subscripts and loop limits are substituted on the fly during the array information propagation, which disambiguates the symbolic values precisely for set operations. We present efficient algorithms that implement our scheme. Initial experiments of applying our analysis to Perfect Benchmarks show promising results of improved array privatization.
数组数据流信息对Fortran程序的自动并行化起着重要的作用。本文提出了一种强大的符号数组数据流分析方法,以支持任意控制流图和非循环调用图程序的数组私有化和循环并行化。我们的方案使用保护数组区域总结数组访问信息,并在分层超图(HSG)上传播这些区域。使用警卫使我们能够使用中频条件下的信息来锐化阵列数据流分析,从而处理其他现有技术无法处理的困难情况。保护数组区域在一般情况下保留了常规数组区域集合操作的简单性,在复杂情况下通过使用保护来处理复杂的符号表达式和数组形状,增强了常规数组区域。出现在数组下标和循环限制中的标量值在数组信息传播期间被动态替换,从而精确地消除了用于集合操作的符号值的歧义。我们提出了有效的算法来实现我们的方案。将我们的分析应用于Perfect benchmark的初步实验显示了改进阵列私有化的有希望的结果。
{"title":"Symbolic Array Dataflow Analysis for Array Privatization and Program Parallelization","authors":"Junjie Gu, Zhiyuan Li, Gyungho Lee","doi":"10.1145/224170.224318","DOIUrl":"https://doi.org/10.1145/224170.224318","url":null,"abstract":"Array dataflow information plays an important role for successful automatic parallelization of Fortran programs. This paper proposes a powerful symbolic array dataflow analysis to support array privatization and loop parallelization for programs with arbitrary control flow graphs and acyclic call graphs. Our scheme summarizes array access information using guarded array regions and propagates such regions over a Hierarchical Supergraph (HSG). The use of guards allows us to use the information in IF conditions to sharpen the array dataflow analysis and thereby to handle difficult cases which elude other existing techniques. The guarded array regions retain the simplicity of set operations for regular array regions in common cases, and they enhance regular array regions in complicated cases by using guards to handle complex symbolic expressions and array shapes. Scalar values that appear in array subscripts and loop limits are substituted on the fly during the array information propagation, which disambiguates the symbolic values precisely for set operations. We present efficient algorithms that implement our scheme. Initial experiments of applying our analysis to Perfect Benchmarks show promising results of improved array privatization.","PeriodicalId":269909,"journal":{"name":"Proceedings of the IEEE/ACM SC95 Conference","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122841202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 61
期刊
Proceedings of the IEEE/ACM SC95 Conference
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1