Proceedings Scalable High Performance Computing Conference SHPCC-92.最新文献

英文中文

On the influence of programming models on shared memory computer performance 编程模型对共享内存计算机性能的影响

Proceedings Scalable High Performance Computing Conference SHPCC-92.

Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232630

T. Ngo, L. Snyder

Experiments are presented indicating that on shared-memory machines, programs written in the nonshared-memory programming model generally offer better performance, in addition to being more portable and scalable. The authors study the LU decomposition problem and a molecular dynamics simulation on three shared-memory machines with widely differing architectures, and analyze the results from three perspectives: performance, speedup, and scaling.<>

实验表明，在共享内存机器上，用非共享内存编程模型编写的程序除了具有更强的可移植性和可扩展性外，通常还能提供更好的性能。作者在三台架构各异的共享内存机器上研究了LU分解问题和分子动力学模拟，并从性能、加速和扩展三个角度分析了结果。

引用次数: 36

A methodology for visualizing performance of loosely synchronous programs 一种将松散同步程序的性能可视化的方法

Proceedings Scalable High Performance Computing Conference SHPCC-92.

Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232663

S. Sarukkai, D. Kimelman, L. Rudolph

Introduces a new set of views for displaying the progress of loosely synchronous computations involving large numbers of processors on large problems. The authors suggest a methodology for employing these views in succession in order to gain progressively more detail concerning program behavior. At each step, focus is refined to include just those program sections or processors which have been determined to be bottlenecks. The authors present their experience in using this methodology to uncover performance problems in selected applications.<>

引入一组新的视图，用于显示涉及大型问题的大量处理器的松散同步计算的进度。作者提出了一种连续使用这些视图的方法，以便逐步获得有关程序行为的更多细节。在每一步中，重点都被细化到只包括那些被确定为瓶颈的程序部分或处理器。作者介绍了他们使用这种方法在选定的应用程序中发现性能问题的经验。

引用次数: 18

Synthesizing scalable computations from sequential programs 从顺序程序中合成可伸缩计算

Proceedings Scalable High Performance Computing Conference SHPCC-92.

Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232639

R. Govindaraju, B. Szymański

Advocates an approach that supports decomposition and scalable synthesis of a parallel computation. The decomposition is achieved with the aid of annotation languages that enable one top annotate programs written in various programming languages. The authors have implemented annotations for the Equational Programming Language (EPL). The synthesis is achieved with the aid of a simple configuration language that describes the computation in terms of interactions of programs and their fragments created by annotations. The decomposition and synthesis simplify the process of: (1) determining the grain size for efficient parallel processing, (2) data distribution, and (3) run-time optimization. The authors discuss annotations and configurations suitable for parallel programs written in EPL and FORTRAN and their use in scalable synthesis. They first discuss how annotations can define different computational blocks from a single program and how these blocks determine data distributions across processors. They outline a design of the configurator and show how FORTRAN programs can be configured into a hierarchical structure of computational blocks. An example of LU decomposition written in both EPL and FORTRAN illustrates the process of program decomposition and synthesis. The authors discuss code generation for synthesized computations, and some possible extensions.<>

提倡一种支持并行计算的分解和可伸缩合成的方法。分解是在注释语言的帮助下实现的，注释语言允许对用各种编程语言编写的程序进行顶级注释。作者为等式编程语言(EPL)实现了注释。这种综合是在一种简单的配置语言的帮助下实现的，该语言根据程序之间的相互作用及其由注释创建的片段来描述计算。分解和合成简化了以下过程:(1)确定粒度以进行高效并行处理;(2)数据分布;(3)运行时优化。作者讨论了适用于用EPL和FORTRAN编写的并行程序的注释和配置，以及它们在可扩展合成中的应用。他们首先讨论了注释如何定义来自单个程序的不同计算块，以及这些块如何确定跨处理器的数据分布。它们概述了配置器的设计，并展示了如何将FORTRAN程序配置为计算块的分层结构。一个用EPL和FORTRAN编写的逻辑单元分解示例说明了程序分解和合成的过程。作者讨论了合成计算的代码生成，以及一些可能的扩展。

{"title":"Synthesizing scalable computations from sequential programs","authors":"R. Govindaraju, B. Szymański","doi":"10.1109/SHPCC.1992.232639","DOIUrl":"https://doi.org/10.1109/SHPCC.1992.232639","url":null,"abstract":"Advocates an approach that supports decomposition and scalable synthesis of a parallel computation. The decomposition is achieved with the aid of annotation languages that enable one top annotate programs written in various programming languages. The authors have implemented annotations for the Equational Programming Language (EPL). The synthesis is achieved with the aid of a simple configuration language that describes the computation in terms of interactions of programs and their fragments created by annotations. The decomposition and synthesis simplify the process of: (1) determining the grain size for efficient parallel processing, (2) data distribution, and (3) run-time optimization. The authors discuss annotations and configurations suitable for parallel programs written in EPL and FORTRAN and their use in scalable synthesis. They first discuss how annotations can define different computational blocks from a single program and how these blocks determine data distributions across processors. They outline a design of the configurator and show how FORTRAN programs can be configured into a hierarchical structure of computational blocks. An example of LU decomposition written in both EPL and FORTRAN illustrates the process of program decomposition and synthesis. The authors discuss code generation for synthesized computations, and some possible extensions.<<ETX>>","PeriodicalId":254515,"journal":{"name":"Proceedings Scalable High Performance Computing Conference SHPCC-92.","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115836171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

METRICS: a tool for the display and analysis of mappings in message-passing multicomputers METRICS:在消息传递多台计算机中显示和分析映射的工具

Proceedings Scalable High Performance Computing Conference SHPCC-92.

Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232647

V. Lo, K. Windisch, R. Datta

METRICS is a software tool for the static (compile time) analysis of mappings. METRICS is designed for use in the mapping of parallel computations consisting of a set of communicating parallel processes which communicate through explicit message passing. The target architectures currently supported include the mesh and hypercube as well as user-defined topologies. The underlying routing schemes include store-and-forward, virtual cut-through, and wormhole routing. METRICS is designed to display the mapping in a clear, logical, and intuitive format so that the user can evaluate it quantitatively as well as visually. The contributions of METRICS include its rich underlying formalism, the temporal communication graph, a hybrid between the static task graph and the DAG; its mechanisms for handling massive parallelism using subviews, scrolling, and hierarchical grouping; and the broad spectrum of mapping metrics used in the analysis of each mapping.<>

METRICS是一个用于静态(编译时)分析映射的软件工具。METRICS设计用于并行计算的映射，并行计算由一组通过显式消息传递进行通信的并行进程组成。目前支持的目标体系结构包括网格和超立方体以及用户定义的拓扑结构。底层路由方案包括存储转发、虚拟直通和虫洞路由。METRICS旨在以清晰、逻辑和直观的格式显示映射，以便用户可以定量地和直观地对其进行评估。METRICS的贡献包括其丰富的底层形式化，时态通信图，静态任务图和DAG之间的混合;它使用子视图、滚动和分层分组处理大规模并行的机制;以及在每个映射的分析中使用的广泛的映射度量。

引用次数: 2

Evaluating parallel languages for molecular dynamics computations 评估分子动力学计算的并行语言

Proceedings Scalable High Performance Computing Conference SHPCC-92.

Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232682

T. Clark, R. V. Hanxleden, K. Kennedy, C. Koelbel, L.R. Scott

The paper describes the practicalities of porting a basic molecular dynamics computation to a distributed-memory machine. In the process, it shows how program annotations can aid in parallelizing a moderately complex code. It also argues that algorithm replacement may be necessary in parallelization, a task which cannot be performed automatically. The paper closes with some results from a parallel GROMOS implementation.<>

本文描述了将基本分子动力学计算移植到分布式存储机器上的可行性。在此过程中，它展示了程序注释如何帮助并行化中等复杂的代码。它还认为，在并行化中，算法替换可能是必要的，并行化是一项不能自动执行的任务。本文最后给出了一个并行GROMOS实现的一些结果。

引用次数: 20

Portable execution traces for parallel program debugging and performance visualization 用于并行程序调试和性能可视化的可移植执行跟踪

Proceedings Scalable High Performance Computing Conference SHPCC-92.

Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232661

A. Couch, D.W. Krumme

There is much interest in defining a standard for event traces collected from parallel architectures. A standard would support free data and tool sharing among researchers working on varied architectures. But defining that standard has proved to be difficult. Any standard must allow user-defined events and avoid or hide event semantics as much as possible. The authors propose a standard based on a declaration language, which describes how the raw event trace is to be translated into a normal form. Analysis tools then share a common interface to a compiler and interpreter which use the declarations to fetch, transform, and augment trace data. This concept is evaluated through construction of a prototype declaration compiler and interpreter.<>

对于定义从并行体系结构中收集的事件跟踪的标准，人们非常感兴趣。一个标准将支持研究不同架构的研究人员之间的免费数据和工具共享。但事实证明，定义这一标准很困难。任何标准都必须允许用户定义的事件，并尽可能避免或隐藏事件语义。作者提出了一种基于声明语言的标准，该语言描述了如何将原始事件跟踪转换为标准形式。然后，分析工具与编译器和解释器共享一个公共接口，这些编译器和解释器使用声明来获取、转换和增加跟踪数据。这个概念是通过构造一个原型声明编译器和解释器来评估的。

引用次数: 7

Preliminary experience in developing a parallel thin-layer Navier Stokes code and implications for parallel language design 开发并行薄层Navier Stokes代码的初步经验及其对并行语言设计的启示

Proceedings Scalable High Performance Computing Conference SHPCC-92.

Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232631

D. Olander, R. Schnabel

Describes preliminary experience in developing a parallel version of a reasonably large, multi-grid based computational fluid dynamics code, and implementing this version on a distributed memory multiprocessor. Creating an efficient parallel code has involved interesting decisions and tradeoffs in the mapping of the key data structures to the processors. It also has involved significant reordering of computations in computational kernels, including the use of pipelining, to achieve good efficiency. The authors discuss these issues and their computational experiences with different alternatives, and briefly discuss the implications of these experiences upon the design of effective languages for distributed parallel computation.<>

描述开发相当大的基于多网格的计算流体动力学代码的并行版本以及在分布式内存多处理器上实现此版本的初步经验。创建高效的并行代码涉及到将关键数据结构映射到处理器的有趣决策和权衡。它还涉及到计算内核中计算的重大重新排序，包括使用流水线，以实现良好的效率。作者讨论了这些问题和他们使用不同替代方案的计算经验，并简要讨论了这些经验对设计有效的分布式并行计算语言的影响

引用次数: 17

Vienna Fortran 90 维也纳Fortran 90

Proceedings Scalable High Performance Computing Conference SHPCC-92.

Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232688

S. Benkner, Barbara Chapman, H. Zima

Vienna Fortran 90 is a language extension of Fortran 90 which enables the user to write programs for distributed memory multiprocessors using global data references only. Performance of software on such systems is profoundly influenced by the manner in which data is distributed to the processors. Hence, Vienna Fortran 90 provides the user with a wide range of facilities for the mapping of data to processors. It combines the advantages of the shared memory programming paradigm with mechanisms for explicit user control of those aspects of the program which have the greatest impact on efficiency. The paper presents the major features of Vienna Fortran 90 and gives examples of their use.<>

Vienna Fortran 90是Fortran 90的语言扩展，它使用户能够仅使用全局数据引用为分布式内存多处理器编写程序。在这样的系统上，软件的性能受到数据分发给处理器的方式的深刻影响。因此，Vienna Fortran 90为用户提供了广泛的将数据映射到处理器的工具。它结合了共享内存编程范例的优点和对程序中那些对效率影响最大的方面进行显式用户控制的机制。本文介绍了Vienna Fortran 90的主要特点，并给出了它们的使用实例。

引用次数: 59

A parallel programming tool for scheduling on distributed memory multiprocessors 用于在分布式内存多处理器上调度的并行编程工具

Proceedings Scalable High Performance Computing Conference SHPCC-92.

Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232673

T. Yang, A. Gerasoulis

PYRROS is a tool for scheduling and parallel code generation for distributed memory message passing architectures. In this paper, the authors discuss several compile-time optimization techniques used in PYRROS. The scheduling part of PYRROS optimizes both data and program mapping so that the parallel time is minimized. The communication and storage optimization part facilitates the generation of efficient parallel codes. The related issues of partitioning and 'owner computes rule' are discussed and the importance of program scheduling is demonstrated.<>

PYRROS是分布式内存消息传递体系结构的调度和并行代码生成工具。在本文中，作者讨论了PYRROS中使用的几种编译时优化技术。PYRROS的调度部分优化了数据和程序映射，使并行时间最小化。通信和存储优化部分促进了高效并行代码的生成。讨论了分区和“所有者计算规则”的相关问题，论证了程序调度的重要性。

引用次数: 27

Data remapping for distributed-memory multicomputers 分布式内存多计算机的数据重映射

Proceedings Scalable High Performance Computing Conference SHPCC-92.

Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232660

C. Chase, A. Reeves

The fragmented memory model of distributed-memory multicomputers, such as the Intel iPSC and Paragon series of computers, and the Thinking Machines CM-5, introduces significant complexity into the compilation process. Since most conventional programming languages provide a model of a global memory, a distributed-memory compiler must translate all data references to correspond to the fragmented memory on the system hardware. This paper describes a technique called array remapping which can automatically be applied to parallel loops containing arbitrary array subscripts. The compile time and runtime aspects of providing support for remapping are described, and the performance of this implementation of remapping is presented.<>

分布式内存多计算机(如Intel iPSC和Paragon系列计算机以及Thinking Machines CM-5)的碎片内存模型给编译过程带来了极大的复杂性。由于大多数传统编程语言都提供了全局内存模型，因此分布式内存编译器必须翻译所有数据引用，以与系统硬件上的碎片内存相对应。本文描述了一种可以自动应用于包含任意数组下标的并行循环的阵列重映射技术。本文描述了对重新映射提供支持的编译时和运行时方面，并给出了这种重新映射实现的性能

引用次数: 5

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings Scalable High Performance Computing Conference SHPCC-92.

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀