Proceedings Scalable High Performance Computing Conference SHPCC-92.最新文献

英文中文

Predicting the performance of large programs on scalable multicomputers 预测大型程序在可伸缩多台计算机上的性能

Proceedings Scalable High Performance Computing Conference SHPCC-92.

Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232692

B. Stramm, F. Berman

The paper introduces the retargetable program-sensitive (RPS) model which predicts the performance of static, data-independent parallel programs mapped to message-passing multicomputers. It shows that the model accurately predicts the performance of mapped programs by comparing RPS predictions to actual execution times in the Poker parallel programming environment. The paper also previews plans for further verification of the model on the NCube2 and other multicomputers.<>

本文介绍了可重目标程序敏感(RPS)模型，该模型用于预测映射到消息传递多计算机上的静态、数据无关的并行程序的性能。结果表明，该模型通过将RPS预测与Poker并行编程环境中的实际执行时间进行比较，准确地预测了映射程序的性能。本文还展望了在NCube2和其他多计算机上进一步验证该模型的计划。

引用次数: 15

An expressive annotation model for generating SPMD programs 用于生成SPMD程序的表达性注释模型

Proceedings Scalable High Performance Computing Conference SHPCC-92.

Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232644

E. Paalvast, L. Breebaart, H. Sips

This paper illustrates two major points. First, the authors discuss a general, conceptual model for SPMD program generating systems, and demonstrate that this model allows one to capture a broad range of different program semantics. Second, they show that it is possible to fit the concepts of this model into an annotation language that allows an SPMD program generating system to fully utilize all the possibilities present in the model.<>

本文阐述了两个要点。首先，作者讨论了SPMD程序生成系统的一般概念模型，并证明该模型允许捕获广泛的不同程序语义。其次，他们表明，有可能将该模型的概念适合于一种注释语言，该语言允许SPMD程序生成系统充分利用模型中存在的所有可能性。

引用次数: 3

Parallel volume rendering for curvilinear volumes 曲线体的并行体渲染

Proceedings Scalable High Performance Computing Conference SHPCC-92.

Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232693

J. Challinger

Presents results of investigations into techniques for volume rendering using parallel processing on a multiple-instruction, multiple-data (MIMD) architecture that has a non-uniform access, shared memory. In particular, two parallel algorithms are given for volume rendering of curvilinear volumes. These two algorithms have been implemented on a BBN TC2000, and their performance has been measured and analyzed.<>

介绍了在具有非统一访问的共享内存的多指令多数据(MIMD)架构上使用并行处理的体绘制技术的研究结果。特别地，给出了两种并行的曲线体绘制算法。这两种算法已在BBN TC2000上实现，并对其性能进行了测试和分析。

引用次数: 20

Portable parallel Level-3 BLAS in Linda Linda的便携式平行3级BLAS

Proceedings Scalable High Performance Computing Conference SHPCC-92.

Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232664

B. Ghosh, M. Schultz

Describes an approach towards providing an efficient Level-3 BLAS library over a variety of parallel architectures using C-Linda. A blocked linear algebra program calling the sequential Level-3 BLAS can now run on both shared and distributed memory environments (which support Linda) by simply replacing each call by a call to the corresponding parallel Linda Level-3 BLAS. The authors summarise some of the implementation and algorithmic issues related to the matrix multiplication subroutine. All the various matrix algorithms being block-structured, they are particularly interested in parallel computers with hierarchical memory systems. Experimental data for their implementations show substantial speedups on shared memory, disjoint memory and networked configurations of processors. The authors also present the use of their parallel subroutines in blocked dense LU decomposition and present some preliminary experimental data.<>

描述了一种使用C-Linda在各种并行架构上提供高效的Level-3 BLAS库的方法。调用顺序3级BLAS的阻塞线性代数程序现在可以在共享和分布式内存环境(支持Linda)上运行，只需将每个调用替换为对相应的并行Linda 3级BLAS的调用。作者总结了一些与矩阵乘法子程序相关的实现和算法问题。所有的矩阵算法都是块结构的，他们对具有分层存储系统的并行计算机特别感兴趣。它们实现的实验数据显示，在共享内存、分离内存和处理器的网络配置上都有显著的加速。作者还介绍了并行子程序在阻塞密集LU分解中的应用，并给出了一些初步的实验数据

引用次数: 1

Selective monitoring using performance metric predicates 使用性能度量谓词的选择性监视

Proceedings Scalable High Performance Computing Conference SHPCC-92.

Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232655

C. E. Fineman, P. Hontalas

The field of parallel processing is going through an important evolution in technology characterized by a significant increase in the number of processors within such systems. As the number of processors increases, the conventional techniques for monitoring the performance of parallel systems will produce large amounts of data in the form of event trace files. The authors propose one possible solution to this data size problem: performance metric predicates. These predicates permit the user to define performance parameters that control the output of event trace data during the application's execution time. The authors assert that the use of performance metric predicates provides a powerful and useful tool for the control of event trace data output from large, complex systems.<>

并行处理领域正在经历一个重要的技术演变，其特点是这种系统中的处理器数量显著增加。随着处理器数量的增加，用于监视并行系统性能的传统技术将以事件跟踪文件的形式产生大量数据。作者为这个数据大小问题提出了一个可能的解决方案:性能度量谓词。这些谓词允许用户定义在应用程序执行期间控制事件跟踪数据输出的性能参数。作者断言，性能度量谓词的使用为控制大型复杂系统的事件跟踪数据输出提供了一个强大而有用的工具。

引用次数: 9

Problem specific environments for parallel computing 用于并行计算的特定问题环境

Proceedings Scalable High Performance Computing Conference SHPCC-92.

Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232658

L.S. Auvil, C. Ribbens, L. T. Watson

Considers general-purpose and problem-specific tools for parallel problem solving. A comparison is made between the two approaches, in terms of effort and usefulness, for two example problems. The advantages of special-purpose, problem-specific environments are described, and the effort required to construct such environments is seen to be reasonable.<>

考虑用于并行解决问题的通用工具和特定于问题的工具。以两个实例问题为例，比较了两种方法的有效性和有效性。本文描述了特殊目的、特定问题的环境的优点，并且认为构建这种环境所需的努力是合理的。

引用次数: 2

Programming distributed memory parallel computers without explicit message passing 无显式消息传递的分布式内存并行计算机编程

Proceedings Scalable High Performance Computing Conference SHPCC-92.

Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232683

F. André, T. Priol

Programming distributed memory parallel computers with explicit message passing refrains the use of this type of architecture. The objective is to provide a programmed environment which will hide the message passing aspects of DMPCs, and will allow the use of traditional languages as input. The paper describes two different approaches which satisfy this goal: a compiler which translates sequential code into distributed parallel processes and a shared virtual memory which offers to the user a global address space. Examples and results for both mechanisms are given. The hope and the interest of each approach is outlined.<>

使用显式消息传递对分布式内存并行计算机进行编程可以避免使用这种类型的体系结构。目标是提供一个可编程的环境，该环境将隐藏dmpc的消息传递方面，并允许使用传统语言作为输入。本文描述了满足这一目标的两种不同方法:将顺序代码转换为分布式并行进程的编译器和为用户提供全局地址空间的共享虚拟内存。给出了两种机构的算例和结果。概述了每种方法的希望和兴趣

引用次数: 5

Abstractions for parallel N-body simulations 并行n体仿真的抽象

Proceedings Scalable High Performance Computing Conference SHPCC-92.

Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232690

S. Bhatt, M. Chen, C.-Y. Lin, Peng Liu

Introduces C++ programming abstractions for maintaining load-balanced partitions of irregular and adaptive trees. Such abstractions are useful across a range of applications and MIMD architectures. The use of these abstractions is illustrated for gravitational N-body simulation. The strategy for parallel N-body simulation is based on a technique for implicitly representing a global tree across multiple processors. This substantially reduces the programming complexity and the overhead for distributed memory architectures. The overhead is further reduced by maintaining incremental data structures.<>

介绍了维护不规则和自适应树的负载均衡分区的c++编程抽象。这种抽象在各种应用程序和MIMD体系结构中都很有用。这些抽象概念在重力n体模拟中的应用得到了说明。并行n体仿真策略基于一种跨多个处理器隐式表示全局树的技术。这大大降低了分布式内存体系结构的编程复杂性和开销。通过维护增量数据结构，可以进一步减少开销

引用次数: 36

Solving equality constrained least squares problems 求解等式约束最小二乘问题

Proceedings Scalable High Performance Computing Conference SHPCC-92.

Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232669

U. B. Vemulapati

Constrained least squares problems occur often in practice, mostly as sub-problems in many optimization contexts. For solving large and sparse instances of these problems on parallel architectures with distributed memory, the use of static data structures to represent the sparse matrix is preferred during the factorization. But the accurate detection of the rank of the constraint matrix is also very critical to the accuracy of the computed solution. The author examines the solution of the constrained problem using weighting approach. All computations can be carried out using a static data structure that is generated using the symbolic structure of the input matrices, making use of a recently proposed rank detection procedure. The author shows good speed-ups in solving large and sparse equality conditioned least squares problems on hypercubes of up to 128 processors.<>

约束最小二乘问题在实践中经常出现，大多是在许多优化环境中作为子问题出现。为了在具有分布式内存的并行架构上解决这些问题的大型稀疏实例，在分解过程中首选使用静态数据结构来表示稀疏矩阵。而约束矩阵秩的准确检测对计算解的准确性也至关重要。作者研究了用加权法求解约束问题的方法。所有计算都可以使用使用输入矩阵的符号结构生成的静态数据结构来执行，使用最近提出的秩检测过程。作者展示了在多达128个处理器的超立方体上解决大型和稀疏相等条件最小二乘问题的良好加速。

引用次数: 0

Parameterized memory/processor optimizing FORTRAN compiler for parallel computers 面向并行计算机的参数化内存/处理器优化FORTRAN编译器

Proceedings Scalable High Performance Computing Conference SHPCC-92.

Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232645

D. Nosenchuck

A new approach to generating low-conflict parallel instructions for complex applications is introduced in this paper. This method is presented within the context of a FORTRAN compiler. An approximate simulator has been incorporated within a parallel-code/domain-decomposition loop within the compiler. The simulator estimates the performance of candidate instruction segments, and guides the selection of appropriate code transformations, heuristics, and data storage strategies. At present, many aspects of the target machine are parameterized, to permit investigations of a number of parallel-computer architectures. In this paper, the compiler is illustrated for a Navier-Stokes computer target node application.<>

本文介绍了一种用于复杂应用的低冲突并行指令生成方法。这个方法是在FORTRAN编译器的上下文中提出的。在编译器内的并行代码/域分解循环中加入了一个近似模拟器。模拟器估计候选指令段的性能，并指导选择适当的代码转换、启发式和数据存储策略。目前，目标机器的许多方面都是参数化的，以便对许多并行计算机体系结构进行研究。本文给出了一个用于Navier-Stokes计算机目标节点应用程序的编译器。

引用次数: 0

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings Scalable High Performance Computing Conference SHPCC-92.

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀