首页 > 最新文献

Proceedings Scalable High Performance Computing Conference SHPCC-92.最新文献

英文 中文
Parallel preconditioning and approximation inverses on the Connection Machine 连接机上的并行预处理和近似逆
Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232685
M. Grote, H. Simon
The authors present a new approach to preconditioning for very large, sparse, non-symmetric, linear systems. It explicitly computes an approximate inverse to the original matrix that can be applied most efficiently for iterative methods on massively parallel machines. The algorithm and its implementation on the Connection Machine CM-2 are discussed in detail and supported by timings obtained from real problem data.<>
作者提出了一种新的方法来预处理非常大,稀疏,非对称,线性系统。它显式地计算原始矩阵的近似逆,可以最有效地应用于大规模并行机器上的迭代方法。详细讨论了该算法及其在CM-2连接机上的实现,并以实际问题数据的时序为依据。
{"title":"Parallel preconditioning and approximation inverses on the Connection Machine","authors":"M. Grote, H. Simon","doi":"10.1109/SHPCC.1992.232685","DOIUrl":"https://doi.org/10.1109/SHPCC.1992.232685","url":null,"abstract":"The authors present a new approach to preconditioning for very large, sparse, non-symmetric, linear systems. It explicitly computes an approximate inverse to the original matrix that can be applied most efficiently for iterative methods on massively parallel machines. The algorithm and its implementation on the Connection Machine CM-2 are discussed in detail and supported by timings obtained from real problem data.<<ETX>>","PeriodicalId":254515,"journal":{"name":"Proceedings Scalable High Performance Computing Conference SHPCC-92.","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128428520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 63
Adaptive methods and rectangular partitioning problem 自适应方法与矩形划分问题
Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232665
C. Ozturan, B. Szymanski, J.E. Flaherthy
Partitioning problems for rectangular domains having nonuniform workload for mesh-connected SIMD architectures are discussed. The considered rectangular workloads result from application of adaptive methods to the solution of hyperbolic differential equations on SIMD machines. A new form of the partitioning problem is defined in which sub-meshes of processors are assigned to tasks, each task being a discretized rectangular sub-domain. The work per processor (i.e. the work density) is balanced among the K sub-rectangular meshes of processors. First, a formalization of the 1D problem is given and a O(Kn/sup 3/) time and (Kn/sup 2/) space optimal algorithm is proposed. A more efficient heuristic algorithm is also given for the 1D problem. Finally 2D heuristics are developed by projecting the weights on to a 1D array.<>
讨论了网格连接SIMD体系结构中工作负载不均匀的矩形域划分问题。将自适应方法应用于SIMD机床上的双曲型微分方程的求解,从而考虑了矩形负荷。定义了一种新的划分问题形式,将处理器的子网格分配给任务,每个任务是一个离散的矩形子域。每个处理器的工作(即工作密度)在处理器的K个子矩形网格之间平衡。首先,给出了一维问题的形式化形式,并提出了O(Kn/sup 3/)时间和(Kn/sup 2/)空间的最优算法。对于一维问题,给出了一种更有效的启发式算法。最后,通过将权重投影到一维数组上,开发了二维启发式算法。
{"title":"Adaptive methods and rectangular partitioning problem","authors":"C. Ozturan, B. Szymanski, J.E. Flaherthy","doi":"10.1109/SHPCC.1992.232665","DOIUrl":"https://doi.org/10.1109/SHPCC.1992.232665","url":null,"abstract":"Partitioning problems for rectangular domains having nonuniform workload for mesh-connected SIMD architectures are discussed. The considered rectangular workloads result from application of adaptive methods to the solution of hyperbolic differential equations on SIMD machines. A new form of the partitioning problem is defined in which sub-meshes of processors are assigned to tasks, each task being a discretized rectangular sub-domain. The work per processor (i.e. the work density) is balanced among the K sub-rectangular meshes of processors. First, a formalization of the 1D problem is given and a O(Kn/sup 3/) time and (Kn/sup 2/) space optimal algorithm is proposed. A more efficient heuristic algorithm is also given for the 1D problem. Finally 2D heuristics are developed by projecting the weights on to a 1D array.<<ETX>>","PeriodicalId":254515,"journal":{"name":"Proceedings Scalable High Performance Computing Conference SHPCC-92.","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130376441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Compiler optimizations for distributed-memory programs 分布式内存程序的编译器优化
Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232651
Rajesh K. Gupta
The single-program multiple-data (SPMD) mode of execution is an effective approach for exploiting parallelism in programs written using the shared-memory programming model on distributed memory machines. However, during SPMD execution one must consider dependencies due to the transfer of data among the processors. Such dependencies can be avoided by reordering the communication operations (sends and receives). However, no formal framework has been developed to explicitly recognize the represent such dependencies. The author identifies two types of dependencies, namely communication dependencies and scheduling dependencies, and proposes to represent these dependencies explicitly in the program dependency graph. Next, he presents program transformations that use this dependency information in transforming the program and increasing the degree of parallelism exploited. Finally, the author presents program transformations that reduce communication related run-time overhead.<>
单程序多数据(SPMD)执行模式是在分布式内存机器上使用共享内存编程模型编写的程序中利用并行性的有效方法。但是,在SPMD执行期间,必须考虑由于处理器之间的数据传输而产生的依赖性。可以通过重新排序通信操作(发送和接收)来避免这种依赖关系。然而,还没有开发出正式的框架来显式地识别这些依赖关系的表示。作者确定了两种类型的依赖关系,即通信依赖关系和调度依赖关系,并建议在程序依赖关系图中显式表示这些依赖关系。接下来,他介绍了在转换程序和增加所利用的并行度时使用这些依赖信息的程序转换。最后,作者介绍了减少与通信相关的运行时开销的程序转换。
{"title":"Compiler optimizations for distributed-memory programs","authors":"Rajesh K. Gupta","doi":"10.1109/SHPCC.1992.232651","DOIUrl":"https://doi.org/10.1109/SHPCC.1992.232651","url":null,"abstract":"The single-program multiple-data (SPMD) mode of execution is an effective approach for exploiting parallelism in programs written using the shared-memory programming model on distributed memory machines. However, during SPMD execution one must consider dependencies due to the transfer of data among the processors. Such dependencies can be avoided by reordering the communication operations (sends and receives). However, no formal framework has been developed to explicitly recognize the represent such dependencies. The author identifies two types of dependencies, namely communication dependencies and scheduling dependencies, and proposes to represent these dependencies explicitly in the program dependency graph. Next, he presents program transformations that use this dependency information in transforming the program and increasing the degree of parallelism exploited. Finally, the author presents program transformations that reduce communication related run-time overhead.<<ETX>>","PeriodicalId":254515,"journal":{"name":"Proceedings Scalable High Performance Computing Conference SHPCC-92.","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124096231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
A look at scalable dense linear algebra libraries 看一下可扩展的密集线性代数库
Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232670
J. Dongarra, R. V. D. Geijn, D. Walker
Discusses the essential design features of a library of scalable software for performing dense linear algebra computations on distributed memory concurrent computers. The square block scattered decomposition is proposed as a flexible and general-purpose way of decomposing most, if not all, dense matrix problems. An object-oriented interface to the library permits more portable applications to be written, and is easy to learn and use, since details of the parallel implementation are hidden from the user. Experiments on the Intel Touchstone Delta system with a prototype code that uses the square block scattered decomposition to perform LU factorization are presented and analyzed. It was found that the code was both scalable and efficient, performing at about 14 GFLOPS (double precision) for the largest problem considered.<>
讨论了在分布式内存并发计算机上执行密集线性代数计算的可扩展软件库的基本设计特征。方形块分散分解是一种灵活且通用的方法,可以分解大多数(如果不是全部的话)密集矩阵问题。库的面向对象接口允许编写更可移植的应用程序,并且易于学习和使用,因为并行实现的细节对用户是隐藏的。给出了在Intel Touchstone Delta系统上使用方形块分散分解进行LU分解的原型代码的实验并进行了分析。结果发现,该代码既可扩展又高效,对于考虑的最大问题,执行速度约为14 GFLOPS(双精度)。
{"title":"A look at scalable dense linear algebra libraries","authors":"J. Dongarra, R. V. D. Geijn, D. Walker","doi":"10.1109/SHPCC.1992.232670","DOIUrl":"https://doi.org/10.1109/SHPCC.1992.232670","url":null,"abstract":"Discusses the essential design features of a library of scalable software for performing dense linear algebra computations on distributed memory concurrent computers. The square block scattered decomposition is proposed as a flexible and general-purpose way of decomposing most, if not all, dense matrix problems. An object-oriented interface to the library permits more portable applications to be written, and is easy to learn and use, since details of the parallel implementation are hidden from the user. Experiments on the Intel Touchstone Delta system with a prototype code that uses the square block scattered decomposition to perform LU factorization are presented and analyzed. It was found that the code was both scalable and efficient, performing at about 14 GFLOPS (double precision) for the largest problem considered.<<ETX>>","PeriodicalId":254515,"journal":{"name":"Proceedings Scalable High Performance Computing Conference SHPCC-92.","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126850458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 103
Unsteady flow simulation using an MIMD computer 用MIMD计算机进行非定常流场模拟
Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232675
S. Palaniswamy, S. Chakravarthy
Numerical simulations of unsteady flows require algorithms with high order of accuracy in both time and space and correspondingly vast computer resources, because flow properties may have to be computed for large times before significant information can be extracted from them. Massively parallel computers are particularly well suited for these simulations if the computational domain could be mapped on to the processors while maintaining high efficiency and time synchronization between the nodes of the MIMD computer. Certain aspects of the implementation of a time-accurate algorithm to solve the Navier-Stokes equations on structured grids, using a massively parallel processor, are presented in this paper along with results for two problems: (1) the changing characteristics of the near-wake flow behind a cylinder as a function of Reynolds number and (2) dynamics of vortex pairing in free-shear layers.<>
非定常流动的数值模拟需要在时间和空间上具有高精度的算法和相应的大量计算机资源,因为在从中提取有意义的信息之前,可能需要进行大量的流动特性计算。如果计算域可以映射到处理器上,同时保持MIMD计算机节点之间的高效率和时间同步,那么大规模并行计算机特别适合这些模拟。本文介绍了利用大规模并行处理器在结构网格上求解Navier-Stokes方程的时间精确算法实现的某些方面,并给出了两个问题的结果:(1)圆柱后近尾流随雷诺数的变化特性和(2)自由剪切层中涡对动力学。
{"title":"Unsteady flow simulation using an MIMD computer","authors":"S. Palaniswamy, S. Chakravarthy","doi":"10.1109/SHPCC.1992.232675","DOIUrl":"https://doi.org/10.1109/SHPCC.1992.232675","url":null,"abstract":"Numerical simulations of unsteady flows require algorithms with high order of accuracy in both time and space and correspondingly vast computer resources, because flow properties may have to be computed for large times before significant information can be extracted from them. Massively parallel computers are particularly well suited for these simulations if the computational domain could be mapped on to the processors while maintaining high efficiency and time synchronization between the nodes of the MIMD computer. Certain aspects of the implementation of a time-accurate algorithm to solve the Navier-Stokes equations on structured grids, using a massively parallel processor, are presented in this paper along with results for two problems: (1) the changing characteristics of the near-wake flow behind a cylinder as a function of Reynolds number and (2) dynamics of vortex pairing in free-shear layers.<<ETX>>","PeriodicalId":254515,"journal":{"name":"Proceedings Scalable High Performance Computing Conference SHPCC-92.","volume":"190 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123371818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Incremental mapping for solution-adaptive multigrid hierarchies 自适应多网格层次结构的增量映射
Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232666
J. De Keyser, D. Roose
The full multigrid method uses a hierarchy of successively finer grids. In a solution-adaptive grid hierarchy each grid is obtained by adaptive refinement of the grid on the previous level. On a distributed memory multiprocessor, each grid level must be partitioned and mapped so as to minimize the multigrid cycle execution time. In this report, several grid partitioning and load (re)mapping strategies that deal with this problem are compared. The influence of the type of multigrid cycle is examined. Results obtained on an iPSC hypercube are reported.<>
完整的多网格方法使用连续更细网格的层次结构。在自适应网格层次中,每个网格都是通过对前一层网格的自适应细化得到的。在分布式内存多处理器上,必须对每个网格级别进行分区和映射,以最小化多网格周期的执行时间。在本报告中,比较了处理此问题的几种网格分区和负载(重新)映射策略。考察了多网格循环类型的影响。报道了在iPSC超立方体上获得的结果。
{"title":"Incremental mapping for solution-adaptive multigrid hierarchies","authors":"J. De Keyser, D. Roose","doi":"10.1109/SHPCC.1992.232666","DOIUrl":"https://doi.org/10.1109/SHPCC.1992.232666","url":null,"abstract":"The full multigrid method uses a hierarchy of successively finer grids. In a solution-adaptive grid hierarchy each grid is obtained by adaptive refinement of the grid on the previous level. On a distributed memory multiprocessor, each grid level must be partitioned and mapped so as to minimize the multigrid cycle execution time. In this report, several grid partitioning and load (re)mapping strategies that deal with this problem are compared. The influence of the type of multigrid cycle is examined. Results obtained on an iPSC hypercube are reported.<<ETX>>","PeriodicalId":254515,"journal":{"name":"Proceedings Scalable High Performance Computing Conference SHPCC-92.","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123626310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Load balancing and parallel implementation of iterative algorithms for row-continuous Markov chains 行连续马尔可夫链迭代算法的负载平衡与并行实现
Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232656
M. Colajanni, M. Angelaccio
Presents the first parallel algorithms for solving row-continuous or generalized birth-death (GBD) Markov chains on distributed memory MIMD multiprocessors. These systems are characterized by very large transition probability matrices, decomposable in heterogeneous tridiagonal blocks. The parallelization of three aggregation/disaggregation iterative methods is carried out by a unique framework that keeps into account the special matrix structure. Great effort has been also devoted to define a general algorithm for approximating the optimum workload. Various computational experiments show that Vantilborgh's (1985) method is the fastest of the three algorithms on any data set dimension.<>
提出了在分布式存储器MIMD多处理机上求解行连续或广义生-死(GBD)马尔可夫链的第一个并行算法。这些系统的特点是具有非常大的转移概率矩阵,可在异质三对角块中分解。考虑到特殊的矩阵结构,采用独特的框架实现了三种聚合/分解迭代方法的并行化。在定义近似最优工作负荷的一般算法方面也付出了很大的努力。各种计算实验表明,vantilborg(1985)的方法在任何数据集维度上都是三种算法中最快的。
{"title":"Load balancing and parallel implementation of iterative algorithms for row-continuous Markov chains","authors":"M. Colajanni, M. Angelaccio","doi":"10.1109/SHPCC.1992.232656","DOIUrl":"https://doi.org/10.1109/SHPCC.1992.232656","url":null,"abstract":"Presents the first parallel algorithms for solving row-continuous or generalized birth-death (GBD) Markov chains on distributed memory MIMD multiprocessors. These systems are characterized by very large transition probability matrices, decomposable in heterogeneous tridiagonal blocks. The parallelization of three aggregation/disaggregation iterative methods is carried out by a unique framework that keeps into account the special matrix structure. Great effort has been also devoted to define a general algorithm for approximating the optimum workload. Various computational experiments show that Vantilborgh's (1985) method is the fastest of the three algorithms on any data set dimension.<<ETX>>","PeriodicalId":254515,"journal":{"name":"Proceedings Scalable High Performance Computing Conference SHPCC-92.","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127643919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Applications of a parallel pressure-correction algorithm to 3D turbomachinery flows 平行压力校正算法在三维涡轮机械流动中的应用
Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232657
M. Braaten
A parallel algorithm for the solution of three-dimensional compressible flows in turbomachinery has been developed and demonstrated on a scalable distributed memory multicomputer. The algorithm solves the compressible form of the Euler or Navier-Stokes equations via a compressible pressure correction formulation. To achieve high accuracy for highly turning blade rows, the computational grid is constructed without requiring strict periodicity of the grid points along the periodic boundaries between the blade passages. The impact of this feature on code parallelization and computational efficiency is described. The algorithm has been demonstrated on up to 128 processors of an Intel iPSC/860. Performance 2.4 times faster than a single Cray Y-MP processor has been achieved for an inviscid turbomachinery calculation on 154000 grid points with 128 processors of the iPSC/860.<>
提出了一种求解涡轮机械三维可压缩流的并行算法,并在可扩展分布式存储多计算机上进行了验证。该算法通过可压缩压力修正公式求解Euler或Navier-Stokes方程的可压缩形式。为了获得高旋转叶片排的高精度,在不要求网格点沿叶片通道周期边界的严格周期性的情况下构建了计算网格。描述了该特性对代码并行化和计算效率的影响。该算法已在Intel iPSC/860的多达128个处理器上进行了验证。在iPSC/860的128个处理器上,在154000个网格点上进行无粘涡轮机械计算,性能比单个Cray Y-MP处理器快2.4倍。
{"title":"Applications of a parallel pressure-correction algorithm to 3D turbomachinery flows","authors":"M. Braaten","doi":"10.1109/SHPCC.1992.232657","DOIUrl":"https://doi.org/10.1109/SHPCC.1992.232657","url":null,"abstract":"A parallel algorithm for the solution of three-dimensional compressible flows in turbomachinery has been developed and demonstrated on a scalable distributed memory multicomputer. The algorithm solves the compressible form of the Euler or Navier-Stokes equations via a compressible pressure correction formulation. To achieve high accuracy for highly turning blade rows, the computational grid is constructed without requiring strict periodicity of the grid points along the periodic boundaries between the blade passages. The impact of this feature on code parallelization and computational efficiency is described. The algorithm has been demonstrated on up to 128 processors of an Intel iPSC/860. Performance 2.4 times faster than a single Cray Y-MP processor has been achieved for an inviscid turbomachinery calculation on 154000 grid points with 128 processors of the iPSC/860.<<ETX>>","PeriodicalId":254515,"journal":{"name":"Proceedings Scalable High Performance Computing Conference SHPCC-92.","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116292540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data alignment: transformations to reduce communication on distributed memory architectures 数据对齐:减少分布式内存架构上通信的转换
Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232671
M. O’Boyle, G. A. Hedayat
The relative storage, or alignment, of array data in distributed memory critically determines the amount of communication overhead. This paper expresses data alignment in a linear algebraic framework. Aligned data can be viewed as forming a hyperplane in the iteration space. This allows the quantification of data alignment and the determination of the existence of transformations and the determination of the existence of transformations to reduce nonlocal access. This has led to a new alignment transformation which is applicable to a wider class of problems than existing techniques. The global impact of such transformations are discussed as is the effect of alignment on partitioning.<>
数组数据在分布式内存中的相对存储或对齐方式决定了通信开销的大小。本文用线性代数框架来表达数据对齐。对齐的数据可以看作是在迭代空间中形成一个超平面。这允许对数据对齐进行量化,并确定转换的存在性,以及确定转换的存在性,以减少非本地访问。这导致了一种新的校准转换,它比现有技术适用于更广泛的问题类别。讨论了这种转换的全局影响,以及对齐对分区的影响
{"title":"Data alignment: transformations to reduce communication on distributed memory architectures","authors":"M. O’Boyle, G. A. Hedayat","doi":"10.1109/SHPCC.1992.232671","DOIUrl":"https://doi.org/10.1109/SHPCC.1992.232671","url":null,"abstract":"The relative storage, or alignment, of array data in distributed memory critically determines the amount of communication overhead. This paper expresses data alignment in a linear algebraic framework. Aligned data can be viewed as forming a hyperplane in the iteration space. This allows the quantification of data alignment and the determination of the existence of transformations and the determination of the existence of transformations to reduce nonlocal access. This has led to a new alignment transformation which is applicable to a wider class of problems than existing techniques. The global impact of such transformations are discussed as is the effect of alignment on partitioning.<<ETX>>","PeriodicalId":254515,"journal":{"name":"Proceedings Scalable High Performance Computing Conference SHPCC-92.","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126642767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
HeNCE: graphical development tools for network-based concurrent computing 因此:基于网络的并发计算的图形化开发工具
Pub Date : 1992-04-26 DOI: 10.1109/SHPCC.1992.232678
A. Beguelin, J. Dongarra, Alexander Geist, R. Manchek, Keith Moore, Reed Wade, V. Sunderam
HeNCE (heterogeneous network computing environment) is an X Window based graphical parallel programming environment that was created to assist scientists and engineers with the development of parallel programs. HeNCE provides a graphical interface for creating, compiling, executing, and debugging parallel programs, as well as configuring a distributed virtual computer (using PVM). HeNCE programs can be run on a single Unix workstation or over a network of heterogeneous machines. The paper describes the purpose and use of the HeNCE software.<>
因此(异构网络计算环境)是一个基于X窗口的图形并行编程环境,它的创建是为了帮助科学家和工程师开发并行程序。因此提供了一个图形界面,用于创建、编译、执行和调试并行程序,以及配置分布式虚拟计算机(使用PVM)。因此,程序可以在单个Unix工作站上运行,也可以在异构计算机网络上运行。本文介绍了该软件的用途和使用方法。
{"title":"HeNCE: graphical development tools for network-based concurrent computing","authors":"A. Beguelin, J. Dongarra, Alexander Geist, R. Manchek, Keith Moore, Reed Wade, V. Sunderam","doi":"10.1109/SHPCC.1992.232678","DOIUrl":"https://doi.org/10.1109/SHPCC.1992.232678","url":null,"abstract":"HeNCE (heterogeneous network computing environment) is an X Window based graphical parallel programming environment that was created to assist scientists and engineers with the development of parallel programs. HeNCE provides a graphical interface for creating, compiling, executing, and debugging parallel programs, as well as configuring a distributed virtual computer (using PVM). HeNCE programs can be run on a single Unix workstation or over a network of heterogeneous machines. The paper describes the purpose and use of the HeNCE software.<<ETX>>","PeriodicalId":254515,"journal":{"name":"Proceedings Scalable High Performance Computing Conference SHPCC-92.","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128052150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 52
期刊
Proceedings Scalable High Performance Computing Conference SHPCC-92.
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1