首页 > 最新文献

Conference on Hypercube Concurrent Computers and Applications最新文献

英文 中文
Task allocation onto a hypercube by recursive mincut bipartitioning 通过递归最小分割双分区将任务分配到超立方体上
Pub Date : 1990-08-01 DOI: 10.1145/62297.62323
F. Erçal, J. Ramanujam, P. Sadayappan
An efficient recursive task allocation scheme, based on the Kernighan-Lin mincut bisection heuristic, is proposed for the effective mapping of tasks of a parallel program onto a hypercube parallel computer. It is evaluated by comparison with an adaptive, scaled simulated annealing method. The recursive allocation scheme is shown to be effective on a number of large test task graphs - its solution quality is nearly as good as that produced by simulated annealing, and its computation time is several orders of magnitude less.
为了将并行程序的任务有效映射到超立方并行计算机上,提出了一种基于Kernighan-Lin最小分割启发式的高效递归任务分配方案。通过与一种自适应、缩放模拟退火方法的比较,对该方法进行了评价。结果表明,递归分配方案在大量大型测试任务图上是有效的,其解的质量几乎与模拟退火方法相当,且计算时间比模拟退火方法少几个数量级。
{"title":"Task allocation onto a hypercube by recursive mincut bipartitioning","authors":"F. Erçal, J. Ramanujam, P. Sadayappan","doi":"10.1145/62297.62323","DOIUrl":"https://doi.org/10.1145/62297.62323","url":null,"abstract":"An efficient recursive task allocation scheme, based on the Kernighan-Lin mincut bisection heuristic, is proposed for the effective mapping of tasks of a parallel program onto a hypercube parallel computer. It is evaluated by comparison with an adaptive, scaled simulated annealing method. The recursive allocation scheme is shown to be effective on a number of large test task graphs - its solution quality is nearly as good as that produced by simulated annealing, and its computation time is several orders of magnitude less.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114223803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 154
Portable programming within a message-passing model: the FFT as an example 消息传递模型中的可移植编程:以FFT为例
Pub Date : 1989-01-03 DOI: 10.1145/63047.63100
D. Walker
This paper describes a portable programming environment for MIMD concurrent processors based on an object-oriented, message-passing paradigm. The basis of this environment is the Virtual Machine Loosely Synchronous Communication System (VMLSCS) which is designed to be used for loosely synchronous problems. VMLSCS is structured to make efficient use of hierarchical memory, and permits communication and calculation to be overlapped on certain concurrent processors. As an example, the use of VMLSCS in performing both one-dimensional and multi-dimensional fast Fourier transforms (FFTs) on concurrent multiprocessors is described. It is shown that all necessary interprocessor communication can be performed by a single routine, vm_index. Thus the construction of a portable concurrent FFT rests on the implementation of vm_index on the target machines. In the multi-dimensional algorithm a strip decomposition is applied to each of the directions in turn so that each of the FFTs performed in a particular direction are done in one processor. This allows fast sequential one-dimensional FFTs to be exploited. The implementation of vm_index on both homogeneous and inhomogeneous hypercubes, and shared memory multiprocessors is discussed.
本文描述了一种基于面向对象、消息传递范式的可移植的MIMD并发处理器编程环境。该环境的基础是虚拟机松散同步通信系统(VMLSCS),该系统设计用于解决松散同步问题。VMLSCS的结构是为了有效地利用分层内存,并允许通信和计算在某些并发处理器上重叠。作为一个例子,描述了VMLSCS在并发多处理器上执行一维和多维快速傅里叶变换(fft)的应用。结果表明,所有必要的处理器间通信都可以通过一个例程vm_index来完成。因此,可移植并发FFT的构造取决于目标机器上vm_index的实现。在多维算法中,条带分解依次应用于每个方向,以便在特定方向上执行的每个fft都在一个处理器中完成。这允许利用快速的顺序一维fft。讨论了vm_index在同构和非同构超多维数据集以及共享内存多处理器上的实现。
{"title":"Portable programming within a message-passing model: the FFT as an example","authors":"D. Walker","doi":"10.1145/63047.63100","DOIUrl":"https://doi.org/10.1145/63047.63100","url":null,"abstract":"This paper describes a portable programming environment for MIMD concurrent processors based on an object-oriented, message-passing paradigm. The basis of this environment is the Virtual Machine Loosely Synchronous Communication System (VMLSCS) which is designed to be used for loosely synchronous problems. VMLSCS is structured to make efficient use of hierarchical memory, and permits communication and calculation to be overlapped on certain concurrent processors. As an example, the use of VMLSCS in performing both one-dimensional and multi-dimensional fast Fourier transforms (FFTs) on concurrent multiprocessors is described. It is shown that all necessary interprocessor communication can be performed by a single routine, vm_index. Thus the construction of a portable concurrent FFT rests on the implementation of vm_index on the target machines. In the multi-dimensional algorithm a strip decomposition is applied to each of the directions in turn so that each of the FFTs performed in a particular direction are done in one processor. This allows fast sequential one-dimensional FFTs to be exploited. The implementation of vm_index on both homogeneous and inhomogeneous hypercubes, and shared memory multiprocessors is discussed.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"135 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117340734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Optimal matrix algorithms on homogeneous hypercubes 齐次超立方体上的最优矩阵算法
Pub Date : 1989-01-03 DOI: 10.1145/63047.63125
G. Fox, W. Furmanski, D. Walker
This paper describes a set of concurrent algorithms for matrix algebra, based on a library of collective communication routines for the hypercube. We show how a systematic application of scattering reduces load imbalance. A number of examples are considered (Gaussian elimination, Gauss-Jordan matrix inversion, the power method for eigenvectors, and tridiagonalisation by Householder's method), and the concurrent efficiencies are discussed.
本文描述了一套基于超立方体集体通信例程库的矩阵代数并行算法。我们展示了散射的系统应用如何减少负载不平衡。考虑了一些例子(高斯消去,高斯-乔丹矩阵反演,特征向量的幂方法和Householder方法的三对角化),并讨论了并发效率。
{"title":"Optimal matrix algorithms on homogeneous hypercubes","authors":"G. Fox, W. Furmanski, D. Walker","doi":"10.1145/63047.63125","DOIUrl":"https://doi.org/10.1145/63047.63125","url":null,"abstract":"This paper describes a set of concurrent algorithms for matrix algebra, based on a library of collective communication routines for the hypercube. We show how a systematic application of scattering reduces load imbalance. A number of examples are considered (Gaussian elimination, Gauss-Jordan matrix inversion, the power method for eigenvectors, and tridiagonalisation by Householder's method), and the concurrent efficiencies are discussed.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123320469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
An interactive system for seismic velocity analysis 地震速度分析交互系统
Pub Date : 1989-01-03 DOI: 10.1145/63047.63067
C. Addison, J. M. Cook, L. R. Hagen
Seismic data processing is a time-consuming operation. Two main reasons for this are the large amounts of data, which have to be put in and taken out of the computer regularly, and the need for intervention by an expert at many intermediate stages. A subsidiary cause is the large amount of computation. The recent appearance of multiprocessor computers has created the opportunity to provide an interactive system to ease and speed-up the seismic data processing cycle.This paper describes the development of an initial interactive system for the velocity analysis and NMO/stacking stage of seismic processing. Computational power and storage are supplied by two 32-node Intel iPSC/1 Hypercubes, both with 16 memory nodes and 16 vector nodes. Each hypercube has 96 Mbytes of memory and a top processing speed of over 100 Mflops (32 bit). A SUN-3 Workstation is used to display intermediate results and to enable the expert to direct the processing more efficiently.
地震资料处理是一项耗时的工作。造成这种情况的两个主要原因是大量的数据,这些数据必须定期输入和取出计算机,并且在许多中间阶段需要专家的干预。次要原因是计算量大。最近出现的多处理器计算机为交互式系统提供了机会,从而简化和加快了地震数据处理周期。本文介绍了一个用于地震处理的速度分析和NMO/叠加阶段的初始交互系统的开发。计算能力和存储由两个32节点的Intel iPSC/1 Hypercubes提供,它们都有16个内存节点和16个矢量节点。每个超立方体具有96 mb的内存和超过100 Mflops(32位)的最高处理速度。SUN-3工作站用于显示中间结果,使专家能够更有效地指导处理。
{"title":"An interactive system for seismic velocity analysis","authors":"C. Addison, J. M. Cook, L. R. Hagen","doi":"10.1145/63047.63067","DOIUrl":"https://doi.org/10.1145/63047.63067","url":null,"abstract":"Seismic data processing is a time-consuming operation. Two main reasons for this are the large amounts of data, which have to be put in and taken out of the computer regularly, and the need for intervention by an expert at many intermediate stages. A subsidiary cause is the large amount of computation. The recent appearance of multiprocessor computers has created the opportunity to provide an interactive system to ease and speed-up the seismic data processing cycle.\u0000This paper describes the development of an initial interactive system for the velocity analysis and NMO/stacking stage of seismic processing. Computational power and storage are supplied by two 32-node Intel iPSC/1 Hypercubes, both with 16 memory nodes and 16 vector nodes. Each hypercube has 96 Mbytes of memory and a top processing speed of over 100 Mflops (32 bit). A SUN-3 Workstation is used to display intermediate results and to enable the expert to direct the processing more efficiently.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114274633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Use of the hypercube for symbolic quantum chromodynamics 符号量子色动力学中超立方体的使用
Pub Date : 1989-01-03 DOI: 10.1145/63047.63097
A. Kolawa, G. Fox
A new numerical approach by Furmanski and Kolawa to quantum chromodynamics is based on diagonalizing the underlying Hamiltonian. This method involves the generation of states by repeated action of a potential operator. This symbolic calculation is dominated by the time it takes to search the database of existing states to verify if a generated state is identical to one previously found. We implement this algorithm on the Caltech/JPL Mark II hypercube and analyze its performance of both a simple database search and one optimized for this application. We show that the hypercube performance can be modelled in a fashion similar to conventional numerical (loosely synchronous) applications.
Furmanski和Kolawa提出的量子色动力学的一种新的数值方法是基于对角化底层哈密顿量。这种方法涉及到通过一个潜在算子的重复动作来产生状态。这种符号计算主要取决于搜索现有状态数据库以验证生成的状态是否与先前发现的状态相同所花费的时间。我们在Caltech/JPL Mark II超立方体上实现了该算法,并分析了其简单数据库搜索和针对该应用程序优化的数据库搜索的性能。我们展示了超立方体性能可以以类似于传统数值(松散同步)应用程序的方式建模。
{"title":"Use of the hypercube for symbolic quantum chromodynamics","authors":"A. Kolawa, G. Fox","doi":"10.1145/63047.63097","DOIUrl":"https://doi.org/10.1145/63047.63097","url":null,"abstract":"A new numerical approach by Furmanski and Kolawa to quantum chromodynamics is based on diagonalizing the underlying Hamiltonian. This method involves the generation of states by repeated action of a potential operator. This symbolic calculation is dominated by the time it takes to search the database of existing states to verify if a generated state is identical to one previously found. We implement this algorithm on the Caltech/JPL Mark II hypercube and analyze its performance of both a simple database search and one optimized for this application. We show that the hypercube performance can be modelled in a fashion similar to conventional numerical (loosely synchronous) applications.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129995190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Design and implementation of a concurrent image processing workstation based on the Mark III hypercube 基于Mark III超立方体的并行图像处理工作站的设计与实现
Pub Date : 1989-01-03 DOI: 10.1145/63047.63086
S. Groom, M. Lee, A. Mazer, W. Williams
Various image processing algorithms have been implemented on the hypercube architecture and many success stories have been reported. However, the traditional approach to programming the hypercube has been to write programs which perform ;I single operation or a fixed set of operations upon data items. This approach has several drawbacks when considered for use in an interactive computing environment. First, it is difficult to process data with a sequence of sim;ple programs in the Mark III Hypercube because the Mark III software does not support sharing of data between successive programs. This means that data must be reloaded into the cube for each individual program. It also implies that programs should be fairly large and complete, to minimize the repeated downloading of large data items for multiple programs. However, the entire program must be able to fit within the hypercube node memory, which limits what a program can do by putting a restriction on its size. Furtbermore, large programs limit the amount of memory available for data, which must also be present in memory if the communications overhead is to be effectively reduced. The development of an interactive image processing workstation based on the: Mark III Hypercube requires satisfactory solutions to these and other problems.
在超立方体架构上实现了各种图像处理算法,并报道了许多成功的案例。然而,对超立方体进行编程的传统方法是编写对数据项执行单一操作或固定操作集的程序。当考虑在交互式计算环境中使用时,这种方法有几个缺点。首先,Mark III Hypercube中的一系列简单程序很难处理数据,因为Mark III软件不支持连续程序之间的数据共享。这意味着必须为每个单独的程序将数据重新加载到数据集中。它还意味着程序应该相当大且完整,以尽量减少为多个程序重复下载大数据项。但是,整个程序必须能够容纳在超立方体节点内存中,这通过对其大小施加限制来限制程序所能做的事情。此外,大型程序限制了数据可用的内存量,如果要有效地减少通信开销,这些数据也必须存在于内存中。基于Mark III Hypercube的交互式图像处理工作站的开发需要对这些问题和其他问题进行满意的解决。
{"title":"Design and implementation of a concurrent image processing workstation based on the Mark III hypercube","authors":"S. Groom, M. Lee, A. Mazer, W. Williams","doi":"10.1145/63047.63086","DOIUrl":"https://doi.org/10.1145/63047.63086","url":null,"abstract":"Various image processing algorithms have been implemented on the hypercube architecture and many success stories have been reported. However, the traditional approach to programming the hypercube has been to write programs which perform ;I single operation or a fixed set of operations upon data items. This approach has several drawbacks when considered for use in an interactive computing environment. First, it is difficult to process data with a sequence of sim;ple programs in the Mark III Hypercube because the Mark III software does not support sharing of data between successive programs. This means that data must be reloaded into the cube for each individual program. It also implies that programs should be fairly large and complete, to minimize the repeated downloading of large data items for multiple programs. However, the entire program must be able to fit within the hypercube node memory, which limits what a program can do by putting a restriction on its size. Furtbermore, large programs limit the amount of memory available for data, which must also be present in memory if the communications overhead is to be effectively reduced. The development of an interactive image processing workstation based on the: Mark III Hypercube requires satisfactory solutions to these and other problems.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"44 9-10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132286906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Hypercube implementation of the simplex algorithm 超立方体实现的单纯形算法
Pub Date : 1989-01-03 DOI: 10.1145/63047.63104
C. Stunkel, D. Reed
Large, sparse, linear systems of equations arise frequently when constructing mathematical models of natural phenomena. Most often, these linear systems are fully constrained and can be solved via direct or iterative techniques. However, one important problem class requires solutions to underconstrained linear systems that maximize some objective function. These linear optimization problems are natural formulations of many business plans and often contain hundreds of equations with thousands of variables. Historically, linear optimization problems have been solved via the simplex method. Despite the excellent performance of the simplex method, the size of the optimization problems and the frequency of their solution make linear optimization a computationally taxing endeavor. This paper examines the performance of parallel variants of the simplex algorithm on the Intel iPSC, a message-based parallel system. Linear optimization test data are drawn from commercial sources and represent realistic problems. Analysis shows that the speedup obtained is sensitive to both the structure of the underlying data and the data partitioning.
在构建自然现象的数学模型时,经常会出现大型的、稀疏的线性方程组。大多数情况下,这些线性系统是完全受限的,可以通过直接或迭代技术来解决。然而,有一类重要的问题需要求解使某些目标函数最大化的欠约束线性系统。这些线性优化问题是许多商业计划的自然公式,通常包含数百个方程和数千个变量。历史上,线性优化问题是通过单纯形法来解决的。尽管单纯形法具有优异的性能,但优化问题的规模和求解的频率使线性优化成为一项计算上的繁重工作。本文研究了单纯形算法的并行变体在Intel iPSC(一个基于消息的并行系统)上的性能。线性优化测试数据来自商业来源,代表现实问题。分析表明,所获得的加速对底层数据的结构和数据分区都很敏感。
{"title":"Hypercube implementation of the simplex algorithm","authors":"C. Stunkel, D. Reed","doi":"10.1145/63047.63104","DOIUrl":"https://doi.org/10.1145/63047.63104","url":null,"abstract":"Large, sparse, linear systems of equations arise frequently when constructing mathematical models of natural phenomena. Most often, these linear systems are fully constrained and can be solved via direct or iterative techniques. However, one important problem class requires solutions to underconstrained linear systems that maximize some objective function. These linear optimization problems are natural formulations of many business plans and often contain hundreds of equations with thousands of variables. Historically, linear optimization problems have been solved via the simplex method. Despite the excellent performance of the simplex method, the size of the optimization problems and the frequency of their solution make linear optimization a computationally taxing endeavor. This paper examines the performance of parallel variants of the simplex algorithm on the Intel iPSC, a message-based parallel system. Linear optimization test data are drawn from commercial sources and represent realistic problems. Analysis shows that the speedup obtained is sensitive to both the structure of the underlying data and the data partitioning.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115405866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Finite element solution of thermal convection on a hypercube concurrent computer 超立方体并行计算机上热对流的有限元解
Pub Date : 1989-01-03 DOI: 10.1145/63047.63070
M. Gurnis, A. Raefsky, G. Lyzenga, B. Hager
Numerical solutions to thermal convection flow problems are vital to many scientific and engineering problems. One fundamental geophysical problem is the thermal convection responsible for continental drift and sea floor spreading. The earth's interior undergoes slow creeping flow (~cm/yr) in response to the buoyancy forces generated by temperature variations caused by the decay of radioactive elements and secular cooling. Convection in the earth's mantle, the 3000 km thick solid layer between the crust and core, is difficult to model for three reasons: (1) Complex rheology -- the effective viscosity depends exponentially on temperature, on pressure (or depth) and on the deviatoric stress; (2) the buoyancy forces driving the flow occur in boundary layers thin in comparison to the total depth; and (3) spherical geometry -- the flow in the interior is fully three dimensional. Because of these many difficulties, accurate and realistic simulations of this process easily overwhelm current computer speed and memory (including the Cray XMP and Cray 2) and only simplified problems have been attempted [e.g. Christensen and Yuen, 1984; Gurnis, 1988; Jarvis and Peltier, 1982]. As a start in overcoming these difficulties, a number of finite element formulations have been explored on hypercube concurrent computers. Although two coupled equations are required to solve this problem (the momentum or Stokes equation and the energy or advection-diffusion equation), we will concentrate our efforts on the solution to the latter equation in this paper. Solution of the former equation is discussed elsewhere [Lyzenga, et al, 1988]. We will demonstrate that linear speedups and efficiencies of 99 percent are achieved for sufficiently large problems.
热对流流动问题的数值解对于许多科学和工程问题都是至关重要的。一个基本的地球物理问题是引起大陆漂移和海底扩张的热对流。由于放射性元素衰变和长期冷却引起的温度变化所产生的浮力,地球内部经历缓慢的爬行流动(~cm/yr)。地幔(地壳和地核之间3000公里厚的固体层)中的对流很难建模,原因有三:(1)复杂的流变学——有效粘度以指数形式取决于温度、压力(或深度)和偏应力;(2)驱动流动的浮力发生在相对于总深度较薄的边界层中;(3)球面几何——内部的流动完全是三维的。由于这些困难,对这一过程的准确和真实的模拟很容易超过当前的计算机速度和内存(包括Cray XMP和Cray 2),并且只尝试了简化的问题[例如Christensen和Yuen, 1984;格尼斯,1988;Jarvis and Peltier, 1982]。作为克服这些困难的开端,一些有限元公式已经在超立方体并发计算机上进行了探索。虽然解决这个问题需要两个耦合方程(动量或斯托克斯方程和能量或平流-扩散方程),但本文将集中精力解决后一个方程。前一个方程的解在其他地方有讨论[Lyzenga, et al, 1988]。我们将证明,对于足够大的问题,可以实现99%的线性加速和效率。
{"title":"Finite element solution of thermal convection on a hypercube concurrent computer","authors":"M. Gurnis, A. Raefsky, G. Lyzenga, B. Hager","doi":"10.1145/63047.63070","DOIUrl":"https://doi.org/10.1145/63047.63070","url":null,"abstract":"Numerical solutions to thermal convection flow problems \u0000are vital to many scientific and engineering problems. \u0000One fundamental geophysical problem is the thermal convection \u0000responsible for continental drift and sea floor \u0000spreading. The earth's interior undergoes slow creeping \u0000flow (~cm/yr) in response to the buoyancy forces generated \u0000by temperature variations caused by the decay of \u0000radioactive elements and secular cooling. Convection in \u0000the earth's mantle, the 3000 km thick solid layer between \u0000the crust and core, is difficult to model for three reasons: \u0000(1) Complex rheology -- the effective viscosity depends \u0000exponentially on temperature, on pressure (or depth) and \u0000on the deviatoric stress; (2) the buoyancy forces driving \u0000the flow occur in boundary layers thin in comparison to the \u0000total depth; and (3) spherical geometry -- the flow in the \u0000interior is fully three dimensional. Because of these many \u0000difficulties, accurate and realistic simulations of this process \u0000easily overwhelm current computer speed and memory \u0000(including the Cray XMP and Cray 2) and only simplified \u0000problems have been attempted [e.g. Christensen and \u0000Yuen, 1984; Gurnis, 1988; Jarvis and Peltier, 1982]. \u0000 \u0000As a start in overcoming these difficulties, a number of \u0000finite element formulations have been explored on hypercube \u0000concurrent computers. Although two coupled equations \u0000are required to solve this problem (the momentum \u0000or Stokes equation and the energy or advection-diffusion \u0000equation), we will concentrate our efforts on the solution \u0000to the latter equation in this paper. Solution of the former \u0000equation is discussed elsewhere [Lyzenga, et al, 1988]. \u0000We will demonstrate that linear speedups and efficiencies \u0000of 99 percent are achieved for sufficiently large problems.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116003706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Logic fault simulation on a vector hypercube multiprocessor 矢量超立方体多处理器逻辑故障仿真
Pub Date : 1989-01-03 DOI: 10.1145/63047.63064
F. Özgüner, C. Aykanat, O. Khalid
Fault simulation is the process of simulating the response of a logic circuit to input patterns in the presence of all possible single faults and is an essential part of test generation for VLSI circuits. Parallelization of the deductive and parallel simulation methods, on a hypercube multiprocessor and vectorization of the parallel simulation method are described. Experimental results are presented.
故障仿真是模拟逻辑电路在存在所有可能的单故障情况下对输入模式的响应的过程,是超大规模集成电路测试生成的重要组成部分。介绍了在超立方体多处理机上的并行化演绎法和并行仿真方法,以及并行仿真方法的向量化。给出了实验结果。
{"title":"Logic fault simulation on a vector hypercube multiprocessor","authors":"F. Özgüner, C. Aykanat, O. Khalid","doi":"10.1145/63047.63064","DOIUrl":"https://doi.org/10.1145/63047.63064","url":null,"abstract":"Fault simulation is the process of simulating the response of a logic circuit to input patterns in the presence of all possible single faults and is an essential part of test generation for VLSI circuits. Parallelization of the deductive and parallel simulation methods, on a hypercube multiprocessor and vectorization of the parallel simulation method are described. Experimental results are presented.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124027482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Implementation and performance analysis of parallel assignment algorithms on a hypercube computer 并行分配算法在超立方体计算机上的实现与性能分析
Pub Date : 1989-01-03 DOI: 10.1145/63047.63077
BarryK. Carpenter, IV NathanielJ.Davis
The process of effectively coordinating and controlling resources during a military engagement is known as battle management/command, control, and communications (BM/C3). One key task of BM/C3 is allocating weapons to destroy targets. The focus of this research is on developing parallel computation methods to achieve fast and cost effective assignment of weapons to targets. Using the sequential Hungarian method for solving the assignment problem as a basis, this paper presents the development and the relative performance comparison of four parallel assignment methodologies that have been implemented on the Intel iPSC hypercube computer. The first three approaches are approximations to the optimal assignment solution. The advantage to these is that they are computationally fast and have proven to generate assignments that are very close the optimal assignment in terms of cost. The fourth approach is a parallel implementation of the Hungarian algorithm, where certain subtasks are performed in parallel. This approach produces an optimal assignment as compared to the sub-optimal assignments that result from the first three approaches. The relative performance of the four approaches is compared by varying the number of weapons and targets, the number of processors used, and the size of the problem partitions.
在军事交战中有效协调和控制资源的过程被称为战斗管理/指挥、控制和通信(BM/C3)。BM/C3的一项关键任务是分配武器摧毁目标。本研究的重点是发展并行计算方法,以实现快速和经济有效的武器分配目标。本文以求解分配问题的顺序匈牙利方法为基础,介绍了在Intel iPSC超立方体计算机上实现的四种并行分配方法的发展和相对性能比较。前三种方法是最优分配解的近似。这些方法的优点是计算速度快,并且已被证明可以生成在成本方面非常接近最佳分配的分配。第四种方法是匈牙利算法的并行实现,其中并行执行某些子任务。与前三种方法产生的次优分配相比,这种方法产生了最优分配。通过改变武器和目标的数量、使用的处理器数量和问题分区的大小来比较这四种方法的相对性能。
{"title":"Implementation and performance analysis of parallel assignment algorithms on a hypercube computer","authors":"BarryK. Carpenter, IV NathanielJ.Davis","doi":"10.1145/63047.63077","DOIUrl":"https://doi.org/10.1145/63047.63077","url":null,"abstract":"The process of effectively coordinating and controlling resources during a military engagement is known as battle management/command, control, and communications (BM/C3). One key task of BM/C3 is allocating weapons to destroy targets. The focus of this research is on developing parallel computation methods to achieve fast and cost effective assignment of weapons to targets. Using the sequential Hungarian method for solving the assignment problem as a basis, this paper presents the development and the relative performance comparison of four parallel assignment methodologies that have been implemented on the Intel iPSC hypercube computer. The first three approaches are approximations to the optimal assignment solution. The advantage to these is that they are computationally fast and have proven to generate assignments that are very close the optimal assignment in terms of cost. The fourth approach is a parallel implementation of the Hungarian algorithm, where certain subtasks are performed in parallel. This approach produces an optimal assignment as compared to the sub-optimal assignments that result from the first three approaches. The relative performance of the four approaches is compared by varying the number of weapons and targets, the number of processors used, and the size of the problem partitions.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127225418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
Conference on Hypercube Concurrent Computers and Applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1