首页 > 最新文献

Conference on Hypercube Concurrent Computers and Applications最新文献

英文 中文
Implementing finite element software on hypercube machines 在超立方体机器上实现有限元软件
Pub Date : 1989-01-03 DOI: 10.1145/63047.63134
G. Lyzenga, A. Raefsky, Bahram Nour-Omid
{"title":"Implementing finite element software on hypercube machines","authors":"G. Lyzenga, A. Raefsky, Bahram Nour-Omid","doi":"10.1145/63047.63134","DOIUrl":"https://doi.org/10.1145/63047.63134","url":null,"abstract":"","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126990877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
An analytic model for parallel Gaussian elimination on a binary N-Cube architecture 二元N-Cube结构上并行高斯消去的解析模型
Pub Date : 1989-01-03 DOI: 10.1145/63047.63114
Virgílio A. F. Almeida, L. Dowdy, M. Leuze
This paper summarizes an analytical technique which predicts the time required to execute a given parallel program, with given data, on a given parallel architecture. For illustration purposes, the particular parallel program chosen is parallel Gaussian elimination and the particular parallel architecture chosen is a binary n-cube. The analytical technique is based upon a product-form queuing network model which is solved using an iterative method. The technique is validated by comparing performance predictions produced by the model against actual hypercube measurements.
本文总结了一种分析技术,它可以预测在给定的并行体系结构上使用给定的数据执行给定的并行程序所需的时间。为了说明,选择的特定并行程序是并行高斯消去,选择的特定并行架构是二进制n立方体。分析技术是基于一个乘积型排队网络模型,该模型采用迭代法求解。通过将模型产生的性能预测与实际的超立方体测量结果进行比较,验证了该技术。
{"title":"An analytic model for parallel Gaussian elimination on a binary N-Cube architecture","authors":"Virgílio A. F. Almeida, L. Dowdy, M. Leuze","doi":"10.1145/63047.63114","DOIUrl":"https://doi.org/10.1145/63047.63114","url":null,"abstract":"This paper summarizes an analytical technique which predicts the time required to execute a given parallel program, with given data, on a given parallel architecture. For illustration purposes, the particular parallel program chosen is parallel Gaussian elimination and the particular parallel architecture chosen is a binary n-cube. The analytical technique is based upon a product-form queuing network model which is solved using an iterative method. The technique is validated by comparing performance predictions produced by the model against actual hypercube measurements.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127341165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A comparison of several methods of integrating stiff ordinary differential equations on parallel computing architectures 在并行计算体系结构上对刚性常微分方程的几种积分方法的比较
Pub Date : 1989-01-03 DOI: 10.1145/63047.63129
A. Bose, I. Nelken, J. Gelfand
Many physical systems lead to initial value problems where the system of stiff ordinary differential equations is loosely coupled. Thus, in some cases the variables may be directly mapped onto sparsely connected parallel architectures such as the hypercube. This paper investigates various methods of implementing Gear's algorithm on parallel computers. Two conventional corrector methods utilize either functional or Newton Raphson iteration. We consider both alternatives and show that they exhibit similar speedups on an n node hypercube. In addition a polynomial corrector is investigated. It has the advantage of not having to solve a linear system as in the Newton Raphson method, yet it converges faster than functional iteration.
在许多物理系统中,刚性常微分方程组是松散耦合的。因此,在某些情况下,变量可以直接映射到稀疏连接的并行架构(如超立方体)上。本文研究了在并行计算机上实现Gear算法的各种方法。两种传统的校正方法利用函数或牛顿拉夫森迭代。我们考虑了这两种替代方案,并表明它们在n节点超立方体上表现出相似的加速。此外,还研究了多项式校正器。它的优点是不必像Newton Raphson方法那样求解线性系统,但它比函数迭代收敛得更快。
{"title":"A comparison of several methods of integrating stiff ordinary differential equations on parallel computing architectures","authors":"A. Bose, I. Nelken, J. Gelfand","doi":"10.1145/63047.63129","DOIUrl":"https://doi.org/10.1145/63047.63129","url":null,"abstract":"Many physical systems lead to initial value problems where the system of stiff ordinary differential equations is loosely coupled. Thus, in some cases the variables may be directly mapped onto sparsely connected parallel architectures such as the hypercube. This paper investigates various methods of implementing Gear's algorithm on parallel computers. Two conventional corrector methods utilize either functional or Newton Raphson iteration. We consider both alternatives and show that they exhibit similar speedups on an n node hypercube. In addition a polynomial corrector is investigated. It has the advantage of not having to solve a linear system as in the Newton Raphson method, yet it converges faster than functional iteration.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133781095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Acoustic wavefield propagation using paraxial extrapolators 使用近轴外推器的声波场传播
Pub Date : 1989-01-03 DOI: 10.1145/63047.63069
R. Clayton, R. Graves
Modeling by paraxial extrapolators is applicable to wave propagation problems in which most of the energy is traveling within a restricted angular cone about a principle axis of the problem. Frequency domain finite-difference solutions are readily generated by using this technique. Input models can be described either by specifying velocities or appropriate media parameters on a two or three dimensional grid of points. For heterogeneous models, transmission and reflection coefficients are determined at structural boundaries within the media. The direct forward scattered waves are modeled with a single pass of the extrapolator operator in the paraxial direction for each frequency. The first-order back scattered energy can then be modeled by extrapolation (in the opposite direction) of the reflected field determined on the first pass. Higher order scattering can be included by sweeping through the model with more passes.The chief advantages of the paraxial approach are 1) active storage is reduced by one dimension as compared to solutions which must track both up-going and down-going waves simultaneously, thus even realistic three dimensional problems can fit on today's computers, 2) the decomposition in frequency allows the technique to be implemented on highly parallel machines such the hypercube, 3) attenuation can be modeled as an arbitrary function of frequency, and 4) only a small number of frequencies are needed to produce movie-like time slices.By using this method a wide range of seismological problems can be addressed, including strong motion analysis of waves in three-dimensional basins, the modeling of VSP reflection data, and the analysis of whole earth problems such as scattering at the core-mantle boundary or the effect of tectonic boundaries on long-period wave propagation.
用近轴外推器建模适用于波的传播问题,其中大部分能量是在一个受限制的角锥内围绕问题的主轴传播。利用这种方法可以很容易地生成频域有限差分解。输入模型可以通过在二维或三维网格上指定速度或适当的介质参数来描述。对于非均质模型,透射和反射系数是在介质内的结构边界处确定的。直接正向散射波在每个频率的近轴方向上用单次外推算子进行建模。一阶反向散射能量可以通过外推(反方向)来模拟在第一次通过时确定的反射场。高阶散射可以通过多次扫描模型来实现。傍轴方法的主要优点是:1)与必须同时跟踪上行和下行波的解决方案相比,主动存储减少了一个维度,因此即使是现实的三维问题也可以在今天的计算机上实现;2)频率分解允许该技术在高度并行的机器上实现,如超立方体;3)衰减可以建模为频率的任意函数;4)只需要少量的频率就能产生类似电影的时间片。该方法可用于三维盆地强震分析、VSP反射数据建模、核幔边界散射或构造边界对长周期波传播影响等全地球问题的分析等广泛的地震学问题。
{"title":"Acoustic wavefield propagation using paraxial extrapolators","authors":"R. Clayton, R. Graves","doi":"10.1145/63047.63069","DOIUrl":"https://doi.org/10.1145/63047.63069","url":null,"abstract":"Modeling by paraxial extrapolators is applicable to wave propagation problems in which most of the energy is traveling within a restricted angular cone about a principle axis of the problem. Frequency domain finite-difference solutions are readily generated by using this technique. Input models can be described either by specifying velocities or appropriate media parameters on a two or three dimensional grid of points. For heterogeneous models, transmission and reflection coefficients are determined at structural boundaries within the media. The direct forward scattered waves are modeled with a single pass of the extrapolator operator in the paraxial direction for each frequency. The first-order back scattered energy can then be modeled by extrapolation (in the opposite direction) of the reflected field determined on the first pass. Higher order scattering can be included by sweeping through the model with more passes.\u0000The chief advantages of the paraxial approach are 1) active storage is reduced by one dimension as compared to solutions which must track both up-going and down-going waves simultaneously, thus even realistic three dimensional problems can fit on today's computers, 2) the decomposition in frequency allows the technique to be implemented on highly parallel machines such the hypercube, 3) attenuation can be modeled as an arbitrary function of frequency, and 4) only a small number of frequencies are needed to produce movie-like time slices.\u0000By using this method a wide range of seismological problems can be addressed, including strong motion analysis of waves in three-dimensional basins, the modeling of VSP reflection data, and the analysis of whole earth problems such as scattering at the core-mantle boundary or the effect of tectonic boundaries on long-period wave propagation.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131947242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Lattice gauge theory on the hypercube 超立方体上的晶格规范理论
Pub Date : 1989-01-03 DOI: 10.1145/63047.63081
J. Flower, J. Apostolakis, C. Baillie, H. Ding
Lattice gauge theory, an extremely computationally intensive problem, has been run successfully on hypercubes for a number of years. Herein we give a flavor of this work, discussing both the physics and the computing behind it.
晶格规范理论是一个计算量非常大的问题,已经在超立方体上成功地运行了很多年。在这里,我们给出了这个工作的味道,讨论物理和计算背后的。
{"title":"Lattice gauge theory on the hypercube","authors":"J. Flower, J. Apostolakis, C. Baillie, H. Ding","doi":"10.1145/63047.63081","DOIUrl":"https://doi.org/10.1145/63047.63081","url":null,"abstract":"Lattice gauge theory, an extremely computationally intensive problem, has been run successfully on hypercubes for a number of years. Herein we give a flavor of this work, discussing both the physics and the computing behind it.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130971770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Parallel branch and bound algorithms on hypercube multiprocessors 超立方体多处理机上的并行分支和定界算法
Pub Date : 1989-01-03 DOI: 10.1145/63047.63106
Tarek Saad Abdel-Rahman, T. Mudge
Branch and Bound (BB) algorithms are a generalization of many search algorithms used in Artificial Intelligence and Operations Research. This paper presents our work on implementing BB algorithms on hypercube multiprocessors. The 0-1 integer linear programming (ILP) problem is taken as an example because it can be implemented to capture the essence of BB search algorithms without too many distracting problem specific details. A BB algorithm for the 0-1 ILP problem is discussed. Two parallel implementations of the algorithm on hypercube multiprocessors are presented. The two implementations demonstrate some of the tradeoffs involved in implementing these algorithms on multiprocessors with no shared memory, such as hypercubes. Experimental results from the NCUBE/six show the performance of the two implementations of the algorithm. Future research work is discussed.
分支定界(BB)算法是人工智能和运筹学中许多搜索算法的概括。本文介绍了我们在超立方体多处理器上实现BB算法的工作。以0-1整数线性规划(ILP)问题为例,因为它可以在没有太多分散问题特定细节的情况下实现捕获BB搜索算法的本质。讨论了求解0-1 ILP问题的BB算法。给出了该算法在超立方体多处理器上的两种并行实现。这两种实现演示了在没有共享内存的多处理器(如超多维数据集)上实现这些算法所涉及的一些权衡。在NCUBE/ 6上的实验结果显示了该算法的两种实现的性能。讨论了今后的研究工作。
{"title":"Parallel branch and bound algorithms on hypercube multiprocessors","authors":"Tarek Saad Abdel-Rahman, T. Mudge","doi":"10.1145/63047.63106","DOIUrl":"https://doi.org/10.1145/63047.63106","url":null,"abstract":"Branch and Bound (BB) algorithms are a generalization of many search algorithms used in Artificial Intelligence and Operations Research. This paper presents our work on implementing BB algorithms on hypercube multiprocessors. The 0-1 integer linear programming (ILP) problem is taken as an example because it can be implemented to capture the essence of BB search algorithms without too many distracting problem specific details. A BB algorithm for the 0-1 ILP problem is discussed. Two parallel implementations of the algorithm on hypercube multiprocessors are presented. The two implementations demonstrate some of the tradeoffs involved in implementing these algorithms on multiprocessors with no shared memory, such as hypercubes. Experimental results from the NCUBE/six show the performance of the two implementations of the algorithm. Future research work is discussed.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131092158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Concurrent multiple target tracking 并发多目标跟踪
Pub Date : 1989-01-03 DOI: 10.1145/63047.63079
T. D. Gottschalk
A concurrent algorithm for multiple target tracking is presented. The underlying tracking formalism is first described by way of a sequential program, and the issues in generalizing the tracker for efficient concurrent implementations are discussed in detail. Typical tracking results on the Mark III hypercube are presented.
提出了一种多目标并发跟踪算法。首先以顺序程序的方式描述了底层跟踪形式,并详细讨论了将跟踪器一般化以实现高效并发实现的问题。给出了在Mark III超立方体上的典型跟踪结果。
{"title":"Concurrent multiple target tracking","authors":"T. D. Gottschalk","doi":"10.1145/63047.63079","DOIUrl":"https://doi.org/10.1145/63047.63079","url":null,"abstract":"A concurrent algorithm for multiple target tracking is presented. The underlying tracking formalism is first described by way of a sequential program, and the issues in generalizing the tracker for efficient concurrent implementations are discussed in detail. Typical tracking results on the Mark III hypercube are presented.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115436195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Expressing Boolean cube matrix algorithms in shared memory primitives 在共享内存原语中表达布尔立方矩阵算法
Pub Date : 1989-01-03 DOI: 10.1145/63047.63121
S. Johnsson, C. T. Ho
The multiplication of (large) matrices allocated evenly on Boolean cube configured multiprocessors poses several interesting trade-offs with respect to communication time, processor utilization, and storage requirement. In [7] we investigated several algorithms for different degrees of parallelization, and showed how the choice of algorithm with respect to performance depends on the matrix shape, and the multiprocessor parameters, and how processors should be allocated optimally to the different loops.In this paper the focus is on expressing the algorithms in shared memory type primitives. We assume that all processors share the same global address space, and present communication primitives both for nearest-neighbor communication, and global operations such as broadcasting from one processor to a set of processors, the reverse operation of plus-reduction, and matrix transposition (dimension permutation). We consider both the case where communication is restricted to one processor port at a time, or concurrent communication on all processor ports. The communication algorithms are provably optimal within a factor of two. We describe both constant storage algorithms, and algorithms with reduced communication time, but a storage need proportional to the number of processors and the matrix sizes (for a one-dimensional partitioning of the matrices).
在布尔立方体配置的多处理器上均匀分配的(大)矩阵的乘法在通信时间、处理器利用率和存储需求方面带来了一些有趣的权衡。在[7]中,我们研究了几种不同程度并行化的算法,并展示了如何选择与性能相关的算法取决于矩阵形状和多处理器参数,以及如何将处理器最佳地分配给不同的循环。本文的重点是在共享内存类型原语中表达算法。我们假设所有处理器共享相同的全局地址空间,并提供用于最近邻通信和全局操作的通信原语,例如从一个处理器广播到一组处理器、加减运算的反向操作和矩阵转置(维置换)。我们考虑了两种情况,一种是通信被限制在一个处理器端口上,另一种是所有处理器端口上的并发通信。可证明,该通信算法在两个因子内是最优的。我们描述了恒定存储算法和减少通信时间的算法,但是存储需求与处理器数量和矩阵大小成正比(对于矩阵的一维划分)。
{"title":"Expressing Boolean cube matrix algorithms in shared memory primitives","authors":"S. Johnsson, C. T. Ho","doi":"10.1145/63047.63121","DOIUrl":"https://doi.org/10.1145/63047.63121","url":null,"abstract":"The multiplication of (large) matrices allocated evenly on Boolean cube configured multiprocessors poses several interesting trade-offs with respect to communication time, processor utilization, and storage requirement. In [7] we investigated several algorithms for different degrees of parallelization, and showed how the choice of algorithm with respect to performance depends on the matrix shape, and the multiprocessor parameters, and how processors should be allocated optimally to the different loops.\u0000In this paper the focus is on expressing the algorithms in shared memory type primitives. We assume that all processors share the same global address space, and present communication primitives both for nearest-neighbor communication, and global operations such as broadcasting from one processor to a set of processors, the reverse operation of plus-reduction, and matrix transposition (dimension permutation). We consider both the case where communication is restricted to one processor port at a time, or concurrent communication on all processor ports. The communication algorithms are provably optimal within a factor of two. We describe both constant storage algorithms, and algorithms with reduced communication time, but a storage need proportional to the number of processors and the matrix sizes (for a one-dimensional partitioning of the matrices).","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124330365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
An O(NlogN) hypercube N-body integrator 一个O(NlogN)超立方n体积分器
Pub Date : 1989-01-03 DOI: 10.1145/63047.63051
M. Warren, J. Salmon
The gravitational N-body algorithm of Barnes and Hut [1] has been successfully implemented on a hypercube concurrent processor. The novel approach of their sequential algorithm has demonstrated itself to be well suited to hypercube architectures. The sequential code achieves O (NlogN) speed by recursively dividing space into subcells, thereby creating a hierarchical grouping of particles. Computing interactions between these groups dramatically reduces the amount of communication between processors, as well as the number of force calculations. Parallelism is achieved through an irregular spatial grid decomposition. Since the decomposition topology is not simple, a general loosely synchronous communication routine has been developed. Operations are simplified if the conventional grey code decomposition is modified so that the bits are taken alternately from each Cartesian dimension. A speedup of 180 has been achieved for a 500,000 particle two-dimensional calculation on 256 processors. A speedup of 65 has been obtained for a 64,000 particle three-dimensional calculation on 256 processors.
Barnes和Hut[1]的引力n体算法已经在超立方体并发处理器上成功实现。他们的顺序算法的新方法已被证明非常适合超立方体体系结构。序列代码通过递归地将空间划分为子单元,从而创建粒子的分层分组,从而达到O (NlogN)的速度。这些组之间的计算交互极大地减少了处理器之间的通信量,以及力计算的数量。平行度是通过不规则的空间网格分解实现的。由于分解拓扑并不简单,因此开发了一种通用的松散同步通信例程。如果对传统的灰码分解进行修改,以便从每个笛卡尔维中交替获取比特,则可以简化操作。在256个处理器上进行500,000个粒子的二维计算时,速度提高了180倍。在256个处理器上进行64,000个粒子的三维计算,获得了65倍的加速。
{"title":"An O(NlogN) hypercube N-body integrator","authors":"M. Warren, J. Salmon","doi":"10.1145/63047.63051","DOIUrl":"https://doi.org/10.1145/63047.63051","url":null,"abstract":"The gravitational N-body algorithm of Barnes and Hut [1] has been successfully implemented on a hypercube concurrent processor. The novel approach of their sequential algorithm has demonstrated itself to be well suited to hypercube architectures. The sequential code achieves O (NlogN) speed by recursively dividing space into subcells, thereby creating a hierarchical grouping of particles. Computing interactions between these groups dramatically reduces the amount of communication between processors, as well as the number of force calculations. Parallelism is achieved through an irregular spatial grid decomposition. Since the decomposition topology is not simple, a general loosely synchronous communication routine has been developed. Operations are simplified if the conventional grey code decomposition is modified so that the bits are taken alternately from each Cartesian dimension. A speedup of 180 has been achieved for a 500,000 particle two-dimensional calculation on 256 processors. A speedup of 65 has been obtained for a 64,000 particle three-dimensional calculation on 256 processors.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"82 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116370211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Parallel implementation of domain decomposition techniques on Intel's hypercube 领域分解技术在Intel超立方体上的并行实现
Pub Date : 1989-01-03 DOI: 10.1145/63047.63132
M. Haghoo, W. Proskurowski
Parallel implementation of domain decomposition techniques for elliptic PDEs in rectangular regions is considered. This technique is well suited for parallel processing, since in the solution process the subproblems either are independent or can be easily converted into decoupled problems. More than 80% of execution time is spent on solving these independent and decoupled problems.The hypercube architecture is used for concurrent execution. The performance of the parallel algorithm is compared against the sequential version. The speed-up, efficiency, and communication factors are studied as functions the number of processors. Extensive tests are performed to find, for a given mesh size, the number of subregions and nodes that minimize the overall execution time.
研究了矩形区域椭圆偏微分方程的并行域分解技术。这种技术非常适合并行处理,因为在求解过程中,子问题要么是独立的,要么可以很容易地转化为解耦的问题。超过80%的执行时间花在解决这些独立和解耦的问题上。超多维数据集架构用于并发执行。比较了并行算法与顺序算法的性能。研究了加速、效率和通信系数随处理器数量的变化规律。对于给定的网格大小,执行大量的测试以找到最小化总体执行时间的子区域和节点的数量。
{"title":"Parallel implementation of domain decomposition techniques on Intel's hypercube","authors":"M. Haghoo, W. Proskurowski","doi":"10.1145/63047.63132","DOIUrl":"https://doi.org/10.1145/63047.63132","url":null,"abstract":"Parallel implementation of domain decomposition techniques for elliptic PDEs in rectangular regions is considered. This technique is well suited for parallel processing, since in the solution process the subproblems either are independent or can be easily converted into decoupled problems. More than 80% of execution time is spent on solving these independent and decoupled problems.\u0000The hypercube architecture is used for concurrent execution. The performance of the parallel algorithm is compared against the sequential version. The speed-up, efficiency, and communication factors are studied as functions the number of processors. Extensive tests are performed to find, for a given mesh size, the number of subregions and nodes that minimize the overall execution time.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123342669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Conference on Hypercube Concurrent Computers and Applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1