Conference on Hypercube Concurrent Computers and Applications最新文献

英文中文

Solution of the 3-D Euler equations for the flow about a fighter aircraft configuration using a hypercube parallel processor 用超立方并行处理器求解战斗机构型流场的三维欧拉方程

Conference on Hypercube Concurrent Computers and Applications

Pub Date : 1989-01-03 DOI: 10.1145/63047.63066

D. Weissbein, J. F. Mangus, M. W. George

The Computational Fluid Dynamics (CFD) code FL057, which solves the 3-D Euler Equations using an explicit, finite volume, Runge-Kutta algorithm, was implemented on an Intel IPSC-MX parallel processor. Spatial decomposition was effected on the solution grid about a fighter aircraft configuration and Binary Reflected Graycodes were used to map the computational domain onto the IPSC insuring nearest neighbor communication. Results and timings of the implementation are presented with a comparison of the IPSC and a uniprocessor machine of similar classification to assess the performance of the IPSC on FL057. Suggested improvements to the current version of the parallelized code are listed to aid load balancing, vectorization, and more efficient memory use.

计算流体动力学(CFD)代码FL057在英特尔IPSC-MX并行处理器上实现，该代码使用显式有限体积龙格-库塔算法求解三维欧拉方程。对某型战斗机构型解网格进行空间分解，利用二值反射灰度码将计算域映射到保证最近邻通信的IPSC上。通过比较IPSC和类似分类的单处理机的实现结果和时间，来评估IPSC在FL057上的性能。本文列出了对当前版本并行化代码的建议改进，以帮助实现负载平衡、向量化和更有效地使用内存。

引用次数: 1

Hypercube data analysis in astronomy: optical interferometry and millisecond pulsar searches 天文学中的超立方体数据分析:光学干涉测量和毫秒脉冲星搜索

Conference on Hypercube Concurrent Computers and Applications

Pub Date : 1989-01-03 DOI: 10.1145/63047.63049

P. Gorham, T. Prince, S. Anderson

Astronomical data sets are beginning to live up to their name, in both their sizes and the complexity of the analysis required. Here we discuss two astronomical data analysis problems which we have begun to implement on a hypercube concurrent processor environment: The intensive image processing required in an optical interferometry project, and the large scale power spectral analysis required by a search for millisecond-period radio pulsars. In both cases the analysis proceeds largely in the Fourier domain, and we find that the problems are readily adapted to a concurrent environment. In the following report, we outline briefly the astronomical background for each problem, then discuss the general computational requirements, and finally present possible hypercube algorithms and results achieved to date.

天文数据集在其规模和所需分析的复杂性方面开始名符其实。在这里，我们讨论了两个天文数据分析问题，我们已经开始在超立方体并发处理器环境中实现:光学干涉测量项目所需的密集图像处理，以及搜索毫秒周期射电脉冲星所需的大规模功率谱分析。在这两种情况下，分析主要在傅里叶域中进行，我们发现问题很容易适应并发环境。在下面的报告中，我们简要概述了每个问题的天文学背景，然后讨论了一般的计算要求，最后介绍了可能的超立方体算法和迄今为止取得的结果。

引用次数: 3

Region growing on a hypercube multiprocessor 在超立方体多处理器上生长区域

Conference on Hypercube Concurrent Computers and Applications

Pub Date : 1989-01-03 DOI: 10.1145/63047.63057

M. Willebeek-LeMair, A. Reeves

The region growing paradigm for image segmentation groups neighboring pixels into regions depending upon a predetermined homogeneity criteria. A parallel method for region growing on an MIMD multiprocessor system is presented. Since the region growing problem exhibits non-uniform and unpredictable load fluctuations, it requires a dynamic load balancing scheme to achieve a balanced load distribution. The results of implementing a parallel region growing algorithm on the Intel-iPSC hypercube are discussed.

图像分割的区域增长范式根据预先确定的均匀性标准将相邻像素划分为区域。提出了一种在多处理机系统上进行区域生长的并行方法。由于区域增长问题表现出不均匀和不可预测的负载波动，因此需要动态负载均衡方案来实现负载的均衡分配。讨论了在Intel-iPSC超立方体上实现并行区域生长算法的结果。

引用次数: 20

The preconditioned conjugate gradient method on the hypercube 超立方体上的预条件共轭梯度法

Conference on Hypercube Concurrent Computers and Applications

Pub Date : 1989-01-03 DOI: 10.1145/63047.63126

G. Abe, K. Hane

A parallel algorithm for solving the elliptic partial differential equation (PDE) is described in this paper through the finite difference method (FDM) The Concurrent Preconditioned Conjugate Gradient method is developed to optimize processor load balancing. This algorithm is evaluated on a hypercube-based concurrent machine, the Intel iPSC.

本文提出了一种用有限差分法求解椭圆型偏微分方程的并行算法，并提出了并行预条件共轭梯度法来优化处理器负载均衡。该算法在基于超立方体的并发机器Intel iPSC上进行了评估。

引用次数: 1

An experimental study of methods for parallel preconditioned Krylov methods 并行预处理Krylov方法的实验研究

Conference on Hypercube Concurrent Computers and Applications

Pub Date : 1989-01-03 DOI: 10.1145/63047.63128

D. Baxter, J. Saltz, M. Schultz, S. Eisenstat, K. Crowley

High performance multiprocessor architectures differ both in the number of processors, and in the delay costs for synchronization and communication. In order to obtain good performance on a given architecture for a given problem, adequate parallelization, good balance of load and an appropriate choice of granularity are essential.We discuss the implementation of parallel version of PCGPAK for both shared memory architectures and hypercubes. Our parallel implementation is sufficiently efficient to allow us to complete the solution of our test problems on 16 processors of the Encore Multimax/320 in an amount of time that is a small multiple of that required by a single head of a Cray X/MP, despite the fact that the peak performance of the Multimax processors is not even close to the supercomputer range. We illustrate the effectiveness of our approach on a number of model problems from reservoir engineering and mathematics.

高性能多处理器体系结构在处理器数量以及同步和通信的延迟成本方面都有所不同。为了在给定架构上获得针对给定问题的良好性能，充分的并行化、良好的负载平衡和适当的粒度选择是必不可少的。我们讨论了PCGPAK并行版本在共享内存架构和超多维数据集上的实现。我们的并行实现足够高效，可以让我们在Encore multiax /320的16个处理器上完成测试问题的解决方案，这是Cray X/MP单个头部所需时间的一小倍，尽管multiax处理器的峰值性能甚至没有接近超级计算机的范围。我们说明了我们的方法在一些油藏工程和数学模型问题上的有效性。

引用次数: 43

Hypercube performance for 2-D seismic finite-difference modeling 二维地震有限差分建模的超立方体性能

Conference on Hypercube Concurrent Computers and Applications

Pub Date : 1989-01-03 DOI: 10.1145/63047.63068

L. J. Baker

Wave-equation seismic modeling in two space dimensions is computationally intensive, often requiring hours of supercomputer CPU time to run typical geological models with 500 × 500 grids and 100 sources. This paper analyzes the performance of ACOUS2D, an explicit 4th-order finite-difference program, on Intel's 16-processor vector hypercube computer. The conversion of the sequential version of ACOUS2D to run on hypercube was straightforward, but time-consuming. The key consideration for optimal efficiency is load balancing. On a fairly typical geologic model, the 16-processor Intel vector hypercube computer ran ACOUS2D at 1/3 the speed of a Cray-1S.

二维空间的波动方程地震建模是计算密集型的，通常需要几个小时的超级计算机CPU时间来运行500 × 500网格和100个震源的典型地质模型。本文分析了显式四阶有限差分程序ACOUS2D在Intel的16处理器矢量超立方体计算机上的性能。将连续版本的ACOUS2D转换为在hypercube上运行很简单，但是很耗时。最优效率的关键考虑因素是负载平衡。在一个相当典型的地质模型中，16处理器的英特尔矢量超立方体计算机以Cray-1S的1/3速度运行ACOUS2D。

引用次数: 3

Blitz: a rule-based system for massively parallel architectures Blitz:基于规则的大规模并行架构系统

Conference on Hypercube Concurrent Computers and Applications

Pub Date : 1989-01-03 DOI: 10.1145/63047.63091

K. Morgan

The rule-based system has emerged as an important tool to developers of artificial intelligence programs. Because of the computational resources required to realize the MATCH-SELECT-EXECUTE cycle of rule-based systems, researchers have been trying to introduce parallelism into these systems for some time. We describe a new approach to parallel rule-based systems which exploits fine-grained hypercube hardware. The new algorithms for parallel rule matching and simultaneous execution of several rules at once are presented. Experimental results using a Connection Machine* implementation of BLITZ are presented.

基于规则的系统已经成为人工智能程序开发人员的重要工具。由于实现基于规则的系统的MATCH-SELECT-EXECUTE周期需要计算资源，研究人员一直在尝试将并行性引入这些系统中。我们描述了一种利用细粒度超立方体硬件的并行基于规则的系统的新方法。提出了并行规则匹配和多规则同时执行的新算法。给出了使用连接机*实现BLITZ的实验结果。

引用次数: 4

Binsorting on hypercubes with d-port communication 具有d-port通信的超多维数据集的分类

Conference on Hypercube Concurrent Computers and Applications

Pub Date : 1989-01-03 DOI: 10.1145/63047.63102

S. Seidel, W. George

Three sorting algorithms are given for hypercubes with d-port communication. All of these algorithms are based on binsort at the global level. The binsort allows the movement of keys among nodes to be performed by a d-port complete exchange rather than a sequence of l-port exchanges as in other algorithms. This lowers communication costs by at least a factor of d compared to other sorting algorithms. The first algorithm assumes the keys are uniformly distributed and selects bin boundaries based on the global maximum and minimum keys. The other two algorithms make no assumption about the distribution of keys and so they sample the keys before the binsort in order to estimate their distribution. Splitting keys based on that estimate reduce the variance among the lengths of the subsequences left in the nodes after the complete exchange of bins which in turn helps to balance the computational load in each node. The performance of two of these algorithms on an FPS T-40 is given for data of various distributions and is compared to the performance of bitonic sort and hyperquicksort.

给出了具有d端口通信的超立方体的三种排序算法。所有这些算法都是基于全局层次的binsort。binsort允许通过d端口完全交换来执行节点之间的键移动，而不是像其他算法那样使用l端口交换序列。与其他排序算法相比，这将通信成本降低了至少1 / d。第一种算法假设键是均匀分布的，并根据全局最大键和最小键选择bin边界。另外两种算法没有对键的分布做任何假设，因此它们在binsort之前对键进行采样，以估计它们的分布。基于该估计分割密钥可以减少在完成交换bin后节点中剩余子序列长度之间的方差，这反过来有助于平衡每个节点的计算负载。给出了这两种算法在FPS -40上对不同分布的数据的性能，并与双速排序和超快速排序的性能进行了比较。

引用次数: 21

Molecular dynamics simulation on an iPSC of defects in crystals 晶体缺陷iPSC的分子动力学模拟

Conference on Hypercube Concurrent Computers and Applications

Pub Date : 1989-01-03 DOI: 10.1145/63047.63084

P. Flinn

Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. TO copy otherwise, or to republish, requires a fee and/or specfic permission.

引用次数: 2

QED on the connection machine 连接机QED

Conference on Hypercube Concurrent Computers and Applications

Pub Date : 1989-01-03 DOI: 10.1145/63047.63082

C. Baillie, S. Johnsson, Luis F. Ortiz, G. Pawley

Physicists believe that the world is described in terms of gauge theories. A popular technique for investigating these theories is to discretize them onto a lattice and simulate numerically by a computer, yielding so-called lattice gauge theory. Such computations require at least 1014 floating-point operations, necessitating the use of advanced architecture supercomputers such as the Connection Machine made by Thinking Machines Corporation. Currently the most important gauge theory to be solved is that describing the sub-nuclear world of high energy physics: Quantum Chromo-dynamics (QCD). The simplest example of a gauge theory is Quantum Electro-dynamics (QED), the theory which describes the interaction of electrons and photons. Simulation of QCD requires computer software very similar to that for the simpler QED problem. Our current QED code achieves a computational rate of 1.6 million lattice site updates per second for a Monte Carlo algorithm, and 7.4 million site updates per second for a microcanonical algorithm. The estimated performance for a Monte Carlo QCD code is 200,000 site updates per second (or 5.6 Gflops/sec).

物理学家相信世界是用规范理论来描述的。研究这些理论的一种流行技术是将它们离散到一个点阵上，并用计算机进行数值模拟，从而产生所谓的点阵规范理论。这样的计算需要至少1014次浮点运算，需要使用先进的架构超级计算机，如思维机器公司制造的连接机。目前最重要的有待解决的规范理论是描述高能物理的亚核世界:量子色动力学(QCD)。规范理论最简单的例子是量子电动力学(QED)，该理论描述了电子和光子的相互作用。QCD的模拟需要与简单的QED问题非常相似的计算机软件。我们目前的QED代码实现了蒙特卡罗算法每秒160万格点更新的计算速率，微规范算法每秒740万点更新的计算速率。Monte Carlo QCD代码的估计性能是每秒200,000个站点更新(或5.6 Gflops/sec)。

引用次数: 3

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Conference on Hypercube Concurrent Computers and Applications

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀