The Sixth Distributed Memory Computing Conference, 1991. Proceedings最新文献

英文中文

A Flexible Interleaved Memory Design for Generalized Low Conflict Memory Access 面向广义低冲突存储器访问的柔性交错存储器设计

The Sixth Distributed Memory Computing Conference, 1991. Proceedings

Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633349

L. S. Kaplan

High bandwidth delivery of data to the processor(s) is critical for good perforniance in highly parallel computer systems. To increase memory throughput, many systems make use of interleaved parallel memory banks. An implementation must provide uniform throughput with little or no contention at the memory banks for a wide variety of algorithms and access patterns. This paper proposes an implementation for an interleaved memory system that exhibits extremely low contention for the memoiry banks during virtually all patterned accesses. It also has the advantage that, due to its programmability, it imposes few requirements on the configuration of the machines in which it is used. The hardware to implement the design is dliscussed along with address space considerations. A variant of this design is currently in use on the BBN TC2000 (tm) parallel computer.

在高度并行的计算机系统中，向处理器提供高带宽的数据传输对于良好的性能至关重要。为了增加内存吞吐量，许多系统使用交错并行内存库。实现必须为各种各样的算法和访问模式提供统一的吞吐量，在内存库中很少或没有争用。本文提出了一种交错存储系统的实现方法，该系统在几乎所有的模式访问过程中都表现出极低的内存争用。由于它的可编程性，它还有一个优点，那就是它对使用它的机器的配置要求很少。讨论了实现该设计的硬件以及地址空间方面的考虑。这种设计的一个变体目前在BBN TC2000 (tm)并行计算机上使用。

引用次数: 4

The Finite Difference Solution of Two- and Three-Dimensional Semiconductor Problems on the Connection Machine 二维和三维半导体问题在连接机上的有限差分解

The Sixth Distributed Memory Computing Conference, 1991. Proceedings

Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633216

K. Dalton, E. Hensel, S. Castillo, K. Ng

A study of the finite difSerence solution of the nonlinear partial differential equations governing twoand three-dimensional semiconductor devices is conducted on a SIMD computer. This nonlinear system is solved using Jacobi iteration and successive-under-relaxation. Row scaling and a zero order regularizer are used to aid in convergence. On a 16K CM-2 problems with up to 16.7 million unknowns have been solved. Problems of this size have not previously been reported. The ability to accurately model larger and more realistic three-dimensional devices is necessary to gain a greater physical understanding of their behavior.

在SIMD计算机上对二维和三维半导体器件非线性偏微分方程的有限差分解进行了研究。采用雅可比迭代法和连续欠松弛法求解该非线性系统。使用行缩放和零阶正则化器来帮助收敛。在16K CM-2上，已经解决了多达1670万个未知数的问题。这种规模的问题以前从未报道过。为了对它们的行为有更深入的物理理解，精确地模拟更大、更逼真的三维装置的能力是必要的。

引用次数: 0

Adaptive Optics Calculations Using the Connection Machine 使用连接机的自适应光学计算

The Sixth Distributed Memory Computing Conference, 1991. Proceedings

Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633209

R. Firestone, Eric N. Opp

The performance of reflecting optical telescopes located on the surface of the earth are subject to distortions due to the force of gravity on the mirror and the turbulence of the atmosphere on the light path. Reflective optics are also planned for use in high-powered laser systems, where the intensity of the light itself is capable of producing distortions in the air within the instrument, thereby affecting the shape of the focused wavefront. A solution proposed by optical designers is the use of adaptive optics: an optical system in which the figure of the mirror is deformable to the extent necessary to correct for the distortions mentioned. An adaptive optical system uses a feedback loop concept, in which the distortions of the optical wavefront are measured, the necessary corrections are computed, and a set of actuators is moved to provide those corrections. The calculation of the corrections is computationally intense. Specifically, the measurement of the distortions provides a collection of phase differences between measuring points corresponding to the actuator positions. This set of phase differences is larger than the number of actuators, leading to an overdetermined problem. As physical systems have some amount of noise present, the technique of least-squares solution serves both to provide the best choice of actuator positions for this overdetermined problem and to suppress the noise in the measurements. The necessary algorithms for solving the computation portion of the adaptive optics problem consist of a matrix generator to derive the computational representation of the physical system, a matrix inversion routine, and a high-speed least-squares solver. In the optical astronomy paradigm, the computational requirement is for a small number of adjustments per second, due to the rate of atmospheric turbulence. For the laser system, with more stringent requirements, we demonstrate an improvement of 11 2 orders of magnitude, made possible only through the use of supercomputer methods. Extrapolation of these results indicates that even greater acceleration is possible if the interprocessor communication is minimized; in other words, supercomputer designers have not yet solved the problem of making interprocessor communication as efficient as that within processors (or, in the present case, between processors on a single chip).

位于地球表面的反射式光学望远镜，由于反射镜上的重力作用和光路上大气的湍流，其性能会受到畸变。反射光学也计划用于高功率激光系统，其中光本身的强度能够在仪器内的空气中产生扭曲，从而影响聚焦波前的形状。光学设计师提出的一种解决方案是使用自适应光学:一种光学系统，其中镜子的形状可以变形到必要的程度，以纠正所提到的畸变。自适应光学系统使用反馈回路概念，测量光波前的畸变，计算必要的校正，并移动一组致动器来提供这些校正。修正的计算需要大量的计算。具体来说，对畸变的测量提供了与致动器位置对应的测量点之间相位差的集合。这组相位差大于执行器的数量，导致过定问题。由于物理系统存在一定数量的噪声，最小二乘解决技术既可以为这种超定问题提供执行器位置的最佳选择，又可以抑制测量中的噪声。解决自适应光学问题计算部分的必要算法包括导出物理系统计算表示的矩阵生成器、矩阵反演程序和高速最小二乘求解器。在光学天文学范式中，由于大气湍流的速率，计算要求是每秒进行少量调整。对于要求更严格的激光系统，我们展示了11.2个数量级的改进，只有通过使用超级计算机方法才能实现。这些结果的外推表明，如果处理器间通信最小化，甚至可能有更大的加速;换句话说，超级计算机设计者还没有解决如何使处理器间的通信像处理器内部的通信那样高效的问题(或者，在目前的情况下，在单个芯片上的处理器之间的通信)。

{"title":"Adaptive Optics Calculations Using the Connection Machine","authors":"R. Firestone, Eric N. Opp","doi":"10.1109/DMCC.1991.633209","DOIUrl":"https://doi.org/10.1109/DMCC.1991.633209","url":null,"abstract":"The performance of reflecting optical telescopes located on the surface of the earth are subject to distortions due to the force of gravity on the mirror and the turbulence of the atmosphere on the light path. Reflective optics are also planned for use in high-powered laser systems, where the intensity of the light itself is capable of producing distortions in the air within the instrument, thereby affecting the shape of the focused wavefront. A solution proposed by optical designers is the use of adaptive optics: an optical system in which the figure of the mirror is deformable to the extent necessary to correct for the distortions mentioned. An adaptive optical system uses a feedback loop concept, in which the distortions of the optical wavefront are measured, the necessary corrections are computed, and a set of actuators is moved to provide those corrections. The calculation of the corrections is computationally intense. Specifically, the measurement of the distortions provides a collection of phase differences between measuring points corresponding to the actuator positions. This set of phase differences is larger than the number of actuators, leading to an overdetermined problem. As physical systems have some amount of noise present, the technique of least-squares solution serves both to provide the best choice of actuator positions for this overdetermined problem and to suppress the noise in the measurements. The necessary algorithms for solving the computation portion of the adaptive optics problem consist of a matrix generator to derive the computational representation of the physical system, a matrix inversion routine, and a high-speed least-squares solver. In the optical astronomy paradigm, the computational requirement is for a small number of adjustments per second, due to the rate of atmospheric turbulence. For the laser system, with more stringent requirements, we demonstrate an improvement of 11 2 orders of magnitude, made possible only through the use of supercomputer methods. Extrapolation of these results indicates that even greater acceleration is possible if the interprocessor communication is minimized; in other words, supercomputer designers have not yet solved the problem of making interprocessor communication as efficient as that within processors (or, in the present case, between processors on a single chip).","PeriodicalId":313314,"journal":{"name":"The Sixth Distributed Memory Computing Conference, 1991. Proceedings","volume":"229 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130910711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Comparison of Particle Simulation Implementations on Two Different Parallel Architect ures 两种不同并行架构下粒子仿真实现的比较

The Sixth Distributed Memory Computing Conference, 1991. Proceedings

Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633198

J. Mcdonald, L. Dagum

Direct particle simur'ation is a powerful method for analyzing low density, hypersonic re-entry flows. The method involves following a large sample of representative gas molecules through motion and collision with other molecules or with surfaces in the simulated flow. In this paper, two very different parallel architectures are examined for their suitability an particle samulation computations, na;mely the Connection Machine CM-2 and the Intel iPSC/860. The difference in architectures has resulted in very diferent parallel decompositions. The two implementations are described and performance results are given. Both implementations achieve performance comparable iio a single Cray-2 CPU, however, this performance is obtained at the cost of greatly increased programming complexity.

直接粒子模拟是分析低密度高超声速再入流的一种有效方法。该方法涉及跟踪大量代表性气体分子样本，通过运动和碰撞与其他分子或与模拟流动中的表面。本文考察了两种非常不同的并行架构在粒子模拟计算中的适用性，即连接机CM-2和英特尔iPSC/860。体系结构的不同导致了非常不同的并行分解。描述了这两种实现，并给出了性能结果。这两种实现都实现了与单个Cray-2 CPU相当的性能，然而，这种性能是以大大增加编程复杂性为代价获得的。

引用次数: 5

Communication Abstraction and Process Refinement 通信抽象和过程细化

The Sixth Distributed Memory Computing Conference, 1991. Proceedings

Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633097

J. Yantchev

Concurrent systems are collections of data, processes, and communication channels. Top-down, hierarchical design of concurrent systems needs powerful abstraction facilities provided by the implementation language. While most languages provide some structuring mechanisms for data and process abstraction, none seems to provide any equivalent mechanisms for communication structuring. Communication channels are to communicate data and, therefore, all data structuring mechanisms provided by a programming language must be available to structure channels as well. In order to preserve behaviour through successive levels of design refinement, these means of communication structuring must preserve the abstraction of atomic transfers of values of arbitrary types. Int r o duct ion Most concurrent programming languages [5, 6, 7, 11 support the abstraction of concurrent systems as collections of data, processes, and communication channels. However, while they provide some structuring mechanisms for data and process abstraction, none seems to provide any equivalent mechanisms for communication structuring. Interprocess communication is almost universally viewed as a synchronised atomic exchange of values between two concurrently active processes. This affects the whole design process and intervenes with the freedom and ease in the refinement of the process structure. The design transformation steps may be non-trivial in some cases and, therefore, difficult to arrive at and verify. In addition, the implementation may be less efficient, both in storage and speed, because of unnecessary data copying and context creation for process spawning. The data structuring mechanisms supported by the contemporary programming languages provide a uniform view on data and data types. Structured data types may consist of components of arbitrary types, including themselves, and values of such types are treated as wholes and may be passed as parameters, returned as results of functions, and assigned to variables. The same applies to processes [5, 71. No distinction of kind need be made between systems with and without substructure and, indeed, a system which at one level of abstraction may be considered to consist of a process and the environment in which it evolves, may be considered as a single system at a higher level of abstraction. A process which for one purpose is taken to be atomic

并发系统是数据、进程和通信通道的集合。自顶向下、分层的并发系统设计需要实现语言提供强大的抽象功能。虽然大多数语言为数据和过程抽象提供了一些结构化机制，但似乎没有一种语言为通信结构化提供了任何等效的机制。通信通道是用来通信数据的，因此，编程语言提供的所有数据结构机制也必须对结构化通道可用。为了通过连续的设计细化级别来保持行为，这些通信结构手段必须保持任意类型值的原子传输的抽象。大多数并发编程语言[5,6,7,11]都支持将并发系统抽象为数据、进程和通信通道的集合。然而，虽然它们为数据和过程抽象提供了一些结构化机制，但似乎没有一个为通信结构化提供任何等效的机制。进程间通信几乎被普遍视为两个并发活动进程之间同步的原子交换值。这影响了整个设计过程，妨碍了过程结构的自由和简化。在某些情况下，设计转换步骤可能非常重要，因此很难到达和验证。此外，由于不必要的数据复制和进程生成的上下文创建，该实现在存储和速度方面可能效率较低。现代编程语言支持的数据结构机制提供了数据和数据类型的统一视图。结构化数据类型可以由任意类型的组件组成，包括它们自己，这些类型的值被视为整体，可以作为参数传递，作为函数的结果返回，并分配给变量。这同样适用于进程[5,71]。不需要区分有子结构和没有子结构的系统，事实上，一个系统在一个抽象层次上可以被认为是由一个过程和它所处的环境组成的，在更高的抽象层次上可以被认为是一个单一的系统。为了一个目的而被认为是原子的过程

{"title":"Communication Abstraction and Process Refinement","authors":"J. Yantchev","doi":"10.1109/DMCC.1991.633097","DOIUrl":"https://doi.org/10.1109/DMCC.1991.633097","url":null,"abstract":"Concurrent systems are collections of data, processes, and communication channels. Top-down, hierarchical design of concurrent systems needs powerful abstraction facilities provided by the implementation language. While most languages provide some structuring mechanisms for data and process abstraction, none seems to provide any equivalent mechanisms for communication structuring. Communication channels are to communicate data and, therefore, all data structuring mechanisms provided by a programming language must be available to structure channels as well. In order to preserve behaviour through successive levels of design refinement, these means of communication structuring must preserve the abstraction of atomic transfers of values of arbitrary types. Int r o duct ion Most concurrent programming languages [5, 6, 7, 11 support the abstraction of concurrent systems as collections of data, processes, and communication channels. However, while they provide some structuring mechanisms for data and process abstraction, none seems to provide any equivalent mechanisms for communication structuring. Interprocess communication is almost universally viewed as a synchronised atomic exchange of values between two concurrently active processes. This affects the whole design process and intervenes with the freedom and ease in the refinement of the process structure. The design transformation steps may be non-trivial in some cases and, therefore, difficult to arrive at and verify. In addition, the implementation may be less efficient, both in storage and speed, because of unnecessary data copying and context creation for process spawning. The data structuring mechanisms supported by the contemporary programming languages provide a uniform view on data and data types. Structured data types may consist of components of arbitrary types, including themselves, and values of such types are treated as wholes and may be passed as parameters, returned as results of functions, and assigned to variables. The same applies to processes [5, 71. No distinction of kind need be made between systems with and without substructure and, indeed, a system which at one level of abstraction may be considered to consist of a process and the environment in which it evolves, may be considered as a single system at a higher level of abstraction. A process which for one purpose is taken to be atomic","PeriodicalId":313314,"journal":{"name":"The Sixth Distributed Memory Computing Conference, 1991. Proceedings","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133378155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Parallel BFGS Method for Unconstrained Minimization 无约束最小化的并行BFGS方法

The Sixth Distributed Memory Computing Conference, 1991. Proceedings

Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633160

C. Still

引用次数: 4

Spare Allocation and Reconfiguration in a Fault Tolerant Hypercube with Direct Connect Capability 具有直接连接能力的容错超立方体中的备用分配和重新配置

The Sixth Distributed Memory Computing Conference, 1991. Proceedings

Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633360

B. Izadi, F. Ozguner

This paper investigates hardware reconjiguratzon schemes to make the hypercube multicomputer fault tolerant. Two schemes are proposed; the Cluster Approach and the Enhanced Cluster Approach. The approaches are shown to be able to tolerate large number of failures without any performance deg,radation. It is further demonstrated that no modification to either the existing communication or computaitional algorithm is needed. Finally a gracefully degmdable approach is presented to reconfigure when the number of faulty nodes are more than the available spares.

研究了实现超立方体多机容错的硬件重构方案。提出了两种方案;集群方法和增强集群方法。这些方法被证明能够承受大量的故障而不会有任何性能下降。进一步证明，不需要对现有的通信和计算算法进行修改。最后，提出了一种优雅的可重构方法，用于故障节点数量大于可用备用节点数量时的重新配置。

引用次数: 9

Scatter Scheduling for Problems with Unpredictable Structures 结构不可预测问题的分散调度

The Sixth Distributed Memory Computing Conference, 1991. Proceedings

Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633106

Minyou Wu, W. Shu

An extended scatter scheduling was applied to problems with unpredictable, asynchronous struc- tures. It has been found that with this simple schedul- ing strategy, good load balance can be reached with- out incurring much runtime overhead. This scheduling algorithm has been implemented on hypercube ma- chines, and its performance is compared with other scheduling strategies.

将扩展的分散调度方法应用于具有不可预测、异步结构的问题。研究发现，采用这种简单的调度策略，可以在不产生大量运行时开销的情况下达到良好的负载平衡。在超立方体机器上实现了该调度算法，并与其他调度策略进行了性能比较。

引用次数: 5

Fault Tolerance of the Cyclic Buddy Subcube Location Scheme in Hypercubes

The Sixth Distributed Memory Computing Conference, 1991. Proceedings

Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633075

M. Livingston, Q. Stout

This paper examines the problem of locating large fault-free subcubes in multiuser hypercube systems. We analyze a new location strategy, the cyclic buddy system, and compare its performance to the buddy system, the gray-coded buddy system, and several variants of them. We show that the cyclic buddy system gives a striking improvement in expected fault tolerance over the above schemes and, since it can easily be implemented in parallel with little overhead, it provides an attractive alternative to these schemes. We also investigate the behavior of these location systems in the folded, or projective, hypercube, and find that the cyclic buddy system, which adapts naturally to this enhancement, significantly outperforms the other schemes. A combination of analytic techniques and simulation is used to examine both worst case and expected case performance.

本文研究了多用户超立方体系统中大型无故障子立方体的定位问题。本文分析了一种新的定位策略——循环伙伴系统，并将其与伙伴系统、灰色编码伙伴系统及其几种变体的性能进行了比较。我们表明，循环伙伴系统在预期容错性方面比上述方案有显著的改进，并且由于它可以很容易地并行实现，开销很小，因此它提供了这些方案的一个有吸引力的替代方案。我们还研究了这些定位系统在折叠或投影超立方体中的行为，发现自然适应这种增强的循环伙伴系统明显优于其他方案。分析技术和模拟相结合，用于检查最坏情况和预期情况的性能。

引用次数: 7

Computing over Networks: An Illustrated Example 网络上的计算:一个例子

The Sixth Distributed Memory Computing Conference, 1991. Proceedings

Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633138

Bernd Bruegge, Hiroshi Nishikawa, Peter Steenkiste

With the advances in high-speed networking, partitioning applications over a group of computer systems is becoming an attractive way of exploiting parallelism. Programming general multicomputers is however very challenging: nodes are typically heterogeneous and shared with other users, making the availability of computing cycles on the nodes and communication bandwidth on the network unpredictable, This environment often requires users to use a programming model based on dynamic load balancing. In this paper, we use an flow field generation application to look at the problems that come up in a network environment. We use BEE, a monitoring system that allows programmers to interactively monitor their application, to show the behavior of the program under different conditions.

随着高速网络的发展，在一组计算机系统上划分应用程序正成为利用并行性的一种有吸引力的方式。然而，对通用多计算机进行编程是非常具有挑战性的:节点通常是异构的，并且与其他用户共享，这使得节点上的计算周期的可用性和网络上的通信带宽不可预测。这种环境通常要求用户使用基于动态负载平衡的编程模型。在本文中，我们使用一个流场生成应用程序来研究在网络环境中出现的问题。我们使用BEE，这是一个监控系统，它允许程序员交互式地监控他们的应用程序，以显示程序在不同条件下的行为。

引用次数: 6

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

The Sixth Distributed Memory Computing Conference, 1991. Proceedings

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀