[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation最新文献

英文中文

An overview of the nCUBE 3 supercomputer nCUBE 3超级计算机概述

[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1992-10-19 DOI: 10.1109/FMPC.1992.234880

B. Duzett, R. Buck

nCUBE is developing a new family of massively parallel products-the nCUBE 3 systems. These next-generation supercomputers will be the industry's first implementable multi-TeraFLOPS platforms and will be 100% compatible with previous-generation nCUBE systems. The nCUBE 3 family will carry nCUBE's philosophy of high integration and scalability to new, industry-leading levels, offering systems that scale from low-end, entry-level products to high-end, grand challenge machines. After introducing the nCUBE 3 system, the authors describe the nCUBE 3 system implementation.<>

nCUBE正在开发一个新的大规模并行产品系列——nCUBE 3系统。这些下一代超级计算机将是业界首批可实现多万亿次浮点运算的平台，并将与上一代nCUBE系统100%兼容。nCUBE 3系列将把nCUBE的高集成度和可扩展性的理念提升到行业领先的新水平，提供从低端入门级产品到高端大挑战机器的系统。在介绍了nCUBE 3系统之后，作者描述了nCUBE 3系统的实现

引用次数: 41

Petri net modeling and analysis of centralized timeout and batching arbitration units 集中超时与批处理仲裁单元的Petri网建模与分析

[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1992-10-19 DOI: 10.1109/FMPC.1992.234937

P. Garske, V. Narasimhan

The authors consider two novel arbitration techniques, timeout and batching arbitration, and establish the validity of their operations by using generalized and deterministic Petri net models. After a brief review of Petri net theory and the fundamentals of generalized and deterministic models, Petri net models for the timeout and batching arbitration schemes are presented, followed by a discussion of the simulation results of both of these schemes. It is found that both arbitration schemes provide a degree of fairness in that they reduce the resource allocation time but with the lack of complete resource utilization. A hybrid scheme which combines the key features of batching and timeout schemes is then presented and proven to operate correctly. Simulation of this scheme suggests that, by varying the arbiter parameters in conjunction with the priority of the processors, efficient allocation of system resources can be achieved.<>

作者考虑了两种新的仲裁技术:超时和批处理仲裁，并利用广义确定性Petri网模型建立了它们操作的有效性。在简要回顾了Petri网理论以及广义和确定性模型的基本原理之后，提出了超时和批处理仲裁方案的Petri网模型，然后讨论了这两种方案的仿真结果。研究发现，这两种仲裁方案都提供了一定程度的公平性，减少了资源分配时间，但缺乏完全的资源利用。然后提出了一种混合方案，它结合了批处理和超时方案的关键特征，并证明了它的正确运行。该方案的仿真表明，通过改变仲裁器参数与处理器的优先级，可以实现系统资源的有效分配。

引用次数: 0

A hyper-pyramid network topology for image processing 用于图像处理的超金字塔网络拓扑

[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1992-10-19 DOI: 10.1109/FMPC.1992.234955

É. Dujardin, M. Akil

The authors describe a novel network topology for image processing, called the hyper-pyramid network topology. This structure is hierarchical and implements local, inside-region communications at each level, and upward/downward communications in the whole structure. Intraregion communications are shown by an image processing algorithm study. The authors display the implementation of a component labeling algorithm onto a hyper-pyramid network with a computational complexity of O(log/sup 2/(n)). This complexity is the same as that of the hypercube network. It is also demonstrated that the wiring complexity is less than that of the hypercube network.<>

作者描述了一种用于图像处理的新型网络拓扑，称为超金字塔网络拓扑。该结构是分层的，并在每个级别实现本地、区域内通信，并在整个结构中实现向上/向下通信。通过对图像处理算法的研究，说明了区域内的通信。作者展示了一种计算复杂度为O(log/sup 2/(n))的超金字塔网络上的组件标记算法的实现。这种复杂性与超立方体网络的复杂性相同。研究还表明，该网络的布线复杂度低于超立方体网络。

引用次数: 2

A rank-two divide and conquer method for the symmetric tridiagonal eigenproblem 对称三对角线特征问题的二阶分治方法

[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1992-10-19 DOI: 10.1109/FMPC.1992.234887

K. Gates

A rank-two divide and conquer algorithm is developed for calculating the eigensystem of a symmetric tridiagonal matrix. This algorithm is compared to the LAPACK recommended path for this problem and the rank-one divide and conquer algorithm. The timing results on a Sequent Symmetry S81b show that this algorithm has potential as a parallel alternative to the QR algorithm.<>

提出了一种计算对称三对角矩阵特征系统的二阶分治算法。将该算法与针对该问题的LAPACK推荐路径和排名第一的分治算法进行比较。在Sequent Symmetry S81b上的时序结果表明，该算法有潜力作为QR算法的并行替代方案。

引用次数: 0

The LINPACK benchmark on the Fujitsu FAP 1000 富士通FAP 1000上的LINPACK基准

[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1992-10-19 DOI: 10.1109/FMPC.1992.234897

Richard P. Brent

The author describes an implementation of the LINPACK benchmark on the Fujitsu AP 1000. Design considerations include communication primitives, data distribution, use of blocking to reduce memory references, and effective use of the cache. The LINPACK benchmark results show that the AP 1000 is a good machine for numerical linear algebra, and that one can consistently achieve close to 80 percent of its theoretical peak performance on moderate to large problems. The main reason for this is the high ratio of communication speed to floating-point speed compared to machines such as the Intel Delta and nCUBE 2. The high-bandwidth hardware row/column broadcast capability of the T-net (xbrd, ybrd) and the low latency of the synchronous communication routines are significant.<>

作者描述了在Fujitsu AP 1000上的LINPACK基准测试的实现。设计考虑因素包括通信原语、数据分布、使用阻塞来减少内存引用以及有效使用缓存。LINPACK基准测试结果表明，ap1000是数值线性代数的好机器，在中等到大型问题上，可以始终达到接近其理论峰值性能的80%。其主要原因是与Intel Delta和nCUBE 2等机器相比，其通信速度与浮点速度的比率很高。T-net (xbrd, ybrd)的高带宽硬件行/列广播能力和同步通信例程的低延迟是显著的

引用次数: 10

Parallel parsing of spoken language 口语的平行解析

[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1992-10-19 DOI: 10.1109/FMPC.1992.234933

R. A. Helzerman, M. Harper, C. Zoltowski

The authors extended H. Maruyama's (1990) constraint dependency grammar (CDG) to process a lattice of sentence hypotheses instead of separate test strings. A postprocessor to a speech recognizer producing N-best hypotheses generates the word lattice representation, which is then augmented with information required for parsing. The authors summarize the CDG parsing algorithm and describe how the algorithm is extended to process the lattice on a single processor machine. They outline the CRCW P-RAM algorithm for parsing the word lattice, which requires O(n/sup 4/) processors to parse in O(k+n) time.<>

作者扩展了H. Maruyama(1990)的约束依赖语法(CDG)来处理句子假设的晶格，而不是单独的测试字符串。生成n个最佳假设的语音识别器的后处理程序生成词格表示，然后用解析所需的信息对其进行扩充。作者总结了CDG解析算法，并描述了如何将该算法扩展到单处理机上处理格。他们概述了用于解析字格的CRCW P-RAM算法，该算法需要O(n/sup 4/)个处理器在O(k+n)时间内进行解析。

引用次数: 6

Massively parallel solution of quantum transport problems 量子输运问题的大规模并行解

[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1992-10-19 DOI: 10.1109/FMPC.1992.234870

P. Balasingam, V. Roychowdhury

A numerically intensive program for the simulation of quantum transport in small structures has been implemented on a MasPar MP-1. The high degree of parallelism inherent in numerically intensive sections of the problem has been exploited, and devices with realistic dimensions and operating conditions have been investigated.<>

在MasPar MP-1上实现了小型结构中量子输运的数值模拟程序。该问题的数值密集部分所固有的高度并行性得到了利用，并且对具有实际尺寸和操作条件的设备进行了研究。

引用次数: 0

Self-routing least common ancestor networks 自路由最小共同祖先网络

[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1992-10-19 DOI: 10.1109/FMPC.1992.234867

Chi-Kai Chien, I.D. Scherson

Fat-trees, KYKLOS, baseline and SW-banyan networks, and the TRAC and CM-5 networks belong to a family of networks called least-common-ancestor networks (LCANs). In this paper, attention is restricted to LCANs with identical switches and a uniform stage interconnect. The least common ancestor of two nodes (PEs), A and B, is the node at greatest depth that counts A and B among its descendants: this node corresponds to an LCA switch. Given a source-destination pair, communication progresses upwards to an LCA switch; the stage that it belongs to is called the LCA level. Then, routing returns downwards to the destination. Source-destination pairs are connected using as few stages as their degree of mutual locality permits. Network parameters that facilitate this routing are shown.<>

脂肪树、KYKLOS、基线和SW-banyan网络以及TRAC和CM-5网络属于一个称为最小共同祖先网络(lcan)的网络家族。本文只关注具有相同开关和统一级互连的lcan。两个节点(pe) A和B的最小共同祖先是深度最大的节点，它的后代中包含A和B:该节点对应于LCA交换机。给定一个源-目的对，通信向上进展到LCA交换机;它所属的阶段称为LCA级别。然后，路由向下返回到目的地。在相互局部性允许的情况下，使用尽可能少的阶段连接源-目标对。显示了促进这种路由的网络参数。

引用次数: 3

A CPU utilization limit for massively parallel MIMD computers 大规模并行MIMD计算机的CPU利用率限制

[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1992-10-19 DOI: 10.1109/FMPC.1992.234902

T. Bridges, S. W. Kitchel, R.M. Wehrmeister

Massively parallel computer systems based on off-the-shelf CPU chip-sets have become commercially available. The authors demonstrate a theoretical limit on the silicon (or other circuitry media) utilization of such architectures as the number of processors is scaled up. In addition, case studies of the Thinking Machines Corporation CM-5 and of the Intel Touchstone are presented in order to quantify the maximum utilization on existing machines. Based on this utilization limit, the authors examine whether computer architects' current reliance on the MIMD (multiple-instruction multiple-data) model will be practical in next-generation machines. In order to facilitate the analysis, they decouple the control parallel and data parallel models of computation from MIMD and SIMD (single-instruction multiple-data) target platforms, respectively. Utilization of control parallel paradigms executing on SIMD platforms is introduced for comparison. The authors also consider the relationship of communication overhead to machine size scaling in the presence of the need for virtual processing nodes.<>

基于现成的CPU芯片组的大规模并行计算机系统已经商业化。作者论证了随着处理器数量的增加，这种架构的硅(或其他电路介质)利用率的理论极限。此外，还介绍了Thinking Machines Corporation CM-5和Intel Touchstone的案例研究，以量化现有机器的最大利用率。基于这种利用限制，作者研究了计算机架构师目前对MIMD(多指令多数据)模型的依赖是否将在下一代机器中实用。为了便于分析，他们分别从MIMD和SIMD(单指令多数据)目标平台解耦了计算的控制并行和数据并行模型。介绍了在SIMD平台上执行的控制并行范式的使用情况进行比较。作者还考虑了在需要虚拟处理节点的情况下，通信开销与机器大小缩放的关系。

引用次数: 16

Routing algorithms on a mesh-connected computer 网格连接计算机上的路由算法

[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1992-10-19 DOI: 10.1109/FMPC.1992.234863

Q. Gu, J. Gu

The authors present two algorithms for the 1-1 routing problems on a mesh-connected computer. The first algorithm, with a queue size of 28, solves the 1-1 routing problem on an n*n mesh-connected computer in 2n+O(1) steps. This improves the previous result of queue size 75. The second algorithm solves the problem in 2n-2 steps with queue size 12t/sub s//s where t/sub s/ is the time for sorting an s*s mesh into a row major order for all s>or=1. This result improves the previous result of size 18.67 t/sub s//s. Both algorithms have important applications in reducing the hardware cost on a mesh-connected computer.<>

本文提出了在网格连接计算机上求解1-1路由问题的两种算法。第一个算法，队列大小为28，在一个n*n网格连接的计算机上，用2n+O(1)步解决1-1路由问题。这改进了之前队列大小为75的结果。第二种算法以2n-2步解决问题，队列大小为12t/sub s//s，其中t/sub s/是将s*s网格排序为所有s>或=1的行主顺序的时间。这个结果改进了之前尺寸为18.67 t/sub /s //s的结果。这两种算法在降低网格连接计算机的硬件成本方面都有重要的应用。

引用次数: 2

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀