[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation最新文献

英文中文

Efficient masking techniques for large-scale SIMD architectures 大规模SIMD架构的有效掩蔽技术

[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1990-10-08 DOI: 10.1109/FMPC.1990.89469

W. Nation, S. Fineberg, M. Allemang, T. Schwederski, T. Casavant, H. Siegel

SIMD (single-instruction-stream, multiple-data-stream) architectures require mechanisms that efficiently enable and disable mask processors to support flexible programming. Most current SIMD architectures use local masking. Global processor masks, specified by the control unit, are more efficient for tasks where the masking is data independent. An efficient hybrid masking technique that supports global masking, as well as local masking, for SIMD architectures constructed from standard microprocessors is proposed. A design for the hybrid mechanism is described, and its experimental performance using the existing PASM prototype is examined. It is shown that the hybrid masking technique can increase the utilization of PEs and thus increase performance, the degree of improvement being algorithm dependent.<>

SIMD(单指令流，多数据流)架构需要有效启用和禁用掩码处理器的机制来支持灵活的编程。大多数当前的SIMD体系结构使用本地屏蔽。由控制单元指定的全局处理器掩码对于掩码与数据无关的任务更有效。针对基于标准微处理器的SIMD架构，提出了一种既支持全局掩码又支持局部掩码的高效混合掩码技术。介绍了一种混合机构的设计方案，并利用现有的PASM样机对其实验性能进行了检验。结果表明，混合掩蔽技术可以提高pe的利用率，从而提高性能，改进的程度取决于算法

引用次数: 8

The Digital Transform Machine 数字变换机

[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1990-10-08 DOI: 10.1109/FMPC.1990.89470

W. W. Kirkman

The Digital Transform Machine, a massively parallel computer architecture based on a configurable hardware model of processing, is discussed. Some of the implications of this model of computing are examined, and the cellular structure and interconnection network of a proof-of-concept computer based on it are described. Areas that merit particular attention in future research are identified.<>

讨论了基于可配置处理硬件模型的大规模并行计算机体系结构——数字变换机。研究了该计算模型的一些含义，并描述了基于该模型的概念验证计算机的细胞结构和互连网络。确定了在今后的研究中值得特别注意的领域

引用次数: 7

On bit-serial packet routing for the mesh and the torus 关于位串行分组路由的网格和环面

[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1990-10-08 DOI: 10.1109/FMPC.1990.89475

F. Makedon, A. Simvonis

The bit-serial routing problem wherein each packet consists of a sequence of k flits and is thus called a snake, is considered. On the basis of the properties of the snake during the routing, a formal definition is given for three different packet routing models, namely, the store-and-forward model, the cut-through model, and the wormhole model. The wormhole model, which is most commonly used in practice, is studied. The first algorithms (deterministic and probabilistic) based on the wormhole model for the permutation routing problem on a chain, on a square mesh, and on a square torus are given. A new lower bound is derived for distance-limited permutation routing on a ring of processors, and an algorithm that matches this lower bound if the packets are routed independently is given.<>

考虑了位串行路由问题，其中每个数据包由k个flits序列组成，因此称为蛇。根据蛇在路由过程中的特性，给出了三种不同的分组路由模型的形式化定义，即存储转发模型、穿透模型和虫洞模型。对实践中最常用的虫洞模型进行了研究。给出了基于虫洞模型的链上、方形网格上和方形环面上排列路由问题的第一种算法(确定性算法和概率算法)。导出了处理器环上距离限制置换路由的一个新的下界，并给出了在数据包独立路由时匹配该下界的算法。

引用次数: 23

Vcode: a data-parallel intermediate language Vcode:一种数据并行中间语言

[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1990-10-08 DOI: 10.1109/FMPC.1990.89498

G. Blelloch, Siddhartha Chatterjee

A description is given of Vcode, a data-parallel intermediate language. Vcode is designed to allow easy porting of data-parallel languages to a wide class of parallel machines, and for experimenting with compiling such languages. The design goal was to define a simple language whose primitives can be implemented efficiently but that is still powerful enough to express the features of existing data-parallel languages. Vcode contains about 50 instructions, most of which manipulate arbitrarily long vectors of atomic values, and includes a set of segmented instructions that are crucial for implementing data-parallel languages that permit nested parallelism. The design decisions are discussed, and it is shown how three data-parallel languages-C*, Fortran 8*, and Paralation Lisp-can be mapped onto Vcode. The issues encountered in implementing Vcode on different kinds of parallel machines, as well as specific techniques for implementing it on the Connection Machine, are examined.<>

给出了数据并行中间语言Vcode的描述。Vcode的设计目的是为了方便地将数据并行语言移植到各种并行机器上，并对编译这些语言进行实验。设计目标是定义一种简单的语言，其原语可以有效地实现，但仍然足够强大，可以表达现有数据并行语言的特性。Vcode包含大约50条指令，其中大多数操作原子值的任意长向量，并包含一组分段指令，这些指令对于实现允许嵌套并行的数据并行语言至关重要。讨论了设计决策，并展示了如何将三种数据并行语言(c *、Fortran 8*和parallel lisp)映射到Vcode上。在不同类型的并行机器上实现Vcode时遇到的问题，以及在连接机器上实现它的具体技术，都将被检查

引用次数: 76

What are the two most important issues facing the design and use of massively parallel computers? 设计和使用大规模并行计算机面临的两个最重要的问题是什么?

[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1990-10-08 DOI: 10.1109/FMPC.1990.89507

H.J. Siegel, K. Batcher, C. Brownstein, W. J. Camp, M. Halem, J. Harris, R. Miller, D. Parkinson, A. P. Reeves, J. Reif, A. Rosenfeld, D. Schaefer, I.D. Scherson, P. Schneck, G. Steele, L. Uhr, U. Vishkin

A variety of views is presented by the participants in this panel discussion. Concerns are expressed regarding communication, control, software, programming, cost, performance measures, among others. The responses reflect the varied backgrounds and perspectives of the panelists.<>

在这次小组讨论中，与会者提出了各种各样的观点。所关注的是通信、控制、软件、编程、成本、性能度量等方面。这些回答反映了小组成员的不同背景和观点。

引用次数: 1

A bit-parallel, word-parallel, massively parallel associative processor for scientific computing 用于科学计算的位并行、字并行、大规模并行关联处理器

[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1990-10-08 DOI: 10.1109/FMPC.1990.89457

B. Alleyne, D. Kramer, I. Scherson

A simple but powerful parallel architecture based on the classical associative processor model, which allows bit-parallel computation and communication, is proposed. Complex operations such as multiplication execute in O(m) cycles, as opposed to O(m/sup 2/) for bit-serial machines. This permits very fast processing of floating-point data. A bit-parallel communication network that exploits associative data location independence is presented. It provides the system with a reconfiguration capability, which improves chip yield, as well as fault tolerance. The simplicity of the architecture lends itself to VLSI implementation and hence allows the construction of a bit-parallel, word-parallel, and massively parallel (P/sup 3/) computing system.<>

在经典关联处理器模型的基础上，提出了一种简单而功能强大的并行架构，实现了位并行计算和通信。像乘法这样的复杂操作在O(m)个周期内执行，而对于位串行机器来说则是O(m/sup 2/)个周期。这允许非常快速地处理浮点数据。提出了一种利用关联数据位置独立性的位并行通信网络。它为系统提供了重新配置能力，从而提高了芯片良率和容错性。该架构的简单性使其适合VLSI实现，因此允许构建位并行、字并行和大规模并行(P/sup 3/)计算系统。

引用次数: 2

Simulating numerically controlled machining in parallel 模拟并行数控加工

[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1990-10-08 DOI: 10.1109/FMPC.1990.89443

P. Su, S. Drysdale

Several parallel algorithms for simulating numerically controlled machining are presented. Various implementations of these algorithms on the Connection Machine are discussed. These experiments provide information about the various performance tradeoffs involved in writing programs for the Connection Machine. They also show that this particular problem is well suited to parallel solutions, since the algorithms run much faster than previous sequential algorithms.<>

提出了几种模拟数控加工的并行算法。讨论了这些算法在连接机上的各种实现。这些实验提供了关于为连接机编写程序所涉及的各种性能权衡的信息。他们还表明，这个特殊的问题非常适合并行解决方案，因为算法比以前的顺序算法运行得快得多。

引用次数: 2

Functional and topological relations among banyan multistage networks of differing switch sizes 不同开关大小的榕树多级网络的功能和拓扑关系

[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1990-10-08 DOI: 10.1109/FMPC.1990.89474

A. Youssef, B. Arden

If two N*N networks W and W' have switch sizes r and s, respectively, and if r>s, then W realizes a larger number of permutations than W'. Consequently, the two networks can never be equivalent. However, W may realize all the permutations of W', in which case W is said to functionally cover W' in the strict sense. More generally, W is said to functionally cover W' in the wide sense if the terminals of W can be relabeled so that W realizes all the permutations of W'. Functional covering is topologically characterized, and an optimal algorithm to decide strict functional covering is developed. It is shown that any N-*N-digit permutation network of switch size r functionally covers in the wide sense any other N-*N-digit permutation network of switch size s if and only if r is a perfect power of s, where a digit permutation network is a banyan multistage network such that the interconnections are permutations that permute digits in a specified manner.<>

如果两个N*N网络W和W'的交换机大小分别为r和s，如果r的交换机大小为50，则W实现的排列数量大于W'。因此，这两个网络永远不可能相等。然而，W可以实现W'的所有排列，在这种情况下，我们说W在功能上覆盖了W'。更一般地说，如果W的末端可以重新标记，使W实现W'的所有排列，则W在广义上功能上覆盖了W'。对功能覆盖进行了拓扑刻画，提出了一种确定严格功能覆盖的最优算法。证明了，当且仅当r是s的完全幂时，任何交换机大小为r的N-*N位置换网络在广义上功能覆盖任何其他交换机大小为s的N-*N位置换网络，其中数字置换网络是榕树多级网络，其互连是按指定方式排列数字的置换

引用次数: 1

Mapping reusable software components onto the ARC parallel processor 将可重用的软件组件映射到ARC并行处理器上

[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1990-10-08 DOI: 10.1109/FMPC.1990.89502

L. Welch, B. Weide

It is shown how to map the components of a program onto the ARC (Architecture for Reusable Components) processor automatically in a way that exploits its features. Mapping consists of two phases. The first phase determines the maximum amount of parallelism attainable from a program in the model of parallel execution. This is done by mapping program components onto logical processors (of which there are an infinite number). The second phase maps the contents of the logical processors onto physical processors (of which there are a limited number). It is shown to (1) identify the distributable components, of the system, (2) determine the relevant relationships among the components, (3) model the maximum amount of parallelism attainable with the model of parallel execution used, and (4) use the information from steps 1-3 to map components onto the processor nodes of ARC. Previous related work is reviewed.<>

它展示了如何以一种利用其特性的方式将程序的组件自动映射到ARC(可重用组件体系结构)处理器上。映射包括两个阶段。第一阶段确定在并行执行模型中程序可获得的最大并行性。这是通过将程序组件映射到逻辑处理器(逻辑处理器的数量是无限的)来完成的。第二阶段将逻辑处理器的内容映射到物理处理器(物理处理器的数量有限)。它显示了(1)识别系统的可分布组件，(2)确定组件之间的相关关系，(3)用所使用的并行执行模型对可实现的最大并行量进行建模，以及(4)使用步骤1-3的信息将组件映射到ARC的处理器节点。回顾以往的相关工作。

引用次数: 1

Image reconstruction on hypercube computers 超立方体计算机上的图像重建

[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1990-10-08 DOI: 10.1109/FMPC.1990.89448

E. Zapata, I. Benavides, F. F. Rivera, J. Bruguera, J. Carazo

The problem of the 3-D reconstruction of an object from its 2-D projection images using filtered backprojection is addressed. The implementation of the filtered backprojection method on hypercube computers is analyzed. It is shown that the parallel algorithm is general in the sense that it does not impose any restriction on the problem space dimensions and is adaptable to any hypercube dimension. The flexibility of the algorithm is rooted in the methodology developed for embedding algorithms into hypercubes. The algorithmic complexity is analyzed. Because the data redundancy associated with the replication of the projection images in all the PEs has allowed the process of simple backprojection to be designed without routing, an optimum algorithmic complexity is obtained.<>

研究了利用滤波后的反投影从物体的二维投影图像中重建物体的三维问题。分析了滤波反投影法在超立方体计算机上的实现。结果表明，该并行算法不受问题空间维度的限制，适用于任何超立方体维度，具有通用性。该算法的灵活性源于将算法嵌入超立方体的方法。分析了算法的复杂度。由于在所有pe中与投影图像复制相关的数据冗余使得设计简单的反向投影过程无需路由，因此获得了最优的算法复杂度。

引用次数: 9

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀