[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation最新文献

英文中文

Partitioning on the banyan-hypercube networks 榕树-超立方体网络上的分区

[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1990-10-08 DOI: 10.1109/FMPC.1990.89480

A. Bellaachia, A. Youssef

Partitioning strategies, as well as data structures for partitioning, are proposed and studied. Simulation results of internal, external, and total fragmentations for uniform and exponential distributions of request sizes are presented and discussed. The buddy system strategy of partitioning the hypercube is also simulated for comparison purposes. It is shown that the banyan-hypercube (BH) exhibits a better internal fragmentation than the hypercube, and for large request sizes the total fragmentation of the two networks is comparable. It is also shown that the internal fragmentation in BH decreases as the number of levels of BH increases.<>

提出并研究了分区策略以及分区的数据结构。给出并讨论了请求大小均匀分布和指数分布的内部、外部和总碎片的仿真结果。为了进行比较，还模拟了分区超立方体的伙伴系统策略。结果表明，榕树-超立方体(BH)比超立方体具有更好的内部碎片性，并且对于大请求大小，两种网络的总碎片性相当。结果还表明，随着黑洞层数的增加，黑洞内部破碎度减小。

引用次数: 4

A silicon compiler for massively parallel image processing ASICs 用于大规模并行图像处理asic的硅编译器

[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1990-10-08 DOI: 10.1109/FMPC.1990.89427

A. Boubekeur, G. Saucier

A silicon compiler design methodology for massively parallel architecture for image processing is introduced. It starts from an algorithmic description of the application in a language comparable to the GAPP NCR language (GAL) and generates an optimized circuit organized as a 2-D array of 1-b processing elements with minimized resources. The effectiveness of the approach is shown by two examples. The first is an ASIC (application-specific integrated circuit) for two basic mathematical morphology operations, dilation and erosion. The second is an ASIC for convolution. Both have been implemented in a double-aluminium 2- mu m CMOS standard cell. In both cases the processor element has been found to be very effective. Considerable area savings have been achieved.<>

介绍了一种用于图像处理的大规模并行体系结构的硅编译器设计方法。它从与GAPP NCR语言(GAL)相当的语言中的应用程序的算法描述开始，并生成一个优化电路，该电路组织为1-b处理元件的二维阵列，资源最少。通过两个算例说明了该方法的有效性。第一个是专用集成电路(ASIC)，用于两种基本的数学形态学操作，膨胀和侵蚀。第二个是用于卷积的ASIC。两者都在双铝2 μ m CMOS标准电池中实现。在这两种情况下，处理器元素都是非常有效的。节省了相当大的面积。

引用次数: 2

Exploitation fine-grain parallelism in a combinator-based functional system 在基于组合器的功能系统中利用细粒度并行性

[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1990-10-08 DOI: 10.1109/FMPC.1990.89500

P. Chu, J. Davis

A Scheme to extend the lazy functional language SASL with an eager evaluation operator that allows the programmer to selectively identify expressions to be evaluated eagerly is developed. D.A. Turner's (1979) abstraction and optimization algorithms are then modified so that the eagerness information will propagate through the combinator instruction set to the run-time parallel graph reducer. Simulation of simple benchmark programs shows this method to be very effective in exploiting fine-grain parallelism, even in irregular and unstructured operation. The evaluation is done on a virtual system. Despite the distributive nature of the combinator scheme, it is still unclear how to map the virtual machine into a physical architecture efficiently without seriously degrading the performance.<>

开发了一种扩展惰性函数语言SASL的方案，该方案使用急切求值运算符，允许程序员选择性地标识要急切求值的表达式。然后对D.A. Turner(1979)的抽象和优化算法进行修改，使渴望信息通过组合器指令集传播到运行时并行图减速器。简单的基准程序仿真表明，即使在不规则和非结构化操作中，该方法也能非常有效地利用细粒度并行性。评估是在一个虚拟系统上完成的。尽管组合器方案具有分布式特性，但如何在不严重降低性能的情况下有效地将虚拟机映射到物理体系结构中仍然不清楚

引用次数: 1

Large integer multiplication on massively parallel processors 大规模并行处理器上的大整数乘法

[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1990-10-08 DOI: 10.1109/FMPC.1990.89434

B. Fagin

Results obtained by multiplying large integers using the Fermat number transform are presented. The effectiveness of the approach was previously limited by word-length constraints, which are not a factor with many new computer architectures. A convolution algorithm on a massively parallel processor, based on the Fermat number transform, is presented. Examples of the tradeoffs between modulus, interprocessor communication steps, and input size are given. The application of this algorithm in the multiplication of large integers is then discussed, and performance results on a Connection Machine are reported. The results show multiplication times ranging from about 50 ms for 2-kb integers to 2600 ms for 8-Mb integers.<>

给出了用费马数变换乘大整数的结果。该方法的有效性以前受到单词长度约束的限制，这在许多新的计算机体系结构中不再是一个因素。提出了一种基于费马数变换的大规模并行处理器卷积算法。给出了模数、处理器间通信步骤和输入大小之间权衡的例子。然后讨论了该算法在大整数乘法中的应用，并报告了在连接机上的性能结果。结果显示，乘法时间从2 kb整数的50 ms到8 mb整数的2600 ms不等。

引用次数: 5

Massively parallel auction algorithms for the assignment problem 分配问题的大规模并行拍卖算法

[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1990-10-08 DOI: 10.1109/FMPC.1990.89444

J. Wein, S. Zenios

Alternative approaches to the massively parallel implementation of D.P. Bertsekas' auction algorithm (see Ann. Oper. Res., vol.14, p.105-23, 1988) on the Connection Machine CM2 are discussed. The most efficient implementation is a hybrid Jacobi/Gauss-Seidel implementation. It exploits two different levels of parallelism and an efficient way of communicating the data between them without the need to perform general router operations across the hypercube network. The implementations are evaluated empirically, solving large, dense problems.<>

D.P. Bertsekas拍卖算法大规模并行实现的替代方法(参见Ann。③。Res.， vol.14, p.105- 23,1988)对连接机CM2进行了讨论。最有效的实现是混合Jacobi/Gauss-Seidel实现。它利用了两种不同级别的并行性以及在它们之间进行数据通信的有效方式，而无需跨超立方体网络执行一般的路由器操作。对实现进行经验评估，解决大型、密集的问题。

引用次数: 35

Random number generators with inherent parallel properties 具有固有并行特性的随机数生成器

[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1990-10-08 DOI: 10.1109/FMPC.1990.89433

T. L. Yu, K. W. Yu

By incorporating the spatial variable into a one-dimensional array of numbers, it is possible to generalize the well-known linear congruential random-number generator (LCG) to the spatially coupled random-number generator (SCG) given by X/sub i/(t+1)=f((X/sub i/(t))) (mod m) where i=1, 2, . . ., n can be regarded as spatial sites and f is a function of (X/sub i/) that denotes a set containing X/sub i/ and its neighbors. It was found that SCGs in general possess a very long period. Statistical and spectral tests on these SCGs show that they are excellent pseudorandom-number generators. The SCGs also have inherent parallel properties and are particularly efficient when implemented on parallel machines.<>

通过将空间变量合并到一维数字数组中，可以将众所周知的线性同余随机数生成器(LCG)推广到空间耦合随机数生成器(SCG)，由X/下标i/(t+1)=f((X/下标i/(t)) (mod m)给出，其中i= 1,2，…，n可以被视为空间位置，f是(X/下标i/)的函数，表示包含X/下标i/及其邻居的集合。研究发现，scg一般具有很长的周期。统计和光谱测试表明，这些scg是很好的伪随机数发生器。scg还具有固有的并行特性，并且在并行机器上实现时特别高效。

引用次数: 0

Improved mesh algorithms for straight line detection 改进的直线检测网格算法

[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1990-10-08 DOI: 10.1109/FMPC.1990.89432

Y. Pan, Henry Y. H. Chuang

The problem of detecting lines in an image with N edge pixels on mesh-connected computers with N processors is considered. Four efficient algorithms that detect lines by performing a Hough transform are presented. The first algorithm runs in O(N/sup 1/2/+n) time on a 2-D mesh, where n is the number of theta values considered. The second algorithm runs in O((N/n)/sup 1/2/+n) time on a 3-D mesh. The third algorithm runs in O(log(N/n)+n) time on an augmented mesh. The fourth algorithm runs in O(n log N/log n) time on a mesh with a reconfigurable bus. All of the algorithms have smaller time complexities than algorithms in the literature.<>

研究了在具有N个处理器的网格连接计算机上具有N个边缘像素的图像中的线检测问题。提出了四种通过霍夫变换检测直线的有效算法。第一种算法在二维网格上运行的时间为O(N/sup 1/2/+ N)，其中N是考虑的theta值的数量。第二种算法在三维网格上的运行时间为O((N/ N)/sup 1/2/+ N)。第三种算法在增广网格上的运行时间为O(log(N/ N)+ N)。第四种算法在具有可重构总线的网格上运行的时间为O(n log n /log n)。所有算法都比文献中的算法具有更小的时间复杂度。

引用次数: 1

A framework for efficient execution of array-based languages on SIMD computers 在SIMD计算机上有效执行基于数组的语言的框架

[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1990-10-08 DOI: 10.1109/FMPC.1990.89497

J. Prins

The author presents a framework for supporting efficient execution of machine-independent, array-based, data-parallel languages, such as Fortran-90 and Parallel Pascal, on distributed-memory SIMD (single-instruction-stream, multiple-data-stream) machines with mesh or hypercube interconnection topologies. The framework supports (1) a wide class of mappings of arrays into machines, (2) the implementation of many data selection and reorganization operations by manipulation of data descriptors instead of data movement, and (3) the decomposition of required data motions into sequences of efficient nearest-neighbor communications on the mesh. Each of these is discussed, and an application example is given. Related work is examined.<>

作者提出了一个框架，用于支持在具有网格或超立方体互连拓扑结构的分布式内存SIMD(单指令流，多数据流)机器上高效执行与机器无关的、基于数组的数据并行语言，如Fortran-90和Parallel Pascal。该框架支持(1)数组到机器的广泛映射，(2)通过操作数据描述符而不是数据移动来实现许多数据选择和重组操作，以及(3)将所需的数据移动分解为网格上有效的最近邻通信序列。讨论了这些方法，并给出了应用实例。检查相关工作。

引用次数: 12

A parallel architecture for high speed data compression 用于高速数据压缩的并行架构

[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1990-10-08 DOI: 10.1109/FMPC.1990.89465

J. Storer, J. Reif

The authors discuss textural substitution methods. They present a massively parallel architecture for textural substitution that is based on a systolic pipe of 3839 identical processing elements that forms what is essentially an associative memory for strings that can learn new strings on the basis of the text processed thus far. The key to the design of this architecture is the formulation of an inherently top-down serial learning strategy as a bottom-up parallel strategy. A custom VLSI chip for this architecture that is capable of operating at 320-Mb/s has passed all simulations and is being fabricated with 1.2- mu m double-metal technology.<>

作者讨论了纹理替代方法。他们提出了一种基于3839个相同处理元素的收缩管道的纹理替换的大规模并行架构，该管道基本上形成了字符串的联想记忆，可以根据迄今处理的文本学习新的字符串。该体系结构设计的关键是将固有的自顶向下的串行学习策略表述为自底向上的并行策略。针对该架构的定制VLSI芯片能够以320 mb /s的速度运行，已经通过了所有模拟，并且正在使用1.2 μ m双金属技术制造

引用次数: 1

Array processors with pipelined optical busses 带有流水线光学总线的阵列处理器

[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation

Pub Date : 1990-10-08 DOI: 10.1109/FMPC.1990.89479

Zicheng Guo, R. Melhem, R. W. Hall, D. Chiarulli, S. Levitan

A synchronous multiprocessor architecture based on pipelined optical bus interconnections is presented. The processors are placed in a square grid and are interconnected to one another through horizontal and vertical optical buses. This architecture has an effective diameter as small as two owing to its orthogonal bus connections, and it allows all processors to have simultaneous access to the buses owing to its capability for pipelining messages. Although the resulting architecture is meshlike and uses bus connections, it has a substantially higher bandwidth than conventional and bus-augmented mesh computers. Moreover, it has a simple control structure and is universal in that various well-known multiprocessor interconnections can be efficiently embedded in it. This architecture appears to be a good candidate for hybrid optical-electronic systems in the next generation of parallel computers.<>

提出了一种基于流水线光总线互连的同步多处理器体系结构。处理器被放置在一个方形网格中，并通过水平和垂直光总线相互连接。由于其正交总线连接，该体系结构的有效直径小至2，并且由于其管道消息的能力，它允许所有处理器同时访问总线。虽然最终的架构是网状的，并且使用总线连接，但它比传统的和总线增强的网格计算机具有更高的带宽。此外，它具有简单的控制结构和通用性，可以有效地嵌入各种知名的多处理器互连。这种结构似乎是下一代并行计算机中混合光电系统的一个很好的候选者。

引用次数: 56

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀