首页 > 最新文献

Proceedings of the Fifth Distributed Memory Computing Conference, 1990.最新文献

英文 中文
A Connectionist Technique for Data Smoothing 数据平滑的连接技术
Pub Date : 1990-04-08 DOI: 10.1109/DMCC.1990.555377
R. Daniel, K. Teague
Filtering data to remove noise is an important operation in image processing. While linear filters are common, they have serious drawbacks since they cannot discriminate between large and small discontinuities. This is especially serious since large discontinuities are frequently important edges in the scene. However, if the smoothing action is reduced to preserve the large discontinuities, very little noise will be removed from the data. This paper discusses the parallel implementation of a connectionist network that attempts to smooth data without blurring edges. The network operates by iteratively minimizing a non-linear error measure which explicitly models image edges. We discuss the origin of the network and its simulation on an iPSC/2. We also discuss its performance versus the number of nodes, the SNR of the data, and compare its performance with a linear Gaussian filter and a median filter.
对数据进行滤波去除噪声是图像处理中的一项重要操作。虽然线性滤波器很常见,但它们有严重的缺点,因为它们不能区分大的和小的不连续。这是特别严重的,因为大的不连续经常是场景中的重要边缘。然而,如果减少平滑动作以保留大的不连续,则从数据中去除的噪声非常少。本文讨论了一个连接网络的并行实现,该网络试图平滑数据而不模糊边缘。该网络通过迭代最小化明确建模图像边缘的非线性误差度量来运行。讨论了该网络的起源及其在iPSC/2上的仿真。我们还讨论了其性能与节点数量、数据信噪比的关系,并将其性能与线性高斯滤波器和中值滤波器进行了比较。
{"title":"A Connectionist Technique for Data Smoothing","authors":"R. Daniel, K. Teague","doi":"10.1109/DMCC.1990.555377","DOIUrl":"https://doi.org/10.1109/DMCC.1990.555377","url":null,"abstract":"Filtering data to remove noise is an important operation in image processing. While linear filters are common, they have serious drawbacks since they cannot discriminate between large and small discontinuities. This is especially serious since large discontinuities are frequently important edges in the scene. However, if the smoothing action is reduced to preserve the large discontinuities, very little noise will be removed from the data. This paper discusses the parallel implementation of a connectionist network that attempts to smooth data without blurring edges. The network operates by iteratively minimizing a non-linear error measure which explicitly models image edges. We discuss the origin of the network and its simulation on an iPSC/2. We also discuss its performance versus the number of nodes, the SNR of the data, and compare its performance with a linear Gaussian filter and a median filter.","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121206289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Massively Parallel Fokker-Planck Calculations 大规模并行福克-普朗克计算
Pub Date : 1990-04-08 DOI: 10.1109/DMCC.1990.555416
A. Mirin
package FPPAC [1,2], which nonlinear multispecies FokkerPlanck collision operator for a plasma in twodimensional velocity space, has been rewritten for the Connection Machine 2. This has involved allocation of variables either to the front end or the CM2, minimization of data flow, and replacement of Crayoptimized algorithms with ones suitable for a massively parallel architecture. Coding has been done utilizing Connection Machine Fortran. Calculations have been carried out on various Connection Machines throughout the country. Results and timings on these machines have been compared to each other and to those on the static memory Cray-2 at the National Magnetic Fusion Energy Computer Center. For large problem size, the Connection Machine 2 is found to be cost-efficient.
在连接机2上重写了二维速度空间等离子体的非线性多种FokkerPlanck碰撞算子包FPPAC[1,2]。这涉及到将变量分配到前端或CM2,最小化数据流,以及用适合大规模并行架构的算法替换crayar优化算法。使用连接机Fortran进行编码。在全国各地的各种连接机上进行了计算。这些机器上的结果和时间已经相互比较,并与国家磁聚变能计算机中心的克雷-2静态存储器上的结果和时间进行了比较。对于较大的问题规模,发现连接机器2具有成本效益。
{"title":"Massively Parallel Fokker-Planck Calculations","authors":"A. Mirin","doi":"10.1109/DMCC.1990.555416","DOIUrl":"https://doi.org/10.1109/DMCC.1990.555416","url":null,"abstract":"package FPPAC [1,2], which nonlinear multispecies FokkerPlanck collision operator for a plasma in twodimensional velocity space, has been rewritten for the Connection Machine 2. This has involved allocation of variables either to the front end or the CM2, minimization of data flow, and replacement of Crayoptimized algorithms with ones suitable for a massively parallel architecture. Coding has been done utilizing Connection Machine Fortran. Calculations have been carried out on various Connection Machines throughout the country. Results and timings on these machines have been compared to each other and to those on the static memory Cray-2 at the National Magnetic Fusion Energy Computer Center. For large problem size, the Connection Machine 2 is found to be cost-efficient.","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121834961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Embedding Meshes into Small Boolean Cubes 嵌入网格到小布尔立方体
Pub Date : 1990-04-08 DOI: 10.1109/DMCC.1990.556398
Ching-Tien Ho, S. Johnsson
The embedding of arrays in Boolean cubes, when there are more array elements than nodes in the cube, can always be made with optimal load-factor by reshaping the array to a one-dimensional array. We show that the dilation for such an embedding is of an .to x .t1 x - + x &-I array in an n-cube.Dila tion one embeddings can be obtained by splitting each axis into segments and assigning segments to nodes in the cube by a Gray code. The load-factor is optimal if the axis lengths contain sufficiently many powers of two. The congestion is minimized, if the segment lengths along the different axes are as equal as possible, for the cube configured with at most as many axes as the array. A further decrease in the congestion is possible if the array is partitioned into subarrays, and corresponding axis of different subarrays make use of edge-disjoint Hamiltonian cycles within subcubes. The congestion can also be reduced by using multiple paths between pairs of cube nodes, i.e., by using “fat” edges.
在布尔数据集中嵌入数组时,当数组元素多于数据集中的节点时,总是可以通过将数组重塑为一维数组来实现最佳负载因子。我们证明了这种嵌入的扩展是在一个n立方体中的一个。到x .t1 x - + x &-I数组。Dila 1嵌入可以通过将每个轴分成段,并通过Gray编码将段分配给立方体中的节点来获得。如果轴长包含足够多的2次幂,则负载因子是最优的。如果沿着不同轴的段长度尽可能相等,那么对于配置了最多与数组一样多的轴的立方体,拥塞就会最小化。如果将数组划分为子数组,并且不同子数组的相应轴利用子立方体内的边不相交哈密顿环,则可能进一步减少拥塞。拥塞也可以通过在对立方体节点之间使用多条路径来减少,即通过使用“胖”边。
{"title":"Embedding Meshes into Small Boolean Cubes","authors":"Ching-Tien Ho, S. Johnsson","doi":"10.1109/DMCC.1990.556398","DOIUrl":"https://doi.org/10.1109/DMCC.1990.556398","url":null,"abstract":"The embedding of arrays in Boolean cubes, when there are more array elements than nodes in the cube, can always be made with optimal load-factor by reshaping the array to a one-dimensional array. We show that the dilation for such an embedding is of an .to x .t1 x - + x &-I array in an n-cube.Dila tion one embeddings can be obtained by splitting each axis into segments and assigning segments to nodes in the cube by a Gray code. The load-factor is optimal if the axis lengths contain sufficiently many powers of two. The congestion is minimized, if the segment lengths along the different axes are as equal as possible, for the cube configured with at most as many axes as the array. A further decrease in the congestion is possible if the array is partitioned into subarrays, and corresponding axis of different subarrays make use of edge-disjoint Hamiltonian cycles within subcubes. The congestion can also be reduced by using multiple paths between pairs of cube nodes, i.e., by using “fat” edges.","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"188 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124923711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
An Input/Output Algorithm for M-Dimensional Rectangular Domain Decompositions on N-Dimensional Hypercube Multicomputers n维超立方体多计算机上m维矩形域分解的输入/输出算法
Pub Date : 1990-04-08 DOI: 10.1109/DMCC.1990.556294
H. Embrechts, J.P. Jones
Hypercube-topology concurrent multicomputers owe at least part of their popularity to the fact that it is relatively simple to decompose rectangularly-shaped Mdimensional domains into subdomains and assign these subdoniains to processors (PES) in a manner which preserves the adjacencies of the subdoniains. However, this decomposition involves some rearrangement of the data during input/output operations to (linear memory) data acquisition, display, or mass storage devices. We show that this rearrangement can be done efficiently, in parallel. The main consequence of this algorithm is that Mdimensional data can be stored in a simple, general format and yet be communicated efaiciently independent of the dimension of the hypercube or the number of these dimensions assigned to the dimensions of the domain. This algorithm is also relevant to applications with mixed domain decompositions, and to parallel mass storage media such as disk farms.
超立方体拓扑并发多计算机的流行至少部分归功于这样一个事实,即将矩形的m维域分解为子域并以保留子域邻接性的方式将这些子域分配给处理器(PES)相对简单。然而,这种分解涉及到在(线性存储器)数据采集、显示或大容量存储设备的输入/输出操作期间对数据进行一些重新排列。我们证明了这种重排可以高效地并行完成。该算法的主要结果是,m维数据可以以简单、通用的格式存储,并且可以独立于超立方体的维度或分配给域维度的这些维度的数量而有效地进行通信。该算法也适用于混合域分解的应用,以及并行大容量存储介质(如磁盘场)。
{"title":"An Input/Output Algorithm for M-Dimensional Rectangular Domain Decompositions on N-Dimensional Hypercube Multicomputers","authors":"H. Embrechts, J.P. Jones","doi":"10.1109/DMCC.1990.556294","DOIUrl":"https://doi.org/10.1109/DMCC.1990.556294","url":null,"abstract":"Hypercube-topology concurrent multicomputers owe at least part of their popularity to the fact that it is relatively simple to decompose rectangularly-shaped Mdimensional domains into subdomains and assign these subdoniains to processors (PES) in a manner which preserves the adjacencies of the subdoniains. However, this decomposition involves some rearrangement of the data during input/output operations to (linear memory) data acquisition, display, or mass storage devices. We show that this rearrangement can be done efficiently, in parallel. The main consequence of this algorithm is that Mdimensional data can be stored in a simple, general format and yet be communicated efaiciently independent of the dimension of the hypercube or the number of these dimensions assigned to the dimensions of the domain. This algorithm is also relevant to applications with mixed domain decompositions, and to parallel mass storage media such as disk farms.","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121794642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A Task Mapping Method for a Hypercube by Combining Subcubes 组合子数据集的超立方体任务映射方法
Pub Date : 1990-04-08 DOI: 10.1109/DMCC.1990.556298
S. Horiike
This paper presents a new algorithm for mapping of tasks onto a hypercube. Given a weighted task graph, the algorithm finds good mapping in a reasonable computation time. When the target computer is ndimensional cube (n-cube), the proposed algorithm is composed of n stages. The algorithm starts with an initial state in which the tasks are mapped onto 2n 0cubes. At each stage k, the task graph is mapped onto 2n-k k-cubes. At the beginning of stage k, the tasks have already been mapped onto 2n-(k-1) (k-1)-cubes. The tasks are mapped onto k-cubes by combining a pair of (k-1)-cubes. 2n-k pairs of (k-1)-cubes are determined, and they are combined so that the mapping onto the k-cubes makes the communication cost as low as possible. When the target computer is n-dimensional cube (ncube), the proposed algorithm is composed of n stages. The algorithm starts with an initial state in which the tasks are mapped onto 2" 0-cubes. At each stage k (k=1,2,..,n), the task graph is mapped onto 2n-k k-cubes. At the beginning of stage k, the tasks are already mapped onto 2n-(k-1) (k-1)-cubes. The mapping onto k-cubes can be done by combining a pair of (k-1)-cubes. 2n-k pairs are determined among 2n-(k-1) (k-1)-cubes, and they are combined so that mapping onto the k-cubes makes the communication cost as low as possible.
提出了一种将任务映射到超立方体上的新算法。给定一个加权任务图,该算法在合理的计算时间内找到较好的映射。当目标计算机为n维立方体(n-cube)时,该算法由n个阶段组成。该算法从一个初始状态开始,在初始状态下,任务被映射到2n个立方体上。在每个阶段k,任务图被映射到2n-k个立方体上。在阶段k开始时,任务已经被映射到2n-(k-1) (k-1)个立方体上。任务通过组合一对(k-1)个立方体映射到k个立方体上。确定了2n-k对(k-1)立方体,并将它们组合在一起,以便映射到k个立方体上,使通信成本尽可能低。当目标计算机为n维立方体(ncube)时,算法由n个阶段组成。该算法从一个初始状态开始,在初始状态下,任务被映射到2英寸的0立方上。在每个阶段k (k=1,2,…,n),任务图被映射到2n-k -k -立方体上。在阶段k开始时,任务已经被映射到2n-(k-1) (k-1)个立方体上。映射到k-立方体可以通过组合一对(k-1)-立方体来完成。在2n-(k-1) (k-1)个立方体中确定2n-k对,并将它们组合起来,以便映射到k个立方体上,使通信成本尽可能低。
{"title":"A Task Mapping Method for a Hypercube by Combining Subcubes","authors":"S. Horiike","doi":"10.1109/DMCC.1990.556298","DOIUrl":"https://doi.org/10.1109/DMCC.1990.556298","url":null,"abstract":"This paper presents a new algorithm for mapping of tasks onto a hypercube. Given a weighted task graph, the algorithm finds good mapping in a reasonable computation time. When the target computer is ndimensional cube (n-cube), the proposed algorithm is composed of n stages. The algorithm starts with an initial state in which the tasks are mapped onto 2n 0cubes. At each stage k, the task graph is mapped onto 2n-k k-cubes. At the beginning of stage k, the tasks have already been mapped onto 2n-(k-1) (k-1)-cubes. The tasks are mapped onto k-cubes by combining a pair of (k-1)-cubes. 2n-k pairs of (k-1)-cubes are determined, and they are combined so that the mapping onto the k-cubes makes the communication cost as low as possible. When the target computer is n-dimensional cube (ncube), the proposed algorithm is composed of n stages. The algorithm starts with an initial state in which the tasks are mapped onto 2\" 0-cubes. At each stage k (k=1,2,..,n), the task graph is mapped onto 2n-k k-cubes. At the beginning of stage k, the tasks are already mapped onto 2n-(k-1) (k-1)-cubes. The mapping onto k-cubes can be done by combining a pair of (k-1)-cubes. 2n-k pairs are determined among 2n-(k-1) (k-1)-cubes, and they are combined so that mapping onto the k-cubes makes the communication cost as low as possible.","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127697939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Parallel Discrete Event Simulation Using Synchronized Event Schedulers 使用同步事件调度程序的并行离散事件模拟
Pub Date : 1990-04-08 DOI: 10.1109/DMCC.1990.555367
W. Bain
This paper describes a new algorithm for the synchronization of a class of parallel discrete event simulations on distributed memory, parallel computers. Unlike previous algorithms which synchronize on a per process basis, this algorithm synchronizes on a per processor basis. The algorithm allows full generality in the simulation model by allowing dynamic process creation and destruction and full inter-process interconnections, and it is shown to be deadlock and livelock free. It has been used to simulate very large parallel computer architectures.
本文提出了一种在分布式存储、并行计算机上对一类并行离散事件模拟进行同步的新算法。与以前以每个进程为基础进行同步的算法不同,该算法以每个处理器为基础进行同步。该算法通过允许动态进程的创建和销毁以及进程间的完全互连,使仿真模型具有充分的通用性,并且无死锁和活锁。它已被用于模拟非常大的并行计算机体系结构。
{"title":"Parallel Discrete Event Simulation Using Synchronized Event Schedulers","authors":"W. Bain","doi":"10.1109/DMCC.1990.555367","DOIUrl":"https://doi.org/10.1109/DMCC.1990.555367","url":null,"abstract":"This paper describes a new algorithm for the synchronization of a class of parallel discrete event simulations on distributed memory, parallel computers. Unlike previous algorithms which synchronize on a per process basis, this algorithm synchronizes on a per processor basis. The algorithm allows full generality in the simulation model by allowing dynamic process creation and destruction and full inter-process interconnections, and it is shown to be deadlock and livelock free. It has been used to simulate very large parallel computer architectures.","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"188 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131827078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Basic Matrix Subprograms for Distributed Memory Systems 分布式存储系统的基本矩阵子程序
Pub Date : 1990-04-08 DOI: 10.1109/DMCC.1990.555399
A. Elster
Parallel systems are in general complicated to utilize eficiently. As they evolve in complexity, it hence becomes increasingly more important to provide libraries and language features that can spare the users from the knowledge of low-level system details. Our effort in this direction is to develop a set of basic matrix algorithms for distributed memory systems such as the hypercube. The goal is to be able to provide for distributed memory systems an environment similar to that which the Level-3 Basic Linear Algebra Subprograms (BLAS3) provide for the sequential and shared memory environments. These subprograms facilitate the development of eficient and portable algorithms that are rich in matrix-matrix multiplication, on which major software eflorts such as LAPACK have been built. To demonstrate the concept, some of these Level-3 algorithms are being developed on the Intel iPSC/2 hypercube. Central to this effort is the General Matrix-Matrix Multiplication routine PGEMM. The symmetric and triangular multiplications as well as, rank-tk updates (symmetric case), and the solution of triangular systems with multiple right hand sides, are also discussed.
一般来说,并行系统很难有效地利用。随着复杂性的发展,提供库和语言特性变得越来越重要,这些库和语言特性可以使用户免于了解底层系统细节。我们在这个方向上的努力是为分布式内存系统(如hypercube)开发一套基本矩阵算法。我们的目标是能够为分布式内存系统提供类似于3级基本线性代数子程序(BLAS3)为顺序和共享内存环境提供的环境。这些子程序促进了高效和可移植算法的开发,这些算法具有丰富的矩阵-矩阵乘法,在这些算法的基础上已经建立了诸如LAPACK之类的主要软件。为了演示这个概念,其中一些Level-3算法正在英特尔iPSC/2超立方体上开发。这项工作的核心是通用矩阵-矩阵乘法例程PGEMM。讨论了对称和三角乘法、秩-tk更新(对称情况)以及具有多个右边边的三角系统的解。
{"title":"Basic Matrix Subprograms for Distributed Memory Systems","authors":"A. Elster","doi":"10.1109/DMCC.1990.555399","DOIUrl":"https://doi.org/10.1109/DMCC.1990.555399","url":null,"abstract":"Parallel systems are in general complicated to utilize eficiently. As they evolve in complexity, it hence becomes increasingly more important to provide libraries and language features that can spare the users from the knowledge of low-level system details. Our effort in this direction is to develop a set of basic matrix algorithms for distributed memory systems such as the hypercube. The goal is to be able to provide for distributed memory systems an environment similar to that which the Level-3 Basic Linear Algebra Subprograms (BLAS3) provide for the sequential and shared memory environments. These subprograms facilitate the development of eficient and portable algorithms that are rich in matrix-matrix multiplication, on which major software eflorts such as LAPACK have been built. To demonstrate the concept, some of these Level-3 algorithms are being developed on the Intel iPSC/2 hypercube. Central to this effort is the General Matrix-Matrix Multiplication routine PGEMM. The symmetric and triangular multiplications as well as, rank-tk updates (symmetric case), and the solution of triangular systems with multiple right hand sides, are also discussed.","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115855568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
A Re-Configurable Reduced-Bus Multiprocessor Interconnection Network 一种可重新配置的减少总线的多处理器互连网络
Pub Date : 1990-04-08 DOI: 10.1109/DMCC.1990.556274
T. Ramesh, S. Ganesan
Multiple-bus multiprocessor interconnection networks are still considered as a cost effective and easily expandable processor-memory interconnection. But, a fully connected multiple bus network requires all busses to be connected to each processor and memory module thus increasing the physical connection and the bus load on each memo y. Reduced-bus connections with different connection topology such as rhombus, trapezoidal etc., were presented in [l]. In this paper a general single network topology has been presented that can be reconfigured to any one of the reduced-bus connection schemes. The re-configurability is achieved through the arbitration of combinations of simple link switches in a ring structure. The motive behind developing this reconfigurable structure is to offer a flexibility in matching the reduced-bus connection schemes to structure of parallel algorithms. The paper presents a mapping scheme to arbitrate the link switches for various connection pattems. Also, a comparison of effective memoy bandwidth of each connection scheme is shown. Expandability of the system to larger sizes are addressed.
多总线多处理器互连网络仍然被认为是一种经济有效且易于扩展的处理器-存储器互连网络。但是,一个完全连接的多总线网络要求所有总线都连接到每个处理器和存储器模块,从而增加了物理连接和每个备忘录上的总线负载。[1]提出了不同连接拓扑(如菱形、梯形等)的减少总线连接。本文提出了一种通用的单一网络拓扑结构,可以重新配置为任何一种减少总线连接方案。可重构性是通过环形结构中简单链路交换机组合的仲裁来实现的。开发这种可重构结构的动机是在将减少总线连接方案与并行算法结构相匹配方面提供灵活性。本文提出了一种映射方案来仲裁各种连接模式下的链路开关。此外,还比较了各种连接方案的有效内存带宽。系统的可扩展性,以更大的规模解决。
{"title":"A Re-Configurable Reduced-Bus Multiprocessor Interconnection Network","authors":"T. Ramesh, S. Ganesan","doi":"10.1109/DMCC.1990.556274","DOIUrl":"https://doi.org/10.1109/DMCC.1990.556274","url":null,"abstract":"Multiple-bus multiprocessor interconnection networks are still considered as a cost effective and easily expandable processor-memory interconnection. But, a fully connected multiple bus network requires all busses to be connected to each processor and memory module thus increasing the physical connection and the bus load on each memo y. Reduced-bus connections with different connection topology such as rhombus, trapezoidal etc., were presented in [l]. In this paper a general single network topology has been presented that can be reconfigured to any one of the reduced-bus connection schemes. The re-configurability is achieved through the arbitration of combinations of simple link switches in a ring structure. The motive behind developing this reconfigurable structure is to offer a flexibility in matching the reduced-bus connection schemes to structure of parallel algorithms. The paper presents a mapping scheme to arbitrate the link switches for various connection pattems. Also, a comparison of effective memoy bandwidth of each connection scheme is shown. Expandability of the system to larger sizes are addressed.","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132632523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
The 600 Megaflops Performance of the QCD Code on the Mark IIIfp Hypercube Mark iii ifp超立方体上QCD代码的600兆次浮点运算性能
Pub Date : 1990-04-08 DOI: 10.1109/DMCC.1990.556389
H. Ding
{"title":"The 600 Megaflops Performance of the QCD Code on the Mark IIIfp Hypercube","authors":"H. Ding","doi":"10.1109/DMCC.1990.556389","DOIUrl":"https://doi.org/10.1109/DMCC.1990.556389","url":null,"abstract":"","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117177565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Complexity Of Scattering On A Ring Of Processors 处理器环上散射的复杂性
Pub Date : 1990-04-08 DOI: 10.1109/DMCC.1990.556395
P. Fraigniaud, S. Miguet, Y. Robert
In this paper, we prove that the complexity of scattering in an oriented ring of p processors is (p-1) * (p + L * z) where L is the length of the messages, p the communication startup, and z the elemental propagation time. 1. SCATTERING In a recent paper, Saad and Schultz [SSI study various basic communication kernels in parallel architectures. They point out that interprocessor communication is often one of the main obstacles to increasing performance of parallel algorithms for multiprocessors. They consider the following data exchange operations: (1) One-to-one: moving data from one processor to another.
本文证明了p个处理器组成的定向环的散射复杂度为(p-1) * (p + L * z),其中L为消息长度,p为通信启动,z为元素传播时间。1. 在最近的一篇论文中,Saad和Schultz [SSI]研究了并行架构中的各种基本通信内核。他们指出,处理器间通信通常是提高多处理器并行算法性能的主要障碍之一。他们考虑以下数据交换操作:(1)一对一:将数据从一个处理器移动到另一个处理器。
{"title":"Complexity Of Scattering On A Ring Of Processors","authors":"P. Fraigniaud, S. Miguet, Y. Robert","doi":"10.1109/DMCC.1990.556395","DOIUrl":"https://doi.org/10.1109/DMCC.1990.556395","url":null,"abstract":"In this paper, we prove that the complexity of scattering in an oriented ring of p processors is (p-1) * (p + L * z) where L is the length of the messages, p the communication startup, and z the elemental propagation time. 1. SCATTERING In a recent paper, Saad and Schultz [SSI study various basic communication kernels in parallel architectures. They point out that interprocessor communication is often one of the main obstacles to increasing performance of parallel algorithms for multiprocessors. They consider the following data exchange operations: (1) One-to-one: moving data from one processor to another.","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114993575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
期刊
Proceedings of the Fifth Distributed Memory Computing Conference, 1990.
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1