首页 > 最新文献

[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation最新文献

英文 中文
Concurrent processing with result sharing: model, architecture, and performance analysis 具有结果共享的并发处理:模型、体系结构和性能分析
S. Krishnaprasad, B. Shirazi
An efficient computing model, called concurrent processing with result sharing, is introduced. An architecture suitable for executing programs under this model is developed. A performance analysis of this architecture based on a queuing network model is presented to investigate the effect of problem dynamics on the speed of problem solving and the resource requirements. The analysis indicates that, for both coarse- and fine-grain computations, as the amount of recomputation increases, the number of function units needed decreases and the delay at the processor element decreases significantly. For fine-grain computation, the bottlenecks at either the matching unit or the instruction store significantly degrade the system performance. This can only be avoided by using a more expensive multiple-ring architecture.<>
介绍了一种高效的计算模型——结果共享并发处理。在此模型下,开发了一个适合于执行程序的体系结构。基于排队网络模型对该体系结构进行了性能分析,研究了问题动力学对问题求解速度和资源需求的影响。分析表明,无论是粗粒计算还是细粒计算,随着重计算次数的增加,所需功能单元的数量减少,处理器单元上的延迟显著降低。对于细粒度计算,匹配单元或指令存储的瓶颈都会显著降低系统性能。这只能通过使用更昂贵的多环架构来避免。
{"title":"Concurrent processing with result sharing: model, architecture, and performance analysis","authors":"S. Krishnaprasad, B. Shirazi","doi":"10.1109/FMPC.1990.89499","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89499","url":null,"abstract":"An efficient computing model, called concurrent processing with result sharing, is introduced. An architecture suitable for executing programs under this model is developed. A performance analysis of this architecture based on a queuing network model is presented to investigate the effect of problem dynamics on the speed of problem solving and the resource requirements. The analysis indicates that, for both coarse- and fine-grain computations, as the amount of recomputation increases, the number of function units needed decreases and the delay at the processor element decreases significantly. For fine-grain computation, the bottlenecks at either the matching unit or the instruction store significantly degrade the system performance. This can only be avoided by using a more expensive multiple-ring architecture.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129174940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A silicon compiler for massively parallel image processing ASICs 用于大规模并行图像处理asic的硅编译器
A. Boubekeur, G. Saucier
A silicon compiler design methodology for massively parallel architecture for image processing is introduced. It starts from an algorithmic description of the application in a language comparable to the GAPP NCR language (GAL) and generates an optimized circuit organized as a 2-D array of 1-b processing elements with minimized resources. The effectiveness of the approach is shown by two examples. The first is an ASIC (application-specific integrated circuit) for two basic mathematical morphology operations, dilation and erosion. The second is an ASIC for convolution. Both have been implemented in a double-aluminium 2- mu m CMOS standard cell. In both cases the processor element has been found to be very effective. Considerable area savings have been achieved.<>
介绍了一种用于图像处理的大规模并行体系结构的硅编译器设计方法。它从与GAPP NCR语言(GAL)相当的语言中的应用程序的算法描述开始,并生成一个优化电路,该电路组织为1-b处理元件的二维阵列,资源最少。通过两个算例说明了该方法的有效性。第一个是专用集成电路(ASIC),用于两种基本的数学形态学操作,膨胀和侵蚀。第二个是用于卷积的ASIC。两者都在双铝2 μ m CMOS标准电池中实现。在这两种情况下,处理器元素都是非常有效的。节省了相当大的面积。
{"title":"A silicon compiler for massively parallel image processing ASICs","authors":"A. Boubekeur, G. Saucier","doi":"10.1109/FMPC.1990.89427","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89427","url":null,"abstract":"A silicon compiler design methodology for massively parallel architecture for image processing is introduced. It starts from an algorithmic description of the application in a language comparable to the GAPP NCR language (GAL) and generates an optimized circuit organized as a 2-D array of 1-b processing elements with minimized resources. The effectiveness of the approach is shown by two examples. The first is an ASIC (application-specific integrated circuit) for two basic mathematical morphology operations, dilation and erosion. The second is an ASIC for convolution. Both have been implemented in a double-aluminium 2- mu m CMOS standard cell. In both cases the processor element has been found to be very effective. Considerable area savings have been achieved.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124224243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Large integer multiplication on massively parallel processors 大规模并行处理器上的大整数乘法
B. Fagin
Results obtained by multiplying large integers using the Fermat number transform are presented. The effectiveness of the approach was previously limited by word-length constraints, which are not a factor with many new computer architectures. A convolution algorithm on a massively parallel processor, based on the Fermat number transform, is presented. Examples of the tradeoffs between modulus, interprocessor communication steps, and input size are given. The application of this algorithm in the multiplication of large integers is then discussed, and performance results on a Connection Machine are reported. The results show multiplication times ranging from about 50 ms for 2-kb integers to 2600 ms for 8-Mb integers.<>
给出了用费马数变换乘大整数的结果。该方法的有效性以前受到单词长度约束的限制,这在许多新的计算机体系结构中不再是一个因素。提出了一种基于费马数变换的大规模并行处理器卷积算法。给出了模数、处理器间通信步骤和输入大小之间权衡的例子。然后讨论了该算法在大整数乘法中的应用,并报告了在连接机上的性能结果。结果显示,乘法时间从2 kb整数的50 ms到8 mb整数的2600 ms不等。
{"title":"Large integer multiplication on massively parallel processors","authors":"B. Fagin","doi":"10.1109/FMPC.1990.89434","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89434","url":null,"abstract":"Results obtained by multiplying large integers using the Fermat number transform are presented. The effectiveness of the approach was previously limited by word-length constraints, which are not a factor with many new computer architectures. A convolution algorithm on a massively parallel processor, based on the Fermat number transform, is presented. Examples of the tradeoffs between modulus, interprocessor communication steps, and input size are given. The application of this algorithm in the multiplication of large integers is then discussed, and performance results on a Connection Machine are reported. The results show multiplication times ranging from about 50 ms for 2-kb integers to 2600 ms for 8-Mb integers.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129387226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Massively parallel auction algorithms for the assignment problem 分配问题的大规模并行拍卖算法
J. Wein, S. Zenios
Alternative approaches to the massively parallel implementation of D.P. Bertsekas' auction algorithm (see Ann. Oper. Res., vol.14, p.105-23, 1988) on the Connection Machine CM2 are discussed. The most efficient implementation is a hybrid Jacobi/Gauss-Seidel implementation. It exploits two different levels of parallelism and an efficient way of communicating the data between them without the need to perform general router operations across the hypercube network. The implementations are evaluated empirically, solving large, dense problems.<>
D.P. Bertsekas拍卖算法大规模并行实现的替代方法(参见Ann。③。Res., vol.14, p.105- 23,1988)对连接机CM2进行了讨论。最有效的实现是混合Jacobi/Gauss-Seidel实现。它利用了两种不同级别的并行性以及在它们之间进行数据通信的有效方式,而无需跨超立方体网络执行一般的路由器操作。对实现进行经验评估,解决大型、密集的问题。
{"title":"Massively parallel auction algorithms for the assignment problem","authors":"J. Wein, S. Zenios","doi":"10.1109/FMPC.1990.89444","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89444","url":null,"abstract":"Alternative approaches to the massively parallel implementation of D.P. Bertsekas' auction algorithm (see Ann. Oper. Res., vol.14, p.105-23, 1988) on the Connection Machine CM2 are discussed. The most efficient implementation is a hybrid Jacobi/Gauss-Seidel implementation. It exploits two different levels of parallelism and an efficient way of communicating the data between them without the need to perform general router operations across the hypercube network. The implementations are evaluated empirically, solving large, dense problems.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"354 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126972147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
Exploitation fine-grain parallelism in a combinator-based functional system 在基于组合器的功能系统中利用细粒度并行性
P. Chu, J. Davis
A Scheme to extend the lazy functional language SASL with an eager evaluation operator that allows the programmer to selectively identify expressions to be evaluated eagerly is developed. D.A. Turner's (1979) abstraction and optimization algorithms are then modified so that the eagerness information will propagate through the combinator instruction set to the run-time parallel graph reducer. Simulation of simple benchmark programs shows this method to be very effective in exploiting fine-grain parallelism, even in irregular and unstructured operation. The evaluation is done on a virtual system. Despite the distributive nature of the combinator scheme, it is still unclear how to map the virtual machine into a physical architecture efficiently without seriously degrading the performance.<>
开发了一种扩展惰性函数语言SASL的方案,该方案使用急切求值运算符,允许程序员选择性地标识要急切求值的表达式。然后对D.A. Turner(1979)的抽象和优化算法进行修改,使渴望信息通过组合器指令集传播到运行时并行图减速器。简单的基准程序仿真表明,即使在不规则和非结构化操作中,该方法也能非常有效地利用细粒度并行性。评估是在一个虚拟系统上完成的。尽管组合器方案具有分布式特性,但如何在不严重降低性能的情况下有效地将虚拟机映射到物理体系结构中仍然不清楚
{"title":"Exploitation fine-grain parallelism in a combinator-based functional system","authors":"P. Chu, J. Davis","doi":"10.1109/FMPC.1990.89500","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89500","url":null,"abstract":"A Scheme to extend the lazy functional language SASL with an eager evaluation operator that allows the programmer to selectively identify expressions to be evaluated eagerly is developed. D.A. Turner's (1979) abstraction and optimization algorithms are then modified so that the eagerness information will propagate through the combinator instruction set to the run-time parallel graph reducer. Simulation of simple benchmark programs shows this method to be very effective in exploiting fine-grain parallelism, even in irregular and unstructured operation. The evaluation is done on a virtual system. Despite the distributive nature of the combinator scheme, it is still unclear how to map the virtual machine into a physical architecture efficiently without seriously degrading the performance.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129839357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Random number generators with inherent parallel properties 具有固有并行特性的随机数生成器
T. L. Yu, K. W. Yu
By incorporating the spatial variable into a one-dimensional array of numbers, it is possible to generalize the well-known linear congruential random-number generator (LCG) to the spatially coupled random-number generator (SCG) given by X/sub i/(t+1)=f((X/sub i/(t))) (mod m) where i=1, 2, . . ., n can be regarded as spatial sites and f is a function of (X/sub i/) that denotes a set containing X/sub i/ and its neighbors. It was found that SCGs in general possess a very long period. Statistical and spectral tests on these SCGs show that they are excellent pseudorandom-number generators. The SCGs also have inherent parallel properties and are particularly efficient when implemented on parallel machines.<>
通过将空间变量合并到一维数字数组中,可以将众所周知的线性同余随机数生成器(LCG)推广到空间耦合随机数生成器(SCG),由X/下标i/(t+1)=f((X/下标i/(t)) (mod m)给出,其中i= 1,2,…,n可以被视为空间位置,f是(X/下标i/)的函数,表示包含X/下标i/及其邻居的集合。研究发现,scg一般具有很长的周期。统计和光谱测试表明,这些scg是很好的伪随机数发生器。scg还具有固有的并行特性,并且在并行机器上实现时特别高效。
{"title":"Random number generators with inherent parallel properties","authors":"T. L. Yu, K. W. Yu","doi":"10.1109/FMPC.1990.89433","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89433","url":null,"abstract":"By incorporating the spatial variable into a one-dimensional array of numbers, it is possible to generalize the well-known linear congruential random-number generator (LCG) to the spatially coupled random-number generator (SCG) given by X/sub i/(t+1)=f((X/sub i/(t))) (mod m) where i=1, 2, . . ., n can be regarded as spatial sites and f is a function of (X/sub i/) that denotes a set containing X/sub i/ and its neighbors. It was found that SCGs in general possess a very long period. Statistical and spectral tests on these SCGs show that they are excellent pseudorandom-number generators. The SCGs also have inherent parallel properties and are particularly efficient when implemented on parallel machines.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131958885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A framework for efficient execution of array-based languages on SIMD computers 在SIMD计算机上有效执行基于数组的语言的框架
J. Prins
The author presents a framework for supporting efficient execution of machine-independent, array-based, data-parallel languages, such as Fortran-90 and Parallel Pascal, on distributed-memory SIMD (single-instruction-stream, multiple-data-stream) machines with mesh or hypercube interconnection topologies. The framework supports (1) a wide class of mappings of arrays into machines, (2) the implementation of many data selection and reorganization operations by manipulation of data descriptors instead of data movement, and (3) the decomposition of required data motions into sequences of efficient nearest-neighbor communications on the mesh. Each of these is discussed, and an application example is given. Related work is examined.<>
作者提出了一个框架,用于支持在具有网格或超立方体互连拓扑结构的分布式内存SIMD(单指令流,多数据流)机器上高效执行与机器无关的、基于数组的数据并行语言,如Fortran-90和Parallel Pascal。该框架支持(1)数组到机器的广泛映射,(2)通过操作数据描述符而不是数据移动来实现许多数据选择和重组操作,以及(3)将所需的数据移动分解为网格上有效的最近邻通信序列。讨论了这些方法,并给出了应用实例。检查相关工作。
{"title":"A framework for efficient execution of array-based languages on SIMD computers","authors":"J. Prins","doi":"10.1109/FMPC.1990.89497","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89497","url":null,"abstract":"The author presents a framework for supporting efficient execution of machine-independent, array-based, data-parallel languages, such as Fortran-90 and Parallel Pascal, on distributed-memory SIMD (single-instruction-stream, multiple-data-stream) machines with mesh or hypercube interconnection topologies. The framework supports (1) a wide class of mappings of arrays into machines, (2) the implementation of many data selection and reorganization operations by manipulation of data descriptors instead of data movement, and (3) the decomposition of required data motions into sequences of efficient nearest-neighbor communications on the mesh. Each of these is discussed, and an application example is given. Related work is examined.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133377565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Improved mesh algorithms for straight line detection 改进的直线检测网格算法
Y. Pan, Henry Y. H. Chuang
The problem of detecting lines in an image with N edge pixels on mesh-connected computers with N processors is considered. Four efficient algorithms that detect lines by performing a Hough transform are presented. The first algorithm runs in O(N/sup 1/2/+n) time on a 2-D mesh, where n is the number of theta values considered. The second algorithm runs in O((N/n)/sup 1/2/+n) time on a 3-D mesh. The third algorithm runs in O(log(N/n)+n) time on an augmented mesh. The fourth algorithm runs in O(n log N/log n) time on a mesh with a reconfigurable bus. All of the algorithms have smaller time complexities than algorithms in the literature.<>
研究了在具有N个处理器的网格连接计算机上具有N个边缘像素的图像中的线检测问题。提出了四种通过霍夫变换检测直线的有效算法。第一种算法在二维网格上运行的时间为O(N/sup 1/2/+ N),其中N是考虑的theta值的数量。第二种算法在三维网格上的运行时间为O((N/ N)/sup 1/2/+ N)。第三种算法在增广网格上的运行时间为O(log(N/ N)+ N)。第四种算法在具有可重构总线的网格上运行的时间为O(n log n /log n)。所有算法都比文献中的算法具有更小的时间复杂度。
{"title":"Improved mesh algorithms for straight line detection","authors":"Y. Pan, Henry Y. H. Chuang","doi":"10.1109/FMPC.1990.89432","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89432","url":null,"abstract":"The problem of detecting lines in an image with N edge pixels on mesh-connected computers with N processors is considered. Four efficient algorithms that detect lines by performing a Hough transform are presented. The first algorithm runs in O(N/sup 1/2/+n) time on a 2-D mesh, where n is the number of theta values considered. The second algorithm runs in O((N/n)/sup 1/2/+n) time on a 3-D mesh. The third algorithm runs in O(log(N/n)+n) time on an augmented mesh. The fourth algorithm runs in O(n log N/log n) time on a mesh with a reconfigurable bus. All of the algorithms have smaller time complexities than algorithms in the literature.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134078085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A parallel architecture for high speed data compression 用于高速数据压缩的并行架构
J. Storer, J. Reif
The authors discuss textural substitution methods. They present a massively parallel architecture for textural substitution that is based on a systolic pipe of 3839 identical processing elements that forms what is essentially an associative memory for strings that can learn new strings on the basis of the text processed thus far. The key to the design of this architecture is the formulation of an inherently top-down serial learning strategy as a bottom-up parallel strategy. A custom VLSI chip for this architecture that is capable of operating at 320-Mb/s has passed all simulations and is being fabricated with 1.2- mu m double-metal technology.<>
作者讨论了纹理替代方法。他们提出了一种基于3839个相同处理元素的收缩管道的纹理替换的大规模并行架构,该管道基本上形成了字符串的联想记忆,可以根据迄今处理的文本学习新的字符串。该体系结构设计的关键是将固有的自顶向下的串行学习策略表述为自底向上的并行策略。针对该架构的定制VLSI芯片能够以320 mb /s的速度运行,已经通过了所有模拟,并且正在使用1.2 μ m双金属技术制造
{"title":"A parallel architecture for high speed data compression","authors":"J. Storer, J. Reif","doi":"10.1109/FMPC.1990.89465","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89465","url":null,"abstract":"The authors discuss textural substitution methods. They present a massively parallel architecture for textural substitution that is based on a systolic pipe of 3839 identical processing elements that forms what is essentially an associative memory for strings that can learn new strings on the basis of the text processed thus far. The key to the design of this architecture is the formulation of an inherently top-down serial learning strategy as a bottom-up parallel strategy. A custom VLSI chip for this architecture that is capable of operating at 320-Mb/s has passed all simulations and is being fabricated with 1.2- mu m double-metal technology.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130007253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Array processors with pipelined optical busses 带有流水线光学总线的阵列处理器
Zicheng Guo, R. Melhem, R. W. Hall, D. Chiarulli, S. Levitan
A synchronous multiprocessor architecture based on pipelined optical bus interconnections is presented. The processors are placed in a square grid and are interconnected to one another through horizontal and vertical optical buses. This architecture has an effective diameter as small as two owing to its orthogonal bus connections, and it allows all processors to have simultaneous access to the buses owing to its capability for pipelining messages. Although the resulting architecture is meshlike and uses bus connections, it has a substantially higher bandwidth than conventional and bus-augmented mesh computers. Moreover, it has a simple control structure and is universal in that various well-known multiprocessor interconnections can be efficiently embedded in it. This architecture appears to be a good candidate for hybrid optical-electronic systems in the next generation of parallel computers.<>
提出了一种基于流水线光总线互连的同步多处理器体系结构。处理器被放置在一个方形网格中,并通过水平和垂直光总线相互连接。由于其正交总线连接,该体系结构的有效直径小至2,并且由于其管道消息的能力,它允许所有处理器同时访问总线。虽然最终的架构是网状的,并且使用总线连接,但它比传统的和总线增强的网格计算机具有更高的带宽。此外,它具有简单的控制结构和通用性,可以有效地嵌入各种知名的多处理器互连。这种结构似乎是下一代并行计算机中混合光电系统的一个很好的候选者。
{"title":"Array processors with pipelined optical busses","authors":"Zicheng Guo, R. Melhem, R. W. Hall, D. Chiarulli, S. Levitan","doi":"10.1109/FMPC.1990.89479","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89479","url":null,"abstract":"A synchronous multiprocessor architecture based on pipelined optical bus interconnections is presented. The processors are placed in a square grid and are interconnected to one another through horizontal and vertical optical buses. This architecture has an effective diameter as small as two owing to its orthogonal bus connections, and it allows all processors to have simultaneous access to the buses owing to its capability for pipelining messages. Although the resulting architecture is meshlike and uses bus connections, it has a substantially higher bandwidth than conventional and bus-augmented mesh computers. Moreover, it has a simple control structure and is universal in that various well-known multiprocessor interconnections can be efficiently embedded in it. This architecture appears to be a good candidate for hybrid optical-electronic systems in the next generation of parallel computers.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122409629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 56
期刊
[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1