首页 > 最新文献

Proceedings 11th International Parallel Processing Symposium最新文献

英文 中文
Alias analysis for Fortran90 array slices 别名分析Fortran90阵列切片
Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580967
K. Gopinath, R. Seshadri
Most alias analyses produce approximate results in the presence of array slices. This may lead to inefficient code which is of concern, especially, in languages like Fortran90. The authors present an overview of a static alias analysis that gives accurate results in the presence of array slices in Fortran90.
大多数别名分析在存在数组切片的情况下产生近似结果。这可能会导致低效的代码,这是一个值得关注的问题,特别是在Fortran90这样的语言中。作者介绍了静态别名分析的概述,该分析在Fortran90中存在数组切片时给出了准确的结果。
{"title":"Alias analysis for Fortran90 array slices","authors":"K. Gopinath, R. Seshadri","doi":"10.1109/IPPS.1997.580967","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580967","url":null,"abstract":"Most alias analyses produce approximate results in the presence of array slices. This may lead to inefficient code which is of concern, especially, in languages like Fortran90. The authors present an overview of a static alias analysis that gives accurate results in the presence of array slices in Fortran90.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"301 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129315292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
The sparse cyclic distribution against its dense counterparts 稀疏的循环分布相对于密集的循环分布
Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580969
G. Bandera, M. Ujaldón, M. A. Trenas, E. Zapata
Several methods have been proposed in the literature for the distribution of data on distributed memory machines, either oriented to dense or sparse structures. Many of the real applications, however, deal with both kinds of data jointly. The paper presents techniques for integrating dense and sparse array accesses in a way that optimizes locality and further allows an efficient loop partitioning within a data-parallel compiler. The approach is evaluated through an experimental survey with several compilers and parallel platforms. The results prove the benefits of the BRS sparse distribution when combined with CYCLIC in mixed algorithms and the poor efficiency achieved by well-known distribution schemes when sparse elements arise in the source code.
文献中已经提出了几种在分布式存储机器上分布数据的方法,这些方法要么面向密集结构,要么面向稀疏结构。然而,许多实际应用程序联合处理这两种数据。本文提出了以优化局部性的方式集成密集和稀疏数组访问的技术,并进一步允许在数据并行编译器中进行有效的循环划分。通过几个编译器和并行平台的实验调查,对该方法进行了评估。结果证明了BRS稀疏分布与CYCLIC混合算法相结合的优点,以及当源代码中出现稀疏元素时,常用分布方案的效率较差。
{"title":"The sparse cyclic distribution against its dense counterparts","authors":"G. Bandera, M. Ujaldón, M. A. Trenas, E. Zapata","doi":"10.1109/IPPS.1997.580969","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580969","url":null,"abstract":"Several methods have been proposed in the literature for the distribution of data on distributed memory machines, either oriented to dense or sparse structures. Many of the real applications, however, deal with both kinds of data jointly. The paper presents techniques for integrating dense and sparse array accesses in a way that optimizes locality and further allows an efficient loop partitioning within a data-parallel compiler. The approach is evaluated through an experimental survey with several compilers and parallel platforms. The results prove the benefits of the BRS sparse distribution when combined with CYCLIC in mixed algorithms and the poor efficiency achieved by well-known distribution schemes when sparse elements arise in the source code.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130459250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
On privatization of variables for data-parallel execution 关于数据并行执行的变量私营化
Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580952
Manish Gupta
Privatization of data is an important technique that has been used by compilers to parallelize loops by eliminating storage-related dependences. When a compiler partitions computations based on the ownership of data, selecting a proper mapping of privatizable data is crucial to obtaining the benefits of privatization. This paper presents a novel framework for privatizing scalar and array variables in the context of a data-driven approach to parallelization. We show that there are numerous alternatives available for mapping privatized variables and the choice of mapping can significantly affect the performance of the program. We present an algorithm that attempts to preserve parallelism and minimize communication overheads. We also introduce the concept of partial privatization of arrays that combines data partitioning and privatization, and enables efficient handling of a class of codes with multi-dimensional data distribution that was not previously possible. Finally, we show how the ideas of privatization apply to the execution of control flow statements as well. An implementation of these ideas in the pHPF prototype compiler for High Performance Fortran on the IBM SP2 machine has shown impressive results.
数据私营化是一项重要的技术,编译器通过消除与存储相关的依赖关系来并行化循环。当编译器根据数据的所有权对计算进行分区时,选择可私有化数据的适当映射对于获得私有化的好处至关重要。本文提出了一种新的框架,用于在数据驱动的并行化方法中私有化标量和数组变量。我们表明,有许多可用于映射私有变量的替代方法,并且映射的选择可以显着影响程序的性能。我们提出了一种尝试保持并行性和最小化通信开销的算法。我们还介绍了数组部分私有化的概念,它结合了数据分区和私有化,并且能够有效地处理具有多维数据分布的一类代码,这在以前是不可能的。最后,我们将展示私有化的思想如何应用于控制流语句的执行。这些思想在IBM SP2机器上用于高性能Fortran的pHPF原型编译器中的实现显示了令人印象深刻的结果。
{"title":"On privatization of variables for data-parallel execution","authors":"Manish Gupta","doi":"10.1109/IPPS.1997.580952","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580952","url":null,"abstract":"Privatization of data is an important technique that has been used by compilers to parallelize loops by eliminating storage-related dependences. When a compiler partitions computations based on the ownership of data, selecting a proper mapping of privatizable data is crucial to obtaining the benefits of privatization. This paper presents a novel framework for privatizing scalar and array variables in the context of a data-driven approach to parallelization. We show that there are numerous alternatives available for mapping privatized variables and the choice of mapping can significantly affect the performance of the program. We present an algorithm that attempts to preserve parallelism and minimize communication overheads. We also introduce the concept of partial privatization of arrays that combines data partitioning and privatization, and enables efficient handling of a class of codes with multi-dimensional data distribution that was not previously possible. Finally, we show how the ideas of privatization apply to the execution of control flow statements as well. An implementation of these ideas in the pHPF prototype compiler for High Performance Fortran on the IBM SP2 machine has shown impressive results.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114272923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
A BSP approach to the scheduling of tightly-nested loops 紧嵌套循环调度的BSP方法
Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580954
R. Calinescu
This paper addresses the scheduling of uniform-dependence loop nests within the framework of the bulk-synchronous parallel (BSP) model. Two broad classes of tightly-nested loops are identified in the paper and scheduled according to the BSP discipline, and the resulting schedules are analysed in terms of the BSP cost model.
本文研究了大容量同步并行(BSP)模型框架下的均匀依赖循环巢的调度问题。本文确定了两大类紧密嵌套循环,并根据BSP原则进行了调度,并根据BSP成本模型对所得到的调度进行了分析。
{"title":"A BSP approach to the scheduling of tightly-nested loops","authors":"R. Calinescu","doi":"10.1109/IPPS.1997.580954","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580954","url":null,"abstract":"This paper addresses the scheduling of uniform-dependence loop nests within the framework of the bulk-synchronous parallel (BSP) model. Two broad classes of tightly-nested loops are identified in the paper and scheduled according to the BSP discipline, and the resulting schedules are analysed in terms of the BSP cost model.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122284248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A parallel priority data structure with applications 具有应用程序的并行优先级数据结构
Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580979
G. Brodal, J. Träff, C. Zaroliagis
Presents a parallel priority data structure that improves the running time of certain algorithms for problems that lack a fast and work-efficient parallel solution. As a main application, we give a parallel implementation of Dijkstra's (1959) algorithm which runs in O(n) time while performing O(m log n) work on a CREW PRAM. This is a logarithmic factor improvement for the running time compared with previous approaches. The main feature of our data structure is that the operations needed in each iteration of Dijkstra's algorithm can be supported in O(1) time.
提出了一种并行优先级数据结构,该结构改善了某些算法在缺乏快速高效并行解决方案的问题上的运行时间。作为主要应用,我们给出了Dijkstra(1959)算法的并行实现,该算法在CREW PRAM上运行O(n)时间,同时执行O(m log n)工作。与以前的方法相比,这是运行时间的对数因子改进。我们的数据结构的主要特征是Dijkstra算法每次迭代所需的操作可以在O(1)时间内得到支持。
{"title":"A parallel priority data structure with applications","authors":"G. Brodal, J. Träff, C. Zaroliagis","doi":"10.1109/IPPS.1997.580979","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580979","url":null,"abstract":"Presents a parallel priority data structure that improves the running time of certain algorithms for problems that lack a fast and work-efficient parallel solution. As a main application, we give a parallel implementation of Dijkstra's (1959) algorithm which runs in O(n) time while performing O(m log n) work on a CREW PRAM. This is a logarithmic factor improvement for the running time compared with previous approaches. The main feature of our data structure is that the operations needed in each iteration of Dijkstra's algorithm can be supported in O(1) time.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132849764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Empirical evaluation of distributed mutual exclusion algorithms 分布式互斥算法的实证评价
Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580904
S. Fu, N. Tzeng, Zhiyuan Li
We evaluate various distributed mutual exclusion algorithms on the IBM SP2 machine and the Intel iPSC/860 system. The empirical results are compared in terms of such criteria as the number of message exchanges and the response time. Our results indicate that the Star algorithm (M.L. Neilsen and M. Mizuno, 1991) achieves the shortest response time in most cases among all the algorithms on a small to medium sized system, when processors request for the critical section many times before involving any barrier synchronization. On the other hand, if every processor enters the critical section only once before encountering a barrier, the improved Ring algorithm (S.S. Fu and N.-F. Tzeng, 1995) is found to outperform others under a heavy load; but the Star algorithm and the CSL algorithm (Y.I. Chang et al., 1990) prevail when the request rate becomes light. The best solution to mutual exclusion in distributed memory systems is determined by how participating sites generate their mutual exclusion requests.
我们在IBM SP2机器和Intel iPSC/860系统上评估了各种分布式互斥算法。根据消息交换次数和响应时间等标准对实证结果进行了比较。我们的结果表明,当处理器在涉及任何屏障同步之前多次请求临界段时,在大多数情况下,Star算法(M.L. Neilsen和m.m izuno, 1991)在中小型系统的所有算法中实现了最短的响应时间。另一方面,如果每个处理器在遇到障碍之前只进入临界区一次,则改进的Ring算法(S.S. Fu和n.n - f。Tzeng, 1995)被发现在重载下表现优于其他人;但当请求率变低时,采用Star算法和CSL算法(Y.I. Chang et al., 1990)。分布式内存系统中互斥的最佳解决方案取决于参与站点如何生成互斥请求。
{"title":"Empirical evaluation of distributed mutual exclusion algorithms","authors":"S. Fu, N. Tzeng, Zhiyuan Li","doi":"10.1109/IPPS.1997.580904","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580904","url":null,"abstract":"We evaluate various distributed mutual exclusion algorithms on the IBM SP2 machine and the Intel iPSC/860 system. The empirical results are compared in terms of such criteria as the number of message exchanges and the response time. Our results indicate that the Star algorithm (M.L. Neilsen and M. Mizuno, 1991) achieves the shortest response time in most cases among all the algorithms on a small to medium sized system, when processors request for the critical section many times before involving any barrier synchronization. On the other hand, if every processor enters the critical section only once before encountering a barrier, the improved Ring algorithm (S.S. Fu and N.-F. Tzeng, 1995) is found to outperform others under a heavy load; but the Star algorithm and the CSL algorithm (Y.I. Chang et al., 1990) prevail when the request rate becomes light. The best solution to mutual exclusion in distributed memory systems is determined by how participating sites generate their mutual exclusion requests.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"228 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122842418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Evaluating the performance of software distributed shared memory as a target for parallelizing compilers 评估软件分布式共享内存作为并行编译器目标的性能
Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580943
A. Cox, S. Dwarkadas, Honghui Lu, W. Zwaenepoel
In this paper we evaluate the use of software distributed shared memory (DSM) on a message passing machine as the target for a parallelizing compiler. We compare this approach to compiler-generated message passing, hand-coded software DSM and hand-coded message passing. For this comparison, we use six applications: four that are regular and two that are irregular: Our results are gathered on an 8-node IBM SP/2 using the TreadMarks software DSM system. We use the APR shared-memory (SPF) compiler to generate the shared memory-programs and the APR XHPF compiler to generate message passing programs. The hand-coded message passing programs run with the IBM PVMe optimized message passing library. On the regular programs, both the compiler-generated and the hand-coded message passing outperform the SPF/TreadMarks combination: the compiler-generated message passing by 5.5% to 40%, and the hand-coded message passing by 7.5% to 49%. On the irregular programs, the SPF/TreadMarks combination outperforms the compiler-generated message passing by 38% and 89%, and only slightly underperforms the hand-coded message passing, differing by 4.4% and 16%. We also identify the factors that account for the performance differences, estimate their relative importance, and describe methods to improve the performance.
在本文中,我们评估了在消息传递机上使用软件分布式共享内存(DSM)作为并行编译器的目标。我们将这种方法与编译器生成的消息传递、手工编码的软件DSM和手工编码的消息传递进行比较。为了进行比较,我们使用了六个应用程序:四个是规则的,两个是不规则的:我们的结果是使用TreadMarks软件DSM系统在8节点IBM SP/2上收集的。我们使用APR共享内存(SPF)编译器生成共享内存程序,使用APR XHPF编译器生成消息传递程序。手工编码的消息传递程序使用IBM PVMe优化的消息传递库运行。在常规程序中,编译器生成的消息传递和手工编码的消息传递都优于SPF/TreadMarks组合:编译器生成的消息传递比SPF/TreadMarks组合高5.5%到40%,手工编码的消息传递比SPF/TreadMarks组合高7.5%到49%。在不规则程序中,SPF/TreadMarks组合的性能比编译器生成的消息传递高出38%和89%,仅略低于手工编码的消息传递,相差4.4%和16%。我们还确定了导致性能差异的因素,估计了它们的相对重要性,并描述了提高性能的方法。
{"title":"Evaluating the performance of software distributed shared memory as a target for parallelizing compilers","authors":"A. Cox, S. Dwarkadas, Honghui Lu, W. Zwaenepoel","doi":"10.1109/IPPS.1997.580943","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580943","url":null,"abstract":"In this paper we evaluate the use of software distributed shared memory (DSM) on a message passing machine as the target for a parallelizing compiler. We compare this approach to compiler-generated message passing, hand-coded software DSM and hand-coded message passing. For this comparison, we use six applications: four that are regular and two that are irregular: Our results are gathered on an 8-node IBM SP/2 using the TreadMarks software DSM system. We use the APR shared-memory (SPF) compiler to generate the shared memory-programs and the APR XHPF compiler to generate message passing programs. The hand-coded message passing programs run with the IBM PVMe optimized message passing library. On the regular programs, both the compiler-generated and the hand-coded message passing outperform the SPF/TreadMarks combination: the compiler-generated message passing by 5.5% to 40%, and the hand-coded message passing by 7.5% to 49%. On the irregular programs, the SPF/TreadMarks combination outperforms the compiler-generated message passing by 38% and 89%, and only slightly underperforms the hand-coded message passing, differing by 4.4% and 16%. We also identify the factors that account for the performance differences, estimate their relative importance, and describe methods to improve the performance.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116837380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 44
Geometric data structures on a reconfigurable mesh, with applications 几何数据结构上的可重构网格,与应用程序
Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580983
A. Datta
We present several geometric data structures and algorithms for problems for a planar set of rectangles and bipartitioning problems for a point set in two dimensions on a reconfigurable mesh of size n/spl times/n. The problems for rectangles include computing the measure, contour perimeter and maximum clique for the union of a set of rectangles. The bipartitioning problems for a two dimensional point set are solved in the L/sub /spl infin// and L/sub 1/ metrics. We solve all these problems in O(log n) time.
我们提出了几种几何数据结构和算法,用于解决尺寸为n/spl × /n的可重构网格上的平面矩形集问题和二维点集的双分区问题。矩形的问题包括计算一组矩形的测度、等高线周长和最大团。在L/sub /spl //和L/sub //度量中解决了二维点集的双分区问题。我们在O(log n)时间内解决了所有这些问题。
{"title":"Geometric data structures on a reconfigurable mesh, with applications","authors":"A. Datta","doi":"10.1109/IPPS.1997.580983","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580983","url":null,"abstract":"We present several geometric data structures and algorithms for problems for a planar set of rectangles and bipartitioning problems for a point set in two dimensions on a reconfigurable mesh of size n/spl times/n. The problems for rectangles include computing the measure, contour perimeter and maximum clique for the union of a set of rectangles. The bipartitioning problems for a two dimensional point set are solved in the L/sub /spl infin// and L/sub 1/ metrics. We solve all these problems in O(log n) time.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115579236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Adaptive fault-tolerant wormhole routing algorithms for hypercube and mesh interconnection networks 超立方体和网状互连网络的自适应容错虫洞路由算法
Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580923
Jau-Der Shih
The author presents adaptive fault-tolerant deadlock-free routing algorithms for hypercubes and meshes by using only 3 virtual channels and 2 virtual channels respectively. Based on the concept of unsafe nodes, the author designs a routing algorithm for hypercubes that can tolerate at least n-1 node faults and can route a message via a path of length no more than the Hamming distance between the source and destination plus four. The author also develops a routing algorithm for meshes that can tolerate any block faults, as long as the distance between any two nodes in different faulty blocks is at least 2 in each dimension.
提出了超立方体和网格的自适应容错无死锁路由算法,分别使用3个虚拟通道和2个虚拟通道。基于不安全节点的概念,作者设计了一种超立方体的路由算法,该算法可以容忍至少n-1个节点的故障,并且可以通过长度不大于源和目的地之间的汉明距离加4的路径路由消息。作者还开发了一种可以容忍任何块故障的网格路由算法,只要不同故障块中任意两个节点之间的距离在每个维度上至少为2。
{"title":"Adaptive fault-tolerant wormhole routing algorithms for hypercube and mesh interconnection networks","authors":"Jau-Der Shih","doi":"10.1109/IPPS.1997.580923","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580923","url":null,"abstract":"The author presents adaptive fault-tolerant deadlock-free routing algorithms for hypercubes and meshes by using only 3 virtual channels and 2 virtual channels respectively. Based on the concept of unsafe nodes, the author designs a routing algorithm for hypercubes that can tolerate at least n-1 node faults and can route a message via a path of length no more than the Hamming distance between the source and destination plus four. The author also develops a routing algorithm for meshes that can tolerate any block faults, as long as the distance between any two nodes in different faulty blocks is at least 2 in each dimension.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121914385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Wide-sense nonblocking Clos networks under packing strategy 包装策略下的广义非阻塞Clos网络
Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580844
Yuanyuan Yang, Jianchao Wang
In this paper, we study wide-sense nonblocking conditions under packing strategy for the three-stage Clos network, or v(m, n, r) network. Wide-sense nonblocking networks are generally believed to have lower network cost than strictly nonblocking networks. However, the analysis for the wide-sense nonblocking conditions is usually more difficult. Moore proved that a v(m, n, 2) network is nonblocking under packing strategy if the number of middle stage switches m/spl ges/[/sup 3///sub 2/n]. This result has been widely cited in the literature, and is even considered as the wide-sense nonblocking condition under packing strategy for the general v(m, n, r) networks in some papers. In fact, it is still not known that whether the condition m/spl ges/[/sup 3///sub 2/n] holds for v(m, n, r) networks when r/spl ges/3. In this paper, we introduce a systematic approach to the analysis of wide-sense nonblocking conditions under packing strategy for general v(m, n, r) networks with any r values. We first translate the problem of finding the necessary and sufficient nonblocking conditions for v(m, n, r) networks to a set of linear programming problems. We then solve this special type of linear programming problems and obtain an elegant dosed form optimum solution. We prove that the necessary and sufficient condition for a v(m, n, r) network to be nonblocking under packing strategy is m/spl ges/[(2-1/F/sub 2r-1/)n] where F/sub 2r-1/ is the Fibonaaci number. We believe that the systematic approach developed in this paper can be used for analyzing other wide-sense nonblocking control strategies as well.
本文研究了三阶Clos网络或v(m, n, r)网络在填充策略下的广义非阻塞条件。广义非阻塞网络通常被认为比严格非阻塞网络具有更低的网络成本。然而,大范围的分析非阻塞条件通常是更加困难。Moore证明了在分组策略下,如果中间阶段交换机个数为m/spl /[/sup 3///sub 2/n],则v(m, n, 2)网络是非阻塞的。这一结果在文献中被广泛引用,甚至在一些论文中被认为是一般v(m, n, r)网络在填充策略下的广义非阻塞条件。实际上,尚不清楚当r/spl ges/3时,条件m/spl ges/[/sup 3///sub 2/n]是否对v(m, n, r)网络成立。本文系统地分析了具有任意r值的一般v(m, n, r)网络在填充策略下的广义非阻塞条件。我们首先将寻找v(m, n, r)网络的充分必要非阻塞条件的问题转化为一组线性规划问题。然后我们对这类特殊的线性规划问题进行了求解,得到了一个优雅的剂量形式的最优解。证明了在分组策略下v(m, n, r)网络非阻塞的充分必要条件为m/spl ges/[(2-1/F/sub 2r-1/)n],其中F/sub 2r-1/为斐波那契数。我们相信本文所建立的系统方法也可用于分析其他广义非阻塞控制策略。
{"title":"Wide-sense nonblocking Clos networks under packing strategy","authors":"Yuanyuan Yang, Jianchao Wang","doi":"10.1109/IPPS.1997.580844","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580844","url":null,"abstract":"In this paper, we study wide-sense nonblocking conditions under packing strategy for the three-stage Clos network, or v(m, n, r) network. Wide-sense nonblocking networks are generally believed to have lower network cost than strictly nonblocking networks. However, the analysis for the wide-sense nonblocking conditions is usually more difficult. Moore proved that a v(m, n, 2) network is nonblocking under packing strategy if the number of middle stage switches m/spl ges/[/sup 3///sub 2/n]. This result has been widely cited in the literature, and is even considered as the wide-sense nonblocking condition under packing strategy for the general v(m, n, r) networks in some papers. In fact, it is still not known that whether the condition m/spl ges/[/sup 3///sub 2/n] holds for v(m, n, r) networks when r/spl ges/3. In this paper, we introduce a systematic approach to the analysis of wide-sense nonblocking conditions under packing strategy for general v(m, n, r) networks with any r values. We first translate the problem of finding the necessary and sufficient nonblocking conditions for v(m, n, r) networks to a set of linear programming problems. We then solve this special type of linear programming problems and obtain an elegant dosed form optimum solution. We prove that the necessary and sufficient condition for a v(m, n, r) network to be nonblocking under packing strategy is m/spl ges/[(2-1/F/sub 2r-1/)n] where F/sub 2r-1/ is the Fibonaaci number. We believe that the systematic approach developed in this paper can be used for analyzing other wide-sense nonblocking control strategies as well.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122739940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 45
期刊
Proceedings 11th International Parallel Processing Symposium
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1