首页 > 最新文献

[1993] Proceedings Seventh International Parallel Processing Symposium最新文献

英文 中文
Autonomous parallel heuristic combinatorial search 自主并行启发式组合搜索
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262789
Chao-Chun Wang, L. Jamieson
Heuristic search is the process of searching a state space under the guidance of an evaluation function. Most research on parallelizing heuristic search algorithms has emphasized system problems such as load balancing and reduction in memory use. A theoretical analysis of a new autonomous parallel heuristic search algorithm is introduced. Rather than simply dividing the search space among the processors, the processors share information that monitors the progress of the search and use consensus to limit the amount of time spent in expanding nodes that are not on the optimal path. Each processor uses a different admissible heuristic function, and it is shown that the expected number of nodes generated by each processor in the course of the search is reduced by a factor that reflects the consensus among the processors. The asynchronous behavior of the algorithm eliminates synchronization delays.<>
启发式搜索是在评价函数的指导下对状态空间进行搜索的过程。大多数关于并行启发式搜索算法的研究都强调负载平衡和减少内存使用等系统问题。对一种新的自主并行启发式搜索算法进行了理论分析。处理器不是简单地在处理器之间划分搜索空间,而是共享监视搜索进度的信息,并使用共识来限制扩展不在最优路径上的节点所花费的时间。每个处理器使用不同的可接受启发式函数,结果表明,每个处理器在搜索过程中生成的期望节点数被一个反映处理器之间共识的因子所减少。该算法的异步行为消除了同步延迟
{"title":"Autonomous parallel heuristic combinatorial search","authors":"Chao-Chun Wang, L. Jamieson","doi":"10.1109/IPPS.1993.262789","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262789","url":null,"abstract":"Heuristic search is the process of searching a state space under the guidance of an evaluation function. Most research on parallelizing heuristic search algorithms has emphasized system problems such as load balancing and reduction in memory use. A theoretical analysis of a new autonomous parallel heuristic search algorithm is introduced. Rather than simply dividing the search space among the processors, the processors share information that monitors the progress of the search and use consensus to limit the amount of time spent in expanding nodes that are not on the optimal path. Each processor uses a different admissible heuristic function, and it is shown that the expected number of nodes generated by each processor in the course of the search is reduced by a factor that reflects the consensus among the processors. The asynchronous behavior of the algorithm eliminates synchronization delays.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124561779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Multiple message broadcasting in the postal model 邮政模型中的多消息广播
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262831
A. Bar-Noy, S. Kipnis
Broadcasting is a widely used operation in many message-passing systems. Most existing broadcasting algorithms, however, do not address several emerging trends in distributed-memory parallel computers and high-speed communication networks. These trends include (i) treating the system as a fully connected collection of processors, (ii) packetizing large data into sequences of messages, and (iii) tolerating communication latencies. This paper explores the broadcasting problem in the postal model that addresses these issues. The authors provide two algorithms for broadcasting m messages in a message-passing system with n processors and communication latency lambda . A lower bound on the time for this problem is (m-1)+f/sub lambda /(n), where f/sub lambda /(n) is the optimal time for broadcasting one message. They present algorithm PARTITION that takes at most 2m+f/sub lambda /(n)+O( lambda ) time, and algorithm D-D-TREES that takes at most m+2f/sub lambda /(n)+O( lambda ) time.<>
广播是许多消息传递系统中广泛使用的操作。然而,大多数现有的广播算法并没有解决分布式内存并行计算机和高速通信网络的几个新兴趋势。这些趋势包括(i)将系统视为一个完全连接的处理器集合,(ii)将大数据打包成消息序列,以及(iii)容忍通信延迟。本文探讨了邮政模式中的广播问题,以解决这些问题。在具有n个处理器和通信延迟lambda的消息传递系统中,作者提供了两种广播m条消息的算法。这个问题的时间的下界是(m-1)+f/下标lambda /(n),其中f/下标lambda /(n)是广播一条消息的最佳时间。他们提出的算法PARTITION最多需要2m+f/sub lambda /(n)+O(lambda)时间,而算法D-D-TREES最多需要m+2f/sub lambda /(n)+O(lambda)时间。
{"title":"Multiple message broadcasting in the postal model","authors":"A. Bar-Noy, S. Kipnis","doi":"10.1109/IPPS.1993.262831","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262831","url":null,"abstract":"Broadcasting is a widely used operation in many message-passing systems. Most existing broadcasting algorithms, however, do not address several emerging trends in distributed-memory parallel computers and high-speed communication networks. These trends include (i) treating the system as a fully connected collection of processors, (ii) packetizing large data into sequences of messages, and (iii) tolerating communication latencies. This paper explores the broadcasting problem in the postal model that addresses these issues. The authors provide two algorithms for broadcasting m messages in a message-passing system with n processors and communication latency lambda . A lower bound on the time for this problem is (m-1)+f/sub lambda /(n), where f/sub lambda /(n) is the optimal time for broadcasting one message. They present algorithm PARTITION that takes at most 2m+f/sub lambda /(n)+O( lambda ) time, and algorithm D-D-TREES that takes at most m+2f/sub lambda /(n)+O( lambda ) time.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130075838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
Parallel algorithms for hypercube allocation 超立方体分配的并行算法
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262864
Yeimkuan Chang, L. Bhuyan
Parallel algorithms of the hypercube allocation strategies are considered. Although the sequential algorithms of various hypercube allocation strategies are easier to implement, their worst case time complexities exponentially increase as the dimension of the hypercube increases. The authors show that the free processors can be utilized to perform the allocation jobs in parallel to improve the efficiency of the hypercube allocation algorithms. A modified parallel algorithm for the single Gray-Code (GC) strategy is proposed and is shown to be able to recognize more subcubes than the single GC strategy by using the binary reflected Gray code and inverse binary reflected Gray code, without increasing the execution time. Two algorithms for a complete subcube recognition system are also presented and shown to be more efficient and attractive than the sequential one currently used in the hypercube multiprocessor.<>
研究了超立方体分配策略的并行算法。尽管各种超立方体分配策略的顺序算法更容易实现,但它们的最坏情况时间复杂度随着超立方体维数的增加呈指数增长。研究表明,可以利用空闲处理器并行执行分配任务,以提高超立方体分配算法的效率。提出了一种改进的单灰码(GC)并行算法,通过使用二进制反射Gray码和反向二进制反射Gray码,可以识别比单GC策略更多的子数据集,而不会增加执行时间。本文还提出了用于完整子立方体识别系统的两种算法,并证明了它们比目前在超立方体多处理器中使用的顺序识别算法更有效和更有吸引力。
{"title":"Parallel algorithms for hypercube allocation","authors":"Yeimkuan Chang, L. Bhuyan","doi":"10.1109/IPPS.1993.262864","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262864","url":null,"abstract":"Parallel algorithms of the hypercube allocation strategies are considered. Although the sequential algorithms of various hypercube allocation strategies are easier to implement, their worst case time complexities exponentially increase as the dimension of the hypercube increases. The authors show that the free processors can be utilized to perform the allocation jobs in parallel to improve the efficiency of the hypercube allocation algorithms. A modified parallel algorithm for the single Gray-Code (GC) strategy is proposed and is shown to be able to recognize more subcubes than the single GC strategy by using the binary reflected Gray code and inverse binary reflected Gray code, without increasing the execution time. Two algorithms for a complete subcube recognition system are also presented and shown to be more efficient and attractive than the sequential one currently used in the hypercube multiprocessor.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"196 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122522083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
The connection cubes: symmetric, low diameter interconnection networks with low node degree 连接立方体:对称、低直径、低节点度的互连网络
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262892
Nitin K. Singhvi
The enhanced connection cube or ECC and the minimal connection cube or MCC, proposed in this paper, are regular and symmetric static interconnection networks for large-scale, loosely coupled systems. The ECC connects 2/sup 2n+1/ processing nodes with only n+2 links per node, almost half the number used in a comparable hypercube. Yet its diameter is only n+2, almost half that of the hypercube. The MCC connects 2/sup 2n+1/ nodes using only n+1 links per node, has about the same diameter as a hypercube and is scalable like the hypercube. The MCC can be converted into the ECC by adding one more link per node. Both networks can emulate all the connections present in a hypercube of the same size, with no increase in routing complexity, so that typical parallel applications run on both types of CCs with the same time complexity as on a hypercube.<>
本文提出的增强连接立方体(enhanced connection cube, ECC)和最小连接立方体(minimum connection cube, MCC)是用于大规模松散耦合系统的规则对称静态互连网络。ECC连接2/sup 2n+1/处理节点,每个节点只有n+2条链路,几乎是类似超立方体中使用的数量的一半。然而它的直径只有n+2,几乎是超立方体的一半。MCC连接2/sup 2n+1/个节点,每个节点仅使用n+1条链路,其直径与超立方体大致相同,并且可以像超立方体一样扩展。通过在每个节点上增加一条链路,MCC可以转换为ECC。这两种网络都可以模拟相同大小的超立方体中存在的所有连接,而不会增加路由复杂性,因此典型的并行应用程序在这两种类型的cc上运行,其时间复杂度与在超立方体上运行相同。
{"title":"The connection cubes: symmetric, low diameter interconnection networks with low node degree","authors":"Nitin K. Singhvi","doi":"10.1109/IPPS.1993.262892","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262892","url":null,"abstract":"The enhanced connection cube or ECC and the minimal connection cube or MCC, proposed in this paper, are regular and symmetric static interconnection networks for large-scale, loosely coupled systems. The ECC connects 2/sup 2n+1/ processing nodes with only n+2 links per node, almost half the number used in a comparable hypercube. Yet its diameter is only n+2, almost half that of the hypercube. The MCC connects 2/sup 2n+1/ nodes using only n+1 links per node, has about the same diameter as a hypercube and is scalable like the hypercube. The MCC can be converted into the ECC by adding one more link per node. Both networks can emulate all the connections present in a hypercube of the same size, with no increase in routing complexity, so that typical parallel applications run on both types of CCs with the same time complexity as on a hypercube.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"116 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121379734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
An analytical model for wormhole routing in multicomputer interconnection networks 多机互联网络中虫洞路由的解析模型
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262804
W. Guan, W. Tsai, D. Blough
The communication performance of the interconnection network is critical in a multicomputer system. Wormhole routing has been known to be more efficient than the traditional circuit switching and packet switching. To evaluate wormhole routing, a queueing-theoretic analysis is used. This paper presents a general analytical model for wormhole routing based on very basic assumptions. The model is used to evaluate the routing delays in hypercubes and meshes. Delays calculated are compared against those obtained from simulations, and these comparisons show that the model is within a reasonable accuracy.<>
在多计算机系统中,互连网络的通信性能至关重要。众所周知,虫洞路由比传统的电路交换和分组交换更有效。为了评估虫洞路由,采用了排队理论分析。本文基于非常基本的假设,提出了虫洞路由的一般解析模型。该模型用于计算超立方体和网格中的路由延迟。将计算得到的延迟与仿真得到的延迟进行了比较,结果表明该模型在合理的精度范围内
{"title":"An analytical model for wormhole routing in multicomputer interconnection networks","authors":"W. Guan, W. Tsai, D. Blough","doi":"10.1109/IPPS.1993.262804","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262804","url":null,"abstract":"The communication performance of the interconnection network is critical in a multicomputer system. Wormhole routing has been known to be more efficient than the traditional circuit switching and packet switching. To evaluate wormhole routing, a queueing-theoretic analysis is used. This paper presents a general analytical model for wormhole routing based on very basic assumptions. The model is used to evaluate the routing delays in hypercubes and meshes. Delays calculated are compared against those obtained from simulations, and these comparisons show that the model is within a reasonable accuracy.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126012301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 49
Selection, routing, and sorting on the star graph 星图上的选择、路由和排序
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262802
S. Rajasekaran, David S. L. Wei
The authors consider the problems of selection, routing and sorting on an n-star graph (with n! n odes), an interconnection network which has been proven to possess many special properties. They identify a tree like subgraph (a '(k, 1, k) chain network') of the star graph which enables them to design efficient algorithms for these problems. They present an algorithm that performs a sequence of n prefix computations in O(n/sup 2/) time. This algorithm is used as a subroutine in other algorithms. In addition they offer an efficient deterministic sorting algorithm that runs in (n/sup 3/ log n)/2 steps. They also show that sorting can be performed on the n-star graph in time O(n/sup 3/) and that selection of a set of uniformly distributed n keys can be performed in O(n/sup 2/) time with high probability. Finally, they also present a deterministic (non oblivious) routing algorithm that realizes any permutation in O(n/sup 3/) steps on the n-star graph.<>
研究了n星图(n!)上的选择、路由和排序问题。N odes),一种已被证明具有许多特殊性质的互连网络。他们确定了星图的树状子图(a '(k, 1, k)链网络'),这使他们能够为这些问题设计有效的算法。他们提出了一种在O(n/sup 2/)时间内执行n个前缀计算序列的算法。该算法在其他算法中作为子程序使用。此外,他们还提供了一个高效的确定性排序算法,运行在(n/sup 3/ log n)/2步。他们还证明了在O(n/sup 3/)时间内可以对n星图进行排序,并且可以在O(n/sup 2/)时间内以高概率选择一组均匀分布的n个键。最后,他们还提出了一种确定性(非遗忘)路由算法,该算法在n星图上的O(n/sup 3/)步内实现任何排列。
{"title":"Selection, routing, and sorting on the star graph","authors":"S. Rajasekaran, David S. L. Wei","doi":"10.1109/IPPS.1993.262802","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262802","url":null,"abstract":"The authors consider the problems of selection, routing and sorting on an n-star graph (with n! n odes), an interconnection network which has been proven to possess many special properties. They identify a tree like subgraph (a '(k, 1, k) chain network') of the star graph which enables them to design efficient algorithms for these problems. They present an algorithm that performs a sequence of n prefix computations in O(n/sup 2/) time. This algorithm is used as a subroutine in other algorithms. In addition they offer an efficient deterministic sorting algorithm that runs in (n/sup 3/ log n)/2 steps. They also show that sorting can be performed on the n-star graph in time O(n/sup 3/) and that selection of a set of uniformly distributed n keys can be performed in O(n/sup 2/) time with high probability. Finally, they also present a deterministic (non oblivious) routing algorithm that realizes any permutation in O(n/sup 3/) steps on the n-star graph.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124053490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
Complexity of intensive communications on balanced generalized hypercubes 平衡广义超立方体上密集通信的复杂性
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262914
J. Antonio, L. Lin, R. C. Metzger
Lower bound complexities are derived for three intensive communication patterns assuming a balanced generalized hypercube (BGHC) topology. The BGHC is a generalized hypercube that has exactly w nodes along each of the d dimensions for a total of w/sup d/ nodes. A BGHC is said to be dense if the w nodes along each dimension form a complete directed graph. A BGHC is said to be sparse if the w nodes along each dimension form a unidirectional ring. It is shown that a dense N node BGHC with a node degree equal to Klog/sub 2/N, where K>or=2, can process certain intensive communication patterns K(K-1) times faster than an N node binary hypercube (which has a node degree equal to log/sub 2/N). Furthermore, a sparse N node BGHC with a node degree equal to /sup 1///sub L/log/sub 2/N, where L>or=2, is 2/sup L/ times slower at processing certain intensive communication patterns than an N node binary hypercube.<>
在平衡广义超立方体(BGHC)拓扑下,导出了三种密集通信模式的下界复杂度。BGHC是一个广义的超立方体,它在d维上的每个维度上都有w个节点,总共有w/sup /个节点。如果沿每个维度的w个节点形成一个完整的有向图,则称BGHC是密集的。如果沿每个维度的w个节点形成一个单向环,则称BGHC是稀疏的。结果表明,节点度为Klog/sub 2/N且K>或=2的密集N节点BGHC处理某些密集通信模式的速度比节点度为log/sub 2/N的N节点二元超立方体快K(K-1)倍。此外,节点度等于/sup 1///sub L/log/sub 2/N的稀疏N节点BGHC,当L>或=2时,在处理某些密集通信模式时比N节点二进制超立方体慢2/sup L/倍。
{"title":"Complexity of intensive communications on balanced generalized hypercubes","authors":"J. Antonio, L. Lin, R. C. Metzger","doi":"10.1109/IPPS.1993.262914","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262914","url":null,"abstract":"Lower bound complexities are derived for three intensive communication patterns assuming a balanced generalized hypercube (BGHC) topology. The BGHC is a generalized hypercube that has exactly w nodes along each of the d dimensions for a total of w/sup d/ nodes. A BGHC is said to be dense if the w nodes along each dimension form a complete directed graph. A BGHC is said to be sparse if the w nodes along each dimension form a unidirectional ring. It is shown that a dense N node BGHC with a node degree equal to Klog/sub 2/N, where K>or=2, can process certain intensive communication patterns K(K-1) times faster than an N node binary hypercube (which has a node degree equal to log/sub 2/N). Furthermore, a sparse N node BGHC with a node degree equal to /sup 1///sub L/log/sub 2/N, where L>or=2, is 2/sup L/ times slower at processing certain intensive communication patterns than an N node binary hypercube.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129083736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Delay analysis in synchronous circuit-switched delta networks 同步电路交换增量网络的延迟分析
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262801
A. Bhattacharya, R. R. Rao, Ting-Ting Y. Lin
Multistage interconnection networks (MINs) provide a cost-effective alternative to a full crossbar connection for processor-processor or processor-memory communication in a tightly coupled multiprocessor system. Delta networks, a class of blocking type MIN with unique path property, have been studied extensively for their self-routing capability. A probabilistic analysis of the blocking and its effect on the delay is presented here, for such a network operated in a synchronous circuit-switched mode. Under the assumption of uniformly distributed access requests independently generated at each unblocked source, an upper bound on the expected latency has been established. The bound has been compared with simulation results.<>
多级互连网络(MINs)为紧耦合多处理器系统中的处理器-处理器或处理器-存储器通信提供了一种经济有效的替代方案。Delta网络是一类具有唯一路径特性的阻塞型MIN网络,其自路由能力得到了广泛的研究。在此,对于以同步电路切换模式运行的网络,给出了阻塞及其对延迟影响的概率分析。假设访问请求是均匀分布的,在每个未阻塞的源上独立产生,建立了期望延迟的上界。并与仿真结果进行了比较。
{"title":"Delay analysis in synchronous circuit-switched delta networks","authors":"A. Bhattacharya, R. R. Rao, Ting-Ting Y. Lin","doi":"10.1109/IPPS.1993.262801","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262801","url":null,"abstract":"Multistage interconnection networks (MINs) provide a cost-effective alternative to a full crossbar connection for processor-processor or processor-memory communication in a tightly coupled multiprocessor system. Delta networks, a class of blocking type MIN with unique path property, have been studied extensively for their self-routing capability. A probabilistic analysis of the blocking and its effect on the delay is presented here, for such a network operated in a synchronous circuit-switched mode. Under the assumption of uniformly distributed access requests independently generated at each unblocked source, an upper bound on the expected latency has been established. The bound has been compared with simulation results.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"419 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133517066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Scheduling independent tasks on partitionable hypercube multiprocessors 调度可分区超立方体多处理器上的独立任务
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262866
B. Narahari, Ramesh Krishnamurti
A partitionable hypercube allows simultaneous execution of multiple tasks, where each task can be executed on a choice of subcubes. This paper considers the problem of static nonpreemptive scheduling of w independent tasks on a n processor partitionable hypercube system to minimize the overall finishing time of the w tasks. Each task can be executed on subcubes of different sizes, with smaller execution times on larger subcubes. A schedule determines the size of the subcube to be assigned to each task and schedules these tasks on the processors in the hypercube system. The problem of finding the optimal schedule, with minimum finishing time, is known to be NP-hard. This paper presents a fast polynomial time approximation algorithm for the problem, and derives a tight worst-case performance bound of 2 for the algorithm.<>
可分区的超多维数据集允许同时执行多个任务,其中每个任务可以在一个选择的子多维数据集上执行。研究了n个处理器可分区超立方体系统上w个独立任务的静态非抢占调度问题,以使w个任务的总完成时间最小化。每个任务都可以在不同大小的子数据集上执行,在较大的子数据集上执行时间更短。调度确定要分配给每个任务的子多维数据集的大小,并在超多维数据集系统的处理器上调度这些任务。在最短的完成时间内找到最优计划的问题被称为np困难问题。本文给出了该问题的快速多项式时间逼近算法,并推导出该算法的最坏情况性能界为2。
{"title":"Scheduling independent tasks on partitionable hypercube multiprocessors","authors":"B. Narahari, Ramesh Krishnamurti","doi":"10.1109/IPPS.1993.262866","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262866","url":null,"abstract":"A partitionable hypercube allows simultaneous execution of multiple tasks, where each task can be executed on a choice of subcubes. This paper considers the problem of static nonpreemptive scheduling of w independent tasks on a n processor partitionable hypercube system to minimize the overall finishing time of the w tasks. Each task can be executed on subcubes of different sizes, with smaller execution times on larger subcubes. A schedule determines the size of the subcube to be assigned to each task and schedules these tasks on the processors in the hypercube system. The problem of finding the optimal schedule, with minimum finishing time, is known to be NP-hard. This paper presents a fast polynomial time approximation algorithm for the problem, and derives a tight worst-case performance bound of 2 for the algorithm.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127020691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
KSR1 multiprocessor: analysis of latency hiding techniques in a sparse solver KSR1多处理器:稀疏求解器中延迟隐藏技术的分析
Pub Date : 1993-04-13 DOI: 10.1109/IPPS.1993.262832
D. Windheiser, E. Boyd, E. Hao, S. Abraham, E. Davidson
This paper analyzes and evaluates some novel latency hiding features of the KSR1 multiprocessor: prefetch and poststore instructions and automatic updates. As a case study, the authors analyze the performance of an iterative sparse solver which generates irregular communications. They show that automatic updates significantly reduce the amount of communication. Although prefetch and poststore instructions reduce the coherence miss ratios, they do not significantly improve the sparse solver performance due to the overhead in executing these instructions.<>
本文分析和评价了KSR1多处理器的一些新的延迟隐藏特性:预取和后存储指令以及自动更新。作为实例,作者分析了一种产生不规则通信的迭代稀疏求解器的性能。他们表明,自动更新显著减少了通信量。虽然预取和后存储指令降低了相干缺失率,但由于执行这些指令的开销,它们并没有显著提高稀疏求解器的性能
{"title":"KSR1 multiprocessor: analysis of latency hiding techniques in a sparse solver","authors":"D. Windheiser, E. Boyd, E. Hao, S. Abraham, E. Davidson","doi":"10.1109/IPPS.1993.262832","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262832","url":null,"abstract":"This paper analyzes and evaluates some novel latency hiding features of the KSR1 multiprocessor: prefetch and poststore instructions and automatic updates. As a case study, the authors analyze the performance of an iterative sparse solver which generates irregular communications. They show that automatic updates significantly reduce the amount of communication. Although prefetch and poststore instructions reduce the coherence miss ratios, they do not significantly improve the sparse solver performance due to the overhead in executing these instructions.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130667466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
期刊
[1993] Proceedings Seventh International Parallel Processing Symposium
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1